Você está na página 1de 331

QUEBEC STUDIES IN THE PHILOSOPHY OF SCIENCE

PART I

BOSTON STUDIES IN THE PHILOSOPHY OF SCIENCE

Editor
ROBERT S. COHEN, Boston University

Editorial Advisory Board


THOMAS F. GLICK, Boston University
ADOLF GRUNBAUM, University of Pittsburgh
SAHOTRA SARKAR, McGill University
SYLVAN S. SCHWEBER, Brandeis University
JOHN J. STACHEL, Boston University
MARX W. WARTOFSKY, Baruch College of

the City University ofNew York

VOLUME 177

HUGUESLEBLANC
Courtesy o/Virginia G. Leblanc

QUEBEC STUDIES IN THE


PHILOSOPHY OF SCIENCE
Part I: Logic, Mathematics, Physics
and History of Science
Essays in Honor of H ugues Leblanc

Edited by
MATHIEU MARION
University of Ottawa
and
ROBERT S. COHEN
Boston University

KLUWER ACADEMIC PUBLISHERS


DORDRECHT I BOSTON I LONDON

Library of Congress Cataloging.inPublication Data


Owebec studies in the philosophy of science I edited by Mathieu Marion
and Robert S. Cohen.
p.
cm.
Contents: pt. I. LogiC, mathematics, physics, and history of
science
alk. paper)
1. SCience--Phllosophy--Congresses. 2. Logic--Congresses.
I. Marion, Mathieu, 1962II. Cohen, R. S. (Robert Sonne)
0174.043 1996
501--dc20
95-17467

ISBN13: 9789401072045

eISBN13: 9789400915756

001: 10.1007/9789400915756

Published by Kluwer Academic Publishers,


P.O. Box 17,3300 AA Dordrecht, The Netherlands.
Kluwer Academic Publishers incorporates
the publishing programmes of
D. Reidel, Martinus Nijhoff, Dr W. Junk and MTP Press.
Sold and distributed in the U.S.A. and Canada
by Kluwer Academic Publishers,
101 Philip Drive, Norwell, MA 02061, U.S.A.
In all other countries, sold and distributed
by Kluwer Academic Publishers Group,
P.O. Box 322, 3300 AH Dordrecht, The Netherlands.

Printed on acid-free paper

All Rights Reserved


1995 Kluwer Academic Publishers
Softcover reprint of the hardcover 1st edition 1995
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.

TABLE OF CONTENTS

ix

EDITORIAL PREFACE

LOGIC
On Axiomatizing Free Logic - And Inclusive
Logic in the Bargain
FRAN~OIS LEPAGE / Partial Propositional Logic
SERGE LAPIERRE / Generalized Quantifiers and Inferences
MARIE LA PALME REYES, JOHN MACNAMARA and GONZALO E.
REYES / A Category-Theoretic Approach to Aristotle's Term Logic,
with Special Reference to Syllogisms
JOACHIM LAMBEK / On the Nominalistic Interpretation of Natural
Languages
JEAN-PIERRE MARQUIS / If Not-True and Not Being True Are Not
Identical, Which One Is False?
DANIEL V ANDER VEKEN / A New Formulation of the Logic of
Propositions
YVON GAUTHIER / Internal Logic. A Radically Constructive Logic
for Mathematics and Physics
JUDY PELHAM / A Reconstruction of Russell's Substitution Theory
HUGUES LEBLANC /

1
23
41
57
69
79
95
107
123

PHILOSOPHY OF MATHEMATICS
MICHAEL HALLETT / Hilbert and
MATHIEU MARION / Kronecker's

Logic
'Safe Haven of Real Mathematics'

135
189

PHILOSOPHY OF PHYSICS
MARIO BUNGE / Hidden Variables, Separability, and Realism
STORRS McCALL / A Branched Interpretation of Quantum Mechanics

Which Differs from Everett's


and Chaos Shall Set You Free. . .

MICHEL J. BLAIS / ...

vii

217
229
243

viii

T ABLE OF CONTENTS

HISTORY AND PHILOSOPHY OF SCIENCE


PAUL M. PIETROSKI / Other Things Equal, The Chances Improve
DA VID DAVIES / The Model-Theoretic Argument Unlocked
JEAN LEROUX / Helmholtz and Modern Empiricism
WILLIAM R. SHEA / Technology and the Rise of the Mechanical

Philosophy

259
275
287
297

NOTES ON THE AUTHORS

309

NAME INDEX

315

EDITORIAL PREFACE

By North-American standards, philosophy is not new in Quebec: the first mention of philosophy lectures given by a Jesuit in the College de Quebec (founded
1635) dates from 1665, and the oldest logic manuscript dates from 1679. In
English-speaking universities such as McGill (founded 1829), philosophy
began to be taught later, during the second half of the 19th century. The major
influence on English-speaking philosophers was, at least initially, that of
Scottish Empiricism. On the other hand, the strong influence of the Catholic
Church on French-Canadian society meant that the staff of the facultes of the
French-speaking universities consisted, until recently, almost entirely of
Thomist philosophers. There was accordingly little or no work in modem
Formal Logic and Philosophy of Science and precious few contacts between the
philosophical communities. In the late forties, Hugues Leblanc was a young
student wanting to learn Formal Logic. He could not find anyone in Quebec
to teach him and he went to study at Harvard University under the supervision
of W. V. Quine. His best friend Maurice L' Abbe had left, a year earlier, for
Princeton to study with Alonzo Church.
After receiving his Ph.D from Harvard in 1948, Leblanc started his professional career at Bryn Mawr College, where he stayed until 1967. He then went
to Temple University, where he taught until his retirement in 1992, serving as
Chair of the Department of Philosophy from 1973 until 1979. His achievements
as a logician include seminal contributions to the development of Free Logic, in
particular with the ground breaking paper, written jointly with Theodore
Hailperin, 'Nondesignating Singular Terms' (Philosophical Review 68 (1959),
pp. 239-43). After initial results by Bas van Fraassen, using supervaluation,
Hugues Leblanc and Richmond Thomason obtained completeness results in
'Completeness Theorems for Some Presupposition-Free Logic' (Fundamenta
Mathematicae 62 (1968), pp. 125-64). More recently, Leblanc also made
seminal contributions to Truth-Value Semantics (cf. his Truth-Value Semantics,
Amsterdam, North-Holland, 1976) and, inspired by appendices to Karl Popper's
Logic of Scientific Discovery, to Probability Semantics and Probability Theory,
in his paper 'Probabilistic Semantics for First-Order Logic' (ZeitschriJt for
mathematische Logik und Grundlagen der Mathematik 25 (1979), pp. 498509). In all, Leblanc has written more than one hundred scientific papers, the
more recent of them in collaboration with Peter Roeper (Australian National
University), and four books, he collaborated on two books and edited or coedited four. Many logic students will remember learning the subject from his
classic textbook, written with William A. Wisdom, Deductive Logic (3rd edn.,
Englewood Cliffs, Prentice Hall, 1993).
After a long and fruitful career in the United States, Hugues Leblanc is now
ix

EDITORIAL PREFACE

back in Quebec, where the philosophical milieu has changed beyond recognition since his student days. He came back to find studies in logic and in all
aspects of philosophy of science in a flourishing state. As a result of the revolution tranquille which took place among the French-speaking society in the
sixties, philosophy in Quebec opened up to external influences such as, initially,
phenomenology and Marxism and, increasingly in the past twenty years, AngloAmerican analytic philosophy. As a result, there is now a growing number of
French-speaking logicians and philosophers of science - although not all of
them work from the point of view of analytical philosophy. Conditions were set
for fruitful exchanges with the English-speaking philosophical community. (But
we should add here that the essential role of immigrants in the evolution of
the philosophical life in Quebec should not be overlooked. Contributors to the
present volumes come not only from other parts of Canada, but also from
Argentina, Australia, Belgium, Germany, Ireland, Switzerland, the United
Kingdom and the United States).
Such exchanges have led recently to the creation of research groups across
Quebec. These are now joined together under the name of Groupe de recherche
sur la representation, I' action et Ie langage or GRRAL. Our two volumes
of Quebec Studies in the Philosophy of Science comprise the first full-scale
collection of studies in the philosophy and history of science from French- and
English-speaking philosophers of Quebec to appear in English; they include in
particular most members of the GRRAL. As editors, we are happy to join the
contributors in dedicating these volumes to Hugues Leblanc, who is, among
philosophers, the first logicien quebecois.
In our first volume, which opens with a new essay on Free Logic by Hugues
Leblanc himself, we have collected together papers in logic, philosophy of
mathematics, philosophy of physics and in general philosophy and history of
science. This volume includes members of two of the research groups forming
the GRRAL, the group on Fondements de la logique et fondement du raisonnement (M. Hallett, F. Lepage, S. Lapierre, J.-P. Marquis, and, at the time of
writing, J. Pelham) and the group Actes du discours et grammaire universelle
(D. Vanderveken & Y. Gauthier).
The papers in the section on logic show the great variety of logical investigations work done across Quebec. Both Franyois Lepage and Serge Lapierre
present their results, respectively, on partial functions in type theory and conditional quantifiers, within their more global context. The following three papers
reflect the fundamental research in category theory which has been taking place
in Montreal's various departments of mathematics. Marie La Palme Reyes, John
Macnamara and Gonzalo Reyes argue in their paper for the replacement of the
standard Boolean, class interpretation of syllogistic by a category-theoretical
approach. Jim Lambek argues for an extension of his nominalistic interpretation
of the language of mathematics, developed in collaboration with Jocelyne
Couture and Phil Scott, to natural languages, and Jean-Pierre Marquis studies

EDITORIAL PREFACE

xi

the distinction between true, not-true and not being true within the perspective
of topos theory.
Further to his work on the logic of illocutionary forces, Daniel Vanderveken
presents in his paper a framework for a new logic of propositions. Yvon
Gauthier presents his own system of internal logic and system of finitist arithmetic, where the schema of complete induction is replacd by that of Fermat's
infinite descent. Finally, building on joint results with Alasdair Urquhart on
structured propositions, Judy Pelham presents in her paper a reconstruction of
Russell's substitution theory (circa 1905-6) that provides a resolution of the
paradoxes which originally caused Russell to abandon the theory in favour of
the ramified theory of types.
In the section on the philosophy of mathematics, Michael Hallett studies the
role of logic in Hilbert's approach to the foundations of mathematics, while
Mathieu Marion examines the relations between Kronecker's philosophy of
mathematics and the tradition of logical foundations. The section on philosophy
of physics comprises a paper by the distinguished philosopher of science Mario
Bunge, in which he argues that recent experiments which refuted hidden
variable theories were not a refutation of realism, a paper by Storrs McCall
presenting a new 'branched' interpretation of Quantum Mechanics and a paper
by Michel Blais on Chaos Theory.
The last section of the volume contains studies in general philosopy of
science and a study by the leading historian of science, William Shea. In his
paper, Paul Pietro sky presents an original conception of ceteris paribus laws
inspired by Ramsey's ideas about causation. Dave Davies argues that Putnam's
recent emphasis on conceptual relativity against metaphysical realism does
not vindicate critics of his model-theoretic argument, but actually clarifies
Putnam's objections to these very criticisms. Jean Leroux presents elements of
Helmholtz's epistemology which foreshadow modem forms of empiricism and
he argues for an anti-realist reading of his theory of science. Finally, Shea
studies the origin, in the development of new technologies from the medieval
ages onwards, of the mechanical philosophy which underlay modem science,
from Galileo to Newton.
We would like to thank Alain Voizard for his help in writing this preface, and
also the editor of the Brazilian Journal of Physics, formerly Revista Brasileira
de Fisica, for granting us permission to reprint Mario Bunge's essay 'Hidden
Variables, Separability and Realism' (volume especial os 70 anos de Mario
SchOnberg, 1984, pp. 150-168). We are especially grateful to Annie Kuipers for
her professional assistance on behalf of Kluwer Academic Publishers and for
her continued encouragement and patience.
Boston and Montreal
April 1995

MATHIEU MARION & ROBERT S. COHEN

HUGUES LEBLANC

ON AXIOMATIZING FREE LOGIC AND INCLUSIVE LOGIC IN THE BARGAIN*

Free Logic owes its name to its being free of two presuppositions of Standard
Logic, one to the effect that something exists, a patent truth but surely a factual
rather than a logical one, and the other to the effect that every singular term
designates something, a patent falsehood. l In Standard Logic with Identity,
it is this familiar law:
A.

(3X) (X

T),

T here any term you please, that most crisply encapsulates the two presuppositions. 2 In Standard Logic without Identity, it may be this little known
counterpart of it:
B.

(3X) (A == A(T/X)),

T as above and A(T/X) the result of replacing X everywhere in A by T, that


does.
In such recent expositions of Free Logic as Bencivenga's [2] and Lambert's
[8], Free Logic without Identity is axiomatized first; and the axiom schemata
needed to convert it into a Free Logic with Identity, call them the axiom
schemata for '=', are added next. The procedure is instructive, to be sure,
but it is slightly inelegant. Indeed, one of the axiom schemata for Free Logic
without Identity, to wit:

C.

(\iX) (\iY)A

::> (\iY) (\iX)A,

though independent of the other axiom schemata for that logic, is provable it so happens - in the presence of those for '='.3 So, awkwardly, C must be
dropped as one converts Free Logic without Identity into Free Logic with
Identity. 4 But what I regret more is that (i) in the process of ax iomati zing
Free Logic with Identity, Bencivenga, Lambert, and others do not exploit
Leonard's Specification Law for Free Logic with Identity in [19], to wit:
D.

(3X) (X

T) ::> (\iX)A :::> A(T/X)),5

and (ii) in the process of ax iomati zing Free Logic without Identity, they do
not exploit this Identityless counterpart of D in [10]:
E.

(3X) (A == A(T/X)) :::> \iX)A :::> A(T/X)).

Editorial note: a list of lettered fonnulas in Section I is given at the end of this essay.

M. Marion and R. S. Cohen (etis.), Quebec Studies in the Philosophy of Science I, 1-22.
1995 Kluwer Academic Publishers.

HUGUES LEBLANC

In contrast, I devise in Section II a partial logic with Identity whose axioms


and one primitive rule of inference, Modus Ponens, (i) suit Standard as well
as Free Logic and yet (ii) yield as theorems both D and this curious Identity
Law, used by Tarski in [24]:
F.

(3X)(X = T)

T.

Since the consequent


G.

(V'X)A ~ A(T/X)

of D and that
H.

T=T

of F are the only two of the customary axiom schemata for Standard Logic
with Identity not among those for my partial logic, adding A to them obviously extends it to a full-fledged Standard Logic with Identity.6 The resulting
axiomatization of that logic may be new. In any event it is one in which,
interestingly enough, the Specification Law for Free Logic with Identity begets
G, the Specification Law for Standard Logic (with or without Identity).
In Section III I extend the partial logic of Section II to a full-fledged Free
Logic with Identity. I do this in two different - and shown equivalent - ways,
using in the first case H plus this generalization of A:
I.

(V'Y) (3X) (X = Y),

Y a variable distinct from X, and in the second case H plus Lambert's generalization of G in [7]:

J.

(VY) VX)A :J A(Y/X,

Y this time a variable foreign to A and A(Y/X) the result of course of replacing
X everywhere in A by Y.
In Section IV I drop the one axiom schema in Section II that exibits '=',
and show the resulting partial logic to yield E as a theorem. So, enlisting B
as a substitute for A is sure to extend that logic to a full-fledged Standard Logic
without Identity. The resulting axiomatization may also be new, and this time
around it is the Specification Law for Free Logic without Identity that begets
G. I next extend the present partial logic to a full-fledged Free Logic without
Identity. Following the precedent set in Section III, I do this in two different
- and shown equivalent - ways, using in the first case this generalization of
B:
K.

(V'Y) (3X) (A == A(Y/X,

Y a variable foreign to A, and in the second case Lambert's J. I then show


B, which I called on page 1 a counterpart of A, to be provable from A for
any statement A and A to be provable from (one special case of) B, this
given the axioms of Section II and those of the sort H. 7
In Section V, I sketch a truth-value semantics for Free Logic as well as a

ON AXIOMATIZING FREE LOGIC

Free Probability Theory (or, as more recent usage has it, a Free Probability
Logic), and I remind the reader that Free Probability Theory does provide
an alternative semantics for Free Logic, one in which probability functions
relativized to possibly empty sets of terms substitute for truth-value functions
relativized to such sets.
In Section VI, Lastly, I turn to what Quine called in [22] Inclusive Logic,
provide both an axiomatization of and a semantics for it, and study its relationship to Standard as well as to Free Logic. Used in the process will be
these two axiom schemata:
L.

(3X) (A

-A) ::J \tX)A ::J A(T/X

and
M.

(\tY) (3X) (A

-A).

Note: For brevity's sake I shall refer to the partial logic with Identity of
Section II as L., to the Standard Logic with Identity that L. extends to in
that very section as SL., to the Free Logic with Identity that L. extends to
in Section III as FL., to the partial logic without Identity of Section IV as
L, and to the Standard Logic without Identity and the Free Logic without
Identity that L extends to in that very Section as SL and FL, respectively.
As for Inclusive Logic, I shall refer to the one with Identity as IL. and to
the other as IL.
II

The primitive signs of L., SL., and FL. are the customary ones, to wit: (i)
countably many predicates, '=' one of them, of course, (ii) a denumerable
infinity of individual variables, say, 'X1" 'X2" . , 'xn', . . . , (iii) a denumerable infinity of singular or - for uniformity's sake - individual terms,
say, 't 1 ','t2 ', , 'tn', ... ,8 (iv) the three logical operators '-', '::J', and
'\t', (v) the two parentheses '(' and ,)" and (vi) the comma','. As for the statements of L., SL., and FL., they are also the customary ones, except for (\tX)A
counting as a statement only if A(T/X) - T once more any term you please
- counts itself as a statement. Identical quantifiers, as a result, cannot overlap,
hence the restriction placed three times in Section I on the variable y' 9
Parentheses will be dropped unless ambiguity threatens; and the four logical
operators '&', 'v', '=', and '3' - two of them already used in Section 1will be presumed to be defined in the customary manner.
Lastly, extending my use so far of '/' and introducing its cognate 'II',
suppose I and l'to be two individual variables, or two individual terms, or
one an individual variable and the other an individual term. A(I'II) will then
be the result of replacing I everywhere in A by I', and A(I'III) that of replacing
I at zero or more places in A by 1/.10
Following in this Fitch's example in [5], I identify the axioms of L. (hence,
those common to L., SL., FL., and ILJ recursively:

HUGUES LEBLANC

Basic Clause: Every statement of L_ of any of these six sorts:


AI.
A2.

A ::J (B ::J A)
(A ::J (B ::J C)) ::J A ::J B) ::J (A ::J C))

A3.

(-A ::J -B) ::J (B ::J A)

A4.
AS.
A6.

(VX)(A ::J B) ::J VX)A ::J (VX)B)


A ::J (VX)A II
T = T' ::J (A ::J A(T'//T))

counts as an axiom of L_,


Inductive Clause: If Acounts as an axiom of L_, then so does (VX)A(XlT),
so long as X is foreign to A. 12
Due to the second clause, which I shall occasionally refer to as Fitch's Clause,
one rule of inference suffices, it turns out, Modus Ponens (MP, for short).
So, a finite column of statements of L_ counts as a proof in L_ of a statement A of L_ if (i) every entry in the column is an axiom of L_ or follows
by means of MP from two earlier entries in the column, and (ii) the last
entry in the column is A. And A counts as a theorem of L_, or is said to be
provable in L_, for short:
1-_ A,

if there exists a proof of A in L_.


Preparatory to proving D and F, I put down eight lemmas, supplying proof
of just the two that involve quantifiers. Five of these lemmas are in effect
derived rules of inference, and the most important one of the five is of course
Lemma 6. It is a generalization of what I just called Fitch's Clause; and in
axiomatizations of Standard or Free Logic that dispense with that clause, it
must be adopted along with Modus Ponens as a primitive rule of inference.
Called the Generalization Rule, it is refered to here as Gn.
LEMMA 1. If 1-_ A ::J B and 1-_ B ::J C, then 1-_ A ::J C,
LEMMA 2. If 1-_ A ::J (B ::J C), then 1-_ B ::J (A ::J C),
LEMMA 3. If 1-_ A ::J (A ::J B), then 1-_ A ::J B,
LEMMA 4. 1-_ (A ::J B) ::J (-B ::J -A),
LEMMA 5. If 1-_ -A ::J B, then 1-_ -B ::J A,
LEMMA 6. If 1-_ A, then 1-_ (VX)A(XlT), so long as X is foreign to A.
Proof. Suppose the column made up of AI' A2 , , and An constitutes a
proof of A in L_; suppose X is foreign to A; and for each i from 1 through
n let A; be Ai(X'/X), where X' is the alphabetically earliest individual variable
of L_ that is foreign to all of AI' A2, , and An-I' It is easily verified that
(i) if Ai (i = 1, 2, ... , or n) is an axiom of L_, then so is A;, and (ii) if for

ON AXIOMATIZING FREE LOGIC

some g from 1 through i - I Ai follows by means of MP from As and As ::J


Ai' then A; does so from A~ and (As ::J A)" this because (As ::J AY is the
same as A~ ::J A;. So the column made up of A~, A;, ... , and A~ constitutes a proof of A~ in L., and one to all of whose entries X is foreign. Consider
then the column made up of ('v'X)A~ (X/T), ('v'X)A;(XlT), . . . , and
('v'X)A~(XlT). This third column constitutes a proof of ('v'X)A~(XlT) in L. if
none of the entries in the second column was obtained by means of MP, and
will do so in the contrary case upon the insertion in it of two entries per
entry that was thus obtained. For suppose first that A; was an axiom of L.
Then, X being foreign to A;, so by Fitch's Clause is ('v'X)A;(XlT), which
justifies its presence in the third column. Suppose then that A; was obtained
by means of MP from A~ and A~ ::J A; for some g from 1 through i-I. In
this case inserting the following two lines:
('v'X)(A~(XlT)

::J A;(XlT)) ::J

('v'X)A~(XlT)

::J ('v'X)A;(XlT)

'v'X)A~(XlT)

::J ('v'X)A;(XlT))

and

after whichever of ('v'X)A~(XlT) and ('v'X) (A~ ::J A;) (XlT) (=('v'X) (A~(XlT)
::J A;(XlT))) occurs second in the third column will justify the presence of
('v'X)A;(XlT) in that column. The first of these lines is indeed an axiom of
L. of the sort A4, the second follows from the first and ('v'X) (A~ ::J A;) (XlT)
by means of MP, and ('v'X)A;(XlT) follows from the second and ('v'X)A~(XlT)
by means of MP also. But, X being by hypothesis foreign to A and hence to
An' ('v'X)A~(XlT) is the same as ('v'X)A(XlT). So, if 1-_ A, then 1-. ('v'X)A(XlT),
so long as X is foreign to A.
0
LEMMA 7. 1-. ('v'X)(A ::J B) ::J 3X)A ::J (3X)B).
Proof Suppose T new. 13
(1)

I-_ (A(T/X) ::J B(T/X)) ::J (-B(T/X) ::J


-A(T/X))

(Lemma 4)

(2)

1-. ('v'X) A ::J B) ::J (-B ::J -A))

(Go, (1))

(3)

1-. (2) ::J 'v'X) (A ::J B) ::J


('v'X) (-B ::J -A))

(A4)

(4)

1-. ('v'X)(A ::J B) ::J ('v'X)(-B ::J -A)

(MP, (2), (3))

(5)

1-. ('v'X) (-B ::J -A) ::J 'v'X)-B ::J


('v'X)-A)

(A4)

(6)

1-. ('v'X)(A ::J B) ::J 'v'X)-B ::J ('v'X)-A) (Lemma 1, (4), (5))

(7)

I-_ 'v'X)-B ::J ('v'X)-A) ::J


3X)A ::J(3X)B)
I-_ ('v'X)(A ::J B) ::J 3X)A ::J (3X)B)

(8)

(Lemma 4)
(Lemma 1, (6), (7)) 0

HUGUES LEBLANC

LEMMA 8. 1-. (V X) (A
Proof

B)

3X)A

B), so long as X is foreign to B,

(1)

1-. -B

(2)

1-. (3X)B

(3)

1-. (2) ::) 3X)A ::) 3X)B ::) B

(AI)

(4)

1-. (3X)A ::) 3X)B ::) B)

(MP, (2), (3

(5)

1-. (4) ::) (3X)A ::) (3X)B) ::)


3X)A ::) B

(A2)

(6)

1-. 3X)A ::) (3X)B) ::) 3X)A ::) B)

(MP, (4), (5

(7)

1-. (VX)(A ::) B) ::) 3X)A ::) (3X)B)

(Lemma 7)

(8)

1-. (VX)(A ::) B) ::) 3X)A ::) B)

(Lemma 1, (7), (6 0

VX)-B)

(AS, hypo on X)

(Lemma 5, (1

Proofs of D and F can now be had:


THEOREM 1. (3X)(X = T) ::) VX)A ::) A(T/X.
Proof Suppose T' new.
(1)

1-. A(T'/X)

(T' = T

(A(T'/X(T/T' (Lemma 2, A6)

(1)

1-. A(T'/X)

(T' = T

A(T/X

(2)
(3)

1-. (VX)(A ::) (X = T ::) A(T/X)

(Go, (1

1-. (2) ~ VX)A ~


(V X) (X = T ~ A(T/X)
1-. (VX)A ::) (VX)(X = T

(A4)

i.e.

(4)

A(T/X

(Lemma 2, A6)

(MP, (2), (3

1-. (VX)(X = T ::) A(T/X ::)


::IX) (X = T) ~ A(T/X

(Lemma 8)

(6)

1-. (VX)A ::) 3X)(X = T) ::) A(T/X

(Lemma 1, (4), (5

(7)

1-. (3X)(X = T) ::) VX)A ::) A(T/X

(Lemma 2, (6

(5)

THEOREM 2. 1-. (3X)(X = T) ::) T = T.


Proof Suppose T' new.
(1)
(2)
(3)
(4)
(5)

= T ::) (T' = T ::) T = T)


1-. T' = T ::) T = T
1-. (2) ~ (-(T = T) ::) -(T' = T
I-_ -(T = T) ::) -(T' = T)
1-. (VX)(-(T = T) ::) -(X = T
I-_ T'

(A6)
(Lemma 3, (1
(Lemma 4)
(MP, (2), (3
(Go, (4

ON AXIOMA TIZING FREE LOGIC

(6)

f-. (5) :J VX)-(T = T) :J


(VX)-(X = T

(A4)

(7)

f-. (VX) -(T = T) :J (VX)-(X = T)

(MP, (5), (6

(8)

f-. -(T = T) :J (VX)-(T = T)

(AS)

(9)

f-. -(T = T) :J (VX)-(X = T)

(Lemma 1, (8), (7

f-. (3X)(X = T) :J T = T

(Lemma 5, (9

(10)

So, as claimed on page 2, enlisting as an extra axiom schema the antecedent


A of Theorem 1 and Theorem 2 would extend the partial logic of this section
to a full-fledged Standard Logic with Identity.
It could readily be seen, by the way, that the logic in question was a partial
one with as well as without Identity. Assigning the truth-value 0 (for False)
to all the atomic statements of L., evaluating its negations and conditionals
in the customary manner, and assigning its quantifications the truth-value 1
(for True) ensure that all its axioms evaluate to 1 and that the consequent of
a conditional of L. evaluates to 1 if the conditional in question and its
antecedent themselves do. So, under this assignment, all statements of L.
provable in L. evaluate to 1, but no statement of the sort H does and many
a statement of the sort G, say, '(Vxj)(F(x j) & -F(xj :J (F(t j) & -F(tj',
does not either.
III

To obtain the Free Logic with Identity promised on page 2, (i) substitute
wherever appropriate 'FL.' for 'L.' in Section II, (ii) add to the six axiom
schemata on page 4 either this axiom schema, labelled I on page 2:
FA7.

(VY)(3X)(X

Y),

or this one, labelled J on that page:


FA7'.

(VY)VX)A:J A(Y/X,

(iii) add also this axiom schema, familiar from page 2 as H:


FA8.

and (iv) abridge


A is provable in FL.
as
f- F_ A.

I now proceed to show that Theorem 1 on page 6, i.e. Leonard's


Specification Law for Free Logic with Identity, and FA7 deliver FA7'.

HUGUES LEBLANC

THEOREM 3. I- F_ (V'Y) V'X)A ::J A(Y/X.


Proof. Suppose T new.
(1)

(2)
(3)
(4)

I- F_ (V'Y)3X)(X = Y) ::J
V'X)A ::J A(Y/X)
I- F_ (1) ::J V'Y)(3X)(X = Y) ::J
(V'Y) V'X)A ::J A(Y/X)
I- F_ (V'Y) (3X) (X = Y) ::J
(V'Y) V'X)A ::J A(Y/X
I- F_ (V'Y) V'X)A ::J A(Y/X

(Gn, Theorem 1)
(A4)

(MP, 0), (2
(MP, FA7, (3

This done, I go on to show that FA7' and FAS (not used in the foregoing proof)
deliver FA7. Two new lemmas are used in the course of the proof.
LEMMA 9. I- F_ (A ::J -B) ::J (B ::J -A).
LEMMA 10. I- F_ (V'Y) (A(Y/X) ::J (3X)A).
Proof. Suppose T new.
(1)

(2)
(3)
(4)
(5)

I- F_ V'X)-A ::J -A(T/X ::J (A(T/X) ::J


(3X)A)
I- F_ (V'Y) (V'X)-A ::J -A(Y/X ::J
(A(Y/X) ::J (3X)A
I- F- (2) ::J V'Y) V'X) -A ::J
-A(Y/X ::J (V'Y) (A(Y/X) ::J (3X)A
I- F_ (V'Y) V'X)-A ::J -A(Y/X ::J
(V'Y) (A(Y/X) ::J (3X)A)
I- F_ (V'Y) (A(Y/X) ::J (3X)A)

(Lemma 9)
(Gn, (1
(A4)

(MP, (2), (3
(MP, FA7', (4

THEOREM 4. I- F_ (V'Y) (3X) (X = Y).


Proof.
(1)

(2)
(3)
(4)
(5)

I- F_ (V'Y)(Y = Y ::J (3X) (X = Y


I- F_ (1) ::J V'Y) (Y = Y) ::J
(V'Y)(3X)(X = Y
I- F_ (V'Y) (Y = Y) ::J (V'Y) (3X) (X = Y)
I- F_ (V'Y) (Y = Y)
I- F_ (V'Y)(3X)(X = Y)

(Lemma 10)
(A4)
(MP, (1), (2
(Gn, FAS)
(MP, (3), (4

So, given FAS, FA7 and FA7' are provably equivalent means of extending
the partial logic of Section II to a full-fledged Free Logic with Identity, a result
mentioned by Bencivenga in [2] and known to Lambert.
Shown in effect at the close of Section II was that statements of L_, and
hence of FL_, of the sort FAS are independent of the axioms of L_. Proof

ON AXIOMA TIZING FREE LOGIC

that FA7 and FA7' are independent of the axioms of L_ and of the statements of FL_ of the sort FA8 calls for a bit more work.
:E1, :E2, ... , :En' ... being infinitely many non-empty sets of individual
terms of FL_, take a statement A of FL_ to evaluate to I on a truth-value
assignment a. (to the atomic statements of FL_) relative to the sequence
(:E 1, :E2, ... , :En' ... ) if the customary conditions are met when A is an
atomic statement, a negation, or a conditional; but, in the case that A is a
universal quantification (VX)B, X here the i-th individual variable of FL_
for some i or other from lon, take A to evaluate to 1 on a. relative to
(:E1' :E2, ... , :En' ... ) if, and only if, B(T/X) evaluates to 1 on a. relative to
that sequence for every term T in :Ej This done, consider the truth-value assignment a. that assigns 1 to every atomic statement of FL_ of the sort T =
T but the truth-value 0 to every other one, and a sequence (:E1' :E2, ... , :En'
... ) that is arbitrary except for :E1 and :E2 being {'t 1'} and {'t/}, respectively. Then all the axioms of L_ and statements of FL_ of the sort FA8 evaluate
to 1 on a. relative to (:E 1, :E2, ... , :En' ... ), as does the consequent of a
conditional of FL_ if that conditional and its antecedent themselves do.
Contrastingly, though, 't1 = t/ evaluates to 0 on a. relative to (:E 1, ~, ... ,
:En' ...), hence '(VX)-(x 1 = t2), evaluates to I on a. relative to that sequence
(this because 't1' is the only member of :E 1), hence '(3x 1) (X1 = t2), evaluates
to 0 on a. relative to (:E1' ~, ... , :En' ... ), and hence '(Vx2)(3x 1)(X 1 = x2)'
evaluates to 0 on a. relative to (:E1' :E2, ... , :En' ... ) (this because 't2' is
the only member of :E2). So, at least one statement of FL_ of the sort FA7
evaluates to 0 on a. relative to that sequence. So, at least one statement of
FL_ of the sort FA7 is independent of the axioms of L_ and of the statements
of FL_ of the sort FA8. And at least one statement of FL_ of the sort FA7',
to wit: 'CV'x2) Vx 1)F(x 1) ::J F(x2 will prove to be independent of the axioms
of L_ and of the statements of FL_ of the sort FA8 if 'F(t1)' is assigned 1
rather than 0 by the truth-value assignment 0.. 14

',

IV

To obtain the partial logic L promised on page 2, (i) substitute wherever appropriate 'L' for 'L_' in Section II, (ii) drop the predicate '=' on page 3 and, as
already indicated, axiom schema A6, (iii) for the reason given on page 1,
add this axiom schema, labelled C on that page:
FA6.

(VX)(VY)A::J (VY)(VX)A,

and (iv) abridge


A is provable in FL
as
~F

A.

10

HUGUES LEBLANC

And, to obtain the Free Logic FL also promised on page 2, (i) substitute
wherever appropriate 'FL' for 'L' in Section II, (ii) add besides FA6 either this
axiom schema, labelled K on page 2:
F7".

(VY) (3X) (A == A(Y/X

or the familiar FA", and (iii) abridge

A is provable in FL
as
f-F A.

I first show that E is provable in L and hence that Free Logic without
Identity boasts a Specification Law that exactly parallels
(3X)(X

T)

VX)A

A(T/X

and thus outdoes the customary


(VY) VX)A

A(Y/X,

to wit:
(3X) (A == A(T/X

VX)A

A(T/X.

One additional lemma is needed in the process.


LEMMA 11. f- (A == B)

(A

B).

THEOREM 5. (3X)(A == A(T/X)) :J VX)A :J A(T/X)).


Proof Suppose T' new.
(1)
(2)
(3)
(4)
(5)

f- (A(T'/X) == A(T/X ~
(A(T'/X) ~ A(T/X
f- A(T'/X) :J A(T'/X) == A(T/X ~
A(T/X
f- (VX)(A ~ A == A(T/X :J ACT/X)))
f- (3) ~ VX)A ~
(VX) A == A(T/X ~ ACT/X)))
f- (VX)A ~ (VX) A == A(T/X ~
A(T/X

(6)

f- (VX) A == ACT/X ~ A(T/X ~


3X) (A == A(T/X ~ A(T/X

(7)

f- (VX)A ~ 3X) (A == A(T/X ~

(8)

A(T/X
f- (3X) (A == A(T/X
VX)A ~ A(T/X

(Lemma 11)
(Lemma 2, (1
(Go, (2
(A4)

(MP, (3), (4
(Lemma 8)
(Lemma I, (5), (6

(Lemma 2, (7

ON AXIOMATIZING FREE LOGIC

11

So, as claimed on page 2, substituting the antecedent B of Theorem 5 for


the axiom schema A6 of L. would extend the partial logic L to a full fledged
Standard Logic without Identity. And proof that FA7" is independent of the
axioms of L is easily retrieved from the proof in Section III that FA7' is
independent of the axioms of L.: simply write 'FL' for 'FL.', 'L' for 'L.'
and '(Vx 2)(3x\)(F(x\) == F(x 2))' for '(Vx 2)(Vx\)F(x\) ::J F(x 2 ))'. My main
concern at this point, though, is to show that FA7" and FA7 are provably equivalent ways of extending L to a full-fledged Free Logic without Identity. Yet
another lemma is needed in the process.
THEOREM 6. r-F (VY) (3X) (A == A(Y/X)) ::J (VY) VX)A ::J A(Y/X)).
Proof Suppose T new.
(1)

(2)
(3)

r-F (VY) 3X) (A == A(Y/X)) ::J


VX)A ::J A(Y/X)))
r- F(1) ::J VY)(3X)(A == A(Y/X)) ::J
(VY) VX)A ::J A(Y/X)))
r-F (VY) (3X) (A == A(Y/X)) ::J
(VY)VX)A ::J A(Y/X)

(Go, Theorem 5)
(A4)
(MP, (1), (2))

So, by MP, FA7" yields FA7'.


LEMMA 12.

r-F A == A.

THEOREM 7. r-F (VY)(3X)(A == A(Y/X).


Proof Suppose T new.
(1)

r-F (VY) A == A(Y/X)) (Y/X) ::J


(3X)(A == A(Y/X)))

(Lemma 10)

i.e.
(1)
(2)
(3)

(4)
(5)
(6)

r-F (VY) A(Y/X) == A(Y/X)) ::J


(3X)(A == A(Y/X)))
r- F(1) ::J (VY) (A(Y/X) == A(Y/X) ::J
(VY) (3X) (A == A(Y/X)))
r-F (VY) (A(Y/X) == A(Y/X) ::J
(VY) (3X) (A == A(Y/X))
r-F A(T/X) == A(T/X)
r-F (VY) (A(Y/X) == A(Y/X))
r-F (VY) (3X) (A == A(Y/X))

(Lemma 10)
(A4)
(MP, (1), (2))
(Lemma 12)
(Go, (4))
(MP, (5), (3))

But FA7' is needed to obtain Lemma 10. So, FA7' yields FA7".
On page 2 I talked of A and B being provably equivalent given H (=
FAS). Indeed, enlist H as an extra axiom schema of L. Then B is provable
in L. from A, and A is provable in L. from (one special case of) B, as I proceed

12

HUGUES LEBLANC

to show. Four additional lemmas are needed to that effect, of which I prove
only the two involving '='.

LEMMA 13.

IJr_ A:::>

(B :::> C) and

r_ A:::> (C:::> B), then r_ A:::> (B == C).

LEMMA 14. T' = T :::> T = T'.


Proof.
(1)

(2)
(3)

(4)

r _T' = T :::> (T' = T'


r _T' = T' :::> (T' = T

r_ T' =

r _T'

LEMMA 15.
Proof.
(1)

(2)

T'
T :::> T

r_ T' =

r _T' = T
r _T = T'

:::> T = T')
:::> T = T')

(A6)
(Lemma 2, (1

(H)
=

T'

(MP, (3), (2

T :::J (A == A(T/T'.
(A6)
:::J (A(T/T') :::> (A(T/T' (T'/T (A6)
:::J (A :::J A(T/T'

i.e.
(2)

(3)

(4)

r _T = T' :::J (A(T/T') :::J A)


r _T' = T :::J (A(T/T') :::J A)
r _T' = T :::> (A == A(T/T'

(A6)
(Lemmas 1 and 14, (2
(Lemma 13, (1), (3 0

THEOREM 8. r _(3X) (A == A(T/X.


Proof Suppose T' new.
(1)
(2)

(3)

(4)
(5)

r _(VX)(X = T :::J (A == A(T/X


r _ (1) :::J 3X)(X = T) :::J

(3X) (A == A(T/X)))
r _(3X) (X = T) :::> (3X) (A == A(T/X
r _(3X)(X = T)
r _(3X) (A == A(T/X)

(Gn, Lemma 15)


(Lemma 7, (1
(MP, (1), (2
(A)

(MP, (4), (3

So, given H, B is provable in L_ from A, this for any statement A of L_.

LEMMA 16.

IJr_ B, then r_

(A == B) :::J A.

THEOREM 9. r _(3X) (X = T).


Proof. Suppose T' new.
(1)
(2)
(3)

r_ T =

r _(T' = T == T = T) :::> T' = T


r _(VX)( (X = T == T = T) :::> X = T)

(H)
(Lemma 16, (1
(Gn, (2

(4)
(5)
(6)

(7)

ON AXIOMA TIZING FREE LOGIC

13

f-. (3) ::J 3X)(X = T == T = T) ::J


(Lemma 7)
(3X) (X = T
f-. (3X)(X = T == T = T) ::J (3X)(X = T) (MP, (3), (4
(B)
f-. (3X) (X = T == T = T)
f-. (3X)(X = T)
(MP, (6), (5

So, given H, A is provable in L. from this special case of B:


(3X)(X

T == T

T).15

v
Of the various semantic accounts of Free Logic, the truth-value one in [11]
is by far the simplest. Let ~ be a possibly empty set of terms of FL. and Ul;
be a unary function from the statements of FL. to 0 and 1. Then Ul; is said
to constitute an identity-normal truth-value function for FL= if it obeys the
following six constraints:

BI.

1 if ul;(A) = 0
0 otherwise
ul;(A ::J B) = 1 if ul;(A) = 0 or ul;(B) = 1
= 0 otherwise
ul;V'X)A) = 1 if~ =@ or ul;(A(T/X = 1 for every term T in ~
= 0 otherwise
ul;(T = T) = 1
If ul;(T = T') = 1, then ul;(A) = ul;(A(T'I/T, where A is atomic.
If one of T and T' belongs to ~ but the other one does not, then
ul;(T = T') = o.
=
=

B2.
B3.
B4.
B5.
B6.

It follows from these constraints that


ul;3X) (X

1 if, and only if, T

E ~.

Note indeed that ul;(T = T) = 1 by constraint B4. So, if T belongs to ~,


then there exists a member T' of ~ such that ul;(T' = T) = 1. Hence
ul;3X) (X = T = 1 by constraint B3, constraint Bl, and the definition of
'3'. Suppose, on the other hand, that ul;3X)(X = T = 1. Then ul;(T' = T)
"# 0 for at least one member T' of ~, and hence by constraint B6 either both
T and T' belong to ~ or neither one does. So T as well as T' belongs to ~.
The set ~ to which the truth-value function U is relativized thus consists of
those, and those only, among the terms of FL. which - so far as a is concerned - designate something. With A required in constraint B5 to be atomic,
the foregoing account of Ul; is of course a recursive one.
These matters attended to, declare a statement A of FL. logically true in
the truth-value sense if ul;(A) = 1 for every identity-normal truth-value function

14

HUGUES LEBLANC

u}; for FL.. Proof can then be retrieved from [11] and Section 3 of [13]
that
I- F A if, and only if, A is logically true in the truth-value sense.

Free Probability Logic is the result of similarly relativizing the constraints


placed upon probability functions in Standard Probability Logic. With only
absolute probability functions attended to here,16 let 1: again be a possibly
empty set of terms of FL. and let p}; be a unary function from the statements
of FL. to the reals. Then p}; is said to constitute an identity-normal probability function for FL= if it obeys the following ten autonomous constraints,
adaptations and simplifications of Popper's constraints in [21]:17

CI.

C2.

P};(-(A & -A)) = 1

C3.

P};(-A) = 1 - P};(A)

C4.

P};(A)

P};(A)

P};(A & B) + P};(A & -B)

CS.

P};(A & B)

C6.

P};(A & (B & C))

C7.

P};(A & (\fX)B)

P};(B & A)

P};A & B) & C)


P};(A) if 1: = 0, otherwise
P};(A & ... (B(TI/X) & B(T 2/X)) & ... )
& B(T jX))) or
limit P};(A & . . . (B(T/X) & BCT/X))
n~co

& ... ) & B(TjX))), where T I , T 2 , , and


Tn in the first case and T I , T 2 , , and Tn'

cs.
C9.

CIO.

. . . in the second are in alphabetical order


the various members of 1:

P};(T = T) = 1
If P};(T = T') = 1, then u};(A) = u};(A(T'IIT)), where A is atomic
If one of T and T' belongs to 1: but the other one does not, then
P};(T = T') = O.

Note as regards constraint C7 that Po(-(A & -A) & (\fX)B) = Po(-(A &
-A)). But P};(-(A & -A) & (\fX)B) is easily shown to equal p};\fX)B). So.
by C2, Po\fX)B) = 1, as expected. ls
These matters attended to, declare a statement A of FL. logically true in
the probability sense if P};(A) = 1 for every identity normal probability function
p}; for FL. Proof can then be retrieved from Section 4 of [13] that
I- F= A if, and only if, A is logically true in the probability sense.

Results analogous to the two just obtained hold of course for Free Logic
without Identity: write 'FL' everywhere for 'FL.', delete all occurrences of
the qualifier 'identity -normal', drop constraints B4-B6 on page 13 and con-

ON AXIOMA TlZING FREE LOGIC

15

straints C8-CIO above, write 'f-F' in place of 'f-F-' in the two results in question,
and the trick is done. So, as axiomatized in this paper, Free Logic with and
without Identity is sound and complete in the probability as well as the truthvalue sense. 19
VI

Inclusive Logic is but a timid prefiguration of Free Logic, which it antedates


by some eight years. 20 Both logics acknowledge 0 as a domain, thus lifting
the first of the two presuppositions mentioned on page 2. But, whereas Free
Logic lifts the second as well, Inclusive Logic does not: given any domain
D other than 0, it requires each of the terms 't1" 't2" . . . , 't/, . . . to
designate a member of D. So Inclusive Logic is exactly like Standard Logic
except for counting 0 a domain. 21 Axiomatizing it, however, is a delicate affair,
and one - we now know - that has not been properly attended to in the
past. 22 Helpful in the process will be Quine's phrase "holding for 0" in [22],
and the test he proposed there for deciding whether a theorem of Standard
Logic holds for 0. Adapting Quine's instructions to suit the present context,
mark the atomic statements of the sort T = T as true, mark the universal
quantifications as true and the existential ones as false, and apply truth-value
considerations. If the theorem of Standard Logic you are testing turns out to
be a tautology, then the theorem in question holds for 0; otherwise it does
not.
The axioms of IL_ are to be:
(i) all the statements of IL_ of the sorts AI-A6 and FA8,
(ii) all the statements of IL_ of the sort

IA9.

(3X)(A v -A)

VX)A

A(T/X)),

an axiom schema I borrow from [15],


(iii) all the statements of IL_ of the sort

IAtO.

(VY) (3X) (A v -A),

the axiom schema discussed in Note 22, plus of course


(iv) all the axioms that can be gotten from the foregoing by means of Fitch's
Clause.
IA9 obviously ensures that when one's domain is non-empty, Inclusive Logic
is exactly like Standard Logic. As for IAIO, it ensures - it so turns out that 0 qualifies in Inclusive Logic as a domain. Note indeed that IA9 yields
by Gn

(VY) 3X) (A v -A)

VX)A

A(Y/X))),

which by A4 and MP yields

(VY)(3X)(A v -A)

(VY)VX)A

A(Y/X)),

16

HUGUES LEBLANC

which by IAlO and MP again yields of course


(\ty) \tX)A :J A(Y/X,

i.e. FA 7'. So all the theorems of Free Logic with Identity are provable in
Inclusive Logic with Identity. But Free Logic with Identity was so axiomatized
as to make room for 0 as a domain. Hence so is Inclusive Logic with Identity.
Hence Inclusive Logic with Identity is so axiomatized here as to be exactly
like Standard Logic with Identity except for owning 0 as a domain.
The truth-value semantics of Section V is easily adjusted to suit Inclusive
Logic with Identity, as is the probability one: take 0 and the set {'t l ', 't/,
... , 'tn', ... } of all the terms of IL_ to be the only :E's to which the truthvalue functions and the probability functions there are relativized. 23 Proof that,
given the present axiomatization of and semantics for IL_, IL_ is sound and
complete in both the truth-value and the probability sense can be retrieved from
[10] and [13], but the retrieval is a bit laborious at places.
IL is readily gotten from IL_: (i) drop '=' of course, (ii) substitute Fine's
axiom schema FA6 for A6 and drop axiom schema FA8, and (iii) drop constraints B4-B6 on page 13 and constraints C8-ClO on page 14.
The relationship between Standard Logic, Inclusive Logic, and Free Logic
can be depicted as follows, '=' ignored from now on (and without prejudice)
to expedite matters:

SL
IL

Note for proof that (i) every theorem of FL is provable in IL, as we just
saw, but some statements of the sort IA9 are not provable in FL, as I shall
establish below, and (ii) every theorem of IL is obviously provable in SL,
but some statements of the sort (\tX)A:J A(T/X) are not provable in FL,
'(\txl)(F(x l) & -F(x l :J (F(t l) & -F(tl' being the most obvious case in point.
So, claims to the contrary notwithstanding, Free Logic is but a sublogic of
Inclusive Logic, and of course Inclusive Logic is but a sublogic of Standard
Logic. 24 25
So the only two items of business left concerning the present axiomatization of Inclusive Logic are showing that IA9 is independent of the axiom
schemata of FL and that IAlO is independent of the other axiom schemata
of IL.

17

ON AXIOMA TIZING FREE LOGIC

As regards IA9, understand by the tl-rewrite of a statement A of IL the


result of deleting all the quantifiers that occur in A and substituting the term
'tl' for every variable that occurs in the resulting quasi-statement; and let u r
be any truth-value function for IL that assigns 1 to every atomic statement
of IL except 'F(t2r. It is clear that every statement of IL of any of the sorts
AI-AS, FA6, and FA7' evaluates to 1 on u r . It is also clear that the t lrewrite of any statement of IL gotten from the preceding axioms by means
of Fitch's Clause also evaluates to 1 on u r ' and that the tl-rewrite of the consequent of a conditional of IL is sure to evaluate to 1 under u r if the tl-rewrite
of the conditional in question and that of its antecedent evaluate themselves
to 1 under u r . Yet the tl-rewrite

of this statement of IL of the sort IA9:

does evaluate to 0 on u r .
As regards IAIO, consider the following 4-valued truth-value fuction for
IL due to Roeper:
u",(B)
~(A)

~(-A)

2/3
Ih

Ih
2/3

u",(A => B)

2h

1/3

2h
1/3

2h
1
2/3
1

1/3
Ih
1

Ih
2h

0
u",(A)

0
~(A(T/X

~VX)A)

2/3
1/3

2/3
2/3
2/3

It is easily verified that all the axioms of IL not of the sort IAIO evaluate to
1 under u r . Yet this statement of IL of the sort IAIO:

evaluates to 2/3 whatever truth-value is assigned to 'F(t l),. Indeed,

hence

18

HUGUES LEBLANC

hence

hence

hence

LIST OF THE LETTERED FORMULAS IN SECTION I

A.
B.

c.

(3X) (X = T)
(3X)(A == A(T/X
(V X) (VY)A ::) (VY) (VX)A

(= FA6)

D.
E.

(3X) (X = T) ::) VX)A ::) A(T/X

(= Theorem 1)

(3X) (A == A(T/X ::) VX)A ::) A(T/X

(= Theorem 5)

F.

(3X)(X = T) ::) T = T

G.

(VX)A ::) A(T/X)

H.
I.

T=T

(= FAS)

(VY) (3X) (X = Y)

(= FA7)

J.

(VY) VX)A ::) A(Y/X

(= FA7')

K.

(VY)(3X)(A == A(Y/X

(= FA7")

L.

(3X) (A V -A) ::) VX)A ::) A(T/X

(= IA9)

M.

(VY) (3X) (A V -A)

(= IAtO)

Universite du Quebec

a Montreal
NOTES

I
Free Logic dates back to 1959, the year that saw the publication of [14], a paper by Leblanc
and Hailperin, and the publication of [6], a paper by Hintikka. The Free Logic in both cases is
one with Identity. Free Logic without Identity dates back to 1963, the year that saw the publication of [7], a paper by Lambert. As regards the first of the two presuppositions, recall Russell's
remark on p. 203 of [23]; "The primitive propositions in Principia Mathematica are such as to
allow the inference that at least one individual exists. But I now regard this as a defect in
logical purity."
2 A is an adaptation of an axiom of Tarski's in [24], where - free variables doing duty in
effect for terms - a variable other than X occurs in place of T.
3 The result is Fine's in [4].
4
This is explicitly done in [2], but should be done as well in [8].
s [19] predates [14] and [6], and influenced the writing of [14]. Leonard's Law appears as an

ON AXIOMA TIZING FREE LOGIC

19

axiom schema in several axiomatizations of Free Logic with Identity and in all axiomatizations of what is known as the Logic of Existence. On the latter occasions,
(3X)(X

T)

is either abridged as or shown logically equivalent to


E!T,
'E!' a predicate familiar to readers of Principia Mathematica.
That D, generally and rightfully held characteristic of Free Logic with Identity, nonetheless
follows from axioms suiting Standard as well as Free Logic was first shown in [14]. The proof
in Section II is a simplification of that proof and a later one in [9].
7
E and, on page 2, K first appeared in [10], p. 167. I also reported there that the upcoming
formulas J and K are interprovable, a matter I did not pursue any further at the time. Incidentally,
credit for E should be shared with Cocchiarella, who reported to me after the publication of
[10] that E appeared in the original - though not in the published - version of [3].
8 The order in which the individual variables and individual terms of L_, SL_, FL_, as listed
here will be known as their alphabetical order.
9 The present treatment of universal quantifications allows one to dispense with the distinction between bound variables and free ones, a distinction which has proved particularly irksome
in writings on Free Logic and Inclusive Logic. See [15] on this matter.
10
When I does not occur in A, each of A(I'/I) and A(I'//I) is of course A.
II
Since A here is a statement and identical quantifiers cannot overlap, X is sure not to appear
in A. A quantification ('ltX)A is said to be vacuous, and its quantifier ('ltX) to be a vacuous
quantifier, when X does not occur in A. See Note 24 for more on the converse ('ltX)A :J A of
AS.
12
The restriction on X guarantees of course that ('ltX)A(X/T) is a statement of L. A like
remark applies on later occasions but will not be repeated. Note that when X is foreign to A,
('ltX)A(T/X) is but the vacuous quantification ('ltX)A.
13
The restriction placed either on T as here or on T' as in the proof of Theorem I and that
of Theorem 2 on page 6 is crucial. One example will suffice. Though the conditional

is provable from

for every i larger than I, it is not from

14
Bencivenga had already shown in (1) that FA7 is independent of AI-A6, FA8, and Leonard's
Specification Law for Free Logic with Identity (= Theorem I), hence in effect that the law in
question could not substitute for either of FA7 and FA7' in the foregoing axiomatizations of
FL_. He also showed there that this fascinating law:

('ltX)3Y)(Y

X) :J A) :J ('ltX)A,

is independent of AI-A6, FA8, and Leonard's Law. However, Bencivenga's Law - as I take
leave to call it - readily follows from A4, Lemma 2, and FA7. Note indeed that
('ltX)3Y)(Y

X) :J A) :J 'ltX)(3Y)(Y

X) :J ('ltX)A)

by A4, hence
('ltX)(3Y)(Y

X) :J 'ltX)3Y)(Y

X) :J A) :J ('ltX)A)

20

HUGUES LEBLANC

by Lemma 2, and hence


(VX)3Y)(Y - X) :::> A) :::> (VX)A
by FA7 and MP.
15
The relationship between

T=T'
and
A(T'/T)

is even closer than the foregoing results suggest. Note indeed that
A:5 A(T/T),
If A :5 A(TIT'), then A :5 A(T'IT),

and
If A :5 A(T/T') and A :5 A(T'IT"), then A - A(TIT").

So, like Identity, Substitutivity in the sense of '/' is an equivalence relation. Note also that as
(3X)(X - T) :::> T

is provable in FL., so - trivially, to be sure - is its counterpart


(3X)(A:5 A(X/T :::> (A :5 A(T/T
in FL.
16
For a similar treatement of the matter with conditional probability functions rather than
absolute ones, see Section 5 of [13]. However, absolute probability functions particularly suit
the present occasion: as shown in [12], truth-value functions are those (and those only) among
absolute probability functions that are two-valued.
17
See [12] on this matter. The constraints in question are autonomous in that, far from
presupposing this Interchange Law of other axiomatizations of absolute probability theory:
If 1-. A :5 B, then Pz;(A) = Pz;(B),

they permit proof of it.


18
For more on constraint C7, particularly the presence in it of the conjunct A, see [17].
Unlike the account of (1z; that of Pz; is not a recursive one. In point of fact no recursive account
of Pz; can be had since Pz;(A & B) for atomic A and B is not always a numerical function of
Pz;(A) and Pz;(B).
19
Many a model-theoretic semantics for Free Logic will be found in the literature, the earliest
being undoubtedly van Fraassen's in [25], which introduced the celebrated method of supervaluation. Two others, of quite a different character, will be found in [16] and [18]. The second
of these introduced the two-domains method favored by many.
20
Two of the earliest papers on Inclusive Logic are [20] and [22], which appeared in 1951
and 1954, respectively.
21
My understanding of what counts as Free Logic and what as Inclusive Logic is, I believe,
the more common one. In [2], however, Bencivenga considers it characteristic of an inclusive
logic that it lifts the first presupposition on page I, and characteristic of a free logic that it
lifts the second. Under this understanding of things, FL. and FL would be free logics that are
inclusive, and IL. and IL would be inclusive logics that are not free. FL. and FL are readily made
into free logics that are not inclusive: simply enlist (3X)(A v -A) as an extra axiom schema
of both logics.

ON AXIOMA TIZING FREE LOGIC

21

Absent indeed from previous axiomatizations of IL_ and IL, the one in [15] among the
latter, is IAI0 in (iii) on page 15, an axiom schema recently shown by Roeper to be independent of those in [15]. Yet, no matter the truth-value function 0.1: or the probability one P1: for
IL_, statements of IL_ of the sort IAI0 all evaluate to 1 under 0.1: and P1: and hence are all
logically true.
23
That the statements of IL_ of the sort IAI0 all evaluate to I under the resulting functions
is obvious enough.
24
Studied in [20] and [15] are Inclusive Logics where (\iX)A, when vacuous, is provably
equivalent to A. Quine's test will suit them if vacuous quantifiers are deleted before a statement is subjected to the test. The axiom schemata of IL in [15] are AI-A3, AS, the converse
of AS, and this restricted version of A4:
n

(\iX)(A :::> B) :::> \iX)A :::> (\iX)B), so long as X is foreign to B.


Whether that logic is complete in the sense of this paper has yet to be ascertained.
25
The claim in question was made in [16] for instance.
26
Thanks are due to William A. Wisdom (Temple University) whose queries concerning the
various axiomatizations of FL_ prompted the writing of this paper; to Ermano Bencivenga
(University of California, Irvine) who brought to my attention his 1978 paper (hence, to what
I call in Note 13 Bencivenga's Law) and to whose 1986 paper Section III owes much; to Willard
V. Quine who noted that wanted in B, and hence in E and K, is the single slash rather than
the double one which I had originally used; and to Peter Roeper (Australian National University)
who suggested the present version of constraint B5 and contributed the closing independence
proof of Section VI. Thanks are also due to Raymond Gumb (University of Lowell), Lisa
Pastino (Temple University), Gilles St-Louis (I'Universite du Quebec i\ Montreal), and to William
A. Wisdom (Temple University) who read earlier versions of the paper; and last, but not least,
to Alain Voizard (Universite du Quebec i\ Montreal) who also read earlier versions of the paper
and translated it into French. And thanks are due to the Social Sciences and Humanities Research
Council of Canada which supported the research leading to it. A partial version of the text was
read in December of 1992 at Concordia University (Montreal), the present version was read in
October of 1993 at the Universitat Salzburg, and and a French translation of it was read in
April of 1993 at l'Institut d'Histoire et de Philosophie des Sciences et des Techniques (Paris).

REFERENCES
1. Bencivenga, E., 1978, 'A Semantics for a Weak Free Logic', Notre Dame Journal of Formal
Logic 19, 646-652.
2. Bencivenga, E., 1986, 'Free Logics', in Handbook of Philosophical Logic, vol. 3, D.
Reidel Publishing Company, Dordrecht, pp. 373-426.
3. Cocchiarella, N. B., 1966, 'A Logic of Actual and Possible Objects', The Journal of Symbolic
Logic 31, 689-690.
4. Fine, K., 1983, 'The Permutation Principle in Quantificational Logic', Journal of
Philosophical Logic 12, 31-37.
5. Fitch, F. B., 1948, 'Intuitionistic Modal Logic with Quantifiers', Portugaliae Mathematica
7,113-118.
6. Hintikka, J., 1959, 'Existential Presuppositions and Existential Commitments', The Journal
of Philosophy 56, 125-137.
7. Lambert, K., 1963, 'Existential Import Revisited', Notre Dame Journal of Formal Logic
4,288-292.
8. Lambert, K., 1991, 'The Nature of Free Logic', in Philosophical Applications of Free Logic,
Oxford University Press, New York - Oxford.
9. Leblanc, H., 1968, 'On Meyer and Lambert's Quantificational Calculus FQ', The Journal
of Symbolic Logic 33, 275-280.

22

HUGUES LEBLANC

10. Leblanc, H., 1971, 'Truth-Value Semantics for a Logic of Existence', Notre Dame Journal
of Formal Logic 12, 153-168.
11. Leblanc, H., 1976, Truth-Value Semantics, North-Holland Publishing Company, Amsterdam
New York Oxford.
12. Leblanc, H., 1982, 'Popper's 1955 Axiomatization of Absolute Probability', Pacific
Philosophical Quarterly 63, 133-145.
13. Leblanc, H., 1983, 'Alternatives to Standard First-Order Semantics', in Handbook of
Philosophical Logic, vol. 1, D. Reidel Publishing Company, Dordrecht, pp. 189-274.
14. Leblanc, H. and Hailperin, T., 1959, 'Nondesignating Singular Terms', The Philosophical
Review 68, 239-243.
15. Leblanc, H. and Meyer, R. K., 1969, 'Open Formulas and the Empty Domain', Archiv
fUr mathematische Logik und Grundlagensforschung 12, 78-84.
16. Leblanc, H. and Meyer, R. K., 1982, 'On Prefacing ('v'X)A ::J A(Y/X) with ('v'Y): A Free
Quantification Theory without Identity', in Existence, Truth, and Provability, State
University of New York Press, Albany, pp. 58-75. The paper there is an amended version
of the original, which had appeared in ZeitschriJt fUr mathematische Logik und Grundlagen
der Mathematik 12, 1971, pp. 153-168.
17. Leblanc, H. and Roeper, P. 1993, 'On Getting the Constraints on Popper's Probability
Functions Right', Philosophy of Science 60, 151-157.
18. Leblanc, H. and Thomason, R. H., 1968, 'Completeness Theorems for Some PresuppositionFree Logics', Fundamenta Mathematicae 62,125-164.
19. Leonard, H. S., 1956, 'The Logic of Existence', Philosophical Studies 7,49-64.
20. Mostowski, A., 1951, 'On the Rules of Proof in the Pure Functional Calculus', The Journal
of Symbolic Logic 16, 107-111.
21. Popper, K. R., 1959, The Logic of Scientific Discovery, Basic Books, Inc., New York.
22. Quine, W. V., 1954, 'Quantification and the Empty Domain', The Journal of Symbolic Logic
19,177-179.
23. Russell, 8., 1919, Introduction to Mathematical Philosophy, George Allen and Unwin, Ltd.,
London.
24. Tarski, A., 1965, 'A Simplified Formulation of Predicate Logic with Identity', Archiv fUr
mathematische Logik und Grundlagen der Mathematik 7, 61-79.
25. van Fraassen, 8. C., 1%6, 'Singular Terms, Truth-value Gaps, and Free Logic', The Journal
of Philosophy 67, 481-495.

FRAN<;ms LEPAGE

PARTIAL PROPOSITIONAL LOGIC

From a strictly formal point of view, partial logic is a particular variety of


trivalent logic. It gets its unique character from the specific interpretation it
gives to the third truth-value, which is not considered to be an intermediate
value having a status equivalent to that of true and false, but takes the place
of a specific notion, that of the undefined.
Notwithstanding this relatively rigid constraint, it remains possible to construct partial logics having very different interpretations from one another. The
reason for that is that the notion of undefinedness can itself be the object of
a variety of interpretation.
Thus, some have described languages in which the undefined possesses a
genuine semantic status; the definition of truth for these languages is such that,
for structural reasons, some statements have no truth-value. Others have introduced the undefined in order to give a status to paradoxical statements (we
are thinking here of the paradoxical statements of naive set theory). Others,
finally, have introduced the undefined as the semantic value of undemonstrable
statements. These different flavours of undefinedness have given rise to systems
obedient to very different laws. l
The motivations at the root of the partial logics to be discussed here are
of a completely different order, exclusively epistemic. The general intuition
is relatively simple. To what extent can the rift between the pair 'formal
system'/'classical interpretation' and the pair 'system concretely implanted
in an agent'l'interpretation based on the knowledge of the world that the
agent possesses' be mended by supposing that an interpretation based on
the knowledge of the world that the agent possesses resembles a classical
interpretation which contains gaps? After all, one of the characteristic properties of agent is to not be omniscient, to have a rather piecemeal semantic
representation.
This approach, once again of a purely epistemic inspiration, imposes
straightaway a formal constraint which will be the common characteristic of
all the systems under consideration here: the constraint of monotony. Returning
to the analogy that compares a partial interpretation to an interpretation with
gaps in it, the constraint of monotony simply means that these gaps must be
capable of being filled in a coherent way and, as soon as they are, the resulting
interpretations will approximate classical interpretations.
Most of what follows is situated in the context of the propositional theory
of types, where partial logic is particularly interesting. We will, furthermore,
point out certain results that are valid in type theory in general and others
that are uniquely valid for the first order.
The idea of elaborating a partial logic in type in type theory is relatively
23
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 23-39.
1995 Kluwer Academic Publishers.

24

FRANyOIS LEPAGE

recent. Pavel Tichy [Tic82] not only formulated the project explicitly, but
gave motivations for such an enterprise. His motivations remain, in my opinion,
globally valid, and it is worth recalling them here.
The first motivation is that 'the logic underlying ordinary language - and
hence our conceptual scheme - is that of the simple type theory'. What is meant
here is without doubt, that type theory possesses sufficient resources for
expressing the inferential structures one finds in natural languages; structures whose complexity often exceeds the expressive capacity of first order
calculi. The second motivation is that it is necessary to enrich the classical
type theory from inside 'by dropping the totality assumption and treating partial
functions on a par with total ones'. In fact, 'the partial theory seems to
provide a medium which yields an analysis for any linguistic expression, and
affords a universal explication of logical entailment'.
We see that for Tichy the idea of introducing partial functions in type theory
is not presented as the adoption of a particular hypothesis but, on the contrary,
as the lifting of an arbitrary restriction, i.e., that which consists of using only
a subset of functions for semantic values - the subset of total functions,
which are defined for all the arguments of the right category. It is difficult
to determine whether TichY's motivation was primarily epistemic. What is
certain, at any rate, is that his motives can be interpreted in an epistemic sense.
Tichy did not completely realize his project, in part because he did not know
how to use the full power of the partial theory of types, lacking as he did a
fundamental technical notion to which I will come back later. Now we will
present the framework that will serve to introduce partial interpretations, i.e.,
the theory of propositional types.
THE THEORY OF PROPOSITIONAL TYPES

The partial theory of types results from generalizing the well-known simple
theory of types by authorizing the presence of partially defined entities. Let
us recall some elementary facts regarding type theory, in particular regarding
the theory of propositional types.
Simple type theory is a simplified version of Russell's ramified theory of
types. It was initially formulated by Chwistek and Ramsey, and the first full
presentation is that of Church [Chu40].
In its contemporary versions, the theory of types consists first of the construction of a hierarchy of functions in the following way.2
DEFINITION 1. The set of types T is the smallest set such that
(i)
(ii)
(iii)

e E T (e is the type of individuals)

t E T (t is the type of individuals)


if a, ~ E T, then (a~) E T.

The domains of each type are:

PARTIAL PROPOSITIONAL LOGIC

DEFINITION 2. For each a.


such that
(i)
(ii)

(iii)

25

T, the set of entities of type a. is the set Do.

De = E (where E is a non-empty set; E is the set of individuals);


D, = {a, I} (the set of truth values);
D(o.r.> = Dr,Do. (the set of functions of Do. in Dr.; we will simply
write Do.r.).

It is possible to provide a logical calculus comprising a denumerably infinite


number of variables of each type and a very small number of logical constants,
three in fact, these being the functional abstractor, its converse the functional
application and identity. Henkin [Hen50] proposed an axiomatization of such
a system and proved its completeness in a general sense. Gallin [GaI75] even
proposed an intentional version and gave a complete axiomatization, as always,
for general models.
For the time being we will concern ourselves with an essentially simple
calculus, based on the hierarchy constructed exclusively from D, = {a, I}
and dropping clause (i) of Definitions I and 2, which we will call the theory
of propositional types. This calculus is interesting for two reasons. The first
of these is purely theoretical: it is doing away with the hypothesis that objects
exist. All valid statements of this calculus express properties that rest ultimately
on propositional forms in general. In that sense, this calculus is a true logic
(even though it contains variables for propositional functors of every order and
allows quantification over these variables) and has a status vis-ii-vis the theory
of types in general analogous to that which the propositional calculus has
vis-ii-vis first order predicate calculus: its valid statements constitute a 'hard
core' of the set of valid statements of the simple theory of types.
The second reason is more pragmatic: every domain Do. is finite, a fact which
sometimes, simplifies things greatly, as we will see.
It would be a good idea to provide a date and place of birth to the theory
of propositional types, both for internal reasons - its fundamental concepts
are equally concepts of the simple theory of types and did not appear simultaneously - and for external reasons - some concepts that were introduced
in articles which went mostly unnoticed were simply reinvented later on.
Here is an outline of the genesis of the theory in so far as I am capable of
reconstituting it. To simplify things, let us start with a text which presents
the theory in an complete way, a rather little-known text: 'A Theory of
Propositional Types' by Leon Henkin [Hen63]. The calculus presented by
Henkin uses only the abstractor A, its converse and identity as primitive
symbols; it is complete and the proof of its completeness is constructive (in
fact, for every element of every domain Do. there is an expression of the
language of which that function is the value).
Here is a simplified presentation of that calculus. First of all, for every
type a., there is a denumerable set Var a. = {xo.); EW' of variables of that
type.

26

FRAN<;:OIS LEPAGE

DEFINITION 3. The set Trma of terms of type a. is the smallest set such
that
(i)

(ii)
(iii)
(iv)

Vara E Trma
if A, B E Trm a, then [A == B] E Trm,
A E Trma and x E Trmr" then AxA E Trmr.a.
if A E Trmar, and B E Trm a, then [AB] E Trmr,

We will call terms of type t statements. The logical constants are defined in
the following manner: 3
DEFINITION 4.
T = def[Ax,x == /...xx]
F = def[Ax,x == AxT]
-'" = defAx,[X == F]
1\'(11)

defAx,Ay,[A!,(II)[[fx]y] == Af[[fT]Tll

V xaA

def[AxaA == AxT]

etc. (the other constants being introduced in the usual way). 4


We will write [A 1\ B] for [[I\A]B] and use X a, Ya' Za' fa' ga' and ha to name
any variables of type a..
We then define an assignation of value.
DEFINITION 5. An assignation of value /-L is a function
/-L: U Vara~ U Da

aET

aET

such that /-L(xa) E Da. We write /-L(a/x) for the assignation that differ from /-L
at most by the fact that it assigns the value a to x
Finally, we define a valuation based on /-L.
DEFINITION 6. A valuation based on /-L is a function
V 11: U Trma ~ U Da

aET

aET

such that
(i)

V l1(xa ) = /-L(xa );

(ii)

V11([A == BD = 1 iff VI1(A) = ViB) and 0 otherwise;


VI1([ABD = VI1(A) (ViB
ViAxaA) is the function which associates Vl1(a,x)(A) with every
a E Da.

(iii)

(iv)

PARTIAL PROPOSITIONAL LOGIC

27

When a formula A is closed ViA) is independent of J.l and V Il(A) is called


the denotation of A and is written Ad.
The axiomatic system proposed by Henkin includes seven axioms and one
replacement rule.
HI:

[Aa == A]

H2 :

[[At ==

H3:

[[T /\ F] == F]

H4:

[lfrtT /\fF] == 'v'x,[.fxJ]

H5:
H6:
H7:

[xa == Ya] :J [[fa~ == ga~] :J [[fx] == [gym

11 == A]

[j== gJ]
[[AxaABJ == A {Blx}] where B is free for x and A {Blx} is the formula
obtained from A by substituting B for each free occurrence of x.
['v'xa[[fa~xaJ]

==

[ga~xa]] :J

RULE R. From (A == B) and C we can derive D, where D results from C by


substituting an occurrence of B for an occurrence of A. If A is a theorem,
we write f-A.5
To demonstrate the completeness of this system, Henkin proceeds in the
following way:
(1) For any a E Da, there is a closed formula, call it an such that (an)d =
a. This formula is a canonical name of a and Henkin gives an explicit algorithm for its construction. In particular r = T and Fn = F and Td = 1 and
pi = O.
(2) Let All be the formula obtained from A by substituting the formula
(J.l(xa)t. for every occurrence of the free variable Xa. Henkin shows that for
any formula A we have f-[All == (J.l(A)tJ.
(3) In particular, if A is valid and closed, we will have f-[A == T] and from
that we can easily show that f-A. (If A is valid and non-closed, we go by
way of its universal closure.)
This article of Henkin's seems not to have received the attention it deserves.
In fact I have found only two references to it, in Gallin's [Gal75] and Farmer's
[Far90]. Nevertheless it seems to be the first presentation of a logic of superior
order containing only three primitive symbols. Moreover, Henkin thought he
was the first to suggest reducing the quantifiers to these operators. On page
324 of the first version of the text, after having paid homage to Tarski
[Tar23t for having shown how to reduce the usual logical constants in terms
of quantifiers and identity; he says that he seems to be the first, to propose
an explicit reduction of the quantifiers in terms of the A operator and identity.
In the published version, he adds a note thanking Peter Andrews for having
pointed it out to him that this reduction had been done explicitly by Quine
[Qui56].7
If Henkin's system is the first to use only the three primitive symbols in
question, the first theory of propositional types is the protothetics of

28

FRANyOIS LEPAGE

Lesniewski. Unfortunately, we do not have Lesniewski's text, which may


have been destroyed during the liberation of Warsaw in 1944. The most faithful
version at our disposal is Jerzy Slupecki's [Slu53], published in the first
issue of Studia Logica, entitled 'St. Lesniewski protothetics', which was drawn
from notes taken by Lesniewski students. Curiously, Henkin did not know
this text - indeed he thanks the referee for having pointed it out to him. This
is all the more curious since Henkin himself refers to Lesniewski in the version
of Andrzej Grzegorczyk [Grz55] 'The System of LeSniewski in relation to contemporary Logical Research' which appeared in Studia Logica two years
later. It must be said that Grzegorczyk does not note the existence of Slupecki's
text. We will end this point by noting that Montague [Mon74] recapitulates the
essence of the translations of Henkin in 'Universal Grammar' without pointing
it out but refers explicitly to Tarski.
So much for the theory of propositional types. Let us now examine the
matter of partiality.
PARTIAL LOGIC

As for the theory of propositional types, it would be difficult to give a birth


date for partial logic. I noted earlier that the first explicit formulation of a
system of partial logic was TichY's, but we can find many of the concepts
he used in texts whose main purpose is not to make a contribution to partial
logic.
The first serious attempt to make a systematic presentation of partial logic
is that of Blarney [Bla86]. In fact, Blarney's text contains the essential ingredients for the elaboration of a general partial logic as well as the instructions
for doing so, but he stops at the threshold of the enterprise and restrict himself
to first-order logic.
First of all let us consider these essential elements. The first of these is
the introduction of the undefined as a semantic object, that is the introduction of the name of the undefined in the metalanguage. This may seem trivial
but it is a crucial step: without this 'ontological commitment', it does not seem
possible to describe the reiteration of functional application in a systematic
and coherent way. This is the fundamental notion lacking in Tichy that we
mentioned earlier, which led him to reject domains of functions in favour
of domains of relations and thus essentially to be restricted to the first
order. 8
The argument is simple. Suppose - we reason on the basis of an example,
the generalization of which goes without saying - that An' Bf'> and C"f are
three expressions such that [BC] and [A[BC]] are well formed terms. The
metalinguistic notions of 'being defined' and 'being undefined' apply to the
values of the expressions (which are functions), and it is only be extension
- by a trivial abuse of language - that the notion applies also to expressions
depending on whether its value is or is not defined. It is necessary not to

29

PARTIAL PROPOSITIONAL LOGIC

confuse our use of the notion 'undefined' with another use, that which consists
of calling an expression undefined if it is not well-formed, that is if it is not
a meaningful term. In this case one should rather speak of nonsense.
In the classical context, the expressions A, B, C would take total functions as values, so the value of [BC] would be a total function of the type of
the arguments of A and thus the value of [A[BC]] would be a total function
as well. What would happen if the values of these expressions, still being
of the right type, were partial? Specifically, what would happen if the value
of B were a partial function undefined for the argument which is the value
of C? The value of [BC] is not defined. In that case, [A[BC]] has no
value, that is, it is not possible to give a sense to the expression 'value of
[A[BC]]'.

A general solution to this problem consists of giving the undefined an


intratheoretical status. The undefined becomes an object like the others and
can thus be a value and an argument of a function. This simple addition
enriches the metalanguage sufficiently for us to be able to describe the whole
hierarchy of partial functions.
The idea of explicitly giving the undefined the status of 'object' goes back
to Dana Scott [Sc073] who, curiously, introduced it in order to get rid of the
hierarchy by constructing his reflexive domains which would serve to interpret the typeless A-calculus. We might equally say that the idea goes back to
Kleene with his 'strong' connectors, where the undefined appeared explicitly in the truth tables. My own idea [Lep84] was to use this trick to define
the hierarchy of partial functions. Let us see how this is done. 9

DEFINITION 7. For any a


is
(i)

PM,={O,l,<p}

(ii)

PMaf',

T, the set PMa of partial functions of type a

(PMa -7 PMf',)

where (PMa -7 PMf',) is the set of monotone functions of PMa in PMf',' the
monotony being relative to the following order:
(i)
(ii)

for any x E PM" x ~ x and <p ~ x;


for any f, g E PMaf',' f ~ g if and only if for any x
g(x).

PMa, f(x) ~

PROPOSITION 8. For any a, PMa is a meet-semi-Iattice, where the meet 1\


and (when it exists) the sup V are defined respectively by the recursive
clause
(i)
(ii)

for x, y E PM" x 1\ y ... x if x = y and <p otherwise;


for f, g E PMaf',,f 1\ g is the function h such that for any x
h(x) = f(x) 1\ g(x);

PMa,

30

FRANC;OIS LEPAGE

and
(iii)
(iv)

for x, Y E PM" x V y = x if y = <p or x = y


x V y = y if x = <p and does not exist otherwise;
for f, g E PMar,. fv g is the function h such that for any x
hex) = f(x) V g(x) if f(x) v g(x) exists.

PMa,

We can represent partial objects using the following graphic artifice. First of
all, since PM, forms a meet-semi-Iattice it is natural to represent it by
1

and a function f

PM" by

f(1)

f(O)

f(<p)

Generalizing, PM" can be represented by

1
1

<p

<p
<p

0
<p

<p

1
<p

<p

<p

0
<p

<p

<p

<p
<p

<p
<p

Diagram 1.

Certain elements of the top of the hierarchy (which we will call partial
total functions) behave like the regular total functions (which we will henceforth call classical functions) at least when their arguments themselves behave
like classical functions. We can characterize them formally in the following
way.

31

PARTIAL PROPOSITIONAL LOGIC

DEFINITION 9. For any a. E T, the set PTa


functions) is the smallest set such that
(i)
(i)

for a.
for a.

=
=

t, and x

Pr,

and f

PM" x
E

PMaf3 , f

s: PMa (the set of partial total

PT, iff x :t:- <p;


E

PTaf3 iff for any x

PTa' f(x)

PTf3.

Curiously enough, the space of classical functions is not isomorphic with


that of partial total functions. Moreover we can prove that there is an epimorphism from the set of partial total functions into the set of classical
functions so that the inverse image of a classical function by this epimorphism is a complete lattice of partial total functions [Lep92].
This property makes it possible to define recursively a relation of equivalence between partial functions: two functions are equivalent when they take
equivalent values for total arguments. Formally,
DEFINITION 10. Let '",,' be the following relation on PMa
(i)

(ii)

for any x E PM" x "" x;


for any f, g E PMaf3 , f"" g iff for any x

PTa' f(x) "" g(x).

One easily verifies that '",,' is a relation of equivalence. One of the interesting properties of this notion is expressed by the following proposition.
PROPOSITION 11. For any f, g

PTaf3 , f"" g iff f

v g exists.

Intuitively, partial functions of the same type can differ in two ways. The
first, classical, is that they can take incompatible values. They can also differ
in so far as they are compatible but more or less defined. This intuition can
be formalized in the following way.

s:

PMa x PMa be the relation (:t:-~ is


DEFINITION 12. For any a. E T, let:t:-~
the 'incompatibility' or the 'strong difference' and we read 'x:t:-~ y' 'x is incompatible with y' or 'x differs strongly from y')IO
(i)

(ii)

for x, y
for f, g

PM" x :t:-* y iff x :t:- <p and y :t:- <p and x :t:- y;

PMaf3 , f:t:-* g iff there is an x

PMa such that f(x) :t:-*

g(y).

Now we can define a partial interpretation for the terms of the theory of propositional types.
DEFINITION 13. A partial value assignation Jl is a function
Jl: U Vara ~ U PMa
aET

aET

32

FRANyOIS LEPAGE

such that ~(xa) E PMa. As before, we write ~(a/x) for the assignation that
differs from ~ at the most by the fact that it assigns the value a to x
Finally, we define a partial valuation based on

~.

DEFINITION 14. A partial valuation based on

is a function

V Il: U Trma ~ U PMa


aeT

aeT

such that
(i)

vIl(xa) == ~(xa);

(ii)

V Il([Aa == Ba]) = 1 iff VIl(A)

PTa and VIl(A) "" V iB)

o iff VIl(A) ** VIl(B)


<p otherwise;

(iii)
(iv)

V1l([AB]) = ViA) (V iB));


V Il(Ax"A) is the function which associates V lJ.(a,x)(A) with each a

Da
Clauses (i), (iii) et (iv) are identical to the corresponding clauses of Definition
6. Clause (ii) deserves a few remarks. The adoption of clause 6 (ii) is not
possible because in that case VIl([Aa == BJ) would not be monotone with regard
to V Il(A) and V Il(B). (For example, given AI and BI such that VIl(A) = <p and
ViB) = <p; according to 6 (ii) we would have V i[A == B]) = 1 but for C and
D such that VIl(C) = 1 and VIl(D) = 0 and VIl(A) $ VIl(C) and VIl(B) $ VIl(D),
and by monotony, V I'([C == D]) = 1. Therefore 1 = 0 which is not desirable,
to say the least.
Moreover, the relation '",,' between partial total functions, though weaker
than identity, is sufficient in virtue of Proposition 11: if two total functions
are different but equivalent, their supremum exists and belongs to the same
equivalence class. The fact that these functions take equivalent total values
for the same total arguments guarantees monotony.
We can show that the order on partial functions induces a similar order
on partial valuations. Thus we can speak of partial total valuations and
classical valuations. The existence of an epimorphism between partial total
functions and classical functions allows us to assimilate a classical valuation
to any element of the corresponding equivalence class of partial total valuations, because all statements receive exactly the same value (total, of course)
according to the classical valuation and according to all the partial total
valuations of the corresponding equivalence class. For a detailed proof, see
[Lrep92].
One easily verifies that (see Definition 4 and the definition of denotation),
Td = [AX,x == AXX]d = 1
pi = [AX,x == AxT]d = 0

33

PARTIAL PROPOSITIONAL LOGIC

Let us return to Diagram 1. Clearly

o
represents propositional identity and that -,,/ = A.Xt[x == F]d will be the function
represented by the following diagram

What about the binary connectors? An elementfof PMt(It) can be represented


according to the following convention
f(O)( 1) f(O)(O)

f(1)(1) f(1)(O)
f(1)(q

f(O)(q
f(q(1) f(q(O)
f(q(q
Diagram 2.

If we execute the calculus by taking the denotation of

A t(1I)

for f, namely

At(tt)d = Ax,AYt[A.j,(tt)[[fx]y] == A.f[lfT]T]]d

we get the function represented by Diagram 3


1

o
q>

Diagram 3.

Although there are no surprises in logic, it is remarkable to find Kleene's


strong conjunction here. ll We clearly get Kleene's strong disjunction by the
usual translation A v B =def JA A B).

34

FRAN<;OIS LEPAGE

Let us finish these semantic remarks by examining the quantifiers. We verify


directly that
VxA d = [baA == AxT]d = 1 iff for any a

o iff for some a


<p otherwise.

PTa' V I1(a'x)(A) = 1
PMa , VlJ(a,x)(A) = 0

and
3xA d = [-,VX[-,A]]d

[-,[AXa[-,A] == AxT]]d
= 1 iff for some a E PMa , VlJ(alx)(A) = 1
= 0 iff for any a E PTa' Vl1(alx)(A) = 0
= <p otherwise.
=

TWO NOTIONS OF VALIDITY

In classical logic, in the sense meant there, the notions of validity and valid
inference are monolithic. A is a valid consequence of a set of statements r,
symbolically, r F A, if and only if for any valuation Jl, if for any B E r,
ViB) = I, then VI1(A) = 1. In a similar way, A is valid, symbolically, FA, iff
for any valuation Jl, V I1(A) = 1. An equivalent way of presenting things in
the classical context is to use the false rather than the true. Then we would
have A as a valid consequence of r if and only if, for any valuation Jl, if for
any B E r, VI1(B) = I, then VI1(A) O. In a similar way, A is valid iff ViA)
= 0 for no valuation Jl.
These two notions, as we would expect, do not coincide in partial logic
on account of the present of a third, the undefined. We can define two concepts
of validity.12

DEFINITION 15. Given a set of statements r and a statement At, we will


say that A is a verifiably valid consequence of r, in symbols r tl A if and
only if for any partial valuation Jl, if for any B E r, VI1(B) = I, then ViA)
= 1. In the same way, A is verifiably valid, in symbols tl A, iff for any partial
valuation Jl, VI1(A) = 1.
DEFINITION 16. Given a set of statements r and a statement At, we will
say that A is a falsifiably valid consequence of r, in symbols r j2.A if and
only if for any partial valuation Jl, if for any B E r, VI1(B) = I, then VI1(A)
O. In the same way, A is falsifiably valid, in symbols j2.A, iff V I1(A) = 0
for no partial valuation Jl.

These two notions do not coincide; we can, however, easily show that [Lrep92].
PROPOSITION 17.

r FA

iff

j2.A.

What about the class of verifiably valid statements? This class is neither
empty nor identical to the class of classically valid statements. For example,

35

PARTIAL PROPOSITIONAL LOGIC

[Aa == Aa] is not verifiably valid but [AXaX == AXaX] is. The problem, one
will have guessed, is to give a system (complete if possible) for the class of
verifiably valid statements. The question is not yet resolved, and the major
difficulty rests in the absolute impossibility of constructing an expression Elf
of the object language, the interpretation of which would be [EA] is true if
and only if A is not defined and false otherwise. It is easy to convince oneself
that such a functor would not be monotone.
For the time being we must content ourselves with fragmentary results
[Lrep92]. First of all, if it is not possible to introduce in the object language
a functor [EA] the interpretation of which would be 'A is undefined', it is
possible nonetheless to introduce a functor ~(A), the interpretation of which,
roughly, is 'A is total'.

This is why we say 'roughly': ~(Aa) is true when A is total and undefined
otherwise. ~(Aa) is never false and in this way the monotony is preserved.
Such a functor is of great utility. In effect, it makes it possible to introduce
axioms of the form 'if A, B, ... , are total, then ... ' and permits the elaboration of a system of partial logic.
A DEDUCTIVE SYSTEM FOR PARTIAL PROPOSITIONAL LOGIC

The first stage consists in characterizing the subset of formulae of propositional calculus as verifiably valid tautologies. We proceed in the following
way.
DEFINITION 19. Let FP
smallest set such that
(i)

(ii)

Trm" be the set of propositional formulae, the

Var, U {T, F} ~ FP
if A, B E FP, then -,A, [A

1\

B], [A ::::) B), [A

v B], [A == B)

FP.

Using an idea of Henkin's, we recursively define two subsets of FP:


DEFINITION 20. Let FP+ and FP- be two subsets of FP such that
(i)
(ii)

T E FP+ and F E FP-;


if A E FP+ and B E FP-, then -,B

(iii)

if A, B E FP+, then [A 1\ B] E FP+, and if A E FP- then for any


B E FP, [A 1\ B], [B 1\ A] E FP-;

(iv)

if A E FP+ and B E FP-, then [A ::::) B] E FP-, and for any C E


FP, [C ::::) A], [B ::::) C] E FP+;

(v)

if A
E

E FP+ and B, C E FP-, then [B


FP, [A v D], [D v A] E FP+;

FP+ and -,A E FP-;

C]

FP-, and for any D

36

FRAN<;OIS LEPAGE

(vi)
(vii)

if A, B E FP+ and C, D E FP-, then [A == B], [C == D]


[A == C], [C == A] E FP-;
nothing else belongs to FP+ or FP-.

FP+ and

We prove easily and directly that A is a verifiably valid tautology iff


A E FP+. One can see that the existence of a decision procedure facilitates
things greatly. We can now present a formal system. The system includes
16 proper rules and 3 improper rules. One can range the proper rules in
three categories: those that are classically valid, those that have a classically
valid counterpart, and finally those that have no classically valid counter-

part.
Classically valid rules

Rl.
R2.
R3.
R4.

F ~ A.
[AX I [Ax"A,BI] ... Bn] ~ A,{BI ... Bjxl ... x n}, provided
that every B j is free for Xj in AY
A,[C,Ix,], [C == B,] ~ A[Blx], provided that B in free for x in A.
[[AliT] A [AF]] ~ V'x,[Ax].

R5.

R6.

~ V'Zar,V'var,[V'Xa[ZX ==

V'Zar,V'XaV'Ya [[x == y] ~ [ZX == zy]].


VX]

== [z == v]].

Rules having a classical counterpart


R7.
R8.
R9.

~(Ba)' ~(Ar,{Blxa}) ~ [[AXaAB] == Ar,{Blxa }], provided that B is


free for x in A.
A,[Clxa], [Ca == B a], ~(A[Blxa]) ~ A[Blxa], provided that B is free
for x in A.
~(A,) ~ [A == T] == A.

Rules having no classical counterpart

RIO.

A ~ ~(A).

Ril.
R12.
R13.

~(Aa)' ~(Ba) ~ ~([A == B]).

R14.

~(Aar,), ~(Ba) ~ ~([AB]).

R15.

'Vxl ... V'xn~(Ar,) ~ ~(A.XI ... Ax"Ar,), where XI' ... ,Xn are distinct
variables of any type.

R16.

~ V'xa~(x).

[Aa == Ba] ~ ~(A).


[Aa == Ba] ~ ~(B).

One easily verifies that all these rules are verifiably valid.

PARTIAL PROPOSITIONAL LOGIC

37

DEFINITION 21. There is a proof of A from a set H of hypotheses (in symbols


H f- A) if and only if there is a series of n formulas AI' ... , An such that A
= An and for all i :::; n:
(i)
(ii)
(iii)
(iv)

Ai E H, or
Ai is such that => Ai is a rule or

R1 *

Ai = VX I

R2*
R3*

Ak,

Al => Ai (with k, I < i) is a rule, or

Ai result from the application of one of the following improper rules:


. Vx"B, and there is a proof of B from S(x l ), ,
S(xn )
Ai is [B =::: C] and there is a proof of C from B and a proof of B
from C, and for a certain k < i, Ak = S(B) or Ak = S(C).
Ai = B,{Calxa } and there is a proof of B from the null set and C
is free for x in A.

If H is empty, we write I-A.


PROPOSITION 22. If A

Fr, then I-A.

One easily obtains the following generalization:


PROPOSITION 23. If A E FP is a classical tautological form whose free
variables are XI' ... , Xn, then f-Vx I . . . VxnA.
It does not seem possible, following Henkin, to generalize this result to the
totality of valid formulae of the partial theory of propositional types. The
unavoidable problem linked to this approach is the impossibility of having a
canonical name in the object language for every partial function without
modifying the theoretical framework in an essential way. Suppose, for example,
that q> had a name, say q>n. One immediate consequence would be that, even
for partial total valuations, some expressions would remain undefined. Even
if such systems of logic can be of some interest, they are, from the point of
view adopted here, unacceptable.
So far, we have tried to characterize classes of valid statements by departing
from the very general context of the theory of propositional types. Finally,
let us say a few words about another way of proceeding, more classical,
which consists of defining more elementary partial logics and attempting to
enrich them. It is possible in this way to define a first order functional calculus.
The syntax of this calculus is completely classical, but the interpretation admits
partials predicates and partial functions of arbitrary degrees of definition.
Moreover, we can provide a complete system for this functors of superior order
also raises problems of the same order as those encountered in the theory of
propositional types.

Universite de Montreal

38

FRAN<;:OIS LEPAGE
NOTES

For a review of these different positions, see [Urq86].


For obvious reasons we will use a fonnalism as unifonn as possible. This sometimes imposes
relatively important refonnulations, which we will point out in notes.
3 We present a variation inspired by Andrews [And63].
4
We will omit the type indices where that causes no confusion.
5 We have, of course, adapted Henkin's system to our own notation.
6 Another possible reason for this article's lack of success is the fact that it was published in
a Polish journal with a small distribution - the same journal, in fact, in which Tarski's 'Sur Ie
tenne primitif de la logistique', to which Henkin refers, had been published.
7
This text of Quine's, in which the reduction of quantifiers to the A operator is presented for
the first time, is not particularly well-known itself. I have not seen it cited anywhere else than
in Henkin's text. I had already begun seriously to wonder whether logicians ever read each other's
work, when I discovered that this text was the departing president's address of the Association
for Symbolic Logic meeting in conjoint session with the Eastern Division of the American
Philosophical Association in Boston, on the occasion of their annual congress, December 29,
1955. Not only do they not read each other, they do not even listen to each other!
8 In spite of this deficiency, Tichy's text remains the first real attempt to systematize partial
logic. Oddly, Blarney seems to ignore its existence. Tichy's argument is taken up again by Musken
in [Mus88].
9 We confine our presentation to the hierarchy of partial functions for propositional types.
For a general study of the hierarchy of partial functions in simple type theory, see [Lep92].
\0
No confusion being possible, we will omit the type indices and simply write ",'.
\I
The reader should also note that if we use the Tarski-Montague fonnula x /\ y =def 'v'z[x ==
[[z y)) == [zx]]] as our definition of conjunction, we do not get the sarne function as with Andrews'
definition. The function thus obtained is asymmetric: when x is false and y is undefined, the
conjunction is false, but when x is undefined any y is false, the conjunction is undefined. There
are, of course, only four possible total values for conjunction: Kleene's strong conjunction,
weak conjunction (undefined as soon as one of the arguments is undefined) and two asymmetric conjunctions. The Tarski - Montague conjunction is strong to the left and weak to the
right; we obtain the other by replacing x with y in the right member of the fonnula.
12
Once again, it is difficult to pinpoint the date of birth of these notions. As Elias G. C.
Thijsse [Thi89] rightly remarked (and as we have already emphasized), the history of partial logic
is often confused with that of trivalent logic even though the motives leading to the latter's
elaboration are very different from those advanced here. The definitions presented here are
those of Thijsse in the version of [Lap92]. See also [Ben86].
13
We use the notation A{B/x} to signify that all the free occurrence of x have been replaced
by occurrences of B, and A[B/x] to signify that some free occurrences of x (perhaps none) have
been replaced by occurrences of B.
1

REFERENCES
[And63] Andrews, P. B., 1963, 'A Reduction of the Axioms for the Theory of Propositional
Types', Fundamenta MathematiclE LII, 345-350.
[Ben86] van Benthem, J., 1986, 'Partiality and Nonmonotonicity in Classical Logic', Logique
et Analyse 29, 225-247.
[Bla86] Blarney, S., 1986, 'Partial Logic', in Gabbay, D. and Guenthner, F. (eds.), Handbook
of Philosophical Logic Vol. Ill, Reidel, Dordrecht, pp. 1-70.
[Chu4O] Church, A., 1940, 'A Fonnulation of the Simple Theory of Types', The Journal of
Symbolic Logic 5, 56-68.
[Far90] Fanner, W. M., 1990, 'A Partial Functions Version of Church's Simple Theory of
Types', The Journal of Symbolic Logic 55(3), 1269-1271.

PARTIAL PROPOSITIONAL LOGIC


[Ga175]
[Grz55]
[HenS 0]
[Hen63]
[Lap92]
[Lrep92]
[Lrep93]

[Lep92]
[Lep84]
[Mon74]
[Muss88]
[Qui56]
[Sc073]
[Siu53]
[Tar23]
[Thi87]
[Tic82]
[Urq86)

39

Gallin, D., 1975, Intensional and Higher-Order Modal Logic, North-Holland


Amsterdam.
Grzegorczyk, A., 1955, 'The Systems of LeSniewski in Relation to Contemporary
Logical Research', Studia Logica 3, 77-95.
Henkin, L., 1950, 'Completeness in the Theory of Types', The Journal of Symbolic
Logic 15, 81-91.
Henkin, L., 1983, 'A Theory of Propositional Types', Fundamenta Mathematicce 52,
323-344.
Lapierre, S., 1992, 'A Partial Semantics for Intensional Logic', Notre Dame Journal
of Formal Logic 33(4), 417-541.
Lapierre, S. and Lepage F., 1992, 'Toward a Calculus of Partial Propositional Types',
Cahier du departement de philosophie no 92-11, Universite de Montreal.
Lapierre, S. and Lepage F., 1993, 'La completude du calcul propositionnel des prMicats du premier ordre avec identite pour les interpretations partielles', Cahier du
departement de philosophie no 93-04, Universite de Montreal.
Lepage, F., 1992, 'Partial Functions in Type Theory', Notre Dame Journal of Formal
Logic 33(4),493-516.
Lepage F., 1984, 'The Object of Belief', Logique et Analyse 36,106,193-210.
Montague, R., 1974, 'Universal Grammar', in Formal Philosophy, Yale University
Press, New Haven, pp. 222-246.
Musken, R., 1988, 'Going Partial in Montague Grammar', ITLI Prepublication Series,
LP-88-04, University of Amsterdam.
Quine, W. V., 1956, 'Unification of Universes in Set Theory', The Journal of Symbolic
Logic 21, 267-279.
Scott, D., 1973, 'Models for Various Type-free Calculi', in P. Suppes et al. (eds.),
Logic. Methodology and Philosophy of Science IV, North-Holland, Amsterdam,
pp. 157-187.
Siupecki, J., 1953, 'St. Lesniewski protothetics', Studia Logica 1, 44-111.
Tarski, A., 1923, 'Sur Ie terme primitif de la logistique', Fundamenta Mathematicce
IV, 59-74.
Thijsse, G. C. E., 1987, 'Partial Logic and Modal Logic: A Systematic Survey',
manuscript.
Tichy, P., 1982, 'Foundations of Partial Type Theory', Reports on Mathematical
Logic 14, 59-72.
Urquhart, A., 1986, 'Many-Valued Logic', in Gabbay, D. and Guenthner, F. (eds.),
Handbook of Philosophical Logic Vol. 11/, Reidel, Dordrecht, pp. 71-116.

SERGE LAPIERRE

GENERALIZED QUANTIFIERS AND INFERENCES

INTRODUCTION

The semantic treatment of noun phrases and determiners in Montague (1973)


presupposes the notion of "generalized quantifiers" proposed in Mostowski
(1957). The importance of this notion for natural language has been brought
out explicitly in Barwise and Cooper (1981) and Keenan and Stavi (1986). The
basic idea is to let a noun phrase DA (all men, some women, most students,
etc.) to denote a set of sets of individuals, that is to say, the set of the denotations of the verb phrases B for which (DA)B holds. For instance, given a
fixed model with a non-empty universe E.
all A
some A
most A

denotes
denotes
denotes

{X I: E: [[AJ] I: X},
{X I: E: [[AJ] n X"* 0},
{X I: E: I[[AJ] n XI > I[[AJ] - XI},

where [[AJ] is the extension of the predicate A in the model. As Lindstrom


(1966) pointed out, this analysis suggests that determiners denote binary
relations between sets. For instance, given a non-empty universe E.
all
some
most

denotes
denotes
denotes

{(X, Y)
{(X, Y)
{(X, Y)

r:JP(E) x Q1>(E): X I: Y};


Q1>(E) x Q1>(E): X n Y"* 0};
E Q1>(E) x Q1>(E): IX n YI > IX - YI}.
E
E

Such binary relations are called (local) binary generalized quantifiers. More
generally, a (global) n-ary generalized quantifier (n ~ 1) is a function Q
which assigns to every non-empty universe E an n-ary relation QE between
subsets of E. From now on we will stick with global binary generalized quantifiers and we shall simply call them quantifiers. In order to define or exhibit
a property of a quantifier Q operating on a universe E. we shall write "QEXY"
instead of "X, Y I: E and (X, Y) E QE". Moreover, for denoting a familiar quantifier having a simple determiner, we shall write the determiner itself in italic.
For instance:
allEXY iff X I: Y;
someEXY iff X n Y "* 0;
noEXY iff X n Y = 0;
all and someEXY iff X I: Y and X n Y
at least halfEXY iff IX n YI ~ IX - YI.

0;

This abuse of notation is to be preferred to an abundance of brackets or quotes.


Note that the specification of the parameter E is relevant only for contextdependent quantifiers. For instance, according to a plausible intuition about
41
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 41-55.
1995 Kluwer Academic Publishers.

42

SERGE LAPIERRE

the meaning of the determiner "many", we may set manyEXY if and only if
the proportion of Y-individuals in X is larger than the proportion of Y-individuals in the whole universe E; formally:
manyEXY iff

IX n YI/IXI

> IYI/IEI

Since the proportion IYIIIEI decreases when the size of E increases, many in
this sense is not a context-free quantifier. However, most of the natural
language quantifiers are context-free and even logical in the sense that only
the cardinalities IX - YI and IX n YI are relevant for deciding whether QEXY
or not. This is clearly the case for all, some, no, all and some, at least half,
most and many others, From now on, we shall restrict our attention to logical
quantifiers.
As relations, quantifiers have relational properties. For instance, all is
reflexive and transitive, some is symmetric, not all is connected. These
properties can be considered as inferential properties. For instance, symmetry
allows us to infer someYX from someXY. If we consider other conditions,
more inferential properties emerge. For instance, from allXY we may infer
all(X n Z)Y, where Z is any set, because all is downward-left monotonic
(i.e., if QXY and X' !;;;; X, then QX'Y).
The inferential properties of quantifiers can be studied from two opposite,
but complementary perspectives. First, there is the "inverse logic" perspective,
which consists of considering some specific inferential properties and finding
which quantifiers have those properties. The opposite view - the "direct logic"
perspective - consists of considering some specific quantifiers and studying
their inferential behaviours. This paper belongs essentially to the inverse logic,
since our purpose is to summarize some results in the analysis of the inferential properties of some quantifiers.
By its nature, our study depends on two things. First, it depends on a prior
choice of formalism, expressing more or less the relevant inferential properties. For instance, we may decide, as in Section 1, to study only simple
relational properties; in that case a formalism consisting only of atoms of
the form QXY is sufficient. But in order to express more specific properties
of quantifiers, such as the various forms of monotonicity, we need a formalism
allowing in addition Boolean set terms in the argument of the quantifier
relation. This more expressive formalism will be used in Sections 2 and 3 when
we study some "conditional quantifiers", that is to say quantifiers which behave
as conditional relations.
Besides a prior choice of formalism, another important decision concerns
the cardinality of the universes. Must we admit finite universes only, or infinite
universes as well? Some theoreticians would prefer to stick with finite universes. There is an empirical reason for this decision: natural language seems
to require finite models only, infinite ones arising only through philosophical or scientific considerations. But there is also a methodological reason
for the finiteness restriction: it simplifies results and proofs and one notes
that most of the results obtained under the finiteness assumption also hold

GENERALIZED QUANTIFIERS AND INFERENCES

43

on infinite universes of an arbitrary cardinality. For this, we will assume the


finiteness restriction in Section I because all of the results obtained hold on
finite universes as well as on infinite universes. The situation will be different when we consider conditional quantifiers, since some of them do not
have a stable behaviour when switching from finite universes to infinite ones,
or from denumerable universes to infinite ones of higher cardinalities. Thus
infinity appears relevant here and for this reason we will be obliged to drop
the finiteness restriction.
1.

PURE SYLLOGISTIC THEORIES OF SOME QUANTIFIERS

There are many different formalisms, having more or less expressing power,
which can be used for expressing the inferential properties of a given quantifier. In this section we shall stick with the minimal formalism L syll , which
consists only of elementary formulae of the form QXY, where Q is a quantifier symbol and X, Yare set variables. This formalism is rich enough for
expressing "pure syllogistic" patterns of inference such as:
=>QXX
=> QXY
QXY => QYX
QXY => QXX
QXY => QYY
QXX => QXY
QXY, QYZ => QXZ

reflexivity
universality
symmetry
quasi-reflexivity
weak-reflexivity
quasi-universality
transitivity

The pure syllogistic theory of a given quantifier Q is the inferential theory


Tsyll(Q) of Q in the formalism LSYIl. So given a quantifier Q, a natural issue is
to axiomatize Tsyll(Q) completely. For instance, it has been proven by van
Benthem (1986) that symmetry and quasi-reflexivity give the complete axiomatization of the pure syllogistic theory of some and more generally, of at
least n (for any n = 1, 2, 3, ... ). By using the same proof strategy, similar
results may be obtained for other quantifiers.
THEOREM 1. Reflexivity and transitivity axiomatize the pure syllogistic theory
of all completely.
Proof. Let <p be any universal first-order sentence about Q which does not
follow from the above two principles (Q in <p is interpreted as a binary predicate). According to Godel's Completeness Theorem, <p is falsified in some
reflexive and transitive order. This can be transformed into a counterexample
for <p where the relation is inclusion between sets, so we are done.
First, reduce each (possible) cluster (i.e., subset X over which Q is universal
but not universal over any proper superset of X) to anyone element of
it. This leaves all universal first-order sentences about Q invariant. Let us
represent this resulting first-order model by a set of sets with the inclusion
relation by the function F(x) = {y: Qyx}. Now, F is one-one, since any two

44

SERGE LAPIERRE

different elements are represented by two distinct sets (this follows from
reflexivity (x E F(x and the fact that there is no cluster in the resulting model).
Moreover, Qxy if and only if F(x) !: F(y). For, if Qxy, then for any z such
that Qzx, Qzy (transitivity), which means that F(x) !: F(y); conversely, if
F(x) !: F(y), then x E F(y) (by reflexivity), which means that Qxy.

THEOREM 2. Transitivity, quasi-reflexivity and weak-reflexivity axiomatize


the pure syllogistic theory of all and some completely.
Proof. We proceed exactly as before, but in addition we contract all
(possible) isolated element into a single one. Again no universal first-order
sentence about Q is affected. Using the same F, it is easy to verify that it is

one-one and that Qxy if and only if F(x) !: F(y) and F(x) n F(y) ::F- 0.
Since quasi-reflexivity and weak-reflexivity are both trivial consequences of
reflexivity, it follows from the two previous theorems that Tsyl/(all and some)
C Tsyl/(all).
THEOREM 3. Symmetry and quasi-universality axiomatize the pure syllogistic
theory of no completely.
Proof. We proceed as before, but now we represent the model by a set of
sets with the empty intersection relation by the function F(x) = {{x, y}: not
Qxy}. One easily verifiers that F is one-one. Moreover, Qxy if and only if
F(x) n F(y) = 0. For, if Qxy and x = y, then F(x) = F(y) = 0 (quasi-universality and symmetry) and so F(x) n F(y) = 0. If Qxy and x ::F- y, then any
set in F(x) is distinct from anyone in F(y), and thus F(x) n F(y) = 0.
Conversely, if F(x) n F(y) = 0, then there is no set {x, z} = {y, Z/} E
F(x) n F(y), and this implies Qxy (for, if not Qxy, then not Qxy (symmetry),

which means that {x, y} = {y, x'} E F(x) n F(y.


Incidentally, the latter axiomatization is good for a larger class of quantifiers.
THEOREM 4. At most n (for any n = 0, 1, 2, 3, ... ) has the same pure
syllogistic theory as no.
Proof. At most 0 is no, of course. Now let QX1Y1 ... QXkY/QXY be any
inference refuted in a model where Q is no let Qn (for any n = 1, 2, 3, ... )
be at most n. Obviously, all Q~jYj hold in this model (1 :::; i:::; k), whereas Q~Y
can be refuted, without affecting the previous relations Q~jYj, by adding in
X n Y enough copies of any X nY-element in order that IX n YI > n.
Conversely, let Q~IYl ... Q~kY/QXY be any inference refuted in a model
where Qn is at most n (for any n = 1, 2, 3, ... ) and let Q be no. Add first a
new individual in X n Y without adding it in any of the Xj n Yj and then remove
all old individuals in each of the Xj n Yj (1 :::; i :::; k). In this resulting model

we have that QXjYj (1 :::; i :::; k) but not QXY.


Now let us turn our attention to a more specific type of quantifiers.

GENERALIZED QUANTIFIERS AND INFERENCES


2.

45

CONDITIONAL QUANTIFIERS

Its has been pointed out in van Benthem (1984) that there is a striking analogy
between the analysis of binary quantifiers and conditional sentences of the
form if A, (then) B, considered as expressing relations between antecedent
sets and consequent sets of situations. More precisely, given a "relevant"
non-empty universe E of situations, the functor "if" may be analysed as a determiner denoting a binary quantifier on E, say ifE' so that given any two terms
A, B having the extensions [[All, [[Bll ~ E, if A, (then) B is true if and only
if ifE [[All [[B]].
From the inverse logic perspective, several properties, expressing more or
less a priori intuitions of conditionality, have already been suggested. The
central ones may be formulated by the following patterns of inference:
CONS
Cl
C2
C3
R

BE

ifXY ~ ifX(Y n X)
ifXY => ifX(Y U Z)
ifX(Y n Z) => if(X n Y)Z
ifXY => if(X U Z) (Y U Z)
=> ifXX
Replacement of Boolean Equivalents

conservativity
confirmation
reflexivity

These principles characterize the minimal conditional logic M. We note that


most current accounts of conditionals obey them.
Let us call every quantifier satisfying the minimal conditional logic M a
conditional quantifier. Though uncountably many quantifiers are conditional
in this sense, few of them have been studied in detail. Among the logical
ones, we have all, all or some and at least half. Another interesting one, which
requires universes which are at least denumerable, is:
Iy manyXY
a11 but fi mte

iff {X k Y, if X is finite

X - Y is finite, if X is infinite.

Each of these quantifiers determines, in an appropriate formalism, a conditional logic. Obviously, in order to capture most of the fundamental patterns
of conditional inference, especially the above basic M-principles, the minimal
formalism we need consists of conditional formulae of the form ifXY,
where X, Y are set variables or combinations of set variables with parentheses and the operations "n", "U" and "-", to which we give their usual
meanings.
Given the formalism described above, it remains to determine the range
of the admissible sizes of the universes. Many options are available here.
For instance, we may decide to consider either finite universes only, or only
infinite universes of a fixed cardinality. On the other hand, when considering
infinite universes, we may decide to be careful not to become entangled in
higher infinite cardinalities, and so to restrict the range of admissible universes

46

SERGE LAPIERRE

to denumerable ones. Combining these two options, here is a list of the logics
we may consider:
Quantifiers

Logics

Finite universes
all
C
Classical conditional logic
E
Exemplary conditional logic
all or some
at least half
QD Quasi democratic conditional logic
Denumerable universes
all but finitely many
N
Coo
all
all or some
Eoo
at least half
QDoo
Section 3 is about the first three logics. Section 4 is for the most part about
the others in relation with the formers. However, other possibilities, such as
the inferential behaviours of the quantifiers on at most denumerable universes, or on infinite universes of higher cardinalities, will be considered when
they seem relevant.
3.

CONDITIONAL QUANTIFIERS ON FINITE UNIVERSES

The logics. C, E and QD and their mutual relationships have been studied in
detail in Lapierre (1991). Figure 1 summarizes these relationships. We will
give a quick proof of each indicated relation and specifications about each
logic involved in the relation.
C
MCEn QD

C'

C'
EUQDCC

QD

Fig. 1.

The logic C, the one of all, is the most inclusive logic in this figure.
Obviously, this logic corresponds, in our restricted formalism, to the logic
of (Ss)-strict implication. So it contains, besides all M-principles, the following
additional ones:
CNT
LM
TRN
CNJ
DSJ

ifXY => if-Y-X


ijXY => if(X n Z)Y
ijXY, ijYZ => ifXZ
ifXY, ijXZ => ijX(Y n Z)
ijXZ, ijYZ => if(X U Y)Z

contraposition
left-monotonicity
transitivity
conjunction
disjunction

GENERALIZED QUANTIFIERS AND INFERENCES

47

However, it appears that all basic principles of M plus TRN are sufficient
for characterizing this logic.
THEOREM 5. All basic principles of M together with TRN axiomatize C
completely.
Proof To begin with, one notes that the following principles are straightforward consequences of M+ TRN (0 =defX n -X, X being any term):
PI.
P2.
P3.
P4.

if00
ifXO, ifYO ~ if(X U Y)O
ifXO ~ if(X n Z)O
ifXY :::> if(X n -Y)O

Now suppose that ifX1Y1 ... ifXnYn If ifXY in M+TRN. Since only finitely
many terms are involved here, there must be a finite Boolean algebra with
an additional binary relation Q interpreting if which verifiers every premise
and refutes the conclusion. (Note that though Boolean terms denote elements
of this Boolean algebra by means of some basic assignment from set variables,
these elements are not necessarily sets.) As usual, this Boolean algebra may
be represented isomorphic ally as a power set algebra and under this representation, Q becomes a binary relation between sets, having the following
properties (where X, Y, Z are any subsets of the set B of atoms of our Boolean
algebra):
PI'.

Q00

P2'.
P3'.
P4'.

QX0, QY0 ~ Q(X U Y)0


QX0, ~ Q(X n Z)0
QXY0, :::> Q(X n -Y)0

Then one shows that under an appropriate homomorphic restriction of this


power set algebra, Q becomes inclusion, so we are done.
First, let Q* be the unary predicate of subsets of B such that Q*X if and
only if QX0. SO by P4', QXY if and only if Q*X n -Yo Moreover:
PI*.
P2*.
P3*.

Q*0
Q*X, Q*Y ~ Q*X U Y
Q*X ~ Q*X n Y

(from PI')
(from P2')
(from P3')

Let K = U {X: X E Q*}. Then Q*K by P2* and the fact that B is finite.
Now the mapping F = AX. X n -K is a Boolean homomorphism from our
power set algebra into another one over the smallest base set B n -K.
Moreover, Q* restricted to !fi>(B n -K) contains only the empty set; thus for
all X, Y ~ B n -K, Q*X n -Y if and only if X ~ Y. It remains to verify
that none of the previous relations is affected by this restriction.
Let X, Y ~ B and suppose that Q*X n -Yo Then Q*(X n -Y) n -K by
P3*, that is to say Q*F(X n -Y). Conversely, suppose that Q*F(X n -Y).

48

SERGE LAPIERRE

Since Q*K, we have that Q*K U F(X n -y), by P2*. But by Boolean identity,
K U F(X n -Y) = K U (X n -y), and thus Q*K U (X n -Y). So by
P3*, Q*(K U (X n -Y) n (X n -y), and therefore Q*X n -Y, by Boolean
identity.

We may now return to the relations pictured in Figure 1. First, the assertion
that E U QD C C contains the following non-immediate propositions.
E U QD ::F- C: this follows from the fact that neither all or some nor at least
half validates CNJ.
E ~ C: let ijX\Y\, ... , ijXnYnlijXY be any inference refuted by inclusion

in some model. Already, every or at least one Xj-situation is a Yj-situation


(1 ::; i::; n), because every Xrsituation is a Yrsituation. On the other hand, there
is an X-situation which is not a Y-situation, and thus, if there is no X n Ysituation, we have a C-counter-example which is also an E-counter-example.
Otherwise, consider the "homomorphic sub-model" consisting only of all
non-X nY-situations, behaving in exactly the same way with regard to
(non-)membership of the relevant sets. All ijXjYj still hold according to inclusion, because every former inclusion must still hold in this new model. On
the other hand, ijXY is still refuted according to inclusion, but now there is
no X nY-situation - and thus we have a C-counter-example which is also
an E-counter-example.
QD ~ C: let ijX\Y\, ... , ijXnYnlijXY be any inference refuted by inclusion in some model. Convert this model into a homomorphic sub-model
which is both a C-counter-example and an E-counter-example (as above). Then
all ifXjYj hold according to at least half, because they hold according to
inclusion, while ifXY is refuted according to at least half, because it is refuted
according to all or some - and thus we have a C-counter-example and an
QD-counter-example.
In order to give a more precise idea of the logic E, note first that DSJ belongs
to this logic, as well than the following principles (1 =defX U -X, X being
any term):
CCNJ
CSYM
TN
CWA

ijXY,
ijlX,
ijlX,
iflX,

ij(X n Y)Z => ijX(Y


ijXY => ijYX
ifXY => ifl Y
ijXY => ij(X U Z)Y

Z) cautious conjunction
conditional symmetry
transmissibility of necessity
conditional weakening of the
antecedent

However, it is easy to verify that all these principles, including DSJ, are
derivable from all basic M-principles plus CCNJ. Though that does not mean
that this set of principles axiomatize E completely, it is a very likely conjecture at this stage.
The assertion that E q;, QD and QD q;, E is established by the following
two facts. First, as we pointed out, DSJ is valid according to all or some.

GENERALIZED QUANTIFIERS AND INFERENCES

49

However, OSJ is not valid according to at least half, witness the following
QD-counter-example:

Secondly, it has been established in van Benthem (1986) that at least half
validates the following principle (where X A Y abbreviates the symmetric
difference (X n -Y) u (Y n -X:
if(X A y)Y, if(Y A Z)Z => if(X A Z)Z.

PA

But PA is not valid according to all or some, as the following E-counterexample indicates:

In order to establish that M C E n QD, it is sufficient to show that M ::FQD, since both E and QD include M. This is quite simple since we
easily verify that the following principle is valid according to both all or
some and at least half without being derivable in Malone:

COSJ

if(X

y)O, ifXZ, ifYZ => if(X U Y)Z

conditional disjunction

Another way to see this is to consider the class of the quantifiers all but at
most n (for any n = I, 2, 3, ... ): all of them validate all basic M-principles,
but none validates COSJ.
4.

CONDITIONAL QUANTIFIERS ON DENUMERABLE UNIVERSES

Now let us consider denumerable universes. It is not very surprising that the
logic of all as well than the logic of all or some do not change on these
universes. We give here the proofs of these identities, which will be useful
in some forthcoming demonstrations.

50

SERGE LAPIERRE

THEOREM 6. C(o) = C and E(o) = E.


Proof. The non-immediate propositions are the followings.
Coo ~ C and E(o) ~ E: let ijX1Y1, ... , ijXnYjijXY be any inference refuted
according to all (resp. all or some) in some finite model. Select one situation in the universe of this model and add countably many copies of this
situation, behaving in exactly the same way with regard to (non-}membership of the relevant sets. This procedure preserves inclusion as well than
overlapping, and thus we have a C(o)-counter-model (resp. an E(o)-counter-model)
for the same inference.
C ~ C(o): let ijX1Y1, ... , ijXnYnlijXY be any inference refuted by inclusion in some denumerable model. So there is at least one X n -Y-situation
in this model, say x. Consider the model consisting of x alone, behaving in
exactly the same way with regard to (non-}membership of the relevant sets.
None of the previous relations is affected in this new model, and so we have
a C-counter-model for the same inference.
E ~ Eoo: let ijX1Y1, ... , ijXnYnlijXY be any inference refuted according
to all or some in some denumerable model. For each 1 ~ i ~ n, select exactly
one Xj n Yj-situation (if there is any) and select exactly one X n -Y-situation. Consider the homomorphic sub-model consisting only of these selected
situations, behaving in exactly the same way with regard to (non-}membership of relevant sets. Clearly, this model is finite (note that there is only a finite
number of premises). Moreover, it is still the case that every or at least one
Xj-situation is a Yj-situation (1 ~ i ~ n), that there is one X n -Y-situation,
but that there is no X nY-situation. Thus we have an E-counter-model for
the same inference.

Incidentally, this latter result about the inferential behaviours of all and all
or some can be generalized to at most denumerable universes as well as to
infinite universes of higher cardinalities.
With at least half, matters change, as the following theorem indicates.
THEOREM 7. QD(o) C E.
Proof. DSJ is an E-principle which is not QDoo-valid, and thus QD oo "t:- E.
(To see that DSJ is not QDoo-valid, consider the QD-counter-example of Section
3, and add countably many new situations outside the three relevant sets.
The results is obviously a QDoo-counter-example.) However, every QDoo-valid
inference is E-valid too. Indeed, let ijX1Y1, ... , ijXnYnlijXY be any inference refuted in some E-model. For each 1 ~ i ~ n, select exactly one Xj n
Yj-situation (if there is any) and add countably many copies of this situation,
behaving in exactly the same way with regard to (non-}membership of the
relevant sets. Clearly this new model is denumerable. Moreover, for every
I ~ i ~ n, either there is no Xj n -Yj-situation, or there are countably many
Xj n Yj-situations, which means in both cases that there are no more Xj n
-Yj-situations than Xj n -Yj-situations. On the other hand, there are more
X n -Y-situations than X n -Y-situations, since there was no X n Y-situa-

GENERALIZED QUANTIFIERS AND INFERENCES

51

tion at all, but at least one X-situation in the former model, which is still the

case - and thus we have a QD 61-counter-model for the same inference.


One notes again that this latter result about the inferential behavior of at
least half can be generalized to at most denumerable universes as well as to
infinite universes of higher cardinalities.
Given what we know so far, Theorem 7 gives us two by-products, the first
one concerning the mutual relationship between M, QDw E and C.
THEOREM 8. M C QD61 C E C C.
Proof. The non-immediate assertions are the followings.
QD 61 C E C C: from Theorem 7 and the fact that E C C (Section 3).
Me QD61: every M-principle is a QD 61 -principle, of course; but CDSJ (from
if(X 1\ y)O, ifXZ, ifYZ to if(X U Y)Z) is QD 61 -valid, as we may easily verify,

though it is not an M-principle, as we pointed out in Section 3.


THEOREM 9. QD q;, QD61 .
Proof. From Theorem 7 and the fact that QD

q;,

E (Section 3).

Incidentally, the converse of this latter theorem also holds.


THEOREM 10. QD 61 q;, QD.
Proof. The idea is this. The principle CWA (from iflX, ifXY to if(X U
Z)Y) is not QD-valid, as this QD-counter-example shows:

However, it is a QD 61-valid principle. For, suppose that if1Xand ijXYare


both true according to at least half in some denumerable model. Then there
are countably many X-situations, since there are no more -X-situations than
X-situations. Therefore there are also countably many X nY-situations, since
there are no more X n -Y-situations than X n -Y-situations. So a fortiori
there are countably many (X U Z) nY-situations, and this is sufficient for
verifying the conclusion.

52

SERGE LAPIERRE

Now, note that the situation is different for at most denumerable universes,
since every QD-invalid inference is also an invalid inference according to at
least half on finite or denumerable universes.
The logic N (the one of all but finitely many) is interesting. It contains
CNJ, which distinguishes it from both E and QD. But unlike C, none of LM,
CNT and TRN is valid according to this logic. (Too see that LM, for instance,
is not an N-principle, consider the numerical model where Y = the set of
even numbers, X = Y U {I} and Z = {I}; clearly, all but finitely manyXY
but not not all but finitely many(X n Z)Y.) Thus, N seems to be a good candidate for a counterfactuallogic. Let us compare it with the Basic subjunctive
logic S of Burgess (1981), which is completely axiomatized by R, CNJ, DSJ
and the following two additional principles:
SIMP
CLM

ijX(Y n Z) => ijXY


ijXY, ijXY => ij(X n Y)Z

simplification
cautious left-monotonicity

The following two results, due to van Benthem (1986), will be useful:
(i) S is precisely M+CNJ (all derivations are straightforward in both
ways);
(ii) S is the many-premise fragment, in our formalism, of the full counterfactuallogic of Lewis (1973). This means that that S is sound and complete
with respect to comparative similarity models.
Now, since all basic M-principles and CNJ are N-principles, it follows
from (i) that N ~ S. But N does not exactly coincide with S. Too see this,
consider the inference from ijl-J(, ifXY to ij(X n Z)Y. It is clearly N-valid,
since ifl-J( means here that there are only finitely many X-situations, and in
this case the inference from ifXY to if(X n Z)Y is validated by inclusion.
On the other hand, this inference is not S-valid, as indicates the following comparative similarity model (comparative similarity is distance and world 3 is the
vantage world):

z
y
3

GENERALIZED QUANTIFIERS AND INFERENCES

53

As we see, if(X n Z)Y does not hold in world 3 (i.e., there is a "closest"
X n Z-world, namely world 1, which is not a Y-world). However, both ifl-,X
and ijXY hold in world 3 (in particular, there is only one "closest" E-world
from world 3, namely world 3, and this world is not an X-world). So, the inclusion of S in N is proper.
Where is the location of N in the scheme of Figure I? Here is the picture,
the proofs will follow.

('

EU QDCNCC

MCEnQD
('

QD

Fig. 2.

The first non-immediate assertion to consider here is


N C C: as we pointed out, LM is a C-principle but not an N-principle,

and thus N:F- S. But every inference which is C-invalid is N-invalid too. Indeed,
let ijXjYj, ... , ifXnY jifXY be any inference refuted by inclusion in some finite
model, and thus (Theorem 6) in some denumerable model. Then in this model
all premises are verified according to all but finitely many, because they are
verified by inclusion. On the other hand, if there are countably many X n
-Y-situations, then there are also countably many X-situations, and so the
conclusion is already refuted according to all but finitely many. If there are
only finitely many X n -Y-situations and that the set of X-situations is denumerable, add countably many copies of any X n -Y-situation, behaving in
exactly the same way with regard to (non-)membership of the relevant sets.
Again this procedure does not disturb any of the previous relations, but now
there are countably many X n -Y-situations - and thus we have an N-countermodel for the same inference.
The second non-immediate assertion to consider is E U QD C N. First,
E U QD :F- N, since CNJ is an N-principle but neither an E-principle nor a
QD-principle. It remains to establish that both E and QD are included in N.
In order to do this, we need this
THEOREM 11. Every inference ifXjY j, ... , ifXnYjifXY which is refuted
according to all but finitely many in some denumerable model where X n
- Y is denumerable is also refuted in some finite model according to inclusion (all).
Proof Let ijXjY j, ... , ifXnY jifXY be any inference refuted according
to all but finitely many in some denumerable model where X n -Y is denumerable. Consider the homomorphic sub-model consisting only of all nonl) Xj n -Yrsituations, behaving in exactly the same way with regard to
I

54

SERGE LAPIERRE

(non-)membership of the relevant sets. Then all ijXjYj hold according to


inclusion in this new model, whereas X n -Y is still denumerable, since
l)Xj n -Yj was finite - and thus we have a C",-counter-model, and so (Theorem 6) a C-counter-model for the same inference.

Now we may establish our two assertions.


E ~ N: let ifX\Y\, ... , ijXnY,/ifXY be any inference refuted according to
all but finitely many in some denumerable model. Here X n -Y is either
denumerable or finite. In the first case, there is a finite model which refutes
the same inference according to inclusion (Theorem 11) and this model may
be converted into an E-counter-model for the same inference (Section 3). On
the other hand, if X n -Y is finite, X n Y is finite too. Moreover, we already
have that all or someXjYj (1 :s; i :s; n), because either Xj ~ Yj or Xj n Yj is
denumerable. So consider the homomorphic sub-model consisting only of
all non-X nY-situations, behaving in exactly the same way with regard to
(non-)membership of the relevant sets. The result is a denumerable model
which is both an N-counter-example and an E",-counter-example for the same
inference. But this model may in its tum be converted into an E-counter-model
(Theorem 6).
QD ~ N: let ijX\Y\, ... , ifXnY,/ijXY be any inference refuted according
to all but finitely many in some denumerable model. Again, if X n -Y is
denumerable, then there is a finite model which refutes the same inference
according to inclusion (Theorem 11) and this model may be converted into
an QD-counter-model for the same inference (Section 3). Otherwise, convert
the model into a denumerable sub-model which is both an N-counter-example
and E",-counter-example for the same inference (as above). Then all ijX;Yj hold
according to at least half in this new model, since either Xj ~ Yj or Xj n Yj
is denumerable, and thus in both cases IXj n Yj I ~ IXj n -Yjl. On the other
hand, ijXY does not hold according to at least half, because X n Y is empty,
but not X n -Yo Now, this model may be converted into a QD-counter-example
as follows: for every denumerable set Xj n Y j , select as finitely many Xj n
Yj-situations which are not in X n -Y in order that their number be at least
equal to the one of Xj n -Yj-situations (this last number is finite); let these
selected situations behave in exactly the same way with regard to (non-)membership of the relevant sets: then remove from the whole universe all other
situations not in X n -Yo
Incidentally, from this latter proof we may extract the following results.
(i)

QD",

(ii)

QD

E",

E",

n N;
n N.

Theorem 6, (i) and (ii) imply:


(iii)

QD",

(iv)

QD

E
E

n N;
n N.

GENERALIZED QUANTIFIERS AND INFERENCES

55

All these inclusions are proper. For, DSJ is both an E-principle and an Nprinciple which is neither QDoo-valid nor QD-valid, and thus QDIiJ #. EIiJ n
N = E n Nand QD #. EIiJ n N = E n N.
Many questions are still unanswered. For instance, where is the location
of QDIiJ in the scheme of Figure 2? Theorem 7 gives us a partial answer,
which motivates us to ask whether E n QD C QD oo . Along another line,
there is the issue of the inferential behaviours of our conditional quantifiers
in a more expressive formalism allowing logical combinations between conditional formulae, such as:
CEM
CV
ALT

ifXY V ifX -Y
ifXY, -,ifX -Z => if(X n Z)Y
ifX(Y U Z) => ifXY V ifXZ

Considering only these three principles, it is easy to verify that on finite


universes, all of them are valid according all or some, only CEM and CV
are valid according to at least half and only CV is valid according to all.
Thus a natural question at this stage is how do known facts about conditional logics change as one varies the expressive power of the formalism
expressing conditional assertions and inferences. Finally, from the inverse logic
perspective, we may point out the issue of, given a set of patterns of conditional inference, which quantifiers validate exactely those patterns. In particular,
is there any quantifier validating precisely the logic M?
College Bois-de-Boulogne
REFERENCES
Barwise, J. and Cooper, R., 1981, 'Generalized Quantifiers and Natural Language', Linguistics
and Philosophy 4, 159-219.
van Benthem, J., 1984, 'Foundations of Conditional Logic', Journal of Philosophical Logic
13, 303-349.
van Benthem, J., 1986, Essays on Logical Semantics, D. Reidel Publishing Company, Dordrecht.
Burgess, J. P., 1981, 'Quick Completeness Proofs for some Logics of Conditionals', Notre
Dame Journal of Formal Logic 22, 71-84.
Keenan, E. L. and Stavi, J., 1986, 'A Semantic Characterization of Natural Language
Detenniners', Linguistics and Philosophy 9,253-326.
Lapierre, S., 1991, 'Conditionals and Quantifiers', in Jaap van der Does and Jan van Eijck
(eds.), Generalized Quantifiers and Applications, Dutch Network for Language, Logic and
Infonnation, Amsterdam, pp. 155-174. To appear in Jaap van der Does and Jan van Eijck
(eds.), Quantifiers, Logic and Language, CSLI Lectures Notes, Stanford, California, 1995.
Lewis, D. K., 1973, Counterfactuals, Harvard University Press, Cambridge, Mass.
Lindstrom, P., 1966, 'First-order Predicate Logic with Generalized Quantifiers', Theoria 32,1-11.
Montague, R., 1973, 'The Proper Treatment of Quantification in Ordinary English', in J. Hintikka,
J. Moravcisk, and P. Suppes (eds.), Approaches to Natural Language: Proceedings of the
1970 Stanford Workshop on Grammar and Semantics, D. Reidel Publishing Company,
Dordrecht, pp. 221-242.
Mostowski, A., 1957, 'On a Generalization of Quantifiers', Fundamenta Mathematicae 44, 12-36.
Westerstahl, D., 1989, 'Quantifiers in Fonnal and Natural Languages', in D. Gabbay and F.
Guenthner (eds.), Handbook of Philosophical Logic, volume IV, D. Reidel Publishing
Company, Dordrecht, pp. 1-131.

MARIE LA PALME REYES, JOHN MACNAMARA AND


GONZALO E. REYES

A CATEGORY-THEORETIC APPROACH
TO ARISTOTLE'S TERM LOGIC,
WITH SPECIAL REFERENCE TO SYLLOGISMS

INTRODUCTION

When Aristotle invented logic, what he invented was a logic of terms. The
Stoics replaced Aristotle's term variables with propositional ones, and with
that propositional logic was born (see [16]). For a long time term logic and
propositional logic existed together. For example, William of Ockham [21]
devoted the first part of his Summa logicae to terms and the second part to
propositions. Perhaps it was Kant who was responsible for the emphasis on
propositional logic at the expense of term logic. For where Aristotle had
categories of objects and attributes, closely related to the grammatical categories of terms that normally denote them, Kant had categories of concepts.
Kant, however, derives categories of concepts from categories of judgments;
that is, from categories of propositions. With the move to categories of judgments, term logic in anything like Aristotle's sense drops from view. In this
Frege follows Kant and so does what is now called "classical logic". (These
remarks were inspired by a comment of F. W. Lawvere.)
In the exclusive pursuit of classical logic important logical problems
are neglected. Elsewhere we have studied problems of negation in natural
languages ("unhappy" versus "not happy") and problems of identity (the celebrated ship of Theseus) and we shall not repeat the discussion here (see [13]
and [14]). These problems show the special relevance of term logic for the
study of cognition; for the semantics of natural languages and cognition are
of a piece. Not that one can read the forms of cognitive operations
automatically off grammatical form, as perhaps some ordinary-language
philosophers may once have imagined; but that if a cognitive operation is
demanded for the interpretation of a non-technical expression of natural
language, then that operation must be one that is readily available to the human
mind. It seems that no operation can exist in the non-technical part of a natural
language that is not available to the untutored human mind. Since many of
these operations cannot be captured in classical logic with its set-theoretic
models but require the use of category theory, we propose the following parallel

Categorical logic

Calculus

Cognition

Dynamics

Just as dynamics is expressed in the language of calculus and calculus is


the main mathematical tool for exploring properties of dynamical systems,

57
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 57-68.
@ 1995 Kluwer Academic Publishers.

58

MARIE LA PALME REYES ET AL.

so categorical logic provides the language best adapted to expressing the theory
of cognition and the main mathematical tool for exploring cognitive structures and operations. (For this claim see [10].)
Among the examples which illustrate term logic, which is not in general
Boolean, we develop at length only the theory of syllogisms. The first section
discusses two problems about syllogisms. Section two introduces the standard
class interpretation and shows that it is unable to solve the two problems. In
section three we present a particularly central component of term logic, the
logic of kinds. In section four we apply this logic to syllogisms, delimiting
conditions under which syllogisms are valid. In section five we compare
the category-theoretic approach with the one that has for a considerable time
been offered as the standard approach, namely the class interpretation of
syllogisms. Some concluding remarks bring our paper to an end.
1.

TWO PROBLEMS ABOUT SYLLOGISMS

Aristotle's syllogistic is presented by means of schematic letters together


with illustrative examples from such subjects as biology. This agrees with
Aristotle's view of logic as an instrument for sciences. One has the impression that for schematic letters one can substitute grammatically appropriate
expressions of natural language, and all will be well. Aristotle himself was
well aware that this leads to difficulties, for he notes that while every thief
is a person, a good thief is not usually a good person. Here is a pseudo
syllogism to show what can go wrong:
Every person in the ward is a baby in the ward
Every baby in the ward is big
Every person in the ward is big
The inference fails because even though a baby is a person, a big baby is
not a big person. The source of the trouble is that "big" as sorted by "baby"
is quite distinct from "big" as sorted by "person".
It is not generally appreciated how universal the phenomenon of such sorting
is. Take "white" as typed by "person" and "animal". Although every person
is an animal, a white person is not a white animal; white animals being exemplified by white rats and rabbits. Similar sorting effects apply to verbs and verb
phrases. The ways in which a motor, a government and a dog run are rather
different.
Another difficulty for syllogisms, noticed by Geach in [6] and Mulhern
in [20], relates to the shift of a term between subject and predicate position.
To make the problem concrete consider the following pseudo syllogism
All fire engines are red
All reds are marxists
All fire engines are marxists

APPROACH TO ARISTOTLE'S TERM LOGIC

59

"Red" in the first premiss is an adjective sorted by "fire engine" and as such
it picks out a set of fire engines (the red ones). In the second premiss, "reds"
is the plural form of the eN "red" which refers to the kind RED (consisting
of person belonging to a certain political group). It follows that the putative
inference, made plausible by an unmarked change in grammatical category
associated with the change in grammatical role, is spurious.
Even if we lay such obvious miscarriages of validity aside, we still have
a problem. Take the perfectly unexceptionable syllogism in Barbara
All dogs are animals
All animals are mortal
All dogs are mortal
What is the logical connection between "to be an animal" which in the
first premiss denotes a predicate of "dogs" and "animals" in the second premiss,
which denotes a kind in its own right, not a predicate of "dogs"? What explains
the obvious validity of such a syllogism?
Perhaps to avoid such difficulties some writers have placed restrictions
on the class of natural language expressions that can be substituted for
Aristotle's schematic letters. Lukasiewicz [17] rules out proper names and
adjectives. Yet as Mulhern [20] points out, Aristotle gives numerous examples
of syllogisms in which proper names and adjectives feature. Elsewhere we have
dealt with syllogisms in which proper names and adjectives in both subject
and predicate position occur, delimiting the circumstances under which they
are valid (see [12] and [15]). Here we shall leave these aside and handle the
difficulties in "normal" syllogisms only.
2.

THE CLASS INTERPRETATION

We have indicated two problems for syllogisms. Let us see how the standard
class interpretation handles them. But first a word on what the class interpretation is. The idea of the class interpretation is to assign to the count
nouns and predicables of a syllogism non-empty subsets of a supposed universal kind THING. The members of this kind are sometimes conceived as bare
particulars, that is attribute-free supports for attributes. Thus "baby" is interpreted as the subset of things that have the property of being a baby; "person"
as the subset of things that have the property of being a person; and the relation
between BABY and PERSON is just one of set-theoretical inclusion. Similarly,
"big" is interpreted as the subset of things that have the property of being
big.
Neither of the problems pointed out above seems to arise in this interpretation. There seems to be no problem relating to the sorting of predicables,
since predicables are not sorted or rather, what amounts to the same, are sorted
by the unique sort "thing". The problem of switching between a predicable and
a sentence subject seems also to have disappeared, since both sentence subjects

60

MARIE LA PALME REYES ET AL.

and predicables are interpreted as subsets of a set of things. So construed it


seems to make sense to claim identity, in the syllogism above, between the
subset interpreting "to be an animal" in the first premiss and the subset interpreting "animals" in the second premiss. The class interpretation, supported
by Euler or Venn diagrams, is usually presented as the interpretation of
Aristotelian syllogistic.
The appearance of well being, however, is no more than an appearance. The
class interpretation makes no provision for a predicable to change meaning
with change in sorting. It construes the interpretation of "persons who are
big babies" as things that have the three properties: of being persons, babies
and big. It follows that such things must be big persons. But this, as we have
seen, runs foul of intuition. It is simply not sensitive enough to the system
of interpreting natural-language expressions.
Notice, furthermore, that this approach does violence to grammar, since it
makes no distinction between nouns and predicables. It interprets both as
subsets of the kind THING. However, in every language that has the grammatical category of common noun (nearly all languages do), a common noun
is required for the use of quantifiers (see [1]). Another way to make the same
point is to note that we are unable to grasp conceptually and work with the
supposed universal kind THING. If we try to count the things in a room we
do not know whether to count a certain woman as one, or to count the limbs
separately, or the cells in her body, or the molecules in the cell, or ... The
point has been made by Geach [5].
With these remarks in mind return to the second problem that we noted
for syllogisms; that relating to change in grammatical role from predicate to
subject position. Because of its insensitivity to grammar, the class interpretation just ignores the problem completely.
A further trouble for the class interpretation, discussed already by medieval
logicians, is in how to construe the relation among kinds (the interpretations
of count nouns). We illustrate the difficulty with an example inspired by Gupta
[7]. The source of the trouble is that the class interpretation recognizes only
one relation among kinds: set-theoretic inclusion. Inclusion is of course
injective. What then of the relation between PASSENGER and PERSON? If
a person travels three times in one year with an airline, three passengers will
be counted in association with that person. It follows that the relation between
PASSENGER and PERSON cannot be set-theoretic inclusion. The same applies
to the way we count dinners in a restaurant, clients in a shop, patients in a
hospital and countless other examples. Incidentally, the relation between BABY
and PERSON, though injective, cannot be set-theoretical inclusion; at least not
if the relation between ADULT and PERSON is also set-theoretical inclusion. That would entail a relation of identity between a baby and the adult it
later becomes; a nonsensical result, since a baby is not an adult, nor is an
adult a baby.

APPROACH TO ARISTOTLE'S TERM LOGIC

61

3. THE LOGIC OF KINDS

Since proper names (PNs), count nouns (eNs) and predicables are interpreted
in relation to kinds, we begin with the logic of kinds. We do not here consider
syllogisms that include mass nouns (like "water") or abstract nouns (like
"beauty" or "justice"). Readers seeking more detail on the logic of kinds should
consult [11].
3.1. Situations and Kinds

First we need the notion of possible situation, which we leave at an intuitive


level. Situations are what ground truth-conditions. They carry information
and we assume that possible situations are pre-ordered by the relation ~ of
"having more information than", namely V ~ U whenever V has more information than U.
We now arrive at the notion of kind, which is fundamental in our work.
Kinds are typically the interpretations of eNs. We construe a kind, say
PERSON, as the set P of all persons that ever were, are or will be. Thus P does
not change as persons die and new persons are born. We call this property
modal constancy; it provides for reference to persons past, present and future
with the single word "person". This notion of kind is a simplification, adequate
for our purposes, of the one described in [11].
With the help of this notion, we may assign truth conditions to sentences.
In fact, a sentence like "John runs" is neither true nor false, but it may hold
in some situations. Assume that it holds at U. Viewing this relation as a piece
of information carried by U, we conclude that the set of situations in which
"John runs" holds is downward closed. In the technical development, this
downward closed set of situations is precisely the truth-value of the sentence
in question. More generally, the interpretation of every sentence is a downward
closed set of situations.
Proceeding in this way, we assign kinds to eNs such as "person", "baby",
"adult", etc. As we pointed out in the section on the class interpretation, we
cannot assume that the relations between these kinds are set-theoretical inclusions. In particular, it makes no sense to say of a baby that is identical to a
person. But if this is so, how can we identify John as a baby with John as a
person? In other words, what is the relation between the kinds BABY and
PERSON?
We construe this relation as a map u: BABY ~ PERSON. Thus b, a member
of the kind BABY, is identified with u(b), i.e., a member of the kind PERSON.
We call u(b) the person underlying the baby b. Kinds and underlying maps
constitute a category 'j{ (in the sense of category theory).

62

MARIE LA PALME REYES ET AL.

3.2. Predicates

Predicates are typically the interpretation of predicables (adjectives, VP, etc.)


such as "white", "mortal", "run" and "to be a dog". We define a predicate
of a kind, say PERSON as a map that associates with a given person, say
John, a downward closed set of situations. As an example, take the interpretation of "run" as applied to the kind PERSON, RUNpERsoN' We construe this
predicate as the map which associates with a person, say John, the set of
situations in which John runs. As we saw in section 3.1, this set is downward
closed. Instead of writing "U E RUNpERsoN (JOHN)" we shall usually write
"John runs at U", or, to emphasize the kind to which John belongs, "the person
John runs at U".
In spite of their ubiquity, situations do not playa central role in our theory.
Indeed [22] defines kinds relative to an arbitrary topos defined over a base
topos. We chose situations as a determination that is particularly suited for
purposes of exposition. This determination simplifies computations for it allows
us to use that relatively familiar apparatus of categorical logic, Kripke forcing.
We emphasize, however, that the central notion in our semantics is reference, and kinds are essential for reference. Thus our motivation is different
from that of Barwise and Perry [3] whose central concern is the pragmatic
relation between character and content (that is reference) in the sense of [8].
Situations play a basic role in their semantics but not in ours.
3.3. Nominal Categories, Interpretations and Systems of Kinds

As we have seen, eNs such as "dog" and "animal" are interpreted as kinds
and relations between eNs of the form "a dog is an animal" as un underlying map between the corresponding kinds. However, we can be a bit more
explicit. We will see, by means of an example, that eNs themselves (in a given
"universe of discourse") constitute a preordered set and thus a category, }{,
the nominal category of the given universe. Its role is to act as a "blueprint"
to organize the eNs in such a way as to indicate the connection between the
kinds interpreting these eNs.
Assume that, like Aristotle, we are organizing and developing a given
subject, say Zoology. In Zoology we have eNs ("mammal", "whale",
"fish", "animal", etc.) and predicables ("having a heart", "breathing air", "being
a mammal") which combine with eNs to make significant sentences.
Furthermore, we have relations between eNs described by sentences of the
type "a dog is an animal", "a whale is a mammal", which we assume as
postulates. These postulates impose constraints on the interpretation of these
CNs, constraints which are partly conventional and partly empirical and are
subject to revision. Recall that for a long time people accepted that a whale
was a fish.
The CNs and their relations constitute a category }{ under the following

APPROACH TO ARISTOTLE'S TERM LOGIC

63

definition of objects and morphisms: an object of .N' is a eN d. A morphism


d ~ f1" is the postulate "an d is a 11.".
As we mentioned, eNs and their relations may be interpreted as kinds
and connections between kinds. Such an interpretation turns out to be precisely
a functor between the nominal category .N' and the category of kinds 'X, the
interpretation Junctor. Sometimes we take the nominal category for granted
and mention only the kinds interpreting the eNs. When we do so, we use
the expression system of kinds instead of "interpretation".

H:

3.4. Identification and Entities for a System of Kinds

We noted that the supposed universal kind THING is logically ill founded.
Partly to replace this notion we propose a new notion of ENTITY relative to
a system of bona fide kinds, a notion that is sensitive to ordinary practices
in the interpretation of natural language expressions.
For example, an airplane may transport fathers, mothers, husbands, wives,
women, stewardesses, crew members, etc. To calculate the number of seats
occupied, we do not simply add the number of fathers plus the number of
mothers plus the number of husbands plus the number of . . . , since a
particular woman, say, may also be a crew member. Similarly, a particular wife
may be a mother. We need to describe these identifications systematically.
We assume that all of these kinds are organized in a system of kinds. In our
example, a stewardess s is identified both with a woman u(s) and with a
crew member yes), where u and v are the underlying maps of the kind STEWARDESS into WOMAN and into CREW MEMBER, respectively, corresponding
to the expressions "a stewardess is a woman" and "a stewardess is a crew
member". This forces the identification of u(s) with yes), i. e., the identification of a certain woman with a crew member, even if there are no underlying
maps between WOMAN and CREW MEMBER. Similarly, a certain wife and
a certain mother could each be identified with the same woman, forcing their
identification. Notice that, whereas the first identification takes place by means
of a "backward" move, the other takes place by means of a "forward" move
(these terms referring to the direction of the movement along the underlying
maps in the system). Proceeding in this way, we obtain the required relation
of identification relative to our system of kinds. This relation turns out to be
an equivalence relation.
The ENTITY relative to our system of kinds is the set E of equivalent classes.
In our case this kind is the interpretation of "person (in the airplane)".
Forming the entity is a particular example of the notion of colimit of a
functor in category theory (see [12] for details). This shows that the operation of forming the colimit is one that is a perfectly natural one for the human
mind. At times the language may fail to contain a single lexical item to express
the colimit. Take bent-wood chairs, kitchen chairs, chairs, dining tables, card
tables, tables, etc. The lexical item that covers them all is "furniture". But

64

MARIE LA PALME REYES ET AL.

"furniture" is a mass noun and does not denote the colimit of the system.
Nevertheless, there is "article of furniture" that does. Other such expressions
are "head of cattle", "articles of clothing", and "items of information". Many
languages have a much more extensive set of such "markers" ("article", "head",
"item") than English.
3.5. Coincidence Relation

Besides the notion of identification, which is independent of situations, we


define the notion of a coincidence relation relative to a system of kinds. This
is in fact a binary relation (at each situation U) and thus, whether it holds of
members of (possibly different) kinds or not, depends on which situation is
envisaged. This notion will playa central role in the handling of syllogisms
and can be explained by the following example: a certain person, John, who
is childless at V may be a father at U. It is natural to say that John coincides
with a father at U, but fails to do so at V. Assume that John is also a politician at U. Then John coincides with a politician at U and this forces the
coincidence (at U) of a certain father with a certain politician. These indications should suffice to understand this notion. Further details can be found
in [12].
3.6. Interpretation of Predicables
One of the troubles for syllogisms, discussed in Section 1, is the typing of
predicates by kinds with the corresponding change in meaning in the predicate when we go from one kind to another by means of underlying maps in
a system of kinds. How to express the idea that the interpretation of the
predicable "male", say, behaves well when we go from BABY to PERSON
by means of the underlying map u?
Notice first that since predicates are typed by kinds, we have really two
predicates corresponding to the two kinds in the system constituted by the
two kinds with the underlying map between them: MALE BABy and MALEpERSON.
The answer to our question is now straightforward: the interpreted predicable "male" has the property that a baby is male (as a baby) in a situation
precisely when it is male (as a person) in the same situation. In symbols: if
U is a situation and b a baby, U E MALEBABy{b) iff U E MALEpERsoJu(b.
On the other hand the interpretation of the predicable "big" does not have
this property.
We may generalize our example by defining a predicate of a system of kinds,
to be a family of predicates each typed by a kind in the system. Given a system
of kinds, the interpretation of a predicable is a predicate of the system. In order
to study the effect of change of typing a predicate we need some notions
that capture the idea of a predicate being well behaved. We say that the interpretation of a predicable 'If is functorial with respect to a system of kinds if
it is well behaved relative to the underlying maps in the system. This notion

APPROACH TO ARISTOTLE'S TERM LOGIC

65

of a predicate being functorial is easy to grasp but not always easy to compute
with. For purposes of computation, the equivalent notion of an extensional
predicate is often more convenient. We say that a predicate <I> of a system of
kinds is extensional if and only if given two kinds in the system and one
member in each, either both members have the property <I> at U or both fail
to have the property <I> at U, provided that the members in question are coincident at U. The equivalence of these notions is proved in [12].
We mention that there already exists, in the literature the notion of a predicate being "inter-secting" or "absolute" (see [9]). This notion turns out to be
a very special case of a predicable being functorial/extensional (see [12]).
3.7. Grammatical Change

As we mentioned, some predicables derive from CNs and PNs and we need
to specify the semantical connection between the interpretation of CNs and
PNs on the one hand and that of derived predicables on the other.
Let us start with a system of kinds and assume that "puppy" is a CN in
the nominal category. The interpretation of the predicable "to be a puppy" is
obtained as follows: for each kind A we define a predicate TO BE A PUPPYA
by stipulating that for each a in A and any situation U, a has the property of
being a puppy at U iff there is a puppy which coincides with a at U. Notice
that a member of a kind in the system, a dog say, may have the property of
being a puppy at U, but may fail to have it at V.
The following is fundamental for validity of syllogisms:
PROPOSITION 3.7.1. The interpretation (All)IlEIXI of the predicable "to be
an A" derived from the CN A is an extensional predicate.
4.

INTERPRETATION OF SYLLOGISMS

We now specify the notion of validity for a syllogism. In the literature there
are two notions of validity, depending on how syllogisms are considered: either
in terms of axioms, with Lukasiewicz [17]; or in terms of rules of inference,
with Corcoran [4]. Since validity based on axioms implies validity based on
rules of inference, we will prove validity of syllogisms based on axioms.
Our interpretation makes essential use of the relation of "forcing" between
a situation U and a sentence 0 which we write U II. . 0. Intuitively this relation
expresses the idea that U is a truth-condition of 0 (as explained earlier in
the paper for sentences such as "John runs", see Section 3.2).
Since we do not plan to give a detailed treatment, we shall give one example:
Take the celebrated syllogism in Barbara
All Greeks are men
All men are mortal
All Greeks are mortal

66

MARIE LA PALME REYES ET AL.

We first notice that we here have two eNs: "Greek" and "man" and one
predicable: "to be mortal". The first eN occurs only in subject position in
the syllogism. The second eN changes its grammatical role from subject
position in the second premiss to predicate position in the first. On the other
hand, the predicable "to be mortal" is sorted by "man" in the second premiss
and by "Greek" in the conclusion. We have here an example of the two
problems discussed in the introduction.
Let j{ be a nominal category containing, among its objects, "Greek" and
"man" and let 1: j{ ~ X be an interpretation whose values are inhabited
kinds (i.e., kinds having at least one member). To simplify the language, we
let GREEK = I("Greek") and MAN = I("man").
Since both premisses and the conclusion are of the same form, our
first question is how to interpret "All A are '\If", where A is a eN and '\If is a
predicable?
We recall from Section 3.1 that the interpretation of sentences should be
a downward closed set of situations, i.e., the set of truth conditions should
be a downward closed set of situations. We say that U forces "All A are '\If"
or, equivalently, that U is a truth-condition of "All A are '\If" iff for every a
in A and every V::; u, V forces "a has the property '\If", i.e., a has '\If at V.
We define a syllogism to be valid if every truth-condition of both premisses is a truth-condition of the conclusion.
Let us check the validity of our syllogism. Let U be a situation which is
a truth-condition of each premiss and let g be a Greek. Using the first premiss
and the definition of the interpretation of a predicable derived from a eN, there
is a man m such that, at U, g coincides with m. Using the second premiss, m
has the property MORTAL MAN at U. We want to conclude that g has the property
MORTALGREEK at U. But here we get stuck, unless we know that the interpretation of the predicable "to be Greek" is extensional, in which case we
obtain the desired conclusion. But "to be Greek", being derived from the
noun "Greek", is extensional and the syllogism is valid.
This notion of forcing is just Kripke's forcing and is a particular case of
forcing in a topos [19].
As we mentioned before, Aristotle also considered "deviant syllogisms",
namely syllogisms with predicables in subject position. We shall leave them
aside. The interested reader is advised to consult [12] for details. Aristotle also
considered syllogisms in which PNs occur. For a study of these, the reader
is referred to [15].
5. THE CLASS INTERPRETATION REVISITED

Our work should not be taken as refuting the class interpretation, but as
delimiting its validity, as we show in this section. In fact our theory yields
the class interpretation under rather special circumstances.
The idea is simple enough. Assume that we have a system of inhabited kinds
and a predicable. As an example take the system involving DOG, POODLE,

APPROACH TO ARISTOTLE'S TERM LOGIC

67

etc. and the predicable "male". We define MALE, the interpretation of the
predicable "male" as a predicate of ENTITY (which we assume to be the
kind ANIMAL) by stipulating that an animal a is male at U iff there is a member
of a kind in the system (for instance a dog) that is identified with a and is
male at U. Of course, kinds themselves give rise to extensional predicates
(as shown in Section 3.7) and we obtain the class interpretation for terms
that are "homogeneous with respect to their possible positions as subjects
and predicates" as Lukasiewicz requires.
Although this definition does not require the interpretation of the predicable
to be functorial, several pseudo syllogisms would be validated if this restriction were not imposed, as we pointed out in the introduction.
CONCLUDING REMARKS

In concluding we return to our point of departure, the logic of terms. We


have studied mainly the logic of CNs and predicables. We believe we have
shown that the interpretation of natural language expressions is distorted by
insisting everywhere on set-theoretical models, especially when this is coupled
with the belief that we have access to a universal kind of bare particulars.
All of this obscures the sorting of predicables by CNs and diverts attention
from the two main problems we dealt with: change in semantical role and
change in grammatical role. We believe too that our handling of these problems
(among others) draws attention to a range of operations that, being available
to the untutored mind, are basic for the theory of cognition.
ACKNOWLEDGEMENTS

Work on this paper was funded in part by a post-doctoral fellowship awarded


by the Social Sciences and Humanities Research Council of Canada to Marie
La Palme Reyes, and by grants from the National Science and Engineering
Research Council of Canada to John Macnamara and Gonzalo E. Reyes. The
authors gratefully acknowledge this support.
McGill University and Universite de Montreal (GER)
REFERENCES
1. Bach, E., 1994, 'The Semantics of Syntactic Categories: A Cross-linguistic Perspective',
in J. Macnamara and G. E. Reyes (eds.), The Logical Foundations of Cognition, Oxford
University Press, Oxford.
2. Barnes, J. (ed.), 1984, The Complete Works of Aristotle. Vol. 1, second edition 1985.
Bollington series, Princeton University Press, Princeton, N.J.
3. Barwise, J. and Perry, J., 1983, Situations and Attitudes, Bradford/MIT Press, Cambridge,
Mass.
4. Corcoran, J., 1974, 'Aristotle's Natural Deduction System', in J. Corcoran (ed.), Ancient
Logic and Its Modern Interpretations, D. Reidel Publishing Company, Dordrecht.

68
5.
6.
7.
8.

9.

10.
II.
12.

13.
14.

15.
16.
17.
18.
19.
20.
21.

22.
23.

MARIE LA P ALME REYES ET AL.


Geach, P. T., 1962, Reference and Generality, Cornell University Press, Ithaca.
Geach, P. T., 1972, Logic Matters, University of California Press, Berkeley.
Gupta, A. K., 1980, The Logic of Common Nouns, Yale University Press, New Haven.
Kaplan, D., 1978, 'Demonstratives; An Essay on the Semantics, Logic, Metaphysics and
Epistemology of Demonstratives and Other Indexicals', in P. Cole (ed.), Syntax and
Semantics 9, Academic, New York.
Keenan, E. L. and Faltz, L. M., 1985, Boolean Semantics for Natural Longuage. Synthesis
Language Library, D; Reidel Publishing Company, Dordrecht.
Macnamara, J. and Reyes, G. E., 1994, 'Introduction', in J. Macnamara and G. E. Reyes
(eds.), The Logical Foundations of Cognition, Oxford University Press, Oxford.
La Palme Reyes, M., Macnamara, J., and Reyes, G. E., 1994, 'Reference, Kinds and Predicates', in J. Macnamara and G. E. Reyes (eds.), The Logical Foundations of Cognition,
Oxford University Press, Oxford.
La Palme Reyes, M., Macnamara, J., and Reyes, G. E., 1994, 'Functoriality and Grammatical
Role in Syllogisms', Notre Dame Journal of Formal Logic 35(1), 41-66.
La Palme Reyes, M., Macnamara, J., Reyes, G. E., and Zolfaghari, H., 1994, 'The NonBoolean Logic of Natural Language Negation', Philosophia Mathematica 2(3), 45-68.
La Palme Reyes, M., Macnamara, J., Reyes, G. E., and Zolfaghari, H., 1994, 'A Categorytheoretical Approach to Aristotle's Logic of Terms, with Special Reference to Negation',
in V. G6mez Pin (coordinador). Aetas del Primer Congreso Internacional de Ontologia.
Bellaterra: Publicacions de la Universitat Autonoma de Barcelona, pp. 241-249.
La Palme Reyes, M., Macnamara, J., Reyes, G. E., and Zolfaghari, H., 1993, 'Proper Names
and How They are Learned', Memory 1(4), 433-455.
Lukasiewicz, J. [1934], 1970, 'On the History of the Logic of Propositions' , in L. Borkowski
(ed.), Jan Lukasiewicz Selected Works, North-Holland Publishing Company, Amsterdam.
Lukasiewicz, J., 1957, Aristotle's Syllogistic. Second edition enlarged. Clarendon Press,
Oxford.
Makkai, M. and Reyes, G. E., 1995, 'Completeness Results for Intuitionistic and Modal
Logic in a Categorical Setting', Annals of Pure and Applied Logic 72, 25-101.
Mac Lane, S. and Moerdijk,l., 1992, Sheaves in Geometry and Logic, Springer-Verlag, New
York.
Mulhern, M., 1974, 'Corcoran on Aristotle's Logical Theory', in J. Corcoran (ed.), Ancient
Logic and Its Modern Interpretations, D. Reidel Publishing Company, Dordrecht.
Ockham, William of, 1974 and 1980, Ockham's Theory of Terms: Part I of the "Summa
Logicae" (Translated by M. J. Loux) and Ockham's Theory of Propositions: Part II of the
"Summa Logicae" (Translated by A. J. Freddoso and H. Schuurman), University of Notre
Dame Press, Notre Dame, Ind.
Reyes, G. E., 1991, 'A Topos-theoretic Approach to Reference and Modality', Notre Dame
Journal of Formal Logic 32(3), 359-391.
Reyes, G. E. and Zolfaghari, H. (in press), 'Bi-Heyting Algebras, Toposes and Modalities', Journal of Philosophical Logic.

J. LAMBEK

ON THE NOMINALISTIC INTERPRETATION


OF NATURAL LANGUAGES*

Attempting to extend the nominalistic interpretation of mathematics to natural


languages, we are led to consider three classes of nouns and three classes of
verbs. We find that the former trichotomy plays a prominent role in the early
history of mathematics, while the latter provides a basic framework for our
prescientific view of the world.
1.

INTRODUCTION

In spite of the regrettable divorce of mathematics from philosophy, I am convinced that these two disciplines can learn from one another. In collaboration
with Phil Scott (1986) and Jocelyne Couture (1991), and again in (1993), I
have argued in favour of a nominalistic interpretation of mathematics, suggesting that the world of mathematics may be constructed from the language
of mathematics. Although this view has by no means been accepted by the vast
majority of mathematicians, if they care about foundations at all, it was
intended to minimize the conflict between different traditional philosophies
of mathematics.
I would like to argue here that nominalistic interpretations can be extended
from the language of mathematics to natural languages. This idea is certainly
not new; it has been held for centuries by various philosophers, starting with
Philo of Alexandria, and more recently by linguists and anthropologists; but
I hope that at least some of my arguments have not been seen before. I shall
invoke some linguistic insights, which I arrived at while attempting to write
a mathematically sound production grammar for English, an ongoing project
which many never see completion. My claim that the structure of our world
is greatly influenced by the structure of our language will be illustrated by
episodes in the early history of mathematics, before its lamented divorce
from philosophy (see Anglin and Lambek, 1995). Unfortunately, I do not
possess a technical knowledge of classical Greek and my speculations will
be based on the assumption that there are significant parallels between ancient
Greek and modem English grammars.
2.

MATHEMATICS

Let me begin with the question: what is the language of mathematics about?
Most people would say that numerical expressions such as 1 + 1 are about

* This research was supported by the Social Sciences and Humanities Research Council of
Canada.
69
M. Marlon (lIf{,[ R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 69-78.
1995 Kluwer Academic Publishers.

70

J. LAMBEK

numbers and that formulas such as 1 + 0 = 1 are about propositions. It is


tempting to say that numbers are numerical expressions and propositions are
formulas, but modulo provable equality. That is, numbers are equivalence
classes of numerical expressions, where a and ~ are equivalent if the formula
a = ~ obtained by placing an equal sign between a and ~ is provable. The
same goes for propositions as equivalence classes of formulas, but an equal
sign between formulas should be read as "if and only if".
For example, what is the number 2? Leaving aside such facetious or
extravagant assertions as "2 consists of a pair of platinum balls in Paris" or
"2 is an idea in the mind of the goddess", as well as dubious attempts by
logicists to provide a logical definition of 2, such as "2 is the set of all pairs",
most people would agree that 2 is the successor of the successor of zero, usually
denoted by SSO. We shall assume here that 0 and S are symbols in a formal
language, subject to certain axioms and rules of inference (see, e.g., Lambek
and Scott (1986. Of course 1 + 1, which is provably equal to SSO, must
also be identified with 2, and so must all numerical expressions a for which
one can prove a = SSO.
Other natural numbers can be obtained similarly be applying the successor
symbol a finite number of times to the symbol for zero. But now consider
the following numerical expression a:

the x such that x = 0 ifG and x = 1 if not G,


where "G" is an abbreviation of Godel's undecidable formula. Then a is not
provably equal to 0, SO, SSO, etc. Hence we are forced to conclude that there
are non-standard natural numbers, not obtained from zero by applying the
successor operation a finite number of times. However, the above definition
given by a is based on the venerable assumption that G or not G, an assumption endorsed by Aristotle and classical mathematicians, but not by Brouwer
and other intuitionists. It can indeed be shown (Lambek and Scott, 1986)
that, in intuition is tic type theory, there are no non-standard numbers: for
every closed term a of type N, the type of natural numbers, one can either
prove a = 0 or a = SO or a = SSO or ...
Classical mathematicians also believe that, extensionally speaking, there are
only two propositions, namely the true and the false, nowadays denoted by
T and ..l. But G is neither provably equal to T nor provably equal to ..l. Again,
this won't bother an intuitionist, who believes that there are other truth values
than T and ..l.
We have attempted to construct the world of mathematics from the language
of mathematics, say type theory, by declaring mathematical entities to be closed
terms of the language modulo provable equality: two terms a and ~ of the
same type are to be identified if the equation a = ~ is provable. As the above
examples show, this position is incompatible with classical mathematics.
However, it is compatible with intuitionistic mathematics, created by Brouwer
and formalized by Heyting and others.
A reasonable model of intuitionistic type theory is the so-called term model,

NOMINALISTIC INTERPRETATION OF NATURAL LANGUAGES

71

alias Lindenbaum-Tarski category. It is also known as the free topos, an


initial object in the category of all elementary toposes, which were introduced by Lawvere (1972). That it is a model in a technical sense, extending
Henkin's non-standard models from classical to intuitionistic type theory,
was shown in Lambek and Scott (1986). It satisfies a number of intuitionistic principles; in particular, it has the following property, which even
the Platonist Godel would have insisted on: for any formula q>(x), with x of
type N, if 3XEN q>(x) is true in the model, then so is q>(O) or q>(SO) or q>(SSO)
or . ..
What has been said about the language of mathematics could equally well
be said about the language of any exact science, at any stage of its development. Perhaps the world of physics will ultimately be shown to be a
17-dimensional space with certain properties and so become part of the world
of mathematics. Not wishing this paper to depend on the ultimate success of
a "theory of everything", I will look at the semantics of natural languages
instead.
3.

NATURAL LANGUAGES

Can something like the nominalistic interpretation of mathematics be carried


out for a natural language such as English? Thus, we would like to assert
that the everyday world we talk about is also constructed from words, in
particular, that an entity in this world is also a word or string of words modulo
a suitable equivalence relation. l
At first sight, such a program would seem to be doomed to failure. Consider,
for instance, the sentence
my uncle drank a glass of water.

It would seem that the nouns uncle, glass and water denote entities which exist
quite independently of our language, and even the verb form drank denotes
a past action of some extralinguistic status. A little reflection, however, will
show that the way these entities are structured is entirely language dependent, even though some of them may be made out of molecules or, according
to the latest scientific theory, out of quarks and electrons, and others may
consist of events in space-time.
Let us begin by looking at the noun uncle. This denotes a binary relation,
as in "A is an uncle of B", meaning that "A is a male sibling of a parent of
B or the male spouse of a sibling of a parent of B". To simplify the discussion, let us forget about unclehood by marriage and concentrate on the first
part of the definition: A is a male sibling of a parent, say C, of B. Nothing
is being said about the sex of C; however, in many languages it is necessary
to specify whether C is male or female, as was the case in medieval English.
In some languages, e.g. Hindi (Bhargava and Lambek, 1983), when C is male
we must also specify whether A is older or younger than C. In such a language

72

J. LAMBEK

there are three words where the single word uncle will do in modern English;
the world has been partitioned more finely than in our language.
On the other hand, there are languages in which no distinction is made
between a mother's brother and her father. In fact, our word uncle is derived
from Latin avunculus, meaning "little (maternal) grandfather". Similarly, our
word nephew is derived from a word meaning "grandson". Even in the world
of Hindi, some partitions are less fine than in ours; for example Hindi does
not distinguish between siblings and cousins.
This is not to say that we cannot translate between Hindi and English. Thus,
Hindi mama can be translated into English "maternal uncle" and English
brother can be translated into a compound Hindi phrase which excludes cousinhood. The point is rather that sometimes Hindi sees a simple structure
where English sees a compound one, and sometimes it is the other way
round.
Next, let us look at the word drank, expressing the past tense of the verb
drink. At first sight, drink seems to refer to the process of consuming a liquid,
whereas we usually eat a solid. But even British English and American English
partition the world differently here. Americans eat soup and drink tea, while
the British drink soup and eat tea, at least when it consists of cucumber sandwiches.
On the other hand, British and American scientists agree that glass is a
liquid, contrary to common sense. Yet, when we drink a glass of water, we
do not consume the glass at all, the glass is merely a vessel for containing
or measuring the substance water. Glasses can be counted, but water must
be measured, a distinction we shall take up in the next section.
The fact that drank is the past tense of drink refers to the passage of time.
In English such reference is obligatory, whereas in some languages, e.g. in
Chinese, it is not necessary to refer to the time when the drinking took place
at all. According Parmenides, the flow of time belongs to the world as it
appears to us humans; to the "goddess" past and future are all one. Presumably,
the goddess would feel more comfortable with Chinese than with English.
In a later section we shall consider the fact that the verb drink denotes a process
and not a state or a causal action.
4.

NOUNS

Not counting names and pronouns, which might be marginally included under
the umbrella of nounhood, English has three kinds of nouns, namely
count nouns:
mass nouns:
plural nouns:
pants, ....

house, ox, pea, delivery, ... ;


water, beef, rice, furniture, sadness, ... ;
police, people, cattle, . . . , scissors, glasses,

The plural nouns, of course, include also the plurals of count nouns: houses,
oxen, peas, deliveries, .... One way to tell mass nouns from count nouns

NOMINALISTIC INTERPRETATION OF NATURAL LANGUAGES

73

is that they don't have a plural; another is that they don't allow an indefinite
article. Notice also

many a house, much water, many houses.


I learned from Brendan Gillon (1992) that the difference between count and
mass nouns in English was officially recognized only in 1909 by Jespersen.
An ipteresting case is the count noun pea. At one time there was a mass
noun pease, like rice, but later people preferred to think of peas as a plural,
like beans. May one conjecture that ultimately rice will be thought of as a
plural too, like mice?
Many singular nouns may occur as both count nouns and mass nouns,
depending on context, e.g. glass, stone and chicken. Even man, normally a
count noun, may occur as a mass noun when referring to the species, as in man
is mortal, or even to a substance, as when a cannibal prefers man to beef2
It should be pointed out that the syntactic units we have called "nouns" here
need not be single words, they could be compound expressions. (Unfortunately,
the term "noun phrase" has been preempted for something else.) For example,
we have the count nouns

teacher of English, killer of whales


and the mass nouns

teaching of English, killing of whales.


The distinction between mass nouns and count nouns encapsulates a kind
of folk philosophy, distinguishing between substances to be measured and
things to be counted. We cannot count bread and water, but we can count loaves
of bread and glasses of water. This distinction became a big issue among
early Greek philosophers. Some, like Thales and other Ionian philosophers,
championed substances against things, claiming that everything is made up
from one or more substances. Thales even said there was only one substance,
namely water. Other philosophers, like Democritus, championed things against
substances, claiming that everything is made up of atoms, which can be
counted. Pythagoras even went further, to him numbers, the very instruments
of counting, were the only ultimate reality.
In a spirit of compromise, Aristotle allowed for both substances and things,
exploiting the distinction in his theory of matter and form. Thus, in loaf of
bread and cup of water, bread and water refer to the matter and loaf and
cup to the form. It seems that there was even an element of sexism here: matter
was thought of as feminine and form as masculine, hinting at the contributions of mother and father to procreation respectively.
The debate between those who favoured counting and those who favoured
measuring was extremely important in the early history of mathematics. To
count one uses natural numbers, to measure one use real numbers, which to
the Greeks were ratios of geometric quantities. Thus, the question was whether
arithmetic or geometry was more basic. Pythagoras was all for arithmetic

74

J. LAMBEK

and even claimed that all things are numbers, meaning natural or rational
numbers, thus embracing a reductionist philosophy, reducing all science to
mathematics. Not surprisingly, this extremist position faced criticism even in
his life time. According to one anecdote, Pythagoras was challenged thus:
"If everything is number, what about friendship?" Pythagoras was said to
have replied with a pun: "A friend is to me as 284 is to 220." This pair of
numbers was called "amicable", a technical term meaning that each is the
sum of the proper divisors of the other.
A more serious challenge came later, when his disciples discovered that
{2 is not a number in the sense of Pythagoras; translated into modem terminology, that it is irrational. This discovery brought about a victory of those
who championed measuring over counting, of geometry over arithmetic. It thus
came about that geometry dominated Greek mathematical thinking.
Actually, Eudoxos, a pupil of Plato's, discovered that numbers, meaning
rational numbers, would do after all. He pointed out that two real numbers,
namely ratios of geometric quantities, could be compared by comparing the
sets of rational numbers below each of them. He thus anticipated the modem
definition of real numbers by Dedekind.
5.

VERBS

Leaving aside auxiliary verbs for the moment, we also find three classes of
verbs in English. Before discussing these, let us look at the corresponding
situation in Latin (Lambek, 1979), where verbs belonging to three distinct
classes are usually linked semantically and morphologically in triples.
A typical example is provided by the triple amo, amavi, amor. Although
these words are usually regarded as parts of the same verb, we may also
think of them as three separate verbs, sharing the same root and having an
element of meaning in common. Each of these three verbs is equipped with
a complete 5 by 6 conjugation matrix, consisting of 30 forms made up from
5 simple tenses and 6 persons. One usually refers to amavi as the perfect
and amor as the passive of amo. (There is also a perfect passive, but this
has to be expressed as a composite made up from the past participle and the
auxiliary verb sum.) Not all triples are complete; intransitive verbs lack a
passive and a few verbs such as memini and nascor possess only the second
or third component respectively.
In English, there is a similar, though less obvious trichotomy. A typical
example is the triple: know, learn, teach. Here we have no common root, yet
three related meanings. We say that know refers to a state, learn to a process
and teach to a causal action. The analogy with Latin is not precise. We may
think of learn as kind of passive of teach and know as the perfect of learn;
yet is seems more natural to take know as basic and to regard teach and learn
as its causative and process forms respectively. (There are languages, e.g.
Hebrew, where a causative may be formed by a systematic morphological
variation, as is the case for the passive in Latin).

NOMINALISTIC INTERPRETATION OF NATURAL LANGUAGES

75

It would seem that every non-auxiliary English verb, whether transitive


or intransitive, must belong to one of these three classes. Thus, we have
many complete triples, such as:
state

process

action

be
be dead
sit
have
know

become
die
sit down
get
learn

make
kill
seat
give
teach

Depending on the meaning of know, sometimes teach should be replaced by


tell. There are many verbs belonging to one of the three columns without
correlates in the others. For example, drink is a process verb, but no simple
expression refers to the corresponding state of having quenched one's thirst,
nor is there a word for the act of causing someone to drink, where German
has tranken (usually applied to animals).3
As was the case for the three kinds of nouns, the trichotomy "state, process,
causal action" summarizes a kind of folk philosophy. Not surprisingly, the
distinction between states and processes surfaces in early Greek philosophy.
Parmenides champions the exclusive status of states, claiming that change,
which includes process as well as causation, is a human illusion. On the
other hand, Heraclitus asserts the exclusive status of process and causation,
claiming that everything changes; you cannot step into the same river
twice. He even proposes a theory as to what causes change, claiming that
change is brought about by a tension between opposites, a slogan taken up
more recently by Hegel and his Marxist followers, though lately falling into
disrepute.
The analogy
have : get: give

know: learn: teach

is quite systematic. For example, I have a cold or I get a cold may be transformed into you give me a cold or you give a cold to me, while similarly I
know English or I learn English may be transformed into you teach me English
or you teach English to me.
We may think of learn and teach as being derived in meaning from know.
Thus,
learn
teach A

=
=

get to know,
make A know.

While teach incorporates the notion of causation, make expresses pure causation. The verb make allows one to transform any state verb or process verb
into an expression denoting causal action. It is not the only verb to do this,

76

J. LAMBEK

let, help and have perform a similar function, though with different emphasis.
When we say
A makes, lets, helps or has B come,

we attribute the responsibility for B's coming to A in the first case, to B in


the second case and jointly to A and B in the third case, while in the fourth
case there seems to be another person acting as an intermediary.
What these causation verbs have in common is the syntactic property of
not requiring the word to before the infinitive come, as compared with
A tells or causes B to come.

They share this property with a small number of verbs of perception:


A sees, hears or feels B come.

One suspects that there is here a kind of subconscious belief or folk philosophy, according to which the act of seeing brings about the event which is
seen. In fact, the Greek philosopher Empedocles, renowned for having proved
experimentally that air is a substance, proclaimed that light emanates from
the eye. More recently, the belief that perception is a kind of causation was
crucial to the philosophy of Berkeley.
I will skip discussing the auxiliary verbs be and have, although it could
be argued that the former plays a prominent role in the philosophy of
Heidegger. But I will say a few words about the modal verbs:
shall, will, can, may, must,

the first four of which possess grammatical past tenses:


should, would, could, might.

The modal verbs differ from other verbs not only syntactically, by the contexts
in which they appear, but also morphologically, by the fact that the third person
singular does not end in s. Semantically, the modal verbs carry all that English
is capable of saying about the future.
In Latin and many other languages, tenses impose a linear order on time;
but English seems to assume that, while the past is determined, the future is
not: there is one past but many possible futures. This view is reflected in
the recent philosophy of Storrs McCall. In Victorian times, if one wanted to
express the future, one had to make a choice between shall, which carried
a sense of obligation, and will, which carried a suggestion of desire. Not
surprisingly, in our more permissive times, shall is disappearing. There is a
popular attempt to restore a linear future by the introduction of a new modal
verb gonna.
If time is linear, there can be no causality, as was pointed out by Hume.
Physics seems to be in two minds about this question. While the notion of
causality plays no explicit role in the description of nature by mathematical

NOMINALISTIC INTERPRETATION OF NATURAL LANGUAGES

77

equations, classical physics allows for a number of possible futures, from which
one is selected by the principle of least action. Quantum mechanics achieves
the same end by invoking the collapse of the wave function.
6.

CONCLUSION

Looking at English nouns, we have seen how modern English imposes a


distinction between things to be counted and substances to be measured.
Looking at English verbs, we discovered a basic trichotomy between states,
processes and causal actions. We also noted that English draws an analogy
between causation and perception and that it replaces the simple future by
modalities.
These examples document the not very original observation that the way
we partition the world and categorize the entities in it depends on the language
we speak. Sapir and Whorf have pointed out that this may be different for
different linguistic and cultural communities. Yet, as John Macnamara (1991)
has argued, translation is always possible, although the process of translation may transform a simple concept into a complicated one or vice versa. I
have, in fact, assumed that classical Greek resembles English in its basic
categorization.
It would seem that our language encapsulates a kind of collective folk
philosophy, into which professional philosophers frequently delve as a source
of intuition for their theories or sometimes for contraposing their theories to.
Some philosophers, like the logical positivists, may be quite conscious of
this process. Others, like the existentialists, may do this unconsciously, as
they may not see words and things as distinct in the first place.
One final remark: the title originally planned for this article had been
"In defense of nominalism". However, I was persuaded to change this, as
"nominalism" has been used to describe quite different philosophies, such as
those of William Ockham and Willard V. Quine. Anyway, I do not wish to
be drawn into the debate between "nominalism" and "realism". If challenged,
I would probably argue that, when moderately stated, nominalism and realism
do not contradict one another, just as I have argued about formalism and
Platonism in mathematics.
ACKNOWLEDGEMENTS

I wish to thank Ed Keenan, Brendan Gillon, Bernie Lambek, Michael Lambek


and John Macnamara for their helpful comments on an earlier draft of this
article.
McGill University

78

J. LAMBEK

NOTES
I
This relation, usually called "synonymity", is not as easily made precise as the relation of
provable equality in mathematics. Roughly speaking, one would wish to call strings rand t:.
of English words synonymous if any sentence containing r is known to imply and be implied
by the corresponding sentence containing t:.. There are problems with this attempted definition,
which may conceivably be overcome by putting restrictions on the contexts in which r appears.
Anyway, the notion of synonymity will play no explicit role in this discussion.
2 In some languages, mass nouns predominate and require a so-called "classifier" before being
modified by a numeral. For example, in Indonesian one must say five tail {of] cow, much as
we may say five head of cattle.
3 Many linguists insist that state verbs do not normally admit a continuous form, as in *he is
knowing her. This is so for know (except in India), but not for sit; thus our class of state verbs
is larger than the usual one. Zeno Vendler, in an important article, uses "state" in this narrower
sense. In place of our trichotomy, he speaks of four "time schemata": states, activities, accomplishments and achievements.

REFERENCES
Anglin, W. S. and Lambek, J., 1995, The Heritage of Thales, Springer-Verlag, New York.
Bhargava, M. and Lambek, J., 1983, 'A Production Grammar for Hindi Kinship Terminology',
Theoretical Linguistics 10, 227-245.
Couture, J. and Lambek, J., 1991, 'Philosophical Reflections on the Foundations of Mathematics' ,
Erkenntnis 34, 187-209.
Gillon, B. S., 1992, 'Towards a Common Semantics for English Count and Mass Nouns',
Linguistics and Philosophy 15, 597-639.
Lambek, J., 1979, 'A Mathematician Looks at Latin Conjugation', Theoretical Linguistics 6,
221-234.
Lambek, J., 1993, 'Are the Traditional Philosophies of Mathematics Really Incompatible?', Math.
Intelligencer 15, 5~2.
Lambek, J. and Scott, P. J., 1986, Introduction to Higher Order Categorical Logic, Cambridge
U. Press, Cambridge.
Lawvere, F. W., 1972, Introduction to Toposes. Algebraic Geometry and Logic, Springer LNM
274, 1-12.
Macnamara, J., 1991, 'Linguistic Relativity Revisited', in R. L. Cooper and B. Spolsky (eds.),
The Influence of Language on Culture and Thought, Mouton de Gruyter, pp. 45-60.
McCall, S., 1994, A Model of the Universe, Clarendon Press, Oxford.
Vendler, Z., 1957, 'Verbs and Times', The Philosophical Review 66, 143-160.

JEAN-PIERRE MARQUIS

IF NOT-TRUE AND NOT BEING TRUE


ARE NOT IDENTICAL, WHICH ONE IS FALSE?

You've got to
Accentuate the positive,
Eliminate the negative,
Latch on the affirmative,
Don't mess with Mr. In-Between.
(Arlen and Mercer, 1944)

In classical logic, truth and falsity are highly symmetric and this symmetry
is captured by the negation operator which "transforms" truth into falsehood
and vice-versa. Moreover, the negation operation represented in the semantics by a unary operator on the two-element Boolean algebra is lifted to the
higher-order operation of complementation on sets. However, if we abandon
bivalence, then symmetry is problematic. For one thing, we are forced to
distinguish between not-true, which is a truth-value, from not being true, the
complement of the singleton set consisting of the truth, which is not the false
and not even a truth-value. In classical logic, these two notions collapse into
one. Once they are distinguished, their relationships have to be settled and
in particular the links between complementation and negation have to be
clarified. The purpose of this paper is to explore some of these relationships
in a specific context, namely topos theory.
After a short historical survey on the question in the context of many-valued
logic, we will move to a more general framework where the principle of
bivalence is commonly false and therefore without further ado dismissed: topos
theory. Moreover, in a topos, the passage from an object to the object of its
subobjects is well regimented and thus in particular the passage from the structure of truth values to the structure of collections of truth values is also
well-regimented. Thus, it provides an interesting case study of the relationships one obtains between truth, not-true, not being true, falsehood, not being
false, etc. a topic which goes back to Aristotle. 1 We should immediately point
out that our purpose is not to establish new and profound results in topos theory.
Rather, our goal is simply to point to some interesting asymmetries which arise
in this case and these asymmetries can be exhibited directly in a simple manner,
without any complicated calculations. The main point of this paper, if there
is one, is conceptual, not technical.
Topos theory is particularly appealing for additional reasons: (1) it is wellknown that a topos is "the same" as a (intuitionistic) type theory strong enough
to develop a large part of classical mathematics; furthermore the language
contains terms for truth and falsity, which are however propositional types
and not predicates; (2) every topos contains a subobject classifier which is,

79
M. Marion andR. S. Cohen (eds.), Quebec Studies in the Philosophy ojScience 1,79-94.
1995 Kluwer Academic Publishers.

80

JEAN-PIERRE MARQUIS

in general, a multivalent truth-structure; (3) in some toposes, it is possible


to have two distinct negations.
In a sense, topos theory provides us with the first natural examples of manyvalued truth structures. However, the proper reading of these structures is
still an open problem and we believe it to be important. Urquhart, in his survey
of many-valued logic complained that "the semantical methods involving relational model structures, 'possible worlds' and the like, which ... have proved
so fruitful in areas like modal logic, seem to have no clear connection with
traditional many-valued logic" (Urquhart, 1986, 10~101). Topos theory is one
framework in which this gap is filled, since most of the above methods which
rely on bivalent, two-valued truth-structures can be naturally extended in
arbitrary toposes in which the truth-structures are in general many-valued. 2
A FEW HISTORICAL REMARKS

In a bivalent semantics, it is impossible to distinguish not-true, or false, from


not being true. In a multivalent context, as in the context of supervaluations,
various possibilities are available, and indeed, some have already been investigated. In these contexts, it is possible to define different operators which seem
to capture essential aspects of various negation operators. For instance, Bochvar
(1938) explicitly defined two distinct negation operators as follows:
p

-,p

-p

T
F
N

F
T

F
T
T

The first negation is called the "internal negation" and the second is called
the "external negation". Both are truth-functions and simply disagree on their
value for p = N. The external negation can be interpreted as being the characteristic function of the set "not being true", whereas the internal negation
can be thought of as the characteristic function of "not-true". However, in
the latter case, there is another map which could also qualify as being the characteristic map of "not-true": it differs from the internal negation in that it sends
N to F instead of sending it to N. Thus, one and the same subset of our set
of truth-values can have two distinct characteristic maps and hence there is
no bijection between the subsets of our truth structure and the characteristic
functions.
This is typical of the many-valued approaches: the structure of subsets of
the truth structure is never considered. Indeed, one has the feeling that the
classical Boolean structure is available, in other words that all maps from
the structure to itself are available, and thus all subsets of the truth-structure
are equally important and accessible and that none play any crucial role. This

NOT-TRUE AND NOT BEING TRUE

81

is rather odd. If the truth structure has to be a truth structure, then there
should be some kind of natural restrictions on the maps allowed. These restrictions should then be reflected in the structure of subsets which will probably
not be a Boolean algebra. This is crucial, for it is then conceivable, for instance,
that the collection "not being true" look different from the set theoretical
complement of the singleton {true}.
As far as we know, the only attempt which has been made to incorporate
such considerations into a many-valued framework were those made by fuzzy
logicians. 3
In fuzzy logic, there is some restriction imposed on the intervals of truth
values allowed. It is not the case that any (fuzzy) subset is a truth-"value".
However, it seems that this restriction is arbitrary and is simply a matter of
convenience, for "if we allowed any fuzzy subset of [0, 1] to be a truthvalue of FL [a particular fuzzy logic], then the truth-value set of FL would
be much too rich and much too difficult to manipulate" (Bellman & Zadeh,
1977, 107). In order to avoid this difficulty, they pick a countable subset of
intervals of [0, 1] in a way which seems entirely ad hoc and indicates that
the structure of these intervals can be as one pleases. Needless to say, this is
entirely unsatisfactory. We ought to have intrinsic reasons to choose certain
intervals as truth-values and these intervals should have a definite structure,
uniform in some way for all truth-structures. 4 This is precisely what happens
in topos theory. Indeed, by definition, topos theory specifies what the structure of subobjects of any objects of the topos should be, in particular, the
structure of subobjects of the subobject classifier, the truth-structure, is given.
Thus the interval of truth values are not ad hoc and reflect genuine constraints of the surrounding "universe". We believe that any approach to
many-valued truth-structure has to have such specifications. 5
TRUTH-VALUES, CONNECTEDNESS AND SEPARATION

In an extensional framework, logic deals essentially with classifications.


After all, in such a context, a property is identified with its extension and
the logical operations become operations on these extensions. (We ignore questions of size and the paradoxes here.) Within this framework, the laws of
logic are nothing more than laws of classification. The possibility to perform
such classifications rests on two things: (i) we ought to have a uniform way
to individuate objects, that is, find the individuals; (ii) we have to find the
proper principles to group them and, a fortiori, to separate them. In other words,
the possibility to collect objects together is directly related to the possibility
of separating these objects from the others in the"universe". Thus, any comprehension principle will be intimately linked to the part-whole relation existing
between an object and its parts. In the universe of sets, the axiom of separation provides a true separation between the extension of a predicate and the
extension of its negation.

82

JEAN-PIERRE MARQUIS

The term 'separation' can even be used in its standard topological sense.
Recall that a topological space X is separated if there exists two open sets
A and B of X such that A U B = X and A n B = 0. A space X is said to
be connected if there does not exist a separation of X. Obviously, every set
with the discrete topology is separable in any way one wants. Thus any extension together with its complement yields a separation. However, it is enough
that X be totally disconnected, that is, that its only connected subsets are
one-point sets. Indeed, from the topological point of view, one of the fundamental property of the semantics of classical logic is that the underlying
universe is totally disconnected.
Two different types of evidences can be given at this point to support this
claim. The first type comes from general duality results in categorical logic
and the second type comes from considerations in topos theory.
Let us consider first the case of propositional logic. It is well-known that
a theory in classical propositional logic can be transformed into a Boolean
algebra via the Lindenbaum-Tarski construction. Now, the Stone duality
theorem asserts that the category of Boolean algebras with boolean homomorphisms is equivalent to the category of totally disconnected compact
Hausdorff spaces, called Stone spaces. The important point here is that the
property of being totally disconnected plays a crucial role. When one abandons
it and considers the category of compact Hausdorff spaces, then one obtains
an equivalence with the category of C*-algebras.
However, it is possible, but highly non-trivial, to lift the duality between
boolean algebras and Stone spaces to the (enriched) category of Boolean pretoposes, which captures the "algebras" (categories) generated by first-order
theories and the (enriched) category of ultragroupoids. Unfortunately, at this
level, the topological set-up cannot be lifted as easily. However, it is interesting to note that the category playing the role of the category of Stone spaces
is a groupoid. 6 A groupoid is in a sense totally disconnected. To see how, let
us tum to the second type of evidence.
The argument for this second type of evidence provides at best indications that there is something to be understood better. First, we define what a
totally disconnected category is. A category C is totally disconnected if for
any two distinct objects X, Y of C, there is no morphism between X and Y,
i.e. Hom(X, Y) = 0. Hence, a totally disconnect category is simply a disjoint
union of monoids. Thus every discrete category is totally disconnected but
there are totally disconnected categories which are not discrete. We now need
the following totally trivial observations:
LEMMA. Any groupoid C is equivalent to a totally disconnected category
D.
Proof The proof follows immediately from the fact that any category is
equivalent to one of its skeletons and that the skeleton of a groupoid is totally
disconnected (and not in general discrete), since it is a disjoint union of nonisomorphic groups.

NOT-TRUE AND NOT BEING TRUE

83

LEMMA. Every object of the skeleton of a groupoid is a group.


Proof Since the inclusion functor is full and faithful and that it is
the skeleton of a groupoid, every morphism has an inverse and thus is a
group.
PROPOSITION. If a category D is totally disconnected and such that each
of its object with its morphisms forms a group, then SetDOP is boolean, i.e.
its internal logic is classical.
Proof It is a general fact that whenever C and D are equivalent categories, then so are E CoP and E Dop, for any category E. It is also well-known
that C is a groupoid if and only if the topos of presheaves SetCOP is boolean.
By the lemmas and the general fact, we obtain that whenever D satisfies the
properties of the hypothesis, then the topos SetDOP is boolean.
This technically trivial result should be interpreted as follows. Conceptually,
it means that whenever our universe of discourse contains types, each of which
made up of disjoint extensions, that is, which do not have any properties in
common, thus categories in the Aristotelian sense of that expression, and
that every property of a type is obtained by an equivalence relation, then the
logic is classical. It is important to see that both properties are necessary.
For instance, it is well-known the SetM is not boolean when M is a monoid
other than a group and it is trivial to give an example of a category C whose
objects with their endomorphisms are groups but which is not totally disconnected and such that SetCOP is not boolean.
Of course, this covers only the case of toposes of presheaves and leaves
entirely open the question for toposes of sheaves.?
CONCEPTUAL ASYMMETRIES IN SOME TOPOSES

For our purposes, it is not necessary to go deeply into topos theory. OUf goal
is not to establish general results about toposes. We want to use some toposes
in order to display curious asymmetries and try to make some general conceptual remarks from them.
DEFINITION. A topos E is a category with all finite limits equipped with a
function P which assigns to each object B of E an object PB of E, and for
each object A of E, an isomorphism
Sub(B x A) == Hom(A, PB),
where Sub(-) is a functor from E to the category of sets which associate to
each object A its lattice of subobjects.
The important point here is that what the second condition guarantees is
precisely the existence of a function expressing internally what is the structure of subobjects of an object of the topos. Thus a category is a topos whenever

84

JEAN-PIERRE MARQUIS

the structure of subobjects of each object of the category can be determined


internally (and naturally).
One can think about toposes as a generalization of the universe of sets. A
topos can be seen as a universe of objects which are sufficiently set-like to
share with the universe of sets some of its "basic" structural properties, in
fact enough of it to develop most of classical mathematics. This is precisely
what the axioms of elementary toposes guarantee. In the universe of sets,
the elements are bare individuals, they are "abstract points". In some toposes,
we allow the elements to have a "geometric" structure, that is to say, to have
an internal structure or to satisfy a specified constraint: there is a global law
determining the admissible forms of the objects. Thus, there is a constitutive
principle, an underlying pattern determining the structure of the individuals.
In other words, the "sets" of a topos are typed, they are "sets" for which
"such and such" is the case, the "such and such" being specified by geometric "rules". The examples below will illustrate and hopefully clarify these
mysterious remarks.
But before we look at these examples, let us recall some elementary facts
about negation and complementation in an arbitrary topos.
PROPOSITION. In any topos E, the negation operator -,:
by the pullback:

is defined

Negation is then used to obtain the complement of a subobject by composing a characteristic map with the negation map and then pulling back.
Thus one lifts the external negation, the propositional operator, to the inside
and apply it to predicates, that is, subobjects, as it is done in set theory. So
can we apply this procedure to the predicate "not being true"? Intuitively, it
should yield a subobject of n, the subobject classifier, which contains everything but the whole truth. Thus, what we are looking for is an endomorphism
of n such that it sends everything different from truth to the truth such that
when we pull back along the truth we get the required subobject. Again, in
a set-theoretical framework, this is trivial and traditionally in many-valued
logic, it is not even a question. However, in the framework of toposes, the
question is far from being trivial, as we will now try to show.
First, let us show that in some toposes, truth and falsehood are not on a
par. To affirm something, one can always use the truth, that is, pull back
along truth. However, one cannot in general affirm that something is false,
which should be done by pulling back along the false. The latter operation

NOT-TRUE AND NOT BEING TRUE

85

cannot always be performed in a topos. Thus in general, one can only deny
that something is true. We will illustrate this by displaying the situation in what
is probably the simplest example, the tapas of pointed loops.
In this tapas, usually denoted Ser->, the objects are described by the following "schema": f: X ~ Y, where f is simply a set function and X and Y
denote collections. Thus an object in this tapas is in a sense a function, or a
rule. The collections X and Y are not by themselves objects of the tapas. There
are two intuitive and heuristic ways of thinking about these objects. The
first way is to conceive of an object f: X ~ Y as a functional data base,
where X and Y are lists of objects of (possibly) different types. The second
illustration is more geometric. An element of X is a loop without a base
point. Thus the elements of X are homeomorphic to the circle SI. The elements
of Y are simply points. What the function f does is to transform the loops
into "pointed" loops, that is, loops with a base point. From this point of view,
the individuals are of two kinds: either bare points or "decorated" points,
that is, a collection of loops fixed to a base point. Thus, the elements of our
sets are not bare individuals anymore, but rather individuals possibly satisfying
a certain type of properties, more specifically, a rule-type property. An object
of the tapas can be represented thus:

x
An arbitrary element of an object of this tapas is really a quotient of pointed
loops, i.e. it is homeomorphic to a quotient of pointed circles. (This follows
from the fact that any functor of this category, thus any object of the topos,
is a colimit of representable functions.)
Since a tapas is a category, we have to describe the morphism, the "links"
between the objects. Given two objects f: X ~ Y and g: U ~ V, a morphism
from f to g is a pair (hi' h2 ) of functions hi: X ~ U and h2 : Y ~ V such
that hi = gh l . In our illustration, what this equality guarantees is that loops
are sent to loops, points are sent to points and these maps are consistent with
one another. From now on we will denote the objects of this tapas by X, Y,
Z, . . . and morphisms f, g, h, . . . .
The terminal object of this tapas, denoted by 1, consists of a unique pointed
loop. A point of an object X is given by a morphism from the terminal object
1 into X. By definition of a morphism, such a map picks out a pointed loop
in X. Thus, if X does not contain any loop, X has no points even though it
is not empty.
Let us now turn to the subobject classifier. Recall that from the axioms
of tapas theory, it can be shown that for a universe of objects to qualify as a
topos, it has to have an object Q and a chosen subobject, the true or T, which
allows us to classify the subobjects of an object of our universe. This means

86

JEAN-PIERRE MARQUIS

that we have to find an object of the topos with a morphism T: 1 ~ Q such


that for any object X, there should be a bijection between the subobjects Y
of X and the maps X ~ Q. The idea here is that Y should be the inverse image
of the true in X. In order to see how this subobject classifier can be depicted
in our topos, we will examine directly an arbitrary object X of our universe
and see what its subobjects are.
Consider the object X displayed above. What are its subobjects? The geometric representation suggests the rule immediately: a subobject of X can have
base points with or without loops but no loops without base points. This is
natural if we think of the loops as representing given "background" properties or restrictions. Since there cannot be properties without objects, if a loop
is included in a subobject Y, automatically its base point has to be part of Y,
but a base point can be included without any of the loops attached to it, that
is, we can forget about the restricting properties. This is all we need to construct the subobject classifier Q. Let us now pick a subobject Y of X above,
as follows:

x
It is clearly a legitimate subobject of X.

To construct the subobject classifier Q, we reason thus: we need two points


to separate the base points in Y from the others excluded. So we will have a
base point 'i', for 'in', and a base point '0', for 'out'. Now the loops. For
the loops in Y, we need a loop, which we will call T, for 'the true', with
base point i. For the loops with base point outside of Y, we need a loop with
base point 0, which we will call 1... However, we are not done. What are we
going to do with the loops with base points in Y but which are not in Y? In
a sense, they are almost in Y. We can think of those as being elements which
would be in Y if they did not had these properties represented by the loops
outside of Y. They can be thought of as members of a club which have still
not fully paid their dues, or as members with a slightly different status. Thus
they are in the club, but we want to keep track of them since they cannot benefit
from all the priviledges of a full-membership. Therefore, we have to take a
third loop, which we will call 't', with base point i. The subobject classifier
obtained this way can be depicted thus:

Hence we can now define a morphism from X to Q such that Y is the inverse
image of T.

NOT-TRUE AND NOT BEING TRUE

87

It is now very easy to show that we cannot always pull back along the
morphism 1..: 1 ---7 Q, which would amount to the affirmation that something
is false. Simply take X and Y as above. One immediately see that the following
square cannot be completed:

) 1

1 11
x

We now face the obvious question: which are the toposes for which the truth
and the false are on a par, or rather in which affirmation and denial are on
the same footing, e.g., the topos of sets? to obtain the answer, we first need
the following

LEMMA. In any topos E, there is an object, which we call QOP, which satisfies the basic property of the subobject classifier with the difference that the
pullbacks are taken along 1..: 1 ---7 QOP instead of T: 1 ---7 Q.
Proof. The following diagram is a particular case of the basic definition:

~A. 1) ... Sub(A)

Hom(A, P(l

Hom(A', P(l ---~...


~ Sub(A')
~A'.l)

Usually, we let <1> be the identity map. But, as Barr & Wells (1985) point out,
we could be perverse and let <1> of an element of the power object be its complement. By doing so, we obtain what we call Qap. Qap has the following
basic properties: 1.. = <pT and T = <pl.., where <p is an anti-isomorphism,
which are obvious from the definition and immediately imply the claim.
The last isomorphism of the proof is an anti-isomorphism. In some toposes,
the two structures are also isomorphic and in these cases, truth and falsehood are on a par.
PROPOSITION. Let E be a topos. Thenfor each monomorphism Y ---7 X, there
is a unique map a.y : X ---7 Q such that

l~

<Xy

is a pullback

if and only if Q

==

Qap.

88

JEAN-PIERRE MARQUIS

Proof. Suppose that 0 == ooP. Thus QOP is also a subobject classifier. Clearly,
T: 1 ~ oop corresponds to ..1: 1 ~ Q. But since Oop is a subobject classifier, the proposition is true for T: 1 ~ QOP and therefore it holds for ..1: 1
~ O. Now suppose that for each monomorphism Y ~ X, there is a unique
map cry: X ~ 0 satisfying the appropriate condition. This is equivalent to
say that T: 1 ~ oop is also a subobject classifier and therefore that Q == OOP,
as required.
Notice that this includes more than the toposes in which the subobject
classifier has a boolean structure. It includes toposes for which the subobject classifier has a biheyting structure, that is both a heyting and a coheyting
algebra.
In the topos of pointed loops, we can pick out "not being false". Indeed,
consider the object Nf

There is an obvious monomorphism from Nf into 0, sending the loops of


Nf to the loops T and t. Thus it is possible to separate a subobject of an
object X corresponding to not being false: given a characteristic map Xy: X
~ 0, simply take the following pullback along F: Nf ~ O. This yields a
subobject of X of which we can say that it contains all the elements of X
for which the property is not false. Of course it includes the individuals for
which it is true, but it clearly includes more. In a sense, what we are doing
here is forget about the distinction between full members and partial members.
Notice that the characteristic map of Nf, XNf: 0 ~ 0, is what is called a
Lawvere-Tierney topology in a topOS.8 In fact, it is the so-called ....-topology,
as it can be easily checked.
We can also separate the not being true by taking the following object Nt:

G)Nt

Again, there is an obvious monomorphism form Nt into 0, sending the loops


to t and ..i. As above, given a characteristic map Xy: X ~ 0 we can take
the appropriate pullback to separate the not being true as a subobject of X.
However, the characteristic map associated with the not true, that is the map
XNt: 0 ~ 0, is not a topology, since, for one thing, it does not send T to T.
This is one significant difference between the two: with the not-false, we
can form the topos of the "not-false"-sheaves, whereas we cannot form the
"not being true" -sheaves. However, the fundamental difference here is that
the subobject not-false is connected whereas the subobject not being true is
not. Now, this is a profound asymmetry between the two: there is a separa-

NOT-TRUE AND NOT BEING TRUE

89

tion of the not being true whereas no such separation exists for the not
false.
Finally, we can also separate the subobject of elements which are almost
in a given subobject of Y of X. Simply pull along the morphism t: 1 ~ n,
which, as its name indicates, picks out the loop t. As in the last case, its
characteristic map Xt: n ~ n is not a topology, since it sends T to t. Now,
the interesting thing is that none of the characteristic maps we have defined
above could reasonably qualify as a negation operator, since none of them send
T to 1- and 1- to T.
It is impossible to define a morphism n ~ n such that it sends T to 1and t and 1- to T. The reason should now be obvious: it is precisely because
the not being false is connected and the fact that, by definition, morphisms
send connected subobjects to connected subobjects. Thus it is impossible to
separate T and t. Notice also that in this topos, the subobject classifier is
separable into the false and the not being false.
However, a different complement operation can be constructed in this topos
by pulling an arbitrary map X: X ~ n along not being true thus:

-y ----.-Nt

---_.... n

It is possible to internalize this operation in the following manner. Since Nt


is a subobject of n, we can take its caracteristic map, XNt. Then to obtain
-Y, we simply compose XNt with Xy and form the appropriate pullback. Thus,
XNt can be considered to be a different negation operator. Some interesting
aspects of this operation and its related negation operator have to be pointed
out.
(1) It is obvious from the definition that -(-) can be read as "-p is true if
and only if p is not true", whereas the standard negation operator is "-,p is
true if and only if p is false". Of course, these are different if and only if
not being true is not the same as being false, which is our starting point here.
(2) What -(-) does is to forget about approximations: if an object fails to
be in Y because of one property, then it is enough to conclude that it should
not be in Y at all, and therefore it is in -Yo ("If you haven't paid your dues,
you are not a member (but we do keep in mind that you are almost one).")
(3) As a negation operator, XNt is not "normal", to use Rescher's (1969)
terminology, since it sends T to t and not to 1-. In other words, - T is almost
true. This, in turn, can be interpreted as follows, when we assert "x is -Y",
it is very easy to be right. One can easily convince oneself that there ought
to be such a negation operator in the context of certain types of approximations. For instance, in the context of numerical approximations, when one

90

JEAN-PIERRE MARQUIS

asserts that the value of a certain property is not equal to a given value, one
is almost certain to be correct. Bunge (1981) gives the example of a prudent
person who is asked to give the age of a friend and answers "John is not 30
years old". In this kind of circumstances, it is much easier to be correct with
a negative sentence than with an affirmation.
(4) It is easy to prove that Y U - Y = X and that -(Y n - Y) = X, thus
this negation "satisfies" the law of excluded middle and the law of (non)contradiction. Thus, - Y can be thought as the closest we can get to a logical
contradictory in this context, whereas --,Y should be thought as a logical
contrary, since Y and --,Y are mutually exclusive but need not be mutually
exhaustive.
(5) In fact, it can be shown that in every topos of presheaves, the lattice
of subobjects of an object is a biheyting algebra, hence that there is such an
operation of complementation.
These latter considerations suggest that it might always be possible to define
the subobject "Not being true" and therefore this new negation corresponding
to the coheyting complement. However, this is to forget the richness and variety
of toposes. We will not exhibit a different topos in which we have a complex
truth-structure but in which it is impossible to define that subobject "not
being true".9 An object of this topos is a set X with an endofunction f: X -7
X such that f2 = f. Thus an object of that topos can be depicted in the following manner:

An arrow between two points indicates that they are related by f and that
this relation goes in the direction of the arrow, i.e. x -7 y means that y =
f(x). Informally, the objects of this topos are collections of elements which
"evolve" in one step and then are stable (with the possibility of of elements
evolving into one another in a loop, which is not represented above).
A subobject S of an object (X, f) in this topos is a subset S closed under
the endofunction restricted to it. The construction of the subobject classifier
is now immediate. It can be depicted thus:

An interesting question about this truth-structure is whether it is trivalent or


bivalent. Recall that a topos is said to be bivalent whenever 1 has only two
subobjects. This is equivalent to the claim that n has only two points, i.e. there
are only two morphisms 1 -7 n. thus, the subobject classifier has two points

NOT-TRUE AND NOT BEING TRUE

91

and is in this sense bivalent. However, it is not boolean. This is one of the
simplest examples of a non-boolean bivalent topos.
Moreover, we cannot consider the subobject of elements for which something is not true. Indeed, it is impossible to find a subobject of Q which
represents not being true, though there is a subobject representing the not being
false.
It is in fact easy to generate infinitely many toposes with this property.
Simply let the objects of a topos to be pairs (X, f) such that fn = f, for some
n E N. When n = I, this gives us the topos of evolutive sets.
The moral is clear. For toposes of presheaves, we have two operations of
complementation, one of which can always be transferred to the whole proposition, or to use a pictorial language, moved inside out and outside in, whereas
the other cannot in general be moved outside, that is become a propositional
operator. This is a formal representation of the situation where "this man is
unhappy" is not equivalent to "this man is not happy" or "it is not true that
this man is happy". It is then natural to ask under what conditions the complementation can be brought down and be related to a negation operator.
This is equivalent to inquiring whether the subobjects Nf and Nt can be
defined internally, which, in turn, is equivalent to asking whether certain
morphism exist in Hom(Q, Q). This task is more intricate than it appears at
first, and in fact we are at present incapable of providing a complete and
satisfactory general answer. Consider, for instance, the case of a simple boolean
topos: Set2 , where 2 is a two points set. It is easy to show that the subobject
classifier of this topos is boolean, has four truth-values, hence is not bivalent,
and that it has 16 subobjects. We would thus expect that some of these 16
subobjects represent predicates such as "neither true nor false", "not being
true", "not being false", etc. However, none of the predicates are represented
by the subobjects. It takes a while to convince oneself that this is as it must
be: indeed, in a boolean topos, we are dealing with classical logic, and hence
not-true and not being true should be identical, as indeed they are in this
case, no matter how rich the lattice of subobjects of the subobject classifier
is.
This example brings us to the following observation: we have seen above
an example of a topos which is bivalent but not boolean and the latest example
provides a very simple example of a boolean topos which is not bivalent. It
is well-known that a well-pointed topos is both two-valued and boolean. This
is conceptually very significant. A topos is said to be well-pointed if the
terminal object 1 generates the topos, which means that whenever f ~ g:
A -7 B, there is an arrow u: 1 -7 A such that fu ~ guo This, in fact, tells a
lot about the structure of the objects of the topos. Indeed, in can be shown
that to be well-pointed is equivalent to the claim that every object A of the
topos is isomorphic to the coproduct of 1 indexed by Hom(l, A). In other
words, every object is made of the disjoint union of a given basic individuals, the terminal object, and thus can be decomposed "at will", just as in Sets.
It is in the general case that a deep asymmetry lies. It is always possible

92

JEAN-PIERRE MARQUIS

to define a subobject of n which captures the predicate "not being false":


indeed the subobject classified by the -,-,-topology always include the truth
and in general much more. However, as we saw above, it is not in general
possible to find such a morphism for the predicate "not being true", unless
one is satisfied with the negation -', which always yield only the false. Be
that as it may, it is rather strange to draw as a moral that in general not-false
and not being false are not identical, whereas not-true and not being true are
identical. It would be more interesting to come to the conclusion that somehow
the notion not being true is not extensional, whereas being false is. But we
are a long way from this situation.
CONCLUSION

Until the advent of topos theory, many-valued logics were developed along
two complementary lines: the arithmetic and the algebraic. 1o In the arithmetic framework, the truth structure is thought of as a collection of points with
a linear structure. The algebraic framework is obtained by replacing the linear
ordering by an appropriate partial ordering. In both cases, negation is a specific
endomorphism of the truth structure. However, in both cases, it seems to be
assumed that the predicate "not being true" can be represented by the settheoretical complement of the singleton set {T} and thus, whereas negation
is a morphism of a specific type, complementation is a purely set-theoretical
operation.
In the topos theoretical framework, we have for the first time genuinely geometric truth structures. This has profound conceptual implications. For instance,
the notion of a truth-value is not as straightforward as in the case of arithmetic or algebraic structures. Indeed, the latter structures are constructed
from sets of truth-values, whereas the subobject classifier of a topos is an object
with a specific universal property which determines its internal structure. There
are always at least two points in a subobject classifier, the truth and the false.
However, there are toposes in which these are the only points of the subobject classifier, for instance the topos of evolutive sets, even though the
subobject classifier contains more "parts". These other "truth-values" are in
fact subobjects of n, not points of n. Hence, such a topos is two-valued.
What is queer is that despite the fact that it is two-valued, not being false is
not the same as true in this topos and the internal logic of such a topos is
not classical. As we have said above, for not being false to be the same as true,
the topos has to be boolean, which is totally independent of the number of
truth-values. Thus, it is a structural property and not a combinatorial one.
In a many-valued truth-structure, we recover immediately a traditional
logical distinction between negation as a term operator and negation as a propositional operator. The standard generalization of negation as a endomorphism
of the subobject classifier captures only the propositional operator. When we
consider the complementation operations not being true, not being false and
the like, we try to model the term operator. These two models should be

NOT-TRUE AND NOT BEING TRUE

93

harmonized in the semantics and it should be possible to reflect the distinction in the syntax. The first thing we have to clarify is the way these two
operations should be linked and formalized. We have tried to take the first steps
in this direction in this paper and we hope that it will elicit further explorations
in the near future.
ACKNOWLEDGEMENTS

Much of this paper originated from discussions with G. Reyes and H.


Zolfaghari, whom I would like to thank. I would also like to thank J. Lambek
for his comments and criticisms. Needless to say, I am solely responsible for
the errors of this paper. I would finally like to thank the Universite de Montreal,
the SSHR of Canada and FCAR of Quebec for their financial support.
Universite de Montreal
NOTES
1 See Hom, L. R., 1989, A Natural History of Negation, Chicago: The U of Chicago Press
for a discussion.
2 We will not discuss these questions and relationships here, but see Goldblatt, R.I., 1979, Topoi:
The Categorical Analysis of Logic, Amsterdam: North-Holland, Bell, J.L., 1988, Toposes and
Local Set Theories, Oxford: Oxford U Press, McLarty, C., 1992, Elementary Categories,
Elementary Toposes, Oxford: Oxford U Press or Mac Lane, S. & Moerdijk, I., 1992, Sheaves
in Geometry and Logic, New York: Springer for instance.
3 See for instance Zadeh, L., 1975, 'Fuzzy logic and approximate reasoning', Synthese 30,
407-428, and Bellman, R. E & Zadeh, L., 1977, 'Local and Fuzzy Logics', Modern Uses of
Multiple-Valued Logic, J. M. Dunn & G. Epstein (eds.), Reidel, 103-165.
4 We should point out that Hohle's introduction of weak toposes and Stout's introduction of
footings might provide the proper framework to develop fuzzy set theory on a sound basis.
See Hohle, U., 1991, 'Monoidal closed categories, weak topoi and generalized, logics', Fuzzy
Sets and Systems 42, 15-35 and Hohle, U. & Stout, L.N., 1991, 'Foundations of fuzzy sets', Fuzzy
Sets and Systems 40, 257-296.
5 We should also point out that the category of fuzzy sets with a certain natural class of morphisms do not form a topos. See Barr, M., 1986, 'Fuzzy Set Theory and Topos Theory', Bull.
Can. Math. Soc. 29, 501-508 and Pitts, A., 1982, 'Fuzzy sets do not form a topos', Fuzzy Sets
and Systems 8, 101-104 for the argument. However, if one modifies the presentation slightly,
which amounts to making the notion of equality fuzzy, then it turns out that the category is a
topos. For alternative explorations of categorical explorations of fuzzy subsets, see Mawanda, M.
M., 1989, 'On a categorical analysis of Zadeh generalized subsets of sets 1', Categorical Algebra
and Its Applications, F. Borceux, ed., LNM 1348, New York: Springer-Verlag and Rodabaugh,
S. E. et al., 1992, Applications of Category Theory to Fuzzy Subsets, Dordrecht: Kluwer
Academic.
6 See Makkai, M., 1982, 'Stone Duality for First Order logic', Proceedings of the Hernrand
Symposium, Logic Colloquium '81, J. Stem, ed., 217-232, Makkai, M., 1987, 'Stone Duality
for first order Logic', Advances in Mathematics 65, 97-170, Makkai, M., 1993, Duality and
Definability in First Order Logic, Memoirs of the American Mathematical Society, vol. 105,
for the lifting of the duality theorem for first-order logic. See Johnstone, P. T., 1982, Stone Spaces,
Cambridge: Cambridge University Press, for a presentation of Stone's theorem and general Stone
dualities.

94

JEAN-PIERRE MARQUIS

Of course, the interesting cases here are the localic toposes.


See Bell, J.L., 1988, Toposes and Local Set Theories, Oxford: Oxford U Press or MacLane,
S. & Moerdijk, I., 1992, Sheaves in Geometry and Logic, New York: Springer for the definition. Interestingly enough, a Lawvere-Tierney topology in a topos can be thought of as an
'assertion operator", an idea which goes back to Bochvar 1938. Indeed, his assertion operator
is a Lawvere-Tierney topology on the three-valued truth structure and so is Rescher's "weak
assertion operator" defined for Lukasiewicz three-valued logic. See Rescher N., 1969, ManyValued Logic, New York: Mcgraw-Hill, pp. 31-32.
9 This topos is usually denoted SetM , where M is the two-element monoid which is not a
group.
10
See for instance Rescher, N., 1969, Many-Valued Logic, New York: McGraw-Hill and
Rasiowa, H., 1974, An Algebraic Approach to Non-Classical Logics, North-Holland.
7

DANIEL VANDERVEKEN

A NEW FORMULATION OF THE


LOGIC OF PROPOSITIONS

In the philosophy of language and mind, the abstract entities called propositions have a double nature. On one hand, they are units of sense of a
fundamental logical type which are expressed by the use of sentences. All
propositions represent states of affairs and are true or false depending on
how things are in the actual world. On the other hand, propositions are also
the contents of conceptual thoughts that we, human beings, have in mind
whenever we think, speak or write. As ordinary language philosophers have
shown, the primary units of meaning in the use and comprehension of language
are speech acts such as assertions, promises and requests which consist of
an illocutionary force F with a propositional content P. Moreover, many of
our mental states are attitudes like beliefs, intentions and desires which consist
of a psychological mode m with a propositional content P. Like illocutionary
acts, such attitudes are conceptual thoughts whose contents represent states
of affairs.
As Frege already noticed, the two constitutive aspects of propositions are
not logically independent. Indeed, force, sense and denotation are the three
essential components of sentence meaning in language. Thus, every proposition which is the sense of a sentence in a possible context of utterance is
also the content of the illocutionary act that this sentence could be used to
perform literally in this context. For example, the proposition which is
expressed by the sentence "John will help me" in a context of utterance is
also what the speaker of that context would mean to assert if he were using
literally that declarative sentence in that context. Unfortunately, the philosophical logics of sense and denotation which are currently used today are
incompatible with the philosophical analysis of conceptual thoughts. Indeed,
standard modal, epistemic and intensional logics are based on Carnap's
definition of the logical type of propositions which reduces propositions to
their truth conditions. From a philosophical point of view, strict equivalence
(the property of having the same truth conditions) is not a sufficient criterion of propositional identity. Indeed there are many speech acts with the
same force and strictly equivalent propositional contents which are not performed in the same contexts. Similarly, there are many different attitudes
with the same psychological mode and strictly equivalent propositional
contents. Thus the assertion (or belief) that 2 ~ 4 is different from the asseris irrational.
tion (or belief) that
The main purpose of this paper is to formulate a new logic of propositions that is compatible with the philosophical analysis of conceptual thoughts.
Incidentally, the fact that propositions are expressible in the performance of
speech acts imposes on the logic of propositions many conditions of adequacy

"2

95
M. Marion andR. S. Cohen (eds.), Quebec Studies in the Philosophy of Science /,95-105.
1995 Kluwer Academic Publishers.

96

DANIEL V ANDER VEKEN

that logicians have unfortunately tended to neglect until now. In particular,


propositional logic must account for the finiteness and the restricted cognitive abilities of human beings as well as for the creative abilities of their
linguistic competence. Human beings are neither omniscient nor perfectly
rational in their speech even if they are always in a sense minimally consistent.
The first section of this paper formulates the conditions of adequacy of a
logic of propositional contents of thought. The second section proposes a
new analysis of the logical type of propositions which is adequate for speech
act theory. The third and fourth sections present the object-language and the
model-theoretical semantics of a new minimal propositional logic. Finally,
the fifth section formulates a complete axiomatization and the last section
enumerates some valid laws of that logic.
I.

CONDITIONS OF MATERIAL AND FORMAL ADEQUACY

If all the propositions which are senses of sentences are also possible contents
of illocutionary acts, then any adequate law of propositional identity must
satisfy the following criterion of substitutivity salva felicitate within the scope
of illocutionary forces. Two propositions PI and P 2 are identical only if, for
every force F, the speech acts of the forms F(P I ) and F(P 2 ) have the same
conditions of success: it is not possible for a speaker to perform speech act
F(P I ) in a context of utterance without eo ipso performing the act F(P 2 )
in the same context. Now, it is quite obvious that the ability to perform (or
understand) an illocutionary act F(P) contains the ability to apprehend its
conditions of satisfaction and to understand the truth conditions of its propositional content. Thus to understand an assertion is to understand under which
conditions that assertion is true. Similarly, to understand a promise is to understand under which conditions that promise is kept. Consequently, an adequate
analysis of propositions must theoretically relate the logical aspects concerning
the truth conditions of propositions to the cognitive aspects concerning the
determination of these truth conditions in the use and comprehension of
language.
In my analysis, I will account for the following facts:
(1) From a cognitive point of view, propositions are complex senses provided
with a structure of constituents.

As Frege and Russell pointed out, understanding a proposition consists mainly


of understanding which attributes (properties or relations) certain objects must
possess in the actual world in order that this proposition be true. Let us
consider the proposition which is the sense of the sentence (1) "The president of Russia speaks English but not German" in this context. In order to have
in mind this proposition, we must apprehend the following aspects of its structure of constituents.

A NEW FORMULATION OF THE LOGIC OF PROPOSITIONS

97

First, we must identify the various propositional constituents of that proposition, i.e., the sense of the definite description and the properties expressed
by the two verb phrases.
Second, we must also understand how these propositional constituents are
logically related in terms of predication in the two atomic propositions
out of which this proposition is composed. We must understand that the two
properties of speaking English and German are predicated of the president of Russia.
Finally, we must also understand how the truth conditions of the complete
proposition are determined from the truth possibilities of its atomic propositions. In this example, we must understand that the proposition in question
is true if and only if the first predication is true and the second false.
From a cognitive point of view, propositions are then much more structured
entities than simple functions from the set of possible worlds into the set of
truth values. In order that two propositions be identical, they must not only
have the same truth conditions. They must also be composed out of the same
atomic propositions. Moreover their truth conditions must be determined in the
same effective way from their atomic propositions.

(2) All propositions are general propositions whose constituents are senses and
not objects.
As Frege pointed out, we cannot refer to objects in an act of thought without
subsuming these objects under senses. We can only have in mind concepts
of an individual and refer indirectly to it through such concepts. Now, if one
admits both the indispensable role of concepts in reference and the hypothesis that every proposition is the possible content of an act of thought, then
one must also recognize like Church the absolute need of senses in logic.
All propositional constituents must then be attributes or concepts of objects
of reference.

(3) Propositions are senses whose structure is finite.


As is well known, human beings have restricted cognitive abilities. We cannot
refer to an infinite number of different objects or predicate an infinite number
of attributes in an act of thought just as we cannot use an infinitely long
sentence in a context of utterance. Natural languages are by definition possible
human languages. Thus the set of propositional constituents and the set of
atomic propositions out of which a proposition is composed in a semantic
interpretation are finite sets. Otherwise, certain propositions would not be
expressible.

(4) Human beings are not perfectionally rational in their use and comprehension of language.
Clearly, we are often inconsistent. We often assert (or believe) propositions
whose truth is impossible. Furthermore, our illocutionary and psychological
commitments are not as strong as they should be from the logical point of view.

98

DANIEL VANDERVEKEN

Thus, we often fail to make valid theoretical inferences in the use of language.
For example, we often assert (or believe) a proposition without asserting (or
believing) other propositions which are a logical consequence of the first.
Finally, we are not omniscient and we do not know all necessary truths.
Such imperfections of human thought impose constraints on propositional
logic: First, that logic must formulate a finer criterion of speaker consistency
than logical possibility. second, propositional logic must also define a nonclassical relation of implication which is stronger than Lewis' strict implication. (In modal logic, a proposition P strictly implies another proposition Q
when all the truth conditions of Q are also truth conditions of P.) Contrary
to what Hintikka wrongly assumes in his epistemic logic, the set of propositional contents of attitudes is not closed under strict implication. Indeed,
elementary rules of natural deduction like the rule of introduction of disjunction
do not generate psychological or illocutionary commitments. Thus, we can
believe that John is French without eo ipso believing that John is French or
Tupi.
(5) Even if human beings are not perfectly consistent in their use of language,
they are not totally irrational.

As is the case for the truth conditions of propositions, the success conditions
of speech acts are often logically related. Thus, certain illocutionary acts are
not simultaneously performable. For example, no one can ever mean to assert
that Cicero is and is not Roman. Furthermore, certain illocutionary acts have
more success conditions than others: it is not possible for a speaker to perform
these speech acts in an utterance without eo ipso performing the other speech
acts. Thus an assertion that 0 ~ 2 ~ 3 is an assertion that 2 ~ 3.
We need then to discover in formal semantics which stronger logical relation
of implication must exist between two propositions PI and P 2 in order that
the illocutionary acts of the form F(P I ) commit the speaker to the corresponding illocutionary acts of the form F(P 2 ). To my knowledge, most (if
not all) existing logics are incompatible with an adequate analysis of illocutionary or psychological commitment. First, classical modal, intensional and
epistemic logics which are based on Carnap's definition of propositions predict
far too many illocutionary and psychological commitments. On the contrary,
hyperintensional logic does not predict enough commitments. On one hand,
hyperintensional logic makes the right move in admitting that the contents
of attitudes are more complex entities than simple truth conditions. But it makes
a mistake in identifying these contents with structured senses called hyperintensions whose nature is too sensitive to syntactic features like the order
of predication which are not always relevant. Thus hyperintensional logic
distinguishes identical attitudes like the belief that the morning star is the
evening star and the belief that the evening star is the morning star. For their
contents are not intensionally isomorphic.

A NEW FORMULATION OF THE LOGIC OF PROPOSITIONS


II.

99

THE LOGICAL STRUCTURE OF THE SET OF ALL PROPOSITIONS

On the basis of the preceding considerations, I propose the following analysis


of the logical fonn of propositions.
First, every proposition is a structured entity composed out of a finite number
of atomic propositions which predicate attributes of a certain finite number
of entities that are subsumed under concepts. As we have seen, every atomic
proposition has a finite number of senses as propositional constituents including
one main attribute and concepts of objects of reference. Moreover, its main
attribute is predicated of the objects which fall under its concepts, and that
predication detennines truth conditions. From a logical point of view, an atomic
proposition is then a pair containing the finite set of senses (which are its
propositional constituents) and the function from possible worlds into truth
values that is determined by the internal predication that is made with its
constituents. For the sake of simplicity, let us identify provisionally propositional constituents with Carnapian intensions. On this view, the atomic
proposition of the elementary proposition that the morning star is the evening
star is identified with the pair containing first a set of three senses (the binary
relation of identity and the two expressed individual concepts), and second,
the function which associates the true with a possible world if and only if these
two individual concepts apply to the same object in that world.
Second, from a cognitive point of view, the comprehension of the truth
conditions of a proposition is not the ability to associate with every possible
world the truth value of that proposition in that world. For we do not have
this ability. Thus the Camapian explication of the truth conditions of propositions is inadequate. As Wittgenstein said in his Tractatus, to understand
the truth conditions of a proposition is just to understand what are the truth
possibilities of its atomic propositions under which it is true. In this view,
the truth value of a proposition is always a function of the truth values of its
atomic propositions. Thus, for a number n of atomic propositions, there are
2 2n different propositions composed out of these n atomic propositions.
Incidentally, this purely truth functional conception of the determination
of truth conditions is consistent with the well known phenomena of referential opacity and the intensionality of natural languages. For all modal operations
on propositions affect both the content and the truth conditions. Modal operations serve to predicate attributes of the propositions to which they are applied.
They enrich the content by introducing new atomic propositions whose truth
is necessary for the truth of the new modal propositions.
Here then is the definition of the logical type ofpropositions that I advocate
in philosophical logic. A complete proposition is a pair whose first element
is the finite set of its atomic propositions and whose second element is the
set of all truth value assignments to atomic propositions under which that
proposition is true. By definition, a truth value assignment to atomic propositions is a function that associates to each atomic proposition exactly one truth
value. As the truth value of a proposition P is a function of the truth values

100

DANIEL VANDERVEKEN

of its atomic propositions, there always exists a unique set of truth value assignments to atomic propositions under which proposition P is true. For example,
the proposition that Paul is in France or in Belgium is true under all truth value
assignments that associate the true with at least one of its two atomic propositions. Formally, the notion of truth can then be defined as follows: a
propositions P is true in a possible world w if and only the set of truth value
assignments under which P is true contains at least one assignment which associates with each atomic proposition of P the actual truth value of that atomic
proposition in the world w.
Clearly, my criterion of propositional identity is stronger than that of Carnap
and Montague who require only strict equivalence. Indeed, strictly equivalent propositions with different propositional constituents or different atomic
propositions are distinguished in my logic. Moreover, contrary to Parry's logic,
my criterion of propositional identity requires more than the identity of content
and strict equivalence. Indeed strictly equivalent propositions with the same
content are distinguished in my logic when their truth conditions are determined by the application of different truth functions. Consider the impossible
proposition (2) that naive set theory is consistent and the contradictory proposition (3) that naive set theory is consistent and inconsistent. These two
propositions are different. For we all know a priori by virtue of linguistic competence that the contradictory proposition (3) is impossible while we have to
learn that proposition (2) is necessary false. Unfortunately, such propositions
are identified in Parry's logic. However, in my analysis, these two propositions
are not true under the same set of truth value assignments to atomic propositions. For the truth function that serves to determine the truth conditions
of proposition (2) is the identity function. Thus the first proposition is true
under all truth value assignments that associate the true with its single atomic
proposition. On the other hand, the truth function that serves to determine
the truth conditions of the contradictory proposition (3) is the constant function
that associates the false with each truth value. There is not a single truth
value assignment to atomic propositions under which the second proposition
is true.
On the basis of my new analysis of propositions, one can define recursively as follows the set of all propositions.
The simplest possible propositions are the elementary propositions which
are composed out of a single atomic proposition and which are true
under all truth value assignments that associate the true with that atomic
proposition.
All other propositions are complex propositions that can be obtained
from the elementary propositions by a finite number of applications of
operations. The truth functional operations are of course the simplest logical
operations on propositions. Indeed, the content of a complex proposition
that is the result of the application of a truth function is just the union of
the contents of its arguments. The truth functions do not affect the content.
They only rearrange the ways in which the truth conditions of the new

A NEW FORMULATION OF THE LOGIC OF PROPOSITIONS

101

propositions are determined from the truth possibilities of the atomic


propositions.
On this account, the negation of a proposition P is just the proposition
-J' which is composed out of the same atomic propositions as P and which
is true under all truth value assignments to atomic propositions which make
P false. Similarly, the conjunction of two propositions PI and P 2 is the proposition (PI & P 2 ) whose content is the set of all atomic propositions of PI and
of P 2 and which is true under all truth value assignments to atomic propositions that make true both propositions PI and P 2 And similarly for other
truth functions.
Given the fact that propositions have a content and a truth functional algorithm in addition to truth conditions, there exists in my propositional logic a
non-classical relation of strong implication between propositions which is finer
than Lewis' strict implication. A proposition P strongly implies another proposition Q if and only if all the atomic propositions of Q are atomic propositions
of P and all truth value assignments to atomic propositions that make P true
also make Q true. Like Parry's analytic implication, strong implication requires
an inclusion of content. However strong implication requires more than strict
implication in addition to the inclusion of content. It requires that P tautologically imply Q.
As I have shown, the relation of strong implication is important for the
purposes of formal semantics because it is cognitively realized by human
speakers. Every speaker who understands fully a proposition P also understands all propositions Q which are strongly implied by P and realizes that
these propositions Q are strictly implied by P. This is why strong implication is so important in illocutionary logic.
III.

THE IDEAL OBJECT LANGUAGE OF


MINIMAL PROPOSITIONAL LOGIC

The lexicon of that ideal language L contains:


a series of individual constants c, c', c", . . .
(whose senses are individual concepts),
for each natural number n, a series of predicate constants of degree n: Rn,
R~, R:, ...
(whose senses are attributes of individuals of degree n) and
the syncategorematic expressions =, t, -', &, D, (, and ).
The rules of formation of L are:
1. If Rn is a predicate of degree nand c l , . . . , c n are n individual constants
of L, then Rn(c I . . . c n) belongs to the set La of the terms for atomic propositions of L.
Ric I . . c n) expresses the atomic proposition which is true in a world
w if and only if the sequence of n individuals which fall in w under the
concepts expressed by c l , . , Cn belongs to the extension in w of the
attribute expressed by Rn.

102

DANIEL V ANDER VEKEN

2. If Aa E La then (Aa) belongs to the set Lp of propositional terms of L.


Furthermore, if Ap and Bp belongs to L p' then -.A p and (Ap & Bp) also
belong to Lp.
(Aa) expresses the elementary proposition which is composed out of
the atomic proposition expressed by Aa. -.Ap expresses the proposition which
is the negation of the proposition expressed by Ap. Finally, (Ap & Bp)
expresses the conjunction of the propositions expressed by Ap and Bp.
3. The elementary sentences of L
- If A and B are both individual terms or propositional terms of L, A = B
is an elementary sentence of L (which is true in a world if and only if
A and B express the same sense).
- If Ap E Lp then t(Ap} is an elementary sentence of L (which is true in a
world if and only if the proposition expressed by Ap is true in that world).
4. The complex sentences of L
If AI and BI are sentences of L then -.AI' 0 AI and (AI & B I) are new complex
sentences of L which are interpreted as usual.
The rules of abbreviations of Y;
Other important propositional notions are derived as follows:
- the usual rules of abbreviation for disjunction v, material implication =::),
and material equivalence ~,
- the following rules for the inclusion of content:
(Ba) > (Aa) =df (Ba) = (Aa)
(-.Ap > (Aa =df (Ap > (Aa
(Bp & Cp) > (Aa) =df Bp > (Aa V (Cp > (Aa)))
(Ap > -,Bp) =df (Ap > Bp)
(Ap > (Bp & C p}) =df Ap > Bp) & (Ap > C p

(Ap > Bp) means that all the atomic propositions of Bp are also atomic
propositions of Ap.
-

tautologyhood: TAp =df Ap = (Ap ~ Ap)


strict implication: A -( B =df 0 (A =::) B)
strict equivalence: (A )( B) =df (A -( B) & (B -( A)
analytic implication: Ap ~ Bp =df (Ap > Bp) & (Ap -( Bp)
strong implication: Ap K Bp =df (Ap > Bp) & T(Ap =::) Bp}.
IV.

DEFINITION OF THE STRUCTURE OF A STANDARD MODEL

A possible interpretation or standard model we for the ideal object language


L of propositional logic is a triple <I, U, II II> where I and U are two sets
and II II is a function satisfying the following clauses:
(1) I is a non empty set of possible worlds,
(2) U is a non empty set of individual objects, and
(3) II II is a function which associates with each formula A which is an individual constant, a predicate or a term of L the semantic value of that
formula in the model we. IIAII is defined inductively as follows:

A NEW FORMULATION OF THE LOGIC OF PROPOSITIONS

(i)
(ii)

(iii)

IIcll E (uy,
IIRnll E (2 un y where 2 = {O, I}
IIRic l ... cn)1I = ({IIR.II, IIcdl, ... , IIc.II}, {j/j
... , IIc nllv)) = I}).

103

I and IIR.IIU> lIcdIU),

Notation. Let idl(X) and id2(X) represent respectively the first and the second
terms of an ordered pair X. In the model
id l (IlAaID is the set of propositional constituents of the atomic proposition expressed by Aa while id 2(IIAa1D
is the set of possible worlds where that proposition is true.

we,

(iv)
(v)
(vi)

II(Aa)1I = ({IIAall}, {f E 2(ua )lf(IIAa ID = 1 }),


II-.Apll = (idl(IIApID, {flf E id2(IIAp1D }),
IIAp & Bpll = idl(IIApID U idl(IIBpID), (id2(IIApID n id2(IIBpll). (In
idl(IIAplD is the content of proposition IIApll).

we,

On the basis of these assignments of semantic values, one can now define
as follows the concept of a true sentence of L:
-- A sentence of the form A = B is true in a world i E I under the model M
if and only if IIAII = IIBII.
-- A sentence of the form t(Ap) is true in a possible world i if and only if
there exists at least one f E id2(IIAp1D such that for all U a E idl(IIApID,
f(u a ) = 1 if and only if i E idlua).
-- A sentence of the form ,At is true in a possible world i if and only if the
sentence At is not true in world i under M.
-- A sentence of the form (At & B t) is true in a possible world i if and only
if the sentences At and B t are true in this world.
-- Finally, a sentence D At is true in a world i under M if and only if the
sentence At is true in all possible worlds under M.
As in model theory, a sentence A of L is logically true or valid (symbolically I=A) if and only if this sentence is true in all possible worlds of all
possible interpretations of L.
V.

A COMPLETE AXIOMATIC SYSTEM

The following formal system is both a sound and complete axiomatisation


of the laws of my minimal propositional logic. The axioms of PC are:
-- the axioms of the logic of truth connectives,
-- the axioms of S5 modal logic,
-- the usual axioms for the relation of identity plus the following law of
propositional identity:

Axiom schema 1: (Ap t-( Bp & Bp t-( Ap) => Ap = Bp


-- the following axioms for the structure of atomic propositions
Axiom schema 2: Rn(c l, ... , c n
en = R~(el' ... , en)

= R~(dl'

... , d n ))) => R.(e l, ... ,

104

DANIEL V ANDER VEKEN

Axiom schema 3: c i == d l & ... & (C n = d n ~ (Rn(e l , ... , en


(R~(el' ... , en))) ~ Rn(c l, ... , Cn = (R~(dl' ... , d n
Axiom schema 4: Rn(c i . . .
. V (c; = d n
where i ::;; n.

Cn

Axiom schema 5: -,Rn(c l, ... , Cn

= (R~(dl'
=

... , dn )))

C;

d l) V

(Rm(d l, ... dm. n *- m

Axiom schema 6: 0 (tRic l , , cn ::> tRid l , , dn) ~


Rn(c l, ... , Cn = (Rn(d l, ... , dn))), where {c l, ... , Cn} = {d l, ... , d n}
Identical elementary propositions have the same constituents.
- the usual axioms for truth conditions
Axiom schema 7: (Aa)

Axiom schema 8: t(-,Ap)

(Ba)
::>

(t(Aa)

::>

t(Ba

-,t(Ap)

Axiom schema 9: t(Ap & Bp) ::> (t(Ap) & t(Bp


- and finally the following axioms for tautologies
Axiom schema 10: TAp when Ap has the form of a tautology according to
the method of truth tables
Axiom schema 11: (TAp & T(Ap

Bp

T(Bp)

Axiom schema 12: Aa) = (Ba ~ T(Ap ::> A;), where A; differs at most
from Ap by the fact that some occurrences of the term (Aa) in Ap are replaced
by occurrences of the term (Ba).
Axiom schema 13: -,TA~) V . . . V (A:, for any A~, ... , A: E La.
Axiom schema 14: -,T(-,(A~) V .. V -,(A:, for any A~, ... , A: E La.
Axiom schema 15: TA~) V .. V (A:) V -,(B~) V . . . V -,(B:' ~ (A~) =
(B~ V .. V A!) = (B:' V .. V A:) = (B~ V . . . V A:) = B:'),
for any A!, ... , A:, B!, ... , B:' E La.
These axiom schemas express well known valid laws for tautologies.
The two rules of inference of my axiomatic system PC are the rules of modus
ponens and of necessitation.
VI.

MAIN VALID LAWS OF PROPOSITIONAL LOGIC

1. The relation of inclusion of content is reflexive and transitive. A proposition has the same content in all contexts.
I=Ap > Ap
HAp> Bp)
HAp> Bp)

~
~

Bp > Cp) ~ (Ap > C p


0 (Ap > Bp).

2. Strong implication is a relation of partial order between propositions. Two


strongly equivalent propositions are identical.
3. There are two causes of failure of strong implication:
Failure of content inclusion 1= -,(Ap > Bp) ~ -,(Ap H: Bp).
Failure of tautological implication 1= -,T(Ap => Bp) ~ -,(Ap

H:

Bp).

A NEW FORMULATION OF THE LOGIC OF PROPOSITIONS

4.

5.

6.
7.
8.
9.

105

In the first case, it is possible to have in mind the first proposition P without
also apprehending the second one Q. And in the second case, we do not
necessarily know a priori in virtue of linguistic competence that proposition P pas more truth conditions than proposition Q.
The rules of elimination of conjunction, disjunction and material implication generate strong implication.
4.1. F (Ap & Bp) I-( Ap and F (Ap & Bp) I-( Bp
4.2. F Ap I-( C p) & (Bp I-( C p =::} (Ap v Bp) I-( C p
4.3. F (Ap & (Ap =::} Bp) I-( Bp
4.4. The failure of the law of elimination of negation I' (Ap & -,Ap)
I-(
Bp. Indeed, the content of Bp can be new.
Only the rules of introduction which preserve content inclusion generate
strong implication.
5.1. The failure of the law of introduction of disjunction jib Ap I-( (Ap V Bp).
5.2. On the contrary, the laws of introduction of negation and of
conjunction are valid.
A proposition strongly implies all and only the tautologies whose content
is included in its content.
A theorem of finiteness for strong implication. A proposition only strongly
implies a finite number of other propositions.
The relation of strong implication is decidable. This confirms the thesis that
strong implication is cognitively realized by virtue of competence.
In simple illocutionary logic, there is a law of minimal rationality of
speakers: If a proposition P strongly implies proposition Q, a speech act
of the form F(P) with a primitive force has more success conditions than
the corresponding speech act F(Q) when Q satisfies the propositional content
conditions of F each time P satisfies these conditions. And the speech acts
F(P) and F(-,Q) are incompatible.

Universite du Quebec

a Trois-Rivieres
REFERENCES

Carnap, R., 1956, Meaning and Necessity, Univ. of Chicago Press.


Church A, 1951, 'A Formulation of the Logic of Sense and Denotation', in P. Henle et al.
(eds.), Structure, Method and Meaning, Liberal Arts Press, New York.
Creswell, Max, 1975, 'Hyperintensional Logic', Studia Logica 34, 25-38.
Frege, G., 1923-6, 'Gedankengefiige', in Beitriige zur Philosophie des Deutschen Idealism us
3,36-51.
Hintikka, J., 1962, Knowledge and Belief, Cornell Univ. Press.
Montague R., 1974, Formal Philosophy, Yale Univ. Press.
Parry, W. T., 1933, 'Ein Axiomsystem fUr eine neue Art von Implikation (analytische
Implikation)', Ergebnisse eines Mathematisches Colloquiums, Volume 4.
Searle, J. R. and Vanderveken, D., 1985, Foundations of Illocutionary Logic, Cambridge Univ.
Press.
Vanderveken, D., 1990-1, Meaning and Speech Acts, 2 Volumes, Cambridge Univ. Press; and
The Logic of Propositions (forthcoming).
Wittgenstein, L., 1961, Tractatus LogicoPhilosophicus, Routledge & Kegan Paul, London.

YVON GAUTHIER

INTERNAL LOGIC.
A RADICALLY CONSTRUCTIVE LOGIC
FOR MATHEMATICS AND PHYSICS

When logics die


The secret of the soil grows through the eye.
Dylan Thomas
INTRODUCTION

By the terms 'internal logic' I mean what Hilbert, Weyl and Brouwer have
all called, although from different points of view, inhaltliche Logik to contrast
it with formal logic. It has been sometimes rendered in English by 'contentual';
it could also be considered as an equivalent to the 'intrinsic' logic H. Weyl
has defined in (Weyl, 1968. p. 705)
Each field of knowledge, when it crystallizes into a formal theory, seems to carry with it its
intrinsic logic which is part of the formalized symbolic system and this logic will, generally
speaking, differ in different fields.

Weyl is dealing here with quantum logic and he says that it constitutes an
integral part of the formalism, or as Hilbert would say it, the analytical apparatus, in this case, of Quantum Mechanics. Internal logic denotes the set of
logico-mathematical structures of a given scientific theory. That logic is constructive in the sense that it is not independent of the construction of a particular
theory. Arithmetic is then taken, in the spectrum of mathematical theories,
as the original building block of mathematics and in line with Kronecker's idea
of arithmetical foundations for the whole edifice of mathematics, logic is
seen as an extension of arithmetic. Arithmetical logic is a local logic, that is
it is a theory of local notions, local negation (complementation) and local implication (see Gauthier, 1985). Beyond arithmetical logic, other logics correspond
to various levels in the mathematical hierarchy, from (finite) set theory to
topology and topoi theory - topos a creation of algebraic geometry, is a
generalisation of the notion of topological space, it has an internal logic
which is constructive (intuitionistic) and it has become one of the main objects
of study in category theory.
Topology, more than geometry, deals also with local notions and this idea
has been extended to the formalism of Hilbert space to provide Quantum
Mechanics with a constructive quantum logic (and a local observer to apply
it, see Gauthier, 1983). The concept of local interaction is the heart of measurement theory and the generalisation to cosmology requires a local logic
in the absence of the global unification often dreamed of b.ut yet to be constructed. A phenomenon like non-commutativity (in gauge theories and in field
107
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy ojScience 1,107-122.
1995 Kluwer Academic Publishers.

108

YVON GAUTHIER

theories) seems to reflect the inner logic of interaction, the local character
of processes (strings and knots in symplectic theory).
As is obvious from the preceding, constructive internal logic is of a nonFregean variety and Boolean laws are confined to finite symmetric situations,
while processes that are not finite (i.e. that are not sets) must exhibit strict local
behaviour; consequently Cantorian set theory and set-theoretical model theory
cannot serve as foundational background for a programme that aims at giving
specific insights into the workings of mathematical and physical theory. The
logic that I delineate in the following draws upon number theory in order to
extract the internal content of an arithmetical logic which may be thought of
as the starting point of the constructivist approach in a radical foundational
enterprise.
1.

SYNTAX: FEARFUL SYMMETRY

A logic that is radically constructive is not classical. I present here a system


of logic in a sequent calculus which is minimal, with no structural rules but
with new notions, i.e., two new connectives, local negation and local implication and a new quantifier called the 'effinite quantifier'. The basic concept
'sequence' is divided in two, finite sequences which are sets and effinite
sequences which are not. There are no infinite sequences. An effinite sequence
is open-ended. That is, it has a pre-positional bound, e.g. 0, but no postpositional bound, e.g. w. An effinite sequence is somewhat like Brouwer's infinitely proceeding sequences without any pre-assigned limit. When an effinite
sequence has post-positional bound, it becomes an initial segment, i.e. a set.
Though it is minimal, the radical logic we are devising aims at providing a
natural framework for arithmetic, that is constructive theorems of number
theory, e.g., Euclid's theorem on the infinity of primes. In a way, our logic
is a finite probe for the concept of infinity. All notions are meant to be local
and the logic itself is a 'local logic' .
Symmetry is a global feature. Gentzen made strenuous efforts to give his
system of linear logic a symmetric outlook. Because of the left and right
symmetry (the sagittal correspondence) in sequent calculus and the symmetry
of intelim rules (the inversion principle) in natural deduction, he thought that
internal structure had to be reflected in the manifest structure even at the
price of artificiality. Boolean logic is symmetric, it is not constructive (except
in finite situations). Local negation is not involutive, simply because it does
not reflect a symmetric situation.
Symmetry is spontaneously broken at the core (a situation reminescent of
Quantum Field Theory?). There are two domains, one for assertions denoted
by D (domain), one for negations (or negated assertions) denoted by E
(exterior). These domains are effinite sequences of sentences (or formulas).
Remark: This notion of domain has some similarity with the domains
(champs) of Herbrand's Fundamental Theorem where "the necessary and
sufficient condition for a proposition not to have property B is that it be false

INTERNAL LOGIC

109

in some infinite domain" (Herbrand, 1971). However, we do not need here


the notion of order, which has proven to be defective in Herbrand, since a postbound (for post-positional bound) on an effinite sequence makes a (finite)
set of it. A more philosophical remark would evoke Plato's idea in The Sophist,
256e, where it is said that non-being is not opposed (symmetrically) to being,
but is different from it and thus more numerous. Effinite domains build up
the semantics of our system as we shall see later.

1.1. Vocabulary
Our first-order language L(T) for our first-order theory T has an effinite supply
of atomic symbols: (1) letters (capital and small) for formulas (and sentences), A, B, C, ... , P, Q, R, ... together with their punctuation signs, points,
commas, parentheses brackets, etc., (2) letters for variables XI' x2, , Xn'
(3) predicate letters pj, (4) functions letters fj - when f is o-ary, we consider
it as a constant, (5) the connectives /\, V, -', ~, (6) ... the quantifiers V, 3
and ~. The terms consist exclusively of: (1) variables, (2) sequences composed
of terms and functions letters, e.g. fj t l, ... , tn - for the terms t l, ... , tn.
Formulas or wffs consist exclusively of: (1) atomic formulas composed of terms
and predicate letters, e.g., pj(t l, ... , tn) for the terms t l, ... , tn; (2) any
wff consisting of formulas composed of connectives and quantifiers.
Remarks: Sentences are closed formulas, i.e., formulas are 'open' sentences where variables occur free, that is, are not quantified upon. An instance
A(tl' ... , tn/XI' ... , xn) of a formula A is the result of substitution terms t
for the free occurrences of a variable x.

1.2. Sequents
I adopt (and adapt) the standard formulation of the sequent calculus LK (Girard,
1987). A sequent is an expression r f- ll. where r and ll. are finite sequences
of formulas;
is the antecedent, e.g., AI /\ ... /\ An and ll. the succedent,
e.g., B I, ... , Bm with the interpretation (AI /\ ... /\ An) ~ (B I V . . . V
Bm)

1.2.1. Axioms
The system of L L (Local Logic) has the axiom

Axiom 1

Af-A auto-thesis (or self-positing)

for A an arbitrary formula. Self-positing is the identity axiom. Antithesis or


heterothesis is
Af----,A.

110

YVON GAUTHIER

Both appear in the unified formula


I-A, -,A.

1.2.2. Logical Rules


Logical rules are expressed in the sequent calculus with a left-right symmetry
while in a system of natural deduction, this symmetry is replaced by the intelim
rules (introduction and elimination rules). The bar indicates that the sequent
of the conclusion under the bar has been obtained from the sequent of the
premiss by the given rules. Since our system is a system of local logic (with
minimalist and intuitionistic properties), in practice we can consider only
sequents r I- ll, where II consists of a unique formula.
The logical rules are the following:
Conjunction
L]
L2

L4

r, A I- II
r, A /\ B Ir

II

I- A, II

I- A V B, II

I- A, II

I- B, II

I- A /\ B, II

1 1/\

/\r

L3

r,

Disjunction
r IV
L6

r,

L3
A I- II

r,

r,

r,

B I- II

A /\ B I- II

I- B, II

I- A V B, II

1 2/\

r 2V

B I- II 1 V

A V B I- II

Remark: Since the logic is local, conjunction and disjunction are assumed
to be locally (individually) provable, in particular disjunction has the disjunction property of intuitionistic logic: if I-A V B is provable, it means that
either I-A or I-B is provable - for conjunction, I-A and I-B are provable.
Negation being local, the minimal derivation of negation can be written

L7

r,

A I- II
r -,
I- -,A, II

Negation
Lg

r
r,

I- A, II 1-,
-,A I- II

Remark: one can introduce or eliminate negation (to the right or to the
left), if one has reason to do so, i.e. has found a contradiction. Double negation
cannot be eliminated, as we shall see later.

L9

r,

Implication
A I- B, II

I- A -7 B, II

r-7

(ll] and 112 are two different sequences).

I- A, ll] r, B I- 112l--'A -7 B I- ll], 112

LIO - - - - - - - - - ---,

r,

111

INTERNAL LOGIC

-7

Remark: Notice that since we do not have a -7 -,-,a (no more than -,-,a
a in general), implication is also local, being intimately tied with negation.
Universal quantification
Lll

r I- A, A
r I- '7xA, A

r'7(*)

r, A(t) I- A 1'7(**)
r, '7xAx I- A

LI2

Remark: Since '7 applies only to finite domains, domains, it does not differ
from the intuitionistic (or classical) finite quantifier - of course '7 as well as
3 and :E below are subject to the usual restrictions on variables: (*) means
that x is not free in r, A and (**) means that the substitute t is an arbitrary
term of L.
Existential quantification

I- At, A r3(**)
I- 3xA(x), A

LI4

r, A I- A
r, 3xA I- A

13(*)

Remark: The existence property of intuitionistic logic, i.e. 1-3xA(x) is


provable means that I-A(t) is provable for some (numerical) t.
Effinite quantification
LIS

r
r

I- A(xn ), A r:E(*)
I- :ExA, A

LI6

r, A(x n) I- A
r, :ExnA(xn ) I-

l:E(*)
A

Remark: Some words of explanation are in order. In the r part, :E behaves


like universal quantification and in the I part, it behaves like existential
quantification; this means that effinite quantification is really existential quantification iterated effinitely, that is 'generalised existence' and not existential
generalisation. On the other hand, universal generalisation applied to an effinite
sequence means that there is no counterexample to be found, a fact similar
to Hilbert's use of the E-symbol to define universal quantification
'7xAx == A(Ex -,A(x.
:Ex n means obviously that the variables in A occur effinitely often and
A(xn ) means that there is an effinite sequence of variables in A (eigenvari-

abIes) not identified with those in A; only if there are the same, can :ExAx
be eliminated, that is to say that the left rule is only there for the sake of
symmetry.

1.2.3. Structural Rules


There are no structural rules in our calculus, but a there is a general principle of local shift according to which main formulas remain lexicographically
ordered either side of the turnstile I- in additions, deletions or exchanges
(permutations) - alphabetical order may be ascendant or descendant. The combinatorial principle is latent.

112

YVON GAUTHIER

There is no cut rule either

I- A, A r, A I- A
rl-A

If cut should be added, it would be eliminable.

1.2.4. Negation Revisited


The rules L7 and Lg for minimal negation do not capture the essence of
intuitionist negation. In U (J for intuitionist), we have the rule

for the symbol of absurdity ..1, which amounts to a structural weakening (or
addition), while classical negation requires also
L
7'

r,
r

-,A I- ..1

I- -,-,A r-,

but these are not local. Antithesis


AI--,A
is the negation of autothesis (the identity axiom)

Axiom 1

AI-A

and yields negation directly; it defines otherness or the exterior of the domain
of assertions. Semantics will describe that separation of worlds. In the
meantime, we take both autothesis and antithesis as axioms and delete L7
and Lg which are derived rules. The syntax remains even, despite the fact
that the two worlds (domains) are essentially uneven. Self-duality is only an
epiphenomenon.
2.

SEMANTICS

2.1. The Model


Model is taken in the usual sense of a model for a given structure S which
is a triple S = (Us, p s f.). where Us is the universe of the structure, Ps are
the predicates and fs the functions of the language L(T) of a first-order
theory T. A structure is a model when the proper axioms of T are all valid
in the structure. I depart slightly from the classical notion, as we shall see
immediately.
For the constructivist logician (or otherwise), semantics is only a metaphor

113

INTERNAL LOGIC

(in many disguises). This is why it is important to be most explicit in that


matter. We begin by noticing that the local universe of our syntax is an
expanding one with a constructive (recessive) horizon which the creative
subject (observer?) never attains; this is expressed by
C3:D 3:A 3H f-oA C H)

!\

=>

-,(3:D 3A 3H f-oA

H)

for domains D, assertions A and an horizon H.


We can compare the cumulative rank structure of set theory

V'Y (for an inaccessible ordinal y)


Va = U~<a V ~ for V limit-ordinals
Va+1 = Va U P(Va) for V ordinals
Vo = 0

V=

with the open structure of local universe


__"",H___
_
~

Dn

Dn-m

(1 < m < n)

Dn(n-I)

Dn-n

Remark: Although we are not in a set-theoretic universe, the set-theoretic


symbols retain their usual meaning due to the fact that we are constructing
or observing the universe locally.
The model of local logic is a quadruple M = (DM' EM, qM' <PM) where DM
is the domain, EM its exterior and q a relation of superposition which orders
the domains and their exteriors and <PM is a function which maps the (closed)
formulas of the theory into the natural numbers

<PM: form

(0,1)

Remark: D and E are effinite sequences where validity cannot be reduced


to finite validity - this explains why completeness does not obtain. The map
<PM is defined inductively in the following manner:
(1)

<pM(A)[n]

(2)

<PM(-,A)[n]

(3)

<PM(A

!\

B)[n x m]

1, iff A

DMo and B

(4)

<PM(A

B)[n + m]

1, iff A

DMo or B

1, iff A
=

DMo

1, iff -,A

E~O
E

DMo

DMo

114

YVON GAUTHIER
loe

(5)

<i>M(A -7 B) [nrn]

(6)

<i>M(3xA) [n + m + 1 ...]

(7)

<i>M(V'xA) [n

(8)

<i>M(:ExA) [n

X
X

1, iff A

m
m

X
X

1]

DMo implies B

1, iff I: An

1, iff II An

I ... ]

DMo

DMo

DMo

1, iff II An ...

D Mo .

Remark: The assignment of natural numbers (and arithmetic operations)


is somewhat arbitrary (not unlike GOdel numbering), but serves only as
a general procedure for verification of validity in the sequence of natural
numbers. In clause 8, the dots mean that the sequence does not terminate
while it does in clause 7 - V'x means that we have a (finite) set and that universal quantification is limited to sets (instead of taking sums and products
over the variables, I indicate the quantification through the indexing of the
predicate).
2.2. Interpretation of the Logical Constants
I favour an arithmetical interpretation of constants, as can be seen from the
formulation of the model and that means that constants have arithmetical
existence. Not only disjunction and the existential quantifier, but also conjunction, negation, implication, universal and effinite quantification have
arithmetical import. Conjunction is seen as multiplication and the universal
quantifier as a finite product of numerical instances, disjunction as addition
and while the existential quantifier is a finite sum, the effinite quantifier
must be looked at as a continued product, as an effinite product or sequence,
not as an infinite sequence of conjunctions in set-theoretic semantics. Negation
and implication stand in a close relationship. Let us start with the relative
pseudo-complement expressed in the following way

b = InX - a) U b) = c

where a, b, c are open subsets of a topological space X and In the interior c is the greatest element different from a. X - a is the difference or relative
complementation. Negation interpreted as arithmetic difference should be carefully distinguished from subtraction, when one remembers that the subtraction
sign is not a negation sign!
If implication can be seen as a continuous curve only in a non-standard
model - the topological interpretation - we could interpret it arithmetically
as a Cauchy product of power series, since a continuous function can be
represented arithmetically by a power series

for c n = aobn + a 1b n_1 + ... + anbo, that is, the Cauchy diagonal or product which
does not lead out of the realm of natural numbers unlike Cantor's diagonal

INTERNAL LOGIC

115

- of course, we have to reinterpret 00 as the bad infinite of approximation,


but this is done easily with the effinite quantifier on constants an and variables x. The net result is that we can have a concept of strict implication in
an arithmetical setting, thus dispensing with modalities but affording more than
Ackermann's positive fragment of strong implication.
Remark: Modalities, like background information as in relevant logic, are
not compatible with the bare ontology of arithmetical logic and they can be
accomodated only in a less radical constructive logic.
To which extent is our constructive logic arithmetical? To see this, we
shall introduce arithmetic with (Fermat's) infinite descent and then reformulate Euclid's elementary proof of the infinity of primes which uses a primitive
form of infinite descent. Gentzen says in (Gentzen, 1969) that Euclid's proof
contains a somewhat disguised complete induction and translates it as such
in his system - although Gentzen distinguishes between infinite descent and
complete induction, he does not emphasize the constructive character of the
former. Here, I want a more direct approach than Gentzen's. From a constructivist viewpoint, complete induction (or Peano's postulate) and infinite
descent are not the same and it is important to stress the difference if one wants
to stick to the most stringent proof theory, as Gentzen undoubtedly wanted
to in his perpetuation of Hilbert's programme.
3.

FERMAT ARITHMETIC

Fermat (1894) says of infinite (or indefinite) descent that is an unu/'W'Yl'i d~


a.oUvUtOV or a reductio ad absurdum. He applies his method to the problem
of right triangles (in rational integers) the area of which should be a square.
If there were such a triangle, Fermat says, there would be another one in smaller
integers with the same properties; and if there is second, there must be a
third, a fourth, etc. still smaller and so on ad infinitum. But this is impossible, since there is no infinitely descending sequence in the natural numbers.
Let us remark first that the reductio is harmless here, since it is finitary and
the double negation that ensues is perfectly legitimate since it does not transcend the realm of the finite. The case is still more evident when Fermat
says that he has applied his method not only to negative questions, but also
to affirmative one&, such as "Any prime number, which is greater than a
multiple of 4 by one, must be composed of two squares". If there were a
prime number greater than a multiple of 4 by one, but not composed of squares,
there would be a smaller one of that nature and still smaller ones, till 5 is
reached, which is the least number having the said property. One must then
conclude by indirect proof that the theorem is true. Here, one might find that
we have the equivalent of the least number principle, but Fermat employs it
in a totally different context, that is, a purely arithmetical context. The essential difference lies in the strictly finite or constructive formulation of Fermat
and while infinite descent is perfectly acceptable as reductio ad absurdum,
complete induction and the least number principle when amenable to reductio

116

YVON GAUTHIER

ad absurdum obey the excluded third principle via double negation for infinite
sets and are then rejected by intuitionist (Brouwerian) standards. No such reprobation affects infinite descent and I shall try to give some foundational
legitimation for infinite descent. Poincare has insisted that infinite descent
(which he calls "recurrence") is not equivalent to complete induction (Poincare,
1906).
What I call Fermat arithmetic is arithmetic with the schema of complete
induction replaced by the schema of infinite descent. Fermat arithmetic has
thus only one limit-ordinal, that is an ordinal without an immediate predecessor, namely 0, Peano arithmetic (as Heyting arithmetic) has at least two
such ordinals, 0 and ro, and Cantor (or Gentzen) arithmetic admits of an arbitrary number of limit-ordinals - at least up to

lim ro ... ro}n

Eo.

n ....OO

Fermat arithmetic is in this sense the only arithmetic that is not set-theoretic,
Le., it does not need an infinite ordinal- even Skolem (or Herbrand or GOdel)
recursive arithmetic cannot do without.
The schema of complete induction
'V'x(['V'y(y < x)Ay

Ax]

'V'xAx)

is deductible from Peano's induction postulate


'V'x([AO

f\

'V'x(Ax

ASx)]

'V'xAx)

for Sx the successor of x; the variables x = {x, Xl' , xn } of the first-order


formula are replaced by the subsets (or properties) X in the second-order
formula
'V'X[XO

f\

'V'y(Xy

XSy)]

'V'yXy;

transfinite induction is the schema which substitutes ordinals to natural numbers


in the schema of complete induction
'V'cr(['V't(t < cr

A(t, x

A(cr, x)]

'V'crA(cr, x

where the x's are free variables. Transfinite induction means complete induction up to lim (ro) = Eo in the following hierarchy:
ro = lim (0, 1, 2, ... n)
ro2 = lim (ro + n)
ro 2 = lim (ro n)
roO) = lim (ron)
roof'> = lim (rown )
eo = lim (roO) O)t).

INTERNAL LOGIC

117

Each member of the hierarchy has a last term n, since the hierarchy is based
upon the normal form that Cantor has given for any ordinal

S=

wllln i + wfl2n2 + Wll"'llm

where 131 > 132 > 13m and m, n l , n 2, ... , nn are finite. This is part of Cantor's
second number class; this class contains the constructive ordinals of Kleene
and Church, and indeed it is denumerable (and recursively enumerable) up
to Eo in the set-theoretic sense because of the terms n, but it is not effectively enumerable in a strict sense, since one can find an ordinal such that
an *- a for all n: one takes a = lim an and one has an < a.
Takeuti (1975) has attempted a justification of transfinite induction by using
infinite descent in the form of strictly decreasing sequences

Il > ... > III > /.lo for Il = lim

(WUn).

The least number principle


3xAx

---7

3x(Ax /\ Vy(y <

X ---7

-,A(y)

can be obtained from the principle of complete induction putting -,A for A
in the schema of transfinite induction. While Buss (1986) introduces the '1'LMIN axioms with the schema
3xAx

---7

A(O) v 3x(Ax /\ Vy :s; L'/2XJ (-,Ay

where A is a ~~ formula and L1I2xJ is the shift-right function (divide by


two and round down), Nelson (1986) formulates the predicative version of
the bounded least number principle in the following way: 3x I ... Xn A ---7
3x , ... Xn min x, ... Xn A where minx, ... Xn stands for:
A /\ -,3y, ... 3YuYI :s; XI /\ ... /\ Yu :s; xu)
/\ (YI *- XI V . . . V Yu *- xu) /\ Ax, ... xu[YI ... Yu])'
Buss shows how ~~-LMIN axioms are equivalent to the rr~-PIND or corresponding bounded induction axioms.

3.1. Formalisation of Infinite Descent


Fermat's arithmetic is characterised by the method of infinite descent and I
maintain that from the metamathematical point of view, that is from the
proof-theoretic point of view, infinite descent fulfills the role of induction
without requiring the notion of infinite set. It is obvious that Fermat did not
have the w-point of view in mind. Fermat says that he has invented the method
of infinite or indefinite descent, but it is already in nuce in Euclid. Take, for
example, Proposition 31 of book VII of the Elements: "Any composite number
can be divided by a prime number". The proof uses a decomposition or reduction which cannot go on indefinitely since any descending sequence of natural
numbers is finite. Fermat himself has put his method to use in his proof of

118

YVON GAUTHIER

the impossibility of the Diophantine equation X4 + y4 = Z2 which is reduced


to X4 + y4 = Z4; this is a particular case of Fermat's last theorem
'In > 2VxVyVz(xn + yn *- zn).
The principle of infinite descent can be formulated as follows: if the existence of a property for a given n implies the existence of the same property
for an arbitrary small number, then this property is possessed by still smaller
numbers ad infinitum, which is impossible since any descending sequence
of natural numbers is finite. In order to formalize this principle, we introduce here the quantifier :Ex, the 'effinite' quantifier.
In symbols, we have
:Ex[Ax " 3y(y < x)Ay]

--7

3y:Ez(z < y)Az

which means that the sequence is continuing on indefinitely, or rather 'effinitely'. We have the following schema:
[A(a)
A(a)

Dn

I-

Dn-m]
A(a)

Dj

(0 < m < n)
(0 ~ i < m)

I-:Ex A(x)
which corresponds to rule LIS.
This principle of descent does not need a universal quantifier, only an
'effinite' quantifier for finite or rather indefinite descent; effinite still means
potentially infinite, indefinite sequences or Brouwer's 'infinitely proceeding
sequences'. To such effinite sequences, one could assign an 'unlimited' natural
number, as in (Nelson, 1986), while finite natural numbers are assigned to finite
initial segments (sets) of those sequences.
Since infinite descent is impossible - any descending sequence of positive
integers must stop at 0, the prepositional bound of the sequence of natural
numbers - one can add the following conclusion to our descent schema
:Ex{[Ax " 3y(y < x)Ay]

--7

3y:Ez(z < y)Az}

--7

:Ex-,Ax

which means that the property (or set of properties) postulated for the infinite
descent is false for all natural numbers 'effinitely'. The effinite quantifier of
the conclusion does not reach beyond what is contained in the premisses, it
is an indefinite ascent of the sequence of natural numbers.

3.2. Euclid's Theorem on the Infinity of Primes


It remains to show that our formalism can express in a most natural way
elementary theorems in number theory. Elementary has the usual meaning of
non-transcendental, i.e., analytical methods like L-functions or holomorphic
(entire) functions of complex analysis, infinite series, limits and so on;
elementary methods use only arithmetical properties of logarithms and finite
sums instead of infinite limits, for example. The prime number theorem which

INTERNAL LOGIC

119

asserts that the ratio of the number of primes in a large set x to xllog tends
to the limit 1 as x tends to infinity, that is
lim

1t(x)

xllog x

=1

has been proven by elementary means (by Selberg and Erdos), long after it
has been proven by analytical methods; the same holds for Dirichlet's theorem
on the infinity of primes in any arithmetical progression ax + b for a and b
relatively prime, i.e. (a, b) = 1. Since Euclid's theorem, like the fundamental
theorem of arithmetic on the unique representability of integers by a product
of primes, needs only constructive methods for its proof, it is the concept of
infinity which is at stake here. My contention is again that the concept is
dispensable and that one can eliminate it or paraphrase it as Brouwer did by
referring to 'infinitely proceeding sequences' (or, as I call them, 'effinite'
sequences). It is really an effinite process which is at work in those proofs;
Aristotle said in his Physics 203b, that the infinite is that which cannot be
crossed (dolEs(n1'toC;) - it may be worth noticing that Gentzen spoke rather
of a potential crossing or running through ("ein potentielles Durchlaufen")
of the infinite in his justification of transfinite induction. If the infinite cannot
be crossed, is the thought-experiment of a potential crossing in itself justifiable? In any case, the actual wording of Euclid's theorem is: "Prime numbers
are more numerous than any definite quantity (of prime numbers)", which is
proposition 20 of the book IX of the Elements (see Davenport, 1968). It suffices
for the proof to suppose that the sequence
PI ... Pk

enumerates all primes and we then form the number

PI x P2 X ... X Pk + 1

which is equivalent to p! = 1; here we use Theorem 31 of book VII of the


Elements which says: "Any composite number is divisible by a prime number".
By definition, a composite number is divisible by two factors, one of which
must be a prime; if it is not the case, then it must be composite and it can
be divided further into a composite number and a prime until it is necessarily found, since there is no infinite descent in integers. Thus, the number
n defined above must have a prime divisor and such a prime must differ
from all Pi' i = 1, ... , k, since Pi does not divide n (there is a remainder).
In short, Euclid's theorem asserts the existence of an effinite sequence of
primes. Let's use cr for that sequence. We know already that there is an
effinite sequence of integers which is simply introduced by the rule of effinite
induction
A(a)

Do

I-

[A(a) E Dnl
A(a) E Dn+l

~7-----------------------------

I- :::Ex A(x)

120

YVON GAUTHIER

(A(a) E Do stands for A(O.


In that context, infinite descent becomes the schema
[
A(a)

~(a)

A(a)

f-

Dn

A(a)

Dn_ 1

Dn-(n_l)
D n_n

~8--------------------------

f- :Ex -,A(x)

Here :Ex A(x) means 30 and :Ex -,A(x) means -,30. L17 and LI8 are analogues
of LIS and L7 We can then formalise Euclid's proof in the following way:
LEMMA. Any composite number is divisible by a prime number. In symbols
:Ex Comp x)

3zPrim z)

xJz.

1\

Proof We proceed by reductio ad absurdum and we want to prove


:Ex Comp x)

-,3zPrim z)

1\

xJz

which we take as a formula in a domain Dn (xJz means that x is divisible


by z):

:ExComp x)

-,-,3zprim z)

1\

xJz

We have a double negation, since the descent is finite. The conclusion is


reached, because it is impossible to go on infinitely or rather effinitely (Fermat
says also indefinitely) in a descending sequence. We pass now to the theorem
on primes which says
THEOREM. "Prime numbers are more numerous than any definite quantity
(of prime numbers)" In symbols, we simply write:
V finite t :EoPrim t

1\

Prim 0)

~ t

< 0).

Proof Note the :Eo can stand either 'for 0 effinitely' or for 'there is an
effinite sequence 0'. We take as given the following prime:
3zn[zn $ pk! + 1 1\ (Prim zn

1\

Zn > p]

Dn

defined above and show that all t's differ from it; we have to show that:

121

INTERNAL LOGIC

Suppose that t = pk!, and Zn < pk! + 1. The only case of interest is t = p;
but Zn > p, thus t < zn. The fact that t is finite has been gotten by infinite descent
and the statement of the theorem is obtained by effinite induction
[Prim t /\ Prim 0) ~ t < 0) E Dn)]
Prim t /\ Prim 0) ~ t < 0) E Do f- Prim t /\ Prim 0)
f-V't::oPrim t /\ Prim 0)

t < 0) E Dn+1

t < 0)

where we have a double introduction, the universal quantifier, since it was


understood that t is finite and the effinite quantifier, for 0 is not finite, being
greater than t.
0
Note that this induction is reducible to the (finite) induction on natural numbers,
i.e., it says simply that to any (prime) natural number there is a greater one.
Only the decomposition of composite numbers into primes needs infinite
descent. A detailed analysis of the proof would exhibit a logical structure
(with intelim rules) that is not more complicated, but more explicit than the
mathematical argument. However, the important features of Euclid's proof have
been put in the crude light of a constructive logic and shown to rest on radical
assumptions about the infinite. No infinite set, no W, no induction postulate
other than effinite descent (or induction) is necessary. Infinite descent is not
always effective and it is often used in a non-constructive way (see Ireland and
Rosen, 1982). I hope to have made it clear enough that only effinite quantification is required if arithmetic is to be given its barest logical expression.
Why such a need for a naked ontology of mathematical entities? Not because
of the paradoxes, antinomies and other oddities, but for the sake of intelligibility which amounts to foundational relevance and empirical adequacy, I mean
in agreement with mathematical and logical practice.
4.

CONCLUDING REMARKS

We have obtained an internal consistency proof for arithmetic with infinite


descent (Gauthier, 1993) without resorting to transfinite induction and without
the detour of an infinite set (of natural numbers). The system of constructive logic described here is translated in a calculus of polynomials with the
convolution product. Fermat arithmetic is then coupled with Kronecker's
general arithmetic of indeterminates (Unbestimmte) in order to decompose
the polynomial (logical) content of local implication and the effinite quantifier via infinite descent on the coefficients of the convolution product. The
resulting arithmetic F is shown to be consistent in a constructive fashion.
It encompasses most of number theory and a large part of contemporary
arithmetic (algebraic) geometry - from Weil to Grothendieck. Is such a
constructive arithmetic sufficient for mathematics and physics? My claim is
that a purely arithmetical interpretation (or reduction) of topological (geometrical) concepts is possible and that the minimal logic of interaction which

122

YVON GAUTHIER

serves as a foundation for Quantum Mechanics (and Relativity Theory) can


be rendered into a polynomial translation which avoids infinities of all sorts,
mathematical or physical.
The practical uses of local logic, e.g., in theoretical computer science,
are not immediate but local logic and fragments of F arithmetic lend themselves directly to elementary computable (bounded) functions, since they are
straightforwardly constructive. More work needs to be done in particular cases
and I shall rest content if I can steer a middle course between gross generality and pointless detail by laying claim to a strict logic and an immodest
philosophy.
Universite de Montreal
REFERENCES
Buss, S. R., 1986, Bounded Arithmetic, Bibliopolis, Naples.
Davenport, H., 1968, The Higher Arithmetic, Hutchison, London.
Fennat, P. de, 1894, Oeuvres, Vol. 2, Gauthier-Villars, Paris.
Gauthier, Y., 1983, 'Quantum Mechanics and the Local Observer', International Journal of
Theoretical Physics 22, 1141-1152.
Gauthier, Y., 1985, 'A Theory of Local Negation: The Model and Some Applications', Archiv
fiir mathematische Logik and Grundlagenforschung 25, 127-143.
Gauthier, Y., 1989, 'Finite Arithmetic with Infinite Descent', Dialectica 43, 329-337.
Gauthier, Y., 1990, 'Logical and Philosophical Foundations for Arithmetical Logic', in A. D.
Irvine (ed.), Physicalism in Mathematics, University of Western Ontario Series in Philosophy of Science, Kluwer, Dordrecht, pp. 331-342.
Gauthier, Y., 1991, De la logique interne, collection 'Mathesis', Vrin, Paris.
Gauthier, Y., 1992, La logique interne des theories physiques, collection 'Analytiques',
BellanninlVrin, Montreal/Paris.
Gauthier, Y., 1993, 'An Internal Consistency Proof for Arithmetic with Infinite Descent', Preprint,
Cahier du departement de philosophie, 93-15, Universite de Montreal, Montreal.
Gauthier, Y., 1994, 'Hilbert and the Internal Logic of Mathematics', Synthese 101, 1-14.
Gentzen, G., 1969, Collected Papers, E. Szabo (ed.), North-Holland, Amsterdam.
Girard, J. Y., 1987, Proof Theory and Logical Complexity, vol. 1, Bibliopolis, Naples.
Herbrand, J., 1971, Logical Writings, W. Goldfarb (ed.), Harvard University Press, Cambridge,
Mass.
Ireland K. and Rosen, M., 1982, A Classical Introduction to Modern Number Theory, Springer,
New YorklHeidelbergiBerlin.
Nelson, E., 1986, Predicative Arithmetic, Princeton University Press, Princeton.
Poincare, H., 1906, 'Les mathematiques et la logique', Revue de Meraphysique et de Morale
14,17-34& 294-317.
Takeuti, G., 1975, Proof Theory, North-Holland, Amsterdam.
Weyl, H., 1968, Gesammelte Abhandlungen, K. Chandrasekharan (ed.), vol. 3, Springer,
Berlin/Heidelberg/New York.

JUDY PELHAM

A RECONSTRUCTION OF
RUSSELL'S SUBSTITUTION THEORY

In "Russellian Propositions" (Pelham, 1994), Alasdair Urquhart and I elaborate a theory of structured propositions that closely parallels Russell's
substitution theory of 1905-08. The present paper elaborates how our reconstruction of the substitution theory models Russell's philosophical ideas about
the structure of propositions as well as his axioms for substitution in writings
dating from the period around December 1905.
I.

RUSSELL'S VIEW OF THE ELEMENTS OF SUBSTITUTION

In his attempts around 1905 to find a consistent resolution to all forms of


Russell's paradox, Russell tries to develop a consistent type-free logic. That
is, he tried to develop a logic in which everything which counts as a term
(anything which is an entity from a metaphysical viewpoint) is a possible value
to the variable of quantification. This view is part of Russell's conviction
that logic is universal, in the sense that its elements are all possible objects
of thought.l In The Principles of Mathematics (Russell, 1903), Russell believes
predicates and relations are terms and hence possible values to the variable,
and this fact engenders the paradox concerning predicates which do not apply
to themselves. The central idea of the substitution theory is that predicates
are incomplete symbols, and thus can be eliminated from our ontology as
well as our logic. The notion of the substitution of one entity for another in
a proposition allows us to replace the primitive notion of predicates.
There is only one paper by Russell on the substitution theory presently
published;2 it is "On the Substitutional Theory of Classes and Relations"
(Russell, 1906) (henceforth abbreviated STCR). In STCR a class is an incomplete symbol in the same way as a definite description is an incomplete symbol
in 'On Denoting.' (Russell, 1905a) Classes, relations, and predicates are parts
of the grammatical form of sentences which are correctly analyzable as complexes containing propositions, terms, and substitution. A substitution sentence
is of the form p~!q, (written pia; b!q in STCR) which means "q results from
p by substituting b for a in all those places (if any) in which a occurs in p".
(Russell, 1973, 168) The paradigm case is one in which p is a proposition,
and a is a constituent of p. The substitution sentence "says" that p with one
component replaced by another is q. So if P is 'Socrates is human', a is
Socrates, and b is Plato, p~!q is " 'Socrates is human' with Plato substituted
for Socrates is 'Plato is human' ".
The present paper (as well as the theory developed in Pelham, 1994) is based
largely on an unpublished manuscript of Russell's dated Dec. 22, 1905, and
called 'On Substitution.' (Russell, 1905b) This manuscript contains an intial
123
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy o/Science I, 123-133.
1995 Kluwer Academic Publishers.

124

JUDY PELHAM

section introducing the notion of substitution as well as "the axioms required


on this subject." In this manuscript, Russell defines the notation, ~, as
a definite description meaning the result of substituting b for a in p. If
p, a, and b have the same meaning as above, ~ stands for 'Plato is human.'
Thus the symbol ~ is a definite description which is an abbreviation of "the
unique x such that ~!x." The important notion of a propositional function is
represented as p;" and thus speaking correctly propositional functions are
incomplete symbols. Speaking informally, and following the example above,
'x is human' could be represented by p;,. In the substitution theory, propositional functions are incomplete symbols, and they are not part of the Russellian
ontology.
Russell extends the notion of an incomplete symbol one step further in
constructing the notion of a class. Russell uses the symbol, pIa, to represent
a class, in the sense that p and a set up a matrix which associates p with all
the propositions similar to it with respect to a. This results in the definition:
'b is a member of the class pIa' means '~ is true.' So following the example
above, 'Plato is a member of the class of human things' is defined to mean
'the result of substituting Plato for Socrates in p is true.' To obtain the classes
of classes required for the derivation of arithmetic, Russell makes propositions
the variables of substitution in complex propositions. In STCR, Russell gives
the example of the construction of the cardinal number O.
(X).-(p :).

(Zero)

Considering (Zero) as the proposition in which substitutions take place, and


p and a as the substitution constants, those substitutions which make (Zero)

true will pick out those matrices, sic, such that no substitution for c in s
yields a true proposition. Thus the matrix (Zero)/(pla) plays the role of the
cardinal number zero in STCR. In Russell's words:
According to this definition, 0 is a relation between a proposition and an entity, namely the
relation that, whatever we may substitute for the entity in the proposition, the result is always
false. (Russell, 1973, p. 175)

The relation between the proposition and the entity that Russell speaks of
here is a logical construct, a "matrix" as he calls it. That is, relations do not
exist as independent abstract objects, but zero is a relation which is specifiable in logical terms using the notions of a proposition and an entity. As
such this sort of relation is an instance of a logical fiction.
As Russell understands logic to be completely general, the variables of quantification, and so the substitution operation, are taken to apply to all entities
whatsoever. This means that although a may not occur in p, the substitution
sentence, p;,!q, must be true or false. Russell stipulates that if a does not
occur in p, then the result of any substitution for a in p is p. Russell reveals
this tacit stipulation with a definition of 'a ex p' in STCR (Russell, 1973,
p. 169), which says a is not a constituent of p if every substitution for a in

A RECONSTRUCTION OF RUSSELL'S SUBSTITUTION THEORY

125

p just results in p itself. In the earlier 'On Substitution' manuscript the


same definition is written 'a out p', and it is listed as Proposition 12.14 in

the definitions below.


Russell's substitution theory is designed to be an untyped theory in which
all terms, including individuals and propositions, are possible arguments to the
variable of quantification. Propositional functions are not terms, they are
replaced by constructions involving the substitution relation. In Russell's
words:
In order to get the kind of results which we used to get by considering "for any value of ct>",
we need the idea of substitution. By this I mean the substitution of a constant for a constant,
which is quite a different thing from the determination of a variable as this or that constant.
P!!q is to mean: "The substitution of x for a wherever a occurs in p turns pinto q". (Russell,
1905b)

The elimination of predicates in this way is an attempt to resolve the contradiction concerning classes and predicates without developing a typed theory.
Russell says explicitly in STCR that propositions are to be taken as fundamental and not relations or predicates:
But in symbolic logic, it is best to start with propositions as our data; what is prior to propositions is not yet, so far as I know, amenable to symbolic treatment, ... (Russell, 1973, p. 175)

Some of the definitions which Russell presents in the 1905 manuscript


are given here:
DEFINITIONS

= y. =:(p,

12.1

12.14

a out p. = .(x).p -!p

q, r, a): p -!q.p -

!r.~.q~r

12.141 a in p. = .-(a out p)

Definition 12.1 is a way of representing the Leibnizian definition of identity


in terms of substitution. The definition says that two terms are equal when
the results of substituting each of them into the same proposition are two equivalent propositions. This definition is abandoned in the later manuscript STCR
in favour of x{.!x (Russell, 1973, p. 169). This definition assumes that substitution preserves equality, that is, that substitution preserves the structure
which allows us to individuate and identify things. Russell's notion of the structure required to identify propositions is what the reconstruction of section II
attempts to elucidate. Definitions 12.14 and 12.15 use Russell's convention
that when a is not a constituent of p, p:. is simply p itself, to define the notion
of a constituent of a proposition. A proposition has a as a constituent when
there is some term, which when substituted for a in p will yield a different
proposition than p. Here are some of the axioms for substitution Russell
adopted in his 1905 manuscript (Russell, 1905b):

126

JUDY PELHAM

AXIOMS 3
12.2

x
1-: (3q).p -!q
a

12.201 1-: p -!q.p -!r.:J.q


a
a
12.21

=r

x
1-. P -Ix
p

x
12.211 1-. P -!p
x
12.212 1-: a

"# -p.p -!q.:J.(-p) -!(-q)

12.22

x
1-: -(y).a in ('\jf!y).:J.('\jf!a) -!('\jf!x)

12.24

1-: a in p.p in q.:J.a in q

12.241 1-: a in b.b in a.:J.a = b


12.25

a'
1-: -(x, y).a in <j>!(x, y).:J.{(y).<j>!(a, y)} - !{(y).<j>!(a', y)}
a

These axioms specify properties of substitution. Axioms 12.2 and 12.201 say
that there is a unique value for each substitution operation. Axioms 12.24
and 12.241 say that being a constituent of a proposition is a transitive and
reflexive notion. Those remaining say that with respect to wholes and parts,
substitution behaves in a way that preserves "logical" structure. These axioms
show that Russell did not hold that propositions with the same truth value were
substitutable for one another. They make it clear that Russell had a more
complex notion of propositions than simply the bearers of truth-values. He
thought of propositions as structured objects which seem to involve predicates,
according to axioms 12.22 and 12.25. However it is odd that Russell employs
a variable which seems to range over predicates, for after all predicates are
not supposed to be entities on the substitution theory. If they are not primitives of the theory, it is puzzling that they are mentioned in stating the
fundamental axioms. The answer to this puzzle I believe lies in the following
passage from STCR:
It should be observed that the relations identified with dual matrices are (approximately) relations in extension. If we say' x begat y', the word begat expresses the same relation in intension
as was expressed in extension by our matrix pl(a,b). The drawback to relations in intension,
from the standpoint of symbolic logic, is that not all propositional functions of two variables
correspond to relations in intension, just as not all propositional functions of one variable
correspond to predicates .... Relations in intension are of the utmost importance to philosophy and philosophical logic, since they are essential to complexity, and thence to propositions,

A RECONSTRUCTION OF RUSSELL'S SUBSTITUTION THEORY

127

and thence to the possibility of truth or falsehood. But in symbolic logic, it is best to start with
propositions as our data; .... (Russell, 1973, pp. 174-5)

Intensional relations are not entities according to the substitution theory;


instead, propositions are the fundamental data of logic and extensional relations are constructed from the substitution operation on propositions.
Intensional relations are not amenable to symbolic treatment because Russell
found that if they are admitted as possible values of the variable the logic
becomes inconsistent with some form of the contradiction. However Russell
believed intensional relations are relevant to logic because they are fundamental to the complexity of propositions. The predicate variables in the axioms
of substitution stand for relations which are not entities and thus are not in
the domain of quantification, but which are constituents of propositions and
which are necessary for determining what substitution sentences are true.
Substitution in this way allows us to construct extensional relations (those
things which we can countenance without contradiction) on the basis of a richer
and more elaborate propositional structure which includes intensional relations.
The next section presents my formal elaboration of this interpretation of
Russell's view.
II.

THE RECONSTRUCTION

It is a prominent feature of Russell's work throughout this period that he

did not observe the distinctions between language and metalanguage, and
semantics and syntax. In failing to observe the latter distinction Russell
is firm in his belief that the goal of logical analysis is the elucidation of
abstract structures; he simply assumed that a correct symbolism would
mirror the structure of the objects being described. 4 For the purposes of the
reconstruction we follow Russell's lack of such a distinction. The reconstruction develops a semantic structure which exhibits many features of a
formal language.
Elements of the Reconstruction

Individuals: we assume there is a non-empty set of logically simple entities.


Variables: a, b, c, ... p, q, r, s, ... x, y, z. The variables are place-holders
which range over all entities.
Predicates: The only specific predicates used are:
Identity: x is identical with y is written x = y.
Substitution: p with q substituted for r gives s is written Spqrs, or
Pt-!s.
Connectives: - and ::J.
Universal and existential quantifiers: (S), (3S)'
The elements given here agree for the most part with the notation Russell used,
and thus with basic elements of his ontology. Russell clearly intended the

128

JUDY PELHAM

variables to range over individuals and propositions. The reconstruction introduces a category of individuals which are logically simple, that is, which
have no logical constituents. This category seems implicit in Russell's work.
The reconstruction also employs 1: as a metavariable ranging over predicates
and S as a metavariable ranging over variables, which Russell, without the
notion of a metalanguage, did not do. The connectives, like the predicates,
are functions mapping the set of all entities to propositions, which clearly
agrees with Russell's practice. Quantifiers are a special proposition-forming
operator, since they form propositions from things which are neither individuals nor propositions. The propositional function variables, <I>!x, used in
Russell's presentation of the substitution axioms, are construed as metavariabIes from the point of view of the reconstruction.
Russell understood propositions as complexes built up from individuals and
intensional predicates, and the goal of the reconstruction is to specify how
propositions are built up. For Russell the only intensional predicate which logic
requires is substitution, and identity can be defined in terms of the substitution operation. However, the reconstruction gives truth-conditions for the
substitution sentences in terms of the identity of the structure involved, so it
seems natural to employ identity and substitution as primitive.
Propositional Forms and Propositions
I introduce the notion of a propositional form as an intermediary in the construction of propositions. The set of propositional forms R(I), is recursively
defined by the following clauses:
(1) A variable standing alone is a propositional form.
(2) If 1: is a k-place predicate, and 0.1' . . . , ak are propositional forms
or members of I, then 1:(0.1, , a k) is a propositional form.
(3) If a is a propositional form or a member of I, then -a is a propositional form.
(4) If a, 13 are propositional forms or members of I, then a :J 13 is a
propositional form.
(5) If a is a propositional form and Sis a variable, then (S)a is a propositional form.
(6) Only formulae resulting from repeated applications of the preceding
four rules are propositional forms.
R(I) is the smallest set of objects which fulfils these conditions. Each propositional form may be thought of as an ordered sequence which consists of
the main operator, followed by each of its arguments in order, and each of these
arguments is followed by each of its arguments and so on until the end as in
Polish notation. It is helpful to simply diagram a propositional form as an
inverted tree, thinking of the predicates, connectives and quantifiers as nodes
with one or more branches, and individuals, variables, and propositions as
leaves.

A RECONSTRUCTION OF RUSSELL'S SUBSTITUTION THEORY

129

Using the notion of the formation tree of a propositional form, I give the
definition of the identity of two propositional forms:
Two propositional forms, a and ~ are identical, iff they have the
same formation tree, or if the formation tree of one is obtained from
the formation tree of the other by alphabetic change of bound
variables.
In this way distinct variables make distinct propositional forms, except in
the case of the variables of quantification. Thus while Fx =F- Fy, (x)Fy = (y)Fy.
Propositions are those propositional forms which do not contain a free
variable. An adequate definition of proposition requires that the notion of a
constituent of a propositional form have a precise meaning. The notion of a
constituent of a propositional form parallels the inductive definition of propositional form already given. The constituents of a propositional form are
determined by the following clauses:
(1) Every propositional form is a constituent of itself.
(2) a l , ... , a k are constituents of the propositional form :E(a l , , ak)'
(3) a is a constituent of -a.
(4) a and ~ are constituents of a ::J ~.
(5) a is a constituent of (S)a.
(6) If r is a constituent of 11 and 11 is a constituent of W, then r is a constituent of W, with the following exception:
No bound occurrence of a variable is a constituent of any propositional form.

The notion of the occurrence of a propositional form requires some further


explanation. Abstract objects, including propositional forms, are understood
by analogy with syntactic items of language, and as such abstract objects admit
of tokens and types. A given abstract object, a, may occur in a propositional
form more than once, as in a = a. a = a contains two qualitatively identical
occurrences of the single abstract object a. The single abstract object is defined
by its abstract contents and its rules of formation, but it admits of numerically distinct copies. Clause 6 of the definition of propositional constituents
entails that (x)a has a as a constituent, as well as all the constituents of a
excluding x. Thus, x is a constituent of x = x, it has two occurrences; however
x is not a constituent of (x).x = x. With the formal definition of constituent
in hand a proposition may now be formally defined as a propositional form
which does not contain a variable as a constituent.
The substitution operation is understood as the replacement of one constituent of a propositional form by another. Here is the formal definition:

a ~ is the result of replacing the propositional form 'Y wherever it


occurs in a with the propositional form ~ iff'Y meets the following
two conditions:
(i) no variable free in 'Y is bound in a
(ii) no variable free in ~ is bound by the substitution into a

130

JUDY PELHAM

If 'Y does not meet these conditions, or if 'Y is not identical to any constituent

of a, then a~ is a.
The construction presented in the last three pages gives us a semantic
structure which is three-tiered, in the sense that it contains three different kinds
of objects. The first kind is entities, and entities are possible values of the unrestricted variable of quantification; ontologically these have the status of things
in the world. They include individuals and propositions. The second kind is
propositional forms. Propositional forms are not entities, but they may be
constituents of propositions, or the patterns which propositions instantiate.
In their simplest form they are variables. The third kind is predicates, which
are the nodes which are used in building up propositional forms. Predicates
are not entities, and they are not considered constituents of propositions or
propositional forms. Propositional forms and entities are both capable of
having and of being constituents in other logical structures. Predicates neither
have constituents, nor are they capable of being constituents to any other
complex.
This three-tiered structure of the reconstruction agrees with many of the
things which Russell says in his manuscript about the nature of propositions
and their constituents. Consider the following passage from the substitution
manuscript of Dec. 1905:
An expression containing x is called a dependent variable; thus "x is a man" is a dependent
variable; so is "x - x" or "x ~ x" or any other expression of which x is a constituent. Such an
expression will be called <p!x. Here <p!x by itself is not supposed to have any meaning, so long
as x is not determined. [bold mine] (Russell, 1905b)

What I have called propositional forms, Russell thought of as dependent variables. The determining property of dependent variables or propositional forms
is that each is capable of containing real variables as constituents. Russell
believed that the variable is a constituent of the propositional function, however
propositional functions themselves do not have any meaning. Russell continues:
<p is not a constituent of <p!a or <p!x, and when <p occurs in an expression, we cannot put (<p)
before the expression to obtain an analogue of (x). <p!x. A statement of the form "For any value
of <p, so-and-so is true" is meaningless. (Russell, 1905b)

Here Russell is speaking about what the reconstruction calls predicates. Both
predicates and propositional forms are not entities, but propositional forms
yield a proposition under certain circumstances, and predicates do not. Further,
predicates are not capable of having constituents, as propositional forms are.
Russell's own manuscripts suggest a three-tiered hierarchy of elements.
Russell's manuscript also agrees with the reconstruction in that Russell does
not think of a bound variable as a constituent of a universally quantified
sentence, as the following theorem, taken from the Dec. 1905 manuscript
shows:
1-: x in {(y).$!y}.

=.(y).x in $!y

(Russell, 1905b)

A RECONSTRUCTION OF RUSSELL'S SUBSTITUTION THEORY

131

This theorem effectively says that a given object x is only a constituent of a


quantified proposition if it is a constituent of every instantiation of the propositional form quantified over.
Using the elements of the reconstruction it is easy to construct truth-conditions for the sentences of the substitution theory which make Russell's axioms
concerning non-quantified propositions valid, in the sense that every proposition which results from substituting an entity for a variable in a propositional
form is true. The truth definitions are as follows where (l and ~ range over
the set of propositions of the substitution theory:
(l =

is true if and only

(l

and

have the same formation tree.

S(l~yO is true if and only if (l~ is identical with o.


r

(l

in

(l

V ~

is true if and only if (l is a constituent of ~.


is true if and only if either (l is true or ~ is true.

-(l is true if and only if


not a proposition.)

(l

is not true. (I.e. if

(l

is either false or

The natural extension of these definitions to quantified propositions is: (x).


(l is true if and only if for any object ~, (l~ is true. However the adoption of
this definition leads to a particular form of the paradox. Russell discovered
this paradox and reported it in a letter to Ralph Hawtrey in Jan. 1907 (Russell,
1907). The paradox can be expressed in the reconstruction of the substitution theory almost exactly as Russell reports it to Hawtrey. This substitution
paradox is explained in detail in Pelham (1994). The inability to avoid this
paradox eventually caused Russell to abandon the substitution theory in favour
of the ramified theory of types.
The reconstruction of the substitution theory given here permits a consistent three-valued interpretation of the substitution theory which would preserve
some of the merits Russell intended for it. The substitution paradox Russell
reported to Hawtry involves an existentially quantified sentence (called 'R'
in Pelham 1994) which essentially falls within the scope of its own quantifier. 5 It is possible to avoid this paradox by constructing a partial valuation
on the basis of a fixed point construction outlined in Moscovakis (Moscovakis
1980, 404). To avoid the paradoxes one adopts a strong three-valued interpretation of the connectives - and V, and constructs a truth-value assignment
in which only sentences which receive a truth-value in the fixed point receive
a truth-value. A universal sentence is true if and only if all its instances are
true at some earlier level. A paradoxical sentence which is an instance of itself,
will not receive a truth-value in the fixed point because at no point in the
hierarchy can it be said that all instances of it are true. It is not "grounded".
This method of avoiding the paradoxes is due to Kripke (Kripke 1982)
and Herzberger (Herzberger 1982), and holds that sentences of a particular
level of the hierarchy of metalanguages will receive a truth-value only if

132

JUDY PELHAM

sentences (those sentences connected to it in an appropriate way) in the levels


below it receive a truth-value. Kripke's construction is applied to a language
with a hierarchy of truth-predicates for languages for different levels. Russell
was fundamentally opposed both to the idea that we could adopt a metalanguage outside of our logical language, and to the idea that the domains of
quantification are divided into "levels" (Pelham 1993). The ramified theory
of types adopted in Principia Mathematica was a reluctant compromise adopted
upon the failure of the substitution theory. But the proposed reconstruction
of the substitution theory developed here and in Pelham 1994 allows the
elimination of the paradoxes without the adoption of a metalanguage. It treats
the elements of logic as abstract objects in the direct way Russell intended.
The result of the fixed point construction is a consistent interpretation of the
substitution theory in which paradoxical sentences are undefined, but which
nonetheless makes all sentences part of one language.
The reconstruction of the substitution theory also formulates a consistent
theory in which we have one unrestricted domain of quantification. During this
period Russell believes that there is only one unrestricted domain of quantification because he believes that logical laws are those true of any potential
object of thought. In the both simple and ramified theory of types Russell
is forced to adopt the assumption that there exists a domain of individuals,
and a separate domain of first-order functions such that every individual is a
possible argument to every first-order predicate, and so on. Russell believed
that this division was unintuitive, and (as study of his manuscript shows) he
worked long to avoid it before adopting the ramified theory of types. The
substitution theory allows the domain of quantification to be unrestricted,
and makes any "typing" of predicates tacit, and derived from the data of
given propositions.
The leading idea of the substitution theory is that propositions are given
abstract objects and the notion of predicates and classes are derived from an
analysis of that data. The fixed point construction allows a consistent development of this idea, maintaining one domain of quantification and avoiding
the paradoxes by ensuring that paradoxical sentences do not receive a truthvalue. It seems unlikely that Russell would have been inclined to adopt a
three-valued logic generally, but it is consonant with his analysis of the paradoxes using the vicious circle principle that paradoxical sentences do not
receive a truth-value because they are not grounded. It is in the spirit of his
project to insist that although simple (Le. non-logical) propositions are either
true or false, paradoxical sentences are neither.
Concordia University, Now at York University
NOTES
I
See Russell (1903), p. 43 for his discussion of a term, and my paper "Russell's Early
Philosophy of Logic", Pelham (1993), for some explication of the universality of logic.

A RECONSTRUCTION OF RUSSELL'S SUBSTITUTION THEORY

133

It was not published until 1973. It was received by the London Mathematical Society on
April 24, 1906, but Russell withdrew it before it could be published.
3 In the manuscript Russell writes p;!x for * 12.211 but it seems clear from the rest of the
paper that this is merely a slip on Russell's part.
4
For discussion of how Russell's understanding of logic in The Principles of Mathematics
precludes these distinctions, see Pelham (1993).
5 It is a liar sentence in this respect, and soon after the discovery of this paradox Russell
began work on a manuscript called "The Paradox of the Liar".
2

REFERENCES
Herzberger, Hans, 1982, 'Notes on Naive Semantics', Journal of Philosophical Logic 11: 61-102.
Kripke, Saul A., 1975, 'Outline of a Theory of Truth', Journal of Philosophy 72, 690-716.
Moschovakis, Yiannis N., 1980, Descriptive Set Theory. Studies in Logic and the Foundation
of Mathematics Series. North Holland, Amsterdam.
Pelham, Judy, 1993, 'Russell's Early Philosophy of Logic', Russell and Analytic Philosophy,
University of Toronto Press, Toronto.
Pelham, Judy and Alasdair Urquhart, 1994, 'Russellian Propositions', Logic, Methodology, and
Philosophy of Science IX, Elsevier.
Russell, Bertrand, 1903, The Principles of Mathematics, Cambridge University Press, Cambridge.
Russell, Bertrand, 1905a, 'On Denoting', Mind 14 (Oct): 479-493. Reprinted in Russell (1973)
Russell, Bertrand, 1905b, 'On Substitution' (Russell Archives 220.010940). Unpublished manuscript in Russell Archives, McMaster University, Hamilton, Canada. Dated Dec. 22, 1905
in Russell's hand.
Russell, Bertrand, 1906, 'On the Substitutional Theory of Classes and Relations'. First published in Russell (1973).
Russell, Bertrand, 1907, Letter to Ralph Hawtrey, dated Jan. 22,1907. Copy in Russell Archives
(REC.ACQ 394).
Russell, Bertrand, 1973, Essays in Analysis. Edited by Douglas Lackey. George Braziller, New
York.

MICHAEL HALLETT

HILBERT AND LOGIC*

The logical systems presented in the books by Hilbert and Ackermann (1928,
1938) and in Hilbert and Bernays (1934/39) are not too far removed from
modem, axiomatic systems, those, for instance, to be found in Kleene 1952,
Church 1956, or Mendelson 1964. What Hilbert et oZ. give is, at root, a
system of (many-sorted) first-order logic, suited for the deductive purposes
of all mathematical theories, and therefore (of necessity) adding no genuine
content to any theory. What we have, in fact, is systems which are minimal
when compared to those of Whitehead and Russell or Frege, a Zogica utens
as opposed to a Zogica magna, to echo van Heijenoort's distinction. Moreover,
Hilbert and Ackermann (and then Hilbert and Bernays) state clearly what
are now regarded as basic questions concerning consistency, completeness
and decidability. Thus, in short, whatever the similarities with systems earlier
than those of Hilbert, what we see in many respects is the first modem
presentation of logic.
Hilbert's involvement with, and treatment of, logic is complex, and cannot
be treated here in any full sense, even when technical developments (such
as the e-calculus) are left to one side. Nevertheless, I want to try to shed
some light on the emergence of certain of the elements mentioned above,
and to convey a general sense of the structure and role that logic plays in
Hilbert's overall approach to the foundations of mathematics. This will constitute a modest contribution to a fuller understanding of the position of modem
logic.
I will address two main issues. The first is the emergence of the view of
logic as minimal. I will suggest that there are two central things which pushed
Hilbert in this direction, his attitude to the primitives of a mathematical system,
and his novel approach to the question of mathematical existence. Both of these
views stem from the rejection of the thesis that axioms are truths. It should
be noted that both were initially developed independently of the logical
antinomies, but these certainly served to reinforce the doctrine that a logical
calculus should be minimal, and we will examine part of the reason for this. 2
The second thing I wish to examine in more detail is Hilbert's rather
complex attitude to second-order quantification. This, too, is connected to
the desire to keep the logical resources quite separate from the existential
commitments of mathematical theories, and is again intimately connected to
Hilbert's view of mathematical existence. But it is also connected to two things
stressed by Hilbert from the very early stages of his work on foundational

Bibliographical items marked with an asterisk are hitherto unpublished.

135
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 135-187.
1995 Kluwer Academic Publishers.

136

MICHAEL HALLETT

matters, the role of symbols in mathematical theories, and what Hilbert and
Bernays later call the finite Einstellung. This is important, for it is then quite
natural to see logic for Hilbert as the main link between the two most important elements of his foundational work, the stress on the axiomatic method,
articulated in the period 1899-1905, although iterated throughout, and the
Hilbert programme of the 1920s, built as it is around what we call finitism.
Interestingly, Hilbert did not begin to develop a system of logic until 1905,
a development which was continued following rather different lines from
around 1917, thus a considerable time after he had set out the main aims of
his axiomatic approach. To be more explicit, Hilbert adapted logic (taken
from various sources) so that it conforms to certain features which he saw
as essential to the aims of the axiomatic method, and in this it is the finite
Einstellung which is the dominant feature. One should be careful to distinguish
between the finite Einstellung and finitary reasoning. The two are closely
connected, although I will not discuss finitary reasoning as such in this paper.
Hilbert intended all of developed mathematics to be finitary in a certain
sense, but he considered that only a certain restricted kind of arithmetic is
confined to methods of proof that can be called finitary. I will, however, say
a little at the end of the paper about where the need for finitary reasoning
arises.
The first two sections of the paper will, therefore, examine Hilbert's conception of the axiomatic method, his treatment of primitives, and why the
consistency question emerges in a natural way. This has clear implications concerning the scope of logic, and also for the nature of quantification. The third
section.will give a preliminary examination of the role of the antinomies and
of part of Hilbert's attitude to second-order logic. The last section will also
look very briefly at the finite Einstellung and finitary reasoning.
1.

HILBERT'S AXIOMATIC METHOD AND


THE STATUS OF PRIMITIVES 3

Hilbert's view of the axiomatic method is not straightforward. In the first place,
he is clear that an axiomatic system does not have its origin ex nihilo. It begins,
rather, with an assemblage of what he calls 'facts' [Tatsachen], more or less
established; the task then is to investigate what follows from what, and what
is dispensable and what essential. For instance, in his lectures of 1905, he says:
If one has a certain body of facts [Thatsachenmaterial] at one's disposal, a body that may be
composed of certain propositions [Satze], even doubtful ones, conjectures, etc, then one singles
out a series of these statements, separates them off, and puts them together into a system of its
own [einem eigenen System], into the axiom system. This system is then seen as the foundation, and one seeks to derive from it all the material presented through logical combinations
according to the known logical laws"

In Hilbert's view, the axiom system achieves with this a certain independence from the Tatsachenmaterial, as is made clear in the following passage
from some lectures of 1922-23. Hilbert is reported as saying:

HILBERT AND LOGIC

137

The service of ax iomatics is to have stressed a separation into the things of thought [die
gedanklichen Dinge] of the [axiomatic] framework and the real things of the actual world, and
then to have carried this out. s

To emphasise what is really new here, and for brevity, I cite Bernays writing
in 1922. Bernays, of course, is talking directly about Hilbert's axiomatisation of geometry, but Hilbert's view is quite general. First, Bernays begins with
a short account of how axiom systems were viewed before Hilbert. Bernays
writes:
According to the standard view, the nature of the axiomatic method, i.e., the method of
developing a science purely logically from axioms and definitions, is the following: One starts
from a few basic principles, of whose truth one is convinced, places these at the beginning as
axioms, and, with the aid of logical deductive procedures, derives from them theorems, theorems
whose truth is therefore just as certain as that of the axioms, simply because they follow
logically from these axioms. This view draws attention to the epistemological character of the
axioms. 6

Note that, according to this view, all the axioms and theorems are simply truths,
with the axioms assigned a special role among the truths, presumably (as
Bernays says) for epistemological reasons, reasons which ought to be given
in a full statement of the axiom system. But for Hilbert, axioms are no longer
truths. Bernays goes on:
... according to this view [i.e., Hilbert's] the axioms are not judgements of which it can be
said that they are true or false. Only in connection with the axiom system as a whole do they
have a sense. And even the entire axiom system does not constitute the expression of a truth.
Rather the logical structure of axiomatic geometry is, in Hilbert's sense, purely hypothetical,
just like abstract group theory: If anywhere in reality there are three systems of objects [Bemays
is referring to the three kinds of primitive in Hilbert's geometry, points, lines and planes] which
are mutually related so that the axioms are fulfilled (Le., that under a suitable correspondence
between the names and the objects and relations the axioms become true assertions), then all
the theorems of the system are also correct for these objects and relations. The axiom system
itself does not express a state of affairs [bringt nicht eine Tatsiichlichkeit zum Ausdruck), but
rather represents a possible form of a system of connections, a system which is to be investigated according to its internal properties [innere Eigenscha/ten].7

Note, too, that Hilbert, as reported by Bernays here, relies on a notion similar
in flavour to what is now called logical consequence, and this accurately
reflects formulations of Hilbert which go back at least as far as 1894. 8 Note
also that we have here something similar to the notion of 'true in an interpretation', which also goes back to 1894, and which is mentioned in the
correspondence with Frege. Above all, though, we should note that what
seems to be expressed by an axiom system is not a set of truths, and even
the separation from the 'facts' is stressed; rather what we have is a network
of logical interconnections between propositions such that if the basic terms
are interpreted by some realm of things in such a way that the axioms do
express truths about them, then the theorems must hold of them as well.
It should not be thought that Hilbert's concerns were utterly different from
those of others who promoted something like an axiomatic view at around

138

MICHAEL HALLETT

the same time, even those, like Frege and Pasch, who held something akin
to the first view of the role of axiom at is ation that Bernays sketches. For
example, Frege states clearly in the Grundlagen (1884, 2) that one of the
purposes of giving a proof is to show how the particular truth proved depends
on other truths. 9 Indeed, this is intrinsic to Frege's project in the Grundlagen,
for part of the point is to argue for a conception of the analyticity of a sentence
which is based, not, as with Kant, on its inner form, but rather on its deductive linkages. This expresses much of the spirit of Hilbert's investigations of
geometry. Secondly, Hilbert shares with Frege (as well as with Bolzano,
Dedekind and Cantor) the desire to show that mathematics is autonomous,
not dependent on 'foreign elements' or on appeals to intuition, at least in its
deductive development. This is connected to a third point. The main similarity between Hilbert and others in the late nineteenth-century concerned with
axiomatic development is the pursuit of rigour and precision, to be achieved
by explicit statements (laws or axioms) presented at the outset and which
must not be supplemented during the course of the theory's development. 1o
To this end, both Hilbert and Frege insist, as do Pasch and others, that the
development of the theory must proceed by means of purely logical (and,
indeed, 'gapless ') deduction. ll
However, the similarities mentioned here, although important, are limited,
especially in the conception of the role ascribed to logic. The best way to
approach one of the central differences is to look more closely at the matter
of precision.
It is often said that the primary purpose of logic is to facilitate, indeed
encourage, precision. In a sense, this is right, but it is important to be clear
about how to interpret this. For instance, if one wants to avoid 'foreign
elements', it is necessary to state what can (or cannot) be appealed to in a
proof, and this involves first a declaration of what primitives are to be allowed,
and of what propositions to admit (as axioms) governing these. For the particular deductive exercise in question, one might be content with such
specification, and indeed this was very much Hilbert's view. But one might
also wish to go further than this. Russell provides a prime example of this more
ambitious view. From the very beginning of his work on the foundations of
mathematical theories, Russell was concerned with isolating the right primitives. These Russell characteristically called 'indefinables', meaning not just
indefinable in the theory that one is trying to expose, but absolutely indefinable. For instance, take the following passage from Russell's 1899 reply to
Poincare:
Both [terms, 'distance' and 'straight line'] belong, so to speak, to the geometric alphabet. They
can serve to define other terms, but they are themselves indefinable. Consequently, any proposition, whatever it may be, in which these notions figure is either an axiom or a theorem, and
not a pure definition of the word. When I say: the straight line is determined by two points, I
assume that straight line and point are terms which are already known and understood, and I
express a judgement about how they are related, a judgement which is either true or false, and
which is in no case arbitrary.12

HILBERT AND LOGIC

139

Two things are clear here. First, there is the maintenance of the view that
the statements involving the primitives must all be true or false. But alongside this there is the view that the primitives must be ultimate. In his essay
on Leibniz, Russell says that
.. the business of philosophy is just the discovery of those simple notions, and those primitive axioms, upon which any calculus or science must be based .... An idea which can be defined,
or a proposition which can be proved, is only of subordinate philosophical interest. The emphasis
should be laid on the indefinables and indemonstrables, ... 13

This makes the search for primitives the primary philosophical activity, and
knowledge of these the primary philosophical knowledge. And here Russell
says that
.. no method is available save intuition. 14

But it seems that this intuition does not yield articulable knowledge of the
primitives over and above that embodied in the axioms, for, if it did, this could
then supplement the latter, or even supplant some of the axioms. As Russell
says:
.. the meaning of the fundamental terms cannot be defined, but only suggested. If the suggestion does not evoke in the reader the right idea, nothing can be done. IS

And later he says in the Principles (1903) that any discussion of indefinabIes is
.. the endeavour to see clearly, and to make others see clearly, the entities concerned, in
order that the mind may have that kind of acquaintance with them which it has with redness
or the taste of a pineapple. 16

This concern with the nature of the primitives is not just a feature of
Russell's early examination of geometry, but it continues throughout his work
on the foundations of mathematics. From the time of the Principles onwards
it becomes the issue of whether to select classes, propositional functions or
propositions as the basic elements, one of the central considerations in this
(aside from the avoidance of contradiction) being how such choices fit in
with the demand that the basic principles adopted should be principles of logic.
Even after the Principia, in the theory of logical atomism, Russell is concerned
to find primitives on which our knowledge of the physical world is based. This
is all part of what Russell would have considered logical precision, since,
for Russell, the first task of philosophy requires a rendition of statements in
their ultimate 'logical' form as opposed to their 'surface' (and often misleading)
grammatical form; only this renders the correct content of a proposition.
Thus, he held that only sentences written in a 'logically perfect language' (such
as that of Principia) express genuine propositions. Thus part of what isolating the primitives achieves is precision about what the correct logical form
of our assertions is. 17 Note that this project requires finding the right language
in which everything can be expressed.

140

MICHAEL HALLETT

But this is not at all what logical precision means for Hilbert. For him, there
is no absolute logical form in this sense. Of course, logical form depends on
what is taken as primitive, and, then on what is assumed about the primitives in the axioms. But not only might there be different systems of axioms
based on the same primitives, but the same set of theorems might be presented with different primitives, and what is primitive in one system may
not be primitive in another. If follows that, for Hilbert, there is no unique
way of stating the facts, for there is not one privileged type of primitive,
although there might well be powerful considerations in favour of some ways
of shaping an axiom system and against others. 18 Logical precision for Hilbert,
then, is first and foremost a matter of establishing the logical interconnections between propositions within a system once one has settled on the
primitives. And for Hilbert it is not just the positive deductive links that
must be revealed, but also the absence of such links, and also the links between
the propositions of a theory and those of other theories, in so far as it is possible
to do this. 19 Thus, formalisation in the sense of stating the primitives is actually
only a pre-condition for precision, for it is simply to cast the terms and
propositions in such a way that it is possible to investigate their deductive interconnections. This is surely closely related to what Dedekind says in 3 of
his essay on the irrational numbers (1872) when he stresses the need for a
notion of continuity which can be used in a system of deduction:
Nothing is achieved by vague talk of 'uninterrupted connection in the smallest parts'. It is
surely a matter of specifying a precise characteristic of continuity which can be used as the
basis for actual deductions. 20

The extent to which what has been said about Russell's concern with
primitives also applies to Frege is not certain. For one thing, Frege gives the
strong impression in the Grundlagen (68, footnote) that there could be ways
of defining number other than through the use of extensions, that there is
therefore nothing essential about this. Nevertheless, it is clear that Frege's
intention is to give explicit definitions of the numbers, thus to determine the
reference of the 'primitives' of arithmetic and of analysis in terms of logical
notions which are genuinely primitive. The analogy which Frege uses in the
Begriffsschrift about microscopes is also instructive. Frege makes it clear
that he thinks of a formalised, logical language as being simply an extension of natural language that enables one to reveal the logical structure that
is there but (by implication) hidden. Logic, then, extends the power of natural
language (for some purposes) just as the microscope extends the power of sight
for some purposes:
I believe I can give the clearest indication of the relationship between my Begriffsschrift and
ordinary language [Sprache des Lebensl if I compare it with that of the microscope to the eye.
Because of the range of its applicability and its transportability, by means of which it can be
shaped to fit the most diverse circumstances, the latter has an enormous superiority over the
microscope. But considered as an optical instrument, it exhibits great imperfections, which
ordinarily go unremarked because of its intimate connection with intellectual life. However, as
soon as scientific [wissenschaftlichel purposes pose demands concerning the sharpness of

HILBERT AND LOGIC

141

differentiation, then the eye reveals itself as inadequate. The microscope, on the other hand, is
precisely suited for such purposes, but for the same reason unusable for all others.
The Begrifsschrift is just a scientific tool, invented for determinate scientific [wissenschaftliche] purposes, which should not be condemned because it is not suited for others. 21

This suggests that looking more carefully with the new 'visual' aid will reveal
the correct structure of propositions (say, those of arithmetic).22 The desire
to discern the 'correct structure' might well be simply the desire, not to discern
an 'ultimate' form, but rather simply to reveal a form suitable for carrying
out deductions in a gapless way, as Dedekind suggests. However, what Frege
says in his 1906 (p. 301) about the position of genuine primitives tends to
suggest otherwise; these primitives cannot be defined, they can only be accompanied (sometimes) by 'elucidations'. (Note the similarity with Russell.) These
elucidations, Frege says, cannot play any role in proofs because they 'lack
the requisite precision' (Frege to Hilbert, 27.xii.1899) and 'and no inference
is based on them' (1906, p. 301). Frege's insistence that one must establish
the truth of axioms before one develops the system (an insistence which
emerges very strongly from the letter of 27.xii.1899) then suggests (i) that
there is a unique content to these axioms, and (ii) that the logical interrelations shown by the development of the system do not reveal anything about
this content.
I have suggested how Hilbert's view differs from this, but it is important
to pin down the source of the difference. Part of it stems from Hilbert's position
on the status of axioms. If it is assumed that the axioms are truths laid down
prior to the development of the system, then presumably they are truths about
the (relations between) the primitives which appear in them, and enough
must be said about these to indicate at least why these propositions are true.
This, in tum, indicates that we must work in an ambient language in which
it is possible to fix the primitives. For Frege, and for Russell much of the
time, this was a universal logic. However, if the assumption about the axioms
as truths is dropped, then there is no necessity to say anything about the
primitives prior to the development of the theory. Thus, in particular, there
is no necessity (as regards the primitives) for a strong ambient logic. Hilbert
does not accept that there need ever be absolutely indefinable primitives,
and holds that it would be extremely restricting to regard certain concepts as
absolutely primitive, for many of the most significant developments in mathematics and science are based precisely on giving definitions of 'indefinables'
in a modified or completely new systemY Individual systems operate with
primitives that must be regarded always as relatively indefinable, and these are
governed solely by the axioms set out in the system in question.
It is possible to see in this a generalisation of Dedekind's approach to the
extensions of the various number systems, and what he called new 'creations'
in general. Dedekind is not concerned (at least, not primarily) to give definitions of the elements of the successive fundamental number systems, but rather
first to specify laws which govern them, and then also to state general principles as to how the laws for an extended system emerge from laws for the

142

MICHAEL HALLETT

system that it extends. For instance, Dedekind states that completely general
laws are to be maintained, while restrictions are lifted on laws of limited
generality. (A prime example is the shift from the natural numbers to the
integers, where substraction, an operation of limited scope in the natural
numbers, is turned into a completely general operation like addition.)
According to Dedekind, there should be something canonical (and thus nonarbitrary) about this, and he makes a serious attempt to describe, in a general
way, how such principles are arrived at. 24 At the root of Dedekind's procedure is the insistence that the objects of the new theories are genuinely
primitive in the sense that they are assumed to have no properties other than
those which are specified in the principles laid down.
This procedure is clearest of all in the case (described above) of going
from the positive whole numbers to the integers; the modern way of describing
this would be to say that what Dedekind effects is the construction of the
free ring over the integral domain of positive integers. But it is also clear
from the way that Dedekind treats the reals and the natural numbers. In the
case of the reals, he does not define them as Dedekind cuts in the rationals.
Rather, it is simply shown that the cuts have the continuity property that
Dedekind seeks. The reals themselves are then introduced as primitive objects
which correspond to the cuts in the right way. The natural numbers, too, are
taken to be primitive things, things 'abstracted' from some 'simply infinite
system'. It is not that any error is committed by defining the objects in any
of the well-known ways, but rather that, if this is done, the essential structure will be burdened by inessentials, and possibly its relations to other
structures obscured. This is made quite clear in a letter to Weber, written in
January 1888. Dedekind mentions both the possibility of defining the irrationals as cuts, and then also refers to a suggestion made to him by Weber
that one could (as Russell later did) define the cardinal number of a set M
as the set of all N equinumerous to M. Dedekind comments as follows:
If one wishes to pursue your method - and I would strongly recommend carrying it through -

then I would still advise that by the name number [Zahl, Anzahl] is understood not the class itself,
but rather something new which corresponds to the class, something which the intellect creates .
. . . The rational numbers also generate cuts, but I certainly do not wish to pretend that the rational
numbers are identical with the cuts that they produce. And even after the introduction of the
irrational numbers one will often speak of the cut phenomena in such terms, and ascribe to
these cuts attributes which would sound very peculiar [seltsam] if applied to the corresponding
numbers themselves. Something similar holds with respect to the definition of cardinal number
as a class. One wishes to say a great deal about such a class (for example, that it is a system
of infinitely many elements, namely all similar systems), a weight one would be very unhappy
hanging around the neck of the number itself. Does anyone bother to remember that the number
four is a system of infinitely many elements? (Against this, everyone will always be conscious
of the fact that 4 is the child of the number 3 and the mother of the number 5.) For the same
reasons, I have always thought Kummer's creation of the ideal numbers thoroughly justified,
if only it is carried through the requisite rigour. 25

In effect, this is to say that the new irrational numbers are primitive objects
on the same type level, and thus of the same kind, as the rationals, and those

HILBERT AND LOGIC

143

of the integers, and so on. Thus, for Dedekind part of the desire to avoid
constructive definitions is to avoid type inflation. Hence, in ceasing to have
the 'irrelevant' properties of the cuts themselves, an irrational number possesses only those properties ascribed by the principles of being a field and
being completely ordered. Things are numbers just in virtue of their satisfying the right properties, and this automatically admits the possibility of there
being different sorts of object satisfying the properties. In regard to the natural
numbers, Dedekind puts the position in a particularly strong way:
73. Definition [Erkliirung]. When, in considering a simply infinite system N ordered by a
mapping <p, one completely overlooks the particular nature [Beschaffenheit] of the elements,
when one only keeps in mind (festhiiltl their differentiability and takes into account only
those relations which they have in virtue of the ordering mapping <p, then these elements are
to be called natural numbers or ordinal numbers or even just numbers, and the basic element
1 is called the base number of the number series N. In so far as this frees the elements from
all other content (abstraction), one is justified in calling these numbers a free creation of the
human intellect. 26

Thus, in Dedekind what one sees is something quite different from Frege's and
Russell's analytical investigation; a halt is called to analysis even when it is
known (or strongly suspected) that the analysis could be carried further. The
reduction, if that is what is, is minimal and not maximal; more importantly,
the nature of the objects is specified by the fundamental principles involved,
and not by the ambient logic.
Hilbert's axiomatic method generalises Dedekind's conception of mathematical theories, the major difference being that, in Hilbert's work, there is
no suggestion of 'creating' elements, or of 'freeing' or 'abstracting' them. 27
There is just the claim that the primitives of a theory will have all and only
the properties attributed to them by the principles given (the axioms). Hilbert's
paper from 1900 on the axiomatisation of real number is an example:
In the theory of the number concept, the axiomatic method takes the following form:
We think of a system of things; we call them numbers and denote them by a, b, c, ....
We think of these numbers in certain mutual relations, whose precise and complete description
is obtained through the following axioms. . . .28

Thus the numbers are indeed to have the same elementariness that Dedekind
seems to suggest; they become, in Hilbert's words, a 'system of their own'
(*1905b, pp. 11_2}.29 The word 'complete' is important in this context, not
because of any allusions to the later problems of completeness, but rather
because it forms part of a clear statement that all we know (and need to
know) about the irrational numbers stems from the axioms. There is no need
to 'free' objects from 'extraneous' properties. Things that fulfill the axioms,
like Dedekind cuts, points of space or equivalence classes of Cauchy sequences,
will often have 'extraneous' properties. But, nevertheless, in doing real analysis
one is interested only in the properties that the axioms lay down. It is these,
and only these, which tell us what the primitives are like, more precisely
how they are related. As Hilbert says:

144

MICHAEL HALLETT

It is quite self-evident, that every theory is only a framework or schema of concepts together with
the necessary relations to one another, and the basic elements can be thought in an arbitrary
way.30

Hilbert is also quite clear that he thinks that this latter is a positive advantage of a theory.31
Dedekind achieves this only partially, for his account still depends to some
extent on the individual nature of the objects concerned. For instance, in the
case of the natural numbers there has to be some specific simply infinite system
from which to abstract (or to which to apply the free algebra construction),
whence the need for the infamous Theorem 66 of Dedekind 1888. In the case
of the irrationals, it is shown that some objects (the cuts in the irrationals)
satisfy the desired properties, thus in short (as we would now put it) that
there is a model of the axioms. But for Dedekind this is not just an interesting metamathematical observation, one made once the system of real
numbers has been set out, since Dedekind actually introduces the primitive
irrational numbers by reference to the cutS.32 Thus, to prove anything about
these numbers, one first has to prove that the corresponding fact is true of
the cuts.
Hilbert's position, on the contrary, puts the emphasis on the axiom system
alone (together with the theorems the axioms give rise to), and dispenses
entirely with external explanation for the primitive elements. It is precisely
this, I think, that Hilbert alludes to in the remark in a letter to Frege that 'point'
will mean something different in different systems of geometry.33 There is
no prior 'pure' notion of 'point' taken on its own that the axiom system
attempts to capture, and which we can say has a reference. All there is is a
wealth of imprecise, intuitive ideas (thus, either not fully articulated, and so
not suited for proper deductive development, or too restrictive), and other more
or less satisfactory axiom systems. This is also, I think, the substance behind
Hilbert's remark to Frege that 'love, law and chimney sweep' would form a
Euclidean geometry provided it could be shown that the axioms (in the right
interpretation) are true of them. 34 The example is instructive, for it shows us
that the primitives can be drawn from anywhere, and in other contexts may
well have structure, thus, not be primitives in this context - chimney sweeps
and Dedekind cuts clearly have a great deal of structure. But from the point
of view of the axioms, that structure is of no importance; but since there is
no mention of it, there is no need to strip it away.
Along with the aim of slimming down axiom systems, there is the parallel
aim of cutting to a minimum the assumptions used in proofs. Much of the
point of this is stated clearly by Bernays in the passage quoted early in the
section. Suppose it is shown that P can be proved from just the axioms
AI' A 2 , , An; and now suppose it is shown subsequently that some quite
unexpected interpretation can be given to the AI, A 2 , , An> for instance
by interpreting the basic entities, not as reals, but as certain kinds of function.
Then it follows without further ado that a version of P must hold for these

HILBERT AND LOGIC

145

objects as well. The use of such techniques, a commonplace now, was one
of the reasons why Hilbert's work had such a powerful effect.
We are back, very firmly, to the view stated in the passage from Bernays
quoted at the outset, namely that the axioms can be differently interpreted,
something that Hilbert stresses many times, and something which he takes
to be a beneficial (and liberating) feature of the axiomatic presentation of
theories. Dedekind, it is clear, also appreciated this, hence his desire to prove
categoricity results. But his frequent appeal to "creation" might be seen as
hankering after the uniqueness that constructive definitions guarantee.
Lastly, this makes it quite clear why Hilbert is so insistent (e.g., in his 1900a)
that what he propounds is quite different from the 'genetic method' of
explaining, for instance, the genesis of the number systems in terms of the construction of new elements from existing ones. One problem with this is the one
which worried Dedekind, namely that, as well as the 'right' properties, the
objects constructed will also have properties that are not intrinsic to the structure being characterised. But there is a further worry, namely that the
construction itself requires theoretical treatment, and it may not be a trivial
matter to spell out precisely what such construction entails. This was especially so before the codification of set theoretic principles, a fact which
the antinomies turned into a serious drawback. The rejection of constructive
definitions to some extent dispenses with the need for such theoretical
explanation. 35
2.

THE AXIOMATIC METHOD: CONSISTENCY

This brief survey of Hilbert's axiomatic method tells us something about


how logic is not involved; I want now to turn more to the question of how
logic is involved. Two things emerge clearly.
In the first place, logic is central to Hilbert's early work on the foundations of mathematics, for it is dependent on terms like deduction', derivable'
and 'logical consequence' from the beginning. These uses cannot simply be
rhetorical devices; they are intrinsic to Hilbert's conception of an adequate
axiom system. Hilbert's statements of the adequacy conditions on axiom
systems from 1899 on are fairly constant, basically that the system of axioms
be finite (sometimes 'finite and closed', sometimes just 'simple'), that it be
consistent, and complete, and that the axioms be independent of one another. 36
Thus, to take one obvious example, deducibility is key to the notion of there
being a complete axiomatisation of geometry, for (in Hilbert's explanation
of it) this will mean that the system can prove all the geometrical theorems
(or 'facts'). It is also fundamental to the notion of independence of an axiom
from the others, for this says that this axiom is not derivable from the others.
Consequently, lack of precision with respect to the notions of derivation or
consequence is nothing less than a lack of precision in the claims made by
Hilbert about the nature of mathematics. This makes it clear why Hilbert had

146

MICHAEL HALLETT

to develop a precise account of derivation, again quite independently of the


paradoxes.
The second observation is this. The abandonment of truth as a criterion
for an axiom brings to the fore the consistency question, which is a rather
unimportant and subsidiary matter if one can rely on truth, but is serious if
one cannot, for inconsistency amounts to showing nothing less than that
every proposition is deducible from (,connected to') every other. 37 If we are
dealing with a stock of truths, then there is no question that they will be
compatible, even though we might not be able to see at first any connection
between them. But if one thinks it no longer makes sense to appeal to the
truth of the principles, then it is clear that compatibility is something that
has to be demonstrated. Thus, if we decide to add (J to the system of principles ~, we will not know, without a demonstration, that ~ and (J are compatible.
Hilbert raises the problem in a particularly sharp form in his letter to Frege
of 29.xii.1899. Hilbert states that it is a 'impermissible and unlogical' to try
to add an 'axiom' to a 'complete and unique [eindeutiger] determination
of a concept', a mistake that physicists, in particular, often make, Hilbert
says:
Very often in physical theories, again and again in the course of an investigation new axioms
are invented which are not confronted with the assumptions made earlier, and it is never shown
whether or not they contradict any of the facts drawn from the earlier axioms. From this there
arises the purest nonsense. The method of inventing an axiom, appealing to its truth, and then
concluding from this that it is compatible with the concepts defined, this is a major source of
errors and misunderstandings in modem physical investigations. One of the main purposes of
my Festschrift [i.e., 1899] was to avoid this mistake. 38

Indeed, if the axiom system does yield a 'complete and unique [eindeutig]
concept', then according to Hilbert's characterisation of completeness as
Post-completeness, the 'purest nonsense' which arises will be a contradiction. Even if one does hold to a notion of truth for axiom systems, what
this passage points out is the danger of appealing to the truth of an isolated
proposition.
Note again that the problem of consistency arises quite independently of the
discovery of antinomies.
The consistency question already emerges with Dedekind's position, that is,
where mathematical objects are regarded as 'created' according to certain principles (axioms) and on the basis of nothing else. (Thus: are the creations
even 'possible'? See Frege 1903, 139 and 143, English translation in Geach
and Black (eds.) 1966, pp. 174 and 178.) It is quite clear from his correspondence that Dedekind himself was perfectly aware of the requirement of
proving consistency, both for the system of irrational numbers and also for
the system of natural numbers. For instance, the letter of Keferstein from
1890 states quite clearly that the very point of Theorem 66, which attempts
to prove the existence of an infinite set, is to show that there is no 'inner
contradiction in the concept' of a simply infinite system, thus, in effect, the
demonstration of consistency by the provision of a mode1. 39 Curiously, Hilbert

HILBERT AND LOGIC

147

and Bernays (1934, p. 15) say that Frege was the first to realise the need to
prove the consistency of arithmetic, by carrying out an existence proof on
the basis of logic alone. In fact, what Frege attempted to show was that the
truths of arithmetic are truths about objects of a certain specific kind, logical
objects. (Hence Frege's sceptical question as to whether consistency is the only
criterion we need to satisfy in a creation: Frege 1903, 144, English translation in Geach and Black (eds.) 1966, p. 178.) That this then demonstrates
consistency is just an automatic consequence, as Frege points out to Hilbert
in the letter of 27.xii.1899. Nevertheless, Frege did think, with Dedekind,
that the provision of a 'model' of the axioms is the only way to prove their
consistency.4O
Although Dedekind's strategy of showing that there is a model of the axioms
(or principles) is weaker than Frege's attempt to show directly that the major
assumptions are true, it still approaches the consistency problem as a problem
of existence, for it tries to show that there are objects about which the axioms
express truths. While Hilbert's axiomatic method builds on Dedekind's
approach to mathematical theories, it does not follow Dedekind in the matter
of consistency, models and existence.
First, Hilbert separates the existence question into two, quite distinct, components. The first can be called the question of internal existence. This simply
amounts to proving within a system ~ a statement of the form 3xA, even by
non-constructive means. 41 Existence in this sense is clearly relative to the axiom
system. The second component is what might be called the question of external
existence (to distinguish it from the above). And Hilbert proposes an answer
to this latter question which is quite different from either Frege's or Dedekind's.
In his letter to Frege of 29.xii.1899, Hilbert puts forward the following thesis
about consistency and existence:
You write: "[I call axioms propositions] .... From the truth of the axioms it follows that they
do not contradict one another". I was very interested to read exactly this, since for as long as
I have been thinking, writing and lecturing on these things, I have asserted exactly the opposite:
If the arbitrarily chosen axioms do not contradict each other with all their consequences, then
they are true and the things defined by the axioms exist. That for me is the criterion of truth
and existence. 42

This is repeated in many places in Hilbert's writings. Hilbert 1900b provides


a similar statement:
In the present case [Hilbert is discussing the axiom system for the real numbers], ... the
demonstration of the consistency of the axioms is at the same time the proof of the mathematical existence of the totality of all real numbers or of the continuum. In fact, when the demonstration has been fully achieved, then all objections which hitherto have been raised against
the existence of this totality will lose all justification. 43

Hilbert makes it clear (for instance, in his 1900a) that when we give an
axiom system we make an assumption that there are such things as the axioms
say there are. A proof of consistency for the axiom system is then meant as
a justification of this assumption. We might say that, for this view, a proof

148

MICHAEL HALLETT

of consistency will make the internal existence question the only meaningful
one. (The connections to certain well-known later views on existence are
obvious.) In another way of putting it: a proof of consistency is meant as a
replacement for the procedure of modelling which Dedekind exemplifies, which
before Hilbert had been taken to be the only way to show consistency. Again,
it is important to underline the fact that the assertions governing internal
existence in the domain all stem (on this view) from the axioms: nothing
more can be said about this than is sanctioned by the axioms. The external
existence assumptions playa very important philosophical role, as will become
clear later on.44
It should also be stated that Hilbert's position entails something similar to
the type reduction that Dedekind advocated, for it must follow that there are
not, in general, different kinds of (internal) existence. A little after the passage
from 1900b just quoted, Hilbert goes on:
The conception of the continuum, or equally the concept of the system of all functions, exists
then in precisely the same sense as does the system of rational whole numbers or that of the
higher Cantorian number-classes and powers.45

In short, very familiar and well-accepted notions are on the same footing as
abstruse and controversial ones, providing consistency can be established for
the theory that presents them. In the best cases there would also be some
sort of 'natural' integration with previous theories, either through deductive
relations, laws partially shared, or even through various different kinds of
full or partial modelling of the sort used in Hilbert's 1899. Integration was a
very important element in Dedekind's criteria for the extension of theories, and
was also stressed in Cantor's arguments that the transfinite numbers are indeed
numbers on just the same footing as the natural or irrational numbers, and
for Cantor this was clearly a criterion meant to be applied to genuinely new
theories in place of that of constructing a full model. Such integration, when
taken over by axiomatic theories, is the axiomatic correlate of the 'genesis'
exhibited by constructive definitions.
This is intimately connected with Hilbert's attitude to ideal elements, and
it is in fact the very foundation of it. According to Hilbert, an element (say
...J-i) is indeed of a different status (ideal) when first added to the system of
reals. But once one has given a set of laws for the integration of this element
into the previous system, then what were previously ideal elements exist in
just the same sense as do the other elements, providing one can demonstrate
the consistency of the new, expanded system. Hilbert says the following in
his letter of 29.xii.1899 to Frege:
The proposition "Every equation has a root" is true or the existence of these roots is proven,
as soon as one adds the axiom "Every equation has a root" to the other arithmetical axioms
without, in so doing, making it possible to infer a contradiction. 46

And Hilbert makes it quite clear in some lectures from 1919 that the designation 'ideal element' is only a relative one:

HILBERT AND LOGIC

149

The tenninology of ideal elements thus properly speaking only has its justification from the point
of view of the system we start out from. In the new system we do not at all distinguish between
actual and ideal elements. 47

Thus, as with Cantor and Dedekind, so with Hilbert - the question of existence boils down to that of the acceptability of the new, more inclusive theory. 48
At first sight, this looks a little odd. Consistency is a logical property of
the entire axiomatic system, for it expresses a (rather crude) fact about the
nature of the deductive relations between the propositions. In the earlier way
of regarding axiomatic systems that Hilbert rejects, what exists or not cannot
be a property of the system, for this is surely something determined by the way
the (abstract) world is, independently of the properties of the systems we
use as a means of expression. The consistency of the system would simply
be a by-product of its expressing truths, and thus subordinate to matters of
truth and existence. For Hilbert, however, it is the other way around: consistency is taken to be a sufficient condition for the truth of the axioms and
thus for the existence of what it is that they postulate.
But it should be clear that Hilbert's position is not so strange, given the
attitude he takes towards the primitive objects of a system. If we know nothing
about the primitives of a system other than what is given through the axioms,
then it makes no sense to ask in addition whether the axioms are expressing
truths about these things. To speak vaguely, if it is the axiom system that
'constitutes' or 'defines' these things, then their existence is surely afunction
of that system, a point underlined by Hilbert's claim that concepts (like 'point')
change as the system changes. 49
There are, of course, many reasons why Hilbert's claim that consistency
implies existence turned out to be questionable, but this is not something which
can be analysed here. 50 Instead, I will look at what it implies about the role
of logic.
In view of what Bernays says in the passage quoted at the beginning, and
what Hilbert himself says in earlier lectures (e.g., those from 1894), it is important to raise the question of whether, when he speaks of 'deduction', Hilbert
could really have meant the notion of logical consequence. After all, he does
use the term 'consequence' frequently. This in tum would mean that when
he talks about 'consistency' Hilbert actually means something like satisfiability, hence actually underlines, rather than repudiates, the need to produce
models of theories. Moreover, Hilbert does sometimes state categoricity as a
desirable condition on the adequacy of an axiom system, even if it is not
given as a necessary requirement. However, the categoricity condition would
be rather empty if the axiom system concerned has no models whatsoever.
More importantly, Hilbert uses precisely his notion of logical consequence
in his investigations of independence among the various axioms of geometry.
For what is shown in demonstrating the independence of an axiom A is that
there is an interpretation which makes all of the other axioms true, and which
nevertheless makes A false. Without further ado, Hilbert says that this shows

150

MICHAEL HALLETT

that A 'cannot be inferred logically from the remaining axioms' .51 Lastly, as
we stressed, Hilbert makes an assumption when setting up an axiom system
that there are such things with these properties, an assumption which has to
be justified with a consistency proof. It seems natural to think, therefore,
that a consistency proof is just a demonstration that there are such things,
for what better justification could there be? (Categoricity, when appropriate,
would then be a demonstration of uniqueness of that model up to isomorphism.)
In this case, there would be no interesting distinction between Hilbert's position
and that of Dedekind.
In some sense, it seems obvious that Hilbert cannot mean these things,
for active mathematics has always been concerned with proof, with the construction of a sequence of propositions constituting a chain of reasoning leading
from the assumptions to the theorems. But it is nevertheless important to see
why Hilbert does not mean these things, especially given the important place
of models of theories in Hilbert's work on geometry. Recognition of these
reasons will serve to underline the connection between Hilbert's view of the
function of logic and his finitism.
Firstly, in the passage from Bernays quoted at the beginning which stresses
the notion of consequence, he (Bemays) talks of 'logically deductive procedures'. At the very least, this indicates, as does Hilbert's use of logical
consequence in connection with his proofs of independence in his work on
geometry, that Hilbert and Bemays slipped easily back and forth between
the notions of deduction and of consequence, presumably believing them to
amount to the same thing. 52
Secondly, Hilbert states quite clearly in his 1900b what is to count as a
solution to a problem:
... I mean above all that it is possible to present the correctness of the answer in a finite
number of inferences on the basis of a finite number of assumptions, which are themselves present
in the statement of the problem, and which can always be precisely formulated. This demand
for logical deduction by means of a finite number of inferences is nothing other than the
demand for rigour in the carrying out of proofs. 53

Moreover, Hilbert often makes it clear that when he talks of consistency he


means consistency in the sense of deducibility; that is to say, a system is
consistent if it is impossible to deduce a contradiction in it by 'a finite number
of logical inferences' (Hilbert 1900b, p. 264; 1935, p. 300). And on this page
and the next, Hilbert explicitly distinguishes this from the use of number
domains to provide models of (various groups of) geometrical axioms, and
that what such modelling does is to show that if it is possible to deduce a
contradiction in the geometrical system, then the translation of the terms of
this into those of the number system shows that a matching contradiction
can be deduced in this as well. (See again, Hilbert 1900b, p. 265, or p. 300
of 1935.) Indeed, Hilbert says quite explicitly that for a proof of the consistency of the axioms of real number theory we require a 'direct way', and
not that of giving a model of the real number axioms using another theory.

HILBERT AND LOGIC

151

(See 1900b, p. 265, or 1935, p. 300.) This is made even clearer by 1905, where
Hilbert describes the notion of consistency as that of being unable to deduce
both <\> and...,<\> by means of 'logical operations' (*1905b), and speaks also
of deduction as effected by 'logical combinations' of propositions. And his
work of 1905 (recorded in 1905a, *1905b and *1905c) was partly an attempt
to give an axiom system which is simultaneously an axiomatisation of propositionallogic, arithmetic and the notion of infinite set. This was, presumably,
Hilbert's first attempt at giving 'a modification of the known methods of inference' . In developing a system of propositional logic in *1905 b (into the details
of which we cannot go here), Hilbert sketches the first of many normal form
results, one of which is the basis of the proof of the Completeness Theorem
for propositional logic given in *1917-18. If one asks whether a 'correct'
proposition P follows from a given finite set of axioms AI' A z, A 3, , An'
then Hilbert says that there are only 'finitely many proof possibilities', and
so:
With this, we have solved, in the present and most primitive case, the old problem that every
correct result must be rendered by a finite proof This problem, properly speaking, was the starting
point of all my investigations in this area, and its solution in the most general case, thus the
proof that in mathematics there can be no "ignoramibus", must remain the final goal. 54

Given this, it is not surprising that Hilbert discusses consistency and inconsistency in his *1905b in a syntactic way. Inconsistency would allow us to
prove far too much from a system, says Hilbert. He notes that philosophers
in general have not taken sufficient notice of the antinomies and the effect
they have on what was taken to be logic, and he says that this is a mistake,
since:
... from any contradiction, no matter how far removed, we can prove the falsehood of every
correct statement [e.g., as Hilbert says, 2 = 2]. Hence, we could say that one contradiction in
the whole realm of our knowledge [Wissen] acts like a spark in the gunpowder barrel and destroys
everything. Therefore, every science [Wissenschaft] must have an interest in dealing with a
contradiction, no matter how far removed. 55

Hilbert's later isolation of Post-completeness is clearly related to this, for


the concern is to avoid being able to prove too much (and thus the danger
of toppling into inconsistency), while at the same time wanting to able to prove
as much as possible, a balance which Post-completeness expresses precisely.
There is also a connection between the consistency of a system, conceived
of as being able to show that one cannot prove too many statements, and the
requirement of conservativeness. Suppose a system r. is taken to be minimal
and to yield only 'correct' arithmetical theorems (on some measure of correctness). Then to say that extensions of r. must be conservative over r. is to
say that they should enable no new theorems written in the language of r.
to be proved; and this in particular is to say that the new system cannot
prove 'too many' assertions in the language of r., which is to say that it is
consistent over r..
Thirdly, in discussing the real numbers in both 1900a and 1900b, Hilbert

152

MICHAEL HALLETT

talks, not of a relation of logical consequence, but rather of the 'finite and
closed system of axioms for the real numbers' (l900a). For instance, in 1900a,
Hilbert writes:
Under the conception described above [Le., the axiomatic conception], the doubts which have
been raised against the existence of the totality of all real numbers (and against the existence
of infinite sets generally) lose all justification; for by the set of real numbers we do not have
to imagine, say the totality of all possible laws according to which the progression of elements
in a fundamental sequence can be specified, but rather - as just described - a system of things
whose mutual relations are given by the finite and closed system of axioms I-IV, and about which
new statements are valid only if one can derive them from the axioms by means of a finite number
of logical inferences. 56

And in 1900b we find:


Of course, the totality of all real numbers, that is, the continuum, is not, according to the interpretation characterised above, the totality of all possible decimal expansions or the totality of
all possible laws according to which a fundamental series [i.e. Cauchy sequence] can proceed.
Rather, it is a system of things whose mutual relations are governed by the axioms set up, and
for which all and only those facts are true which can be derived from the axioms by means of
a finite number of logical inferences. Only in this sense, in my view, is the concept of the
continuum logically graspable (faPbarV7

This passages reveal very clearly that Hilbert did not think that modelling
was primary, for they state quite explicitly that it is the possession of the
finite (or 'finite and closed') axiom system which enables us to avoid considering whether there is a 'totality' (i.e., something independent of the axiom
system) with the properties desired. To see this, we have to remember three
important factors in the background to Hilbert's remarks.
The first is the significant opposition both to the full theory of real numbers
(for instance, the theory of Dedekind cuts), to geometry (with its use of
'arithmetical' continuity principles), and to Cantor's transfinite numbers, all
of which Hilbert sought to defend. Indeed, he remarks in a passage from his
1900b which has already been cited (above, p. 145) that a proof of consistency for the system of real numbers would defuse 'all objections which have
hitherto been raised against the existence of the totality of all real numbers',
and this remark would make no sense if it just meant consistency in the sense
of providing a model for that system, for the main objection was precisely
to such completed infinite totalities. To underline this, Hilbert makes it clear
in the passages quoted above, that it is the finite nature of the axiom systems
involved together with the finiteness of proof (and, of course, a consistency
proof for the system) that makes the concepts involved unexceptionable, even
when these systems are apparently of a radically infinitistic nature. The clearest
statement of this appears in Hilbert's unpublished notebooks:
Through the Archimedean and my completeness axioms [for Euclidean geometry or the reals
respectively], the ordinary continuity axiom is divided into two completely different components.
Moreover, with my completeness axiom, not one infinite process is demanded, but we have
only a finite number of finite axioms, just as Kronecker demands. 58

I will take this up again later in Section 4.

HILBERT AND LOGIC

153

The second thing that must be remembered is exactly what the explicit
construction of the system of real numbers from the system of rational numbers
entails, namely, in one form or another, the concept of transfinite set, and
Hilbert was well aware that no consistent axiomatic presentation of this was
then available. Part of the goal of the axiomatic method and the stress on
consistency was to make all questions of mathematical existence (even in
transfinite set theory) as uncontentious as that for the most basic systems,
say that for natural numbers. To appeal to explicit construction would not,
therefore, be an effective way of appeasing the critics of the theories of real
number.
There is a more general point here about the provision of interpretations
for theories. When one provides a model of a theory r, what one actually
does is to show that certain objects of another theory :E can be used as an interpretation of r. This depends on existence proofs carried out within :E, thus
on objects which are shown to exist by the theory :E in the internal sense.
But, of course, this says nothing about the consistency of :E itself. To be even
more explicit, there is reliance on another theory A (which might be :E itself)
which can describe both :E and r, and in which the construction takes place,
thus where the translation function exists. If mathematics is to aspire to selfcontainment and self-sufficiency, then it is surely incumbent on it to be able
to describe mathematically what takes place with such constructions, thus to
be specific about the theories in which these constructions can be carried
out. If the sole account of consistency available is what we now call satisfiability, then this suggests that there must be some uniform (and well accepted)
theory in which this can be done. But what are such theories? In 1900, the only
candidate for such a universal theory was Frege's, and this was demonstrably
inconsistent.
The third thing to be kept in mind is specific to the problems with Cantor's
transfinite set theory. In setting out a theory of the transfinite, what could
Cantor possibly have given as a model, and out of what body of accepted
material could this have been created? What is required instead is presumably something rather like what Hilbert later called the method of ideal
elements. Hilbert recognised that it is often necessary to make radical extensions to mathematical theories by adjoining new "objects", with no prior
guarantee that this can be done consistently, and for which consistency then
has to be proved. Of course, one might think this should be proved by explicit
construction wherever possible, as in the cases of Dedekind cuts or the Gauss
complex numbers or the von Staudt construction of points at infinity as equivalence classes of parallel lines. But if we do not possess a theory which we
think is universal for the provision of models, then it is far too restrictive to
insist that it be possible to model the new principles by constructions based
on elements already present. Cantor's work furnishes the prime example. What
this starts from is the desire to add transfinite ordinals to point set theory
without any prospect that the result can be modelled. Focusing on syntactic
consistency is one suggestion as to how this problem, and this apparent restric-

154

MICHAEL HALLETT

tion, can be circumvented. The primary goal is then to set out clear axioms,
and to show how the resulting system extends (or is related to in some
other way) older, established systems. The next goal is to achieve a proof of
consistency, by modelling if possible, by some other means if the prospects
for modelling look hopeless. More or less the same considerations arise
when mathematicians want to proceed in some area of research by adding a
hypothesis whose status is not yet clear. The example Hilbert often cites is
analytic number theory and the Riemann Hypothesis, an example of "an
axiomatic postulation of an as yet unproved proposition' (* 1920a, p. 35).
And while insistence on the exhibition of a model as a uniform condition
is difficult enough, appeal to logical consequence as the fundamental positive
notion is even more problematic, for while Hilbert had a clear sense of what
it means to give an interpretation as early as his work on geometry in the 1890s,
and which is set out clearly in quite general terms in *1917-18, it was not clear
what it meant to quantify over all interpretations. For instance, the 1917-18
lectures (p. 131; see also Hilbert and Ackermann 1928, p. 45) state that an
interpretation is always based on one or several restricted 'species of objects'.
And, as we will see later, a domain of quantification for Hilbert (thUS a 'species
of object') must always be associated with an axiom system governing the
behaviour of the objects in the domain. What then is the domain of all domains,
thus what are the axioms governing all interpretations?59 Moreover, referring
to all interpretations which make the axioms true would thus not have made
much sense for anyone wanting to defend mathematical theories with infinitary content against those suspicious of such notions. This is presumably
why, in his 1929, Hilbert says that taking categoricity as the notion of completeness 'does not satisfy the demands of finite rigour' (1929, p. 139, 1930,
p. 6). What is really important about Hilbert's use of interpretations is just
the acceptance that it is part of the nature of the mathematical enterprise that
theories often have different, and surprising, interpretations, indeed in domains
with which we are quite familiar. Thus, we can show the independence of a
proposition from others by means of interpretations; but logical consequence
cannot be a replacement for deducibility. Hilbert might well have thought
that 'consequence' and 'derived proposition' are co-extensive or at the very
least that the latter entails the former. But the point remains that this had not
been shown, nor had the question been made precise, and hence that he could
not operate with the former in any official version of his account of the
nature of mathematics.
What emerges from these two sections about the scope and position of logic
in Hilbert's system?
First, there was the negative point. Logic for Hilbert cannot have the
function of a linguistic 'microscope' that it has for Frege and for Russell,
for there is no stock of ultimate primitives from which all mathematical propositions in their final, correct form are composed.
Second, the last section made it clear that Hilbert's theory of the axiomatic

HILBERT AND LOGIC

155

method relies heavily on a syntactic notion of deduction, that is, on some


system of rules for the transformation of sentences that forms the basis of a
construction of a proof. However, in 1900, when Hilbert starts to stress the
axiomatic method in his foundational work, he does not have a developed logic
in anything like our sense of a logical calculus. What he has, however, is
some notion of logical consequence, though he seems to move quite easily
between 'logical consequence' and 'can be deduced by means of a finite
derivation' .
Third, Hilbert introduces consistency as a criterion of (external) existence,
a property concerned with the deductive structure of the theory. But this means,
if we avoid the provision of models, that consistency has to be investigated
by means of some other theoretical device, preferably mathematical. Hence,
the role of logic in mathematics cannot be completely autonomous.
Fourth, when Hilbert does tum his attention to the specification of a logical
calculus in 1904-5, he takes it that what he set down is a codification of
reasoning as it is actually applied in mathematical theories. This is heavily
implied, for instance, in his lectures of 1905. Having developed a calculus
for propositions (something close to a Boolean algebra, which owes much to
SchrOder 1890), he says that this is to replace (or codify) the laws of logic
implicitly used in the development of mathematical theories in an axiom
system, a system such as that for real numbers. 60 He also says something
stronger than this, namely:
It is evident that the processes of organizing [ZusammenJassung] statements actually under-

taken in our thought satisfy these axioms. These are the simplest of all laws of thought. Most
of the time, one does not state them as such, since thought, with respect to such organisation,
is not conscious of any sequence. Only in writing down and in expression is such a sequence
necessary.61

This embodies a version of another of Hilbert's uniformity claims, namely


that the logical principles used in any mathematical theory are the same, or
rather it is a fairly direct statement of a special case of the more general
assertion in 1900b that one of the things that unifies mathematics is 'the
sameness of the logical tools' used in all the standard mathematical theories.
(It is also consonant with Hilbert's various strong claims about the central
nature of the axiomatic method for all scientific thought.)62 This uniformity
thesis is quite different from the form of uniformity thesis about logic
articulated by Frege and Russell. This says that all mathematical theories are
deducible inside the basic logical theory (with the exception, for Frege, of
geometry), whereas Hilbert's merely says that mathematical theories use the
same inferential principles, and this uniformity governs only the logical interconnections, and not the primitives. It also follows from Hilbert's uniformity
thesis that any mathematical examination of the logical calculus (which he
begins in his *1905b) ought to be independent of specific theories. The same
should apply to the 'logical tools' in the full sense, once these have been
articulated.

156

MICHAEL HALLETT

But what should such an articulation look like? The history of this is
involved, not least because of Hilbert's reaction to the antinomies, which
itself is complex and various. What I will concentrate on here is just one
effect of the antinomies, namely in reinforcing the tendencies that have already
been identified in this essay.
3.

LOGIC AND THE ANTINOMIES

In his 1905 lectures, Hilbert suggests first that he thinks the primary difficulties behind the paradoxes reveal a problem with the notion of totality
[Gesammtheit]. For instance:
One recognises immediately that the contradictions rely on collecting together [Zusammenfassung] certain totalities to a set, collections which seem not to be permitted. But nothing is
achieved simply by saying this, since all thought depends on such collection. The problem
here is rather that of distinguishing the permissible collections from the impermissible. 63

In a marginal remark to the notes of these lectures, Hilbert asks:


Why is the totality of all sets an impermissible totality, whereas that of all real numbers is a
permissible collection [Zusamme'!fassung]?64

And later on, Hilbert draws a connection between this and the logical notion
of 'all':
... [T]he most difficult concept is the concept 'all' or 'every', since through its use all the
contradictions known to us arise, at least if one applies it in the traditional ways .... 65

At various places in these lectures, Hilbert is also reported as saying that the
root of the difficulties arise with the ('unscrupulous') use of 'all' in the
'traditional' or 'old' way in the formation of concepts and, through them,
classes.
From Hilbert's point of view c. 1905, it seems fairly straightforward to
say how the 'all' of universal quantification ought to be dealt with, for this
is signalled in his attitude to the system of real numbers of 1900 and is confirmed by his later reaction to the difficulty seen by Weyl in the use of
impredicative definitions. The legitimacy of using object-quantifiers in an unrestricted way simply reduces to the question of whether there exists a domain
of objects over which the quantifiers range, what Hilbert and Bernays later call
a 'fertige Gesammtheit', a 'ready made' or 'given' totality which endows
the quantified statements with a perfectly good sense. 66 The kind of existence involved here is just existence in the second, external sense; the
quantification relates to a given, fixed domain, and the existence of this domain
only makes sense in the context of an axiom system supported by a consistency proof. (The axiom system, to use the terminology Hilbert sometimes
employed, is an 'implicit definition' of the concept or domain in question,
be it Euclidean geometry, or complete ordered field, or the universe of sets.)
Something like this is implied by what Hilbert says about the existence of

HILBERT AND LOGIC

157

the real numbers in his 1900a and 1900b, thus goes back to the beginnings
of Hilbert's concern with consistency. In explaining the axiomatic method at
the beginning of 1900a, Hilbert says (his example is again geometry):
Here, one begins with the assumption of the existence of all the elements, i.e. one puts at the
outset three systems of things, namely points, lines and planes, and then establishes relationships between these elements through certain axioms .... 67

Then having set up an axiom system for the reals, which he is convinced is
consistent, Hilbert concludes that this defuses 'the objections raised against
the existence of the totality [lnbegrif.f] of all real numbers and against infinite
sets of any sort'. In his 1900b he says that a consistent axiomatisation of the
real number system will show that 'the concept of the continuum exists in
the same sense as the system of rational numbers' .68 It should also be noted
that this procedure is at the core of what Bernays calls 'restricted Platonism' ,69
and it is also what is behind Hilbert's designation of the quantifiers in the 1920s
as 'transfinite elements'.
In effect, one can see in this doctrine of Hilbert's the same principle that
is behind one of Cantor's arguments for the necessity of actual infinites in
mathematics, namely that whenever it appears as if there is only a potential
infinite, the use of unbounded quantifiers shows that there is in fact an actually
infinite domain which constitutes the universe over which the quantifiers
vary. Hilbert adopts a version of this principle, namely: whenever we use
quantifiers, we must be assuming the existence of afertige Gesammtheit over
which the quantified variables vary. Hilbert's doctrine of external existence
adds to this the important modification that this existence assumption, in
tum, must mean that the use of the quantifiers is accompanied by axioms
governing the domain of quantification. It is these axioms, and these alone,
that should govern the existence proofs for objects inside the domain.
In this regard, Hilbert's early general statements concerning set theory
are confusing. He ends his 1900a by saying that an existence proof of the
kind he believes he has given for the reals cannot be given for the Cantorian
system:
If we wanted to present a proof in the same way for the existence of a totality of all powers
(or Cantorian alephs), then we would not succeed. Indeed, the totality of all powers does not
exist, or - to use Cantor's way of expressing it - the system of all powers is an inconsistent
(or unfinished [nichifertige]) set. 70

And even in 1905 (as we have seen) he says that the 'totality of all sets' is
'impossible' .
Hilbert's reference to Cantor is explained as follows. In 1897, through
conversation and two subsequent letters ,71 Hilbert had learnt about Cantor's
distinction between 'consistent' and 'inconsistent' (or fertige and nichtfertige) Mengen or totalities. Assuming that a set is just any 'well-defined'
collection, thus a collection for which it is clear whether or not any object
belongs to it, then Cantor's distinction (given most explicitly in the second

158

MICHAEL HALLETT

letter mentioned in the previous note) can be stated in the following way: a
fertige Menge is one such that 'all its elements can be thought of as being
together' or that it can be 'thought of as constituting a composite thing'.72
The nichtfertigen Mengen are for Cantor' Absolute' infinites, and are therefore not subject to the normal mathematical operations which can be performed
onfertigen Mengen. Given this, Cantor argues that the set of all alephs cannot
be afertige Menge, since if it were it would be numerable and have an aleph
as cardinal number, and the well-known contradiction would follow. Cantor
uses this as the basis of a reductio, arguing that if there were a power which
is not an aleph, then all the alephs could be injected into this power, and
there would then be afertige Menge of all alephs, which is impossible.?3 Hilbert
seems to acknowledge Cantor's conclusion in the first sentence of the passage
quoted.
Hilbert substitutes Cantor's vague 'can be thought of as existing together
without contradiction' with the precise 'can be consistently axiomatised',
which, as we have noted, makes the concept of existence relative to the
axiom system concerned. Given this, the correct modification of the conclusion of Cantor's argument would be that there can be no consistent
axiomatisation of sets and cardinalities which allows the collection of all alephs
as afertige Menge, i.e., as a thing which is subject to all the same mathematical
operations as normal sets, including the alephs themselves, thus as a thing
inside the domain. The argument would not show that the totality of all alephs
'does not exist' (thus externally), as Hilbert in fact says, only (as Hilbert
also says) that it is a nichtfertige Menge, thus does not exist internally.
Hilbert's statement in 1900b does not exhibit quite the same confusion,
but the statement is equally misleading:
The concept of the continuum or of the concept of all functions exists in just the same way as,
say, the system of whole rational numbers or even of the higher Cantorian number-classes and
powers. For I am convinced that the latter can be shown to exist in the sense I have indicated
just as clearly as the continuum can be shown to exist. This is in opposition to the system of
all powers absolutely, since for this it can be shown that a consistent system of axioms in my
sense cannot be set up. Consequently, therefore, the system of all powers is not a mathematically existing concept, according to the way I use these terms. 14

Here, Hilbert states a conjecture that the totality of all alephs can be shown
to exist in the same sense that he thinks the totality of all real numbers
exists, namely that it is possible to give a consistent axiomatisation for this
totality. But Hilbert seems to ignore Cantor's assertion that the totality of all
powers is identical to the totality of all alephs, for he says in effect that there
must always be a contradiction in the assumption that there is a totality of
all powers.
It is quite possible that Hilbert's lack of endorsement in the second statement of Cantor's proof that all powers are alephs reflects a subsequent doubt
that Cantor's proof is adequate. Indeed, it is quite clear from the section of
the same paper dealing with the continuum problem (Problem 1, discussed two
pages earlier) that Hilbert does not regard the well-ordering theorem as proved.

HILBERT AND LOGIC

159

But in this case, what Hilbert ought to have stated is: (l) his conjecture that
we can give a consistent axiomatisation of the system of all alephs, and thus
that this totality exists in the external sense of existence (thus, in the same
sense as the totality of reals can be said to exist); (2) that Cantor has conjectured not just that every set can be well-ordered but also that all powers
are alephs, a conjecture for which as yet there is no adequate proof; and that
(3) it can be shown that there is nofertige Menge of all alephs (or all powers),
which is the same as saying that there is no Menge of all alephs inside the
totality of all alephs or all powers. Thus, in sum, the natural position for Hilbert
should have been that there exists (external sense) a finished domain of all
the Cantorian powers and number-classes (alephs) (each of which exists in
the internal sense), but that this domain itself does not exist in the internal
sense as afertige Menge.
What I call Hilbert's 'natural position' is close to that adopted towards
set theory by Zermelo in his 1908. Zermelo's axiom system begins with the
statement:
1. Set theory has to do with a "domain" \B of objects, which we will call simply "things", and
among which are the sets. 75

To show that a set exists, says Zermelo, is to show that it belongs to the domain
~, which involves just the internal sense of existence based on a proof from
the axioms of a statement of the form 3xA, where ~ is taken the domain of
quantification. Zermelo then shows quite straightforwardly that there is no
set inside the domain ~ which contains all the objects of the domain ~, for
he simply turns the Russell-Zermelo paradox into a reductio which shows
that no set can contain all sets. Then:
From this theorem, it follows that not all things of the domain \B can belong to one and the
same set; i.e., the domain \B is not itself a set. 76

Thus, the issue becomes not whether there is a fertige Gesammtheit of all
sets (this is just Zermelo's domain ~, or some of it), but rather whether we
can prove that there is an object in the domain (thus, capable of being a member
of other objects) which has this property.77 The demonstration of the existence of m itself is a separate issue, which for Hilbert would be left to a
proof of the consistency of the axiom system. 78
This points up the worry much more precisely, and makes it easier to understand Hilbert's vacillation with respect to sets, not just in 1900 but even in
1905. In the first place, in 1900 there was no axiomatisation of set theory which
does what Zermelo's promised to do, thus no means of specifying set theory
which does not rely on the universal conversion of any collection into a set
(or the association of any concept with a unique object). The danger that Hilbert
sees is clearly that the existence of the domain of all the number-classes
would automatically entail the existence of a set of all these classes, and
thus contradiction. Indeed, the possibility of this unrestrained conversion is

160

MICHAEL HALLETT

what Hilbert is concerned with in the passages from *1905b quoted above, and
Hilbert's major criticism of Frege's system is precisely that it allows a general
form of concept-to-object conversion which leads to proofs of the existence
of too many objects. 79
What poses the problem, and what therefore introduces much of the hesitancy, is that Hilbert believed that 'all thought depends on ... collection' of
objects to a set. Thus, given the way that Hilbert associates logic with the principles that we actually use in mathematical thought, there is a strong temptation
to take this as a logical law, a 'Law of Thought' .80 Indeed, a version of this
view persists in the discussion in Hilbert and Bernays 1934 where it is stated
that fairly free operation of this sort of collection is fundamental to mathematics, as for instance, in the use of the fertige Gesammtheit of all natural
numbers as a set which can then act as the basis for further constructions. 81
Indeed, the view survives not only in the assertion in *1905b that a successful axiomatisation of the concept of natural number shows this latter to be a
'thing', but also indirectly in the view from 1926 (and earlier) that adding
the quantifiers is an instance of the method of adding ideal (or transfinite)
elements.
Given this, Zermelo's axiomatisation of set theory clearly addresses part
of the problem, for what it does quite consciously is replace wholesale collection-to-set conversion by piecemeal set construction principles, and aims
to do this in a way which is mathematically sufficient, that is, is such that
all those conversions which basic mathematical activity requires can be carried
out, but which proscribes the collections responsible for the paradoxes. In other
words, mathematical set construction is no longer the business of logic; if
Zermelo's axioms are the only principles available, it appears that the inference from the existence of the domain to the existence of a set of all sets is
blocked. 82
This makes it clear that Zermelo's theory of set existence not only puts a
good deal of stress on proving the consistency of the axiom system (thus,
on showing the existence of the underlying domain of quantification in
Hilbert's sense), but also on the analysis of the principles of reasoning themselves ('the logical foundations' of the disciplines in question: *1905b, p. 191),
for it shows that one natural 'Law of Thought' actually has no place in the
logical framework, whatever the temptation to adopt it.
But there is a further problem, for principles akin to Frege's Law V are
not the only problematic ways of specifying objects. Hilbert poses two questions about the theory of real numbers in his 1905 lectures. The first is: Why
is the totality of all sets an impermissible totality, while that of all real numbers
is not? The Zermelo system attempts the beginnings of an answer to this
question. The second question is quite different and is provoked by arguments like that of Richard 1905, to which Hilbert devotes a good deal of
attention, both in his 1905 lectures and later. One way of specifying a real
number is to give a 'law' (Hilbert's term) for a binary expansion. Given this,
Hilbert asks:

HILBERT AND LOGIC

161

What is a genuine law, e.g., for a sequence of numerals in the binary expansion for an irrational number? When is a mathematical problem correctly and clearly posed, so that we must
demand the possibility of a clear answer? Why is it, e.g., a clear question whether Mascheroni's
constant 2[2 is rational or irrational, unlike the question of whether there is a totality of binary
expansions expressed in a finite number of words?83

The latter, as Hilbert shows, using essentially the same argument as Richard
(see Hilbert *1905b, pp. 201-3, *1905e, pp. 128-30), gives rise to contradiction by an application of Cantor's diagonal method, 'one of the most
beautiful proofs in set theory', as Hilbert calls the latter (*1905b, p. 196;
*1905e, p. 125). Hilbert sees a large part of the perniciousness of the contradictions, not only in the fact that they undermine, indeed trivialise, whole
theories, but also because they undermine particular, central and 'beautiful'
proofs, like Cantor's diagonal proof. 84
While Zermelo's axiom system begins to tackle the first question, it is not
clear that it addresses the second. One of Zermelo's axioms (Axiom III, the
Aussonderungsaxiom) is designed to deal with properties in a quite general
way, at least properties that are 'definite', as Zermelo calls them. Zermelo's
characterisation is as follows:
4. A question or statement @ is called "definite" when it can be decided in a non-arbitrary
way by the basic relations of the domain, using the axioms and the generally valid logical
laws, whether the question or statement is valid or invalid. In the same way, a "class statement" @(x), in which the variable term x ranges through all individuals of a class Sf, is called
"definite" when the statement is definite for every individual of the class Sf. 85

Zermelo says clearly what the two 'basic relations' of the domain are, namely
membership (the 'E'), and the identity relation. But nothing is said about quantification over properties, and the reference to classes and "class statements"
is enough to raise at least a suspicion of circularity, and consequently to raise
the spectre of the type of definition of objects present in antinomies such as
Richard's. This shows that, unless we operate with a precisely circumscribed
(and inflexible) descriptive apparatus when discussing uncountable collections,
then we will fall into contradiction, certainly if the notion of 'definable' is
allowed to appear in a definition. And it is this sort of thing which Zermelo
does not clearly forbid.
Zermelo himself is insistent that the definitions which appear in these
antinomies are excluded by his notion of 'definite', just as the Aussonderungsaxiom itself excludes direct construction of 'sets' of the type used in Russell's
antinomy. His argument, however, amounts to nothing more than the rather
uncertain claim that such notions as 'definable in a finite number of words'
simply cannot be 'definite' because this must be decided by the 'basic relations of the domain' (and the 'generally valid laws of logic,).86 Others saw
the matter differently, not least Weyl in his 1910. 87 Weyl's analysis was clearly
of profound importance to Hilbert.
In his 1910, Weyl puts forward the position that all definitions in mathematics should be direct definitions, and not 'implicit definitions' through axiom
systems, and that set theory (in something like the form presented by Zermelo)

162

MICHAEL HALLETT

ought to be the sole source of these direct definitions. The reasons he advances
for the choice of set theory as the universal provider of material are, crudely,
that the involvement of set theory is, at some level inevitable (certainly if
one favours direct definition), and that it is the 'discipline that stands closest
to logic'. Thus:
So set theory appears to us today, in logical respects, as the proper foundation of mathematical science, and we will have to make a halt with set theory if we wish to formulate principles of definition which are not only sufficient for elementary geometry, but also for the
whole of mathematics. 88

What is important about this proposal in the present context are not Weyl's
reasons for it, but rather the way that it deals with the Richard antinomy,
since, for Weyl's proposal to be in any way feasible, set theory as originally
presented by Zermelo must be reformulated so as to avoid any suggestion of
antinomies like Richard's. What Weyl suggests is the use in set theory of a
circumscribed notion of concept or relation formed from the basic relations
(Zermelo's 'E' and '=') by finite iteration of certain recognised operations.
What precisely these are is left a little unclear by Weyl in 1910, but it is
made explicit in 1917 (pp. 4-8); they are, in effect, just the logical operations -', 1\, V, V and 3 (treated as quantifiers over the objects). Thus, the
proposal is to replace Zermelo's 'definite property' with (in effect) first-order
predicate in the language of set theory. There is no possibility of reproducing
Richard's paradox, since the defining description employed to get the contradiction can only be reproduced if one can form a predicate which involves
quantification over, or reference to, predicates or propositions (or 'definitions')
as well as the objects. This simply cannot be done if Weyl's proposal is
followed.
Weyl makes three points (1910, p. 300) by way of justification for
addressing Richard's antinomy in this way.
First, rather in the spirit of Hilbert, Weyl says that it is a mistake to think
that the objects themselves are of primary importance. What matters, rather,
are what he calls the 'concepts' or 'relations'. It follows from this that to
address the Richard paradox properly means focusing attention on the way
concepts or relations are defined, for a proper account of concept formation
will tell us at the same time which definite descriptions of objects are, and
which are not, legitimate, and thus will answer Hilbert's question from 1905
as to what are the proper laws.
Second, we cannot say generally what concepts and relations are without
first saying what are the fundamental concepts and relations on the basis of
which they will be defined. There are always just a finite number of these basic
relations and concepts, and the way they behave is laid down through the
axioms of the discipline in question. In a sense, this is just a variant of the
Hilbert view that the nature of the domain (and thus, in particular, quantification) is governed by the axioms given. For example, one might view the
axiomatisation of Euclidean geometry either as an axiomatisation of points,

HILBERT AND LOGIC

163

lines and planes, or as an axiomatisation of incidence, order, congruence and


continuity, or Zermelo's axiomatisation as an axiomatisation either of the
collection of all sets or, alternatively, as an axiomatisation of the membership relation. Both axiomatisations are, strictly speaking, both of these things,
as becomes clear when we look at different models for these axioms.
Third, the way the full stock of concepts and relations is arrived at is simply
by generation from the basic ones by repeated (and finite) application of a
finite number of generating principles, in effect, the logical formation principles mentioned above. As Weyl says, this produces something like a
'dictionary of the language [Sprachlexicon]" or rather it renders an explicit
dictionary list strictly speaking redundant by saying how every 'word' is
generated. In other words, what Weyl's procedure gives is an explicit characterisation of the mathematical language.
It is not immediately clear what positive reasons Weyl would have for
restricting the formation of concepts in this way, but this is not something
into which we need go. The important thing to note is that Weyl's schema
can be applied independently of the universalist sentiment he expresses towards
set theory in his 1910, for (as Weyl says explicitly, both here and in 1917)
the proposal can be shaped to fit any mathematical system. In this case, what
will vary as theories vary will be the basic concepts and relations, and what
will be common to all systems is what we now think of as the logical part,
that is to say, the means of generating the concepts of the theory from the basic
concepts. This, I suggest, was precisely the reason that Weyl's proposal was
attractive to Hilbert. His presentation of logic in his *1917-18 follows very
much these lines, as does the discussion of Zermelo's system in his lectures
from 1920 (see Hilbert *1920a, pp. 23-4). But there was one important
modification to, or better, clarification of, Weyl's system.
Hilbert's major modification of Weyl's proposal is that, in his hands, it gives
a universal description of the form of mathematical languages, rather than a
description of the mathematical language. One of Hilbert's constant aims
was to isolate and describe what is common to all mathematical theories.
Part of the answer Hilbert gave as early as 1900 was based on the observation that what all mathematical theories share is symbolic representation, the
ability to represent (or present) content through formulas or diagrams, that
all our contact with mathematical content is mediated through signs, formulas
and proofs, which are concatenations of formulas, which themselves are simply
concatenations of symbols. These are what we literally form and actually
manipulate, what we inspect and have direct contact with; these are what we
'write down and experiment on with chalk and pen' (to borrow precisely
Hilbert's terms from *1905b; see below). He even extends this to diagrams,
for in his 1900b, Hilbert says that 'formulas are written diagrams and diagrams
drawn formulas'.
It is clear from quite early writings that Hilbert thinks that the use of symbols
represents a general mental ability, namely the ability to think about objects
through signs, that is, through something non-abstract and quasi-sensible,

164

MICHAEL HALLETT

and to which we stand in an immediate, direct relation. And this is an assumption common to all use of language. The stress on signs is something Hilbert
shared with Frege. The use of signs, says Frege, is essential in order to free
ourselves from the 'sway of the inner world of ideas'. As Frege goes on:
Since the concept is something unintuitable, then it requires an intuitable representative in
order to appear [erscheinen] to us. In this way, the sensible makes accessible to us the world
of the non-sensible. 89

But what Hilbert noticed is that it is the use of signs in this way that constitutes the finitisation of all mathematics.
Given all this, it is then not surprising to find that the stress on signs is
at the root of Hilbert's conception of logic. Indeed, since logic is taken to
be a codification or an abstract reflection of the Laws of Thought, of the
way that minds actually reason, it is of the essence to be clear what it is ultimately that minds reason with, and this he thinks is the sign in combination,
the things we can 'write down and experiment on with chalk and pen'. In
his *1905b, this takes the form of
... an Axiam afThought, or as one might say, an Axiam afthe Existence afan Intelligence, which
can be formulated approximately as follows: I have the capacity to think of things and to denote
them through simple signs (a, b; ... , X, Y, ... ; ... ) in such a fully characteristic way that I
can always unequivocally recognise them again. My thinking operates with these things in this
designation [Bezeichnung] in a certain way according to determinate laws, and I am capable of
learning these laws through self-observation, and of describing them completely.90

Indeed, he remarks in a marginal note to this passage that this assumption is


'the a priori of the philosophers'. And in *1910, he says:
We start from the assumption that we possess the capacity so to name things by signs, and
that we can always recognise them again. We can then carry out certain operations with these
signs, operations which are analogous to those of arithmetic and which satisfy analogous laws. 91

Given this view, what Weyl's proposal becomes in Hilbert's hands is a


precise way of describing how mathematical language is put together from
the primitive signs, and this explains its attraction. It gives a precise and
quite general account of how we, in our dealings with a mathematical theory,
construct symbol sequences that are to stand for genuine concepts and relations once it is clear what concepts the primitive signs stand for, thus how
to construct the elements that we actually use in writing down step by step a
mathematical proof. Since this is to apply to all mathematical theories, once
again it must be independent of content. Such a view is expressed quite
clearly by Hilbert in 1920 in his description of logic as providing an empty
frame:
Here we dissected the language in its function as a universal instrument of human thought,
and we have laid bare the mechanism for carrying out a proof.
Nevertheless, this way of proceeding is incomplete, in so far as the application of the logical
calculus to a definite domain of human knowledge demands an axiom system as a foundation,
that is, a system of objects must be given, between which certain relations are considered with
definite, initial basic properties.92

HILBERT AND LOGIC

165

This exerts further pressure in the direction of a minimal logic, for logic
described in this way gives the form of a descriptive apparatus while remaining
neutral as to content. The specific mathematical assumptions of the individual theories themselves should be kept separate from this framework.
What this amounts to, for Hilbert, is that the presentation of a mathematical theory splits into two distinctive elements, the constructive element,
which deals with the constructive generation of the formulas, and what might
be called the existential element, which deals with the axioms and therefore
with the existence of the domain(s) of quantification. Both terms are apposite.
Weyl's procedure is a perfectly constructive one, indeed, this is his term from
1910. Each concept (for Weyl) or each formula (for Hilbert) is itself a finite
object constructed in a thoroughly finite way by iteration from the finite number
of primitive concepts or formulas taken as starting point using only the four
or five principles of combination (Weyl's 'definitional principles'). In the
lecture notes *1920a (in remarks added by hand), Hilbert actually assimi1ates the genetic method to the constructive, for this, too, is based on a form
of generation of objects not previously constructed. He also calls the axiomatic
method of specifying objects the existential method of definition, which makes
the term 'existential' here appropriate. 93

This finally puts Zermelo's set theory (and the set theoretic assumptions
used in presentations of arithmetic) in a modern light, divorced from any
suggestion that it should be part of logic. It represents a departure from
Hilbert's position of 1905, where (in both 1905a and *1905b) he attempts to
axiomatise logic, arithmetic and the notion of 'infinite set' simultaneously.
Building on Weyl's suggestion, logic, indeed, becomes minimal, a calculus
for reasoning on the one hand (as Hilbert had recognised from the beginning), and a neutral tool for forming meaningful expressions on the other,
without any explicit content of its own.
Given the rigid separation of the two notions of constructive and existential, it is now possible to say more precisely what Hilbert thinks goes awry
in Richard's paradox. The account is a fairly straightforward application of
something like the vicious circle principle, and goes roughly like this. The
objects used to give definitions of sets or real numbers are the formulas, which
are themselves constructively presented, formed from symbols in the now
standard way, a given formula being constructed at some finite initial stage
in the generation of the domain of all formulas. This domain is certainly a
definite domain in the strong sense that it is a decidable question whether
an object belongs to it or not.
Suppose now we permit second-order quantification as one of the operations of formula formation. The central question for Hilbert is what this
quantification is quantification over. Now, says Hilbert,
... there is a difficulty.
We have to ask ourselves the question, what does it mean when we say "There is a predicate P"? In axiomatic set theory, the "there is" always refers to the domain \B we take to be
there at the foundation. In logic, we could indeed think of the predicates as collected together

166

MICHAEL HALLETT

to a domain. But this domain of predicates cannot he considered as something given from the
beginning; rather, it must be formed through logical operations. Only through the rules of
logical construction is the predicate-domain subsequently determined.
And now it becomes obvious that, in the rules of the logical construction of predicates,
reference to the domain of predicates cannot be permitted. Otherwise there would indeed be a
circulus vitiosuS. 94

In other words, since the domain of predicates is given only through the
constructive generation process, it would be illegitimate in the course of
this generation to refer (via unrestricted quantification) to the totality of all
predicates.
This does not mean, of course, that we should disallow second-order quantification under any circumstances. For example, we might want to think of
predicates as referring to properties or propositions or subsets, and then it is
natural to make general assertions about these involving quantification over
them as well as over the underlying objects. (There will, of course, be an
extra clause governing the second-order quantifiers in the principles for the
generation of formulas.) But given Hilbert's strictures about quantification,
such a procedure must be supported by axioms specifying the domain of the
second-order quantification. This is implicitly what we do anyway when we
use second-order quantifiers, for instance when we say, in interpreting the
formulas, what is the domain D(2) that the second-order variables range over
given that the first-order variables are to range over the domain D(l). (Note that
we normally rely on some specific set theory to assure us that this D(2) exists.)
But such quantification is not quantification over the formulas themselves;
indeed, it would be utterly inappropriate for Hilbert to take it as such, and
for two reasons. The first is that formulas in Hilbert's later conception of
logic are just a formal characterisation of what it is that the mathematician
actually deals with, thus they play the role in the logical framework that the
diagrams and formulas play in the informal framework. This suggests that
formulas are not something we make existential assumptions about. We actually
have them, or rather construct them in operating the mathematical theory.
This leads to the second reason, namely that we actually have no need of
any axiomatic characterisation, for the generating principles for formulas
capture directly the objects concerned, and indeed their constructive nature,
for they reflect the restricted number of elementary ways in which we actually
form new formulas from existing ones. But since there are no existential
assumptions here, it is simply not appropriate to have quantifiers over the
predicates while specifying a predicate, and straightforwardly unnecessary.
Consequently, a formula object with a quantifier over formula objects would
make indirect reference to objects that are only generated at some subsequent stage in the construction of formulas. It is precisely this that is the
mistake in the pseudo-formula used to 'define' the number that gives the
Richard paradox.
Hilbert is particularly clear about this in the lecture manuscript *1917-18.
Having extended the logical calculus to that of second-order, and then devel-

HILBERT AND LOGIC

167

oped versions of the known paradoxes in this extension, he says that the trouble
must lie in assumptions that were made in the extended calculus, the ordinary
(first-order) calculus being provably consistent. He says:
In the original functional calculus, we took a system or several systems (species) as given
from the beginning, and by referring to these totalities of objects, the operation with the
variables . . . was given a significance. The extension of the calculus now consisted in
regarding statements, predicates and relations as types of object, and, according to this, allowing
symbolic expressions whose logical significance demands reference to the totality of statements resp. functions.
This procedure is in fact dubious in the following way. Those expressions which obtain
their content through reference to the totality of statements resp. functions are then themselves
counted among the statements resp. functions, while on the other hand, before we can refer to
the totality of statements or functions the statements resp. functions must be considered as
determined from the beginning. Here there is a kind of logical circle, and we have grounds for
the assumption that this circle is the cause of the paradoxes. 95

Hilbert often characterises the mistake that is made if we do not observe


this distinction as the result of an impermissible mixing of the genetic (constructive) and axiomatic (existential) ways of conceiving of objects, a general
mistake that (he thinks) lies behind many of the contradictions. The genetic
view (for instance, for sets) is characterised by the adoption of one or more
construction procedures. But it is wrong to allow any of the processes to
apply quantification over the domain supposedly generated only by the
processes themselves. The existential (axiomatic) view, on the other hand,
could perfectly well use quantifiers ranging over the domain of all the objects
given by the axioms, and it could even contain objects which are defined
only in terms of all the objects in this domain. However, when the elements
are mixed, the danger is in allowing a constructive procedure to be applied
to an object which is defined only by using quantification over the whole
domain, which is tantamount to assuming that closure of the constructive
procedures themselves is itself a construction, and that its result is again
subject to constructions. The analogy with the Richard procedure should be
clear, and Hilbert also constructs 'pure' set theoretic paradoxes along similar
lines. 96
As I have said, Hilbert was not against all forms of second-order quantification. Indeed, it would be strange if he were, since the formulation of
his Completeness Axiom for the real numbers and Euclidean geometry appears
to rely on just this. As is clear from the passage from *1920a recently quoted,
Hilbert would even be happy with the assumption that the formulas given
by the original constructive characterisation form a domain, F, which means,
for Hilbert, that it would make sense to quantify over F. This produces new
formulas, but we must not assume that these formulas are themselves elements
of the domain of quantification F. What we get are genuinely new formulas,
of a logically different kind from the original formulas; if we do not observe
this, then we get all the old paradoxes back. Thus we are forced to adopt, in
effect, a form of type theory.97
Hilbert's objections to type theory are driven by mathematical considera-

168

MICHAEL HALLETT

tions, although what concerns us most here are more the logical objections
to Russell's ramified theory, more specifically to the Axiom of Reducibility,
objections closely related to the issue of quantification which we have just
elaborated, and which therefore help to elucidate this.
Suppose one develops a view of sets as properties, something which Hilbert
explores in his *1917-18 notes and in Hilbert and Ackermann 1928. The appropriate formula U(x) serving to define a union set will be higher-order, namely
3P[P(x) 1\ <I>(P)]. A constructive attitude towards sets will be reflected here
in a constructive attitude towards the defining properties; in this case, the
complex predicate U(x) must stand for a different kind of object from any
of the predicates over which one has quantified, for it makes reference to a
domain not available in the construction of any of the objects quantified
over, and given only by the constructive presentation of the predicates (sets).
In this case, to take the standard example, the least upper bound of a set (a
union) will not be an object of the same kind as the elements in the set, since
its defining predicate is of a different order from that of its members. This,
for Hilbert, would be an unacceptable basis on which to pursue analysis, and
is sufficient reason to abandon the constructive attitude towards sets or their
defining properties. Russell's method of overcoming the difficulty is to invoke
the Axiom of Reducibility, which assumes that there is an (unspecified) propositional function which is extensionally equivalent to the complex predicate
U(x), and which contains no quantification over higher-order entities, and therefore can stand for the same kind of object as the members of the set. But to
use quantification here, says Hilbert, is also tantamount to the abandonment
of the constructive view, thus abandonment of the view that one is dealing with
predicates (sets) in the original constructive sense, thus with expressions built
up in our language of orders from the base formulas. Russell's invocation of
Reducibility, says Hilbert, is an assumption of theory, namely an assumption
that there is a domain of objects of some sort (say, concepts or properties or
sets), a domain which has to be specified by laying down axioms. As Hilbert
puts it:
In doing this RUSSELL returns from the constructive to the axiomatic viewpoint. 98

But it was only the constructive view that led to the difficulty in the first place.
If it is to be abandoned, then it can surely be done by a much simpler system
of set theory than that Russell presents.
4.

CONCLUSION: FINITISM

In this last section, I want to look briefly at some of the respects in which
this development of logic is consonant with Hilbert's finite Einstellung, and
indeed in such a way as to point in the direction of Hilbert's programme.
The first thing to emphasise is that what drives Hilbert's reaction to
Kronecker's treatment of the foundations of mathematics is the attempt to
explain the transfinite mathematics of Cantor, Dedekind, and that which it

HILBERT AND LOGIC

169

inspired, as 'finitist' while at the same time sacrificing as little of it as possible.


Hilbert's account is based on a theory of what every mathematical mind has
direct access to, operates with and constructs, namely symbols, formulas and
proofs in finite combination. This finds its mature expression in the minimal
logical calculus, which constitutes a uniform way of describing the abstract
form of mathematical language independently of the mathematical content that
the language is intended to express. Specifying this language is a thoroughly
finite' and constructive business. Hand in hand with this must go the investigation of key parts of mathematics to show that its central proofs can be
formulated in ways which avoid infinitary arguments, and further that these
theories, when formulated within the confines of the restricted mathematical
language, lose none of their theoretical strength. The key to achieving this is
the correct use of the quantifier, which for Hilbert depends upon axiom systems
and ultimately on consistency proofs.
That this reflects Hilbert's view from a rather early stage can be seen in
the following passage from his *1905b. Hilbert refers to Dedekind's analysis
of the concept of natural number, and says we should look carefully at
Dedekind's proofs for any proposition which contains the concept of the totality
of all numbers:
The simplest example would be:
l+n=n+1
as a statement for 'every' whole number. This proposition has no content which one can write
down on paper in finitely many signs, unlike a proposition such as
1+7=7+1
which does have such a content. Rather the content for the first of these appears at first as
infinitely large in a certain way, and the primary task is to transform this content into one that
can be written down. This happens, for instance, through the axiomatic definition of the totalconcept [Gesammtbegriffl of the natural numbers as a thing, as we have done above, and that
is the motivation for all our considerations: Every mathematical proposition and proof ought
to be brought into a form which can actually be written down, and every question ought to be
rendered decidable through such inscription in a limited, finite number of signs, decidable by
"experiment with chalk and pen". [* 1905c has 'with the aid of' instead of 'by'.l In this, one
must use the signs of the logical calculus with great care. 99

That such reformulations are always possible is by no means obvious a priori,


although Zermelo's restructuring of the proof of the Well-Ordering Theorem
by means of the axiom of choice, and then the subsequent development of
set theory, are interesting examples of a successful prosecution of these
features. loo It is worth noting that Hilbert's 'finitisation' entails giving up
the view that every object is individually specifiable, something which
Hilbert himself says quite explicitly in the remarks about real numbers
from 1900a and 1900b quoted above, since one can only generate a countable number of explicit descriptions of numbers. IOI It also means having
to accept, along with this, that many existence proofs might well be nonconstructive.

170

MICHAEL HALLETT

The motivations behind the development of this logical framework suggest


a range of (interconnected) theoretical questions. The first is the question of
how to characterise the notion of the interpretation of a mathematical language.
Connected with this, there is a range of completeness questions which emerge
naturally from Hilbert's conception of the role of logic in the foundations of
mathematical theories. The first has to do with the relation between Hilbert's
notion of logical consequence and that of deducibility in the system, whether
the latter is an adequate replacement for the former. Another version of this
completeness question is that of whether every 'correct' formula can be
deduced in the logical framework independently of any added mathematical
content. A third completeness question is that of how consistency in Hilbert's
sense is related to Dedekind's conception of consistency based on modelling
and whether it is an adequate replacement for it. Given its importance for
Hilbert's finitist project (as emerges from the passage quoted on p. 167), decidability, too, is an important issue. In short, what we see here are many of
the central issues that came to dominate modem logic.
However, from the point of view of arguing plausibly that all of mathematics is in a sense finitist, the most important question concerns the way
that consistency itself can be established. What then is the transition to Hilbert's
programme, to finitary arithmetic, finitary reasoning and finitary consistency
proofs? I want to make a few sketchy remarks.
The first important thing to note is the point made earlier in the paper:
because of the need to investigate syntactic consistency by mathematical
means, the role of logic in mathematics cannot be completely autonomous, and
hence there must be some other source of knowledge. We have seen that
what underlies Hilbert's description of mathematical language is an assumption about the mind's ability to represent through symbols, and an ability to
operate with symbols in such a way that the rules for constructing formulas
and proofs can be carried out and verified. It is stated in Hilbert's 'Axiom
of Thought' from 1905, and also more explicitly in later works, for instance
in the familiar reference to Kant from Hilbert's 1926:
Kant already taught - and indeed this formed an integral component of his teaching - that
mathematics has at its disposal a content guaranteed independently of all logic [iiber einen
unabhiingig von aller Logik gesicherten Inhalt verfiigt], which therefore can never be grounded
through logic alone. Thus for this reason, the efforts of Frege and Dedekind are bound to fail.
Rather, something must already be given in representation [Vorstellung] as a precondition for
the application of rules of inference and the performance of logical operations. namely it must
be the case that certain extra-logical concrete objects are given in intuition as immediate experience [Erlebnis] prior to all thought. If logical inference is to be certain, then these objects
must be capable of being completely surveyed in all their parts, and the facts of their occurrence, of their differentiation, their succeeding each other, or their concatenation, are also given
with the objects immediately and intuitively, and none of this can be or needs to be reduced to
anything else. This is the basic philosophical viewpoint which I hold to be necessary both for
mathematics and indeed for all intellectual [wissenschaftlichen] thought, understanding and
communication in general. And in particular in mathematics the concrete signs themselves are
the object of our consideration, and their form, in keeping with the conception we have adopted,
is immediately clear and recognisable. 102

HILBERT AND LOGIC

171

What is explicitly stressed in this passage is the operation with symbols, and
not so much their use in representing mathematical content to the mind; indeed,
it is claimed that we are able to verify things about elements of this formal
language by operating on them (in the ways Hilbert states) because of their
discrete, finite nature. Means of verification in mathematics must always be
constrained by the way the system is given; for example, proof in set theory
is constrained both by the formal framework (rules of inference) and by
the system of axioms. However, the objects of a (or any) formal language
are not given by an axiom system in such a language; so the constraints on
verification here are different. It is in their very nature that they are constructively (recursively) presented, so the means of verification must be
determined (and constrained) solely by the constructive (recursive) definitions.
Note that this must be the same for all mathematical languages, and therefore for the formal representation of all mathematical languages, and therefore
for the formal representation of all mathematical theories. Let us call the
kind of reasoning involved here finitary reasoning. Thus, what we see is that
finitary reasoning (whatever the correct description of it) is determined by
Hilbert's way of explaining all mathematical theories as finitist.
Hilbert's 'finitary conjecture' now has two parts. The first part states that
this finitary reasoning is precisely the same as that which is required to carry
out proofs in a certain primitive arithmetic. The second part of the conjecture is that such reasoning (and therefore such arithmetical reasoning) can
be used to establish the consistency of all theories. About this second part,
I will say nothing here; but I do wish to say a little more about the first
part.
That operation with a logical calculus is not just 'combinatorial', but
depends explicitly on arithmetical knowledge was stressed by Poincare in
his 1905, specifically that any operation with a logical calculus must involve
what Poincare calls the 'principle of recurrence', by which he seems to mean
ordinary mathematical induction. (Poincare's point is repeated by Weyl in
his 1910, where he sees it as a difficulty for his view of the language of set
theory that it presupposes a concept of number, or of 'finite repetition', while
the concept of number itself is supposedly first explained in set theory.)103
The connection between operation with the symbol system and some sort
of arithmetic (made precise by the analysis of the structure of the logical
calculus given by Godel and Turing) was clear to Hilbert from the beginning. In his 1900b, Hilbert refers to a 'rapid, unconscious not definitive but
certain' combinatoral feel for signs. 104 Note also the claim in the passage
from Hilbert *1910 quoted above that operation with signs is 'analogous to'
the operations of arithmetic.
But what exactly is 'the concept of number' presupposed by the operation with a logical calculus? A concept (such as that of natural number or
real number), where quantification over a domain (or domains) is used as
part of its specification is taken by Hilbert to be something 'defined' by an
axiom system, and this is what would be involved if indeed Poincare were

172

MICHAEL HALLETT

right. What Hilbert wants to show, however, is that what is really involved
is only a much more restricted 'concept of number', a primitive notion which
does not depend on quantification or an axiom system and which mirrors the
constructive development of any of the formal languages, in short a concept
of number which is presented by a similar collection of recursive rules. The
corresponding notion of derivation for these numbers would then be determined by the recursive structure of the number-objects, but would not involve
the principle of induction.
Correspondingly, what Hilbert develops is a theory of numbers as numbersigns (which makes the analogy with the formulas even tighter), a development
that is begun in the lectures *1917-18 and continued in various works in the
1920s. (For a partial description of this, see section 4 of Hallett 1994.) As is
stressed in Hilbert 1922 (p. 164, also p. 164 of Hilbert 1935), there are no
axioms, by which Hilbert means that there is no quantification and (as he
stresses in 1926) the sign-arithmetic is not developed in a logical language
(as full arithmetic must be). There are just recursive rules governing the formation of the signs and the elementary operations (such as 'addition').
Consequently, as with the formulas, the appropriate notion of demonstration
is not that of formal proof within a logical calculus, but rather proof based
on recursion following the various clauses of the definition. What this means
is that the signs are actually treated themselves as objects just as the formulas
of a language are, which would not be the case with a full, formal axiom system
for numbers. The correct way to think of it is that the number-signs might
be a way of giving an interpretation of the formal language for arithmetic,
but this language, by itself, is strictly speaking meaningless. lo5
The point, presumably, is that the analogy between this arithmetic of signs
and the system of formulas in a logical language is now meant to be transparent. Hilbert states it in his 1922-23 lectures, Wissen und mathematisches
Denken:
The invocation of mathematical methods in investigating the logical language is not artificial,
but entirely appropriate and even inevitable. For the role of the language in the expression of
the logical connections between thoughts corresponds to the sign language in calculation. In
following a logical passage of thought with the help of this logical language, we carry out at
the same time a calculation, in which manifold logically elementary processes are put together
according to practised rules. It is even self-evident that, when we exclude the accidental features
[zufiilligen MomenteJ in the derivation of words, then a form of mathematical sign language
arises. 106

And then, as Hilbert says, the restriction 'to the essential' will make it possible
... to frame the rules of the grammar in such a surveyable way that logical inference can be
carried through automatically by calculation according to simple, determined rules. 101

But the crucial thing about Hilbert's intention is summed up in this passage
from 1922:
When we develop number-theory in this way, there are no axioms and no contradictions of
any sort are possible. We simply have concrete signs as objects; we operate with them, and

HILBERT AND LOGIC

173

we make contentual statements about them. And in regard in particular to the proof just given
that a + b = b + a, I would like to stress that this proof is merely a procedure that rests solely
on the construction and decomposition of number-signs and that it is essentially different from
the principle that plays such a prominent role in higher arithmetic, namely, the principle of
complete induction or of inference form n to n + 1. This principle is rather, as we shall see, a
formal principle which goes well beyond this and which belongs to a higher level; a principle
which needs proof and which is capable of proof. 108

What we have, then, is an arithmetical paradigm of finitary knowledge obtained


by finitary reasoning.
It is worth noting that the recognition of the fundamental difference between
this form of primitive arithmetic and full arithmetic developed in a formal
language goes back to the 1905 lectures. At the very end of these lectures,
Hilbert is reported as making the following remarks about quantification and
the infinite content of quantified statements. Referring again to Dedekind,
Hilbert says:
... the proofs are really much more complicated than the theorems being derived, in that the
concept 'all' in these proofs is used in that indefinite way which has been the occasion for
contradiction, while the 'all' in the theorems appears to have a quite definite and restricted
extent [Urn/angel. Rendering this concept precise, as I have repeatedly stressed, is therefore
particularly important, for this concept is really responsible for all the difficulties. Many significant things will be altered in the new view as opposed to the older. Thus, for example, one
will not be able to interpret a determinate numerical proposition like
1+7- 7 + 1
as a special case of the theorem
l+n=n+l
which holds for the totality, since the propositions refer to quite different axiom systems, and
since also the general concept is an individually quite determinate concept, and the proposition containing this is also quite determinate. Only through the addition of new axioms, e.g.,
n = 1 + 1 + ... + I, will the transition from one proposition to the other become possible,
and only then will the first proposition appear as a special case of the second. 109

(This passage immediately follows the other passage from *1905h quoted at
the beginning of this section.) What is interesting about the statement is
the remark that the judgement involving universal quantification and the particular numerical judgement are of quite different kinds, not necessarily
connected, since these refer to 'different axiom systems'. This contains more
than an echo of Hilbert's statement to Frege that 'point' will mean different
things in different axiom systems. But more interesting is the fact that the
difference which is pointed out here corresponds in the later view to the
differences between the elementary sign-arithmetic and full arithmetic. The
most significant change is that, as we have seen, Hilbert adopts the view (in
his *1920a) that if a domain of objects is constructively presented through a
few recursive rules, there is no need for an explicit axiom system, and hence
no need for a formal system. Hence the axiomatic system of arithmetic
presented in a full logical calculus is not a simple extension of the system
of numeral-objects given, at least, it is not without further interpretation. 110

174

MICHAEL HALLETT

One final remark. Hilbert's insistence that, in representing the formulas


or sign-arithmetic there is no use of axioms, is at the very root of his attitude
to the finitary. Put conversely, the use of quantifiers (and hence for Hilbert
an axiom system) means a decisive shift away from elementary, 'finitary' mathematics to 'transfinite' mathematics. It now becomes clear why Hilbert in
the papers 1922, 1926 and 1928 says that the logical axioms governing quantifiers are 'transfinite' axioms, and what we have with their use is an 'ideal
extension'. In fact, using quantifiers (over the natural numbers, say) is really
just the same as the addition of ideal elements, for example, adding ..J-i to
the field of real numbers. This is because the use of quantifiers automatically implies for Hilbert that there must also be some axioms which 'define'
a domain of quantification, in other words, an assumption that there exists
a domain of quantification - a concept, a fertige Gesammtheit, a 'thing' in
Hilbert's earlier way of speaking. This assumption of the existence of a
'completed totality' is an assumption which has to be justified in precisely
the same way as that of the addition of ideal elements, namely by a consistency proof.
McGill University
NOTES

An early version of this paper was read at a session of the Boston Colloquium for the
Philosophy of Science in November 1993. I wish to thank the organisers of the Colloquium,
particularly Jaakko Hintikka and Fred Tauber, for their invitation. I also wish to thank George
Boolos, Emily Carson, William Demopoulos, William Ewald, Richard Heck, Mosbe Machover,
Mihaly Makkai, Ulrich Majer, John Mayberry, Stephen Menn and Wilfried Sieg for useful discussions on this and related work, and Mathieu Marion and Robert Cohen for their patience.
The Niedersachsische Staats- und Universitatsbibliothek and the Mathematisches Institut of the
Georg-August Universitat, Gottingen kindly granted permission to quote from various unpublished lecture notes and manuscripts of Hilbert. The support of the Alexander von Humboldt
Stiftung, the Deutsche Forschungsgemeinschaft, the Social Sciences and Humanities Research
Canada of Canada and the FCAR of Quebec is gratefully acknowledged.
Unless otherwise stated, the translations below are my own, although I have tried to give additional references to published translations wherever possible.
I
See van Heijenoort 1979.
2
We should not, however, make the mistake of thinking either that Hilbert's attitude to the
antinomies is simple, or that he underestimated their importance, as is clear from the following
passage from a report of his 1905 lectures:
The paradoxes which we got to know in the above [insertion in Hilbert's own hand: and which
are just a precise mathematical version of the Kantian antinomies] show only too well that
an examination of and a new approach [Neuauffiihrung] to the foundations of mathematics
and logic are absolutely necessary. (Hilbert *1905b, p. 215.)
3
A fuller discussion can be found in Hallett 1994.
4
Hilbert * J905b, pp. 11-12. Helmholtz uses the word 'Tatsache' in the title of his 1868, and
in his 1890 Klein refers to the geometrical facts to be captured by an axiom system. Hilbert refers
to the basis of Tatsachen repeatedly leading up to 1899 and after. I suspect that Hilbert's use
of the term is somewhat different, in that he means that, whatever the status of the propositions examined, and whatever reasons one might have for examining them, they are for the
time what one starts from.

HILBERT AND LOGIC

175

Hilbert *1922-23b, p. 122 (p. 87 of the new typescript).


Bernays 1922b, p. 95.
7
Bernays 1922b, pp. 95-6.
8 See Hilbert *1894, p. 60 and footnote; this passage is quoted in Toepell 1986, p. 85. There
is also a similar passage in Hilbert's letter to Frege of 29.xii.1899, reproduced in Frege 1976
(English translation in Frege 1980). Unfortunately not all of the relevant remark of Hilbert's is
rendered in the English translation in Frege 1980. See also Hallett 1994.
9 One also finds this statement in the Begriffsschrift:
It behoves us to deduce the more complex among these judgements from simpler ones, not
so as to render them more certain, which in most cases would be completely unnecessary,
but rather to exhibit the relations of the judgements to one another. Merely to know the
laws is obviously not the same as to know them together with the connections some have
to others. (Frege 1879, par II, 13, p. 24, English translation, pp. 28-9.)
'Laws' here clearly refers to all general arithmetic truths, not just to basic principles.
10
In Frege's case, this aim is particularly clear in the Vonvort to the Begriffsschrift.
11
See 1 of Hallett 1994.
12
Russell 1899, pp. 701-2. Compare Russell's statement here to Frege's statement about axioms
in his letter to Hilbert of 27.xii.1899 in Frege 1976 or Frege 1980.
13
Russell 1900, pp. 170-1. As against this, much of the spirit of Russell 1901 is that the essence
of mathematics is the study of 'what follows from what'.
14
Russell 1900, p. 171.
15
Russell 1899, pp. 702-3. Again, compare this statement to Frege's position on primitives
in his letter to Hilbert of 27.xii.1899, and in his 1906.
16
Russell 1903, p. xv.
17
See Russell 1918-19. That Principia is meant to present something like a 'logically perfect
language' is stated by Russell on pp. 197-8 of this work (pp. 58-9 of the Pears edition). Russell
himself says that there is a connection between the projects of Principia and that of logical
atomism. At the beginning of the very first lecture, he states:
The kind of philosophy I wish to advocate, which I call Logical Atomism, is one which
has forced itself upon me in the course of thinking about the philosophy of mathematics,
although I should find it hard to say exactly how far there is a definite logical connection
between the two. (Russell 1918-19, p. 178, p. 35 of the Pears edition.)
However, it is clear that Russell did not intend logic to have the same primacy in the realm
of empirical theories as in mathematics, as the following striking passage from 1912 makes
clear:
What has happened in the case of space and time has happened, to some extent, in other directions as well. The attempt to prescribe to the universe by means of a priori principles has
broken down; logic, instead of being, as formerly, the bar to possibilities, has become the
great liberator of the imagination, presenting innumerable alternatives which are closed to
unreflective common sense, and leaving to experience the task of deciding, where decision
is possible, between the many worlds which logic offers for our choice. (Russell 1912,
p. 148.)
However, the role of 'logical construction' (and hence Russell's version of the theory of classes)
remains central in the theory of logical atomism.
18
This is clearly reminiscent of the position adopted by Poincare towards the geometry to be
employed in accounts of physical space. It is also reminiscent of elements of the later views
of Carnap and Quine.
19
Both of these are reflected in Hilbert's interest in the independence of propositions from
others pursued in his work on geometry, where he uses models from separate realms (usually
algebraic structures built over various number domains).
2Q
Dedekind 1872, 3, p. 10, pp. 10-11 of the English translation.
It should be noted that Russell sometimes expressed awareness that the primary thing in
mathematics is the logical interconnections established by the deductive system and not the nature
of the primitives. For instance, in his 1897 he writes:

5
6

176

MICHAEL HALLETT

108 All geometrical reasoning is, in the last resort, circular: if we start by assuming points,
they can only be defined by the lines or planes which relate them; and if we start by
assuming lines and planes, they can only be defined by the points through which they pass.
This is an inevitable circle, ... It is, therefore, somewhat arbitrary to start either with points
or with lines, as the eminently projective principle of duality mathematically illustrates;
... (Russell 1897, p. 120.)
This implies (as is explicit in Hilbert's statements of the axiomatic method) that one can only
hope to know exactly what the nature of points or lines or planes is after the axioms have
been given and the system developed. However, the last remark suggests that Russell might
here be stating a view peculiar to projective geometry, and that in the realm of metrical geometry,
intuition will be sufficiently refined to isolate the correct starting point. Whatever Russell's
attitude in 1897, there is certainly a passage in his 1919 (p. 59) where Russell states explicitly, and quite generally, much the same attitude as Hilbert, thus stressing the logical
interconnections and playing down the importance of primitives.
21
Frege 1879, Vorwort, p. IX, English translation, p. 6.
22
Compare this to what Russell says on the need for a special philosophical language in his
reply to Strawson, Russell 1957, pp. 123-4.
23
This, after all, is part of the point of Hilbert's idea of Tieferlegung der Fundamente ('driving
the foundations deeper'). See Hilbert 1918.
24
See, above all, Dedekind 1854. Dedekind also discusses here cases other than those of the
extension of the number systems.
25
Dedekind, letter to Weber, 24.i.l888, in Dedekind 1932, item LXVI, pp. 489-90.
26
Dedekind 1888, 73.
27
Weyl, in effect, makes just this same point about how the axiomatic method generalises
Dedekind's procedure. He writes:
The method of implicit definition consists, not in explaining the sense [Sinn] of each individual concept on the basis of others taken to be known. Rather it consists only of setting
up a system of propositions or axioms in which these concepts are involved. This method
of implicit definitions has very often been employed in mathematics. It has the advantage
that the most important properties of the concepts can be set out immediately at the beginning, whereas by setting down a proper [i.e., explicit] definition, these properties might appear
only as very distant consequences of the definitions. (Weyl 1910, p. 301.)
28
Hilbert 1900a, p. 181.
29
The full passage is quoted above, at the beginning of the section.
30
Hilbert to Frege, 29.xiL1899, in Frege 1976 or Frege 1980. This idea, too, goes back to Hilbert
*1894.
31
See the letter mentioned in the previous note; the English translation omits the relevant
remark.
32
See Dedekind 1872, IV, p. 13, p. 15 of the English translation. See also Dummett's highly
illuminating discussions of Dedekind in his 1991, especially Ch. 5 and pp. 247-51.
33
See the letter of Hilbert to Frege of 29.xii.1899 in Frege 1976 (or Frege 1980). We should
note in passing that GOdel and others deny this with respect to the addition of axioms to
standard set theory.
34
See Hilbert's letter to Frege of 29.xii.l899, in Frege 1976 or Frege 1980. The remark continues the quote referred to in n. 30. Hilbert made a similar remark in 1891, according to
Blumenthal 1935 (pp. 402-3), although this time about 'table', 'chair' and 'beer mug'.
35
We should note that extra theories might be needed for a precise formulation of certain of
the axioms. This was surely the case with Hilbert's own Completeness Axiom for the real numbers
or for Euclidean geometry, although we cannot go into this here.
36
The completeness criterion seems, on the surface, somewhat weaker than completeness as
it is now understood. Hilbert sometimes states it thus: The system is complete when it has 'all
the facts presented to us as logical consequences' (* 1905b, p. 12). His later formulations of
completeness are a good deal more precise, and apparently much stronger. For example, in Hilbert
1929 and 1930 two formulations of the completeness of I: are its Post-completeness (I: is Post-

HILBERT AND LOGIC

177

complete if whenever cr is unprovable from E, then E U {cr} inconsistent: see Post 1921), and
that if cr is consistent with E, then it is actually provable in E. (Hilbert's statement of Postcompleteness pre-dates Post; it is used in the formulation of the property of completeness of
the propositional calculus in his 1917-18 lectures, which Hilbert then proves. See *1917-18,
pp. 152-3.)
Hilbert also sometimes lays weight on the categorical nature of the axioms, pointing out
that his own axiom system for the real numbers is categorical. See, for instance, Hilbert *1905b,
p. 21. In one of his notebooks, Hilbert even says that categoricity of an axiom system shows
that the concept 'defined' by the system is uniquely defined: see Hilbert, Notebook 3, Cod.
Ms. 600, III, p. 131. The passage is undated. Hilbert regarded categoricity as a version of completeness; see his 1929 (p. 139) and 1930 (p. 6). See also Wey11944, pp. 155-6. But it is important
to recognise why Hilbert prefers (syntactic) completeness to categoricity. See above, p. 152.
37
Hilbert often points out this consequence of inconsistency; see, for instance, *1905b,
p. 217, or *1917, p. 138, or *1917-18, p. 218.
38
Hilbert to Frege, 29.xii.1899, in Frege 1976, English translation in 1980.
39
See the letter to Keferstein, 27.ii.1890, point (7), p. 275 of Dedekind 1890, p. 101 of the
English translation. See also the letter to Lipschitz of 27.vi.l876, in Dedekind 1932, item LXV,
p. 477, or Scharlau (ed.) 1986, pp. 77-8.
40
See Frege 1903, 143, English translation in Geach and Black (eds.) 1966, p. 178. Even Weyl,
as late as his 1910, is of the opinion that modelling is the only way of showing consistency.
For instance, after discussing the advantages of the axiomatic method, he says this:
But the implicit definition through axioms is nevertheless only something provisional, since
one can only depend on it in case the axioms are free from contradiction, in other words,
in case a system of explicitly defined concepts can be set up which satisfies them. A good
example of what has been said here is offered by the treatment which Lebesgue has given
of the concept of integral in Ch. IV of his "Le~ons sur I'integration" (Paris 1904). Lebesgue
makes precisely the same distinction between explicit and implicit definitions, which he
distinguishes as "constructive" and "descriptive" respectively. (Weyl 1910, pp. 301-02 of
the reprint.)
It is worth noting that the condition of categoricity on axiom systems is an attempt to go as
far as possible to satisfy the stronger demands of the Frege approach while remaining within
the Hilbert framework, i.e., while still allowing that there need not be unique referents for the
primitive terms. Dedekind certainly wanted to prove categoricity for his systems. His letter to
Keferstein of 27.ii.189O states this concern explicitly for the natural number system. It is somewhat
more involved in the case of the real number system. Part of the question is settled by Dedekind's
Proposition IV from his 1872, V, which shows that the system of cuts in the rationals is complete
in the sense that one gets no new numbers by considering cuts in the ordered system of cuts in
the rationals. (This is mirrored by Cantor in 1 of his 1872, where he argues that for any
Cauchy sequence of reals there must be an equi-convergent sequence of rationals.) What categoricity then depends on is the isomorphism of any two rational fields.
41
The use of what would now be called non-constructive existence proofs was a hallmark of
Hilbert's work from very early on, the most famous example being the proof of the Hilbert
Basis Theorem which goes back to 1890.
42
Hilbert to Frege, 29.xii.l899, in Frege 1976; English translation in Frege 1980. Hilbert's view
is also clear in his 1900a, p. 184, and 1900b, pp. 265-6 (1935, pp. 300-1).
43
Hilbert 1900b, p. 266 (1935, p. 301).
44
Bernays (1935) calls the position based on such external existence assumptions a 'restricted
platonism', and somewhat the same position is put forward by Bernays in 1922a. It is also clearly
set out in some undated notes in Bernays's hand in Hilbert's NachlaB, Cod. Ms. 685, 3, Blatter
13-20. The main difference in Bernay's later statement is that Bernays at this point recognises
quite clearly that such assumptions can be weaker or stronger, and that they have what we
now call a 'consistency strength', something which was quite unclear before GOdel's and
Gentzen's work. Hilbert's work is also intimately tied to his views on quantification: see below,
pp. 154-8. Note also the various connections to Cantor.

178

MICHAEL HALLETT

Hilbert 1900b, p. 266 (1935, p. 301).


See Frege 1976, English translation in Frege 1980.
47
Hilbert *1919, p. 149.
48
For further discussion of Hilbert's theory of ideal elements, and especially an argument as
to why Hilbert's theory is not a version of instrumentalism, see 4 of Hallett 1990.
49
This is not to endorse the view (expressed in the passages from Weyl cited above, and repeated
by Schlick in 1918) that axiom systems are 'implicit definitions' of the primitive terms. See
Hallett 1994, n. 36, pp. 192-3.
50
For a discussion of some of the difficulties of Hilbert's position, see Bernays 1950.
51
See, for instance, Hilbert 1899, pp. 21-24.
52
Actually, it seems to be soundness that Hilbert assumes automatically at this early stage.
The question of completeness was formulated explicitly later, in *1917-18. However, the stress
on (finite) proof must indicate some implicit belief in its adequateness, thus on completeness.
53
Hilbert 1900b, p. 257, p. 293 of 1935. Hilbert states immediately after this that the demand
for rigour in this sense is a 'requirement of reason'. A look at the infinitary arguments given
for the Well-Ordering Theorem is enough to show that a plea for rigour in this finitary sense
was by no means an empty demand.
54
Hilbert *1905b, p. 249.
55
Hilbert *1905b, p. 217 (or *1905c, pp. 141-2).
56
Hilbert 1900a, p. 184.
57
Hilbert 1900b, pp. 265-6; p. 301 of 1935.
58
Hilbert, Notebook 3, Cod. Ms. 600, III, p. 113. The passage is undated, although the
surrounding material suggests that it was probably written in the period between 1900 and
1905. One sees, of course, what Hilbert is getting at, but his own Completeness Axiom is a
second-order axiom for which it is not clear how the second-order quantifiers are to be understood.
59
GOOel's work on completeness shows that what counts for first-order systems is just interpretation in countable domains, as, in fact, had Skolem's work. See GOOel 1929, especially
p. 62, where Giidel states that his proof of completeness avoids any use of the uncountable.
Note the analogy with Hilbert's statement in the passage quoted on p. 150 above about avoiding
infinite processes. (Hilbert makes no reference to Skolem, and no reference in this context to
Liiwenheim.) The problem of quantifying over all interpretations certainly does arise in connection with Hilbert's Completeness Axiom for geometry, for this make reference to all
interpretations of the other geometric axioms.
60
See especially *1905b, p. 228 (p. 150 in the version due to Born, *1905c).
61
Hilbert *1905c, p. 148. (The *1905b version says 'does not know' where Born says 'is
not conscious'; see pp. 225-6.) (Interestingly, Hilbert says this before he states the axioms
concerning negation and the 0 and 1 of the algebra.) On p. 250 of 1905b, Hilbert states that
the aim of foundational and logical study is the attainment of a 'complete description of mathematical thought, and therewith also theoretical thought generally'. (For this, see the passages
quoted in n. 80. See also n. 90 and the text to that note.) That logic has its roots in the description of thought seems to have been a fairly constant view of Hilbert's. For instance, in his
1928 Hilbert refers to the rules of operation with a logical calculus as expressing 'the technique of our thinking' (p. 79, p. 475 of the English translation).
62
It is also clear that, for Hilbert, 'ordinary logical operation' is much more narrowly confined
than the ordinary notion of 'follows from' appears to be for Tarski in his 1936, which, judging
from his own example (an infinitary inference about the natural numbers), is always specific
to the particular domain of objects under consideration. For Hilbert, the challenge would be to
formalise such examples within the framework of ordinary finitary logic, augmenting the proper
axioms if necessary.
63
Hilbert *1905b, p. 215.
64
It is not known when this was added, but it is in Hilbert's hand.
65
Hilbert *1905b, p. 254.
66
See Hilbert and Bernays 1934, pp. 15-30. They set out just this connection between the
45

46

HILBERT AND LOGIC

179

existence of the domain of quantification and consistency proofs. The adjective 'fertig' is
clearly borrowed from Cantor; see p. 155.
67
Hilbert 1900a, p. lSI.
68
Hilbert 1900b, p. 266, p. 301 of Hilbert 1935. The whole passage is cited below.
69
See n. 44.
70
Hilbert 1900a, p. lS4.
71
See the letters of 26.ix.lS97 and 2.x.IS97, reprinted in Purkert and Ilgauds 1987, pp. 224-7.
See also Cantor's correspondence with Dedekind from IS99, excerpted in Cantor 1932 (English
translation in van Heijenoort (ed.) 1967).
72
Cantor asserts in the second letter to Hilbert mentioned in the previous note that this distinction is what lies behind his definition of set given at the beginning of his 1895.
73
The proof appears to rely on a step-by-step choice argument, not referred to in these letters,
but hinted at in the letters to Dedekind. See Hallett 1984, pp. 169-70.
74
Hilbert 1900b, p. 266, p. 301 of Hilbert 1935. The passage comes in a discussion of Hilbert's
Problem 2: 'The consistency of the arithmetical axioms'.
75
Zermelo 1908, p. 262, p. 201 of the English translation.
76
Zermelo 1908, p. 265, English translation, p. 203. Thus, he was in effect making use of
the inconsistencies in much the same way that Cantor had.
77
Zermelo himself states that the Aussonderungsaxiom 'forms a certain replacement for the
general and untenable definition of set given in the Introduction', a definition, in effect, incorporating the Comprehension Principle. See Zermelo 1908, pp. 263-4, English translation,
p.202.
78
Zermelo himself does not go this far; all he says is that he does not elucidate in his paper
the further questions as to 'the origin and domain of validity' of his principles, and that:
I have not even been able to show the 'consistency" of my axioms, something which it is
quite essential to do; ... (Zermelo 1908, p. 262, English translation, pp. 200--01.)
79
Hilbert's two clearest statements of this criticism are in his 1905a, p. 175 (p. 130 of the
English translation), and 1922, p. 162 (p. 162 of the reprint).
80
In *1905b (p. 191), Hilbert describes the main problem of the foundations of mathematics
as being:
... to recognise and describe completely the driving force [Getriebe] of correct mathematical thought, and thereby of logical thought generally.
Note the mechanical analogy in this. Hilbert also says that in carrying out the 'clear and exact
construction of the logical foundation' for mathematics,
... one will and must obtain a complete description of mathematical thought, and thereby
theoretical thought. (Hilbert * J905b, p. 250.)
81
See Hilbert and Bernays 1934, p. 15 and then p. 39. The drive to ever more inclusive
totalities makes clearer one of the reasons why an axiomatisation of set theory adequate for
the development of the whole of basic mathematics would have been thought desirable.
82
See, in particular, the first paragraph of Zermelo 1908, p. 261, p. 200 of the English translation. It should be noted that Zermelo's system still falls short of showing what Hilbert claims
in 1900, for in Zermelo's treatment, among other shortcomings, there is no way of reproducing
the full Cantorian theory of ordinals and alephs. This had to await von Neumann's work. See
Hallett 1984, chs. 7 and S, and also n. 100, below.
83
Hilbert *1905b, p. 215. The mention of 'Mascheroni's constant' 2{2 is important. Hilbert's
Seventh Problem in 1900b is: To show, for an algebraic number a and an algebraic irrational
exponent 13, that a~ is always transcendental or at least irrational. He gives the above number,
and e" (in the form t 2i ), as specific examples of the problem. (See Hilbert 1900b,
p. 274, Hilbert 1935, p. 308.) The fact that such problems as this, or that of whether there are
infinitely many twin primes, were unsolved at the time is used by Hilbert in *1905b to give
examples of 'non-algorithmic' binary expansions in which the presence of a 1 or a 0 depends
on the outcome of their solution. (See *1905b, p. 199, *1905b, or pp. 127-S.) Hilbert regards
these as perfectly good specifications of expansions, even though we cannot calculate the numbers
involved, and this is (presumably) based on his faith in the solubility of every well-posed

180

MICHAEL HALLETT

mathematical question. He gives a strong statement of this conviction in 1900b, pp. 261-2 (Hilbert
1935, p. 298), and this is repeated on pp. 193 and 249 of * 1905b. (This is another interesting
example of admitting metamathematical formulations directly into directly into ordinary
mathematics, Hilbert's Completeness Axiom being the first, a practice which has become
extremely important, particularly in set theory, since GOde1's Incompleteness Theorem.) On
the other hand, Hilbert does not regard specifications of binary expansions given by throws of
a die as proper specifications (see *1905b, pp. 199-200, *1905c, p. 128).
Hilbert's Seventh Problem was solved by a result proved by Gel'fond and Schneider independently in 1934, one of whose forms states: if u, ~ are algebraic, with u ~ 0, 1 and ~
irrational, then uP is transcendental. (See Tijdeman 1976, pp. 242-43.) In 1929, Gel'fond proved
a special case of the result which shows that 2r::2 must be transcendental, and in 1930 Siegel
extended the proof to cover 2n. Siegel himself recalls having heard Hilbert say in a lecture in
1920 that no one in the room would live to see this problem solved! See Reid 1970, p. 164.
Hilbert's use of the term 'Mascheroni's constant' is somewhat mysterious. As far as I can
tell, 'Mascheroni's constant' is another term for the very different Euler constant. Indeed,
Hilbert himself refers to the 'Euler-Mascheroni constant' when speaking of Euler's constant in
his 1900b (p. 261, p. 297 of Hilbert 1935).
S4
Since Hilbert makes no reference here to Richard, it is hard to say whether he knew of the
Richard paradox through Richard 1905, or an account of it, or whether he had made this discovery separately. (There is no mention of the paradox in Hilbert 1905a.) After 1905, Hilbert
always refers to this antinomy as Richard's without saying that he had discovered it independently, and even the talk in 1905 of 'laws' which define real numbers, and the assertion that a
reasonable condition on laws is that they be specifiable in a finite number of words (see * 1905b,
p. 200, *1905c, p. 128) is very reminiscent of Poincare's position in 1905 and 1906. The most
important thing to note, however, is that Hilbert seems to have taken the Richard antinomy
very seriously, at least up until about 1917. There are important discussions of it, not just in
Hilbert *1905b, but also in *1917 (pp. 129-32), and *1920a, pp. 2-5, not to mention Weyl
1910, 1917 and 1919.
S5
Zermelo 1908, p. 263, p. 201 of the English translation.
S6
See Zermelo 1908, p. 264, p. 202 of the English translation.
S7
See Weyl 1910, and also Weyl 1917. See also Skolem 1923.
ss Weyl 1910, p. 302 of the reprint.
S9
Frege 1882a, pp. 48-50. The need to investigate the 'associated ideas' is later played down,
especially in the Grundlagen (1884 J. In Frege 1880-81, there is another remarkable similarity
with Hilbert's later position:
... this [artificial language] differs from ordinary language [Wortsprache] in yet a further
way, in that it is designed for the eye and not the ear. Of course, ordinary writing is so designed
also, but this is simply an image [Abbi/dung] of ordinary, spoken language [Wortsprache],
and in this sense it is does not come any closer to the Begriffsschrift than this latter does.
Indeed, it is even further removed, since it is composed of signs that derive from signs and
not from content [Sachen]. (Frege 1880-81, pp. 13-14.)
90
Hilbert *1905b, p. 219. Virtually the same passage is to be found in the other Ausarbeitung
of these lectures stemming from Max Born (*1905c, p. 143). In his 1905a (p. 176; p. 131 of
the English translation), Hilbert hints at much the same thing, for there he begins his presentation of his logical theory:
Let an object of our thought be called a thought-object [Gedankending] or, briefly, a thing
[Ding] and let it be denoted by a sign.
91
Hilbert *1910, p. 159. In the last section, we will return briefly to the remark 'analogous
to arithmetic'.
92
Hilbert *1920b, pp. 46-7.
93
See *1920a, p. 11.
94
Hilbert *1920a, p. 31.
95
Hilbert *1917-18, pp. 218-20.
96
See Hilbert *1920b, pp. 10--12, or *1924-25, pp. 108. What Hilbert, in effect, does in forming

HILBERT AND LOGIC

181

these paradoxes is to invoke the danger of what have been called 'indefinitely extensible' collections, or what Russell in 1906 called 'self-reproductive classes'.
97
More correctly, we would have to assume that the formulas denote propositions or propositional functions or even sets, but with the assumption that these objects exhibit the same
complexity as their defining formulas.
98
Hilbert *1920a, p. 32.
99
Hilbert *1905b, pp. 274-5.
100
Of course, we now take it for granted that, with a system like Zermelo's set theory (or
ZF), the constructive approach to the underlying language is not too restrictive, in other words,
that the constructive attitude to formulas is not an undue restriction on the Separation or
Replacement Axioms in particular, and that we can prove the existence of all the sets we want
to show exist. This is by no means obvious without demonstration, and showing this was one
of the prime concerns of early set theory. Much of the development of set theory in the 1920s,
though, pursues the 'axiomatic' approach to the higher-order notions that Hilbert suggests is
necessary once quantification over them is applied, thus with various attempts to axiomatise
Zermelo's notion of 'definite property', and then to show that this does not weaken the theory,
Le., that all the essential results still follow, and that it is possible to develop Cantor's theory
of transfinite numbers within such frameworks. This is true of all of Fraenkel, von Neumann
and Zermelo himself. Skolem seems to have been the major exception in the early stages. See
Skolem 1930, and Hallett 1984, 7.4, 8.2. We should note that even Bernays's axiomatisation of set theory in the late 1930s and 1940s attempts to show, in effect, how first -order predicates
can be axiomatised (as classes) alongside sets, the key assumption being that any predicate which
does not involve quantification over classes (predicates!) determines a class. The connection with
the attitude of Hilbert to predicates cannot be accidental.
101
See also Weyl 1910, p. 304. One might consider these statements of Hilbert and Weyl to
be premonitions of the Skolem paradox. Something Dedekind says might also be seen as an
early recognition of this:
Whether the symbolic language [Zeichensprache] will suffice in order to denote all the new
objects that are to be created individually is really of no importance. It is always sufficient
to signify all the individuals needed for a (limited) investigation. (Letter to Weber, 24.i.1888,
in Dedekind 1932, item LXVI, p. 490.)
102
Hilbert 1926, pp. 170-1; p. 376 of the English translation. See also the passages cited on
p. 187 of Hallett 1994. The passage quoted here is expanded from Hilbert 1922, p. 163
(p. 163 of the reprint), and is then repeated in somewhat abbreviated form in Hilbert 1928, pp.
65-6 (pp. 464-5 of the English translation), and very abbreviated form in Hilbert 1929, p. 140.
Neither the 1922 passage, nor that from 1928 nor 1929, contain the express reference to Kant.
103
See Weyl 1910, p. 304.
104
Hilbert 1900b, p. 260, p. 296 of Hilbert 1935.
105
See Hallett 1994, pp. 184-5.
106
Hilbert *1922-23b, p. 130 (p. 93 of the new typescript). See also pp. 128-30 (resp.
pp.92-94).
101
Hilbert *1921-22, p. 76.
108
Hilbert 1922, p. 164, p. 164 of Hilbert 1935.
109
Hilbert *1905b, pp. 275....fJ.
110
Hilbert asserts this explicitly about the sign-arithmetic in his 1922, p. 164 (also p. 164 of
Hilbert 1935).

REFERENCES
Benacerraf, Paul and Putnam, Hilary (eds.) 1964: Philosophy of mathematics: selected readings.
First edition. Oxford: Basil Blackwell.
Benacerraf, Paul and Putnam, Hilary (eds.) 1983: Philosophy of mathematics: selected readings.
Second edition. Cambridge: Cambridge University Press.

182

MICHAEL HALLETT

Bernays, Paul 1922a: 'Dber Hilberts Gedanken zur Grundlegung der Arithmetik', Jahresbericht
der deutschen Mathematiker-Vereinigung, 31, 10-19.
Bernays, Paul 1922b: 'Die Bedeutung Hilberts fiir die Philosophie der Mathematik', Die
Naturwissenschaften, 10, 93-9.
Bernays, Paul 1935: 'Sur Ie platonisme dans les mathematiques', L' enseignement mathematique, 34, 52-69. English translation by Charles Parsons in Benacerraf and Putnam (eds.)
1964,274-86, and 1983, 258-71. German translation by Peter Bernath in Bernays 1976,
62-78.
Bernays, Paul 1950: 'Mathematische Existenz und Widerspruchsfreiheit' in Etudes de Philosophie
des Sciences, Neuchatel: Editions de Griffon, 1950, 11-25. Republished in Bernays 1976,
92-106.
Bernays, Paul *1956: 'Are pure mathematics between logic and geometry?', lecture delivered
at Columbia on May 10th, 1956, unpublished, Bernays Nachlass (Hs 973: 18), Wissenschaftshistorische Sammlung, ETH-Bibliothek, Ziirich.
Bernays, Paul 1976: Abhandlungen zur Philosophie der Mathematik. Darmstadt: Wissenschaftliche Buchgesellschaft.
Blumenthal, Otto 1935: 'Lebensgeschichte', in Hilbert 1935, 388-429.
Browder, Felix (ed.) 1976: Mathematical developments arising from Hilbert problems.
Proceedings of symposia in pure mathematics, volume 28, parts 1 and 2. Providence, Rhode
Island: American Mathematical Society.
Cantor, Georg 1872: 'Dber die Ausdehnung eines Satzes aus der Theorie der trigonometrischen
Reihen', Mathematische Annalen, S, 123-32. Reprinted in Cantor 1932, 92-102.
Cantor, Georg 1883: 'Dber unendliche lineare Punctmannigfaltigkeiten, 5', Mathematische
Annalen, 21, 545-91. Reprinted in Cantor 1932, 165-209; English translation in Ewald
(ed.) 1996.
Cantor, Georg 1895: 'Beitriige zur Begriindung der transfiniten Mengenlehre, 1', Mathematische
Annalen, 46, 481-512. Reprinted in Cantor 1932, 282-311. English translation in Georg
Cantor: Contributions to the founding of the theory of transfinite numbers, with an
historical introduction and notes by P. E. B. Jourdain, La Salle, I11inois: Open Court
Publishing Company, 1915, republished by Dover Publications, New York, 1955.
Cantor, Georg 1897: 'Beitriige zur Begriindung der transfiniten Mengenlehre, II', Mathematische
Annalen, 49, 207-246. Reprinted in Cantor 1932,312-56. English translation in Georg Cantor:
Contributions to the founding of the theory of transfinite numbers, with an historical introduction and notes by P. E. B. Jourdain, La Salle, Illinois: Open Court Publishing Company,
1915, republished by Dover Publications, New York, 1955.
Cantor, Georg 1932: Gesammelte Abhandlungen mathematischen und philosophischen Inhalts.
Edited by Ernst Zermelo. Berlin: Julius Springer Verlag.
Church, Alonzo 1956: Introduction to mathematical logic. Volume 1. Princeton: Princeton
University Press.
Cohen, Robert and Elkana, Yehuda (eds.) 1977: Hermann von Helmholtz: Epistemological
Writings. Dordrecht, Holland: D. Reidel Publishing Co. [Boston Studies in the Philosophy
of Science, vol. 37] Translation of Helmholtz 1921.
Dedekind, Richard 1854: 'Dber die Einfiihrung neuer Funktionen in der Mathematik:
Habilitationsvortrag, gehalten im Hause des Prof. Hoeck, in Gegenwart von Hoeck, Gauss,
Weber, Waitz, 30 Juni 1854' in Dedekind 1932, item LX, 428-38. English translation in
Ewald (ed.) 1996.
Dedekind, Richard 1872: Stetigkeit und irrationale Zahlen. Braunschweig: Vieweg und Sohn.
Latest reprint, 1965. Also reprinted in Dedekind 1932, 315-332. English translation in
Dedekind 1901, with a revision of this translation in Ewald (ed.) 1996.
Dedekind, Richard 1888: Was sind undwas sol/en die Zahlen? Braunschweig: Vieweg und Sohn.
Latest reprint, 1969. Also reprinted in Dedekind 1932, 335-91. English translation in Dedekind
1901, with a revision of this translation in Ewald (ed.) 1996.
Dedekind, Richard 1890: 'Brief an Keferstein, vom 27ten Februar, 1890', in Sinaceur
1974 (together with a French translation), 271-8. English translation by Hao Wang and Stefan

HILBERT AND LOGIC

183

Bauer-Mengelberg in van Heijenoort (ed.) 1967, 98-103. The original is in the


Niedersachsische Staats- und Universitatsbibliothek, Gottingen (Cod. Ms. Dedekind, 13).
Dedekind, Richard 1901: Essays on the theory of numbers. English translations by W. W.
Beman of Dedekind 1872 and 1888. LaSalle, Illinois: Open Court Publishing Co. Reprinted
by Dover Publications, New York, 1963.
Dedekind, Richard 1932: Cesammelte mathematische Werke, Band 3. Herausgegeben von Robert
Fricke, Emmy Noether and Oystein Ore. Braunschweig: Friedrich Vieweg und Sohn. Reprinted
by Chelsea Publishing Co., New York, 1969, as three volumes in two, with some omissions from the third volume.
Dummett, Michael 1991: Frege: philosophy of mathematics. London: George Duckworth or
Cambridge, Ma.: Harvard University Press.
Ewald, William (ed.) 1996: Readings in the philosophy of mathematics. Oxford: Clarendon Press.
Frege, Gottlob 1879: Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des
reinen Denkens. Halle an die Saale: Verlag von Louis Nebert. Reprinted in Frege 1964; English
translation in van Heijenoort (ed.) 1967, 1-82.
Frege, Gottlob 1880-81: 'Booles rechnende Logik und die Begriffsschrift' in Frege 1964,
9-52.
Frege, Gottlob 1882: 'Uber die wissenschaftliche Berechtigung einer Begriffsschrift', Zeitschrift
for Philosophie und philosophische Kritik, 81, 48-56. Reprinted in Frege 1969 and in Patzig
(ed.) 1986, 91-7.
Frege, Gottlob 1884: Die Crundlagen der Arithmetik. Breslau: Wilhelm Koebner. Reprinted
by Felix Meiner Verlag, 1986, 1988. English translation by l. L. Austin as Thefoundations
of arithmetic, Oxford: Basil Blackwell, second edition, 1953. The Austin edition is a bilingual one, and the page numbers are the same for both the German and English texts.
Frege, Gottlob 1903: Crundgesetze der Arithmetik, Band 2. lena: Hermann Pohle. Reprinted in
Frege 1966.
Frege, Gottlob 1906: 'Uber die Grundlagen der Geometrie', lahresbericht der deutschen
Mathematiker-Vereinigung, 15,293-309, 377-403, 423-30. English translation in Kluge,
E-H. W. (ed.) 1971, 49-112, reprinted in Frege 1984, 293-340. Both the translation in the
former and its reprinting in the latter give the page numbers of the original.
Frege, Gottlob 1964: Begriffsschrift und andere Aufsiitze. Zweite Auflage. Mit E. Husserls und
H. Scholz' Anmerkungen herausgegeben von Ignacio Angelelli. Darmstadt: Wissenschaftliche
Buchgesellschaft.
Frege, Gottlob 1969: Nachgelassene Schriften und Wissenschaftlicher Briefivechsel: Erster Band,
Nachgelassene Schriften, edited by H. Hermes, F. Kambartel and F. Kaulbach. Hamburg:
Felix Meiner.
Frege, Gottlob 1976: Nachgelassene Schriften und Wissenschaftlicher Briefivechsel: Zweiter Band,
Wissenschaftlicher Briefivechsel, edited by G. Gabriel, H. Hermes, F. Kambartel, F. Kaulbach,
C. Thiel and A. Veraart. Hamburg: Felix Meiner.
Frege, Gottlob 1980: Philosophical and mathematical correspondence. Oxford: Basil Blackwell.
Partial English translation by Hans Kaal of Frege 1976.
Geach, Peter and Black, Max (eds.) 1966: Translations from the philosophical writings ofCottlob
Frege. Oxford: Basil Blackwell.
George, Alexander (ed.) 1994: Mathematics and mind. (Proceedings of the Conference on
Mathematics and Mind held at Amherst College, Massachusetts in April, 1991.) New York:
Oxford University Press.
GOdel, Kurt 1929: Uber die Vollstiindigkeit des Logikkalkiils. Doctoral dissertation, University
of Vienna. Published, with an English translation, in GOdel 1986, 60-101. Page numbers in
the text refer to this reprinting.
GOdel, Kurt 1986: Kurt Cadel: Collected works, volume 1. Oxford: Clarendon Press.
Hallett, Michael 1984: Cantorian set theory and limitation of size. Oxford: Clarendon Press.
Hallett, Michael 1900: 'Physicalism, reductionism and Hilbert' in Irvine (ed.) 1990, 183-257.
Hallett, Michael 1994: 'Hilbert's axiomatic method and the laws of thought' in George (ed.) 1994,
158-200.

184

MICHAEL HALLETT

Heijenoort, Jean van 1979: 'Absolutism and relativism in logic' in van Heijenoort 1985,75-83.
Previously unpublished.
Heijenoort, Jean van 1985: Selected essays. Naples: Bibliopolis.
Heijenoort, Jean van (ed.) 1967: From Frege to Godel: a source book in mathematical logic.
Cambridge, Mass.: Harvard University Press.
Helmholtz, Hermann von 1868: 'Dber die Tatsachen, die der Geometrie zum Grunde liegen',
Nachrichten von der koniglichen Gesellschaft der Wissenschaften zu Gottingen, 1868,
193-221. Reprinted in Helmholtz 1921.
Helmholtz, Hermann von 1921: Schriften zur Erkenntnistheorie, edited by Paul Hertz
and Moritz Schlick. Berlin: Julius Springer. English translation, Cohen and Elkana (eds.)
1977.
Hilbert, David *1894: Die Grundlagen der Geometrie. Sommersemester 1894: Hilberts eigene
Manuskript. Niedersiichsische Staats- und Universitiitsbibliothek, Gottingen. Cod. Ms. 541.
Hilbert, David 1899: 'Grundlagen der Geometrie', in Festschrift zur Feier der Enthiillung des
Gauss-Weber-Denkmals in Gottingen, 1899. Leipzig: B. G. Teubner.
Hilbert, David 1900a: 'Dber den Zahlbegriff', lahresbericht der deutschen MathematikerVereinigung, 8, 180-4. English translation in Ewald (ed.) 1996.
Hilbert, David 1900b: 'Mathematische Probleme', Nachrichten von der kOniglichen Gesellschaft
der Wissenschaften zu Gottingen, mathematisch-physikalische Klasse, 1900,253-96. English
translation in Ewald (ed.) 1996.
Hilbert, David 1905a: 'Dber die Grundlagen der Logik und Arithmetik' in Krazer (ed.) 1905,
174-85. English translation in van Heijenoort (ed.) 1967, 129-38.
Hilbert, David *1905b: Logische Principien des mathematischen Denkens. Sommersemester 1905:
Ausgearbeitet von Ernst Hellinger. Mathematisches Institut, Georg-August Universitiit,
Gottingen. 277 pages, handwritten.
Hilbert, David *1905c: Logische Principien des mathematischen Denkens. Sommersemester 1905:
Ausgearbeitet von Max Born. Niedersiichsiche Staats- und Universitiitsbibliothek, Gottingen,
Gottingen: Cod. Ms. Hilbert, 558a. 188 pages, handwritten.
Hilbert, David *1908: Prinzipien der Mathematik. Sommersemester 1908: Ausarbeitung
unbekannt. Mathematisches Institut, Georg-August Universitiit, Gottingen. 206 pages, handwritten.
Hilbert, David *1910: Elemente und PrinzipienJragen der Mathematik. Sommersemester 1910:
Ausgearbeitet von Richard Courant. Mathematisches Institut, Georg-August Universitiit,
Gottingen. 163 pages, handwritten.
Hilbert, David *1917: Mengenlehre. Sommersemester 1917: Ausgearbeitet von Margarethe Loeb.
Mathematisches Institut, Georg-August Universitiit, Gottingen. 166 + IV pages, typewritten.
Hilbert, David *1917-18: Prinzipien der Mathematik. Wintersemester 1917-18: Ausgearbeitet
von Paul Bernays. Mathematisches Institut, Georg-August Universitiit, Gottingen. vii + 246
pages, typewritten.
Hilbert, David 1918: 'Axiomatisches Denken', Mathematische Annalen, 78, 405-15. Reprinted
in Hilbert 1935, pp. 146-56. English translation in Ewald (ed.) 1995, volume 2.
Hilbert, David *1919: Natur und mathematisches Erkennen. Herbstsemester 1919: Ausgearbeitet
von Paul Bernays. Mathematisches Institut, Georg-August Universitiit, Gottingen. Vi + 165
pages, typewritten. Retyped and published in a limited edition by the Mathematisches Institut,
Gottingen, 1989, 117 pages.
Hilbert, David *1920a: Probleme der mathematischen Logik. Sommersemester 1920:
Ausgearbeitet von N. [sic] Schonfinkel and Paul Bernays. Mathematisches Institut, GeorgAugust Unviersitiit, Gottingen. i + 46 pages, typewritten.
Hilbert, David *1920b: Logik-Kalkiil. Wintersemester 1920-21: Ausgearbeitet von Paul Bernays.
Mathematisches Institut, Georg-August Universitiit, Gottingen. 62 pages, typewritten.
Hilbert, David *1921-22: Die Grundlagen der Mathematik. Wintersemester 1920-21:
Ausgearbeitet von Paul Bernays. Mathematisches Institut, Georg-August Universitiit,
Gottingen. 100 + 9 + 38 pages, typewritten.
Hilbert, David 1922: 'Neubegriindung der Mathematik. Erste Mitteilung', Abhandlungen aus dem

HILBERT AND LOGIC

185

mathematischen Seminar der Hamburgischen Universitiit, I, 157-77. Reprinted in Hilbert


1935, pp. 157-78. English translation in Ewald (ed.) 1996.
Hilbert, David *1922-23a: Logische Grundlagen der Mathematik. Wintersemester 1922-23:
Ausarbeitung unbekannt, vermutlich von Paul Bemays. Niedersachsische Staats- und
Universitatsbibliothek, Gottingen, Cod. Ms. 567. 33 pages typewritten (pages 21-22 are
missing), plus 25 sheets in Hilbert's hand.
Hilbert, David *1922-23b: Wissen und mathematisches Denken. Wintersemester 1922-23;
Ausgearbeitet von Wilhelm Ackermann. Mathematisches Institut, Georg-August Universitiit,
Gottingen. iii + 138 pages, typewritten. Retyped and published in a limited edition by the
Mathematisches Institut, Gottingen, 1988, 99 pages.
Hilbert, David 1923: 'Die logischen Grundlagen der Mathematik', Mathematische Annalen,
88, 151-85. Reprinted in Hilbert 1935, pp. 178-91. English translation in Ewald (ed.)
1996.
Hilbert, David *1924-25: Uber das Unendliche. Wintersemester 1924-25: Ausgearbeitet von
Lothar Nordheim. Mathematisches Institut, Georg-August Universitat, Gottingen. 136 pages,
typewritten.
Hilbert, David 1926: 'Uber das Unendliche', Mathematische Annalen, 95, 161-90. English
translation in van Heijenoort (ed.) 1967, 367-92.
Hilbert, David 1928: 'Die Grundlagen der Mathematik', Abhandlungen aus dem mathematischen Seminar der Hamburgischen Universitiit, 6, 65-85. English translation in van Heijenoort
(ed.) 1967, 464-79.
Hilbert, David 1929: 'Probleme der Grundlegung der Mathematik' in Atti del Congresso
1nternazionale dei Matematici, Bologna 3-10 Settembre 1928, volume I, Bologna: Nicola
Zanichelli, 1929, 135-41. Reprinted, with omissions, as Hilbert 1930.
Hilbert, David 1930: 'Probleme der Grundlegung der Mathematik', Mathematische Annalen, 102,
151-65. Reprint, with omissions, of Hilbert 1929.
Hilbert, David 1933: Gesammelte Abhandlungen. Zweiter Band. Berlin: Julius Springer.
Hilbert, David 1935: Gesammelte Abhandlungen. Dritter Band. Berlin: Julius Springer.
Hilbert, David: Cod. Ms.: 600, I, II, III, 'Aufzeichnungen zu allgemeinen und besonderen
Problemen der Mathematik'. (3 Hefte.) Niedersachsische Staats- und Universitiitsbibliothek,
Gottingen, Hilbert Nachlass.
Hilbert, David: Cod. Ms.: 685, 3, 'Bemays: Weyls Kritik der Analysis', in Bemays's handwriting.
Niedersiichsisches Staats- und Universitiitsbibliothek, Gottingen, Hilbert Nachlass.
Hilbert, David and Ackermann, Wilhelm 1928: Grundziige der theoretischen Logik. Berlin: Julius
Springer.
Hilbert, David and Bemays, Paul 1934: Grundlagen der Mathematik. Band I. Berlin: Julius
Springer.
Hilbert, David and Bemays, Paul 1939: Grundlagen der Mathematik. Band II. Berlin: Julius
Springer.
Husserl, Edmund 1902: 'Notiz einer miindlicher Mitteilung Zermelos an Husserl' in Rang (ed.)
1979, 399, translated by Dallas Willard as 'Memorandum of verbal communication from
Zermelo to Husserl', in Willard (ed.) 1994, 442.
Irvine, Andrew (ed.) 1900: Physicalism in mathematics. Dordrecht, Holland: D. Reidel Publishing
Co.
Kleene, S. C. 1952: Introduction to metamathematics. New York: Van Nostrand. Also published by North-Holland: Amsterdam, and Noordhoff: Groningen.
Klein, Felix 1890: 'Zur Nicht-Euklidischen Geometrie', Mathematische Annalen, 37,544-72.
Reprinted in Klein 1921, 353-83.
Klein, Felix 1921: Gesammelte mathematische Abhandlungen, Erster Band. Berlin: Verlag von
Julius Springer.
Kluge, E-H. W (ed.) 1971: Gottlob Frege on the foundations of geometry and formal theories
of arithmetic. New Haven and London: Yale University Press.
Krazer, A. (ed.) 1905: Verhandlungen des dritten internationalen Mathematiker-Kongresses in
Heidelberg, 1904. Leipzig: B. G. Teubner.

186

MICHAEL HALLETT

Mendelson, Elliot 1964: Introduction to mathematical logic. New York: D. Van Nostrand
Company. (Third edition, Monterey, California: Wadsworth and Brooks, 1987.)
Patzig, Giinther (ed.) 1986: Gottlob Frege: Funktion, BegrifJ, Bedeutung. Sechste Auflage.
Gottingen: Vandenhoeck und Ruprecht.
Poincare, Henri 1905: 'Les matMmatiques et la logique', Revue de metaphysique et de morale,
13, 815-35.
Poincare, Henri 1906: 'Les matMmatiques et la logique', Revue de metaphysique et de morale,
14,294-317. Reprinted with some changes in Poincare 1908, 192-214.
Poincare, Henri 1908: Science et methode. Paris: Ernest Flammarion.
Post, Emil 1921: 'Introduction to a general theory of elementary propositions', American journal
of mathematics, 43, 163-85. Reprinted in van Heijenoort (ed.) 1967, 264-83.
Purkert, Walter and I\gauds, Hans Joachim 1987: Georg Cantor 1845-1918. Basel, Boston and
Stuttgart: Birkhauser.
Rang, Bernhard (ed.) 1979: Edmund Husserl: Aufsiitze und Rezensionen (1890-1910).
Husserliana: Edmund Husserls Gesammelte Werke, Band XXIl. Den Haag: Martinus Nijhoff.
Reid, Constance 1970: Hilbert. New York, Heidelberg, Berlin: Springer-Verlag.
Richard, Jules 1905: 'Les principes de mathematiques et Ie probleme des ensembles', Revue
generale des sciences pures et appliquees, 16, 541. English translation by Jean van Heijenoort
as 'The principles of mathematics and the problem of sets' in van Heijenoort (ed.) 1967,
142-9.
Russell, B. A. W. 1897: An essay on the foundations of geometry. Cambridge: Cambridge
University Press. Republished by Dover Publications, Inc. in 1937, with a new introduction
by Morris Kline.
Russell, B. A. W. 1899: 'Sur les axiomes de la geometrie', Revue de metaphysique et de morale,
7,685-707.
Russell, B. A. W. 1900: A critical exposition of the philosophy of Leibniz. London: George
Allen and Unwin. Second edition, 1937.
Russell, B. A. W. 1901: 'Recent work in the philosophy of mathematics', The international
monthly, 1901, reprinted as 'Mathematics and the metaphysicians' in Russell 1917.
Russell, B. A. W. 1903: The principles of mathematics. Volume 1. Cambridge: Cambridge
University Press. Second edition published as The principles of mathematics, with a new introduction, by George Allen and Unwin, London, 1937.
Russell, B. A. W. 1906: 'On some difficulties in the theory of transfinite numbers and order
types', Proceedings of the London mathematical society, 4 (second series), 29-53. Reprinted
in Russell 1973, 135-64.
Russell, B. A. W. 1912: The problems of philosophy. The Home University Library of Modem
Knowledge. Oxford: Oxford University Press.
Russell, B. A. W. 1917: Mysticism and Logic. London: George Allen and Unwin.
Russell, B. A. W. 1918-19: 'Lectures on logical atomism', The Monist, 28 (1918), 495-527,
29 (1919), 33-63, 190-222,345-80. Reprinted in Russell 1956, 177-281. Also reissued as
The philosophy of logical atomism, edited and introduced by D. F. Pears, Open Court
Publishing Co., LaSalle, Illinois, 1985.
Russell, B. A. W. 1919: Introduction to mathematical philosophy. London: George Allen and
Unwin.
Russell, B. A. W. 1956: Logic and Knowledge. Edited by R. C. Marsh. London: George Allen
and Unwin.
Russell, B. A. W. 1957: 'Mr. Strawson on referring', Mind, 66, 385-9. Reprinted in Russell 1973 ,
120-6. Page numbers in the text refer to this reprinting.
Russell, B. A. W. 1973: Essays in analysis. Edited by Douglas Lackey. London: George Allen
and Unwin.
Scharlau, Winfried (ed.) 1986: Rudolf Lipschitz: Briefwechsel mit Cantor, Dedekind, Helmholtz,
Kronecker, Weierstrass und anderen. Braunschweig: Vieweg und Sohn.
Schlick, Moritz 1918: Allgemeine Erkenntnislehre. Berlin: Julius Springer.

HILBERT AND LOGIC

187

SchrOder, Ernst 1890-1905: Vorlesungen iiber die Algebra der Logik (exacter Logik). 3 Bande.
Leipzig: B. G. Teubner.
Sinaceur, M.-A. 1974: 'L'infini et les nombres. Commentaires de R. Dedekind a Zahlen.
La correspondence avec Keferstein', Revue d'histoire des sciences, 27, 251-78.
Skolem, Thoralf 1923: 'Einige Bermerkungen zur axiomatischen Mengenlehre'. Reprinted in
Skolem 1970, 137-52. English translation in van Heijenoort (ed.) 1967, 290-301.
Skolem, Thoralf 1930: 'Einige Bemerkungen zu der Abhandlung von Ernst Zermelo "Dber die
Definitheit in der Axiomatic" " Fundmenta mathematicae, 15, 337-41. Reprinted in Skolem
1970, 275-9.
Skolem, Thoralf 1970: Selected papers in logic. Edited by Jens Erik Fenstad. Oslo:
Universitetsforlaget.
Tarski, Alfred 1936: 'Dber den Begriff der logischen Foigerung', Actes du Congres International
de Philosophie Scientijique, 7, 1-11. English translation by J. H. Woodger in Tarski 1956,
409-420.
Tarski, Alfred 1956: Logic, semantics, metamathematics: papers from 1923-1938, translated
by J. H. Woodger. Oxford: Clarendon Press. Second, corrected edition, with a new introduction by John Corcoran, published by Hackett Publications, Indianoplis, 1983.
Tijdeman, R. 1976: 'Hilbert's seventh problem: on the Gel'fond - Baker method and its applications' in Browder (ed.), 241-68.
Toepell, Michael-Markus 1986: Uber die Entstehung von David Hi/berts "Grundlagen der
Geometrie". (Studien zur Wissenschafts- Sozial- und Bildungsgeschichte, Band 2.) Gottingen:
Vandenhoeck and Ruprecht.
Weyl, Hermann 1910: 'Dber die Definition der mathematischen Grundbegriffe', Mathematisch-naturwissenschaJtliche Blatter, 7, 93-5, 109-13. Reprinted in Weyl 1968, volume
1, pp. 298-304.
Weyl, Hermann 1917: Das Kontinuum. Reprinted by Chelsea Publishing Co., New York.
Weyl, Hermann 1919: 'Der circulus vitiosus in der heutigen Begriindung der Analysis',
lahresbericht der deutschen Mathematiker-Vereinigung, 28, 85-92. Reprinted in Wey11968,
volume 2, 43-50.
Weyl Hermann 1944: 'David Hilbert and his mathematical work', Bulletin of the American
mathematical society, 50, 612-54. Reprinted in Wey11968, volume 4, 130-72; page numbers
in the text refer to this reprinting. Also partially reprinted in Reid 1970, 245-83.
Weyl, Hermann 1968: Gesammelte Abhandlungen. Volumes 1-4. Berlin, Heidelberg, New
York: Springer-Verlag.
Whitehead, A. N. and Russell, B. A. W. 1910-13: Principia mathematica. Volumes 1-3.
Cambridge: Cambridge University Press.
Willard, Dallas (ed.) 1994: Edmund Husserl: early writings in the philosophy of logic and
mathematics. Dordrecht: Kluwer Academic Publishers.
Zermelo, Ernst 1908: 'Untersuchungen iiber die Grundlagen der Mengenlehre, 1', Mathematische
Annalen, 65, 261-81. English translation in van Heijenoort (ed.) 1967, 199-215.
Zermelo, Ernst 1930: 'Uber Grenzzahlen und Mengenbereiche', Fundamenta mathematicae,
16, 29-47. English translation in Ewald (ed.) 1996.

MATHIEU MARION

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

Moe; ltv<!)

wvtrj.J
Heraclitus, Fr. 60

KU'tW ~(U KU'l

The mathematical legacy of Kronecker is impressive. The list of mathematicians who took up his problems includes Adolph Hurwitz, David Hilbert,
Kurt Hensel, Julius Konig, Ernst Steinitz, Erich Hecke, Helmut Hasse, Carl
Ludwig Siegel, Hermann Weyl and, recently, Andre Weil, Robert Langlands,
Harold Edwards. But, although nobody has ever doubted the brilliance of
Kronecker's results and insights, it is relatively safe to say with Edwards
that, more than a century after the publication of his complete works, "there
are important passages in Kronecker's work that no one, ever has fully understood, other than Kronecker himself" (1987, p. 29). This situation is
the result of a lack of interest caused by disdain for Kronecker's well-known
foundational stance and for his style of mathematics. Indeed, Kronecker's
mathematical practice features an insistence on providing algorithms which
is alien to the abstract, axiomatic approach which has been so dominant in
the past hundred years, in particular in Bourbaki's treatises. From this point
of view, algebra consists in the study of (highly) abstract notions - the structures-meres - such as groups, rings, field, and so forth. Dedekind presented
and refined his theory of ideals in successive versions of his Supplement XI
to Lejeune-Dirichlet's Vorlesungen aber Zahlentheorie (1879). Adopted by
Hilbert in his Zahlbericht and studied carefully by Emmy Noether and her
students, Dedekind's ideals played a crucial role in the development of modem
algebraic number theory.
Kronecker (1882) had an altogether different approach from Dedekind.
He considered in his theory of algebraic functions polynomials of the form:

where the coefficients ao, aI' ... , an are, in the old terminology, 'rational
integral functions' of a variable x, i.e.

with the difference that, while in the Dedekind-Weber theory (1882) x is a


real or complex variable, and an algebraic function ~(x) is any zero of a
polynomial:
f(~) = ao~n

+ al~n-l + ... + an-l~ + an


189

M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy o/Science I, 189-215.
1995 Kluwer Academic Publishers.

190

MATHIEU MARION

where aQ, ai' ... , an are rational function of x, in Kronecker's theory the
variable x and the coefficients Ck, I can only be rational. These limitations proved
intolerable to followers of Riemann (i.e. not only Dedekind and Weber, but
also Weierstrass, Hilbert and others) who worked on the theory of functions
of a real or complex variable.
Dedekind (and Weber) assumed the field of complex numbers C as the
ground field. Since the Fundamental Theorem of Algebra asserts that every
equation f(x) = 0 in Q has a root a in C, Dedekind worked with the subfield
Q(a). Hermann Weyl described this approach in these words:
This algebraic number field is cut out from the continuum [C]. The standpoint thus described
is that of analysis. (1940, p. 10)

Kronecker, however, wanted to make the arithmetic of algebraic numbers independent of the theory of complex numbers and he avoided C by using, for
example, congruences modulo ~ + 1 instead of the imaginary unit i = ~.
From this viewpoint, the expression ~ is only a "symbol", while the equality
~ = -1, which defines ~, is said to be "real" and

if with rational coefficients) is equivalent to


f(u) == (mod ~

+ 1),

with u an indeterminate.2 Kronecker would take an irreducible polynomialf(x)


of Q[x] and consider Q[x] modf(x).
The use of indeterminates was felt at the time to be cumbersome, and
needlessly so, since Q[x] modf(x) is algebraically isomorphic to Dedekind's
Q(a). Dedekind himself disliked Kronecker's uses of indeterminates. In his
own words, he "could not make friends with the 'method of the indeterminates' " (1895, p. 52) since "mixing [number theory] with the theory of function
of variables muddies the purity of the theory" (1895, p. 55). This ideal of purity
had won almost universal acceptance by the tum of the century, 3 partly because
of Hilbert's adoption of Dedekind's ideals, and Kronecker's programme was
almost entirely abandoned. In the words of Andre Weil:
[Kronecker's] grandiose conception has been allowed to fade out of our sight, partly because
of the intrinsic difficulties of carrying it out, partly owing to historical accidents and to the
temporary successes of the partisans of purity and of Dedekind. (1950, p. 90)

It is in this context that the modem schools of philosophy of mathematics formalism in particular - took shape. This historical fact is of great importance
and should not be overlooked.
To most philosophers of mathematics Kronecker is also a villain, because
he committed the unpardonable sin of opposing Cantor's pioneering work in
set theory.4 Moreover, Kronecker is accused of having made vicious personal
attacks on Cantor. I believe, however, that Kronecker's reputation as a klein

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

191

Despot - this expression is from Cantor himself (Dugac, 1976, p. 252) - is


exaggerated. s At any rate, such attacks on the personality of Kronecker do
not constitute for a moment an argument against the particular views he
happened to hold. It is a shame to think that they have been, all too often in
the past, sufficient pretext not to open his books.
The limited space of this study does not allow for an adequate presentation of Kronecker's work in algebraic number theory or of his work on
elliptic functions, even if I were able to write it. 6 Instead, I shall give an
overview of Kronecker's philosophy of mathematics, and shall give my reasons
for believing that we do a great injustice to Kronecker when we try to understand his work through the spectacles of modem foundational research (i.e.
misrepresenting requisites (1 )-(3), below). For this purpose, I shall draw
some comparisons with his great scientific rival Dedekind, discuss Hilbert's
attitude towards his legacy and illustrate some unexpected points of contact
with Wittgenstein; another unfashionable name among philosophers of
mathematics. 7
The most accurate - not to say, most sympathetic - account of Kronecker's
approach to foundations was written by his student Kurt Hensel: 8
... I must also point out a requirement which Kronecker consciously imposed on the definitions and proofs of general arithmetic (allgemeinen Arithmetik), the strict observance of which
distinguishes his treatment of number theory and algebra from almost all the others. He believed
that in these domain one could and must formulate each definition in such a way that one can
verify in a finite number of steps if it applies to a given magnitude or not. Similarly, a proof
of the existence of a magnitude can only be seen as completely rigorous if it contains a method
by which the magnitude whose existence is being claimed can really be found. Kronecker was
far from wanting to reject entirely a definition or proof which did not satisfy the highest demands;
but he believed that in this case something was lacking and he held that any improvement in
this direction was an important problem, through which our knowledge of an essential point could
be extended. Besides, he believed that a formulation which was rigorous in that respect would
take in general a simpler form than another which did not fulfill these requirements, ...
(Kronecker, 1901; vi)9

From the point of view of logical foundations, Kronecker's philosophy of


mathematics can be summed up in three tenets. The first thesis is embodied
in his most famous remark, stated in a lecture in Berlin (1886) and quoted
by Weber in his obituary: "Die ganzen Zahlen hat der liebe Gott gemacht, alles
andere ist Menschenwerk" (Weber, 1893, p. 19). This statement should be
understood has meaning that:
(1)

everything must be constructed from the natural numbers

(and not as an ontological statement), to which one hastens to add the


following requisites:
(2)

no completed infinities,

(3)

no proof of existence or definition without an algorithm.

192

MATHIEU MARION

The first requisite is best understood by look at the role played by his
allgemeine Arithmetik - this expression refers to the arithmetic of polynomials
with rational or integer coefficients. 1O The motto according to which "natural
numbers were created by God, everything else is the work of men" reflects
Kronecker's conviction that number theory and (what we now call) algebraic
geometry can be constructed form Gattungen of quantities algebraic over
natural fields (natiirliche Rationalitiitsbereiche), i.e., the fields of quotients
of the ring of polynomials Z[XI' X 2, , xn] in some indeterminates XI' x 2,
, Xn with integral coefficients. The Gattungsbereiche are constructed by
adjoining to a natural field one root of a polynomial f(x) irreducible over
the field. It seems that Kronecker realized that these Bereiche are all that he
needed. II Andre Weil expressed essentially the same idea while commenting
on Kronecker's Grundziige:
.. in the final analysis, every statement we can make can be thought of as a theorem in algebraic geometry over an absolutely algebraic-ground field, i.e., either over a finite field or over
an algebraic number-field of finite degree. While this realization, of course, cannot in any way
detract from the methodological importance of arbitrary ground-fields as one of the chief tools
of modern algebraic geometers, it give us some insight into the deep meaning of Kronecker's
view, according to which the absolutely algebraic fields are the natural ground fields of algebraic geometry, at any rate as long as purely algebraic methods (as distinct from analytical or
topological methods) are being used. (1950, p. 444)

The second of these requisites was spelled out in a footnote to his paper
'Uber einige Anwendungen der Modulsysteme auf elementare algebraische
Fragen':
It seems to me that these considerations are in opposition to the introduction of Dedekind's
concepts of 'module', 'ideal', etc.; as well as to the introduction of various new concepts,
which have been used in many recent attempts (first of all by Heine) to grasp and to give foundations to the concept of 'irrational' in all its generality. The general concept of an infinite series,
e.g. one which increases according to definite powers of variables, is in my opinion permissible only with the reservation that, in every special case, certain assumptions must be shown
to hold, on the basis of the arithmetical laws of construction of the terms (or coefficients),
. . , which allow the use of the series as finite expressions and thus make it really unnecessary to go beyond the concept of a finite series. (1886a, p. 156)

(The last sentence expresses Kronecker's stance on the notion of arbitrary


function, to which I shall come back in the next section.) Kronecker's
objection to the use of completed infinities in mathematics is simple enough:
they are unnecessary. This remark was written before the discovery of
the paradoxes of set theory and Kronecker was not claiming that these
Begriffsbildungen lead to contradictions, as all contructivists did afterwards,
but rather that we can simply do without them. 12 There is indeed no attempt
in Kronecker's writings at viewing his domains as completed; there is no
need to. When introducing new domains, he was not introducing new numbers,
but extending the possibility to undertake given calculations.
For example, in 'Uber den Zahlbegriff' (1887a), the notion of algebraic
or 'real algebraic irrational' numbers is replaced by means of manipulating

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

193

functions whose zero values cut out intervals in the real line in which such
numbers can be isolated from their Galois conjugates, and for Kronecker, to
talk here of Existenz is simply to talk of possible calculations with these
intervals (1887a, p. 272): there is no need to introduce new entities the way
Dedekind did. In this respect, it is important to understand that Kronecker
did not limit himself to the use of elementary or algebraic tools (such as Sturm's
Theorem) and calculations sometimes also involved analytic tools: it is the
ability to calculate which was his prime concern.
In these remarks, I have already partly dealt with the third requisite, which
was, according to Hensel's quotation, the distinguishing feature of Kronecker's
mathematical practice. Kronecker required that proofs of existence contain a
method to find, in a finite number of steps, an arbitrarily good approximation to the number whose existence was proven, and to have held that a
definition, in number theory or algebra, is acceptable only if it could be checked
in a finite number of steps whether any given number falls under it or not.
To give an example, Kronecker rejected the usual definition of irreducibility
according to which if a polynomial f(x) has a rational factor, it is reducible,
otherwise it isn't, because
... [itl is devoid of a sure foundation until a method is given by means of which it can be decided
whether a given function is irreducible or not by the definition. (1882, pp. 256-257)

A similar remark is to be found in Julius Molk's study' Sur une notion qui
comprend celie de la divisibilite et sur theorie generale de l' elimination':
Definitions should be algebraic and not just logical. It is not sufficient to say: 'an object is or
isn't'. One must explain what is meant by being and not-being, in the particular domain in
which we are moving. Only then are we making a step forward. If we define, for example, an
irreducible function as a function which is not reducible, i.e. decomposable in other functions
of a precise nature, we are not giving an algebraic definition, but a simple logical truth. In
order for us to be able to give this definition in Algebra it must be preceded by a method by
which we can obtain, in a finite number of rational operations, the factors of a reducible function.
Only this method gives to the words reducible and irreducible and algebraic meaning. (1885,
p. 8)

Using Lagrange's interpolation formula, Kronecker and Molk produced an


algorithm which determines in a finite number of steps whether f(x) is reducible
or not, and which also determines, in the case that it is, its divisors (Molk,
1885, p. 15f.). In 1940, Hermann Weyl was already pointing out in his book
The Algebraic Theory of Numbers, that the presence of a test for divisibility
is undeniably an important advantage of Kronecker's theory over Dedekind's
(1940, p. 67)Y
In a modem terminology, one is tempted to describe Kronecker's position
by saying that concepts must be decidable. But this is misleading: Kronecker
did not reject 'undecidable' concepts, but only asked for the additional information contained in the constructive definition, which would enhance the
comparison with non-algorithmic objects. As Hensel said in the quotation
above, Kronecker "believed" that with a logical proof "something was lacking

194

MATHIEU MARION

and he held that any improvement in this direction was an important problem,
through which our knowledge of an essential point could be extended". Indeed
Lagrange's interpolation formula and the like are essential for generalizations, not just extraneous information.
This feature of Kronecker's practice was entirely alien to Dedekind. His
ideals came with no specific means of computation; this was not acceptable
to Kronecker. Dedekind's reaction to the remark by Kronecker just quoted was
published in a footnote to Was sind und was sollen die Zahlen?, two years later.
The footnote occurred just after the introduction of a "system" or set S as "completely determined when with respect to every thing it is determined whether
it is an element of S or not" (1888, 2). Dedekind was objecting to (3),
inasmuch as it implies that all concepts have to be decidable; according to him,
that a given object falls under a given concept or not is determined independently of our knowledge:
In what manner this determination is brought about, and whether we know a way of deciding
upon it, is a matter of indifference for all that follows; general laws to be developed in no way
depend upon it; they hold under all circumstances. I mention this expressly because Kronecker
not long ago ... has endeavoured to impose certain limitations upon the free formation of concepts
in mathematics which I do not believe to be justified ... (1888 2, note a)

(As an early statement of the position of "realism" in mathematics, this is as


clear as anything one can get.) To my mind, to accuse Kronecker of imposing
"certain limitations upon the free formation of concepts in mathematics" is
to misrepresent his position.
The algorithmic style of Kronecker was also that of his great predecessors, Leibniz, Euler, Jacobi, Kummer, etc ... But, as the result of the influence
of Hilbert and Noether in particular, it is the decidedly non-algorithmic style
of Dedekind which prevailed since, and which totally dominated XXth century
philosophy of mathematics - this time as the result of the influence of Frege
and Russell. A perhaps unique exception is Wittgenstein, who also put the
emphasis on the algorithmic aspects of mathematics, as these quotations will
show:
Mathematics consists entirely of calculations.
In mathematics everything is algorithm and nothing is meaning; even when it doesn't look
like that because we seem to be using words to talk about mathematical things. Even these
words are used to construct an algorithm. (1974, p. 468)
Let's remember that in mathematics, the signs themselves do mathematics, they don't describe
it. The mathematical signs are like the beads of an abacus. (1965, 157)
... we can't describe mathematics, we can only do it. (And that of itself abolishes every
'set theory'.) (1965, 159)

Although the insistence on (3) is the same, Wittgenstein's motives were different. As a philosopher, he had no interest in pursuing work in foundations
of mathematics. At any rate, he knew little about it. But in the course of
their activity philosophers often emphasize aspects of mathematics which
are relevant to their discussion. It seems to me that Wittgenstein sought to

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

195

emphasize the algorithmic aspect of mathematics, not in order to state a


thesis about the nature of mathematics (although the above quotations do not
show this), but rather in order to break the spell of the "descriptivist" imagery,
which goes hand in hand with the non-algorithmic style exemplified by
Dedekind. 14
Another well-known aspect of Kronecker's foundational stance, not mentioned
by Hensel, is the rejection of the general notion of an arbitrary function. In
1829, Lejeune-Dirichlet proved that the Fourier series of a (piecewise
monotone) function converges to the function. 15 The assumptions needed to
prove this were very weak, but the class of piecewise monotone functions is
rather wide. Dirichlet was thus led to introduce a general concept of a function
of a real variable, according to which y is called a function of x if within a
definite interval there is a definite value of y for every value of the variable
x; while it does not matter whether y is dependent on x according to the same
rule within the whole interval or not and whether the dependence can be
expressed by means of mathematical operations or not (Lejeune-Dirichlet,
1837, p. 135).
The adoption of Dirichlet's notion meant abandoning the idea of a function
being determined by a formula in favour of functions 'given by a graph',
i.e. an arbitrary infinite subset of R x R. Riemann was one of the prominent
figures in this drive to discard the notion of a function defined by a rule,
and he had a considerable influence on Dedekind and Weber. It was indeed
Riemann's work on Abelian functions which motivated their masterpiece,
'Theorie der algebraischen Funktionen einer Veriindlichen (1882).16
The note to Kronecker (1886a) quoted above contains the only statement
of his views on the notion of arbitrary function: 17 an infinite series is admissible only if a rule for computing its terms is given. There is also a good
statement of this viewpoint in Molk's revised French translation (1909) of
Alfred Pringsheim's contribution to the Encyklopiidie der mathematischen
Wissenschaften (1899).18 According to Molk, Dirichlet's notion would make
sense only if one was able to write down an "ideal table":
In order to bring to the fore the arithmetical dependence of a (real) variable x and a junction y
of this variable x in a domain (x), in the general sense of the word given by G. Lejeune-Dirichlet,
one would need to draw a kind of 'ideal table' in which each value of y is facing the corresponding value of x. (1909, p. 20)

But, apart from the case where the domain is finite, the "ideal table" must
contain an infinity of elements, and
... one cannot see how it could be effectively realized. There is no need for this since, in
order to study in a precise manner the arithmetical dependence of y and of x, it is not necessary to encompass it all in one look; it suffices to obtain at any moment, in a rigorous fashion,
those of the elements which are needed.
This is the case when the ideal table is, so to speak, condensed in a computational procedure from which one obtains effectively the value of y corresponding to each value of x in the
domain (x). (1909, p. 20)

196

MATHIEU MARION

So Molk concluded his discussion by requesting that functions be given by


a rule:
... when we say with G. Lejeune-Dirichlet, that a (real) variable y is a (real) one-to-one
function of a (real) variable x, in a domain (x), when to each value of x in the domain (x) corresponds a definite value of y, we cannot dispense with the supposition that this definite value
of y is, either directly or indirectly, defined with the help of the corresponding value of x by
some procedure of computation; ... (1909, p. 22)19

We can see here that the rejection of the notion of an arbitrary function is
linked with requisite (3), i.e. with the emphasis on algorithms. As an example
of the opposite standpoint, I shall quote Philip Jourdain's objection (to
Pringsheim's original article):
This implies that the function must be defined by at most an enumerable aggregate of specifications. However, such a restriction, which would reduce the cardinal number of all functions
to be considered from [2"1 to [Nol, and would in general, exclude integrable functions, is certainly not necessary for the theorems on the upper and lower limits of a function, and, in any
case, is a practical necessity irrelevant to our contemplation of functions sub specie aeternitatis. (1905, pp. 185-186n.)

Here, the order of priority is clearly stated: the requisites of mathematical


practice are brushed aside in favour of a "contemplation sub specie aeternitatis". In philosophy of mathematics, Molk's words were echoed more recently
by Wittgenstein, who is on the record as pointing out that "set theory starts
from Dirichlet's concept of a function" (1979, p. 102). Wittgenstein also
thought that Dirichlet's concept of a function of a real variable made sense
only if one could produce an "ideal table", which he here calls a "list":
A law is not another method of giving what a list gives. The list cannot give what the law
gives. No list is imaginable any more. We are actually dealing with two absolutely different
things. People always pretend that the one is an indirect method of doing the other. I could supply
a list; but as that is too complicated or beyond my powers, I will supply a law. This sounds
like saying, Up to now I have been talking to you; when I am in England I shall have to write
to you. (1979, p. 103)

The rejection of the notion of arbitrary function is one of the main features
of Wittgenstein's philosophy of mathematics. 20
I would like to compare now the different philosophical approaches behind
Dedekind's and Kronecker's algebraic number theories. There is a fascinating
similarity in the way Dedekind conceived his three fundamental notions of
ideals, cuts and chains. 21 In order to avoid having to produce irreducible factors,
for a given arbitrary algebraic field Q(a), he introduced his ideals as (possibly
infinite) sets of integers of Q(a) behaving as if they were divisible by some
common irreducible factor. In Stetigkeit und irrationale Zahlen, Dedekind
introduced cuts (Schnitte) as any division of the rational numbers into two
classes such that any number in the first class is less than any number in the
second. For classes Al and A2 , the cut is denoted by (AI' A2) (1872, pp. 12-13).
According to this conception, it suffices to talk of irrational numbers, such

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

197

n,

as
in terms of sets of rational numbers behaving as if separated off by
numbers AI and A2.22 In Was sind und was sol/en die Zahlen?, Dedekind did
not define a particular class as the natural numbers, but a class of "simply
infinite systems"; anyone of which being able to serve as the subject of arithmetic, if we abstract from their specific properties (1888, 73).23 A class N
is a simply infinite system if there is a one-to-one function <p mapping N
into itself, an object a which is not a value of <p for an argument in N, and
N is the intersection of all classes containing a and <p(x) if it contains x i.e. if N contains a and the chain (Kette) of a (1888, 37 & 44). This leads
to the logicist definition of natural numbers as the set of all classes which
behave as if possessing just as many members as that class.
In all three cases, Dedekind was able to offer a most general (and nonalgorithmic) definition. The root of this common approach is, to my mind, a
general methodological requirement for the introduction and axiomatization of
new abstract notions. This is the requirement that the introduction of new
concepts should not be marred by explicit, arbitrary representations:
My efforts in number theory have been directed towards basing the work not on arbitrary
representations or expressions but on simple foundational concepts, and thereby - although the
comparison may sound a bit grandiose - to achieve in number theory something analogous to
what Riemann achieved in function theory, in which connection I cannot suppress the passing
remark that Riemann's principles are not being adhered to in a significant way by most writers
- for example, even in the newest works on elliptic functions. Almost always they mar the
purity of the theory by unnecessarily bringing in forms of representation which should be
results, not tools, of the theory. (1932, p. 477)

This requirement meant a lot to Dedekind since, as he explained himself in


his second exposition of his theory of ideals (1887), it was while following
it that he was able to overcome the deficiencies of Kummer's approach:
I succeeded in obtaining a general theory, with no exceptions, ... , only after completely abandoning the older more formal approach, and after replacing it by another approach which starts
from the simplest fundamental conception, ... In this approach I do not need any new creation
anymore, such as Kummer's ideal number, and the consideration of this system of really existing
numbers, which I call an ideal, suffices entirely. (1887, p. 268)

With A denoting a prime number and (X an imaginary root of the equation


I, Kummer posed the problem of the resolution into prime factors of
the cyclotomic integers, i.e. numbers of the form
(X'A. =

where a o, ai' a 2 , , a'A._1 are integers. (See Kummer, 1851, for example.)
Unique factorization into primes was not valid in this field, and Kummer
had to introduce "ideal prime factors", as he called them, in order to reestablish it. With the theory of cylotomic fields with prime A, Kummer was able
to prove Fermat's Last Theorem, i.e.

198

MATHIEU MARION

for prime exponents. But his interest was higher reprocity laws. He succeeded (in 1859) in generalizing his theory to extensions of these cyclotomic
field obtained by adjoining a Ath root, in order to prove his general reciprocity law. 24
Dedekind was dissatisfied with Kummer's theory for many reasons: first,
he complained that Kummer never explained what an ideal prime factor is,
but only defined such ideal prime factors by providing a procedure for determining the multiplicity with which a given cyclotomic integer is divisible
by them. 25 Secondly, in cyclotomic fields the factorization of a rational prime
p in the number field can be determined from the factorization of a cyclotomic polynomial modulo p, but Kummer's approach didn't lend itself to a
generalization, because of ramification, and this is why Dedekind abandoned
it. Dedekind saw in this "arbitrary representation" the source of Kummer's
mistakes: 26
Kummer has not defined ideal numbers themselves, but only divisibility by these numbers. If
a number a possesses a given property A, always consisting in the fact that a satisfies one or
many congruences, he says that a is divisible by a determined ideal number, corresponding to
the property A. Although this introduction of new numbers is perfectly legitimate, it is to be
feared that, as a result of the mode of expression one has chosen, according to which one
speaks of ideal numbers and their products, and by the presumed analogy with rational numbers,
one would be drawn to hasty conclusions and thus to insufficient proofs; this pitfall was indeed
not always entirely avoided. (1887, p. 268)

It is the general methodological requirement just mentioned, i.e., that one


should always avoid arbitrary representations when introducing new concepts,
which presided over Dedekind's introduction of his ideals - as for the introduction of cuts, Dedekind himself wrote a lengthy footnote to that effect (1877,
p. 269). This requirement is in many ways paradigmatic of the modem style
of mathematics which was alluded to at the very beginning of this paper.
It is easy to see why Dedekind could not accept Kronecker's requisites (1)-(3).
Requisite (1): As a result of their interest in Riemann's work on Abelian
functions, Dedekind and Weber worked with subfields of C, generated form
Q by the root of an irreducible polynomial f(x) with integer coefficients.
The Fundamental Theorem of Algebra asserting that every equation in Q has
a root a in C, one can form such an extension Q(a). Kronecker avoided C
+ 1. One can thus create the field
using, as we saw, congruences modulo
k(a) out of a given field k by a purely algebraic construction, the adjunction
of the root a of f with k an arbitrary field, and f(x) an irreducible polynomial in k of degree n, Kronecker considers the field k(a) of polynomials
f(a) in k, with an indeterminate a, such that two polynomials are equivalent
if they are congruent modf(a). Therefore, Kronecker did not need an embedding field such as C, where f(x) = 0 has a solution. It was the former approach,
that of Dedekind-Weber, which was adopted and popularized by Hilbert in
his Zahlbericht, but there are reasons to believe, however, that Kronecker's
approach was ultimately more fruitful. (I shall come back to this question.)

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

199

There is an important advantage in this approach, as opposed to Dedekind's,


which ought to be mentioned at this point. It has to do with the formulation
of a place or point on an algebraic curve; precisely the notion which motivated
the work of Dedekind and Weber (1882). Since this paper, it is customary in
algebraic geometry to consider function fields over an algebraically closed field
such as the field of the complex numbers rather than over Q. The decomposition of a (global) divisor as a product of powers of places requires the
adjunction of constants, i.e. extending k. In Dedekind's theory the ambient
field is fixed, so the constants must all be included at the outset, i.e. Q( a) must
be replaced by the ring of polynomials with coefficients in an algebraically
closed field. This algebraically closed ground field is a set-theoretic construction right at the foundation of the theory of algebraic curves. On the other
hand, the theory of divisors developed by Edwards on the basis of Kronecker's
Grundziige is devised to be independent of the ambient field and simply calls
for adjoining new constants algebraically, as they are needed. In Edwards'
words:
In the Kroneckerian approach, the transfinite construction of algebraically closed fields is avoided
by the simple expedient of adjoining new algebraic numbers to Q as needed. (1990, p. 97)

This is quite in line with Kronecker's remark against Dedekind: completed


infinities are simply unnecessary.
Requisite (2): it is easy to see that Dedekind's mathematics involved in
essential ways completed infinities or infinite totalities, in contrast with
Kronecker's requisite. The following description of the product of two ideals
should show this. According to Dedekind's definition, an ideal I is a possibly
infinite set of algebraic integers such that if a and ~ belong to I, so do a +
~ and Ila, for Il any algebraic integer. In a field k, each ideal, written
[aI'

a2,

J,

has a 'basis' of algebraic integers, aI' a 2 ,


the form

such that its elements have

Illa l + 112 a 2 + ...

where Ill' 112, . . . , are elements of K. The product r of two ideals I and t}
of the field k is the smallest ideal which contains the product a~ of each a
of I and each ~ of t}. So I and t} are factors of r if it is the product of I and
t}. (A prime ideal r = It) if only if r = I or r = t}.) For example, 3 and 5
are replaced by the ideals [3] = (3, 6, 9, ... ) and [5] = (5, 10, 15, ... ), i.e.
by the set of all numbers divisible by, respectively, 3 and 5. The number 15
is replaced by the ideal [15] = (15, 30, 45, ... ), which contains all the products
3m X 5n. These are in [3] and [5], so [15] = [3] x [5]. From this definition,
Dedekind was eventually able to prove that every ideal has a unique representation as the product of prime ideals. We can see here that Dedekind's theory
of ideals consis.ts in manipulations of infinite sets. Although these are relatively harmless, Dedekind opened the door to Cantorian set theory?7

200

MA THIEU MARION

Requisite (3): in order to avoid the requisite about the existence of individual real numbers, which he could not satisfy, Dedekind was forced to tum
attention away to the existence of their complete system (denkbar vol/stiindigstes Grossen-Gebiet), i.e. to move, so to speak, from local to global
questions. 28 This procedure is even more transparent in Was sind und was sol/en
die Zahlen?, where Dedekind starts with an informal introduction of the natural
numbers by means of the notions of system, 1 and successor, which leads
directly to the definition of a simply infinite system. Dedekind then proves
the existence of a simply infinite system (1888, 66) and, further, that it has
a simply infinite subsystem (1888, 72). Dedekind thought that these "logical"
proofs of the existence of these systems had secured the consistency of these
notions, as seen from his letter to Keferstein:
After the essential nature of the simply infinite system, whose abstract type is the number sequence
N, had been recognized in my analysis (articles 71 and 73), the question arose: does such a system

exist at all in the realm of our ideas? Without a logical proof of existence it would always
remain doubtful whether the notion of such a system might not perhaps contain internal contradictions. Hence the need for such proofs (articles 66 and 72 of my essay). (1890, p. 101)

We can see in Dedekindian mathematics the origin of the modem axiomatic


method which was to be developed fully by Hilbert. We can also see the source
of the problem of consistency, which was to take shape in Hilbert's metamathematical programme (Sieg, 1990, pp. 265f.).
Requisites (1)-(3) were considered as unnecessarily restrictive not only by
Dedekind but also by Weierstrass, who used completed infinities in defining
functions and existence theorems such as the Bolzano-Weierstrass theorem
and by Hilbert, who adopted the analytic approach of Dedekind and Weber
in his Zahlbericht (1897) and later caricatured Kronecker as a Verbotsdiktator
(1922, pp. 159 & 161).29 Such rhetorical excesses have greatly helped perpetuating a negative image of Kronecker which is, to my mind, utterly
misleading. To begin with, it has masked the influence of Kronecker on
Hilbert's programme, which was belatedly recognized in 'Die Grundlegung
der elementaren Zahlenlehre', when he admitted that "Kronecker had a clearly
defined conception ... which corresponds today essentially to our finitist viewpoint" (1931, p. 487).30 Moreover, these remarks are unfair because they hide
the crucial role played by Kronecker's ideas in forging some of Hilbert's
most important results. I shall give two examples, one taken from his early
work on invariant theory, the other from his Zahlbericht.
It seemed for a while that mathematicians had exhausted the methods of
invariant theory, e.g., Paul Gordan's results for binary quadratic forms which
shows that they have a finite complete system of (rational integral) invariants. But the young Hilbert succeeded in proving, in his paper 'Uber die
Theorie der algebraischen Formen' (1890), that any collection of infinitely
many forms of any degrees has a "basis" (1890, p. 143). In the language of
modular systems (i.e. systems of polynomials such that if G and H belong
to a system M then so do G + H and AG, where A is any homogeneous poly-

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

201

nomial in XI' X2' , x n ), Hilbert proved that any modular system M has a
basis such that any polynomial F of M can be written as

F=AIFI +A 2F 2 + +AJm
where F I , F 2 , , F m' is a finite set of polynomials in M and AI' A 2 ,
Am' are suitable polynomials, not necessarily in M. This result is often cited
as a decisive victory of the new abstract methods over cumbersome old ones.
For example, E. T. Bell wrote in The Development of Mathematics that "the
[algorithmic] method was incapable of revealing the general underlying principle of which Gordan's theorems are but special manifestations" (1940, pp.
211-212).31 But it is seldom said, as Hilbert himself pointed out (1890, p. 150),
that the result was inspired by Kronecker's ideas about homogeneous polynomials. Poincare is the exception here; he wrote in his report for the Bolyai
Prize on the works of Hilbert that
This is a consequence of the fundamental notion of [modular system] introduced by Kronecker.
This means, in Kronecker's language, that the divisors common to any [modular system], even
where they are infinite in number, are submultiples of one of them which is their greatest common
divisor ... (1911, p. 754)32

Another important aspect of Kronecker's influence, not clearly perceived


by Hilbert himself, is the use by Hilbert in his Zahlbericht (1897) of Dedekind's
Prague Theorem. The key idea of divisor theory is that the "content of a product
is the product of the contents" (Edwards, 1990, p. vi & p. 5). It generalizes
the so-called Gauss lemma, which states in modern terms that, given two monic
polynomials f and g in one indeterminate with integer coefficients, the content
of fg is the content of f times the content of g. The content of a polynomial
is the greatest common divisor of its coefficients. Gauss's lemma cannot be
generalized to algebraic integers because the notion of greatest common divisor
is not defined for algebraic numbers. But, as a result of his careful study of
Kronecker's Grundzuge (1882),33 Dedekind proved his Prague Theorem (1892),
which generalizes Gauss's lemma to the algebraic case: letfand g polynomials
in one indeterminate with algebraic number coefficients, if all coefficients
of fg are algebraic integers, then the product of any coefficient of f and any
coefficient of g is an algebraic integer (Edwards, 1990, p. 2). But this theorem,
which brings about an important simplification of Kronecker's theory
(Edwards, 1980, p. 366), is in fact a consequence of an earlier theorem by
Kronecker (Edwards, 1990, p. 2f.), of which Hurwitz obtained a proof in 'Uber
einen Jundamentalsatz der arithmetischen Theorie der algebraischen Groj3en'
(1895). Now, although Hilbert used Dedekind's ideals instead of Kronecker's
divisors, he derived their properties by making crucial use of the Prague
Theorem. 34
Finally, although Hilbert is usually credited for having initiated, with Weber,
class field theory, its fundamental concepts were known to Kronecker who had
somewhat anticipated the principal divisor theorem of class field theory by
finding out that for imaginary quadratic fields, the 'singular moduli' gener-

202

MATHIEU MARION

ated the algebraic extensions K of a given algebraic number field k such that
all the divisors in k become principal divisors in K. (Kronecker called these
extensions: "zu assoziierenden Gattungen".) His investigations of modular and
elliptic functions with 'singular moduli' led Kronecker to the formulation of
his "liebster lugendtraum" (1880, p. 453). Class field theory was developed
further by students of Hilbert, Furtwfulgler, Takagi, Artin and, later, Chevalley
and Hasse,35 with Teiji Takagi finally proving the lugendtraum in 1920.
Nowhere is the relevance of Kronecker's work to contemporary mathematics
as obvious as in the case of the theory of elliptic functions with complex
multiplication. Building on earlier work of Gauss, Abel and Jacobi, Kronecker
initiated number-theoretic investigations of complex multiplication. For
example, the main result of 'Zur Theorie der elliptischen Funktionen XI'
(1886b, p. 164) is, in the words of Serge Vladut, "a profound theorem of the
arithmetic of modular functions, which can also be used as a technical basis
for complex multiplication theory" (1991, p. 75). From this result, Kronecker
obtained his 'fundamental relation' (1886b, p. 164), of which he was to give
three proofs, and the so-called 'Kronecker congruence relation' (1886b, p. 177).
Already as a young man, Kronecker saw very far: he found out that Kummer's
cyclotomic fields are the only Abelian extensions of the rational numberfield and conjectured two fundamental theorems: the Kronecker-Weber
theorem (1853, p. 101), which says that every Abelian extension of the rationals is a subfield of the cyclotomic field Q(s), and his "liebsten lugendtraum",
i.e., the dream of constructing all these Abelian extensions by using special
values of analytic functions. 36 Later, Hilbert tried to generalize these results
and this led him to the formulation of his 12th problem (1902, pp. 18-20).37
It is usually assumed that Kronecker only had in mind that the division of
elliptic functions with complex multiplication provides all Abelian extensions, i.e. the completeness theorem. But, according to Weil, Kronecker was
even more ambitious: in parallel to Eisenstein's extension of Kummer's work
on cyclotomy to the lemniscate, Kronecker wanted to extend Kummer's work
on the factors of class-numbers of cyclotomic fields and their p-adic properties to all imaginary quadratic fields and their Abelian extensions (Weil,
1976, p. 88).
I would like to make two remarks. First, it is a striking fact that there is
no equivalent line of inquiry in Dedekind's work. Secondly, the interest of
results such as the Kronecker-Weber theorem or the lugendtraum lies in the
intimate relation between algebraic extension and special values of purely
analytical functions. These results would be of no interest if one had constructed the extensions with non-constructive methods or if the functions
were shown to have an algebraic origin. Moreover, these results can be interpreted both ways, i.e. as generating number fields 'analytically' or as defining
continuous quantities 'algebraically' .38 This clearly shows that talking of a
"putsch" (Hilbert, 1922, p. 160) or of an attempt at abolishing analysis (Bell,
1940, p. 257), in relation to Kronecker's work, is just incredibly misleading.
Why should Kronecker bother obtaining brilliant analytic results such as,

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

203

say, his 'limit formulas'?39 It is easy to jump to the conclusion that there
was no coherence between Kronecker's mathematical practice and his scattered philosophical remarks and that his own practice was contradicting his
foundational position. But we would do better by simply getting rid of the
caricature of Kronecker as a klein Despot or as a Verbotsdiktator.
To come back to the development of the theory of elliptic and modular functions, Serge Vladut has shown in the third part of his book Kronecker's
Jugendtraum and Modular Functions (1991) that a good deal of the modem
theory arithmetic of modular functions of one variable is derived from
Kronecker's work, i.e. results, such as the congruence relations, obtained in
papers such as 'Zur Theorie der elliptischen Funktionen XI' (1886b). One clear
example of this Kroneckerian heritage is the Eichler-Shimura congruence
relation, whose prototype is Kronecker's main theorem of (1886b) mentioned
above (Vladut, 1991, p. 76). The multidimensional generalization of the theory
of complex multiplication, started by Hecke and developed by Deuring, Weil,
Shimura and Taniyama40 and the theory of Shimura varieties are also among
modem developments owing to the work of Kronecker. Finally, it is impossible to avoid mentioning here the so-called 'Langlands philosophy' or
'Langlands Programme', 41 which consists in a series of conjectures about the
connections (characterized by L-functions) between finite-dimensional
representations of the Galois group of a local or a global field with infinitedimensional complex representations of linear groups over this field. Langlands
has discussed the Kroneckerian heritage of his work in 'Some Contemporary Problems with Origins in the Jugendtraum' (1976).
Hilbert's rhetorical excesses against Kronecker were somewhat excusable
in early XXth century. The use of set-theoretic tools led to great advances in
analysis and to the development of whole disciplines such as point-set topology.
Even in algebraic number theory, Kronecker's approach was felt not to have
produced results - the notable exception being Hensel's work on p-adic
numbers, whose importance was at any rate not immediately understood while Dedekind's ideal-theoretic approach led to early successes, starting
with the proof of the Riemann-Roch Theorem by Dedekind and Weber (1882).
But this rhetoric was already becoming somewhat misleading by the midcentury, as is recorded by Helmut Hasse, in the preface to the first edition
of Number Theory (1949):
It seemed at first that the ideal-theoretic approach was superior to the divisor-theoretic, not
only because it led to its goal more rapidly and with less effort, but also because of its usefulness in more advanced number theoretic research. For Hilbert and, after him, Furtwangler and
Takagi succeeded in constructing on this foundation the imposing structure of class field theory,
including the general reciprocity law for algebraic numbers, whereas on Hensel's side no such
progress was recorded. More recently however, it turned out, first in the theory of quadratic
forms and then especially in the theory of hypercomplex numbers (algebras), not only that the
divisor-theoretic or valuation-theoretic approach is capable of expressing the arithmetic structural laws more simply and naturally, by making it possible to carryover the well-known
connection between local and global relations from function theory to arithmetic, but also that
the true significance of class field theory and the general reciprocity law of algebraic numbers

204

MATHIEU MARION

are revealed only through this approach. Thus, the scales now tip in favour of the divisortheoretic approach. (1949, p. vi)42

Hasse's statement is of interest; it records that the tide has turned. This is
an important change in XXth century mathematics, seldom acknowledged
by philosophers. Often, contemporary philosophers take their lead from predecessors, whose opinions reflect a past state of affairs in mathematics. Thus,
although such important changes do not necessarily invalidate the conclusions of past philosophers, much of contemporary philosophy of mathematics
runs the risk of becoming irrelevant to the preoccupations of mathematicians.
One is reminded here of Jean Dieudonne's words, when he complained about
philosophers "who believed that they spoke to today's mathematics, while they
considered in fact mathematics of the day before yesterday" (1981, p. 27).
Kronecker's foundational ambition, as he expressed it in 'Uber den Zahlbegriff', was to achieve an "arithmetization" (Arithmetisierung) of algebra and
analysis, i.e., to found them "solely on the concept of number understood in
its narrowest sense" (l887a, pp. 252-253). It was indeed his belief that we
could "base all of pure mathematics on the theory of whole numbers"
(Meschowski, 1967, p. 238). But most mathematicians (and philosophers)
believe, as Cantor did, that an arithmetization of analysis could not be "effected
without the use of the actual infinite in some form" (Meschowski, 1967,
p. 250). Kronecker's programme of a strict arithmetization, with his requisites (1)-(3), is not, however, a prima facie unrealizable dream, as shown by
Reverse Mathematics. The latter was devised by Harvey Friedman and his
collaborators in order to answer the question: "Which set existence axioms
are needed to prove the theorems of ordinary mathematics?" What is meant
here by ordinary mathematics is "mainstream or non-set-theoretic mathematics,
i.e. mathematics as it was before the abstract set theorists got hold of it"
(Simpson, 1985, p. 461). As a starting point, theorems are formalized within
a system of second order arithmetic called Z2 (or, sometimes, rr~-CAo). A hierarchy of subsystems of Z2 is obtained by restricting set existence axioms;
restricting, for example, the comprehension scheme to arithmetical formulas
(the resulting system being called ACA o). The basic, weakest system considered in this hierarchy is RCAo (recursive comprehension axiom), from which
the remaining systems are built up. Although it is quite weak, it is strong
enough to develop basic results about continuous functions of a real variable
and countable algebraic structures. The task of Reverse Mathematics is to
find out which parts of 'ordinary mathematics' can be developed within
which of these subsystems. The principal phenomenon uncovered is this:
Very often, if a theorem of ordinary mathematics is proved from the weakest possible set existence axioms, it will be possible to "reverse" the theorem by proving that it is equivalent to
those axioms over a weak base theory. (Simpson, 1985, p. 467)

Now investigations in Reverse Mathematics have shown that a good deal


of analysis and algebra can be done within conservative extensions of primi-

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

205

tive recursive arithmetic. 43 On the other hand, Reverse Mathematics could


certainly improve by looking beyond Hilbert's programme at Kronecker's.
Reverse mathematics is calibrated to show how much set theory is needed
to obtain given results and it is not refined enough. In fact, the 'ordinary
mathematics' considered is far from being "non-set-theoretic mathematics":
a typical result of (countable) algebra "reversed" by Friedman, Simpson and
Smith (1983) is the existence and uniqueness of countable algebraic closures. 44
That this result is found to be equivalent to some set-theoretic principle - in
their terminology, it is equivalent to WKLo (Weak Konig's Lemma) over RCAo
- should not come as a surprise since it is a set-theoretic elaboration, originally due to Steinitz,45 of combinatorial results about polynomials. In
Kronecker's approach, one does not need to bother about uniqueness; one
knows, so to speak, how to proceed locally.
At any rate, the surprisingly insignificant number of remarks of purely philosophical nature in the whole of Kronecker's writings point to a lack of interest
in philosophical or foundational questions. Indeed, Kronecker admitted in a
letter to Cantor that he assigned to these questions only a "secondary value"
(Meschowski, 1967, p. 238). In the same letter, he compared himself with
Kummer:
... I recognized, as he did, the unreliability of all speculations and I took refuge in the safe haven
of real mathematics. (Meschowski, 1967, p. 238)

adding a little bit later:


In the field of mathematics, I find a real scientific value only in concrete mathematical truths
or, to put it more pointedly, 'only in mathematical formulas'. History of mathematics has shown
that only these are everlasting. The various theories about the foundations of mathematics (such
as that of Lagrange) were put aside over time, but Lagrange's resolvent is here to stay!
(Meschowski, 1967, pp. 238-239)

(In modem terms, Lagrange resolvents are called Gaussian sums.) These words
should come as no surprise, considering the importance of the requisite (3)
for Kronecker. To my mind, Jourdain's "contemplation sub specie aeternitatis"
of the concept of function, mentioned above, is a perfect example of the
theorizing that Kronecker wanted to avoid in his work. It could be argued
that Cantorian set theory cannot be dissociated from a Platonist or realist
philosophy, i.e. that it is essential to Cantor that, in parallel to our intensional understanding of a concept such as that of natural numbers, its extension
is already to exist in its totality - e.g. that the infinite table is somewhat already
written down, irrespective of our ability to read it over. While Cantor has
devoted much space to this philosophical question, Kronecker wanted nothing
of the sort, no "speculation".
Reading Kronecker's letter to Cantor, one cannot but be reminded of
Wittgenstein,46 who thought that:
It is a strange mistake of some mathematicians to believe that something inside mathematics
might be dropped because of a critique of the foundations. Some mathematicians have the right

206

MATHIEU MARION

instinct: once we have calculated something it cannot drop out and disappear! And in fact,
what is caused to disappear by such a critique are names and allusions that occur in the calculus.
(1979, p. 149)

I already pointed out that Wittgenstein shares Kronecker's emphasis on


"mathematical formulas", i.e. algorithms, a fact that makes him almost unique
among XXth century philosophers.
I would like to conclude this paper with some remarks of a more speculative nature, which have deep implications for the philosophy of mathematics.
Philosophers seldom pay attention to the fact that early 'constructivists' such
as Kronecker or Poincare professed to have little or no interest in either
mathematical logic or the foundations of mathematics. Isn't it possible that
in construing Kronecker's (or Poincare's) constructivism within logical categories, we are missing some insights which were underlying his (or their)
so-called 'constructivist' convictions?
Kronecker's lack of interest in foundations should indicate that his foundational stance was not dictated by some kind of misplaced philosophical
thesis, that it was not all ill-fated attempt to found mathematics on, so to speak,
a smaller basis (with no completed infinities, etc.). As I pointed out earlier,
this is not an appropriate description of Kronecker's work. It would be a
mistake to claim that Kronecker only sought an arithmetization of analysis
on a stricter basis only because of philosophical beliefs such as (1)-(3). His
allgemeine Arithmetik should be looked at as part of an attempt to broaden
the field of mathematics rather than to narrow it, by opening a vast field of
investigations - the development of an arithmetic algebraic geometry in this
century. I already mentioned the relevance of Kronecker's ideas for the modern
theory of modular functions. This legacy speaks for itself. A very good example
of the kind of deep arithmetical result which can be obtained is Fermat's
Last Theorem: it has been shown recently that a proof the Taniyama-Weil
conjecture in algebraic geometry, which is part of the 'Langlands programme',
would entail Fermat's Last Theorem. 47 It is something very much like the
connection between number theory and algebraic geometry displayed here,
which was sought by Kronecker and it seems to me that this is where we should
look for an explanation of his attitude towards foundations and questions of
methodology. As for Poincare, his attitude was similar: he saw, for example,
that the theory of elliptic curves would "render great services to arithmetic"
(1901, p. 483), but that in order for it to be fruitful, it had to be treated
arithmetically. It should not be forgotten that it was Poincare who gave the
impetus to the arithmetic study of elliptic curves, with the "research programme" which he set up in his paper 'Sur les proprietes arithmetiques des
courbes algebriques' (1901), and which led to fundamental results on the
number of rational point on elliptic curves by Mordell, Weil and, recently,
Faltings (1983).
Perhaps it is also worth mentioning in this context the recent topic of
arithmetic geometry, i.e., the generalization of the work of Arakelov (1974a,

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

207

1974b) and Faltings (1984) for arithmetic surfaces, in particular by Bismut,


Gillet and Soule. In Arakelov theory, numerical results of algebraic geometry
are transposed to the number field, by completing divisors to include those
at infinity - this is being done by considering Hermitian structures. Here
Hermitian geometry becomes a 'oo-adic arithmetic' and, in the words of Serge
Lang, "analysis becomes number theory at infinity" (1988, p. v). The most
important result of this approach is the recent proof of an arithmetic analogue
of the Riemann-Roch-Grothendieck Theorem by Gillet and Soule (1992).
Now, it would not be entirely wrong to say that this current trend towards
a merger of arithmetic and algebraic geometry is close, if not in the details
at least in spirit, to Kronecker's programme. But exclusive consideration of
(1)-(3) in the tradition of logical foundations has attracted attention away from
this aspect of the latter; it has pre-empted philosophers from understanding the
profound conception of the relations between arithmetic and geometry which
underlies it. A reassessment of Kronecker's achievements along those lines
is urgently needed, and philosophy of mathematics stands to gain much from
it. 48
University of Ottawa
NOTES
"The path up and down is one and the same". See Kirk, Raven & Schofield (1983, p. 188).
In 'Uber den Zahlbegriff', Kronecker's Buchstabenrechnung made essential use of Gauss'
notion of Unbestimmten (l887a, pp. 260f.) in an attempt at avoiding (vermeiden) the introduction of symbols for negative numbers, irrationals, etc .... Indeterminates are to be distinguished
here from variables: the former cannot be disposed of during a proof, but a value can be given
to them after the proof, while the latter can acquire a value during a proof. See also Molk
(1885, pp. 3-8).
3 Louis Couturat's negative comments in De l'infini mathematique are a perfect example of
this widespread attitude: "[Kronecker's] theory masks the true nature of fractions, and it hides
their relations and combinations under a cumbersome notation, while mixing in foreign notions.
Surely, it justifies the rules for operating on these numbers; but it does so by burdening them
with so many indeterminates that it would render calculations with fractions extremely difficult, not to say practically impossible" (1896, p. 615). These comments reflect more Couturat's
ability as a polemicist than any sound judgment. Because of his philosophical bias as a supporter of Dedekind, Couturat obstinately did not see any advantages in Kronecker's theory.
Able mathematicians such as Hurwitz, Konig, Lasker and Macaulay did. For example, Hurwitz,
obtained an important proof of a fundamental theorem of Kronecker in his (1895) and Konig
obtained further results, generalizing Kronecker's ideas from 'relatively whole' to 'absolutely
whole' (ganze ganzzahlige) algebraic functions (see (1903, p. 78) for an important theorem).
Macaulay's study, The Algebraic Theory of Modular Systems (1916), which is a milestone in
the development of modem commutative algebra, contains a useful introduction to Kronecker's
theory.
4 Joseph Dauben writes: "No one could have been more opposed to Cantor's ideas, nor have
done more damage to his early career, than Leopold Kronecker" (1979, p. 66). (The claim that
Kronecker has done "more damage" to Cantor's career than anyone else needs to be substantiated!) Dauben gives as an example of this opposition a delay of one year in the publication
of Cantor's 'Ein Beitrag zur Mannigfaltigkeitslehre' (1879), which is attributed (on which basis?)
I

208

MATHIEU MARION

to Kronecker' reluctance to have it published. This is a supposedly telling fact: "The Beitrag,
in signaling the lengths to which its adversaries would carry their attempts to defeat it, also
reflected the revolutionary ideas with which Cantor had begun to work" (Dauben, 1979,
p.66).
5 Contrary to Cantor, there is precious little verbal abuse in Kronecker's writings; evidence upon
which this reputation rests is mostly hearsay (see for example the letter from Mittag-Leffler to
Hermite in Dugac (1973, pp. 16lf. or invention. For example, the well-known accusation
that Cantor was a "corruptor of the youth" (ein Verderber der lugend) was never made by
Kronecker: it is Schoenflies (1927, p. 2), a follower of Cantor, who assumed that Kronecker
was of that opinion. (Many stories about Kronecker are simply not backed by any hard evidence.
One such story is that of Kronecker's comment to Lindemann, on his proof of the transcendence of 1t: "Of what value is your beautiful proof, since irrational numbers do not exist?"
(Bell, 1940, p. 253, Dauben, 1979, p. 69). As far as I know, there is no primary source for
this implausible story.) I do not wish to enter a sterile debate on Kronecker's personality: as
philosophers, we ought not to pay too much attention to the (largely successful) attempts to smear
Kronecker's reputation - mathematicians don't, as we shall see. (In fact, we could tum around
the quotation of Dauben and say: the lengths to which adversaries of Kronecker's ideas carried
their attempts to suppress them (for more than a hundred years!) also reflect their revolutionary
nature.) Dedekind himself did not share Cantor's negative opinion of Kronecker. Compare also
Dauben's remarks, in the footnote above, with (Edwards, 1989, pp. 71f.) and with Edwards'
discussion of the relations between Kronecker and Dedekind in (1980, pp. 368-372).
6
At any rate, there are some very useful studies. An early introduction to Kronecker's algebraic number theory is Hermann Weyl's Algebraic Theory of Numbers (1940). In Divisor
Theory (1990), Harold Edwards develops a theory of divisors on the basis of Kronecker's
Grundziige (1882), and both Andre Weil's Elliptic Functions according to Eisenstein and
Kronecker (1976), and Serge Vladut's Kronecker's lugendtraum and Modular functions (1991)
concentrate on his work on modular and elliptic functions. Edwards has also published a series
of articles (1980, 1987, 1989, 1992a, b), expounding Kronecker's foundational standpoint and
rectifying common misconceptions about it.
7
The mathematics involved here are quite difficult, especially for a non-specialist, and it is
extremely difficult to present them in a concise manner in such a restricted space. I apologize
for the shortcomings of my presentation and hope to have at least given enough references to
the readers for them to be able find the relevant information.
S
All translations from German or French are mine, except when an already available translation is listed in the bibliography. In that case, the references are to the translation.
9 The last point is very interesting, if we think in terms of Methodenreinheit: it is usually
assumed, in particular by Hilbert, that the use of impure methods was simply a shorter, simpler
way to obtain results. Steinitz's proof of the fundamental theorem of algebra (1910), based on
the so-called Kronecker-Steinitz construction, is a good example to the contrary. In 'The Modem
Algebraic Method', Hasse remarked that "The objection that this construction is formal and yields
nothing new in terms of content is without merit. To my mind, this procedure is conceptually
richer and less burdened with formal computations than the so-called algebraic proofs of the
so-called fundamental theorem of algebra" (1930, p. 19), the latter being in fact a theorem of
analysis. It should be said, however, that 'impure' proofs have, in Hasse's words, a greater and surely non-negligible - Beziehungsrechtum, which is lost when a purer proof is given. I
hasten to add also that the expression 'Kronecker-Steinitz' is somewhat misleading. It is true
that Steinitz - who was student of Hensel - built on Kronecker's ideas but, on the other hand,
he did not respect requisite (2), stated below: his result on the uniqueness of an algebraic
closure uses the axiom of choice. Compare also Steinitz's remarks on Zermelo's Auswahlprinzip:
"Many mathematicians are still opposed to the axiom of choice. With the increasing recognition that there are questions in mathematics which cannot be decided without this axiom, the
opposition to it must increasingly disappear. On the other hand, it seems expedient in the
interest of the purity of method to avoid the said axiom when the nature of the question
does not require its application. I have endeavoured to make these limits conspicuous" (1910,

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

209

pp. 170-171). If however, Kronecker's approach to algebraic number theory is followed faithfully, there is simply no need to discuss uniqueness. See Weyl (1940. p. 12).
\0
On the allgemeine Arithmetik, see Kronecker's remarks at the beginning of 'Ein Fundamentalsatz der allgemeinen Arithmetik' (1887b, p. 212) and at the end of 'Uber den Zahlbegriff'
(1887a, pp. 273-274), Hensel's remarks in his introduction to Kronecker's Vorlesungen iiber
Mathematik (Kronecker, 1901, p.v), and the comments in Edwards (1992b, pp. 132-133).
II
To give only one example, in his 1886 paper' Zur Theorie der elliptischen Funktionen XI',
Kronecker used the modular function p - K + 11K (with K being the Legendre modulus), and
the Rationalitiitsbereich corresponded to Q(p) and the ground ring to Z[pl, so that the divisor
theory concerned rings which are integral over Z[p].
12
Kronecker also wrote to Cantor, in 1884, that "the new conceptual constructions are at
least not necessary" (Meschowski, 1967, p. 238). Edwards also insisted on this point (l992b,
p. 138).
13
See also Edwards (1990, pp. vi-vii & 24) and Vladut (1991, p. 45).
14
For more details on Wittgenstein on the foundations of mathematics, see Marion (forthcoming
(a.
15
In modem terms, Dirichlet proved that the Fourier series converges pointwise to the function
for any periodic function with period 21t that is piecewise monotone and has at most a finite
number of discontinuities, at each of which the function has a left-hand limit and a right-hand
limit and a value that is the average of these two limits.
16
This interest in Riemann's work is quite explicit in the introduction to their paper (1882,
pp. 238-241). See also Dedekind's letter to Cantor, dated February 17, 1882, where he says
that in constrast to Kronecker, his work with Weber was motivated by Riemann's notion of a
"point" (Dugac, 1976, p. 252).
11
There are many reports that he stated his views forcefully in his lectures, exasperating
Weierstrass (for example, see Dugac, 1973, pp. 161-162).
18
In the following, I shall quote from Molk's translation (1909): although it is presented as
written by Pringsheim, it is almost twice the size of the original text, and contains many developments not to be found in the latter, including the first two passages quoted below.
19
For Pringsheim's original statement, see his (1899, pp. 9f.).
20
For example, it helps us in explaining his objections to Ramsey's notion of 'functions in
extension'. There is a telling reference to Dirichlet' notion of an arbitrary function occuring in
a passage where Wittgenstein criticizes Ramsey's 'functions in extension' in Philosophical
Grammar (1974, p. 315). On this topic, see Marion (forthcoming (b.
21
On the unity of Dedekindian mathematics, see Kneebone (1963, p. 154), Dugac (1976,
pp. 141-142) and Edwards (1983, p. 13).
22
Strictly speaking, Dedekind insisted that an irrational number is "created by the human mind"
and "corresponds to" or "produces" a cut (1872, p. 15). Others, such as Weber, suggested that
irrational numbers should be taken to be the cuts. For Dedekind's objections to this suggestion, see his letter to Weber, January 24, 1888 (1932, vol. 3, p. 489).
23
This "freeing the elements from every other content" is a justification, according to Dedekind,
for calling also the natural numbers of "free creation of the human mind" (1888, 73).
24
In this very paper, Kummer announced the publication of "a work of Mr. Kronecker which
will appear soon, in which the theory of the most general complex numbers (... ) is developed
fully and with great simplicity" (Kummer, 1859, p. 737). Both Dedekind's and Kronecker's divisor
theories were conceived as generalizations, but, while Dedekind's theory extended Kummer's
theory to algebraic functions of one variable, Kronecker's extended it to algebraic function in
general. (It is very striking to think that Kronecker's algebraic number theory was intended as
a generalization of Kummer's theory from cyclotomic fields to all algebraic extensions of
natural fields.) In order to do so, Kronecker consciously developed his general theory without
having mind factorization into primes as the ultimate goal, as Dedekind did (Kronecker, 1882,
p. 309f.). (See, however, the result about 'prime forms' in (1882, p. 352).) But Kronecker published his results in his Grundziige, in 1882, that is eleven years after the second version of
Dedekind's Supplement Xl.

210

MATHIEU MARION

25
This objection sounds "curious" to modem number theoreticians. For example, see Mazur
(1977, p. 982). In Divisor Theory, Edwards also adopted the Kummer-Kronecker approach: "The
contemporary style of mathematics trains mathematicians to ask "What is a divisor?" and to want
the answer to be framed in terms of set theory. Those trained in this tradition will want to
think of a divisor as an equivalence class of polynomials, when the equivalence of polynomials is the property of representing the same divisor. I believe, however, that instead of asking
what a divisor is one should ask what is does. It divides things" (1990, p. 19). Philosophers should
reflect on the difference between the algorithmical approach of Kummer and Kronecker
and the set-theoretic attitude of Dedekind taken for granted by most of them. For Kummer's
divisibility tests, see Edwards (1980, p. 338 or 1992a, p. 60).
26
On common mistake at the time but avoided by Kummer was precisely to assume the validity
of unique factorization. See Edwards (1975).
27
These completed infinities are rather harmless compared to those involved in the theory of
irrationals as cuts, but it has been said that, because of these essential uses, Dedekind's Supplement
Xl is one of the origins of set theory, alongside Dirichlet's concept of an arbitrary function.
See Dugac (1976, p. 29)
28
See Dedekind's letter to Lipschitz, July 27, 1876 (1932, vol. 3, p. 477). For a similar presentation of Dedekind's viewpoint, see Sieg (1990, pp. 263f.).
29
Even Poincare wrote, in his obituary of Weierstrass, that Kronecker obtained results only
when he forgot that "he was a philosopher, and by voluntarily abandoning his principles, which
were condemned in advance to sterility" (Poincare, 1899, p. 17). It should be said that this remark
is based, however, on a misunderstanding of Kronecker's position (and, worse, it is all too
often quoted out of context). The principle in question is not (1) - Poincare ascribes it also to
Weierstrass - but something like: 'natural numbers being the foundations of everything, they
must be in evidence everywhere'. It is hard to see how one could ascribe such a principle to
Kronecker. See Edwards (1989, pp. 72-7).
30
On this topic, see Gauthier (1994).
31
Hilbert could then prove that for any system of forms every (rational integral) invariants
could be expressed as the linear combination of a finite set of them (1890, pp. 208f.) The proof
was criticized for its shortcomings by Paul Gordan, who supposedly cried out: "This is not
mathematics, it is theology!" (Bell, 1940, p. 403). While recognizing that the proof was correct,
he felt there was a gap in its execution, because Hilbert "limited himself to the proof of the
existence of that system of invariants, and renounced any discussion of their properties", such
as the upper bound to their number (1893, p. 132). Hilbert recognized in a subsequent paper
that his proof "provides no way of constructing such a system of invariants by means of a
finite number of processes which can be surveyed before the start of the computation so that,
for example, an upper bound for the number of the invariants of the system or for their degrees
in the coefficients of the ground form can be given" (1893, p. 268).
32
Perhaps it is worth mentioning in this context Dirichlet's 'other principle', expressed by
Hermann Minskowski: "to tackle problems with a minimum of blind calculations and a maximum
of insightful thoughts" (1905, pp. 460-461). Surely, Hilbert's results, as opposed to Gordan's,
is the perfect example of a result satisfying this principle. But there is no reason to associate
here Kronecker's work with Gordan's. On the contrary, Kronecker's papers contain a maximum
of thoughts so insightful that they were not understood by his contemporaries.
33
For Dedekind's notes on Kronecker's paper, see Edwards, Neumann & Puckert (1982).
34
It has been argued that the approach using the Prague Theorem is preferable to Dedekind's
fourth and last version of the Supplement Xl (Edwards, 1980, p. 352). But it brings the treatment of algebraic number theory closer to Kronecker's and it violates Dedekind 's methodological
requirement. This may explain why Dedekind objected to uses of his own theorem. See Dedekind
(1895) for Dedekind's negative reaction to Hurwitz's paper and Edwards (1980, sec. 13) for a
discussion of these matters.
35
It is worth mentioning here Herbrand's (posthumous) contributions to class field theory, such
as a classic result on local fields in (1931) and his contributions to the theory of number fields

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

211

of infinite degree in (1932) and (1933). Herbrand introduced a (now common) method based
on the notions of 'inductive limit' and 'projective limit' in connection with generalisations of
splitting groups and inenia groups when the fields k and K :::J k (K being the Galois extension
of the number field k ) are of infinite degree on Q. Claude Chevalley followed Herbrand in
extending the theory of class fields to the number fields of infinite degree. After introducing
his notion of idele, Chevalley went on to rework class field theory, getting rid of analysis by
giving algebraic proofs of all its main theorems (1940). For example, Chevalley gave an arithmetical proof of the so-called (in old terminology) 'first fundamental inequality', while Takagi
previously used Dirichlet's L-series. See Tate (1967, pp. 180f.). This is another example of
Methodenreinheit. On the other hand, Chevalley made room, following Anin, for the local-global
principle in class field theory.
36
For an adequate statement of the theorems see the chapter by W.J. & F. Ellison in Abrege
d'histoire des mathematiques 1700-1900 (Dieudonne, 1986, pp. 188-189).
37
Again, for an adequate statement, see Dieudonne (1986, p. 189). Hilbert's student Erich Hecke
made the first breakthroughs on the problem. It should be noted, however, that Hecke made ample
use of analytic continuation, which was never mentioned by Kronecker, presumably for philosophical reasons (Weil, 1976, p. 55).
38
lowe this imponant point to David Reed.
39
On Kronecker's first and second limit formulas, see Lang (1987, chaps. 20-22).
40
See the classic Shimura & Taniyama (1961).
41
For an introduction to the 'Langlands programme', see Gelban (1984) or Arthur (1981).
42
See also Hasse (1967, p. 266n.) and Weil (1950, p. 90) for a similar conclusion. In his preface
to his book Algebraic Theory of Numbers, in 1940, Weyl wrote: "I have axiomatized Kronecker's
approach to the problem of divisibility, which has recently been completely neglected; [... ]
The ultimate verdict may be that the one outstanding way for any deeper penetration into the
subject is the Kummer-Hensel p-adic theory" (1940, p. i). We are now in a better position to
evaluate the role of p-adic numbers, and it is so obvious that Weyl's prediction was entirely
correct that it is not even wonh arguing for.
43
For details about the programme and the results, see Simpson (1985, 1988).
44
Steinitz (1910) was the first to establish the existence and uniqueness of an algebraic
closure for a given field, a theorem which is considered by Hasse as the Fundamental theorem
of Algebra, since the original (Gaussian) Fundamental Theorem cannot be proved without analytic
tools and its validity is limited to the field of complex numbers (Hasse, 1926-27, pp. 247-248).
See Steinitz (1910, p. 33) and the editorial note 43 by Baer and Hasse (1910, p. 153). Bart
van der Waerden's Modern Algebra contains a description of Steinitz's existence and uniqueness theorems for countable fields (1931, pp. 193-196). Compare the treatment by Birkhoff &
MacLane (1941, pp. 1I4f. & 393).
45
Even in the countable case, the axiom of choice is needed for the proof of the uniqueness
theorem. See van der Waerden (1931, vol. I, p. 195). See notes 9 and 44 for other remarks
about Steinitz (1910).
46
One is also reminded of Poincare's answer to Russell's inquietude: when Russell stated
that until "a complete solution of our difficulties [the paradoxes] ... is found we cannot be
sure how much mathematics it will leave intact" (1906, p. 53), Poincare replied that "only
Cantorism and Logistic are called into question; real mathematics, ... , will continue to develop
according to their own principles" (1906, p. 307).
47
This much was shown by Frey and Ribet. See Barry Mazur's 'Number Theory as Gadfly'
(1991). On the recent spectacular attempt at a proof of the conjecture by Andrew Wiles, which
has been temporarily withdrawn since, see Ribet's repon in Notices of the American Mathematical
Society (1993).
48
Research for this paper was made possible by Post-doctoral Fellowships from the Social
sciences and Humanities Research Council of Canada and from the Fonds pour la Formation
de Chercheurs et l' Aide ala Recherche (Quebec), while 1 was successively a Research Associate
at the Center for the Philosophy and History of Science at Boston University and a Post-

212

MATHIEU MARION

Doctoral Fellow at the Universite de Montreal. I would like to thank Harold Edwards, Daniel
Isaacson, Yvon Gauthier, Angus Macintyre and David Reed for their comments on earlier versions
of this paper which led to substantial improvements.

REFERENCES
Arakelov, S. J., 1974a, 'Intersection Theory of Divisors on an Arithmetic Surface', Math. USSR
Izvestija, 1167-1180.
Arakelov, S. J., 1974b, 'Theory of Intersections on the Arithmetic Surface', Proceedings
of the International Congress of Mathematicians. Vancouver, 1974, vol. 1, 405408.
Arthur, J., 1981, 'Automorphic Representations and Number Theory', Canadian Mathematical
Society/Societe Canadienne de mathematiques. Conference Proceedings, Volume 1: 1980
Seminar on Harmonic Analysis, 3-51.
Bell, E. T., 1940, The Development of Mathematics, McGraw-Hi11, New York.
Birkhoff, G. and MacLane, S., 1941, A Survey of Modern Algebra, Macmi11an, New York.
Cantor, G., 1878, 'Ein Beitrag zur Mannigfaltigkeitslehre', }ournalfUr die reine und angewandte
Mathematik 77, 242-258; reprint in E. Zermelo (ed.): 1966, Gesammelte Abhandlungen,
reprint, Hildesheim, Georg Olms, 119-133.
Cassels, J. W. S. and Frohlich, J. (eds.), 1967, Algebraic Number Theory, Academic Press,
London/New York.
Chevalley, c., 1940, 'La theorie du corps de classes', Annals of Mathematics 41, 394--417.
Couturat, L., 1986, De l'infini mathematique, reprint: 1980, Blanchard, Paris.
Dauben, J., 1979, Georg Cantor. His Mathematics and Philosophy of the Infinite, Harvard
University Press, Cambridge Mass.
Dedekind, R., 1872, Stetigkeit und irrationale Zahlen, Vieweg & Sohn, Brunswick; English translation: 'Continuity and Irrational Numbers', in Dedekind (1963), 1-27.
Dedekind, R., 1877, 'Sur la tMorie des nombres entiers algebriques', in Dedekind (1932), vol.
3,262-296.
Dedekind, R., 1888, Was sind und was sollen die Zahlen?, Vieweg & Sohn, Brunswick, second
edition: 1893, third edition: 1911; English translation, 'The Nature and Meaning of Numbers',
in Dedekind (1963), 29-115.
Dedekind, R., 1890, 'Letter to Keferstein', in J. van Heijenoort (ed.): 1967, From Frege to Godel.
A Sourcebook in Mathematical Logic, 1879-1931, Harvard University Press, Cambridge
Mass., 98-103.
Dedekind, R., 1892, 'Uber einen arithmetischen Satz von GauS', in Dedekind (1932), vol. 2,
28-39.
Dedekind, R., 1895, 'Uber die Begriindung der Idealtheorie', in Dedekind (1932), vol. 2, 5058.
Dedekind, R., 1932, Gesammelte mathematische Werke, 3 vols., R. Fricke, E. Noether, and O.
Oystein (eds.), Vieweg & Sohn, Brunswick.
Dedekind, R., 1963, Essays on the Theory of Numbers, reprint, Dover, New York.
Dedekind, R. and Weber, H., 1882, 'Theorie der algebraischen Funktionen einer Veriindlichen',
in Dedekind (1932), vol. 1,238-349.
Dieudonne, J., 1981, 'La philosophie des matMmatiques de Bourbaki', in Choix d' oeuvres
mathematiques, Hermann, Paris, vol. 1, 27-39.
Dieudonne, J. (ed.), 1986, Abrege d'histoire des mathematiques 1700-1900, Hermann, Paris.
Dugac, P., 1973, 'Elements d'analyse de Karl Weierstrass', Archives for History of Exact Sciences
10, 41-176.
Dugac, P., 1976, Richard Dedekind et lesfondements des mathematiques, Librairie philosophique
J. V rin, Paris.
Edwards, H. M., 1975, 'The Background of Kummer's Proof of Fermat's Last Theorem for
Regular Primes', Archive for History of Exact Sciences 14, 219-236.

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

213

Edwards, H. M., 1980, 'The Genesis of Ideal Theory', Archive for History of Exact Sciences
23, 322-378.
Edwards, H. M., 1983, 'Dedekind's Invention of Ideals', Bulletin of the London Mathematical
Society 15, 8-17.
Edwards, H. M., 1987, 'An Appreciation of Kronecker', The Mathematical Intelligencer 9(1),
28-35.
Edwards, H. M., 1989, 'Kronecker's Views on the Foundations of Mathematics', in Rowe, D.
E. and J. McCleary (eds.) The History of Modern Mathematics. Volume I: Ideas and their
Reception, Academic Press, London, 67-77.
Edwards, H. M., 1990, Divisor Theory, Birkhiiuser, Boston-Basel-Berlin.
Edwards, H. M., 1992a, 'Mathematical Ideas, Ideals, and Ideology', The Mathematical
Intelligencer 14(2), 6-19.
Edwards, H. M., 1992b, 'Kronecker's Arithmetical Theory of Algebraic Quantities', Jahresbericht
der deutschen Mathematiker-Vereinigung 94, 130-139.
Edwards, H., Neumann, 0., and Purkert, W., 1982, 'Dedekinds 'Bunte Bemerkungen' zu
Kroneckers 'Grundzuge' " Archive for History of Exact Sciences 27, 49-85.
Faltings, G., 1983, 'Endlichkeitssiitze fiir abelsche Varietiiten iiber Zahlkorpern', Inventiones
Mathematicae 73, 349-366.
Faltings, G., 1984, 'Calculus on Arithmetic Surfaces', Annals of Mathematics 119, 387-424.
Friedman, H. M., Simpson, S. G. and Smith, R. L., 1983, 'Countable Algebra and Set Existence
Axioms', Annals of Pure and Applied Logic 25, 141-181.
Gauthier, Y., 1994, 'Hilbert and the Internal Logic of Mathematics', Synthese 101,1-14.
Gelbart, S., 1984, 'An elementary Introduction to the Langlands Program', Bulletin of the
American Mathematical Society 10, 177-219.
Gillet H. and Soule, C., 1992, 'An Arithmetic Riemann-Roch Theorem', Inventiones
Mathematicae 110, 473-543.
Gordan, P., 1893, 'Uber einem Satz von Hilbert', Mathematische Annalen 42, 132-142.
Hasse, H., 1926-27, Hohere Algebra I, II, Walter de Gruyter, Berlin, second edition: 1933,
third edition: 1951; English translation: 1954, Higher Algebra, Frederick Ungar, New York.
Hasse, H., 1930, 'Die moderne algebraische Methode', Jahresbericht der deutschen
Mathematiker-Vereinigung 39, 22-34; English translation: 1986, 'The Modern Algebraic
Method', The Mathematical Intelligencer 8(2), 18-23.
Hasse, H., 1949, Zahlentheorie, Springer, Berlin; English translation: 1980, Number Theory,
Springer, Berlin.
Hasse, H., 1967, 'History of Class Field Theory', in Cassels & Frohlich (1967), 266279.
Herbrand, J., 1931, 'Sur la tbeorie des groupes de decomposition, d'inertie et de ramification',
Journal de mathematiques pures et appliquees 10, 481-498.
Herbrand, J., 1932, 'Tbeorie arithmetique des corps de nombres de degre infini. I. Extensions
algebriques finies de corps infinis', Mathematische Annalen 106, 473-501.
Herbrand, J., 1933, 'Tbeorie arithmetique des corps de nombres de degre infini. II. Extensions
algebriques de degre infini', Mathematische Annalen 107, 699-717.
Hilbert, D., 1890, 'Uber die Theorie der algebraischen Formen', Mathematische Annalen 36,
473-534, reprint in Hilbert (1935), vol. 2, 199-257; English translation: 'On the theory of
Algebraic Forms', in Hilbert (1978), 143-224.
Hilbert, D., 1893, 'Uber die vollen Invariantensysteme', Mathematische Annalen 42, 313-373,
reprint in Hilbert (1935), vol. 2, 287-344; English translation: 'On the Complete systems
of Invariants', in Hilbert (1978), 225-302.
Hilbert, D., 1897, 'Die Theorie der algebraischen Zahlkorper', Jahresbericht der deutschen
Mathematiker-Vereinigung 4,175-546; reprint in Hilbert (1935), vol. 1,63-363.
Hilbert, D., 1900, 'Mathematische Probleme', Gottinger Nachrichten, 253-297; reprint in Hilbert
(1935), vol. 3, 290-329; English translation: 1902, 'Mathematical Problems' Bulletin of the
American Mathematical Society 8,437-479; reprint in 1976, Proceedings of Symposia in Pure
Mathematics 28, 1-34.

214

MA TRIEU MARION

Hilbert, D., 1922, 'Neubegriindung der Mathematik (Erste Mitteilung)', reprint in Hilbert (1935),
vol. 3, 157-177.
Hilbert, D., 1931, 'Die Grundlegung der elementaren Zahlenlehre', Mathematische Annalen
104, 485-494.
Hilbert, D., 1935, Gesammelte Abhandlungen, Springer, Berlin.
Hilbert, D., 1978, Hilbert's Invariant Theory Papers, M. Ackerman, R. Hermann (eds.), Math
Sci Press, Brookline Mass.
Hurwitz, A., 1895, 'Uber einen Fundamentalsatz der arithmetischen Theorie der algebraischen
GraBen', in: 1932, Mathematische Werke, Birkhiiuser, Basel, vol. 2, 198-207.
Jourdain, P. E. B., 1905, 'On the General Theory of Functions', Journal for reine und angewandte Mathematik 128, 169-210.
Kirk, G. S., Raven, J. E., and Schofield, M., 1983, The Presocratic Philosophers, second
edition, Cambridge University Press, Cambridge.
Kneebone, G. T., 1963, Mathematical Logic and the Foundations of Mathematics. An Introductory
Survey, Van Nostrand, Toronto/New York/Princeton.
Konig, J., 1903, Einleitung in die allgemeine Theorie der algebraischen Grossen, Teubner,
Leipzig.
Kronecker, L., 1853, 'Uber die algebraisch auflosbaren Gleichungen', in Kronecker (1899),
vol. 4, 27-37.
Kronecker, L., 1880, 'Auszug aus einem Briefe von L. Kronecker and R. Dedekind', in Kronecker
(1899), vol. 5, 453-457.
Kronecker, L., 1882, 'Grundziige einer arithmetischen Theorie der algebraischen Grossen', in
Kronecker (1899), vol. 2, 287-296.
Kronecker, L., 1886a, 'Uber einige Anwedungen der Modulsysteme auf elementare algebraische
Fragen', in Kronecker (1899), vol. 3, 147-208.
Kronecker, L., 1886b, 'Zur Theorie der Elliptischen Funktionen XI', in Kronecker (1899), vol.
4,389-470.
Kronecker, L., 1887a, 'Uber den Zahlbegriff', in Kronecker (1899), vol. 3,251-274.
Kronecker, L., 1887b, 'Einer Fundamentalsatz der allgemeinen Arithmetik', in Kronecker (1899),
vol. 3, 211-240.
Kronecker, L., 1899, Werke, 5 vols., K. Hensel (ed.), Teubner, Leipzig.
Kronecker, L. 1901, Vorlesungen iiber Zahlentheorie, K. Hensel (ed.), Teubner, Leipzig.
Kummer, E. E., 1851, 'Memoire sur la theorie des nombres complexes composes de racines de
l'unite et de nombres entiers', Journal de mathematiques pures et appliqU/?es 15, 377-498,
reprint in Kummer (1975), vol. 1,363-484.
Kummer, E. E., 1859, 'Uber die allgemeinen Reciprocitiitsgesetze den Resten und Nichtresten
der Potenzen, deren Grad eine Primzahl ist', Mathematische Abhandlungen der Koniglichen
Akademie der Wissenschaften zu Berlin, 19-159; reprint in Kummer (1975), vol. 1,699-838.
Kummer, E. E., 1975, Collected Papers, 2 vols., A. Weil (ed.), Springer, Berlin.
Lang, S., 1987, Elliptic Functions, second edition, Springer, Berlin.
Lang, S., 1988, Introduction to Arakelov Theory, Springer, Berlin.
Langlands, R. P., 1976, 'Some Contemporary Problems with Origins in the Jugendtraum',
Proceedings of Symposia in Mathematics 28, 401-418.
Lejeune-Dirichlet, P. G., 1837, 'Uber die Darstellung ganz wilkiirlicher Funktionen durch Sinusund Cosinusreihen', in P. G. Lejeune-Dirichlet (1969), Werke, L. Kronecker (ed.), reprint,
Chelsea, New York, 1969, vol 1, 135-160.
Lejeune-Dirichlet, P. G., 1879, Vorlesungen iiber Zahlentheorie, R. Dedekind (ed.), Vieweg &
Sohn, Brunswick.
Macaulay, F. S., 1916, The Algebraic Theory of Modular Systems, Cambridge University Press,
Cambridge.
Marion, M., Forthcoming (a), Wittgenstein, Finitism, and the Foundations of Mathematics, Oxford
University Press, Oxford.
Marion, M., Forthcoming (b), 'Wittgenstein and Ramsey on Identity', to appear in 1. Hintikka

KRONECKER'S 'SAFE HAVEN OF REAL MATHEMATICS'

215

(ed.): 1995, Essays on the Development of the Foundations of Mathematics, Kluwer,


Amsterdam.
Mazur, B., 1977, 'Review of E. E. Kummer, Collected papers', in Bulletin of the American
Mathematical Society 83, 976-988.
Mazur, B., 1977, 'Number Theory as Gadfly', American Mathematical Monthly, 98, 593-610.
Meschkowski, H., 1967, Probleme des Unendlichen. Werk und Leben Georg Cantors, Vieweg
& Sohn, Brunswick.
Molk, J., 1885, 'Sur une notion qui comprend celle de ladivisibilite et sur theorie generale de
I'elimination', Acta Mathematica 6, 1-165.
Molk, J., 1909, 'Principes fondamentaux de la theorie des fonctions', in Encyclopedie des sciences
mathematiques pures et appliquees, Gauthiers-Villars, Paris, vol, 11.1, 1-112.
Poincare, H., 1899, 'L'oeuvre mathematique de Weierstrass', Acta Mathematica 22, 1-18.
Poincare, H., 1901, 'Sur I'arithmetique des courbes algebriques', in: 1950, Oeuvres de Henri
Poincare, Gauthier-Villars, Paris, vol. 5, 483-548.
Poincare, H., 1906, 'Les mathematiques et la logique', Revue de metaphysique et de morale
14, 294-317.
Poincare, H., 1911, 'The Bolyai Prize. Report on the Works of Hilbert', Science 34, 753-765
& 801-808.
Pringsheim, A., 1899, 'Grundlagen der allgemeinen Funktionenlehre', in Encyklopiidie der
mathematischen Wissenschaften mit Einschluss ihrer Anwendungen, Teubner, Leipzig, vol.
lIal, 1-41.
Russell, B., 1906, 'On some Difficulties in the Theory of Transfinite Numbers and Order
Types', Proceedings of the London Mathematical Society 4, 29-53.
Schoenflies, A., 1927, 'Die Krisis in Cantor's Mathematischen Schaffen', Acta Mathematica
50, 1-23.
Shimura, G. and Taniyama, Y., 1961, Complex Multiplication of Abelian Variety and its
Applications to Number Theory, Publications of the Mathematical society of Japan 6.
Sieg, W., 1990, 'Relative Consistency and Accessible Domains', Synthese 84, 259-297.
Simpson, S., 1985, 'Reverse Mathematics', Proceedings of Symposia in Mathematics 42, 461-471.
Simpson, S., 1988, 'Partial Realizations of Hilbert's Program', Journal of Symbolic Logic 53,
359-363.
Steinitz, E., 1910, 'Algebraische Theorie der Korper', Journal for die reine und angewandte
Mathematik 137, 167-309. Book edition: 1930, Algebraische Theorie der Korper, R. Baer
& H. Hasse (eds.), Springer, Berlin; reprint: 1950, Chelsea, New York.
Tate, J. T., 1967, 'Global Class Field Theory', in Cassells & Frohlich (1967), 162-203.
van der Waerden, B. L., 1931, Moderne Algebra, Springer, Berlin; English translation: 1949,
Modern Algebra, Frederick Ungar, New York.
Vladut, S. G., 1991, Kronecker's Jugendtraum and Modular Functions, Gordon & Breach,
New York.
Weber, H., 1893, 'Leopold Kronecker', Jahresbericht der deutschen Mathematiker-Vereinigung
2, 5-31.
Weil, A., 1950, 'Number-Theory and Algebraic Geometry', in A. Weil: 1970, Oeuvres
Scientifiques. Collected Papers, Springer, Berlin, vol. 1,442-452.
Weil, A., 1976, Elliptic Functions According to Eisenstein and Kronecker, Springer, Berlin.
Weyl, H., 1940, The Algebraic Theory of Numbers, Princeton University Press, Princeton.
Wittgenstein, L., 1965, Philosophical Remarks, Ba\Ckwell, Oxford.
Wittgenstein, L., 1974, Philosophical Grammar, Blackwell, Oxford.
Wittgenstein, L., 1979, Ludwig Wittgenstein and the Vienna Circle, Blackwell, Oxford.

MARIO BUNGE

HIDDEN VARIABLES, SEPARABILITY, AND REALISM*

The entire family of hidden variables theories, containing inequalities of the


Bell type, has recently been refuted experimentally. However, there is confusion as to what exactly this proves. Some claim that realism has been refuted,
others that determinism is definitely out, and still others that the loser is
the conjunction of the EPR criterion of reality and the classical principle of
separability or locality. The aim of this paper is to clarify some of the key
concepts involved in order to help make a decision among these alternatives.
First the original goals of hidden variables theories are listed and examined.
Along the way several philosophical concepts, particularly that of realism,
are elucidated. It is shown that philosophical realism, i.e. the thesis that
nature exists on its own, is different from the EPR criterion of physical
reality, which is actually only a classicist tenet.
Then the experiments of Aspect and his coworkers are briefly analyzed.
It is concluded that, far from refuting philosophical realism, they presuppose
it. It is also concluded that those experiments signal the demise of the EPR
tenets that every physical property has a definite value at all times, and that
spatial separation of the components of a complex system suffices to dismantle
it.
The superposition principle and inseparability ("non-locality") are seen as
the victors of the recent episode. On the other hand the Copenhagen interpretation of the mathematical formalism looks just as defective and avoidable
as before the downfall of the Bell inequalities. It is suggested that the realistic interpretation advanced elsewhere by the author has not been touched
by this event.
The original EPR Gedankenexperiment is criticized in the Appendix from
a realistic viewpoint.
In this paper we will discuss some developments triggered by the famous paper
by Einstein, Podolsky and Rosen [I] - henceforth EPR - and culminating
with the experimental tests of hidden variables theories, in particular Aspect
et al. [2, 3].
EPR held that the current quantum theory is incomplete for being about
ensembles of similar things rather than about individual entities. On this view
individual microphysical objects would have precise though different positions,
momenta, energies, etc. Randomness would not be a basic mode of being
but an appearance originating in individual differences among the compo-

* First published in Revista Brasileira de Fisica, volume especial os 70 anos de Mario Schonberg,
pp. 150-168 (1984).
217
M. Marion and R. S. Cohen (etis.), Quebec Studies in the Philosophy of Science I, 217-227.
1995 Kluwer Academic Publishers.

218

MARIO BUNGE

nents of an ensemble of mutually independent things. A complete theory,


EPR claimed, should contain hidden (classical) variables only, and it should
deduce all the probability distributions instead of postulating them.
A hidden variable is a function possessing definite or sharp values all the
time instead of being spread out: it is dispersion-free like the classical position
and the classical field intensities. This type of variable was called "hidden"
in contrast with the alleged "observables" and for presumably referring to a
sub-quantum level underlying that covered by quantum theory. (See Bohm
[4] for a fascinating account.) It was hoped that quantum theory could be
enriched or completed with hidden variables, or even superseded by a theory
containing only variables of this type. Bohm's theory [5], which revived and
expanded de Broglie's ideas on the pilot wave, was of the first kind. Stochastic
quantum mechanics (de la Peiia-Auerbach [6]) and stochastic quantum electrodynamics (e.g. Claverie and Diner [7]) are of the second type. Moreover,
the latter theories, far from being thoroughly causal, contain random hidden
variables like those occurring in the classical theory of Brownian motion.
Hidden variables theories were supposed to accomplish five or six tasks
at the same time. It will pay to list and examine them separately because
they are often conflated, with the result that there is still no consensus on
what exactly the experimental refutation of these theories has refuted. Those
tasks are:
(i) To restore realism in the philosophical sense of the word, i.e. to supply
an objective account of the physical world, rather than describe the information, expectations, and uncertainties of the knowing subject (in particular the
observer).
(ii) To restore realism in EPR's idiosyncratic sense of the word. That is,
to comply with the oft-quoted EPR criterion of physical reality: "If, without
in any way disturbing a system, we can predict with certainty (i.e., with probability equal to unity) the value of a physical quantity, then there exists an
element of physical reality corresponding to this physical quantity" (Einstein
et al. [1], p. 777).
(iii) To restore classical determinism by deducing chance from causality
- i.e. to explain probability distributions in terms of individual differences
and the mutual independence of the components of statistical ensembles in
the sense of Gibbs.
(iv) To complete the job begun by quantum theory, regarded as a statistical theory, by accounting for the behavior of individual physical entities.
(v) To replace quantum theory, regarded as a phenomenological (or black
box) theory, with a mechanismic theory that would explain quantum behavior
instead of providing only successful recipes for calculating it. I.e., to exhibit
the mechanisms underlying quantum behavior - e.g. electron diffraction and
the tunnel effect - deducing quantum theory as a particular case, in some limit,
or for special circumstances.
(vi) To restore the separability or independence of things that, having
been components of a closely knit system in the past, have now become widely

HIDDEN VARIABLES, SEPARABILITY, AND REALISM

219

separated in space - i.e., to eliminate the distant or EPR correlations. More


on this below. Suffice it now to recall that, according to quantum theory,
once a (complex) system, always a system.
The first objective - the restoration of philosophical realism - was legitimate. However, it can be attained without modifying the mathematical
formalism of quantum theory and, in particular, without introducing hidden
variables. Indeed it can be shown that the Copenhagen interpretation is adventitious and that the realistic interpretation, which focuses on things in
themselves rather than on the observer, is not only possible but the one actually
employed by physicists when not in a philosophical mood. For example, no
assumptions about the observer and his experimental equipment are made when
calculating energy levels. (See Bunge [8, 9].)
The second goal, though often mistaken for the first, has nothing to do
with philosophical realism: it is simply classicism, and it involves the denial
of the superposition principle, according to which the dynamical properties
of a quantum object normally have distributions of values rather than sharp
values. (In the non-relativistic theory only mass and electric charge have
sharp values all the time - but these are not dynamical variables.) In the
light of the indisputable success of quantum theory and the failure of hidden
variables theories, that classicist principle must be regarded as unjustified:
we shall call it the EPR dogma.
The third objective - determinism - was pursued by the early hidden variables theorists, but was no longer an aim of those who worked from about
1960 on. In fact, as we recalled a while ago, there are theories containing
random hidden variables, namely stochastic quantum mechanics and electrodynamics. The introduction of these theories was important to distinguish
the concepts of (i) philosophical realism, (ii) classicism or EPR realism, and
(iii) determinism - as we realize with hindsight.
The fourth goal (completeness) makes sense only when quantum theory
is regarded as a statistical theory dealing only with ensembles. It evaporates
as soon as it is realized that the elementary theory refers to individual entities.
The fifth objective (mechanismic theory) was soon abandoned in the most
advanced work on hidden variables: that of Bell [10, 11]. In fact, Bell's hidden
variables have no precise physical interpretation: they are only adjustable parameters in a theory that is far more phenomenological than quantum mechanics.
Moreover, such hidden variables are not subject to any precise law statements, hence they are no part of a physical theory proper. But this is a virtue
rather than a shortcoming, for it shows that Bell's famous inequalities hold
for the entire family of local hidden variables theories, regardless of the precise
law statements that may be assumed.
So far, only the first objective - namely the restoration of philosophical
realism - seems worth being pursued. However, it can be attained without
the help of hidden variables, namely by a mere change of interpretation of
the standard formalism of the theory. On the other hand the sixth objective,
namely separability, remained plausible for about half a century. Indeed, the

220

MARIO BUNGE

EPR distant correlations are quite counter-intuitive. How is it possible for


the members of a divorced couple to behave as if they continued to be married,
i.e., in such a manner that whatever happens to one of them affects the other?
The quantum-theoretical answer is simple if sybilline: there was no divorce.
However, let us not rush things: it will pay to have a look at the experiment
suggested by Einstein et al. [1]. The original experiment is described and
analyzed in the Appendix. Here we shall deal with an updated version of it
in the context of Bell's general hidden variables theory, which contains the
now famous Bell's inequalities.
According to quantum theory the three components of the spin of a quantum
object do not commute, hence (in our realistic interpretation) they have no
definite or sharp values all the same time - as a consequence of which they
cannot be measured exactly and simultaneously. Simpler: the spin components are not the components of a vector, hence there is no vector to be
measured. On the other hand, if the spin were an ordinary vector (hence a
hidden variable), its components would commute and consequently it should
be possible to measure all three components at the same time with the accuracy
allowed by the state of the art.
Apply the preceding considerations to a pair of quantum objects that are
initially close together with spins pointing in opposite directions, which subsequently move apart without being interfered with. Compute the probability
P(x, y) that thing 1 has spin in the positive x-direction, and thing 2 in the
positive y-direction. According to quantum mechanics this probability exceeds
the sum of the probabilities P(x, z) and P(y, z):
QM

P(x, y) 2= P(x, z)

+ P(y, z).

On the other hand any hidden variables theory predicts the exact reversal of
the inequality sign:
HV

P(x, y) $ P(x, z)

+ P(y, z).

This is one of Bell's inequalities. Bell himself, and others, derived several other
inequalities that hold in all local hidden variables theories, and that involve
exclusively measurable quantities such as coincidence counter rates. (See
Clauser and Shimony [12] for a masterly review.)
Thanks to this theoretical work a number of ingenious high precision
measurements were designed. Some of them, involving photons, have been
performed. The latest version is that of Aspect [2, 3]. An atom emits two
photons in opposite directions at the same time. Their linear polarizations
are measured by polarizers oriented in directions that are adjustable at will.
The photons coming out of the polarizers activate photomultipliers the coincidence rate of which is monitored by a counter. The result is that the
polarization states of the two photons are highly correlated. That is, the state
of polarization measured by one of the polarizers depends upon that measured
by the other polarizer and conversely (Aspect [2]). Essentially the same result
obtains if the polarizers are rotated during the flight of the photons and before

HIDDEN VARIABLES, SEPARABILITY, AND REALISM

221

they strike the polarizers, to prevent any communication between them through
signals propagating at a finite speed [3].
Although there is consensus that Bell's inequalities, and with them all of
the hidden variables theories, have been experimentally refuted, it is not clear
what is at issue: philosophical realism, determinism, the EPR dogma that all
properties have sharp values at all times, or the EPR hypothesis that distant
things behave independently of one another.
Most authors - e.g. Clauser and Shimony [12], d'Espagnat [13], Aspect
et al. [2] - claim that realism is the casualty, without however clarifying
what they mean by "realism", and ostensibly conflating philosophical realism
with what we have called classicism or the EPR dogma. Yet it should be
clear that the very design and performance of measurements presuppose the
reality (independent existence) of the entire measurement system, hence of
every component of it (object, apparatus, and experimenter). So, philosophical realism is definitely not at stake: the downfall of Bell's inequalities has
not refuted the principle that the physical world manages to exist without
the help of those who attempt to know it. Nor is determinism at issue, since
some hidden variables theories entailing Bell-type inequalities contain random
variables. So, only the EPR dogma and the separability (locality) hypotheses
are at issue. As a matter of fact this is the conclusion Clauser and Shimony
[12] reach after their careful analysis: Hidden variables (sharpness) & Locality
(separability) => Bel/'s inequalities. Hence, the experimental refutation of
the latter refutes either separability or the EPR dogma - or both. We shall argue
in a while that the or is inclusive, for the EPR dogma entails separability
(locality). But before doing this we must take another glance at the Aspect
experiment.
The experiment poses two problems. One is to ascertain whether the photons
emitted by the source possess all their properties, in particular polarization,
or acquire them when interacting with the polarizers. The second problem is
to account for the correlations at a distance, or EPR correlations, between
the results obtained with one of the analysers and those obtained with the other
- or, equivalently, to explain the coincidences registered by the coincidence
counter.
Let us start with the first problem. According to both classical and hidden
variables theories, the fragments resulting from the process at the source
have sharp properties (eigenvalues) from start to finish: the analysers only
exhibit what is already there. On the other hand, according to quantum theory
the entities leaving the source are in a superposition of polarization states. This
superposition collapses (reduces) to eigenstates through interaction of the
system with the analysers. In other words, the analysers do not just measure
an existing polarization but also produce it (In Pauli's terminology, they are
instruments of the second kind.) Clearly, this explanation will satisfy only
those who believe that accurate measurement is accompanied by a projection of the state function - even though one need not believe the dogmas
that this projection is subjective, instantaneous, and unrelated to the smooth

222

MARIO BUNGE

processes described by the Schr6dinger equation. (See Cini [14], Bunge and
Kalnay [15].)
The second problem is this: if only one of the analyzers is read, we may
infer the result obtained with the other analyzer without looking at it. (This
cannot be done with the apparatus designed by Aspect et al. [2, 3], which
registers only coincidences. The equivalent operation here is to alter the angle
between the analyzers and note the change in coincidence rate. The experiment shows that the coincidence rate attains its maximum when the two
analyzers are parallel.) This result is nicely explained by any hidden variables theory jointly with the theorem of conservation of the total spin. Indeed,
the first quantum object would arrive at the analyzer possessing a definite
spin value, say 1. so that the second must have spin -1, since the total spin
of the system at the source was O. The trouble with this explanation is that
it takes hidden variables for granted. Since experiment has refuted hidden variables theories we must try to explain the same results with the help of quantum
theory.
If the two quantum objects leaving the source were to become independent after a while - as they should according to classical physics and to EPR
- the result obtained with one of the polarizers should be independent of the
result obtained with the other. Thus it should be possible to observe, say, the
left quantum object with its spin pointing in the x-direction, and the right
one with its spin pointing in the z-direction. But this is not what quantum theory
predicts: according to this theory there is total spin conservation, so that if
the left polarizer projects the spin onto the positive x-axis, the right polarizer will project it onto the negative x-axis. In other words, there is a strong
correlation at a distance between the polarizations of the two former components of the original system. This correlation is correctly predicted by quantum
theory, which treats the "two" quantum objects as constituting a single entity.
Indeed, only the state function for the entire system satisfies the Schr6dinger
equation. (In other words, the superposition principle is not optional but mandatory in this case.)
The quantum-theoretical explanation of distant correlation is often found
unsatisfactory or even mysterious because it involves the counter-intuitive
("paradoxical") hypothesis that a complex system may continue to be one even
after its components have moved far apart. In fact all known classical forces
fall off quickly with increasing distance. For that reason a host of more or
less unorthodox explanations of the EPR distant correlation have been
proposed. Among them we note those in terms of hidden variables and a
peculiar quantum potential that does not occur in the hamiltonian but derives
from the state function of the system (Bohm and Hiley [16]); the occurrence
of actions at a distance - and even psychokinesis. We reject all three explanations for the following reasons.The first occurs in a hidden variables theory,
hence one containing some of Bell's false inequalities. The second violates
special relativity and electrodynamics (both classical and quantal), which
involve the principle of nearby action (or locality in the field-theoretic sense).

HIDDEN VARIABLES, SEPARABILITY, AND REALISM

223

And the third violates the principle of conservation of energy, since every
message must ride on a physical process - unless, of course, one is prepared
to believe in the existence of non-physical things.
We need not look for any special mechanism explaining the distant or
EPR correlations, any more than we need to explain the Lorentz "contractions"
and "dilatations" in terms of mechanisms (Paty [17]). Quantum mechanics
accounts for distant correlations as follows. If two quantum objects are initially independent from one another, then the state function of the aggregate
they constitute is correctly represented by the product of their individual
stage functions. So, separability or "locality" obtains in this case just as in classical physics. But if two quantum objects have initially been part of a system,
i.e., if they have interacted strongly at the beginning, then the state function
of the whole cannot be so factorized even after the components have moved
far apart. In other words, the state of each component is determined not only
by the local conditions (i.e. the state of the immediate surroundings of the
spatially separated quantum objects), but also by their still belonging to a
system.
So, although physical separation entails spatial separation, the converse is
false. Quantum theory is then non-local in this peculiar sense or, as we prefer
to say, it is systemic in that it does not incorporate the classical principle
that widely separated things cannot belong to the same system. (Note again
the ambiguity of the word "local". In field physics, whether classical or quantal,
"locality" means that nearby action, or non-action-at-a-distance, obtains: all
actions are assumed to proceed de proche en proche. Quantum theory is local
in this traditional sense of the word.)
We submit that the non-separability or (non-"locality") inherent in quantum
theory is a consequence of the superposition principle together with the
Schrodinger equation. In fact, once a system is in a state consisting of an
inextricable merger (not just either a sum or a product but a sum of products)
of the states of its components, the Schrodinger equation guarantees that the
system will evolve in such a way that its state function will retain that structural property even though the relative contribution of every individual
eigenstate is likely to change in the course of time. (In the case of the EPR
experiment, the total system I + 2 is in some such state as 2- 112 (U1V 1 + U 2V 2)
or 2- 112 (U1V 1 - U 2V 2 ). If the u's and v's are spin eigenstates, the preceding
state functions are distance-independent: the components continue to be entangled no matter how far apart they move.)
The EPR paradox is such only if one presuppose the classical principle
that the behavior of things that are far apart cannot be correlated. (See Einstein
[18] and Bell [11] for statements and discussions of this classical principle
of "locality" or separability.) The physicist who takes this principle for granted
is bound to puzzle over how it could be possible for two things to continue
interacting after they have stopped being components of system. Quantum
theory solves the "mystery" much in the same way as a detective solves all
alleged murder cases by proving that there was no murder to begin with:

224

MARIO BUNGE

that the system was not dismantled. that the distant components are still part
of the original system, that the "wave" function has not split into two parts
but continues to cover the entire system.
What the quantum theorist may puzzle over is a different question, namely:
How, that being the case. could one effectively dismantle the system, and
thus factorize its state function. We submit that the answer to this query is:
The original system becomes dismantled only when at least one of its original
components gets integrated into another system - e.g. when it is captured or
absorbed by an atom.
Philosophers have been quick to exploit the downfall of Bell's inequalities. Thus Fine [19] has interpreted it as a refutation of determinism. and
van Fraassen [20] as a refutation of philosophical realism. We have seen that
it is neither. In fact. some hidden variables theories, such as stochastic quantum
mechanics. are non-deterministic in that they contain random variables. (See
additional criticism by Nordin [21] and Eberhard [22].) As for philosophical
realism, it would indeed be dead if, as van Fraassen contends. it involved
the hypothesis that there is a causal mechanism underlying every correlation, since no such mechanism is known to exist in the case of the EPR
distant correlations. But such a version of philosophical realism is a straw man.
Philosophical realism happens to be an epistemological, not an ontological.
view, and it boils down to the thesis that nature exists even if it is neither
perceived nor conceived. (If in doubt consult any philosophical dictionary.)
Moreover, philosophical realism is compatible with alternative ontologies, in
particular neodeterminism, which acknowledges non-causal (in particular probabilistic) types of determination (Bunge [23]).
If anything, the experimental refutation of Bell's inequalities, like any
other experiment, has confirmed philosophical realism once more, since every
well designed experiment involves a clear distinction between knowing subject
(in particular experimenter), apparatus, and object of knowledge. (See Wheeler
[24], Rohrlich [25], and Bunge [26].) Only "realism" in the idiosyncratic sense
of EPR has been refuted along with Bell's inequalities - but that was only a
classicist dogma. Actually this dogma went down the moment quantum theory
succeeded in solving the problems that classical physics had left unsolved.
Since then we have learned that reality is smudged rather than neat: that
every quantum object possesses some properties that, far from having sharp
values all the time, have distributions of values; and that, unlike rigid bodies
but rather like fluids and fields, quantum objects are extremely sensitive to
their environments, whether natural or artificial. This discovery did not alter
nature and, in particular, it did not enslave nature to the supposedly omnipotent Observer. It only taught us that nature is composed not only of classical
objects but also of quantum and semiquantum objects. It altered our representation of the world, not the world itself; it was a scientific revolution, not
a cosmic cataclysm.
Hidden variables theorizing remained outside the mainstream of physics
until Bell and others succeeded in deriving formulas capable of being subjected

HIDDEN VARIABLES, SEPARABILITY, AND REALISM

225

to rather direct crucial experimental tests. Then it came briefly to the fore,
and it may soon become little more than a historical curiosity. However, the
failure of hidden variables theories has taught us a valuable lesson: that, like
other revolutionary theories before, quantum theory has got to be understood
in its own terms. (See Bunge [8, 9].) Besides, hidden variable theorists have
rendered a valuable service to the scientific and philosophical communities.
Firstly, they have exhibited some of the inconsistencies and obscurities of
the Copenhagen orthodoxy. (Unfortunately they have often confused the issues
- e.g. the problem of objectivity with that of determinism: see Bunge [27].)
Secondly, they have helped others remove those inconsistencies and obscurities by reinterpreting the standard mathematical formalism of quantum theory
in a strictly physical, hence realistic, fashion. Thirdly, they suggested the
very experiments that were to refute classicism.
Now that the hidden variables episode seems to be closed for good we
can look forward to theories even less intuitive than quantum theory. One
wonders whether Einstein was right in believing that God (sive natura) is
not malicious.
ACKNOWLEDGEMENTS

I thank Andres Kitlnay (Secci6n Fisica, IVIC, Caracas) and Maria E. Burgos
(Depto. de Fisica, Universidad de Merida, Merida) for many long and enlightening discussions on EPR and related matters.
APPENDIX. THE EPR "PARADOX"
Let us recall and examine the original EPR "paradox" [I]. Consider a complex system, such
as a molecule, that breaks up into two components that fly off in opposite directions. Call P =
PI + P2 the total linear momentum of the system - a quantity that is conserved - and x - XI X2 the relative distance between its components. Since PI + P2 commutes with XI - X 2, both are
definite (have sharp values) and can be measured simultaneously with as much accuracy as the
state of the art will allow. At all times after break-up, the system is said to be in a state represented by

where 3 is Dirac's delta, and (XI + x2)/2 is the position coordinate of the center of mass of the
system.
By measuring XI we can infer (without measurements) the exact position of component 2,
given by X2 - XI - a. It would seem that component 2 "knows" what its place is even though
no information about XI has reached it. Alternatively, if PI is measured we can infer the exact
momentum of component P2' given by P2 = P - PI' In both cases it seems clear that the property
of the component that is being inferred (not measured directly) exists without measurement which contradicts the Copenhagen doctrine. Moreover, the information on the position (or the
momentum) of component 2 is not included in the state function (I). Therefore, according to
EPR, quantum mechanics is incomplete.
I do not profess to understand the criticisms leveled by Bohr [28, 29] - except that they
presuppose the very operationist interpretation that Einstein and his coworkers were criticizing.
It seems to me, on the other hand, that the EPR paper can be criticized as follows.

226

MARIO BUNGE

Firstly, the "wave" function (1) does not describe correctly the system in question because
it is not a possible eigenfunction of any operator. The
delta must be replaced with a well-behaved (bounded and square integrable) function sharply
peaked at XI - X2 = a, symmetric about a, and with a nonvanishing width (standard deviation)
!J.x. But this amounts to saying that the relative distance X - XI - X2 has an objective spread or
indeterminacy !J.x. Hence, measuring XI exactly does not allow us to infer X2 with certainty but
only to within !J.x. Therefore Heisenberg's inequalities are not violated. (Dirac [30] conjectured that the infinite length of the state vectors corresponding to sharp position eigenvalues
"is connected with their unrealizability". As a matter of fact nobody has measured a sharp position
value: all one can do is to locate a quantum object within a small though nonvanishing region
of space of, say, vi in linear dimension.)
Secondly, although XI - X2 and PI + P2 commute, in practice they cannot be measured accurately at the same time. In fact, measuring XI - X2 involves localizing component I at XI and
component 2 at X2' But by so doing both PI and P2 get smudged according to Heisenberg's inequalities. Consequently the total momentum PI + P2 is not conserved while measuring XI - X2' The
classical explanation (adopted by Bohr [29]) is that there are exchanges of momentum PI and
P2 between the quantum objects and the measuring instrument. The quantum-theoretical explanation is that the quantum objects are thrown into superpositions of momentum states.
In our view either of the above objections disposes of the EPR claim that quantum mechanics
is incomplete. On the other hand neither touches on the real crux of the matter, namely the
inseparability inherent in quantum theory - which Schrooinger [31] regarded as "the characteristic trait of quantum mechanics".

o is not a square integrable function:

McGill University
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.

Einstein, A., Podolsky, B., and Rosen, N., 1935, Phys. Rev. 47, 777.
Aspect, A., Grangier, P., and Roger, G., 1981, Phys. Rev. Lett. 47, 460.
Aspect, A., Dalibard, J., and Roger, G., 1982, Phys. Rev. Lett. 49, 1804.
Bohm, D., 1957, Causality and Chance in Modern Physics, Routledge & Kegan Paul,
London.
Bohm, D., 1952, Phys. Rev. 85, 166-180.
de la Pena-Auerbach, L., 1969, J. Math. Phys. 10, 1620.
Claverie, P. and Diner, S., 1977, Intern. J. Quantum Chemistry XX, Suppl. 1,41.
Bunge, M., 1967, Foundations of Physics, Springer, New York.
Bunge, M., 1973, Philosophy of Physics, Reidel, Dordrecht. French translation 1975,
Philosophie de la physique, Seuil, Paris.
Bell, J., 1964, Physics I, 195.
Bell, J., 1966, Rev. Mod. Phys. 38, 447.
Clauser J. F. and Shimony, A., 1978, Rep. Prog. Phys. 41, 1881.
d'Espagnat, B., 1979, Sci. Amer. 241(5), 158.
Cini, M., 1983, Nuovo Cim. 73B, 27.
Bunge, M. and Kalnay, A., 1983, Nuovo Cim. 77, I.
Bohm, D. and Hiley, B., 1975, Founds. Phys. 5, 93.
Paty, M., Centre de Rechrches Nucleaires de Strasbourg HE, 81-10.
Einstein, A., 1948, Dialectica 2, 320.
Fine, A., 1982, Phys. Rev. Lett. 48, 291.
van Fraassen, B., 1982, Synthese 52,25.
Nordin, I., 1979, Synthese 42,71.
Eberhard, P. H., 1982, Phys. Rev. Lett. 49, 1474.
Bunge, M., 1959, Causality, Harvard University Press, Cambridge. Rev. ed.: New York,
Dover, 1979.

HIDDEN VARIABLES, SEPARABILITY, AND REALISM

227

24. Wheeler, J. A., in R. G. Jahn, ed., The Role of Consciousness in the Physical World,
Westview Press, Boulder, Colo.
25. Rohrlich, F., 1983, Science 221, 1251.
26. Bunge, M., 1983, Treatise on Basic Philosophy 6, Understanding the World, Reidel, Boston.
27. Bunge, M., 1979, Lecture Notes in Phys. 100,204.
28. Bohr, N., 1935, Phys. Rev. 48,696.
29. Bohr, N., 1949, in P. A. Schilpp, ed., Albert Einstein: Philosopher-Scientist, The Library
of Living Philosophers, Evanston, Ill.
30. Dirac, P. A. M., The Principles of Quantum Mechanics, 4th ed., Clarendon Press, Oxford.
31. Schrooinger, E., 1935, Proc. Cambridge Phil. Soc. 31, 555.

STORRS McCALL

A BRANCHED INTERPRETATION OF QUANTUM


MECHANICS WHICH DIFFERS FROM EVERETT'S

In his 1935 axiomatization of quantum mechanics, von Neumann postulated two very different types of change which quantum systems undergo:
Schrodinger evolution and state vector reduction. Since that time, many
attempts have been made to reconstruct or to reaxiomatize quantum theory
so as to eliminate reduction or collapse and retain unitary evolution as the
sole principle of change.
In this paper an interpretation of quantum mechanics, called the "branched"
interpretation, is proposed which (i) accommodates both of von Neumann's
evolutionary processes, (ii) provides a physical model of state vector reduction, and (iii) is non-local in a precise sense to be explained below. The
branched interpretation is discussed in chapter 4 of McCall (1994).
The theory which the branched interpretation resembles most closely is
Everett's, but at the same time there are deep and fundamental differences
between the two. As Everett presented it, his interpretation did not envisage
a literal spatio-temporal splitting of the world on measurement, but a branching
only of the observer's state, with the result that each element of the superposition of states which describes the observer finds itself on a separate
branch (Everett, 1957, p. 146). Although Healey (1984) cautions against it, the
interpretation is generally understood today as involving full-blooded spatiotemporal branching. DeWitt describes the theory as the many-worlds or
"many-universes" interpretation, and speaks of the world as "constantly splitting into a stupendous number of branches" (1972, p. 178).
The branched interpretation of this paper differs from the many-worlds interpretation in the following respects.
(1) In the many-worlds interpretation there are multiple present states of the
world, reSUlting from past measurement interactions. In the branched interpretation, the present and past are unique, and only the future is branched.
(2) In the many-worlds interpretation new branches are continually being
created by measurement-like interactions (DeWitt, 1970, p. 161), whereas
in the branched interpretation all branches have existed since the creation
of the universe. The flow of time is objectively modelled in the branched
interpretation as the progressive vanishing of all branches but one at the first
branch point to reveal a unique past. In this interpretation, the universe has
the form of a tree which loses branches to reveal a progressively lengthening
trunk.
(3) The many-worlds interpretation has difficulty interpreting probabilities of quantum events. In the branched interpretation the probability of a future
event is given by the proportion of branches on which the event in question
occurs. Since branch selection is random, probability = proportionality.
229
M. Marion andR. S. Cohen (elis.), Quebec Studies in the Philosophy of Science 1.229-241.
1995 Kluwer Academic Publishers.

230

STORRS McCALL

(4) The branched interpretation accounts for quantum entanglement, and for
the distant correlations of measurements performed at different locations
upon an entangled system, by the fact that two different measurement outcomes
may occur on the same branch, or on a fixed proportion of branches, no
matter how far apart the measurements may have been made. Consequently,
the joint probability of two distant events may not equal the product of the
probabilities of the events taken separately, and in that case it may appear as
if the occurrence of one event influences the occurrence of the other. This nonlocal character of the interpretation arises from the fact that all splitting of
branches is along space-like hyperplanes.
(5) The many-worlds interpretation was specifically devised to eliminate
the need for von Neumann's projection postulate. It will appear to each
observer on each branch that the measured system has "jumped" into an eigenstate of the measured observable, but in fact all branches corresponding to
the initial superposition are equally actual and no jump has taken place (Everett,
1957, p. 146). In the branched interpretation on the other hand, one and only
one branch is selected as actual and state vector reduction occurs when branch
attrition takes place within a specific kind of branched structure. This is the
physical feature which models "collapse".
In what follows these points of difference between the branched and the
many-worlds interpretation will be enlarged upon and made explicit. Before
starting, it might be asked in what sense the word "interpretation" is being
used. To this day, no agreement on a correct interpretation of quantum
mechanics exists. In particular, the principle of superposition poses severe
difficulties in this regard. Everett attempted to explain why observers are never
in fuzzy superpositions of observational states by postulating that every element
of such a superposition is actualized on a separate branch. Those who, following von Neumann, attempt to avoid superpositions of observational states
by positing "reduction" or "collapse", still have the problem of interpreting
"collapse" itself, i.e., specifying what kind of process it is. The aim of the
branched interpretation is to provide what Einstein sought for and was never
prepared to abandon, namely "the idea of direct representation of physical
reality in space and time" (Einstein, 1940, quoted in Fine, 1986, p. 97). In
its branching space-time structure the interpretation gives one possible answer
(c. Healey, 1989, p. 6 and Hughes, 1989, p. 296) to the question, "What
would the world have to be like, in order for quantum mechanics to be
true?"
(1) Shape of the Model. Roughly and broadly, the many-worlds interpretation in its branching space-time version envisages a model in the shape of
a bush, with no distinguished branch and with multiple present moments,
each containing perfect or slightly imperfect copies of oneself and other
existing individuals. In the branched interpretation by contrast there is only
one present and one past, and all branching lies in the future. The model is
tree-like, not bush-like. Every path through the tree is a complete four dimensional space-time manifold, and all branching is along space-like hyperplanes

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

231

or hypersurfaces, so that the upper limit-point of the intersection of any two


branches is a hypersurface. Because special relativity denies that there is any
unique global set of instants simultaneous with "now", there will be a family
of parallel space-like hyperplanes orthogonal to the time axis of every coordinatization of space-time. Consequently there will be many different branching
tree-like structures, one for each coordinate frame.
Although the fact that the exact branching structure of each model is relative
not only to a time but also to a coordinate frame might appear to be a complication which makes the branched interpretation unworkable, such is not
the case. In fact, as will be seen, the frame-dependent nature of each model
is exactly what is needed to deal with the reciprocal non-local influence of
distant measurement outcomes in the EPR experiment.
It can be shown, although it will not be shown here, that every frame-dependent model is a picture in a given coordinate frame of one and the same
frame-independent reality. This reality, which can be regarded as the (branched)
set of all point-events in the universe, underlies each and every one of its framedependent models. Independently of any coordinate frame or metric, the set
of all point-events of the universe forms a branched topological space, of which
a frame-invariant or Lorentz-invariant description can be given. For present
purposes, however, only the frame-dependent models are important, and it will
be assumed here that exactly the same set of point-events (i.e. the Universe)
can be suitably arranged in anyone of these branched four-dimensional framedependent structures.
(2) Branch Attrition. The second difference between the two interpretations is that in Everett's new branches are continually being added, while
in the branched interpretation all branches were there from the beginning.
The only thing that "happens" in the branched interpretation is that the
branches drop off. This branch attrition is the "process" that the universe undergoes, and, because it occurs in each frame-dependent model, it too is
frame-dependent.
Albert and Loewer criticize the fact that the branches of the many-worlds
interpretation constantly sub-divide and split as implying, if taken literally, a
violation of mass-energy conservation (1988, pp. 200-1). And in view of the
fact that what Keith Campbell would call the "ontological gross tonnage" of
the branched model is constantly diminishing through branch attrition, perhaps
Albert and Loewer would also make the same objection to the branched interpretation. However, the principle of the conservation of mass-energy was never
designed to apply to branched four-dimensional structures. Within each fourdimensional branch it applies to be sure, but not to the whole model.
(3) Quantum Probabilities. No natural or even consistent account of
quantum probabilities is given by the many-worlds interpretation. If I am
observing an atom of radium which mayor may not decay, it makes no sense
to say that the probability of my observing its decay in the next 30 minutes
is p, since according to the many-worlds interpretation in 30 minutes time I
will be on a branch on which I observe its decay, and I also will be on a

232

STORRS McCALL

branch on which I do not observe its decay. The concept of probability plays
no role here, since what will happen in the future is certain. (C. Albert and
Loewer, 1988, p. 201.) In the branched interpretation on the other hand
the notion of probability is central, and the interpretation provides not
only a physical meaning, but in addition a precise value, to all quantum
probabilities.
In the branched interpretation the probability of any future event, including
quantum events, is given by the proportion of branches on which the event
in question is located. For example, suppose that a vertically-polarized photon
is entering a two-channel polarizer which is inclined at an angle <I> to the
vertical. This polarizer has two exit channels, <1>+ and cI>-, and quantum theory
tells us that the probabilities of the photon exiting in the <1>+ and cI>- channels
are cos 2 <I> and sin2 <I> respectively.' In the branched model there are two kinds
of branch above the space-time point A where the photon enters the polarizer, and the proportionality of <1>+ branches to cI>- branches is cos 2 <I> to sin2 <I>
(see Figure 1).

Fig. 1.

There is a problem here, as was first pointed out to the author by Bas van
Fraassen, and the problem stems from the fact that in general cos 2 <I> and
sin 2 <I> will not be rational numbers. If they always took rational values nlm
then finite branching at point A would serve to represent these values, but if
cos 2 <I> is irrational then infinite branching at A would seem to be required,
and Cantor showed long ago that there are no fixed proportionalities in infinite
sets. (Consider for example, what the proportion is of even numbers in the
set of natural numbers.) It would appear that if the set of branches at A were
infinite, no means would exist of designating exactly cos2 <I> of them as <1>+
branches.
There is however a way in which an infinite set of branches can contain
subsets of fixed proportionality corresponding to any real number between 0
and 1, despite what Cantor says. In the branched interpretation there is at
point A a highly ramified branched structure of a certain specific kind. This
structure is of temporal height At, where I1t can be as small as we please.
The branched structure, which we shall call a "decenary tree", takes the following form. At time t, when the photon first enters the polarizer, the decenary
tree branches in 10, and each branch branches in 10 at each of the subse-

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

233

quent times t + 1/2~t, t + 314M, t + 7/8M, .... By t + M there will be IO Ko


branches. Of this non-denumerably infinite set, there exist subsets of any
desired proportionality.
Suppose for example cos 2 <I> = 0.75, and a decenary tree on which exactly
314 of the branches are <1>+ branches is desired. On such a tree, seven of the
first ten branches which split at t are <1>+, and two are <I>- branches. One
branch is neither <1>+ nor <1>-, but at time t + 1I2~t, when it divides in ten,
five of the new branches are <1>+ and five are <1>-. Once a branch is <1>+ or
<I>- (i.e. once the photon emerges in the <1>+ or <I>- channel on that branch)
then it and all its descendants in the decenary tree remain so until t + ~t.
Counting up now, by time t + 314M seventy-five of the first 100 branches in
the decenary tree are <1>+, and twenty-five are <1>-. This proportionality remains
until t + ~t, hence exactly 3/4 of the non-denumerable infinity of branches
in the tree are <1>+.
If cos 2 <I> is irrational, say equal to 1tI1O or 0.314159 ... , then a decenary
tree in which exactly 1tII0 of the branches are <1>+ takes the following form.
Three of the first ten branches are <1>+ and six <1>-, with one unlabelled; of
the next ten, one is <1>+ and eight <1>-; and so on. Given any decimal between
zero and one, we can find a decenary tree corresponding to it. Conversely,
given a decenary tree containing <1>+ branches, we can find the decimal which
gives the exact proportionality of the set of <1>+ branches.
The branched structure, henceforth called a prism, which gives the photon
its probability, need not be a decenary tree. A binary or ternary or any other
proportionality-preserving tree would so as well. We humans prefer decenary
trees because we have ten fingers, but nature decides, not us, and it is part
of the branched interpretation that there is a fact of the matter as to which
type of prism is located at which space-like hyperplane, giving the quantum
events on that hyperplane their probabilities. On every hyperplane there will
be many such events or "eventualities", as they are called by Howard Stein:
a photon passing through a polarizer on earth, a radioactive atom about to
decay on Venus, an electron moving to a lower energy level in a hydrogen
atom on Mars. All these eventualities, on the branched model, are given their
probability values in the interval M by one and the same prism, which though
very short in temporal extension is very broad in spatial extension.
At the tip of every branch of every prism there sits another prism, so that
the entire branched model is a stack of prisms. Since probability values
multiply going up the tree, and add across disjoint subsets of branches at
any level, the model gives a fixed and precise value to the probability of
any future event.
To sum up, probability values for future quantum events, and indeed for
all future events, are built into the branched space-time structure of the interpretation. The probability of a future event is given by the proportionality of
the set of branches on which the event is located. Unlike the many-worlds
interpretation, where every quantum mechanically possible future event
gets actualized on some branch or other, the branched interpretation permits

234

STORRS McCALL

only very few possible future events to become actual. Where pee) is the
probability for e to become actual, the value of pee) is given by branch
proportionality.
(4) Non-Locality. Suppose, in a version of the EPR experiment which
measures the plane polarization of two photons in a single state, that the
left measuring device is a two-channel horizontal-vertical polarizer, while
on the right there is a similar polarizer set at an angle <I> to the vertical (see
Figure 2).

Fig. 2.

In this experimental set-up there are four different possible joint outcomes,
namely (v, <1>+), (v, <j)-), (h, <1>+), (h, <j)-), with probabilities 112 sin2 <1>, 112 cos 2 <1>,
112 cos 2 <I> and 112 sin 2 <I> respectively. In the branched interpretation there
will be four different kinds of branch, and the probabilities of the joint
outcomes on left and right are given by the proportionality of the respective
branch sets. Denoting the event consisting of the left photon entering its
polarizer by A, and the corresponding event for the right photon by B, the
branched model for the frame of reference in which A and B are simultaneous takes the form as shown in Figure 3.

,v

Fig. 3.

Except for the case where <I> = 45, which yields p(v, <1>+) = p(v, <j)-) = p(h,<I>+)
p(h, <j)-) = 114, it is apparent on inspection of this diagram that the outcomes
of the measurements at A and B cannot be statistically independent. The condition for the two outcomes v on the left and <1>+ on the right to be independent
is that p(v, <1>+) = p(v) x p(<I>+). In the terminology of Fine (1980, p. 536),
the two outcomes are independent if the probability of the joint outcome
(v, <1>+) is "factorizable" into the product of the probability of v with the
probability of <1>+. But except where <I> = 45 this equality fails. For example
if <I> = 30, p(v, <1>+) = 112 sin 2 <I> = 118 and p(v, <j)-) 112 cos 2 <I> = 3/8. Then

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

235

p(v) = p(V, <1>+) + p(v, <1>-) = 118 + 3/8 = 112; p(<I>+) = p(v, <1>+) + p(h, <1>+) =
118 + 3/8 = 112; and p(v) x p(<I>+) = 114 "# p(v, <1>+). The two measurements at A and B are space-like separated, but their outcomes are not
independent.
The existence of a statistical correlation between the separate measurements
performed at A and B would not be puzzling if a common-cause explanation
or a local hidden variables explanation of the correlation could be given. But
Bell's theorem rules this out. A non-local hidden variables explanation would
presumably involve the transfer of information between the two polarizers at
faster-than-light velocities, and such transfers contradict relativity theory. At
present we seem to be left with no explanation at all of the distant correlations revealed by the EPR experiment.
One possible reaction to this situation would be simply to accept the correlations as brute fact, and forego any attempt to provide an explanation of
them. This course of action is proposed in Fine (1989). However, an explanation of the non-local reciprocal influence exhibited by the measurements
performed at A and B can be extracted from the branched interpretation. The
kind of explanation that the interpretation provides emerges most clearly if
we consider a frame of reference in which the left measurement of the EPR
experiment precedes the right measurement.
Assume that the left polarizer is positioned close to the source on earth,
and the right polarizer on the moon. The left photon passes through its polarizer and is measured v. Here it is natural to take a frame of reference in
which the measurement event A on earth occurs before B. Given that the left
photon has been observed to be vertically polarized, any subsequent measurement on the right photon will reveal it to be polarized horizontally. If it
is passed through a <I> polarizer it will have probability sin 2 <I> of being
measured <1>+. If it is passed through an HV polarizer it will invariably be
measured h. In every conceivable respect, the right photon behaves exactly
as if it had the property of being horizontally polarized. The question is,
when did it acquire this property? Consider the following three questions,
derived from Mermin (1985, pp. 46--47):
(i) Did the right photon have the property of being horizontally polarized
prior to the measurement performed on the left photon? The answer would
seem to be no. If for example the left photon had not been measured v, but
had emerged from its polarizer in the h channel, then the right photon would
not have had the property of being horizontally polarized, but would instead
have been vertically polarized. If at the last moment the left polarizer is
removed, so that no measurement at all is performed on the left photon, then
tests on the right photon reveal no consistent angle of polarization. The photon
passes + or - through polarizers at any angle with equal frequency. All the
evidence suggests that the right photon did not have the property of being
horizontally polarized before the left measurement was performed.
(ii) Did the right photon have the property of horizontal polarization after
the measurement on earth? Yes. No matter what tests are performed on the

236

STORRS McCALL

right photon after the left measurement, it will behave like a horizontally polarized photon.
(iii) If the right photon did not possess the property of being horizontally
polarized before the left measurement, but did possess it afterward, how did
it acquire the property, given that it was then travelling towards the moon?
Alternatively, how did it receive instructions (as it apparently did) about how
to behave when it encountered the right polarizer?
In the branched interpretation, question (iii) receives an answer that is
both simple and natural. It is this. The measurement performed on the left
photon has two possible outcomes, represented on the branched model by
two sets of branches. On one of these sets the left photon is measured as
being vertically polarized, and on all the branches which belong to this set
the right photon behaves like a horizontally polarized photon. For example,
its probability of passing $+ on the moon is sin2 $. But on the other set,
where the left photon is horizontally polarized, the right photon behaves
as if it were vertically polarized. Its probability of passing $+ is cos 2 $, not
sin 2 $. In a frame of reference in which the measurement event A precedes B
the branched model takes the form as shown in Figure 4.

"2

""

s
Fig. 4.

At the instant the left photon passes through its polarizer, one and only
one of the branches which divide along the hyperplane containing A is selected,
and the rest vanish. If the surviving branch is a v-branch, the right photon
behaves exactly as if it were horizontally polarized - on that branch, in fact,
it is horizontally polarized. If the surviving branch is an h-branch, then the
right photon is vertically polarized. Branch attrition ensures that one and
only one branch is selected, and hence that one and only one of these joint
outcomes is realized. Through the mechanism of branch selection, a measurement performed on earth can instantaneously bestow a property on a photon
which is travelling to the moon.
Suppose now that the experimental arrangement on earth contains an ultrasonic switch, as in the latest Aspect version of the EPR experiment, so that

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

237

at the last moment the left photon can be diverted to a $ polarizer instead
of an HV polarizer. Here again, if the left photon were measured $+, the
property of being polarized in the $- direction would be instantaneously
bestowed on the right photon, and the proportion of branches on which it
was measured $+ on the moon would be zero. The branched diagram for this
experimental arrangement would be a little more complicated in that there
would be four types of branch for the left outcomes, (v, h, $+ and t- instead
of just v and h), but the principle would be the same.
The branched interpretation, then, can explain how it is that a measurement performed on earth can instantaneously affect the outcome of a
measurement performed on the moon, without there being anything which
"travels" from one location to the other. Since the two measurement events
A and B are space-like separated, there will also be a frame of reference in
which the measurement on the moon precedes that on earth. A branched model
constructed according to this coordinate frame will show how the moon measurement can instantaneously affect the earth measurement. The three different
models, in which A is respectively simultaneous with, earlier than, or later than
B, are all relativistic variants of one another - i.e. all pictures of one and the
same set of events in different coordinate frames. The fact of there being equivalent frame-dependent models of the same underlying spatio-temporal reality
is what provides an explanation of the unique reciprocal influence exerted
on one another by the left and right measurements in the EPR experiment. 1
In their (1978, p. 1883), Clauser and Shimony state that the experimental
violation of Bell's inequality would appear to entail that in physics either
the thesis of realism or that of locality must be abandoned. They remark that
"Either choice will dramatically change our concepts of reality and of spacetime". In the branched interpretation it is locality, not realism, that is sacrificed.
The branched model is completely realistic, and the fact that two or more
distant events can be linked by their co-presence (or co-absence) on branches
which separate along space-like hyperplanes means that non-locality is one
of the model's central features.
(5) Reduction of the State-Vector. In the EPR experiment with two photons,
measurement of the angle of polarization of one photon instantaneously affects
measurement of the other. Thus if the outcome of the first measurement on
earth is v, the probability of the outcome $+ on the moon is sin2 $, whereas
if the first outcome had been h, the probability of the second outcome would
have been different. This is, for friends of "collapse", a paradigm case of
state vector reduction. At the instant at which the first measurement is made,
the state vector is reduced (relative of course to a coordinate frame, which
determines which measurement is "first"). This instantaneous collapse is
modelled in the branched interpretation by branch attrition. Since future
branches vanish instantaneously along the entire length of the hyperplane at
which they intersect the actual world, branch attrition, i.e. state vector reduction, is necessarily a non-local phenomenon. In the branched model, it is
both non-local and relativistic.

238

STORRS McCALL

A qualification must be made - not all branch attrition is state vector


reduction. In the branched model, there are two different kinds of prism stacks.
(A prism stack, of finite temporal height, is a branched structure composed
entirely of prisms, with a single prism at the base and another prism located
at the tip of every branch of every prism except those in the topmost row.)
Let S be any quantum or macroscopic physical system, and let w be some
state of S. (If S is a quantum system, w might be an eigenstate of some quantum
dynamical variable 0 of S; if S is macroscopic, w might be a state such as
"pointing north-east".) Let S be in a certain state (which mayor may not be
w) at the base node of a prism stack P, and consider the state of S at the top
of P. If S is in w at the tip of every branch of P, or not in w at the tip of
every branch, then we shall say that P is a V-type prism stack for S with respect
to w (see Figure 5.) If on the other hand S is in w at the tip of some branches
of P, and not in w at the tip of others, then we say that P is an R-type prism
stack for S with respect to w.
-w -w -w -w -w -w -w

U-Type

U-Type

-w

w -w w -w

w -w

R-Type

Fig. 5.

With respect to any state w which it is physically possible for S to move


into over the time interval t, every prism stack of temporal height t which
has S at its base node will be either a U-type or an R-type prism stack. If S
is at the base node n of a U-type prism stack, and if the hyperplane n is
"present", so that the prism stack is about to begin the process of branch
attrition, then S undergoes unitary or Schrodinger evolution over the interval
t. If on the other hand S is at the base node of an R-type prison stack, then
S undergoes measurement over t.
The difference between these two types of prism stack is a purely objective feature of the branched model. Therefore, with respect to any future
possible state w, a system S either undergoes Schr6dinger evolution or measurement. Whether it be one or the other, is in the branched interpretation a
matter of empirical fact. The difference between U-type and R-type prism
stacks having been established, we can define state vector reduction as branch
attrition within an R-type prism stack. Since branch selection is random, if
the system S is not in w at the base of an R-type stack, it may "jump" into
w abruptly and unpredictably as branch attrition proceeds. Or again it may not.
Whether it does or not, is a matter of chance. But if on the other hand S
finds itself at the base node of a U-type prism stack, nothing unpredictable
or chance-like will happen. Branch attrition will take place, but the state of
S will evolve smoothly and predictably. SchrMinger evolution is deterministic,

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

239

and measurement is indeterministic. Given the difference between V-type


and R-type prism stacks, the branched interpretation is able to provide an objective physical model of state vector reduction.
With regard to the measurement problem, the branch interpretation is a
concrete representation of physical reality in space and time, and therefore
medium-sized physical bodies are not to be found in macroscopic superpositions on any of its branches. In the branched interpretation, a pointer on the
dial of a laboratory instrument cannot point in two different directions at
once or be in a fuzzy superposition of states. The branched interpretation
provides a mechanism which requires macroscopic superpositions to be completely and immediately reduced. To employ Shimony's colourful terminology,
the treatment of macroscopic superpositions in the branched interpretation
resembles contraception rather than abortion (Shimony, 1989a, p. 36). In the
branched interpretation the reduction of superpositions, when it takes place,
is instantaneous and complete. It resembles the theory of Ghirardi, Rimini,
Weber, and Pearle in treating reduction as an objective physical process, but
the process differs from theirs in being complete and leaving no tails or vestiges
of "what might have been". (Ghirardi, Rimini and Weber, 1986, 1988; Pearle
1989. C. Albert and Loewer, 1990, p. 284).
In conclusion, two further differences between the branched and the manyworlds interpretations must be mentioned. The first is related to the "preferred
basis" objection to the many-worlds interpretation (Hellman, 1984, p. 564;
Shimony, 1986, p. 201). According to the many-worlds interpretation, what
occasions the splitting of one branch of the universe into many branches is
a measurement. But in quantum theory, there are as many different possible
"measurements" as there are observables. If 0 and 0' are two different observables, does the universe split into eigenstates of 0 or into eigenstates of O'?
If a supporter of the many-worlds interpretation answers that the universe splits
into eigenstates of 0, then he or she must provide a reason why the universe
prefers this basis rather than the basis for 0', or for 0" or 0"'. But on this matter
the many-worlds interpretation has nothing to say. The universe divides, but
the many-worlds interpretation cannot specify how it divides.
In the case of the branched interpretation, however, the situation is different.
At any node n, the precise way in which the branches divide is specified in
detail. In particular, whether at n the branches divide into eigenstates of 0,
or of 0', or of both, or of neither, will be a matter of empirical fact, determined by what events are located on the branches. In the branched interpretation, there is no "democracy of bases" which governs branching (Albert
and Loewer, 1988, p. 201).
If a free electron is moving through empty space, its wave function evolves
according to the SchrOdinger equation. There is branching, certainly, but
the particle is faced with a V-type prism stack, not an R-type stack with
different spin, position or momentum eigenstates on its branches. Only if,
for example, the electron moves into an environment which contains a SternGerlach apparatus or is otherwise "reactive" to spin (Shimony, 1989b,

240

STORRS McCALL

p. 380) will it be faced with an R-type stack. There is, in sum, no "democracy of bases" in the branched interpretation, but at each point every physical
system is faced by a specific U-type or R-type prism stack for each dynamical variable which characterizes it.
Finally, the branched interpretation differs from the many-worlds interpretation in making Heisenberg's "transition from potentiality to actuality" one
of its central features. (Heisenberg, 1958, pp. 54-55.). The multiplicity of future
branches represents what is potential, the single trunk represents what is actual,
and branch attrition represents the transition. (C. the "selection process"
referred to in Stapp, 1991). In connection with the idea of the passage from
potentiality to actuality, Howard Stein cautions us that such an idea, although
attractive in itself, "should be entertained only with the reservation that we
do not yet know of any clear case that can be characterized as such a passage"
(Stein, 1982, p. 576). It is suggested in this paper that the vanishing of branches
in the branched interpretation is the "clear case" to which Stein alludes.
McGill University
NOTE
I
In the same way, the branched interpretation accounts for correlations in the spin measurements on entangled three-particle systems in the GHZ thought-experiment (see Greenberger,
Home, Shimony and Zeilinger, 1990, Mermin, 1990, and Clifton, Redhead and Butterfield, 1991).
Suppose that a system of three spin-I12 particles is emitted in the state 'If = I1..J2(11, I, I) 1-1, -I, -I, where I or -I denotes spin-up or spin-down in the direction of propagation, i.e.
along the z-axis of each particle. If measurements are made of the x-component of spin of each
particle at different locations,then quantum theory predicts that the product of such spin measurements will always equal -1. As a consequence, if one particle is observed to be spin-up,
then this measurement will constrain the x-spins of the other two particles to be different, whereas
if it is observed to be spin-down, then the x-spins of the other particles will be the same. The
answer to the question, how can a measurement of one particle's spin instantaneously affect
the outcomes of measurements performed on the others, is the same as in the EPR experiment.
In the branched interpretation there are no branches with the joint outcomes (1, I, 1), (1, -1 , -I),
(-1, I, -1) or (-I, -I, 1) for measurements of the x-component of spin of the three particles.
The lack of branches of these four kinds, combined with branch attrition, is what produces the
instantaneous non-local influence of one measurement upon the others. For example, since
there are no (I, -I, -1) branches, if the first particle is measured 1, then if the second particle
is measured -1, the third particle will perforce be measured 1.

REFERENCES
Albert, D. and Loewer, B., 1988, 'Interpreting the Many-Worlds Interpretation', Synthese 77,
195-213.
Albert, D. and Loewer, B., 1990, 'Wanted Dead or Alive: Two Attempts to Solve SchrOdinger's
Paradox', PSA 1990, 1,277-285.
Clifton, R. K., Redhead, M. L. G., and Butterfield, J. N., 1991, 'Generalization of the GreenbergerHome-Zeilinger Algebraic Proof of Nonlocality', Foundations of PhYSics 21, 149-184.
Clauser, J. F. and Shimony, A., 1978, 'Bell's Theorem: Experimental Tests and Implications',
Reports on Progress in Physics 44, 1881-1927.

A BRANCHED INTERPRETATION OF QUANTUM MECHANICS

241

Cushing, J. T. and McMullin, E. (eds.), 1989, Philosophical Consequences of Quantum Theory,


Notre Dame University Press, Notre Dame.
DeWitt, B. S. and Graham, N. (eds.), 1973, The Many Worlds Interpretation of Quantum
Mechanics, Princeton.
DeWitt, B. S., 1970, 'Quantum Mechanics and Reality', reprinted in DeWitt and Graham, 1973,
pp. 155-165.
DeWitt, B. S., 1972, 'The Many-Universes Interpretation of Quantum Mechanics', reprinted in
DeWitt and Graham, 1973, pp. 167-218.
Einstein, A., 1940, 'Considerations Concerning the Fundaments of Theoretical Physics', Science
91, 487-492.
Everett, H., 1957, '''Relative State" formulation of Quantum Mechanics', reprinted in DeWitt
and Graham, 1973, pp. 141-149.
Fine, A., 1980, 'Correlations and Physical Locality', PSA 19802,535-562.
Fine, A., 1986, The Shaky Game, Chicago.
Fine, A., 1989, 'Do Correlations Need to be Explained?', in Cushing and McMullin, 1989.
Ghirardi, G. C., Rimini, A., and Weber, T., 1986, 'Unified Dynamics for Microscopic and
Macroscopic Systems', Physical Review D34, 470-491.
Ghirardi, G. C., Rimini, A., and Weber, T., 1988, 'The Puzzling Entanglement of SchrOdinger's
Wave Function', Foundations of Physics 18, 1-27.
Greenberger, D. M. et al., 1990, 'Bell's Theorem Without Inequalities', American Journal of
Physics 58, 1l31-1143.
Healey, R., 1984, 'How Many Worlds?', Nous 18, 591-616.
Healey, R., 1989, The Philosophy of Quantum Mechanics, Cambridge.
Heisenberg, W., 1958, Physics and Philosophy, New York.
Hellman, G., 1984, 'Introduction', Nous 18, 557-567.
Hughes, R. I. G., 1989, The Structure and Interpretation of Quantum Mechanics, Harvard
University Press, Cambridge, Mass.
McCall, S., 1994, A Model of the Universe, Oxford.
Mermin, D., 1985, 'Is the Moon There When Nobody Looks? Reality and the Quantum Theory',
Physics Today, April 1985, pp. 38-47.
Mermin, D., 1990, 'What's Wrong with These Elements of Reality?', Physics Today, June
1990, pp. 9-11.
Pearle, P., 1989, 'Combining Stochastic Dynamical State-Vector Reduction with Spontaneous
Localization', Physical Review A39, 2277-2289.
Shimony, A., 1986, 'Events and Processes in the Quantum World', in Penrose and Isham (eds.),
Quantum Concepts in Space and Time, Oxford.
Shimony, A., 1989a, 'Search for a Worldview which Can Accommodate our Knowledge of
Microphysics', in Cushing and McMullin, 1989, pp. 25-37.
Shimony, A., 1989b, 'Conceptual Foundations of Quantum Mechanics', in Paul Davies (ed.), The
New Physics, Cambridge.
Stapp, H., 1991, 'Quantum Measurement and the Mind-Brain Connection', in P. Lahti and P.
Mittelstaedt (eds.), Symposium on the Foundations of Modern Physics 1990, Singapore.
Stein, H., 1982, 'On the Present State of the Philosophy of Quantum Mechanics', PSA 1982 2,
563-581.

MICHEL J. BLAIS

. . . AND CHAOS SHALL SET YOU FREE . . .

INTRODUCTION

In the long tradition of pondering the possibility of human freedom, no single


event has more influenced the basic framework within which freedom has been
discussed than the advent of Newtonian physics. This elegant and powerful
system, permitting not only the explanation but also the prediction both of
planetary motion and of earthbound trajectories, set the tone of modem thought
in many domains, and unequivocally so in the domain of ethical thought.
Laplace is the philosopher-scientist who set forth the basic postulate of
modem determinism, that the universe is predictable through the use of
appropriate mathematical functions. l Once found, these linear functions (differential equations, for Laplace) should guarantee the existence and uniqueness
of the solutions to physical problems. With knowledge of the initial position
of a body and of the force vectors acting upon it, the physicist should in
principle be able to predict its future trajectories and to retrodict its past
movements.
Also, for Laplace, chance is but a name for our ignorance, since his basic
deterministic postulate requires that each and every body answer to mathematically expressed physical laws. Thus, probability theory is useful not
because it derives its validity from the fact that some events are only imperfectly determined and subject to chance, but because it permits us to predict
global macro-physical behavior in systems too complex to permit the calculation of this behavior as the sum of the individual micro-behaviors. This
viewpoint was carried forward by Boltzmann in the field of thermo-dynamics.
INCOMPATIBILISM AND COMPATIBILISM

Given such a deterministic outlook on the physical universe, some philosophers have argued that human freedom can exist only if humans are not
themselves (or at least, if their spirits or souls or minds are not) subject to
the same laws that govern other material bodies of the universe. This view,
known as incompatibilism, requires that human beings be in some way able
to act for other reasons than purely physical, determinist ones; that freedom
is incompatible with determinism. Incompatibilism does not require that there
be uncaused events, only that voluntarily caused events not be subject to
physical law. And so, a distinction is often drawn between event-causation and
agent-causation. 2
The opposite view - compatibilism, as its name implies - considers freedom
to be compatible with determinism. David Hume is a typical compatibilist
243
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy oj Science I, 243-258.
1995 Kluwer Academic Publishers.

244

MICHEL J. BLAIS

philosopher; rather than deal with the ontological reality behind the terms,
he dissected various uses of them in order to show that in one, legitimate,
sense, we are quite free and our actions are not deterministically necessitated, while in another, illegitimate, sense, we are not and cannot be free.
The legitimate sense of 'free' that Hume recognizes is that usage whereby
one is said to be free if one's action is guided by nothing else but internal
reasons. As long as one's actions are not externally coerced, one can correctly be said to be free, even if the internal motivations themselves are
determined. 3 To act freely in Hume's sense means only to act without the influence of any external deterministic cause, and does not require the absolute
absence of deterministic cause. On the contrary, some form of determinism
is required in Hume's view, for otherwise one cannot be said to have done
an act, in the sense of being responsible for it. It is thus possible to be both
free and determined, according to Hume; indeed, the latter is required for
the former to be possible.
For an example of an incompatibilist viewpoint, one can turn to a contemporary figure, Peter van Inwagen, whose construal of freedom is such
that the free individual must be able in some way to act in independence of
physically determined law or to change the course of physically determined
events. For the incompatibilist, Hume's analysis of freedom is tantamount to
saying that it is an illusion, for a Humean free act, although free from external
coercion, is nonetheless internally determined and so not really free from
the incompatibilist's point of view. 4
DETERMINISM

Both the compatibilist and the incompatibilist require that the universe be determined in the Laplacian sense. Both need determinism, but for different reasons.
Hume needs it for the agent to be responsible. Van Inwagen needs it as an
essential feature that differentiates the free agent from the rest of the physically determined world. Both regard the world as determined, and determinism
in both cases is construed as necessarily implying predictability. Hume's determinism, of course, is grounded a posteriori on experienced constant
conjunction, and he refuses to countenance any ontological reality behind
this perceived conjunction.
In van Inwagen's well-known proposal, determinism is analyzed as the conjunction of two theses:
(a) For every instant of time, there is a proposition that expresses the state of the world at that
instant.
(b) If A and B are any propositions that express the state of the world at some instants, then
the conjunction of A with the laws of physics entails B.5

But the laws of physics (the natural laws of the universe) do not include as
a subset the voluntaristic laws governing the behavior of rational agents; otherwise, one could not claim that rational agents are free, since determinism as

.. AND CHAOS SHALL SET YOU FREE . . .

245

defined above excludes any deviation from physical law, and this conception of physical law implies strict (albeit theoretical only, in many cases)
predictability. Also, if a rational agent can exercise his or her will and bring
it about that some proposition P is not true, then P can't be a law of physics,
by definition of 'law of physics,' for laws of physics are stipulated to be
necessarily true - in some non-logical sense of the word - and no agent is
supposed to be able to render a law of nature false. Van Inwagen construes
a law of nature ontologically rather than epistemically; although our knowledge of laws of nature may be imperfect and subject to subsequent revision,
a genuine law of nature is immutable, else it simply is not a law at all. Also,
his view of determinism is quite Laplacian; where Po is the proposition that
expresses the state of the world at To, P is the proposition that expresses the
state of the world at T, and L is the conjunction into a single proposition of
all laws of physics, "[i]f determinism is true, then the conjunction of Po and
L entails p.,,6
Just what does determinism entail? Laplacian determinism implies perfect
reversibility and predictability such that, given the state of a system at time
To and the causal physical laws in question, the state of the system at any
later time T+n (or any earlier time Ln) could be effectively calculated. Any
deviation from calculated results should be attributed to ever refinable
measurement errors, it being assumed that the world is ontologically and
thus rigidly determined. Obviously a small error in determining the initial
conditions can lead to a corresponding error in the resulting calculation; but
the resulting error is supposed to always be roughly commensurate to the initial
error and, more importantly, initial measurement error is supposed to be at least
theoretically eliminable, thereby eliminating the resulting error (see Figure 6
(left), below). Such a deterministic system is considered to be so constrained
by the initial conditions and the inflexible rules governing its evolution that,
were it possible to reverse the flow of time, the system would return to its
exact starting point. With the advent of thermodynamics and the recognition
of the inexorable flow of entropy, reversibility in many deterministic systems
is no longer considered a realistic supposition. But predictability remains to
this day securely linked to determinism, as one can readily see in van Inwagen's
aforementioned proposal.
DETERMINISTIC CHAOS

Traditionally, determinism and chaos have been set in diametrical opposition, for determinism seemed to necessarily imply predictability, whereas chaos
seemed to imply the exact opposite: impossibility of prediction of future
states (and impossibility of retrodiction of past states). Also, the absence of
predictability usually signaled the possible presence of chaotic (random)
fluctuations totally ungoverned by law. The "new science of chaos" has contributed much to the deeper understanding of determinism, of chaos and of
predictability. 7

246

MICHEL 1. BLAIS

Some types of unpredictability are due to random motions that are in effect
unknowable and thus unpredictable: such is the case for Brownian motion
where the progress of a microscopic mote is the result of innumerable collisions on the atomic level; not only is it in practice impossible to take into
account each and every molecule that collides with the mote, it is in principle impossible to do so, because of quantum indeterminacy.8 However,
quantum indeterminacy only sets in when the measurement is made; a quantum
system is always perfectly deterministic and always evolves in a deterministic manner.9 Macro-scale unpredictability, such as the spatial path of a rapidly
deflating untethered balloon, has nothing to do with quantum effects, and
although the balloon and a pitched baseball obey the same physical laws,
yet the baseball is relatively predictable in flight (witness the batting averages
of some batting aces), whereas the balloon is not. The difference between
the two classical physics tells us, depends upon the turbulent flow of the
balloon and the laminar flow of the baseball.
Regular, laminar flow, exemplified by the rush of air around a spinning
baseball or over the wing of a speeding aircraft, is quite predictable and understandable because such flows are calculable through differential equations;
turbulent flows such as the flight of the balloon through the air or the unorderly
crash of water through a rapids are much more difficult to understand and to
predict, since they escape the neat formalisms of the calculus. It was thought
that the unruliness of turbulent systems was directly linked to their basically
chaotic origins, and that chaos was basically unpredictable. But the new science
of chaos has changed the meanings of all these terms.
A rock poised on the crest of a hill is sensitive to initial conditions: a
very small push one way or the other will cause it to wind up in different
valleys. However, this type of sensitivity is restricted to the initial conditions; after the rock has started down one or the other side of the hill, its
behavior is largely controlled by well-known laws of physics, and small deviations in its course down the hillside will cause commensurately small
modifications of its final resting place at the bottom. A chaotic system,
however, is sensitive not only to initial conditions but to conditions at every
point of its evolution.
Another example lO is that of the exact prediction of the motion of a billiard
ball; assuming no loss of energy, if the gravitational effect of only one electron
at the edge of the galaxy is neglected in the calculation of the ball's motion
among the other balls on the table, the predicted motion is off after only one
minute. Such a large scale uncertainty stems from the curvature of the balls
that amplify small accumulated differences in impact points into large deviations in trajectories. This high sensitivity to small influences is one of the
characteristics of chaos.
The uncertainties inherent in quantum phenomena can always be accommodated by a Laplacian determinist because, although initial measurements
are subject to Heisenberg's quantum indeterminacy, it is still possible to predict
in the long run what the behavior of a given quantum system will be on

.. AND CHAOS SHALL SET YOU FREE . . .

247

average: the probabilities in a sense converge to specific values. But the uncertainties inherent in chaotic phenomena are not so limited and do not permit
such a global convergence, for small influences give rise to very large
inaccuracies in the predicted outcome.
STRANGE AND NORMAL ATTRACTORS

A dynamical system can be described as having two parts: a state (essential


information about the system) and a dynamic (rules for its evolution). These
two parts combine to define a state (or, phase) space, an abstract construct
whose coordinates are the components of the space. For a mechanical system,
the variables might be position and velocity; for an ecological system comprising various niches, the variables might be the populations of different
species.
When explaining the concept of attraction, the usual introductory examples
that ate presented are pendulums, either damped (subject to friction) or
undamped (idealized as frictionless); the variables are position and velocity,
and a state is a point in a plane whose coordinates are position and velocity.
At either extreme positions of its swing, the pendulum's velocity decreases
to zero and then begins increasing in the opposite direction; at the middle of
its swing, it is at maximum velocity. If position is plotted as the x-coordinate and velocity as the y-coordinate, then the state space for a damped
pendulum gives a picture of the orbit traced by the successive points at
regular time intervals.
The damped pendulum describes an ever shortening arc and finally ends
up stationary; the corresponding diagram in state space (see Figure 1) shows
a spiral whose constituent points correspond to the pendulum's velocity and
position. The velocity and the position slowly converge on the origin, and
the origin acts like an attractor for the pendulum in state space: it is in this
case a limit point attractor. In the case of the undamped pendulum, the velocity
and position cycle endlessly between their maxima and zero, and the corresponding diagram in state space shows a limit cycle attractor in the form of
a circle. An attractor is just a compact region of state space to which the set
of all orbits converge, and the set of points that eventually evolve to an attractor
is the attractor's basin of attraction. Complex systems may have any number
of basins of attraction.

~r

,..J

~
Damped
pendUlum

n
.- .velocity

velocity

..:....

"""0

'-----""

Pendulum's orbit
in state space

Undamped
pendulum

Pendulum's orbit
in state space

Fig. 1. Representation of the state space of a pendulum: damped [left] and undamped [right].

248

MICHEL 1. BLAIS

Each basin of attraction of a system's state space is not necessarily visited


on a regular basis. The dynamics of the system may only sporadically cause
it to enter a particular basin, or an external influence may force the system into
the basin at a time when, left to itself, the system would have remained elsewhere in its state space. Basins of attraction may be visualized in two
dimensions as shown in Figure 2.

C
A simple system's two
basins 01 attraction (A & C)

F
A more complex system's four
basins 01 attraction (D. F, H & J)

Fig. 2. A system's basins of attraction can be pictured as valleys (troughs) separated by crests
(barrie~), where the depth of a trough is an indication of the "attractiveness" of the basin and
the height of a barrier that of the relative "difficulty" of leaving the basin; a ball rolling up
and down the inclines will be trapped in a trough unless it has the energy to carry it over a barrier.
Although trough C [left] is lower than trough A, the system cannot reach it unless sufficient
energy is on hand to get it over the barrier B. In more complex systems [right], some troughs
may be visited only rarely, (for example, F) but, once reached, may capture the system for a
lengthy period, keeping it from returning to an inherently more stable trough (for instance, J).

Simple systems such as that of a pendulum have a closed-form solution,


that is, a formula that expresses any future state in terms of the initial state.
This formula serves as an algorithm for calculating any desired point in state
space (position and velocity) at any desired time; the amount of time necessary for computing the future state is roughly independent of the targeted
desired time. It only takes a few minutes or a few hours to calculate the state
of the moon many years hence, and thus to predict, say, the cycle of eclipses.
It is with such solutions in mind that Laplace conceived of his deterministic
ontology; success in predicting various phenomena led him to suppose that
all phenomena were equally predictable as far into the future (or back into
the past) as one cared to carry out the required calculation.
A system that eventually comes to rest (for example, the damped pendulum)
is characterized by a fixed point in state space. A system that evolves
indefinitely (for example, the driven pendulum of a clock) is characterized
by a periodic limit cycle in state space. (Actually, the clock's pendulum has
two attractors in its state space: the limit cycle to which it is attracted after
having been given a sufficiently generous starting push, and a fixed point to
which it is attracted if the push is insufficient to engage the escapement
mechanism.)
The next most complicated systems have a torus-shaped attractor, were
two independent oscillations drive its motion; one oscillation defines the motion
of the point in state space along the long axis of the torus, and the other defines

.. AND CHAOS SHALL SET YOU FREE . . .

249

the motion along the short axis. Higher dimensional attractors have also been
studied, but their representations require a correspondingly higher number of
state space dimensions; in order to make them "visible," one may choose to
sample the motion at fixed intervals, in effect "intercepting" the orbit with
an orthogonal plane that then records a scattering of points representing the
state of the system at regular intervals. 11
In Figure 3, the resulting Poincare section through a torus (an orthogonal
planar cut) has the shape of a circle which is traced out by a large number
of intersections with the Poincare plane of the orbit of the attractor in its
phase space. Once the circle is completed, no new information is gained by
examining the section, for the attractor cycles through the same orbit forever.
A Poincare section through the orbit of a strange attractor, however, displays
a pattern that never repeats itself and that shows more and more self-similar
detail as one magnifies it (see Figure 4).

Fig. 3. Torus-shaped attractor [left] and Poincare section [right] through the torus, showing
the resulting set of intersecting points on the Poincare section. Successive blow-ups of a portion
of the Poincare section show no more detail than the original.

Fig. 4. Strange attractor [left] and associated Poincare section [right] through the attractor. Successive blow-ups of a portion of the Poincare section show self-similar but never
identical detail.

Until 1963, fixed points, limit cycles and tori were the only known attractors, and they were all "normal.,,12 In that year, Edward Lorenz l3 published
results showing for the first time how relatively simple systems could display
behavior exhibiting the structure of a "chaotic" or "strange" attractor. In such
an attractor, microscopic perturbations are amplified to affect macroscopic
behavior in such a way that

250

MICHEL J. BLAIS

[t]wo orbits with nearby initial conditions diverge exponentially fast and so stay close together
for only a short time. The situation is qualitatively different for non-chaotic attractors. For
these. nearby orbits stay close to one another. small errors remain bounded and the behavior is
predictable. 14

This amplification has received the name of "Butterfly effect," so-called


by Lorenz who gained insight into chaotic mechanisms while studying
atmospheric phenomena. The butterfly effect gets its name from the startling
realization that an exceedingly small influence such as the beating of a butterfly's wings in Tanzania can have an enormous effect such as a tornado in
Florida.
What makes an attractor strange? Its complex structure is dictated by the
fact that its orbit in state space, being non-periodic, must fold in upon itself
indefinitely without crossing because the attractor is a finite object and cannot
diverge forever (if the orbit did recross itself, it would of course be periodic).
The only way the attractor can continuously traverse new regions of its finite
state space is by folding over upon itself over and over again in at least three
dimensions, thereby permitting different paths to pass close to one another
without ever overlapping. An easy way of envisioning this phenomenon is
the "Baker's transformation."
Suppose a baker wishes to dye a ball of dough with vegetable coloring.
To do so, he might inject a spot of color into it and then proceed to stretch
the ball out with a rolling pin. The spot of color in the ball of dough will stretch
out as the baker rolls the dough out and then cuts it in half and superposes
the two halves before rolling it out again. After a number of rolls and cuts,
the individual molecules of color that were quite close together at the beginning will be separated by a large distance. A few dozen of these transformations
will stretch the original spot of color millions of times while flattening it to
molecular thickness.
As can be seen in Figure 5, the baker's transformation can maintain the
basic size of an object while stretching and folding it in such a way that two
proximate points become rapidly separated.

1- - ~
r.=~~ - ~
Fig. 5. Baker's transformation. The starting square is flattened and stretched, then cut in half;
the resulting halves are superposed, and the transformation reiterated. Two proximate points
can be rapidly separated by this type of transformation.

251

.. AND CHAOS SHALL SET YOU FREE . . .

There are only three kinds of attractor: equilibrium (fixed point); periodic
(limit cycle) and quasi-periodic (torus); and chaotic. 15 Periodic and quasiperiodic motion, even if very complex, is predictable; chaotic motion is not.
Also, strange attractors require a state space of three or more dimensions.
DETERMINISM WITHOUT PREDICTABILITY

Strange or chaotic attractors are characterized by two basic laws: they are
extremely sensitive to initial conditions, and nearby trajectories rapidly diverge
(see Figure 6). In normal, non-chaotic attractors, small deviations in the
initial conditions will result in corresponding relatively small deviations of
orbits after a given time; in strange attractors, very slight differences in initial
conditions rapidly cause arbitrarily large deviations in the trajectories after
even a short time. In normal attractors, narrowing down the measurement errors
for the initial conditions will correspondingly narrow down the errors in the
calculated outcomes; but in strange attractors no amount of initial precision
will permit a corresponding precision in the calculated outcome.

Fig. 6. In a non-chaotic system [left] two nearby trajectories, arbitrarily close together in state
space at the outset, diverge relatively little as they evolve. In a chaotic system, however, [right]
two nearby trajectories diverge quite rapidly. Chaotic systems are very sensitive to slight
differences in initial conditions, whereas non-chaotic ones are not.

Chaotic systems are basically turbulent, as opposed to smoothly evolving


laminar systems that can be understood in linear terms.
The chief qualitative difference between laminar and turbulent flow is in the direction of information flow between the macroscopic and microscopic length scales. In laminar flow, motion
is governed by boundary and initial conditions, no new information is generated by the flow,
hence the motion is in principle predictable. Turbulent motion on the other hand is governed
by information generated continuously by the flow itself, this fact precludes both predictability
and reversibility.16

Normal attractors (such as those associated with laminar flows) are theoretically predictable, because it is possible to extract all the information about
them, once and for all. Strange attractors (such as those associated with turbulent flows) are not theoretically predictable, because they are information
sources: they continuously create information at each folding or bifurcation.
The seeming randomness is the result of the shuffling or folding process.

252

MICHEL J. BLAIS

Because they create information, chaotic attractors reveal ever more detail
as one scrutinizes more finely: this is their self-similar fractal l ? nature.
In a normal, non-chaotic, attractor, small uncertainties give a certain amount
of information that is preserved with time; this is the sense in which they
are predictable, because these systems are not overly sensitive to measurement
errors. Because their trajectories settle down in a finite time either to a fixed
point or an endlessly repeating limit cycle, no information is lost and none
is created. But in a chaotic attractor, initial information is removed and replaced
by new information; instead of mixing up the dough (in the Baker's transformation), the attractor mixes up the state space.
The stretching and folding operation of a chaotic attractor systematically removes the initial information and replaces it with new infonnation: the stretch makes small-scale uncertainties larger,
the fold brings widely separated trajectories together and erases large-scale infonnation. Thus
chaotic attractors act as a kind of pump bringing microscopic fluctuations up to a macroscopic
expression. [... tJhere is simply no causal connection between past and future. 's

This results in the unlinking of determinism from predictability. What makes


a system predictable is the possibility of measuring some initial state and
calculating some future state on the basis of the system's dynamic (its governing laws). The ultimate predictability - deterministic predictability - permits
the calculation of an event far into the future with (relatively speaking, almost)
as much ease as of an event in the next instant; one only needs to insert the
desired time into the pertinent equation and to perform the calculation. This
is the model that was postulated by Laplace, and it is, one supposes, the one
van Inwagen has in mind when discussing his incompatibilism. But, chaotic
systems governed by strange attractors are not predictable in the same way.
Systems governed by strange attractors are, however, thoroughly deterministic; each point in the state space is determined by the preceding point
and the particular rules of the dynamic that govern the attractor's evolution.
But the characteristic stretching and folding and the resultant bifurcations
make it impossible to calculate very far into the future: the basic sensitivity
to initial conditions amplifies uncertainties to the extent that they rapidly
grow and thus reduce the usefulness of the rules as instruments for longterm prediction.
[... J it may be observed that in the chaotic region arbitrarily close initial conditions can lead
to trajectories which, after a sufficiently long time, diverge widely. This means that, even if
we have a simple model in which all the parameters are detennined exactly, long tenn prediction is nevertheless impossible. 19

The very existence of deterministic chaos thus breaks the traditional link
between determinism and predictability, because the only way to calculate a
future event for these systems is to actually step through each and every
individual calculation until finally reaching the point sought at the desired
future time. In other words, to find out the future state of a strange attractor,
it is necessary to calculate its whole history, one step at a time. In the case
of normal attractors, it is necessary to perform one calculation only, where
the desired future time is substituted into the equation.

AND CHAOS SHALL SET YOU FREE.

253

CHOICE AND PREDICTION

In a perfectly determined Laplacian world, in which agents' actions are completely governed by physical laws, what appear to be genuine choices for an
agent are in fact only imagined options because they are physically unrealizable; although the agent may think that two options A and B are really open,
strict determinism in fact forces the result of the choice. In a world that is
not perfectly determined, in which agents would be genuinely able to alter
the course of at least some events, even if an agent has chosen option A, it
is theoretically possible to have chosen option B; the agent can thus imagine
a reversal of time's arrow, permitting a return to the moment of decision where
option B is chosen instead of option A.
Obviously, not many people believe that time can actually be reversed.
But the undetermined agent's inherent freedom, unconstrained by physical law,
should permit the choice of option B if the agent were again placed in exactly
the same circumstances once again (see Figure 7 [right]). The determined agent,
however, must necessarily choose option A all over again were time reversed,
because option B was never real and thus was never really open (see Figure
7 [left]).

_10_ _ _---;11/

............
------.------------ B

Fig. 7. An agent's time line [left] encounters a bifurcation at t" where two options are open:
A and B; the agent chooses, and the time line consequently continues along line A; option B
is unrealized. According to some views of freedom of choice, were it possible [right] to reverse
the flow of time at t2 , an agent could return to t, and choose option B instead.

Predictability and determinism are closely linked in the Laplacian outlook


upon the world. If a free agent is supposed to be able to change the course
of some events, then escape from physical determinism must be possible,
else the presumed freedom of choice turns out to be but imagined and in
fact unrealizable. Thus it is that incompatibilists, having construed the physical
world in Laplacian terms, find it necessary to re-inject the real possibility of
choice through such means as agent causation that is presented as somehow
different from ordinary, physical causation.
Agent causation is designed to permit real action wherever the laws of
physics do not expressly forbid it. Thus it is not possible to choose to soar
like a bird without mechanical aid, because the laws of thermodynamics decree
that humans are not properly built for flying; but it is possible to leave for

254

MICHEL J. BLAIS

work at 8 o'clock by bus rather than at 8:15 by car, because, supposedly, no


physical law precludes one or the other: thus, the agent is deemed to have a
choice.
These possibilities are often mapped out in terms of possible worlds, where
a possible world is deemed accessible to an agent if nothing necessarily forbids
an action that would or could have different consequences from the ones
actually brought about in the real world. But possible world semantics, although
perhaps useful for the description and analysis of choice, does not explain how
freedom can coexist with determinism because the prior supposition of the
existence of possible and accessible worlds assumes precisely what should
instead be proven; real choices (possible worlds) are hypothesized from the
outset, instead of being proven to exist. The staunch determinist can always
call the possible-world theorist to task and require that the real - as opposed
to logical - existence of these alternate worlds be demonstrated; without such
a demonstration, the supposed free choice of the agent just boils down to
imagined but unreal courses of action that are not really open to the agent
and that, although logically possible, are physically excluded if the agent is
basically part and parcel of the physical world. According to the staunch determinist, an agent is always constrained to repeat history (see Figure 7 [left]),
were it possible to reverse the flow of time, whereas the possible-world determinist maintains that constraints are not so severe as to exclude the redirection
of history (Figure 7 [right]).
All of this talk about the real or only imagined possibility of choice presupposes that the physical world is determined. The compatibilist framework
will typically allow that humans are constrained by the same laws as the rest
of the universe, but will deem free any action that traces its constraints to
the agent, and non-free any that depends upon external constraints. The inc ompatibilist framework recognizes that strict physical determinism precludes
the possibility of real choice, so a special sort of (non-physical) causation
is invoked to permit the agent to effectively choose options not explicitly
excluded by physical laws.
If an agent's future action can be predicted, then that action can't be free.
Why? Because if it can be predicted, it is because knowledge of the initial
conditions and of the pertinent deterministic laws not only allow the possibility of prediction, but also guarantee at least theoretically that it will be
accurate (see van Inwagen's analysis of determinism as the conjunction of
two theses, in the section titled "Determinism," above). Even if the prediction is only probabilistic, the action can still be considered to be determined,
for the inaccuracies can be laid on the doorstep of imperfect knowledge of
initial conditions, and increased knowledge of these conditions should correspondingly increase the accuracy of the resulting prediction. Determinism
and predictability are thus thoroughly linked both for the incompatibilist and
for the compatibilist.

.. AND CHAOS SHALL SET YOU FREE . . .

255

DETERMINISM AND PREDICTABILITY UNLINKED

Recent research has shown deterministic chaos to exist not only in inert
physical and chemical systems but also in biological systems as well. Of
particular interest is the demonstration that electroencephalograms of the
human brain and electrocardiograms of the human heart can be modeled by
deterministic chaos. In both cases, the chaotic regimes characterized by
epilepsy and by fibrillation can be effectively modeled by deterministic
chaos. Ocean dynamics, sunspots, animal population dynamics, blood pressure
variations and hormonal concentrations are other examples of systems that
are being better understood for being modeled on deterministic chaos.z Without
going so far as to assume that the human mind is purely biological, it is still
fascinating to discover that human bodies respond to chaos.
The most important characteristic of deterministic chaos is the impossibility
of predicting behavior beyond the short term into the far future, for the very
reason that chaotic systems, being extremely sensitive to initial conditions,
generate information as they evolve. It is thus impossible at time to to predict
what exactly will happen at a later time ~ without actually calculating each
intermediate state. However, and this is the paradoxical second most important characteristic, each of these intervening states is as perfectly determined
as a Laplacian could wish. Thus, even though an agent is perfectly determined,
if the agent is governed even partly by deterministic but chaotic laws, then
the agent's behavior may be inherently unpredictable.
The possibilities that are considered open to the agent at the moment of
choice can be quite real, and quite determined, but the outcome of the choice
can also be quite unpredictable. Even though an option is determined, it is
not true that the very same option would or even could be chosen, were time
reversed. This is because subtle differences in initial conditions can bring about
large differences in final behavior, on the one hand, and it is impossible to
really be sure that a system has been brought back to its exact prior starting
point, on the other. Figllre 8 shows the difference between a normal [left]
and a chaotic [right] interpretation of a time-reversal in a determined world.
The left-hand portion of Figure 8 shows the assumption that freedom should
permit the choice of a different option B at tl were time reversed - small uncertainties in the initial conditions give rise to at most small variations in the
resulting effects; this, in essence, is a combination of Figure 6 [left] with Figure
7 [right]. The right-hand portion of Figure 8 shows the assumption that freedom
would not necessarily permit the same repetition - small uncertainties that
can never be completely eliminated can give rise to very large variations in
the resulting effects; this combines Figure 6 [right] with Figure 7 [right].
There is thus from the standpoint of deterministic chaos some truth to the
claim that the free agent could have done otherwise; this however doesn't entail
that the agent wasn't determined. It just means that, all other things being
just about equal, very slight differences could have led to a different trajectory in state space. In Figure 8 [left] the imagined return to the exact prior

256

MICHEL J. BLAIS

Fig. 8. Normal depiction [left] of the possibility of making a different choice were timereversal possible; small discrepancies would not fundamentally change the chosen option, B.
Chaotic depiction [right] of the same possibility; small discrepancies can give rise to a quite
different option, C.

point t] in state space is impossible from the standpoint of chaotic determinism,


for a chaotic attractor would preclude this return by the very fact of its nonperiodicity; it would at best be possible to return to a very proximate point
in state space. This very close but non-identical point entails that the trajectories can in fact quickly diverge (Figure 8 [right], and so the time-line B could
be perhaps only approximately followed for a short while and the final
outcome, C, could be quite different.
What of moral responsibility? An agent is not normally held responsible for
an action unless it was possible for the agent to have done otherwise; this is
reflected in Figure 8 by the bifurcations at t]. Without a bifurcation, no other
course was open to the agent, and assessments of moral responsibility accordingly will take this into account. This picture of moral responsibility holds
good whether a normal or a chaotic description is used, and old conundrums
about the agent's really having a choice or not - about the overwhelming necessity of determinism or the agent's being able to overcome it - carry over intact.
So modeling the agent's possibilities of action along the lines of chaos theory
doesn't remove any of the old problems pertaining to moral responsibility.
What it does provide, however, is an indication that the presence of unpredictability does not necessarily entail the absence of determinism: chaotic
systems are perfectly determined without being predictable.
Modeling along the lines suggested by chaos theory provides a new analysis
of the expression "the agent could have done otherwise." The phrase should
not be interpreted in the sense of Figure 7 [right], nor even in that of Figure
8 [left], but in the sense of Figure 8 [right]. This is the sense according to
which not only is no exact return to a prior point t] possible (precluding the
exact repetition of events up to the point of bifurcation), but also that the
only possible, that is approximate, return to a prior point renders extremely
unlikely that the bifurcation unfolds into the very same possibilities. "The agent
could have done otherwise" means that the agent, at time t], was at some critical
crest in state space (see Figure 2, above), and no forecast at to could have unambiguously predicted which of the possible basins of attraction would in fact

.. AND CHAOS SHALL SET YOU FREE . . .

257

attract the agent. Whether the basin was freely chosen by the agent, or physically coerced by natural causes, chaos theory cannot decide. All one can
say is that in similar past circumstances, this particular agent (or any agent,
in general) followed different time-lines.
Informal modeling of human behavior along the lines of deterministic chaos
doesn't of course solve the free-will problem. Probably nothing will, if only
because humans feel that deliberation makes a difference, and nothing seems
able to reduce this feeling to any form of determinism. But deterministic chaos
sheds new light on the free-will problem by showing that determinism doesn't
necessarily entail predictability and, conversely, that unpredictability doesn't
necessarily entail freedom from determinism either.
So, if chaos doesn't set you free, it at least allows you to eat your deterministic cake and have your non-predictability, too.
University of Sherbrooke
NOTES
I
"All events, even those which on account of their insignificance do not seem to follow the
great laws of nature, are a result of it just as necessarily as the revolutions of the sun. In ignorance of the ties which unite such events to the entire system of the universe, they have been
made to depend upon final causes or upon hazard, according as they occur and are repeated
with regularity, or appear without regard to order; but these imaginary causes have gradually
receded with the widening bounds of knowledge and disappear entirely before sound philosophy, which sees in them only the expression of our ignorance of the true causes. [... ] We
ought then to regard the present state of the universe as the effect of its anterior state and as
the cause of the one which is to follow. Given for one instant an intelligence which could
comprehend all the forces by which nature is animated and the respective situation of the
beings who compose it - an intelligence sufficiently vast to submit these data to analysis - it
would embrace in the same formula the movements of the greatest bodies of the universe and
those of the lightest atom; for it, nothing would be uncertain and the future, as the past, would
be present to its eyes." Pierre Simon, Marquis de Laplace: A Philosophical Essay on Probabilities.
Dover Publications, Inc., New York, 1915 (translated from the 6th edition by F. W. Turscott
and F. L. Emory), pp. 3-4.
2 In this regard, see for example: Antony Flew and Godfrey Vesley: Agency and Necessity. Basil
Blackwell, Great Debates in Philosophy, Oxford & N.Y., 1987.
3 See Section VIII entitled "Of Liberty and Necessity" in David Hume: An Enquiry Concerning
Human Understanding. 3rd edition, edited by L. A. Selbye-Bigge & P. H. Nidditch, Oxford,
Clarendon Press (1748) (1902) 1975. Hume's definition of freedom: "By liberty, then, we can
only mean a power of acting or not acting, according to the determinations of the will."
[po 95]. His assimilation of volition to natural causation: "But being once convinced that we know
nothing farther of causation of any kind than merely the constant conjunction of objects, and
the consequent inference of the mind from one to another, and finding that these two circumstances are universally allowed to have place in voluntary actions; we may be more easily led
to own the same necessity common to all causes." [po 92].
4
See the following section for more details on van Inwagen's incompatibilism.
5 Peter van Inwagen: "Incompatibility of Free Will and Determinism." Philosophical Studies
27, no. 3 (March 1975, pp. 185-199), p. 186.
6
[van Inwagen, 1975, p. 191].
7 See Henri Poincare: Science et methode. Flammarion, Paris, [1908]1947. "A very small cause,

258

MICHEL J. BLAIS

which we do not notice, determines a considerable effect that we cannot but see, and then
we say that the effect is due to chance. If we were to exactly know the laws of nature and the
situation of the universe at the initial instant, we could exactly predict the situation of the same
universe at an ulterior instant. But, even if the laws of nature no longer held any secrets for
us, we could only know the initial situation but approximately. If that permitted us to predict
the ulterior situation with the same approximation, that is all we require, we say that the phenomenon has been predicted, that it is governed by laws; but it is not always thus, it may
happen that small differences in the initial conditions generate very large ones in the final phenomena; a small error in the former will produce an enormous one in the latter. Prediction becomes
impossible and we have a fortuitous phenomenon." [pp. 68-9 - translated from the original
French]
In the 1960's, Lorenz sparked what has become an explosion of theory and of practical
applications of deterministic chaos; see Edward N. Lorenz: "Deterministic Nonperiodic Flow."
Journal of Atmospheric Science 20, 1963, pp. 130--141. Many excellent introductory texts exist
on the rapidly burgeoning subject of deterministic chaos. See for example James Gleick: Chaos
- Making a New Science. Viking, New York, 1987.
The following sketch of deterministic chaos follows in part: James P. Crutchfield et al.:
"Chaos." Scientific American 255, no. 6 (December 1986, pp. 46-57).
S the interested reader may wish to consult Jesse Hobbs: "Chaos and Indeterminism." Canadian
Journal of Philosophy 21, no. 2 (June 1991, pp. 141-164), in which the relationship between
chaos and quantum indeterminacy is explored. In the present article, this particular aspect of
chaotic behavior will not be pursued.
9
On this subject, see for example Richard P. Feynman: QED - The Strange Theory of Light
and Matter. Princeton University Press, Princeton, N.J., 1985.
10
Also given by [Crutchfield et al., 1986, pp. 48-9].
II
This is the method pioneered by Poincare.
12
For the history of chaos theory, see for example the introduction to: Hao Bai-Lin (ed.): Chaos.
World Scientific, Singapore, 1984.
13
See [Lorenz, 1963].
14
[Crutchfield et al., 1986, p. 51].
15
for a detailed discussion of various types of chaotic attractors, see for example: J. M. T.
Thompson and H. B. Stewart: Nonlinear Dynamics and Chaos. John Wiley and Sons, Toronto
& N.Y., 1987.
16
Robert Shaw: "Strange Attractors, Chaotic Behavior, and Information Flow." Zeitschrift
fiir Naturforschung 36a (1981, pp. 80--112), pp. 106.
11
See Benoit Mandelbrot: The Fractal Geometry of Nature. Freeman, N.Y., 1977.
18
[Crutchfield et al., 1986, p. 53].
19
Robert M. May: "Simple Mathematical Models with Very Complicated Dynamics." In [BaiLin, 1984, pp. 149-157].
20
See for example Hermann Haken and Arne Wunderlin: "Le Chaos deterministe." La
Recherche 21, no. 225 (October 1990, pp. 1248-1255), p. 1255. See also the numerous examples
cited in: Dahan Dalmedico, A. et al. (eds): Chaos et dererminisme. Points/Sciences, Seuil,
Paris, 1992.

PAUL M. PIETROSKI

OTHER THINGS EQUAL, THE CHANCES IMPROVE

Ramsey (1929) offers a characteristically clear formulation of an attractive idea


about causation:
The world, or rather that part of it with which we are acquainted, exhibits as we must all agree
a good deal of regularity of succession. I contend that over and above that it exhibits no feature
called causal necessity, but that we make sentences called causal laws ... and [we] say that a
fact asserted in a proposition which is an instance of causal law is a case of causal necessity
(p. 160).

This idea has fallen on hard times. But rumors of its death have been exaggerated, in my view, by the mistaken view that laws are exceptionless universal
generalizations. We can construe Ramsey's proposal in terms of a more relaxed
(and empirically motivated) conception of law. l The resulting account of
causation is worth taking seriously. It also shows that (plausible) coveringlaw and probability-raising conceptions of causation are not so different after
all; and while neither conception leads to a simple theory, I think we have
to live with complexity here.
1.

Let a "strict" law be a true, finitely statable sentence of the form


(1)

Vx[Fxt ---7 3y(Gyt+E)]

where x and y range over nomologically possible events, 'F' and 'G' are
(perhaps complex) predicates, t it a time, and E is an interval. An ordered event
pair (e j , ej ) instantiates (1), iff: e j satisfies 'F' at t, and ej satisfies 'G' at
some time between t and t + E. Instead of appealing to nonactual events to
handle accidental generalizations, one might require "projectible" predicates.
But I do not share Goodman's (1979) ontological concerns; and I think we
have some sense of nomological possibility independent of our commitment
to any given generalization. In any case, I will be appealing only to actual
events, or events easily made actual by performing experiments.
The key feature of strict laws is their universality: whenever the antecedent
is satisfied, the consequent must be satisfied. So any instance of [Fxt &
-,3y(Gyt+E)] is inconsistent with (1). I assume that we are excluding, for
present purposes, generalizations that are true by virtue of logic or meaning.
Still further requirements - e.g., that antecedents not be otiose, and the absence
of the word 'cause' - would be needed to define a notion of causal law without
circularity. But this is not my goal, although I offer a tentative proposal

259
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 259-273.
1995 Kluwer Academic Publishers.

260

PAUL M. PIETROSKI

below. So henceforth, let 'law' be short for 'causal law' . That said, we might
render Ramsey's proposal as follows:
e j causes ej , if there is an ordered n-tuple of events (e l e 2, en).
such that e j = e l ej = en. and for each ek , (ek ehl ) instantiates a strict
law.
This thesis is unsatisfactory; though not because events can have a common
cause, or because of causal asymmetries. The rising of one barometer does
not cause the rising of another. But neither is there a strict law here. If Dudley
tinkers with his barometer on a fair day, his barometer may rise while others
remain constant. Similarly. a few experiments will show that no strict law determines the height of a flagpole as a function of (i) the length of a nearby pattern
of illumination on the ground, and (ii) the position of the sun: paint a long
streak on the ground near a short pole (or absent any pole at all) on a cloudy
day; erect a pole on a steep hill; build a (dark) wall near the pole; etc. More
restricted shadow-to-flagpole generalizations may avoid particular counterexamples. But no finitely statable version will be exceptionless. 2
Of course. such "defenses" of the strict covering law account serve to
illustrate its basic inadequacy: there are few, if any, strict laws. Even if there
are no counterexamples to a thesis that some condition suffices for causation. such a thesis will be unexplanatory if the condition is never satisfied. And
it is not a strict law barometers rise when atmospheric pressure rises.
Barometers can malfunction; and various things can interfere with the "normal"
effects of atmospheric pressure on operational barometers. Similarly. an
indefinite number of (possibly conspiring) interfering factors might result in
a shadow that is too long (or short). according to some flagpole-to-shadow
generalization. Nor is this difficulty confined to homey examples.
It is not a strict law that if organisms possessing a (heritable) property P
are better able to survive and reproduce than organisms possessing an alternative property P*, then the proportion of organisms in the popUlation having
P will increase. As Sober (1984) notes, genetic drift, pleiotropy, cataclysmic
events. etc., can work against the "fittest". Joseph (1980) and Cartwright (1983)
have argued that, absent a unified field theory, the same point applies to
physics. To take Joseph's example: Coulomb's Law tells us that a point charge
produces (the relativistically invariant analog of) a spherically symmetric electromagnetic fields; but if there is a dense mass in the vicinity, the quanta
of the electromagnetic field will respond to the gravitational influence. thus
distorting their spherically symmetric distribution. Cartwright offers similar
examples.
One might reply that laws do not make claims about the behavior of objects,
not even the trajectories of fundamental particles. This claim has some plausibility in physics. since the "fundamental" laws are often interpreted as saying
that whenever certain initial conditions are met, a field is produced; where a
field is held to be an entity in its own right quite apart from the objects upon
which its acts. Creary (1981), for example. seems to hold that fields cause

OTHER THINGS EQUAL, THE CHANCES IMPROVE

261

the behavior of particles, and that only a compositional "Law of Total Force"
would make claims about such behavior. While this suggestion is plausible
a priori, Joseph notes that nature has thus far been recalcitrant. We cannot
just "put the parts of physics together," because they do not form "a single
consistent theory demonstrably possessing a physical model (779-80)."
It seems less likely that the laws of other sciences can be construed a la
Creary. Natural selection is not any kind of field that causes changes in gene
pools. And once we consider psychological or economic generalizations, it
becomes implausible to say that (in general) laws should not be interpreted
as entailing claims about the behavior of objects in the relevant domains.
Moreover, even if a unified field theory is imminent, there is no reason to
expect parallel unifications of biology or other special sciences. The "force"
of natural selection, for example, can be swamped by nonbiological cataclysmic
events (e.g., earthquakes or meteor strikes). In Davidson's (1970) terms, special
science generalizations are "heteronomic," and this would seem to make
them irremediably nonstrict. For similar reasons, it is Panglossian to expect
a "Law of Total Force" stated in the vocabulary of the special sciences. But
without composition laws, appeals to forces cannot be translated into laws
that particular events instantiate.
I take these considerations to motivate the claim that laws have implicit
ceteris paribus (henceforth, cp) clauses; though without some explanation of
their theoretical role, appeal to such clauses merely relabels the problem at
hand. In discussing the relevance of these issues for deductive-nomological
models of explanation, Hempel (1988) notes that the idea of a cp clause "is
itself vague and elusive .... What other things, and equal to what?" Schiffer
(1991) wants to know which proposition (if any) is expressed by a putative
cp-Iaw. For on pain of rendering
(2)

cp{V'x[Fxt

3y(Gyt+e)]}

as a trivial tautology, the cp-clause cannot mean, "Barring cases of [Fxt &
-,3y(Gyt + e))." And as Schiffer's real worry is that (2) is meaningless, it is
no help to say that 'cp{0}' means 0, cp (Cf. 'All mimsy were the borogoves.' means that the borogoves were all mimsy.) Let us call instances of
[Fxt & 3y(Gyt + E)) and [Fxt & -,3y(Gyt + E)), respectively, "Normal" and
"Abnormal" instances of (2). If there are cp-Iaws, they are compatible with
some nomologically possible Abnormal instances, else they would be strict.
This tells us something about what doesn't make a cp-Iaw false - viz., the mere
fact that it faces an Abnormal instance; though we still need an informative
claim about what does make a putative cp-Iaw false. But if we can say under
what conditions (2) would be false, then (2) must be meaningful.
Joseph (1980) considers interpreting laws as holding "ceteris absentibus,"
by which he means that (say) Coulomb's law would describe the trajectories of relevant quanta in situations where all non-electromagnetic forces
were absent. But he rejects this idea for good reason: the subjunctives will
often be contralegals of a radical sort. Given objects with mass and charge,

262

PAUL M. PIETROSKI

counterfactual situations in which gravitational forces are absent will be


situations in which massive objects either lack mass or fail to exert gravitational force. Our faith in Coulomb's law hardly depends on what we think
would happen in such scenarios. For intuitions about contralegals seem to
depend on an antecedent grasp of relevant laws. Dicke et al. (1965) argued
that temperatures in excess of 10100 (K) in the early stages of the universe would
be detectable in the form of temperatures about 3.5 0 higher than those expected
on the basis of known sources of radiation. Independently, Penzias and Wilson
(1965) discovered just this. But if some local radiation is residue from the
Big Bang, we cannot adopt ceteris absentibus interpretations of the generalizations that led Penzias and Wilson to expect a lower temperature. For such
generalizations are not claims about how much radiation there would be, had
the Big Bang not occurred. So it is at best useless to say that a law is false,
if an associated contralegal is false.
Cartwright (1983) considers interpreting laws as indicative conditionals
of the form
(3)

'tx (lA

---7

[Fxt

---7

3y(Gyt + E)]}

where 'lA' is a statement of relevant idealizing assumptions. An event pair


(ei , ej ) would instantiate (3), only if e i occurred when IA obtained. But again,
we are typically unable to formulate all our idealizing assumptions as a finitely
statable extra antecedent. Too many things can go wrong. Idealizing assumptions are also false: planes are not frictionless; gas molecules do take up
space and attract one another; etc. So even if IA is finitely statable, (3) will
be only vacuously true (in nomologically possible worlds). If we interpret
Coulomb's Law this way, IA will have to include something like, 'No objects
with mass are present'. Few if any events will instantiate Coulomb's law
thus interpreted. And as Laymon (1985) notes, given (3), an instance of
[Fxt & -,3y(Gyt + E)] tells us only what we already knew: IA is false.
The moral is that covering-law accounts are illuminating, only if we have
strict laws, a unified theory, or nonvacuously true cp-Iaws. Finding all three
unavailable, Cartwright (1989) returns to the idea that objects have causal
powers or "capacities." The "fundamental" laws express a commitment to
the existence of particular capacities; and explanation takes the form of citing
such capacities (typically in combination). For example, Newton told us that
the sun has the capacity to make the earth accelerate in the manner described
by the equations F = Gmm'/d2 and F = rna. But manifestations of capacities
will typically be "impure". For the behavior of an actual object will (almost?)
always be a manifestation of more than one capacity. (Cartwright holds that
explaining events as the results of interacting capacities does not require
composition laws.) I find little to disagree with here, except for the implicit
contention that appeals to capacities somehow supercede appeals to cp-Iaws. 3
And I take Joseph and Cartwright to have shown that subjunctive and indicative conditional analyses of cp-Iaws are inadequate, not that appeal to such
laws is hopeless. In the next section, I offer an account of cp-Iaws that is

OTHER THINGS EQUAL, THE CHANCES IMPROVE

263

more fully developed in Pietroski and Rey (1995), and applied to other domains
in Pietroski (1993, 1994); and I show how this account can be deployed in
the service of a covering-law conception of causation.
2.

If we need cp clauses because we idealize when stating laws, Abnormal


instances of a cp-Iaw should be explicable by citing some fact(s) - or, if you
like, factor(s) - we have idealized away from (i.e., ignored) in stating the
law. Bodies significantly affected by friction (wind resistence, etc.) present
Abnormal instances of the generalization, 'cp, falling bodies accelerate towards
the earth at 32 ft.lsec 2 '. But we can cite the fact that such bodies are affected
by friction (wind resistance, etc.) in explaining why they do not accelerate at
exactly 32 ft.lsec 2 This suggests the following condition on cp-Iaws:

'cp{'v'x[Fxt -7 3y(Gyt+)]}' is true, only if


'v'x{Fxt -7 [3y(Gyt+) V 3H3z([Hzt*] explains [---.3y(Gyt + )])]}.
Whenever the initial conditions of a cp-Iaw obtains, either (i) the consequent
condition obtains, or (ii) some Hzt* (i.e., the fact that z has property H at
t*) explains why the consequent condition did not obtain. I will sometimes
abbreviate (ii) by saying that the presence of (the factor) H explains the
Abnormal instance. Strict laws, if such there be, are special cases of cp-Iaws,
for which explanations of Abnormal instances are never required. Far from
being tautologous or trivial, cp-Iaws would have substantive empirical consequences when their initial conditions are met; though the consequences would
be disjunctive, and so (as we should expect) weaker than those of strict laws.
It is, perhaps, worth noting here that 'The exception proves the rule.' originally meant that exceptions test a rule: we judge rules according to how well
they stand up against apparent counterexamples.
For present purposes, I take the notion of explanation more or less as
given. This is not because I think it easy to say what explanations are, but
because we have a better pre-theoretic grip on the notion of explanation than
on that of a cp-Iaw. The current project is an attempt to illuminate the notion
of causation (via Ramsey's proposal) by appealing to cp-Iaws - and hence,
by appealing (inter alia) to explanations. (Cf. Hanson (1958). The resulting
proposal about causation might also be seen as an attempt to cash out some
suggestive remarks by Hart and Honore (1959, pp. 45-6).) But I have no reductionistic ambitions; and I doubt that an illuminating account of explanation
can avoid appealing to the notions of law and causation. Nonetheless, we
can try to make interesting claims that take one of the notions in this family
(law, cause, explains, counterfactual, etc.) as given. It is, however, worth
clarifying the relevant sense of 'explains' (for these purposes) in several
respects.
First, since we are concerned with the truth conditions of putative

264

PAUL M. PIETROSKI

cp-Iaws, only correct explanations are germane. Second, Hzt* explains


[-,3y(Gyt + e)] (or not) regardless of whether anyone ever offers this explanation. Third, what calls for explanation is the fact that [-.3y(Gyt + e)] - as
opposed to [3y(Gyt + e)] - despite the fact that Fxt and the (provisional)
assumption that the putative cp-Iaw is true. The "why-question" presented
by an Abnormal instance of a cp-Iaw is, therefore, somewhat complex; and
an answer must presuppose the cp-Iaw in question. 4 One cannot explain away
an Abnormal instance of 'cp, swans are white.' just by saying that the black
swan in question has a gene that (in its environment) makes it black. Even
if this is a correct explanation of the swan's blackness, it is no explanation
of how a true cp-Iaw comes to face an Abnormal instance; for it does not accept
- even if does not overtly deny - the presumption that swans are white. But
an elaboration along the following lines might explain the Abnormal instance:
Gene g* is found in all (except mutant) swans, white or black; and having
g*, together with other swannish traits, usually leads to white feathering
because of the effect g* has on pigmentation; but some Australian swans
also have another gene that, in combination with g* and other swannish
traits, leads to black feathering. 5
Perhaps the most important constraint on explanation is the following: If
the presence of H explains an Abnormal instance of a cp-Iaw, then any Normal
instance of the cp-Iaw in the presence of H calls for explanation. We will be
surprised if a falling body affected by friction accelerates towards the earth
at exactly 32 ft.lsec 2 ; and this Normal instance of the generalization - together
with the claim that friction can explain Abnormal instances - commits us to
saying that the presence of another factor H* explains why the effects of
friction were "cancelled out" in this case. Moreover, I intend this constraint
to be iterative: if the presence of H* explains the occurrence of a Normal
instance of a cp-Iaw L despite the presence of a factor H that can itself
explain Abnormal instances of L, then Abnormal instances of L in the presence
of H and H* call for explanation; etc. So explanatory commitments proliferate.
But Normal and Abnormal instances differ here.
Explanations of why Normal instances occur despite the presence of interfering factors need not assume (though they must not deny) the truth of the
cp-Iaw in question. Returning to our earlier example, if Dudley has painted
his pet Australian swan white, this is an explanation of the Normal instance;
and the explanation is agnostic with regard to the Normal color of swans. If
Dudley painted an ordinary swan white, its whiteness would be overdetermined. But instead, Dudley's swan is an "accidentally Normal" instance of
'cp, swans are white'. For intuitively, the cp-Iaw is irrelevant to the swan's
whiteness (assuming Dudley chose white paint at random). Let (e j , e) be a
nonaccidentaUy Normal instance of 'cp{V'x[Fxt ~ 3y(Gyt+e)]}', just in
case (e j , e) is a Normal instance of the cp-Iaw, and: if there is any factor H
present, such that if e i occurred but -.3y(Gyt + e) then the presence of H
could explain the Abnormal instance, then the presence of some other factor
H* explains why 3y(Gyt + e) despite the presence of H. And I intend this

OTHER THINGS EQUAL, THE CHANCES IMPROVE

265

constraint to be iterative, just as in the last paragraph. Intuitively, a nonaccidentally normal instance is one that occurs when any potentially interfering
factors are themselves interfered with. For in the presence of "undefeated"
interference, we expect Abnormal instances of cp-Iaws; but (as with Dudley's
swan) the consequent of a cp-Iaw can be satisfied when the antecedent is
satisfied, even if the latter does not explain the former.
The task for the covering (cp) law theorist, as I see it, is to make explicit
the kinds of constraints on explanation I am gesturing at. Given such constraints, it may be that the necessary condition on cp-Iaws proposed above
can also serve as a sufficient condition. Following Ramsey, the idea would
be that events exhibit a good deal of regularity of succession - though not
so much regularity that we can state strict laws. A putative cp-Iaw would serve
to set a standard for what counts as a Normal case; and the standard would
be correct, if every Abnormal case "left over" can be handled - where there
are constraints on what counts as "handling" an Abnormal instance. But the
present task is to show how there can be non-trivial cp-Iaws; so (substantial) necessary conditions are more important than sufficient conditions. And
the crucial claim is that we quantify over Abnormal instances of a putative
cp-Iaw, holding that each must be explicable, instead of trying to state in
advance a condition (in the form of an extra antecedent) that covers every
possible Abnormal case.
If this is correct, events can instantiate cp-Iaws even when other things
are not perfectly equal. Differential fitness can lead to evolution in the direction of the fitter trait in the presence of some counteracting drift, pleiotropy,
etc. So many Normal instances of Darwin's principle would not instantiate
an indicative conditional whose antecedent is, 'If there is no drift or pleiotropy
or . . .'. Of course, given a generalization that purports to state the precise
quantitative effects of selection on a gene pool, the problem arises again.
But qualitative laws are often what we want; and we settle for them when
they are all we are likely to get. More importantly, while it may be a tautology that A leads to B unless it doesn't, the same is not true of the claim that
A leads to B unless there is an explanation for why it doesn't. In particular,
(4)

cp, if a barometer B rises, then barometers near B rise.

is false. If Dudley manually applies pressure to the relevant mechanism in


his barometer on a fair day, we get an inexplicable Abnormal instance of
(4). For the fact that Dudley, as opposed to the atmosphere, applied the pressure
does not explain why (4) - a generalization stated in terms of a relation between
barometers - faces an Abnormal instance. 6
Assuming that cp-Iaws are real laws, we can render Ramsey's claim as
follows:
e j causes ej , if there is an ordered n-tuple of events (e l , e 2 , ,
en), such that: e j = e 1, ej = en' and for each e k , (e k , e k+1) is a nonaccidentally Normal instance of a law.

266

PAUL M. PIETROSKI

If there tum out to be no strict laws - i.e., if all (causal) laws are cp-Iaws then it would be tempting to say that causal laws are those laws that have
nomologically possible Abnormal instances, thus excluding logical and analytic
truths by virtue of the fact that any such generalizations will be exceptionless. The motto would be, "No causation without exceptions." I find this idea
attractive, although I won't press it here. The current proposal says nothing
about events covered by no law. But given the relaxed notion of law (which
I will relax still further) there may be enough laws to go around; though
many laws may be scientifically uninteresting. Suppose, however, that there
is an event e*, such that for no ek is (ek , e*) a Normal instance of a law. It
does not follow that the proposal is false, or even that we should look for
an alternative sufficient condition with wider explanatory scope. For I don't
see why we have to assume that every event has a cause. Perhaps "No uncaused
events" slogans express an ideal: strive to find laws that cover all events.
Moreover, if the current rendering of Ramsey's proposal has the consequence
that there are uncaused events, perhaps we can allow that some events "just
happen;" where this just means that they cannot be located in a pattern of nomic
regUlarity. But I propose to set aside such questions. For the most serious worry
about the current proposal, I suspect, will be that the maneuvering used to
defend appeal to cp-laws is unnecessary. One might hope that we can appeal
to probabilistic laws instead of cp-Iaws, and that we can appeal directly to
probability-raising instead of laws in our account of causation, thereby getting
a theoretically simpler account. But I think that such simplicity is not to be
had.
3.

Let us represent probabilistic laws as modifications of strict lawlike statements


as follows:
(5)

\Ix {Fxt -7 with probability N[3y(Gyt + e)]}.

A strict law would be a special case of a probabilistic law, with N


simplicity, I will write (5) and the corresponding cp-Iaw
(2)

1. For

cp{\lx[Fxt -7 3y(Gyt+e)]}

as: if F, then N%(G); and if F, then cp[G]. (The account of cp-Iaws offered
in section two, recall, renders the consequent of such laws disjunctive in form.)
We cannot dispense with cp-clauses in favor of probability operators, because
these two ways of modifying the traditional (strict) form of laws scratch different itches. We need probability operators because the world is not
deterministic; whereas we need cp-clauses because the world is complex and
full of interacting factors. Consider again Coulomb's law, which faces
Abnormal instances given nearby dense masses. We cannot make a strict statement of this law true just by adding a probability operator. For we cannot
determine the probability of there being a nearby dense mass; and there is

OTHER THINGS EQUAL, THE CHANCES IMPROVE

267

no reason to think there even is a single (or an average) probability across


nomologically possible worlds. Nor can we dispense with probability-operators in favor of cp-clauses. If (5) expresses a genuinely indeterministic relation
- as opposed to there being "hidden" interacting factors that give rise to
(5) - then some instances of (Fxt & -,[3y(Gyt+E)]) cannot be explained by
citing other factors. And we have reason to think that generalizations concerning the behavior of quanta, for example, will be genuinely indeterministic.
It is not surprising that we need both ways of relaxing the traditional account
of laws. For the world is both complex and indeterministic; and simplicity
is not the same as determinacy. We need to idealize and probabilize. So I
think we should allow for laws of the form: if F, then cp[N%(G)]. Such "cpprobabilistic" laws would say that, if initial conditions are met, then either
there is an N% chance that the consequent condition is met, or there is some
explanation for why there is not an N% chance that the consequent condition is met. A probabilistic claim about certain quanta might be incorrect,
because it idealized away from the gravitational influences of any nearby
masses. And given cp-probabilistic laws, the covering-law account at the end
of section two can accommodate cases in which one event causes another, even
though there was only a chance that it would be do - e.g., cases involving
uranium and geiger counters, or particles and slits. (The notion of a nonaccidentally Normal instance of a cp-probabilisitc law introduces some
complications; but I do not think these are insuperableY
One might worry, however, that the covering law account - which seemed
so simple in Ramsey's formulation - is growing increasingly complex and
unwieldy. Of course, the basic idea remains the same: causation is a matter
of instantiating appropriate laws. But if in stating such laws as we can state
in various domains, we discover no single form of law, the covering law theorist
must allow for the associated "messiness" in her account of causation. (Indeed,
she should welcome such messiness for the reasons I have discussed.) So
the covering law theorist owes an argument that the obvious alternative - direct
appeal to probability - will not lead to a simple theory of causation. But the
makings of such an argument are already in the literature. Let 'ei[Tmr mean
that ei is a token of type Tm; let 'P(Tn), refer to the probability that an event
of type Tn will occur; and let 'P(TjTm), refer to the probability that an event
of type Tn will occur given the occurrence of an event of type Tm. Then the
simplest theory is:
ei causes ej , if there are types Tm and Tn such that: eJTml. eJTn],
and P(TjTm) > P(Tn).
(This theory can be mimicked, in a given case, by a qualitative probabilistic
law of the form: if F, then there is a greater than N% chance of G; where N
is the "baseline" percentage chance of G.) But to take the now familiar example
of Simpson's Paradox: consider a population in which most of the smokers
- but few of the non-smokers - exercise, because a single genetic factor is
responsible for both a disposition to smoke and a disposition to exercise. In

268

PAUL M. PIETROSKI

such a population, the chance of getting heart disease given that one smokes
may be lower than the incidence of heart disease in the whole population.
This would not show that smoking is a prophylactic against heart disease in
the population. Indeed, smoking would still be a cause of heart disease. But
the simple theory above offers no explanation of this fact.
A now standard response to this problem is to partition the population, in
this case into exercisers and non-exercisers. Eells and Sober (1983), for
example, suggest that smoking causes heart disease, as long as the chance
of getting heart disease given that one smokes is higher than the incidence
of heart disease in each relevant sub-population. This complicates matters
significantly, since there will be a relevant partition for every factor (other than
the putative cause in question) that is causally relevant to the effect in question.
In the context of the present example, this would mean considering partitions for diet, genetic predispositions to heart disease, exercise, etc. The appeal
to causal relevance will render the resulting theory unsuitable as a reduction
of the causal to the non-causal. But I am not hoping for such a reduction.
And while it is difficult in practice to apply such a theory to cases, it might
still be true that
e j causes ej, if there are types T m and Tn such that
eJT m], ej[Tn], and in each causally relevant partition,
P(TjTm) > P(Tn)
Dupre (1984, 1993) calls this requirement "contextual unanimity" and points
out that it is unduly strong. 8 If scientists (employed by the tobacco industry)
discover a rare physiological condition P, such that those with P are less
likely to get lung cancer if they smoke, this would not count as a discovery
that smoking does not cause lung cancer. But Dupre thinks that if having P
were the "rule rather than the exception," we could conclude that "smoking
was a prophylactic against lung cancer;" and this conclusion would not be
refuted by the fact that "those abnormal and unfortunate individuals who lacked
this physiological advantage were actually more likely to get lung cancer if
they smoked (1984, p. 172)." According to Eells and Sober, if smoking raises
the probability of (lung) cancer in some causally relevant partitions, but lowers
it in others, then there is "no such thing as the causal role of smoking" with
respect to cancer "in the population as a whole" (p. 37, their emphasis). And
indeed, smoking does not have a uniform causal role in the population Dupre
describes. But a contextual unanimity theory leaves us with no explanation
of the fact that smoking causes cancer in many of those who lack P.
If one restricts the population from, say, North Americans to North
Americans without P, Dupre could offer a similar example for the restricted
(and rather arbitrary) population. And repeated restrictions threaten to limit
(eventually to zero) the population size, thus threatening the very idea of probability raising. More importantly, as Dupre (1984, p. 174) suggests, imposing
the same condition on every causally relevant partition is the analog in the

OTHER THINGS EQUAL, THE CHANCES IMPROVE

269

probability raising approach of demanding that laws be strict in the covering


law approach. Since the generalization 'If x smokes, x gets cancer' faces
counterexamples, the traditional covering law theorist hopes for an alternative lawlike statement in which every factor that might have a bearing on
(not getting) cancer is explicitly mentioned in a more elaborate antecedent.
A contextual unanimity theory mirrors this hope. Dupre fails to note the important difference between (i) explicitly mentioning all relevant factors in a lawlike
statement and (ii) quantifying over partitions. If there are infinitely many
relevant factors, only (ii) is even possible. But the unanimity condition is
just as implausible as the strictness condition. Dupre's case serves to remind
us that, even in the context of probability raising, there can always be Abnormal
scenarios. And since we cannot expect unanimity, we must be prepared for
dissent.
Dupre himself rejects contextual unanimity in favor of "average effect."
Instead of asking whether the cause raises the probability of the effect in all
causally relevant partitions, he asks whether the cause raises the probability
of the effect on the whole, given a fair sample. In the smoking/exercise case,
Dupre thinks the sample of smokers is biased by the common-cause connection between smoking and exercising - appeal to partitioning and unanimity
constituting an unsuccessful attempt to remedy the unfair representation of
exercisers in the pool of smokers. While recognizing that explicating the notion
of a "fair sample" presents difficulties, Dupre claims that it imposes a weaker
and more plausible condition on probability raising theories. (But cf. Eells,
1987.) I too have doubts on this score. Returning again to Coulomb's Law:
what is the average effect of charge, if the effect depends on whether a dense
mass is nearby? But putting such worries aside, Dupre faces an analog of Eells
and Sober's problem. If condition P were widespread, then on Dupre's view,
smoking would be an inhibitor of cancer. I have no objection to this claim. But
an average effect theory leaves us with no explanation of the fact - and I
take it to be a fact - that smoking would still cause cancer in many of those
who lack P.
If one grants that a cause need not raise the probability of its effect in
every partition, appeal to "average effect" is an obvious alternative. But there
are other possibilities. Cartwright (1989) suggests that whenever a cause fails
to raise the probability of its effect, there must be a reason. So perhaps in every
causally relevant partition, either the cause raises the probability of the effect
or there is an explanation for why it doesn't. (Note that this is the same kind
of condition proposed in the account of cp-Iaws offered in section two: Given
initial conditions, either the consequent condition is satisfied, or there is an
explanation for why it isn't.) This is a weaker, and more plausible, condition than contextual unanimity. It also captures something that Dupre's proposal
misses. On his theory, a cause failing to raise the probability of its effect is
simply a matter of chance. But there is an important difference, for example,
between (i) a group of smokers who do not get heart disease because they
exercise, and (ii) a group of uranium atoms that do not decay after a long

270

PAUL M. PIETROSKI

time because "that's just the way it goes sometimes." In deterministic cases,
it seems perfectly reasonable to ask for explanations of why causes fails to
produce their effects; and at least often, it is reasonable to ask why indeterministic causes fail to raise the probability of their effects. In Dupre's own
example, we are inclined to ask if having condition P explains why smoking
reduces the chance of cancer in those with P.
Dupre rejects Cartwright's suggestion, because he thinks it can be a mere
matter of chance that a cause fails to raise the probability of its effect. If the
relation between smoking and cancer is indeterministic, an individual smoker
can simply be "lucky" and not get cancer. So, Dupre asks, couldn't a group
of smokers (say those with the condition P) be "second-order lucky" in that
their chance of getting cancer wasn't raised, without there being any reason
for this? If so, I agree that this would not show that smoking wasn't a cause
of cancer. And like Dupre, I don't think we have evidence against the claim
that the world is "chancy" in just this way. Nor should we assume that all
such probabilistic truths must be manifestations of underlying (competing)
causal capacities. So it may be too much to demand an explanation whenever
a cause fails to raise the probability of its effect. But again, "average effect"
proposals will miss something by never requiring such an explanation. This
is, I hope, enough to suggest that an adequate probability-raising theory is
unlikely to be simple. Moreover, I think we can get between Cartwright and
Dupre by completing our relaxation of the notion of law.
I have argued that we should allow for laws with: cp-clauses, probability
operators, and both. But modifying the strict form in two ways presents the
possibility of a scope difference. Recall that 'if F, then cp[N%(G)]' says: given
F, either (i) there is an N% chance that G will obtain, or (ii) there is an
explanation for why (i) is false. We can think of a cp-probabilistic law as
the result of probabilizing and then idealizing. If we idealized and then probabilized, we would get a law that says: given initial conditions, there is an
N% chance that either (i) the consequent condition will obtain, or (ii) there
is an explanation for why (i) is false. A "probabilistic-cp" law of the form
'If F, then N%(cp[G])' would allow for a percentage of cases in which there
is no explanation for why initial conditions do not lead to consequent conditions. But not every case in which (say) smoking does not lead to cancer would
be written off as a mere matter of chance. We could distinguish scenarios in
which some smokers fail to get cancer because they have condition P and
scenarios in which smokers with condition P just get lucky.9
The resulting covering-law account does not reject probability-raising conceptions of causation. For it appeals to a range of law forms. In so doing, it
appeals to probability operators and the notion of quantifying over (and
explaining) Abnormal instances. Moreover, suppose that, as I have suggested,
a plausible probability-raising theory will quantify over partitions, and appeal
to explanations of why causes fail to raise the probability of effects in some
partitions. Then in some sense, the two approaches converge. But Dupre's case
provides reason for preserving the covering law conception. For I think his

OTHER THINGS EQUAL, THE CHANCES IMPROVE

271

case is one in which smoking causes cancer in some members of the population, while it inhibits cancer in others; and if there are such cases, we
cannot cash out the notion of causation solely in terms of probabilities and
(non ad hoc) populations. For smoking cannot raise and lower the chance of
cancer in a population.
The covering law theorist, however, can describe Dupre's population as one
governed by a (probabilistic-cp) law to the effect that smoking leads to cancer
(perhaps with condition P as an interfering factor), and another law to the
effect that smoking inhibits cancer in those with condition P. For nonstrict laws
will be nonmonotonic. CP, falling bodies accelerate at 32 ft.lsec 2 - Abnormal
instances of this generalization to be explained by citing, inter alia, friction.
But it does not follow that cp, bodies affected by friction accelerate at 32
ft.lsec 2 Indeed, if we can specify the (normal) effects of friction (in a given
environment), we can say that cp, bodies fall at some other rate R; though
de re, as it were, each body affected by friction would still be such that it accelerates at 32 ft.lsec 2 , Cp.lO Dupre thinks that whether smoking causes or inhibits
cancer depends on whether P is the rule or the exception in the popUlation,
where this is a statistical matter (given a fair sample). I think what the "rules"
(and hence the exceptions) are is a matter of which laws hold, where this is
not just a matter of probability. So a smoker getting cancer can be the rule
or the exception, depending on the law in question.
By way of conclusion, let me offer the following familiar observation: In
saying that one event causes another, we commit ourselves to something
regarding similar events. Strict laws represent too strong a commitment. But
this is because laws need not be strict, not because the notion of a covering
law cannot help us give substance to the slogan, "Same cause, same effect."
A more relaxed notion of law makes the relevant commitments rather more
open-ended and easier to satisfy, without making them vacuous. This rendition of Ramsey's proposal may not provide the rigorous and tidy picture of
the world's causal order that some covering-law theorists hoped for. But I don't
think we need the rigor. And I see no reason to think the world is tidy.ll
McGill University
NOTES
I will not, however, defend Ramsey's claim that causal laws do not express propositions.
Perhaps, by definition, the shadow of x is caused by x. But flagpoles may cause a disturbed
Dudley to elongate the next comparative darkness he sees; or there might be a large refracting
lens between a pole and its shadow. And to say the shadow of x must be caused by x in "the
normal way" is to grant that there are no strict laws involving shadows.
3 Cartwright must distinguish among false sentences - e.g., F = Gmm' /d 2 vs. F = Gmm'. If
we use 'FE' to label the false but explanatory (by virtue of expressing a capacity) generalizations, then some nomic generalizations will have the form: FE{'v'x[Fxt ~ 3y(Gyt+e)]}. Arguably,
the difference between this proposal and appeal to cp-laws is notational. Cartwright would not
subscribe to a covering-law account of causation. But that is another matter.
1

272

PAUL M. PIETROSKI

But I do not stipulate that we have an explanation only if the putative cp-Iaw is true. That
would make the proposal viciously circular. See, e.g., van Fraassen (1980) for discussion of
the role of contrast classes and presuppositions in explanation.
S Or to take a familiar case, while there is an explanation of why Mercury orbits the sun as it
does, this does not explain the Abnormal instance of Newton's laws. The anomaly of Uranus'
orbit, however, was explicable by citing the presence of Neptune.
6
Only if atmospheric pressure is rising, will (4) be free from inexplicable Abnormal instances;
and building this condition into the antecedent would make it otiose. Similarly, there will be
no "shadow-to-flagpole" cp-Iaws. The presence of a "shadow-like" streak on the ground, or a
refracting lens between the pole and its shadow (see note 3 above) will not explain the relevant
Abnormal instances. (Again, my goal is not to say why such appeals would not explain why
the flagpole was "too short," much less to do so without appealing to causation.)
1 Suppose we have laws of the form 'if F, then cp[N%(G)], and 'if K, then cp[M%(G)]'. If
K obtains, it seems that an instance of F&G can be an accidentally Normal instance of the
first law, even if there is no undefeated interference with respect to either law. For
intuitively, there is an M% chance that K caused G and a (IOO-N)% chance that F did not
cause G. I am not entirely sure that we should take such intuitions at face value. But if we do,
we can also speak of the chance that an event is overdetermined, and the probability that we have
an accidentally Normal instance of a cp-probabilistic law.
8 Skyrms (1980) aruges that a cause need only raise the probability of its effect in some background condition and not lower the probability of its effect in any. But Dupre's point is unaffected,
since this still requires unanimity with respect to not lowering probability.
9 Are there cp-probabilistic-cp laws, and so on? I take the question to be an empirical one.
But at some point, it will become impossible to gather evidence for iteratively hedged laws.
10
See Pietroski (1993), pp. 504-5. The Ideal Gas Law and van der Waal's equation provide
a similar example; see Pietroski and Rey (1995) for further discussion. The prevalence of an
interfering condition (like friction) is irrelevant to whether it can explain Abnormal instances.
And cp-Iaws will not be transitive for the same reason they are not monotonic.
II
My thanks to FCAR and SSHRCC for financial support, and to Susan Dwyer for helpful
discussion.
4

REFERENCES
Cartwright, N., 1983, How the Laws of Physics Lie, Clarendon, Oxford.
Cartwright, N., 1989, Nature's Capacities and Their Measurement, Clarendon, Oxford.
Creary, L., 1981, 'Causal Explanation and the Reality of Natural Component Forces', Pacific
Philosophical Quarterly 62, 148-157.
Davidson, D., 1970, 'Mental Events' , in Essays on Actions and Events, Clarendon, Oxford (1980).
Dicke, Peebles et al., 1965, 'Cosmic Black Box Radiation', Astrophysical Journal 142, 414419.
Dupre, J., 1984, 'Probabilistic Causality Emancipated', Midwest Studies in Philosophy 9, 169175.
Dupre, J., 1993, The Disorder of Things, Harvard, Cambridge.
Eells, E., 1987, 'Probabilistic Causality: Reply to John Dupre', Philosophy of Science 53,
52-64.
Eells, E. and Sober, E., 1983, 'Probabilistic Causality and the Question of Transitivity',
Philosophy of Science SO, 35-57.
Goodman, N., 1979, Fact, Fiction, and Forecast, Harvard University Press, Cambridge.
Hanson, N., 1958, Patterns of Discovery, University Press, Cambridge.
Hart, H. and Honore, A., 1959, Causation in the Law, Clarendon, Oxford.
Hempel, C., 1988, 'Provisoes: A Problem Concerning the Inferential Function of Scientific
Theories', Erkenntnis 28, 147-164.
Joseph, G., 1980, 'The Many Sciences and the One World', Journal of Philosophy 77, 773-790.

OTHER THINGS EQUAL, THE CHANCES IMPROVE

273

Laymon, R., 1985, 'Idealization and the Testing of Theories by Experimentation', in P. Achinstein
and O. Hannaway (eds.), Observation, Experiment, and Hypothesis in Modern Physical
Science, MIT Press, Cambridge.
Penzias, A. and Wilson, R., 1965, 'A Measurement of Excess Antenna Temperature at 4080 mcfs',
Astrophysical Journal 142, 419-421.
Pietroski, P., 1993, 'Prima Facie Obligations, Ceteris Paribus Laws in Moral Theory', Ethics 103,
489-515.
Pietroski, P., 1994, 'Mental Causation for Dualists', Mind and Language 9, 336-66.
Pietroski, P. and Rey, G., 1995, 'When Other Things Aren't Equal: Saving Ceteris Paribus
Laws from Vacuity', British Journal for the Philosophy of Science 46, 81-110
Ramsey, F., 1929, 'General Propositions and Causality', in H. Mellor (ed.), Philosophical Papers,
Cambridge University Press, Cambridge (1990).
Schiffer, S., 1991, 'Ceteris Paribus Laws', Mind 100,1-18.
Sober, E., 1984, The Nature of Selection, MIT, Cambridge.
Skynns, B., 1980, Causal Necessity, New Haven, Yale.
van Fraassen, B., 1980, The Scientific Image, Clarendon, Oxford.

DA VID DAVIES

THE MODEL-THEORETIC ARGUMENT UNLOCKED

1. In the Preface to Realism with a Human Face, Hilary Putnam remarks


that "the difference between the present volume and my work prior to The
Many Faces of Realism is a shift in emphasis: a shift from emphasizing modeltheoretic arguments against metaphysical realism to emphasizing conceptual
relativity" (Putnam, 1990, pp. x-xi). One might be tempted to view this as
at least partially vindicating those defenders of metaphysical realism (henceforth 'Realism') who have been urging all along that the so-called 'modeltheoretic argument' (MTA) is flawed. I argue, however, that Putnam's remark
should not be thought to provide such succour to the Realist, and that, quite
the contrary, appeals to 'conceptual relativity' help to clarify why Putnam takes
certain standard Realist responses to the MTA to be inadequate. In so doing,
I also attempt to clarify another aspect of Putnam's 'internalist tum' stressed
in his most recent writings, namely, the sense in which this tum is 'Kantian'
in nature. I argue, against those who hold that central features of our scientific picture of the world are threatened by Putnam's assault on Realism, and
overlooked in the formulation of the MTA, that Putnam's 'internal realism',
like Kant's 'empirical realism', preserves the relevant features of that picture
while challenging a particular metaphysical interpretation of it. The Kantian
roots of Putnam's argument against Realism become further apparent in the
appeal to the doctrine of 'conceptual relativity', which affirms that the objects
of knowledge are subject to 'epistemic constraints'.
2. Let me begin with a brief account of certain relevant notions and dialectical strategies in the Putnamian literature. More specifically, I shall sketch
(i) the distinction between 'Metaphysical' and 'internal' forms of realism,
(ii) the model-theoretic argument against Metaphysical Realism, and (iii)
what I shall term the canonical response to this argument.
'Metaphysical Realism' (or 'externalism') and 'internal realism' (or 'internalism') are presented as two mutually exclusive philosophical perspectives.
Metaphysical Realism purports to provide a model of what it is for any particular theory to be correct. The Realist assumes that there is a unique and
determinate relation of reference between terms in a natural language and
elements, or sets of elements, in THE WORLD, where the latter comprises
some fixed totality of mind-independent, or representation-independent,
entities. A correct theory is one that is 'true', where truth consists in a correspondence relation, mediated by reference so construed, between sentences
of the theory and states of THE WORLD. THE WORLD's independence of
our representational capacities renders Realist truth "radically non-epistemic",
in the sense that even an "epistemically ideal" theory might be false. Insofar
275
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 275-286.
1995 Kluwer Academic Publishers.

276

DA VID DAVIES

as this model is to serve for all theories, the notions employed in characterising the model cannot receive their interpretation from within any particular
theory. (Putnam, 1976b, p. 125)
In his earlier writings, Putnam himself embraced a form of Realism, a
position he termed "sophisticated realism" (see especially Putnam, 1976a):
on such a view, language users construct representations of THE WORLD, and
the contribution of language use to the successful prosecution of our practical goals justifies the hypothesis that these representations refer to and are
approximately true of THE WORLD. 'Internal realism', which Putnam now
embraces, "employs a similar picture within a theory" (Putnam, 1976b, p. 125).
'The world' of which language users can be said to construct representations
is the world as it is conceived through one or more of our various representations: it is to objects so represented that our terms refer. For the internal
realist, the 'causal interactions' with the world through which knowledge is
acquired are themselves "part of the subject-matter of the representation"
that is our scientific account of knowledge.
The model-theoretic argument challenges the externalist thesis that truth
is radically non-epistemic - that even an 'Ideal Theory' satisfying the highest
standards of rational assertibility might still be false. Putnam argues that the
externalist is not entitled to something that is a precondition for the possible
falsity of such a theory. The precondition is that there be a unique, determinate, and pre-theoretic relation of reference between terms in a language and
entities in THE WORLD whereby states of THE WORLD confer determinate truth-values on sentences containing such terms, and might therefore
confer the determinate truth-value 'false' on some sentence comprised by an
'Ideal Theory'. The MTA, which purports to establish that no such relation
of reference is available to the Realist, is formulated from within the externalist perspective and proceeds roughly as follows.
Assume the externalist conception of THE WORLD, and of truth as reference-mediated correspondence to THE WORLD. Then, given an Ideal Theory
(IT) satisfying all operational and theoretical constraints and of an appropriate cardinality:
(a) There is a mapping, and hence a 'reference' relation, from the terms
of the language of IT onto elements, or sets of elements, in THE WORLD such
that THE WORLD satisfies IT - call this mapping SAT;
(b) Nothing in our use of the language can serve to exclude this mapping
as an 'unintended' interpretation, since IT, and hence SAT, satisfies all operational and theoretical constraints (ex hypothesi), and since SAT also satisfies
our general intention to refer in such a way that our beliefs come out (for
the most part) true;
(c) Furthermore, no other constraints available to the externalist can exclude
SAT as an 'unintended' interpretation; so
(d) If truth consists in a reference-mediated correspondence between a theory
and THE WORLD, then IT cannot fail to be true.
The foregoing sketches the presentation of the MTA in Putnam, 1976b.

THE MODEL-THEORETIC ARGUMENT UNLOCKED

277

In Putnam, 1980, the point is made in a slightly different way. It is argued that,
given a set of operational constraints formulated in terms of one vocabulary
- what is, relatively speaking, the 'observational' vocabulary - we cannot
single out a unique 'interpretation-independent' reference for the terms in
another vocabulary - relatively speaking, the 'theoretical' vocabulary - by
means of the operational constraints in conjunction with a plausible set of
theoretical constraints. Putnam establishes this result for various choices of
'observational' and 'theoretical' vocabularies. For example, we may take the
'theoretical' vocabulary to comprise terms that purport to refer to 'unobservables' postulated in scientific theories, where the corresponding 'observational'
vocabulary contains terms that refer to things, events, and properties "observable with the human sensorium". Alternatively, we may take the 'observational'
vocabulary to comprise only terms that refer to present sense-data, and include
all terms that purport to refer to past and future sense-data in the 'theoretical' vocabulary. In either case, it is claimed, we can run what is essentially
the argument analysed above [(a)-(d)] to establish that terms in the 'theoretical' vocabulary fail to receive a unique and determinate interpretation and
must therefore be viewed as "formal constructs variously interpreted in various
models" (Putnam, 1980, p. 475).
What I shall term the 'canonical' response to the MTA consists in granting
- albeit, perhaps, for the sake of argument - steps (a) and (b) above, but balking
at step (c). The externalist maintains that, in the words of David Lewis, "there
must be some additional constraint on reference: some constraint that might,
if we are unlucky in our theorising, eliminate all the allegedly intended interpretations that make the theory come true" (Lewis, 1984, p. 224). More
specifically, the canonical response consists in some variation on the following
line of reasoning:
(1) Putnam assumes that the only plausible constraints on reference available to the Realist are those imposed by the acceptance of an uninterpreted,
or partially interpreted, set of sentences (see, e.g., Lewis, 1984, p. 225;
Goldman, 1986, p. 155; Demopoulos, 1982, p. 138); but
(2) This assumption is incorrect: our use of language in our more general
commerce with each other and with the world (THE WORLD) furnishes
additional constraints on reference, and the realist can appeal to such constraints in explaining how a unique, determinate, and pre-theoretic reference
relation is possible, and why SAT-like interpretations are unacceptable.
The general thrust of the canonical response is well expressed in the following remarks by Ian Hacking:
All [Putnam] has shown is that you cannot succeed in reference by stating a set of truths expressed
in first-order logic ... Language is embedded in a wide range of doings in the world ... Language
is more than talking ... Cherries are for eating, cats, perhaps, for stroking. Once speech
becomes embedded in action, talk of Lowenheim and Skolem seems scholastic ... We can do
nothing with very large numbers except talk about them. With cats we relate in ways other
than speech ... Assuring reference is not primarily a matter of uttering truths, but of interacting with the world ... (Hacking, 1983, pp. 104-8; see also Blackburn, 1984, p. 301)

278

DA VID DAVIES

As to the precise mechanism whereby our "interacting with the world" generates a unique and determinate relation of reference between language and
THE WORLD, most exponents of the canonical response have taken the
relevant constraints to consist in the existence of appropriate causal relationships between terms and their referents (see, e.g., Glymour, 1982, p. 177;
Devitt, 1983, p. 298; Brueckner, 1984). An alternative approach championed
by David Lewis takes the desired additional constraints to be furnished by
the 'structure' of THE WORLD itself (Lewis, 1984, p. 227; see also Merrill,
1980).
That the use of natural language occurs in the context of a more general
intercourse with non-linguistic reality is obvious. No less obvious, it would
seen, is that this feature of language use might yield constraints upon interpretation. The contention, implicit or explicit in the canonical response, that
Putnam ignores such possible resources for the Realist cause therefore invites
the question of how he could have overlooked something so transparent,
especially in the light of his own earlier criticisms of the 'descriptivist' programme in the philosophy of language. It is the recognition of this anomaly
confronting the canonical response that motivates Carsten Hansen's "Lockean"
reading of the model-theoretic argument. According to Hansen, Putnam
assumes that any adequate Realist response to the model-theoretic argument
must satisfy the condition that "we have knowledge of, or access to (mental)
representations only", where the latter "are in a significant respect like Lockean
'ideas' - they are mental entities of a sort that form a 'veil' between us and
the external world." (Hansen, 1987, p. 92)
Given this "Lockean epistemological premiss", Putnam's offhand dismissal
of Realist appeals to "causal connections" and "non-linguistic facts" is easily
understood, for "it will be question-begging to assume the possibility of having
knowledge of the external, non-representational objects in terms of which
the language is to be interpreted" (Hansen, 1987, p. 93). But, if the modeltheoretic argument does presuppose such a Lockean premiss, then, Hansen
argues, it has little force against Realism. For the Realist is not committed
to any such premiss, nor is the latter plausible on independent grounds. Thus
she can answer Putnam by simply rejecting the "Locke an" epistemology and
offering the sort of 'causal' story about reference beloved of (most) devotees
of the canonical response.
3. The aim of the MTA is to undermine Realism while leaving internal realism
unscathed. 1 The Realist contends that truth is a matter of correspondence to
THE WORLD, grounded in a determinate relation of reference. Putnam challenges the Realist to provide an account of how such a relation of reference
is established. But - and this point is crucial - an adequate Realist response
to this challenge must advert to constraints on reference that allow for the
possibility, at least in principle, that, given the extensions determined for the
terms in a language L, even an epistemically Ideal theory in L could be false.
This point is not undermined by the contention (see Hansen, 1987, and Devitt,

THE MODEL-THEORETIC ARGUMENT UNLOCKED

279

1983) that the Realist is not committed to a general fallibilism concerning


our beliefs about "non-representational reality". The Realist may indeed
concede that some extension of our present methods of justification might in
fact be sufficient to determine the truth of some (or even all) such beliefs:
but this fails to bridge the conceptual gap between Realist truth and some form
of warranted assertibility. The Realist remains committed to the possibility,
at least in principle, that an Ideal Theory might be false, and must therefore
offer an account of the constraints on reference compatible with such a possibility. To put this point another way, the Realist's purposes will be served
only if she can show how it is possible to determinately refer to classes of
entities concerning whose members even our most warranted beliefs could
at least in principle be false. This is what the "mind-independence" of THE
WORLD amounts to.
Consider, then, Hacking's account, cited earlier, of the manner in which
the Realist is able to secure determinate reference. Hacking's principal concern
is the debate between scientific realists and anti-realists, and his principal
contention is that entity realism, but not theory realism, is warranted to the
extent that we manipulate certain entities in experimental contexts. Our
capacity to represent and have knowledge of such things as electrons derives
from our interventions into nature, not from the construction of linguistic structures in which the term 'electron' plays a certain role. Hacking believes that
the sort of response that tells against the scientific anti-realist also tells against
the metaphysical anti-realism promoted by the MTA. The additional constraint
on interpretation that the Realist requires is provided by the ways in which
"language is embedded in a wide range of doings in the world", and we are
offered, as examples, such 'doings' as our eating the things we call cherries
and our stroking the things we call cats. We may term this proposed constraint the "interaction" constraint.
The salient question, then, is whether the interaction constraint suffices to
determine referents that are 'mind-independent' in the sense just defined. How
is this constraint supposed to operate? One suggestion might be that the
extension of a term - 'cat', for example - comprises just those entities with
which we interact, or could interact, in the sorts of ways specified by Hacking.
The class of cats would be carved out of "noumenal dough" (see Putnam, 1985)
by our interactive practice. But can such an account make sense of the claim
that some of the things we treat as cats - things with which we engage or might
engage in standard human-feline interactions - might really not be cats at
all? Clearly, it can, by the simple expedient of enlarging the class of relevant
human-feline interactions to incorporate not only such benign 'doings' as
stroking and feeding, but also the more devious 'interventions' of actual or
possible laboratory scientists. Suppose, however, we further specify that the
non-cathood of some of the things we treat as cats in actual or possible practice
might be in principle undetectable for beings like us - that even an 'idealisation' of our interactive practice might furnish no reason to doubt the cathood
of these non-cats. This eventually is obviously not one that could be accom-

280

DAVID DAVIES

modated on the suggested interpretation of the interaction constraint. But this


is precisely the sort of possibility that must be accommodated if the classes
of entities singled out by the interaction constraint are to be 'mind-independent' in the required Realist sense.
It might be thought that the Realist can respond as follows: even if our most
warranted beliefs cannot mis-specify the membership of a reference-class
singled out by the interaction constraint, they can falsely ascribe certain properties to its members. For example, our 'Ideal cat-theory' might include the
claim that all cats have a certain property p when at least some cats fail, 'in
reality', to possess such a property. But it should be readily apparent that
the argument we have just run for the class of cats could equally well be
run for the class of things having the property p. If the constraints on reference allowed by Putnam are augmented solely by an appeal to interactive
practice understood in the manner suggested above, the class of entities having
the property p will be precisely the class carved out by an idealisation of
the relevant fragment of our practice, and the claim, within our 'Ideal
cat-theory', that all cats have property p will be true.
4. Facts about the embedding of language use in the broader context of human
activity cannot, by themselves, furnish the kind of constraint that the Realist
requires. But, we may now note, such facts do provide precisely the sort of
constraint that might be incorporated into an internalist account of how terms
acquire a determinate reference:
For an internalist . . . , signs do not intrinsically correspond to objects, independently of how
those signs are employed and by whom. But a sign that is actually employed in a particular
way by a particular community of users can correspond to particular objects within the conceptual
scheme of those users. "Objects" do not exist independently of conceptual schemes. We cut
the world up into objects when we introduce one or another scheme of description. Since the
objects and the signs are alike internal to the scheme of description, it is possible to say what
matches what. (Putnam, 1981, p. 52)

Indeed, internal realism can also incorporate the sort of 'causal' account of
the relationship between representations and what they represent that writers
like Devitt and Glymour have urged as a counter to the MTA. For internal
realism preserves the empirical content of Putnam's earlier "sophisticated
realism", now interpreted within a theory, and this includes precisely the kind
of scientific picture of linguistic representation advocated by Putnam's Realist
opponents. In 'Models and Reality', Putnam rejects the suggestion that a causal
theory of reference might enable the Realist to evade the force of the MTA,
and then immediately adds the following remarks:
This is not to say that the construction of such a theory would be worthless as philosophy or
as natural science. The program of cognitive psychology already alluded to, the program of
describing our brains as computers which construct an "internal representation of the environment", seems to require that mentalese utterances be, in some cases at least, describable as the
causal products of devices in the brain and nervous system which "transduce" information from
the environment, and such a description might well be what the causal theorists are looking

THE MODEL-THEORETIC ARGUMENT UNLOCKED

281

for. And the program of realism in the philosophy of science - of empirical realism, not metaphysical realism - is to show that scientific theories can be regarded as better and better
representations of an objective world with which we are interacting; and if such a view is to
be part of science itself, as empirical realists contend it should be, then the interactions with
the world by means of which this representation is formed must themselves be part of the
subject matter of the representation. But the problem as to how the whole representation, including
the empirical theory of knowledge that is a part of it, can determinately refer is not a problem
that can be solved by developing more and better empirical theory. (Putnam, 1980,p. 479)

It should be apparent from the foregoing that, in presenting the MTA against
externalism, Putnam doesn't neglect the role of "non-linguistic facts" and
our interactions with the world in securing reference. Rather, he takes such
things to be elements in the sort of internalist picture he endorses. But such
resources cannot aid the cause of externalism, for they fail to determine reference-classes that are 'mind-independent' in the required sense. It is because
the Realist requires such reference-classes that she needs to avail herself of
constraints that go beyond those furnished by our causal, or quasi-causal, interactions with things in our everyday and scientific practice. She has traditionally
relied upon an appeal to 'operational and theoretical contraints', but the
'Skolemisation' argument of 'Models and Reality' purports to show that
such an appeal fails to secure determine reference to classes in THE WORLD,
whatever kinds of entities such classes are taken to comprise - 'theoretical'
entities in science, middle-sized physical objects, or past and future sense-data.
The problem is not the 'Lockean' one of determinately referring to extra-mental
entities given that we have immediate referential access only to our own
"ideas"; it is, rather, the problem of securing determinate reference to any class
of entities that satisfies the Realist's notion of 'mind-independence'. One way
of setting up this problem is indeed in terms of a 'Lockean' epistemology:
assume that we have immediate referential access to our own inner states,
and try to secure determinate reference to external objects that are 'mindindependent' in the Realist's sense. But the problem can be set up equally
well with a Kantian veil rather than a Lockean one: assume that experience
provides us with immediate referential access to spatio-temporal objects,
and try to establish the possibility of determinately referring to noumenal
objects.
The foregoing distinction between Lockean and Kantian veils requires, of
course, that we resist a 'phenomenalist' reading of Kant's talk of spatiotemporal objects as "appearances". I follow, here, Henry Allison's influential
construal of Kant's Transcendental Idealism (Allison, 1983). Allison stresses
the distinction between transcendental and empirical senses of 'mind-dependence', 'appearance', and 'thing-in-itself'. Only in the empirical sense does
the 'mind-dependence' of an entity entail that it is a mental representation
in the Lockean sense; taken transcendentally, the 'mind-dependence' of an
object consists in its being considered qua subject to what Allison terms "epistemic conditions", conditions necessary for the representation of an object
or an objective state of affairs. For beings like ourselves who possess a discursive intellect - who know by bringing intuitions under concepts - something

282

DAVID DAVIES

can be an object of knowledge only insofar as it is represented in accordance


with some set of epistemic conditions. Parallel to the two readings of 'minddependence' are two readings of the distinction between "appearances" and
"things-in-themselves". Empirically, this is a distinction between two distinct
modes of being or two distinct classes of objects - spatio-temporal objects
(empirically real), on the one hand, and mental representations (empirically
ideal), on the other. Transcendentally, however, it is a distinction between
two ways in which, at the level of philosophical reflection, we can consider
empirical objects, real and ideal. Firstly, we can consider them as objects of
knowledge, and thus as subject of necessity to certain epistemic conditions,
not the least of which are space and time. as formal conditions of our sensibility; or, secondly, we can consider them as they are independently of these
conditions.
For Allison, then, Kant's distinction between Transcendental Realism and
Transcendental Idealism is a distinction between two metaphilosophical
perspectives which differ over the status of the objects of knowledge: the
Realist holds, and the Idealist denies, that the latter are 'mind-independent'
in the transcendental sense. The argument for Transcendental Idealism is
that (1) there are epistemic conditions, and (2) one cannot ascribe such conditions to things-in-themselves, transcendentally conceived. I suggest that
the debate between Putnam's externalist and intemalist perspectives closely
parallels this Kantian debate - when Putnam characterises Kant as the first
internalist, he interprets Kant in a manner similar to Allison (see, e.g. Putnam,
1985, pp. 4lff). The extemalist and the intemalist differ over whether the
objects of knowledge are 'mind-dependent' in the transcendental sense. The
argument against externalism purports to establish that there are 'epistemic
conditions' - now reconceived, given the linguistic tum, as conditions on
the possibility of reference - that objects of knowledge must satisfy. Putnam's
rejection of the "surd metaphysical truths" required to salvage "metaphysical materialism" (Putnam, 1983) is a rejection of the idea that such 'epistemic
conditions' can be viewed as properties of things in THE WORLD. This, as
we shall now see, bears crucially on an assessment of what I shall term the
'fortified' version of the canonical response to the MTA.
5. We have seen that an appeal to "non-linguistic facts" is insufficient by
itself, to provide the sort of additional constraint on reference that the Realist
requires, because the reference-classes picked out by such facts are not 'mindindependent' in the desired sense. What the Realist wants to say, of course,
is that, in using language in a certain way as part of our interaction with
experienced objects, we bring it about that our language "hooks onto" classes
of entities that are properly 'mind-independent' - classes whose membership, and the properties of whose members, may be incorrectly represented
by even our most warranted theories. But this will involve recourse to additional constraints of the sort cited by Lewis - to classes in THE WORLD
that pre-exist our employment of representational schemata in our attempts

THE MODEL-THEORETIC ARGUMENT UNLOCKED

283

to cut up the "noumenal dough", and to 'causal relations' between these classes
and terms in our language. It will involve, in other words, what Putnam has
termed a "Ready-Made World" (Putnam, 1983, pp. 205 fO.
The 'fortified' version of the canonical response appeals not merely to additional constraints on interpretation deriving from the use of language, but
also to pre-existing structure in THE WORLD. It is because THE WORLD
itself contains the kind of 'mind-independent' reference classes that the Realist
requires that our use of language, whereby terms are associated with members
of those classes, furnishes the additional constraints on reference that can
rule out SAT interpretations of an Ideal Theory. I shall now suggest that the
Kantian reading of Putnam's argument sketched above illuminates Putnam's
reasons for rejecting such a strategy.2
Consider the following kind of counter which Putnam offers to the contention that reference is a causal relation in THE WORLD, and to the further
contention that what he terms 'conceptual relativity' - the relativity of facts
about 'what there is' to choice of conceptual system (see below) - is simply
a matter of there being different ways of carving the 'noumenal dough'. In
"Why There Isn't a Ready-Made World" (Putnam, 1983, pp. 205-228), Putnam
argues that reference cannot be a relation in THE WORLD because it is a
notion whose extension depends upon what we are willing to count as reference in our attempts to understand and explain the linguistic and non-linguistic
behaviour of others: reference is
a flexible, interest-relative notion: what we count as referring to something depends on background knowledge and our willingness to be charitable in interpretation. To read a relation
so deeply human and so pervasively intentional into the world and to call the resulting metaphysical picture satisfactory ... is absurd. (Putnam, 1983, p. 225)

In a similar vein, Putnam (1985, 1988) has recently argued that Realism cannot
account for the phenomenon of conceptual relativity, because any attempt to
appeal to different ways of cutting the 'noumenal dough' falls foul of the
question, of what does the 'dough' itself consist? To offer any answer to this
question is to privilege one among the many categorial frameworks countenanced by the thesis of conceptual relativity.
But why is it not open to the Realist to reply that, while we may not be
able to tell which of our categorial frameworks corresponds to the way in which
THE WORLD divides itself into kinds, this doesn't count against the Realist
contention that there is "a way THE WORLD is"? Why, in other words, is
the damage sustained by Realism in the face of Putnamian arguments for
'conceptual relativity' any more than epistemological damage? (See, again,
Lewis, 1984, for this charge). Exactly parallel considerations apply to the
arguments over reference. Granted that our practices of ascribing referents
are permeated by our explanatory interests, and granted that these features
cannot with any plausibility be imputed to the Realist's WORLD, why should
this reflect upon the nature of reference itself, the thing we attempt to correctly
characterise through such practices?

284

DA VID DAVIES

Our earlier reflections on the Kantian roots of Putnam's anti-Realism suggest


the following parallel. Allison, in discussing the thesis of the Transcendental
Aesthetic, notes that a common charge against Kant is that he fails to consider
a third alternative to the theses that (i) space and time are transcendentally real,
and (ii) space and time are merely forms of our sensibility and therefore
transcendentally ideal. Why might it not be the case that space and time are
both forms of our sensible intuition and also properties of things in themselves?
The reason why Kant rejects this "neglected alternative", according to Allison,
is that it follows from the very notion of a form of sensibility, as an "epistemic condition", that such a feature cannot be ascribed to things independently
of our cognition of them. Put more forthrightly, it is a corollary of the central
Kantian thesis that the objects of knowledge are in their very nature representable (see Allison, 1983, chapter 2), that no constitutive feature of the mode
whereby such objects are represented can be ascribed to things considered
as logically distinct from their representability.
It obviously falls beyond the purview of this paper to evaluate either
Allison's reading of Kant on these matters or the Kantian thesis so read. But
there is an interesting parallel between this Kantian response to the "neglected
alternative" and Putnam's response to the 'fortified' canonical response to
the MTA, where the latter is viewed as a proposed third alternative to an
Externalism committed to "a fixed totality of mind-independent objects", and
an Internalism committed to the thesis that objects don't exist independently
of conceptual schemes. The Realist, like Kant's critics, may propose that we
can quite consistently hold both (a) that certain categorial notions are products
of our 'conceptual scheming', and (b) that such notions may represent properties of 'things in themselves', i.e. categories inherent in the structure of
THE WORLD itself (see, e.g., Hacking, 1983, chap. 5). Putnam's response
is that the notions in question - 'cause', 'fact', 'object', 'existence', and the
semantic notions of 'reference' and 'truth' - are inextricably embedded in
our cognitive and representational endeavours, as beings with a particular
biologically determined nature, with particular interests and saliences. Our
notion of reference, for example, has its place in our mutual interpretations
of one another as rational agents. Our notion of truth and of facthood cannot
be separated from criteria of relevance that reflect our fundamental values, and
from judgments of right assertibility that draw upon our full cognitive capacity
(see, e.g., Putnam, 1981, chap. 9). Our notions of reference, truth, and facthood
cannot be separated from our cognitive efforts, as beings whose cognition is
'bounded' by our nature, to represent to ourselves a knowable world and a
community of fellow cognitive agents. Such notions, then, cannot be taken
to stand for properties of a representation-independent WORLD, because
they pertain to a mode of representation.
Similar considerations apply if we consider the Realist's contention that
the world of objects that we represent through our various 'versions' may,
perhaps unknowably, correspond to the order of objects comprised by THE
WORLD itself. For a precondition for the intelligibility of such a claim is

THE MODEL-THEORETIC ARGUMENT UNLOCKED

285

that objects can be postulated independently of the representation of objects,


where 'representation' is understood in the 'transcendental' rather than the
'empirical' sense. It is the coherence of such a precondition that Putnam,
following Allison's Kant, has challenged in his recent discussions of "conceptual relativity":
What is (by commonsense standards) the same situation can be described in many different ways,
depending on how we use the words. The situation does not itself legislate how words like
"object", "entity", and "exist" must be used. What is wrong with the notion of objects existing
"independently" of conceptual schemes is that there are no standards for the use of even the
logical notions apart from conceptual choices ... We can and should insist that some facts are
there to be discovered and not legislated by us. But this is something to be said when one has
adopted a way of speaking, a language, a "conceptual scheme". To talk of "facts" without
specifying the language to be used is to talk of nothing; the word "fact" no more has its
use fixed by the world itself than does the word "exist" or the word "object". (Putnam, 1988,
p. 114)

The notion of an 'object' is necessarily bound up with those mechanisms of


individuation, enumeration, and reidentification that are - notoriously - responsible for the purported inscrutability of reference (Quine, 1960, chapter 2).
It only makes sense to talk of objects, then, where there exist criteria that determine when we have the same object on different occasions, or different objects
on the same occasion. If the claim that there is a representation-independent
order of objects is to be coherent, then the logical categories necessary for such
individuation must apply to THE WORLD in a determinate way independently
of our practice. But, as with the individuation of referents, it seems impossible to separate such categories determinately applying to a given domain
from their being applied as part of the cognitive activity of rational agents.
THE WORLD is no more capable of individuating objects than it is of sorting
objects into kinds, and for similar reasons. Thus for Putnam, as for Allison's
Kant, the "neglected alternative" to transcendental realism and transcendental
idealism is incoherent because it assumes that certain features of the 'mode
of representation' can sensibly be ascribed to a WORLD that is, by definition, logically independent of its own representability.
6. I have argued, in this paper, against two versions of the canonical response
to the MTA. The simpler version, which underlies Hansen's Lockean interpretation of the argument, was seen to rest upon inadequate attention to the
Kantian roots of the critique of Realism. If remedying this deficiency is the
key to 'unLockeing' the model-theoretic argument, further attention to the parallels between Putnam and Kant developed in our discussion of Hansen, and
to their bearing on the doctrine of 'conceptual relativity' to which Putnam now
accords a central role in his arguments against Realism, may help us to understand Putnam's answer to the 'fortified' version of the canonical response,
thereby helping to unlock the argument as well.
McGill University

286

DA VID DAVIES
NOTES

I
Hansen claims that his concern is to defend what he terms "minimal realism", and that
Putnam's characterisation of Realism incorporates certain substantive assumptions that neither
belong to nor are implied by minimal realism. But the central tenets of minimal realism, ad
identified by Hansen, seem identical to those of the Realism sketched above - namely, a commitment to "the notion of a mind-independent world and to a non-epistemic notion of truth"
(Hansen, 1987, p. 78).
2 It should be noted that both components of the fortified canonical strategy are necessary if
it is to have a chance of succeeding where its unfortified counter-part failed. As Gregory Currie
(1982) has pointed out, no additional purchase on THE WORLD is obtained by merely asserting
(as Lewis does) that we intend our terms to pick out "elite classes"; for any such talk can be
added to our Ideal Theory, and a SAT interpretation can be given for the IT so augmented.

REFERENCES
Allison, H., 1983, Kant's Transcendental Idealism: An Interpretation and Defense, Yale
University Press, New Haven.
Blackburn, S., 1984, Spreading the Word, Clarendon Press, Oxford.
Brueckner, A., 1984, 'Putnam's Model-Theoretic Argument Against Metaphysical Realism',
Analysis 44, 134-140.
Currie, G., 1982, 'A Note on Realism', Philosophy of Science 49,263-267.
Demopoulos, W., 1982, 'The Rejection of Truth-Conditional Semantics by Dummett and Putnam',
Philosophical Topics 13.1, 135-153.
Devitt, M., 1983, 'Realism and the Renegade Putnam', Nous XVII, 291-301.
Glymour, C., 1982, 'Conceptual Scheming', Synthese 51, 169-180.
Goldman, A., 1986, Epistemology and Cognition, Harvard University Press, Cambridge.
Hacking, I., 1983, Representing and Intervening, Cambridge University Press, Cambridge.
Hansen, C., 1987, 'Putnam's Indeterminacy Argument: The Skolemisation of Absolutely Everything', Philosophical Studies 51, 77-99.
Lewis, D., 1984, 'Putnam's Paradox', Australasian Journal of Philosophy 62, 221-236.
Merrill, G. H., 1980, 'The Model-Theoretic Argument Against Realism', Philosophy of Science
47,69-81.
Putnam, H., 1976a, 'The Locke Lectures: Meaning and Knowledge', in Putnam, 1978, pp.
7-80.
Putnam, H., 1976b, 'Realism and Reason', in Putnam, 1978, pp. 123-140.
Putnam, H., 1978, Meaning and the Moral Sciences, Routledge and Kegan Paul, London.
Putnam, H., 1980, 'Models and Reality', Journal of Symbolic Logic XLV, 464-482.
Putnam, H., 1981, Reason, Truth, and History, Cambridge University Press, Cambridge.
Putnam, H., 1983, Realism and Reason: Philosophical Papers Volume III, Cambridge University
Press, Cambridge.
Putnam, H., 1985, The Many Faces of Realism, Open Court, LaSalle.
Putnam, H., 1988, Representation and Reality, M.I.T. Press, Cambridge.
Putnam, H., 1990, Realism with a Human Face, Harvard University Press, Cambridge.
Quine, Willard Van Orman, 1960, Word and Object, M.I.T. Press, Cambridge.

JEAN LEROUX

HELMHOLTZ AND MODERN EMPIRICISM

I. Helmholtz's works in physical geometry, the semiotics associated with his


theory of perception and, in a more general vein, the Kantian influence of
his epistemology have not failed to draw the attention of philosophers.
However, Helmholtz's attitude towards scientific realism has scarcely been discussed. 1 While Helmholtz surely held a realist position toward laws of nature
(thUS espousing what could be called nomological realism), I want to underline significant aspects of his epistemology that indicate a rather sceptical stand
towards the realist thesis, and put these aspects in a historical-philosophical
perspective. With this in view, I will first indicate how Helmholtz, on the basis
of his investigations in physiology, came to consider sensations as signs, this
semiotic conception of sensations being at the center of his views on scientific realism. I will then discuss ensuing aspects of Helmholtz's theory of
science that show strong anti-realist tendencies and appear to anticipate major
themes of latter-day empiricism.

II. Helmholtz's century has occasionally been characterized as having witnessed in physics the refutation of Kantian philosophy of science, while
it has indeed been marked by multiple attempted transformations of the
latter. 2 Many physicists, Helmholtz among them, felt the need to emend the
Kantianism in which they had been trained. If German physicists of the first
half of the nineteenth century had been more than willing to discuss philosophical topics, they had also been unanimous in their hostility towards the
Naturphilosophie associated with the then prevailing idealism in German
Universities. 3 Helmholtz himself deplored this unhappy relation which, under
the influence of what he called the "philosophy of identity" of Schelling and
Hegel, had come to exist between philosophy and natural science. Talking
about the generation of physicists that preceded him, Helmholtz remarked
that Hegel's philosophy of nature "seemed, at least to natural philosophers,
absolutely meaningless. Of all the distinguished scientists who were his contemporaries, not one was found to stand up for his ideas".4
On the other hand, Helmholtz claimed to have remained a faithful Kantian
throughout his career, considering that what he had altered in Kant's philosophy of geometry was of secondary importance compared to what had been
Kant's fundamental results. 5 By this, he meant the demonstration of the transcendental character of the principle of causality (as a principle of the
intelligibility of natural phenomena). "The law of causality", writes Helmholtz,
"is in reality a transcendental law, a law which is given a priori. It is impossible to prove it by experience, for [... J even the most elementary levels of
experience are impossible [... J without the law of causality,,;6 nor can it be
287
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science I, 287-296.
1995 Kluwer Academic Publishers.

288

JEAN LEROUX

refuted by any possible experience, for "if we founder anywhere in applying


the law of causation, we do not conclude that it is false, but simply that we
do not yet completely understand the complex of causes mutually interacting
in the given phenomenon. ,,7
The transcendental character of causality underlies Helmholtz's theory of
perception, which he views as both the scientific completion of Kantian philosophy and the epistemological extension of his own investigations in optical
and acoustical physiology. We know that Kant had distinguished between
two faculties of representation: sensibility, a purely receptive faculty, and
understanding, which he qualified as "spontaneous". These mutually exclusive functions had to playa complementary role in the constitution of valid,
objective knowledge. Kant called "synthesis" the fundamental activity of understanding, by which a multiplicity of representations are united in a single
one. One could read in the Transcendental Analytic how understanding is
already active in the formation of our perceptions that are the product of a
synthesis of a multiplicity of sensations (the latter constituting for Kant the
given data of sensibility). Helmholtz adopted the same view while putting
emphasis on the causal dimension involved in perception and allowing a larger
part to experience and certain unconscious activities of the mind in the process
of perception. s
The cornerstone of Helmholtz's theory of perception is the so-called "theory
of the specific nerve energies", which he adopted from his former teacher
Johannes MOller. This theory postulate the existence of a causal relation
between external objects and our sensations. Our sensations are the effects
caused by external objects that produce excitations of our sense organs. The
different properties of the objects perceived by the senses are simply the
specific effects that external objects produce on the organs and, as such, depend
essentially upon the physiological composition of our senses. The quality of
the sensation caused by the external object depends on the peculiarities of both
the body producing it and the body which is affected. That is to say, all properties attributable to external objects through sensation do not denote something
that is peculiar to the individual object by itself, but invariably imply some
relation to our sense organs. Thus, there is no relation of similarity between
the quality of our sensations and those of the external objects. In Helmholtz's
terms, this means that sensations are not images depicting external objects, but
only signs or symbols indicating their presence as agents:
Our sensations are simply effects which are produced in our organs by objective causes; precisely how these effects manifest themselves depends principally and in essence upon the type
of apparatus that reacts to the objective causes. What information then, can the qualities of
such sensations give us about the characteristics of the external causes and influences which
produce them? Only this: our sensations are signs, not images, of such characteristics. One expects
an image to be similar in some respect to the object of which it is an image [... J. A sign, however,
need not be similar in any way to that of which it is a sign. The sole relationship between
them is that the same object, appearing under the same conditions, must evoke the same sign;
thus different signs always signify different causes or influences. 9

HELMHOL TZ AND MODERN EMPIRICISM

289

All signs are submitted to interpretation. Our perceptions result from an


interpretation which is accomplished at an unconscious level. More precisely,
this interpretation is carried out by unconscious inferences where, as we have
seen, the principle of causality is already at work. 1O More generally, perceptions arise from the unconscious activity of the mind that compares and
associates a large quantity of signs, learning thus to use these signs in pretty
much the same way that we learn to use words. Perception is learned and
this learning is conceived by Helmholtz in an analogous way to the learning
of a mother tongue. These semiotic systems differ essentially in that sensations, contrary to linguistic signs, are natural signs. 11
The fact that properties of objects as they affect us bear no similarity to
the properties of objects in the external world seems to put into question our
ability to form an image of reality, or at least an objective image of reality.
But time, together with causality, heals all things. Since there is a causal
relation between the sign and the external object of which it is a sign, and since
the principle of causality ensures us that under similar conditions, like causes
produce like effects, we can rest assured that different signs, as effects, refer
to different causes. In other words, since sensations are causally related to
external objects, the causal relations between the latter will be reflected by
the regular uniformity of our sensations. This lawful regularity that we observe
in the temporal sequence of our sensations is an image, and not only a sign,
of the lawful regularity of the events that cause them, so that what we can
make ourselves pictures of what is lawful in reality. If we do not know and
cannot know what our sensations refer to in the domain of reality, the principle of causality ensures us that differences in the realm of sensorial signs
correspond to differences in the order of things; we can indeed make ourselves
a picture of reality, but the latter is objective only as image of changes and
laws that govern these changes. The main point is that this is all that science
needs.
What we have called above Helmholtz's "nomological realism" follows
quite directly from his transcendental viewpoint on causality, his semiotic conception of perception and the basic idea that the temporal relation is the only
respect in which there can be an agreement between perceptions and reality.
Similar views would later be reiterated with force by his former student
Heinrich Hertz in his introduction to The Principles of Mechanics:
We form for ourselves images or symbols of external objects; and the form which we give
them is such that the necessary consequents of the images in thought [denknotwendige FolgenJ
are always the images of the necessary consequents in nature [naturnotwendige FolgenJ of the
things pictured. In order that this requirement may be satisfied, there must be a certain conformity between nature and our thought. [... J The images which we here speak of are our
conceptions of things. With the things themselves they are in conformity in one important respect,
namely, in satisfying the above-mentioned requirement. For our purpose it is not necessary
that they should be in conformity with the things in any other respect whatever. As a matter of
fact, we do not know, nor have we any means of knowing, whether our conceptions of things
are in conformity with them on any other than this one fundamental respect. 12

290

JEAN LEROUX

From a historical point of view, an argument could be made here that


Wittgenstein's picture theory of the Tractatus was not only borrowed from
Hertz and generalized (from a temporal to a logical interpretation of the
term Folge) for epistemological purposes, but that the fundamental idea
originated with Helmholtz. Indeed, we can trace back to Helmholtz the solution
to the problem of knowing what a picture must have in common with what
is pictured (in order for a picture to be a picture of something) when the
elements of the picture and the pictured elements are of different nature. In
contemporary terms, we could say that the relation between the sequence of
internal sensations and the sequence of their external causes is a relation of
structural identity or isomorphism. It is through this isomorphism that we
are enabled to have a picture of the nomological regularity of the external
world:
I need not go into the fact that it is a contradictio in adjecto to try to present the actual [das
Reelle] or Kant's "thing in itself" through direct detenninations that do not take the fonn of
our representation into account. This has already been often discussed. What we can attain,
however, is knowledge of the lawful order in the realm of reality [des Wirklichen], this order
being of course only represented in the sign system or for sense impressions. 13

Helmholtz inaugurated a picture theory that (in Kantian spirit) explains the
possibility of Newtonian mechanics as a description of movements of matter
in space. It is noteworthy that Helmholtz's theory of science bases knowledge of the external world on a single fundamental process of symbol formation
(the "linguistic tum"?) and not, as was the case with Kant, on two mutually
exclusive processes of intuition and intellection.
III. This semiotic conception of sensations is at the center of a complex position
regarding scientific realism. 14 Contrary to the opinion that Helmholtz always
remained a consistent realist, although his realm faltered for a moment in
his later Introduction to the Lectures on Theoretical Physics (1894), I want
to contend that Helmholtz had from the start a more nuanced standpoint on
the question.
IlIa. Already in the Memoir on the ConservatiOn of Force,ls which was his
first article on physics, Helmholtz took up considerations that run counter to
the realist thesis. One important aspect of this anti-realist standpoint concerns
as it were the notion of force in mechanics.
The Memoir starts with general considerations concerning the aim of
science. "The task of the physical sciences", says Helmholtz, "is to discover
laws so that natural phenomena can be traced back to, and deduced from,
general principles" .16 The search for such laws is the task of experimental
physics, while theoretical physics tries to find the causes of natural phenomena
on the basis of their observable effects. We are justified in and indeed compelled to this search for causes by the fundamental principle according to which
all changes in nature must have a sufficient cause. These causes that we

HELMHOL TZ AND MODERN EMPIRICISM

291

attribute to phenomena are variable or invariable; in the latter case, the same
principle that makes us ask for the cause of changes will make us ask anew
for the cause of this variation, so that only the discovery of constant causes
will achieve complete comprehension of natural phenomena. 17
Helmholtz goes on to say that in order to achieve this task, we have recourse
to two abstractions: force and matter. These abstractions are correlative in
the sense that they only have definite meanings in relation to and in combination with each other. A pure force, for instance, would be contradictory
and self-refuting, since it would correspond to a law describing change where
there is nothing to undergo change. In an appendix to the Memoir, Helmholtz
notes that this impossibility to conceive of force and matter without one another
follows simply from the fact that a force always presupposes certain conditions under which it is realized. "A force separated from matter would be
the objectification of a law, which lacked the conditions for its realization".18
To talk about forces in general terms without giving the conditions under which
they are realized is meaningless. This principle motivates the ontological
distrust regarding the diverse forces posited by mechanics, as it also underlies the descriptivist standpoint in mechanics of which Kirchhoff was the
most influential proponent. 19 If he doesn't completely adhere to Kirchhoff's
elimination from physics of the search for causes and causal explanations,
Helmholtz, on the other hand, subordinates the notions of force and cause
to that of law. We only speak of causes and forces, he explains, when we
recognize them as independent of our will. What is independent of our will
in tum, can only be recognized when we are able to recognize the uniform
effects of our will. Hence, lawlikeliness is the essential presupposition of
what has the character of reality.
In short, the Helmholtz of the Memoir acknowledges the reality of causes
because he judges that the category of causality is necessary to our comprehension of natural phenomena. However, his realism about causality is a critical
realism since our knowledge of causes always remains hypothetical and subject
to revision. As for causes themselves, they are real insofar as they are effective (Effectivity - Wirklichkeit - being the German word for reality). If it is
true that Helmholtz, by the time of Memoir, considered central forces as objective, this reality of forces as causes of movement nevertheless reduced to
the reality of laws. This brand of realism constitutes metaphysical realism
and not scientific realism, since it is the realm of laws, and not laws themselves, that, for Helmholtz, is devoid of hypothetical character.
IIIb. Helmholtz's scepticism about forces is accentuated in his Introduction
to the Lectures on Theoretical Physics, where he draws close to the descriptivist view alluded to above. Calling upon the tradition of Faraday, Maxwell
and Kelvin, and using Kirchhoff's own words, he states that the diverse laws
of movement constitute only a description of observable events, while adding
that a complete and simplest possible description of natural phenomena cannot
do without dynamical forces. Helmholtz states that he prefers to talk about

292

JEAN LEROUX

"laws of forces" and not "forces", since, according to the descriptivist view,
the notion of force, qua substance, has no factual content. "In a factual sense,
to be sure, we express nothing more and nothing less by this abstraction than
what is contained in the mere description of the phenomena.,,2o This instrumentalist standpoint, which only sees the necessity of the notion of force in
the context of prediction and in view of the economy of theoretical description, foreshadows indeed the logical empiricists' discussion of "theoretical
terms" or "auxiliary expressions" and their attempted syntactical and semantical elimination. I refer here of course to the use that Carnap and Hempel have
made of W. Craig's and F. P. Ramsey's formal results on definability within
first-order axiomatic systems. The idea that a force, that is, a capacity to
produce effects, cannot be conceived of independently of the effects it
produces, finds it definitive formulation in Helmholtz's thesis that a force,
separated from its conditions of realization, has no factual content. Logical
Empiricism was later to take over this notion of factual content and develop
it as a metatheoretical notion generalizable to entire theories. The factual
content of a physical or empirical theory was either defined in syntactical terms
as the class of all observational logical consequences of the theory's axioms,
or in semantical terms as the class of all observational substructures of the
models of the theory. It is interesting to note that already for Helmholtz, the
notion of factual content or empirical content was a semantic notion: Helmholtz
used the term "factual meaning".
HIc. The Introduction to the Lectures brings forward new considerations
enabling Helmboltz to present more systematically his views on the ontological status of forces. The systematic character of his standpoint lies in the
fact that it is now fully integrated into a theory of language. In the first Section
of the Lectures, Helmholtz explains that we are naturally inclined to use the
substantive mode of expression and, instead of speaking of "laws of forces"
- that term being closer to the factual meaning - we prefer to speak substantially of forces acting in a specific way. "As long as its meaning was not
completely clear", writes Helmholtz, "this mode of expression caused many
errors. More specifically, the abstract substantive force was thought to denote
something actually existent, and some believed that they were entitled to make
statements about the real properties of forces. [...] Of this hypothetical substantive, for that is what we must consider a force to be, we know nothing
except that it lies in its nature to produce a specific effect".21
We knew that forces were idealizations: we now know that they are no
more and no less than the substantivization of laws of force, and this
substantivization leads us to adopt the realist stance. This explanation of
realism as originating in the use of language (to adopt the substantive
mode as "preferred mode of expression") and as "leading to statements about
the real properties of forces that, however, either are tautologies or have only
an apparently real content", prefigures Carnap's distinction between the
material mode of speech (inhaltliche Redeweise) and the formal mode of

HELMHOL TZ AND MODERN EMPIRICISM

293

speech. 22 According to Carnap, ontological theses are as such metalinguistic


statements pertaining to the linguistic frame to be adopted in order to describe
a domain of discourse. This formal resolution of the issue between realism and
idealism was later generally taken over by the logical empiricists. Ernest
Nagel's classical work, The Structure of Science (1961), is typical in this
respect. Summarizing the debate between realism and instrumentalism (as a
form of anti-realism) and adopting Carnap's standpoint, Nagel concludes:
"In brief, the opposition between these views is a conflict over preferred modes
of speech".23
Helmholtz again foreshadows logical empiricist tenets by stating that the
thesis of scientific realism can only be stated in the material mode of speech.
But this thesis is external to science, since physical laws (and this is manifest
when formulated in the formal mode) have no ontological import, i.e. they
do not carry any presuppositions on the nature of the objects of the domain
of discourse. In the Appendix to 'The Facts of Perception', Helmholtz had
already incorporated his position on the realism issue within a theory of
language: "In this first section", writes Helmholtz, "I shall retain the realistic hypothesis and use its language. I shall also assume that the things are
perceive as objective really do exist and that they act on our senses. I do
this only in order to be able to use the simple, understandable language of daily
life [... ]. I shall drop the realistic hypothesis in later paragraphs and repeat
the same discussion in abstract language, without making any special assumptions concerning the nature of reality.,,24 Carnap's notion of "linguistic frames",
as well as the notion of "tolerance", seem already present in these lines.
Helmholtz considered the thesis of subjective idealism just as admissible as
the thesis of realism. As with the theses of materialism and spiritualism, they
were for him equally possible, equally irrefutable, equally metaphysical. 25 It
is in view of their heuristic value that Helmholtz accorded to metaphysical
hypotheses their right of existence in science. However, the principles of
scientific explanations that he applied in his own scientific works were clearly
of an empiricist nature and counterbalanced his tolerance towards metaphysics. 26
Finally, I would like to mention very briefly a topic that would require an
autonomous treatment and has implications for the realism issue: it has to
do with Helmholtz's conception of truth. This is summarized in the following
passage of the Treatise on Physiological Optics:
In my opinion ... there can be no possible sense in speaking of any other truth of our ideas
except of a practical truth. Our ideas of things cannot be anything than symbols, natural signs
for things which we learn how to use on order to regulate our movements and actions. [... J
To ask whether the idea I have of a table, its form, strength, colour, weight, and so on, is true
per se, apart from any practical use I can make of this idea, and whether it corresponds to the
real thing, or is false and due to an illusion, has just as much sense as to ask whether a certain
musical note is red, yellow, or blue. 27

While it is generally recognized that the notion of truth as correspondence


to facts or reality has affinities with realism, it is equally significant that

294

JEAN LEROUX

Helmholtz's semiotic conception of sensations directly leads him to reject


the correspondence theory of truth in favor of a pragmatic theory of truth.
IV. Summing up, let us note that already in the Memoir (1847), Helmholtz took
up considerations that put the realist thesis directly into question. The same
considerations were further developed in the third Volume of his Psychological
Optics (1867) and in 'The Facts of Perception' (1878), a major article that
summarizes the results of his investigations in physiology, discusses their philosophical implications and accentuates the anti-realist features of his theory
of science. The same views were later to be reiterated in a more definite and
systematic manner in the Lectures on Theoretical Physics (1894), where, for
instance, Helmholtz, coming extremely close to endorsing Kirchhoff's descriptivism, states that the notion of force as such has no factual content. Taken
together, Helmholtz's construal of the realist thesis as a metaphysical thesis,
his recourse to the idea of "preferred modes of speech" in order to account
for the realist thesis, his adoption of a pragmatic notion of truth and his
views on the ontological and epistemological status of the notion of force in
mechanics all foreshadow some of the major tenets and developments of twentieth century Empiricism.
University of Ottawa
NOTES
I
Notorious analyses are found in Lenin's Materialism and Empirio-Criticism (1909), which
is essentially polemical, and Cassirer's Philosophy of Symbolic Forms, 1lI (1929), which treats
the question in an incidental manner. Russell Kahl, editor and translator of Selected Writings
of Hermann von Helmholtz (Middleton, Conn.: Wesleyan University Press, 1971) addresses
the issue in the introduction; his main contention (to which I will come back later) is that:
"On the whole he [Helmholtz} continued to be a persistent realist, arguing among other things
that a correct analysis of perception, while it does not provide conclusive evidence, does support
the realist position" (p. xx). More recently, G. Hatfield (The Natural and the Normative. Theories
of Spatial Perception from Kant to Helmholtz. Cambridge (Mass.): The MIT Press, 1990) has
provided a thoroughgoing discussion of the question with respect to spatial realism and idealism.
2 As F. Suppe points out, "By the turn of the [nineteenth] Century, the three main philosophic
positions held in the German scientific community were mechanistic materialism, neo-Kantianism,
and Machian neo-positivism, with neo-Kantianism being the most commonly held" (The Structure
of Scientific Theories, 2nd ed., Urbana: University of Illinois Press, 1977, p. 10).
3 Taking a look at the physics textbooks of the epoch, C. Jungnickel and R. McCommach
(Intellectual Mastery of Nature. Vol. I: Theoretical Physics from Ohm to Einstein. Chicago:
University of Chicago Press, 1986) note that "almost to a man", the early nineteenth century
German university physicists declared themselves to be influenced by Kant's work and demonstrated it in their publications. "The move away from Aristotelian textbooks, which had occurred
not long before, to textbooks that stressed experience had been followed by renewed attention
to the nature of objective knowledge and particularly to the process by which experience becomes
scientific knowledge" (p. 24). Further on, they add: "Except for matters of emphasis, the physicists' quarrel was not with philosophy as a whole but only with a part of it, Naturphilosophie,
which they often did not even dignify with the name of philosophy. The heated campaign of
the physicists against the nature philosophers did not stop with denunciations in their text-

HELMHOL TZ AND MODERN EMPIRICISM

295

books. These were faint echoes of what they said in private letters, reviews, addresses, periodicals, and elsewhere" (p. 27).
4
'Uber das Verhiiltnis der Naturwissenschaften zur Gesamtheit der Wissenschaft' (1862),
p. 126 in Kahl, Selected Writings, trans. as 'The Relation of the Natural Sciences to Science in
General', pp. 122-143.
5 However, what Helmholtz had emended could barely be considered of secondary importance: it had to do with the epistemological status of the axioms of geometry. Appealing to
previous work done by Beltrami and reaching to results on n-dimensional manifolds that Riemann
had obtained just before, Helmholtz showed that physical geometry is based on certain physical
facts and constitutes as such an empirical science. Eventually, Helmholtz admitted that space
could be considered as an a priori form of intuition; but the question as to whether this space
is Euclidian or not (a question that is involved in the choice of axioms of geometry) remained
for him an empirical question.
6 'Die Tatsachen in der Wahrnehmung' (1878), p. 390 in Kahl, Selected Writings, trans. as
'The Facts of Perception', pp. 366-408.
7
Handbuck der physiologischen Optik, III (1867), p. 33 in J. P. Southhall ed., Helmholtz's
Treatise on Physiological Optics, III. New York: Dover, 1962.
8 "What we unquestionably can find as a fact, without any hypothetical element whatsoever,
is the lawful regularity of the phenomena. From the very first, in the case where we perceive
stationary objects distributed before us in space, this perception involves the recognition of a
uniform or lawlike connection between our movements and the sensations which result from
them. Thus even the most elementary ideas contain a mental element and occur in accordance
with the laws of thought" ('The Facts of Perception', p. 386 in Kahl, Selected Writings).
9 'The Facts of Perceptions', p. 372 in Kahl, Selected Writings.
JO
Helmholtz talks about "unconscious conclusions" that are the product of inductive judgements. This is tied with Helmholtz's associational account of the processes underlying perception.
With the benefit of hindsight, one would judge Helmholtz's conception of logic, especially his
associative account of inference, as psychologistic. As Hatfield (The Natural and the Normative,
pp. 207) remarks: "Retrospectively describing his {Helmholtz} account of the inductive inferences of perception, he asserted that they depend not merely on the laws of association but
also on the Kantian 'law of causality' ".
II
Cf. 'Uber das Ziel und die Fortschritte der Wissenschaft' (1869), trans. as 'The Aim and
Progress of Physical Science', pp. 223-245 in Kahl, Selected Writings: "From these [Miiller's
law of the specific energy of the senses] and related facts we are led to the very important
conclusion that our sensations are, insofar as their quality is concerned, only signs of external
objects, not images with any sort of resemblance to them. An image must be similar in some
respect to an object. [... ]. For a sign, it is sufficient that it appear whenever that which it
signifies makes an appearance, the correspondence between them being restricted to their
appearing simultaneity. The correspondence existing between our sensations and the objects
producing them is precisely of this kind. Sensations are signs which we have learned to decipher.
They form a language given to us by our physical make-up, a language by which external
objects speak to us. It is, however, a language, which, like our mother tongue, we can learn
only by practice and experience" (pp. 241-242).
12
Die Prinzipien der Mechanik (1894), pp. 1-2 in The Principles of Mechanics, trans. by D.
E. Jones and J. T. Walley. New York: Dover, 1956.
13
This translation differs somewhat from Kahl's (p. 388 in Selected Writings). Cf. H. von
Helmholtz, Die Tatsachen in der WahrnehmunglZiihlen und Messen. Darmstadt: Wissenschaftliche Buchgesellschaft, 1959, p. 45.
14
Helmholtz's stand on the question is surely more complex than the above quoted main
contention from Kahl would lead one to believe, for Kahl feels the need to add: "Sometimes,
especially in the Introduction to the Lectures on Theoretical Physics, his [Helmholtz] scientific realism faltered for a moment and he favored the position that to use abstract substantives
such as force was only a far,;on de parler. It was convenient and consistent with the natural,
common-sense realism of everyday life; but a formulation, for example, of a law of motion

296

JEAN LEROUX

using the tenn force ("the force of gravity") was, he maintained at these times, empirically no
more than a description of motion" (Selected Writings, p. xx).
15
'Uber die Erhaltung der Kraft' (1847), pp. 12-75 in Wissenschftliche Abhandlungen,
I(Leipzig: Barth, 1882}, also reproduced as monograph, Uber die Erhaltung der Kraft. Weinheim:
Physik-Verlag, 1883, trans. as 'The Conservation of Force: A Physical Memoir', pp. 3-55 in
Kahl, Selected Writings.
16
Memoir, p. 3 in Kahl, Selected Writings.
17
One of the major point of the memoir was indeed to insist on the necessity of reducing all
natural phenomena to invariant forces of attraction and repulsion, to central movement forces,
the intensities of which are dependent only upon the mutual distance of material bodies.
IS
Memoir, Appendix I, pp. 49-50. in Kahl, Selected Writings.
19
Cf. Gustav Kirchhoff, Vorlesungen iiber mathematische Physik: Mechanik. 2nd. ed., Leipzig,
1877. Preface: "We usually define Mechanics as the science of forces, and forces, as causes which
produce movements or have the capacity to produce movements. [... ]. However, [this definition] is beset with the obscurity inherent in the concepts of cause and capacity. Given the rigor
of demonstrations otherwise obtained in Mechanics, it has appeared desirable to me to remove
such obscurities, even if this proves to be only feasible at the cost of restricting its assigned
task. Hence I give to Mechanics the task to describe the movements that occur in nature, in a
complete and simplest possible way. By this, I mean that Mechanics has soley to detennine
what are the phenomena that occur, and not to search for their causes." (My translation)
20
Einleitung zu den Vorlesungen iiber theoretische Physik [Koenig & Runge (eds.). Leipzig:
Barth, 1903], p. 521 in Kahl, Selected Writings (trans. as 'Introduction to the Lectures on
Theoretical Physics', pp. 513-529).
21
Ibid., pp. 521-4.
22
Cf. Rudolf Carnap, 'On the Character of Philosophical Problems', Philosophy of Science I
(1934) 5-19.
23
E. Nagel, The Structure of Science. 2nd ed., Indianapolis: Hackett, 1979, p. 152.
24
'The Facts of Perception, Appendix 1', in Kahl, Selected Writings, p. 398.
25
"Undoubtedly the realistic hypothesis is the simplest that can be fonnulated. [...]. We cannot
assert that it is necessarily true, for opposed to it there is always the possibility of other irrefutable
idealistic hypotheses. [... ]. It is always well to keep this in mind in order not to infer from
the facts more than can be rightly inferred from them. The various idealistic and realistic interpretations are metaphysical hypotheses which, as long as they are recognized as such, are
scientifically completely justified. They may become dangerous, however, if they are presented
as dogmas or as alleged necessites of thought" (,The Facts of Perception', p. 385-6 in Kahl,
Selected Writings).
26
The Treatise on Physiological Optics, which closes on a chapter dealing with "The
Foundations of the Empirical Theory" is eloquent in that respect: "I acknowledge that we are still
far from a real scientific comprehension of psychic phenomena. We may agree with the idealistic philosophers [Spiritualisten] and take the ground that it is absolutely impossible to
comprehend them, or we may take precisely the contrary view of the materialistic school,
according as we are inclined toward one speculation or the other. The natural philosopher must
stick to the facts and try to find out their laws; and he has no means of deciding between these
two kinds of speculation, because materialism, it should be remembered, is just as much a
metaphysical speculation or hypothesis as idealism, and therefore it has no right to decide
about matters of fact in natural philosophy except on a basis of facts. [... ] It is safer, in my
opinion, to connect the phenomena of vision with other processes that are certainly present
and actually effective [... ] instead of trying to base these phenomena on perfectly unknown
hypotheses [... ] which have been invented for the purpose and have no analogy of any sort"
(p. 532). On Helmholtz's methodological empiricism (in relation with his attack on nativistic
theories of vision), see Hatfield, The Natural and the Normative, Chapter 5, Section 2.
27
Ibid., p. 19.

WILLIAM R. SHEA

TECHNOLOGY AND THE RISE OF


THE MECHANICAL PHILOSOPHY

INTRODUCTION

The method of scientific investigation that became prevalent in the seventeenth


century rests on the assumption that the universe can be understood on the
analogy of a machine rather than on that of an organism. On this view, the
basic explanatory elements are matter and motion, where matter is characterized by size and shape, and motion is described by a small number of
rules based on the principle of inertia. This mechanical philosophy, as it
came to be known, was considered the simplest, as well as the most economical
and comprehensive of all possible accounts of nature. With Galileo and Newton
it triumphed not only in mechanics but also in cosmology and, with Descartes,
in the reduction of the phenomena of life to the working of a clock.
Why did natural philosophers come to expect that the correct explanation
of nature could be couched, indeed must be couched, in mechanical terms?
How did the Soul of the World become the Perfect Clock? By what steps
did the Machina Mundi, which meant little more than the Frame or System
of the World, acquire the narrow mechanical sense in which it is customarily used in the 17th century? There is no easy and obvious link between
progress in engineering and technology and the rise of a new philosophical
paradigm because we are dealing with a shift in expectations rather than a
straightforward causal connection. As A. Rupert Hall pointed out, "When
we speak of seventeenth century mechanics, we mean not a theory of machines
but a mathematical science of motion".1 The mechanization of nature was
rendered possible by the mathematization of the conceptual tools used to interpret nature, but mathematics alone would not have sufficed. Arithmetic and
geometry enhanced the understanding of machines as sources of power; they
did not create them. From the Middle Ages onwards, technology made steady
if unspectacular progress and acquainted an ever-increasing number of people
with the benefits of machines. I argue in this paper that the development of
mechanical power and the rise of technology created the intellectual climate
for the mechanization of the world-picture. 2
PUTTING NATURE TO WORK

The power of water was used in several earlier civilizations but it was nowhere
exploited so widely as in medieval Latin Christendom. In England in 1086
there were 5,624 water-mills south of the Trent and Severn. Given the population at the time, this is about one to every fifty households, enough to make
a profound difference in lifestyle. In the eleventh century, the tide-mill was
297
M. Marion and R. S. Cohen (eds.), Quebec Studies in the Philosophy of Science 1,297-307.
1995 All rillhts reserved.

298

WILLIAM R. SHEA

introduced on the Adriatic and soon spread to other countries on the


Mediterranean and in Northern Europe. The water-wheel had been used to
grind corn by the Romans but it was now made to serve a variety of other
purposes, beginning with fulling, a process of beating woollen cloth in water
to shrink it in order to increase its density and durability. Initially performed
by the stamping of feet or the beating of clubs, it was transformed by the introduction of trip-hammers raised by cams on the shaft of a water-wheel, a method
that was common by the thirteenth century. Around that time water-driven
forge hammers and forge bellows were developed, soon followed by pulping
mills for paper-making, stamp-mills for crushing ore and similar devices for
pounding wood to obtain the popular blue dye that was later superseded by
indigo. In the thirteenth century, water power was also used for saw-mills
and grinding cutlery, and in the fourteenth for wire-drawing, grinding pigments
and driving lathes. By the fifteenth century, the water-wheel, which had initially been used to mill corn, had become a generalised prime mover useful
in many branches of industry and familiar to all social classes as a way of
relieving human effort. 3
Wind was another source of power that was harnessed in the late twelfth
century. Windmills had been erected in Islam since the seventh century but
there were of a different type, consisting of vertical sails fastened to the rim
of a horizontal wheel on a vertical axle that directly drove the stone that ground
the flour. The European version was, from the outset, the one with which
we are acquainted and consists of sails fanning out from a horizontal axle from
which the drive is transmitted by a series of gears. The inspiration was clearly
the water-mill described by Vitruvius in the first century B.C. but turned upside
down and with sails replacing the water-wheel. The windmill was not as easily
adaptable as the water-mill to purpose other than com-milling but from about
1400 onwards it was used for water-lifting in the Netherlands where it became
the linchpin of their drainage system. In Bohemia, it was successfully adapted
to power a mine hoist.
The evolution of the European windmill is an interesting case of technological progress. The earliest models were of the post-mill type in which the
whole mechanism was rotated to face the wind. This severely restricted the
size of the mill. In the fourteenth century, the machinery began to be housed
in a fixed tower capped by a revolving turret that carried the sails. This
improved version could give two to three times as much power and was fully
developed by Dutch engineers in the sixteenth century.
As important as the replacement of animal and human muscle power by
wind and water was the development of the mechanical clock, the first complex
machine to become a public attraction. The Ancients had relied on water-clocks
that measured time by the amount of water that dripped from a container
through a small hole. They had added regulating devices and mechanical
connections to show the time by means of a pointer on a scale but they do
not seem to have thought of an alternative design.
The appearance of the mechanical clock, driven by a falling weight and

TECHNOLOGY AND THE RISE OF THE MECHANICAL PHILOSOPHY

299

depending for its time-keeping on an escapement, namely a device that


regularly interrupts the motion of the clockwork, is not well attested before
the fourteenth century. From then on, these clocks quickly became very
common and were a matter of civic pride. The most famous was built in
Strasbourg in 1354 and featured a cock that stretched its neck, flapped its wings
and crowed. Around 1480 spring-driven clocks were introduced, and before
the end of the century portable versions - too large to be classed as watches
today - were available. Clock-making demanded great accuracy of workmanship and set new standards of precision. When allied to the skill of the
millwrights and the builders of other power-driven machinery it opened the
door to a new technological age.
THE STATUS OF CRAFTSMEN

The development of machinery is associated with a change of attitude towards


technology which has often been contrasted with the viewpoint of the Ancients,
especially that of Aristotle and Plato. In his Politics, Aristotle excluded
"mechanics" or artisans from citizenship in the ideal state on the grounds
that to be a citizen in the full sense one must be "released from necessary
services" (l278a 8-11). In this he echoed Plato who, in The Laws, declared,
"no resident citizen shall be numbered among those who engage in technical
crafts" (846d).
But Antiquity was not of one mind on this issue as on so many others.
We see this in Plutarch's ambiguous appraisal of the remarkable machines
that Archimedes designed in order to defend Syracuse. On the one hand,
Plutarch notes the interest of King Hiero in applied science and mentions
that the King asked Archimedes to "reduce to practice some part of his
admirable speculation in science, and by accommodating the theoretic truth
to sensation and ordinary use, bring it more within the appreciation of the
people in general". On the other hand, since Plato disparaged mechanics,
Plutarch felt he had to assure his readers that "Archimedes possessed so high
a spirit, so profound a soul, and such treasures of knowledge that ... he
repudiated as sordid and ignoble the whole trade of engineering".4 Nonetheless
Plutarch recognized that Archimedes' "renown of more than human sagacity"
rested on his technological discoveries and innovations. 5 Hence the disparagement of the crafts existed alongside a strong current of admiration for
technology in Antiquity. The Mechanica, a work of a follower of Aristotle
but commonly ascribed to the master himself, contains an important discussion of the lever, the basis of all machines, a word that was applied to any
device interposing a number of moving parts between the driving force and
the object moved. Traditionally, the five simple machines were the lever, the
capstan, the wedge, the screw and the windlass.
Archytas of Tarentum was famous for the invention of a wooden dove
that could fly, and Philo of Byzantium and Hero of Alexandria (around first
century A.D.) used compressed air to open and shut doors, move figures and

300

WILLIAM R. SHEA

blow whistles. In the preface to book VIII of his influential Mathematical


Collections, Pappus (fourth century A.D.) tells us that the science of mechariics
"is held by philosophers to be worthy of the highest esteem, and is zealously
studied by mathematicians, because it takes almost first place in dealing with
the nature of the elements of the universe".6 Pappus' positive appraisal of
mechanics was shared by others. Vitruvius' great compendium of architecture, De Architectura, contains a number of machines, and many Renaissance
descriptions of devices for raising water and lifting or drawing weights were
inspired by those he mentioned.
THE SIGNIFICANCE OF MINING

The positive valuation of crafts was reinforced in the writings of a number


of sixteenth-century authors on mining, on mechanics and on machines. The
Sienese Vannoccio Biringuccio (l480-ca. 1538) wrote a treatise, Pirotechnia,
on ores, assaying, and smelting that is remarkable for its freshness and selfconfidence. Biringuccio stressed the openness of scientific knowledge and
denounced the secret operations of alchemy as an impediment to progress.
He emphasized the accurate crediting of authorship as a form of openness
and he derided alchemists who concealed their ignorance behind a smokescreen
of citations.
Georg Bauer, a physician working in the mining regions of South Germany,
and known by the latinized form of his name, Agricola, discussed various
aspects of the extraction and preparation of metals in De re metallica, published in Basel in 1556. The sixth book, which deals with machines used in
mines for pumping out water, ventilating the shafts and hauling up the ore,
is the most lavishly illustrated section of the whole work. It caught the attention of the readers more than other parts that deal more strictly with the
nature of the various metals and how they are to be worked. Biringuccio and
Agricola were instrumental in changing a prevalent attitude towards mining
and, hence, towards nature itself. Because the earth was considered a living
being in Antiquity, the formation of metals was seen as the result of a long
gestation in womb like matrices deep below the surface. This idea carried ethical
implications for mining. In his Natural History, Pliny (23-79) warned against
invading the womb of mother earth, and conjectured that earthquakes were her
way of expressing her indignation at this violation. 7 "How innocent", he added,
"how blissful, nay even how luxurious life might be, if it coveted nothing from
any source but the surface of the earth, and, to speak briefly, nothing but
what lies ready to her hand".8 Ovid, in the Metamorphoses, contrasts the happy
state of mankind before mining was practiced with the evil let loose in the
form of greed and trickery as men dug into the earth's entrails in the age of
iron. 9 Seneca also lamented the greed that made men pry into the bosom of
the earth: "What powerful necessity bent man down, man ordinarily erect to
the stars, and buried him and plunged him to the bottom of the innermost
earth so that he might dig out gold, no less dangerous to search for than it

TECHNOLOGY AND THE RISE OF THE MECHANICAL PHILOSOPHY

301

is to possess?".10 These texts were often quoted in the sixteenth century, for
instance by writers like Henricus Cornelius Agrippa, who published his De
incertitudine et vanitate omnium scientiarum et artium in 1530. The Ovidian
theme is also echoed in the two greatest epic poems in English: Edmund
Spencer's Faerie Queene (1595) and John Milton's Paradise Lost (1667).
Spencer laments the day when mining began:
Then gan a cursed hand the quiet wombe
Of his great Grandmother with steele to wound,
And the hid treasures in her sacred tombe
With Sacrilege to dig ... 11

Milton describes "bands of pioners with Spade and Pickaxe" who, led by
Mammon,
Ransacked the Center, and with impious hands
Rifled the bowels of their mother Earth
For Treasures better hid. Soon had his crew
Opened into the Hill a spacious wound
And Diged out ribbs of Gold l2

Biringuccio and Agricola defended mining against these strictures. They


argued that minerals and metals were blessings from heaven and that those
who did not avail themselves of them wronged themselves and their fatherland. Just as man catches fish out of the deep blue sea, so he hauls up bounty
from the deepest recesses of the Earth. Biringuccio and Agricola did not
make a frontal attack on the metaphor of the Earth as a nurturing mother,
but their vindication of mining and their praise of machinery contributed to
the demise of the organic model and prepared the rise of the mechanistic image
that replaced it.
These technological developments provided an environment where natural
philosophers had their attention directed to processes of artisans that they might
otherwise have overlooked. The relation of craftsmanship to the mechanical
philosophy was not straightforward, however. The success of craft empiricism is undeniable but it would be an overstatement to say that achievements
of the arts were so spectacular that they forced upon scholars the realization
that their procedure was barren and that they should apprentice themselves
to the craftsman. Nonetheless, philosophers changed their attitude towards
the crafts, partly because of technological progress, partly because of the recognition of a new purpose for natural philosophy.
Renaissance books on machines played a major role in this respect, especially Jacques Besson's Theatre des instruments mathematiques et mechaniques
(Lyon, 1578), Vittorio Zonca's Novo teatro die machine et edificij (Padua,
1607), and Agostino Ramelli's Le diverse et artificiose machine (Paris, 1588).
Besson's book has 60 plates, Ramelli's contains nearly 200 of exceptional
quality. All these works are devoted to a wide public conscious of the benefits
that better machinery could give them, and they are, especially Ramelli's,
aesthetically attractive. Although the engineer's workshop was new territory

302

WILLIAM R. SHEA

for most of the illustrators, they rose to the challenge and produced pictures
that are both pleasing to the eye and accurate in the rendering of technical
details. The importance of illustrations in conveying a precise idea of nature
and functions of the parts of a machine is obvious, and it is surprising that
these three great books should have appeared so late. The delay in publishing
sketches of mechanical inventions may have something to do with a world that
knew little of patents or copyrights, and in which inventors had no interest
in publicizing their work for others to plagiarize.
Closely related to the growing interest in machines was the increased
number of automata based or developed from the models found in antiquity.13 The singing birds of Philo and Hero had been powered by compressed
air or steam. An important innovation of the sixteenth century that made
possible the reproduction of sound within a self-contained unit was the
revolving pinned barrel or cylinder. The action of pins or pegs attached to
the circumference of the cylinder or barrel at right angles to the axis could
be transmitted some distance by means of simple levers as the cylinder
revolved. If these levers were placed in contact with valves of organ pipes,
the pipes would sound for as long as the pins continued to make contact
with the levers. This invention made possible the completely mechanical
performance of automatic sounding instruments. One of the earliest applications of this idea can be found in an organ clock that was presented as a gift
from Queen Elizabeth to the Sultan of Turkey in 1599.
The fascination with machinery is illustrated in the writings of Renaissance
engineers such as Leonardo da Vinci (1452-1519) and Francesco di Giorgio
Martini (1439-1501). By the time Montaigne went on an extended tour of
Switzerland, Southern Germany and Italy in 1580-1581, it had become fashionable to be on the look-out for technological innovations, especially if they
had entertainment value. In his Journal de voyage, Montaigne notes practical devices for hoisting and distributing water, and he particularly admired
the fine display of fountains at the Villa d'Este in Tivoli. He comments enthusiastically on the hydraulic organs that played music to the accompaniment
of the fall of water and devices that imitated the sound of trumpets. He relates
how birds began to sing and how, when an owl appeared on a rock, the birdsong ceased abruptly. He does not seem to have realised that this sequence was
borrowed from Hero of Alexandria's Pneumatics which Federico Commandino
had made available in a Latin translation in 1575. The rest of Europe sought
to emulate Italian achievements and Henri IV borrowed from the Grandduke
Ferdinand I (1551-1609) the services of Tommaso Francini and his brother,
Alessandro, to design the water-works at Saint-Germain-en-Laye. Their creations were to inspire Descartes who either saw them personally or read about
them in Salomon de Caus' illustrated La raison des forces mouvantes avec
diverses machines tant utiles que plaisantes ausquelles sont adjoints plusieurs
desseings de grotes et fontaines (Frankfurt, 1615).

TECHNOLOGY AND THE RISE OF THE MECHANICAL PHILOSOPHY

303

CRAFT EMPIRICISM AND THE GUIDING HAND OF MATHEMATICS

The Italian artisans, much more given to reading than their ancestors, were
readier to look for theoretical explanations and to expect improvement from
a knowledge of mathematics. To build his cupola in Florence Brunelleschi
needed the guidance of theory, and the guidance of theory presupposed faith
in theory. Renaissance projects were impressive and needed master-builders
but it is important to remember that megaprojects, as such, do not lead to a
knowledge of the laws of nature. The Egyptian builders of the pyramids left
no treatises on statics and no writings on the resistance of materials. The
Renaissance books on machines stress how important it is to apply theoretical considerations to machines as opposed to merely putting them together
by rule of thumb. Craft knowledge had been craft secret. Vannoccio Biringuccio
inveighed against this and called for openness in science. The medieval
craftsman suffered from lack of familiarity with any theoretical notions and
from a narrow, ingrained specialization that there was only one way of doing
things rightly. Biringuccio realized that an analysis of the action of machines
is essential to the realization of their full potential.
The spokesmen of this new school of thought were numerous but none
was more eloquent than the Lord Chancellor, Francis Bacon. In the opening
lines of the Novum Organum he wrote:
I. Man, as the minister and interpreter of nature, does and understands as much as his observations on the order of nature, either with regard to things or the mind, permit him, and neither
knows nor is capable of more.
II. The unassisted hand and the understanding left to itself possess but little power. Effects
are produced by the means of instruments and helps, which the understanding requires no less
than the hand; and as instruments either promote or regulate the motion of the hand, so those
that are applied to the mind prompt or protect the understanding.
III. Knowledge and human power are synonymous. I.

Bacon stressed that knowledge was power and that nature was not only
to be modified but improved. In a significant passage of his Advancement of
Learning, he declares,
But if my judgment be of any weight, the use of history mechanical is of all others the most
radical and fundamental towards natural philosophy; such natural philosophy as shall not vanish
in the fume of subtile, sublime, or delectable speCUlation, but such as shall be operative to the
endowment and benefit of man's life: for it will not only minister and suggest for the present
many ingenious practices in all trades, ... but further, it will give a more true and real illumination concerning causes and axioms than is, hitherto attained. IS

BACONIAN IDEALS

Knowledge confers power and for the Lord Chancellor power was to be used
for the establishment of the kingdom of man over nature. This utilitarianism
has been illuminated in recent years from a rather unexpected source, the
hermetic tradition. The intent to manipulate nature for useful ends was a goal

304

WILLIAM R. SHEA

of Paracelsian chemists and many of Bacon's ringing phrases hold levels of


meaning that are surprising. In his New Atlantis, the Head of Solomon's House
is made to say, "The End of our Foundation is the knowledge of Causes, and
secret motions of things, and the enlarging the bounds of Human Empire, to
the effecting of all things possible".16 The effecting of all things possible Bacon's self-proclaimed goal - partakes more of the alchemist's dream than
of the practical vision of the engineer, and Bacon makes it abundantly clear
that he expected new substances to be created by manipulating the powers
and active virtues of matter. Solomon's House had deep mines where artificial metals were produced by burying materials under conditions that were
believed to be used by nature to generate metals. Earths were skilfully mixed
to produce plants and to transform one kind into another. Even animals were
produced using the same principles:
We make a number of kinds of serpents, worms, flies, fishes, of putrefaction; whereof some
are advanced (in effect) to be perfect creatures, like beasts or birds, and have sexes and do
propagate. Neither do we this by chance, but we know beforehand of what matter and commixture
what kind of those creatures will arise. 17
CARTESIAN RIG OUR

Descartes was equally insistent on the domination of nature as the aim of


the "practical" philosophy which he compared to the knowledge of craftsmen.
"His eyes were opened", he says,
to the possibility of gaining knowledge which would be very useful in life, and of discovering
a practical philosophy which might replace the speculative philosophy taught in the schools.
Through this philosophy we could know the power and action of fire, water, air, the stars, the
heavens and all the other bodies in our environment, as distinctly as we know the various crafts
of our artisans; and we could use this knowledge - as the artisans use theirs - for all the
purposes for which it is appropriate, and thus make ourselves, as it were, the lords and masters
of nature. 18

Bacon's science is still Faustian; Descartes' is completely mechanical. The


material world, all the material world, is made up of cogs and wheels, of cranks
and shafts. The phenomena that the craftsman handles and the scientist explains
are bits and pieces of machinery. Machines, regardless of their size or complexity, have no end of their own. They are to be used to promote human
interests and serve "the general welfare of mankind" as Descartes puts it. '9
Nature is no longer the object of awe or even respect. On the analogy of the
operator who pushes and pulls the levers of a machine, man is not part of nature
but stands outside it, tinkering with its mechanism and seeking to enhance
its utility for mankind. The world-machine expresses order and intelligence
but it is not the kind of intelligence that is immanent and it can no longer
be compared to a Cosmic Soul. It is an intelligence that is separate and distinct
from the Nature that it produces. Man, created in God's image, also transcends nature and, in a mechanical universe, exercises his stewardship as a
mechanic rather than as a shepherd. When the world was understood as an

TECHNOLOGY AND THE RISE OF THE MECHANICAL PHILOSOPHY

305

organism, to intervene in nature was to act upon entities endowed with their
own vital forces. Now it is only a matter of winding up, regulating and oiling
the clockwork mechanism. 20
NOVEL TY AND PROGRESS

The mechanical philosophy denied that the world was an organism and as a
consequence repudiated any attempt to explain change as an endeavour to
realize forms not yet existing. The quest for tendencies or final causes was
denounced as misguided and true knowledge was identified with the discovery
of the mechanical structure that produced a specific kind of motion. This vastly
enlarged man's possibility of acting on nature because the laws that governed
the world as a whole ceased to be different from those with which he was
familiar in machines.
The fascination with clockwork figures lay in the demonstration that the
actions of living creatures could be mimicked by the simplest movements of
springs, cords and levers. There is no doubt of their influence on Descartes'
understanding of animals. In his Treatise on Man, he refers to water-powered
automata to explain the operations of the human body which, he claims, is
nothing but a machine:
Indeed, one may compare the nerves of the machine I am describing with the pipes in the
works of these fountains, its muscles and tendons with the various devices and springs which
serve to set them in motion, its animal spirits with the water which drives them, the heart with
the source of the water, and the cavities of the brain with the storage tanks. Moreover, breathing
and other such activities which are normal and natural to this machine, and which depend on
the flow of the spirits, are like the movements of a clock or mill, which the normal flow of
water can render continuous. 21

To a correspondent, Descartes says of swallows returning in the spring, "they


behave like clocks".22 In some respects, Descartes' idealized automata were
more perfect than human beings. He marvelled at their regularity, consistency and reliability, all easily explained in terms of the shape, size and motions
of their parts. If Descartes had never seen automata it is unlikely that he would
have had such faith in machines. Descartes' belief was shared by others. Robert
Boyle repeatedly referred to the world as a "great automaton,,23 on the analogy
of the clock of Strasbourg. 24 As early as 1541, Joachim Rheticus wrote in
the Narratio Prima, one of the first accounts of Copernicanism: "Should we
not attribute to God, the creator of nature, that skill which we observe in the
common makers of clocks".25 The world was a machine in the literal sense
of an arrangement of material parts designed, assembled and set in motion
for a purpose by an intelligent mind outside itself, the supreme Watchmaker
and Ruler of Nature. This view depended on the Christian notion of a creative
and omnipotent God but it was also based on the experience of designing
and building machines. It may have promised more than it could deliver but
it could point to real achievements, and it came to embody an overpowering
faith in progress. Without the success of technological innovation, the mechan-

306

WILLIAM R. SHEA

ical philosophy would not have been given the enthusiastic welcome it
received.
McGill University
NOTES
A. Rupert Hall, 'The Changing Technical Act', Technology and Culture 3 (1962), p. 513.
The nature of the contribution of technology to the rise of the mechanical philosophy remains
a moot question and forms a subsection of the ongoing debate on the cultural antecedents of
the scientific revolution of the seventeenth century. Leonardo Olschki in Geschichte der
neusprachlichen wissenschaftlichen Literatur, 3 vols. (Leipzig 1919-1922, Halle, 1927), and
Edgar Zilsel in 'The Sociological Roots of Science', American Journal of Sociology 47 (1942),
pp. 544-562, claimed that artisans, because of their respect for handiwork, played an important role in the development of a more empirical viewpoint. This view is generally considered
exaggerated although it has been revised and defended with qualifications by more recent scholars
such as Paolo Rossi in I filosofi e Ie macchine (Milan, 1962), A. Rupert Hall, 'The Scholar
and the Craftsman in the Scientific Revolution', in M. Clagett (ed.), Critical Problems in the
History of Science, Madison, 1959, pp. 3-23, and 'The Changing Technical Act', Technology
and Culture 3 (1962), pp. 501-515. More general reassessments are to be found in Alexander
Keller, 'Has Science Created Technology', Minerva XXII (1984), pp. 160-182 and Richard S.
Westfall, 'Robert Hooke, Mechanical Technology and Scientific Investigation', in John G. Burke
(ed.), The Uses of Science in the Age of Newton, Berkeley, 1983, pp. 85-110.
3
Works that carry sections on the period with which we are concerned include Abbott Payson
Usher, A History of Mechanical Inventions, revised edition, Cambridge, MA, 1954; Charles
Singer, E. J. Holmyard, A. Rupert Hall, and Trevor I. Williams (eds.), A History of Technology,
especially vol. II, The Mediterranean Civilization and the Middle Ages A.D. 1500 and vol.
III, From the Renaissance to the Industrial Revolution I500-f. 1700, Oxford, 1957; T. K.
Derry and Trevor I. Williams, A Short History of Technology, Oxford, 1960; U. Forti, Storia
della tecnica italiana, Florence, 1940; George Basalla, The Evolution of Technology, Cambridge,
1988; Samuel Lilley, Men, Machines and History: the Story of Tools and Machines in Relation
to Social Progress, New York, 1966.
4
I quote from John Dryden's seventeenth-century translation of Plutarch's Lives revised by
Arthur Hugh Clough. New York: Modern Library, no date, being a reprint of Clough's edition
of 1864. The passages are from The Life of Marcellus XIV, 4, and XVII, 3-4.
5 Ibid.
6 Quoted in Ivor Thomas (editor and translator), Greek Mathematical Works, 2 vols. Loeb
Classical Library. Cambridge, MA, 1941, reprinted 1981, vol. 2, p. 615.
7 Historia Naturalis, book XXXIII, 1. Pliny, Natural History, translated by H. Rackham (Loeb
Classical Library). Cambridge, MA, 1968, vol. 9, p. 3.
8 Historica Naturalis, book XXXIII, 1, ibid., p. 5.
9 Metamorphoses, book I, lines 137-150.
10
Naturales Questiones, book V, 15.3, in Seneca, Naturales Questiones, translated by Thomas
A. Corcoran (Loeb Classical Library). Cambridge, MA, 1972, vol. 2, p. 103.
II
Edmund Spencer, The Faerie Queene, book II, canto 7, stanza 17.
12
Paradise Lost, book I, lines 686-690.
13
The role of automata is analysed by Derek J. De Solla Price in 'Automata and the Origins
of Mechanism and Mechanistic Philosophy', Technology and Culture 5 (1964), pp. 9-23, and
Silvio A. Bedini, 'The Role of Automata in the History of Technology', Technology and Culture
5 (1964), pp. 24-42. On the influence of Hero, see the excellent survey by Marie Boas, 'Hero's
Pneumatica: A Study of its Transmission and Influence', Isis 40 (1949), pp. 38-48.
14
Novum Organum, book I, aphorisms 1-3, in The Works of Francis Bacon, edited by James
I

TECHNOLOGY AND THE RISE OF THE MECHANICAL PHILOSOPHY

307

Spedding, Robert Leslie Ellis and Douglas Denon Heath, 13 vols. London, 1857-1874, vol. I,
p. 157.
15
Francis Bacon, Advancement of Learning (Everyman's Library). New York, 1965, pp. 72-73.
16
Ibid., vol. III, p. 156.
17
Ibid., p. 159.
18
The Philosophical Works of Descartes, translated by I. Cottingham, R. Stoothoff and D.
Murdoch, 2 vols. Cambridge, MA, 1985, vol. I, pp. 142-143.
19
Ibid., p. 142.
20
The significance of the evolution of technology and the mechanical philosophy is discussed
by J. A. Bennett, 'The Mechanic's Philosophy and the Mechanical Philosophy', History of Science
XXIV (1986), pp. 1-28. Other important studies include Andre Leroi-Gourhan, Evolution et
technique, Paris, 1971; Gilbert Simondon, Due mode d'existence des objets techniques, Paris,
1958; P. M. Schuhl, Machinisme et philosophie, Paris, 1947. An excellent survey is Marie
Boas' 'The Establishment of the Mechanical Philosophy', Osiris 10 (1952), pp. 412-541.
21
The Philosophical Works of Descartes, vol. I, pp. 100-101.
22
Letter to the Marquis of Newcastle, 23 November 1646, Oeuvres de Descartes, edited by
C. Adam and P. Tannery. Paris, 1897-1913, reprinted 1956-1973, vol. IV, p. 575.
23
Robert Boyle, The Origin of Forms and Qualities According tgo the Corpuscular Philosophy
in Selected Philosophical Papers of Robert Boyle, edited by M. A. Stewart, Indianapolis, 1991,
p. 71, A Free Enquiry into the Vulgarly Received Notion of Nature, Section II, ibid., pp. 190-191.
24
See, for instance, An Essay Containing a Requisite Digression, Concerning Those That Would
Exclude the Deity from Intermeddling with Matter, ibid., pp. 160, 170, 174.
2S
The Narratio Prima was reprinted by Kepler as an appendix to his Mysterium Cosmographicum in 1596. I quote from Edward Rosen's translation in Three Copernican Treatises, 3rd
edition. New York, 1971, p. 137.

NOTES ON THE AUTHORS

MICHEL BLAIS, born in New Hamsphire, holds Bachelor's (1973) and


Master's (1975) degrees in Philosophy from the Universite de Sherbrooke
and a Ph.D. in Philosophy (1983) from the Universite de Montreal. He is
Full Professor at the Universite de Sherbrooke in the Departement de Sciences
Humaines and is currently secretary and member of the Executive Committee
of the Faculte des Lettres et Sciences Humaines. His primary interests are in
Formal and Informal Logic, Philosophy of Mathematics and Epistemology.
Among his publications are La logique - une introduction and various articles,
most notably in The Journal of Philosophy and in Philosophia Mathematica.
MARIO BUNGE, born in Buenos Aires (1919), holds a doctorate in PhysicoMathematical Sciences from the Universidad Nacional de La Plata. He is
currently the Frothingham Professor of Logic and Metaphysics at McGill
University, and has been a Professor of Theoretical Physics at the Universities
of Buenos Aires, La Plata, Temple and Delaware, as well as a Visiting Professor
of Philosophy at the Universities of Pennsylvania, Texas, Freiburg, Aarhus,
Mexico, Geneva, Fribourg and Genova. He is the author of more than 80 books
and 400 papers on Physics, Sociology, and Philosophy, among them Causality,
Foundations of Physics, Scientific Research (in two volumes), The MindBody Problem, Scientific Materialism, and Treatise on Basic Philosophy (in
8 volumes). His main current fields of interest are the Philosophies of Biology,
Social Sciences, and Technology. His work has been analyzed in numerous
articles, as well as in the volumes Studies on Mario Bunge's Treatise, edited
by P. Weingartner and G. Dorn, and Entretiens avec Mario Bunge. by L.-M.
Vacher.
DAVID DAVIES, born in London, England (1949), studied Politics, Philosophy
and Economics at Oxford University. He holds an M.A. in Philosophy from
the University of Manitoba and a Ph.D. in Philosophy, specializing in
Philosophy of Science, from the University of Western Ontario. He has worked
as a Canada Research Fellow in the Department of Philosophy at McGill
University, where he is now an Assistant Professor. His fields of interest include
Philosophy of Social Science, Philosophy of Mind, Philosophy of Language
and Realism, and he has published a number of articles in these fields.
YVON GAUTHIER, born in Drummondville, Quebec (1941) has studied in
Heidelberg (Dr. Phil., 1966) and has been a Research Fellow in the Departments
of Mathematics at Berkeley (1972) and Leningrad (1986). He has taught at
Laurentian University of Sudbury and at the University of Toronto. He is
309
M. Marion andR. S. Cohen (eds.), Quebec Studies in the Philosophy of Science 1,309-314.
1995 Kluwer Academic Publishers.

310

NOTES ON THE AUTHORS

Professor of Formal Logic and Philosophy of Science at the Universite de


Montreal since 1973. His main fields of interest are Foundations of Mathematics and Foundations of Physics. He has published six books, including
Fondements de mathematiques (1976), Logique interne (1991) and La logique
interne des theories physiques (1992), and he is the author of numerous articles
in specialized journals.
MICHAEL HALLETT was born in Bristol, England (1950) and received
both his B.Sc. (1972) and Ph.D. (1979) from the University of London. From
1980 to 1981, he was an Alexander von Humboldt Research Fellow at the
Georg-August UniversiHit, Gottingen, and from 1981 to 1984 he was a Junior
Research Fellow at Wolfson College, Oxford. He is Associate Professor of
Philosophy at McGill University since 1985. He is the author of Cantorian
Set Theory and the Limitation of Size (1986) and numerous papers in
philosophy of mathematics, including 'Putnam and the Skolem Paradox',
'Physicalism, Reductionism and Hilbert' and 'Hilbert's Axiomatic Method and
the Laws of Thought'. He is currently writing a book on Hilbert's foundational
work, and he is one of four editors engaged in a long term editorial project
to publish many of Hilbert's unpublished writings on the foundations of
mathematics and physics.
JOACHIM (JIM) LAMBEK was born in Leipzig (1922) and obtained all his
degrees in Mathematics from McGill University, where he has spent the last
50 years, except for sabbaticals in Princeton, Zurich, Paris and Oxford. He
has written Lectures on Rings and Modules and co-authored Introduction to
Higher-Order Categorical Logic with Phil Scott. His present interests range
from Categorical Logic to Mathematical Linguistics.
MARIE LA PALME REYES, born in Montreal (1942), studied Mathematics
at the Universite de Montreal and Philosophy of Language at Concordia
University. She holds the degree of CAPES et L. Sc. in Mathematics
(Universite de Montreal, 1966), M.Sc. in Mathematics (Universite de Montreal,
1970) and Ph.D. in Humanities (Concordia University, 1989). She is PostDoctoral Fellow at McGill University. Her main interests include applications
of the (mathematical) theory of categories to semantics of natural languages
and fictional texts. She has published a book of poems Poemes I et II, literary
criticisms and articles on semantics of natural languages.
SERGE LAPIERRE, born in Montreal (1958), holds a Ph.D. in Philosophy
from the Universite du Quebec aTrois-Rivieres (1990) and has worked as PostDoctoral Fellow at the Institute for Language, Logic and Information, based
at the University of Amsterdam. His primary interests are in Formal Logic,
Philosophical Logic and Philosophy of Language. Among his publications
are the papers 'A Functional Partial Semantics for Intensional Logic' and
'Structured Meanings and Reflexive Domains'.

NOTES ON THE AUTHORS

311

HUGUES LEBLANC, born in Canada (1924), holds an M.A. in Philosophy


from the Universite de Montreal (1946) and a Ph.D. in Philosophy from
Harvard University (1948). He has taught at Bryn Mawr College (1948-67)
and Temple University (1967-92), and served as Chair of the Department of
Philosophy at the latter (1973-79). He is currently a Research Professor at
the Universite du Quebec a Montreal. He held a Fulbright Research Scholarship
in Belgium (1953-54), a Eugenia Chase Research Fellowship from Bryn Mawr
College (1958-59), a Guggenheim Fellowship (1965-66), and a Paul W.
Eberman Research Award from Temple University (1982). He was awarded an
Honorary Doctorate in Philosophy by the Universite de Montreal (1980) and
Dalhousie University (1982), and a Reconnaissance du Merite Scientifique
by the Universite du Quebec a Montreal (1985). Among his main interests
are Free Logic, Truth-Value and Probability Semantics, and Probability Theory.
he has authored four books (among them Statistical and Inductive Probabilities
and Truth-Value Semantics), co-authored two (among them Deductive Logic),
and edited or co-edited four. He has also authored or co-authored more than
one hundred papers, thirty-five of which are collected in Existence, Truth,
and Provability.
FRAN<;OIS LEPAGE, born in Saint-Jerome, Quebec (1950), holds a B.Sc.
in Physics from the Universite de Montreal and a Doctorat de troisieme cycle
in Logic from the Universite de Paris. He is Professor of Philosophy at the
Universite de Montreal (Logic, Philosophical Logic, Decision Theory). He has
published an introductory book in logic, Elements de logique contemporaine
and many articles, among others, in Logique et Analyse, Dialectica and NotreDame Journal of Formal Logic.
JEAN LEROUX, born in Montreal (1948), studied philosophy and linguistics at the Georg-August Universitiit, Gottingen (M.A. 1976) and earned his
Ph.D. in philosophy at the Universite de Montreal (1984). He -is Associate
Professor of Philosophy at the University of Ottawa. He has worked mainly
on the logical empiricist and structuralist approaches to scientific theories
and has authored La semantique des theories physiques. His main fields of
interest are General Methodology of Science, Philosophy of Language, and,
recently, 19th and early 20th Century epistemological conceptions of the
semantics of physical theories in the German tradition.
JOHN MACNAMARA was born in Ireland and holds a B.A. (Latin and Irish)
from University College Dublin, an M.Ed. and a Ph.D. (psychology) from
Edinburgh University. He has taught at St. Patrick's College, Dublin and
McGill University where he is Professor in the Department of Psychology.
He has been Visiting Professor at Harvard University and Carleton College,
Minnesota. His special interest is the application of the mathematical theory
of categories to problem in the semantics of natural languages and in the
learning of such languages by children. The author of numerous papers, his

312

NOTES ON THE AUTHORS

books include Names for Things (1982) and A Border Dispute (1986). With
G. E. Reyes he has edited The Logical Foundations of Cognition (1994).
MATHIEU MARION, born in Montreal (1962), studied philosophy at the
Universite de Montreal (B.A., 1984 and M.A. 1986) and at New College,
Oxford (D. PhiL, 1992). He was a Visiting Scholar at the Department of
Logic & Metaphysics of the University of St. Andrews in 1991-92, a Research
Associate at the Center for the Philosophy and History of Science at Boston
University in 1992-93 and a Post-Doctoral Fellow at the Universite de
Montreal in 1993-94. He is currently Assistant Professor at the University
of Ottawa. His fields of interest are Wittgenstein, Philosophy of Mathematics
and Philosophical Logic. He has published a number of articles and he is
the author of Wittgenstein, Finitism, and the Foundations of Mathematics
(forthcoming).
JEAN-PIERRE MARQUIS, born in Quebec City (1961), studied philosophy
at McGill (Ph.D. 1988) and did post-doctoral research in Paris with the
REHSEIS team and then spent a year at the Center for Philosophy of Science
in Pittsburgh as a Research Fellow. His interests are Philosophy of Logic,
Philosophy of Mathematics, Philosophy of Science and General Epistemology.
He has published articles in topics in Philosophy of Science and Philosophy
of Logic and Mathematics.
STORRS McCALL, born in Montreal in 1930, studied philosophy at McGill
and at Oxford, where he obtained a B. Phil. and a D. Phil. He taught at
McGill from 1955 to 1963, then over the next 11 years spent 6 years at the
University of Pittsburgh and 5 years at Makerere University in Uganda, where
he started courses of formal instruction in Philosophy in 1965, and where he
was later joined by Andre Gombay and Ian Hacking. In 1974--75 he worked
on defining the concept of quality of life (QOL) in the Department of
Environment in Ottawa, following which he returned to McGill. He is the editor
of Polish Logic 1920-1939, and is the author of Aristotle's Modal Syllogisms
(1963) and A Model of the Universe (1994).
JUDITH PELHAM was born in Montreal (1961), although her family's roots
are dug into Nova Scotia soil. She completed her B.A. at Dalhousie University,
her M.A. at Simon Fraser University and her Ph.D. at the University of Toronto.
She is interested in Russell's early work in Philosophy and Logic, and in
resolutions to the paradox, as well as Philosophical Logic generally. She is
at present Assistant Professor of Philosophy at Concordia University.
PAUL M. PIETROSKI, born in the U.S. (1964), studied History at Rutgers
University (1986), and received his Ph.D. in Philosophy from the Massachussetts Institute of Technology (1990). He is an Assistant Professor of
Philosophy at McGill University. His primary interests are in the Philosophy

NOTES ON THE AUTHORS

313

of Mind and Language, and in particular, the structure of intentional explanation. He is currently working on a manuscript, Idealization and Intentional
Explanation, which draws together a number of articles dealing with mental
content, mental causation, the semantics of opaque constructions, and the
role of ceteris paribus clauses in various explanatory enterprises.
GONZALO E. REYES, born in Santiago, Chile (1937), studied Electrical
Engineering and Mathematics at the University of Chile (Santiago), and Logic
and Methodology of Sciences at the University of California (Berkeley). He
holds the degrees of Licenciado en Matematicas (University of Chile, Santiago,
1961) and Ph.D. in Logic and Methodology of Sciences (University of
California, Berkeley, 1967). He is Professor of Mathematics at the Universite
de Montreal. He has been Visiting Professor in several Universities including
Louvain-Ia-Neuve (Belgium), Lille (France), Sydney (Australia) and the
Catholic University of Chile (Santiago). His current interests include applications of Category Theory to Modal Logic and to Semantics of Natural
Languages. He has co-authored with M. Makkai First-Order Categorical Logic
and with I. Moerdijk Model for Smooth Infinitesimal Analysis. He has co-edited
with J. Macnamara The Logical Foundations of Cognition and has published
many articles.
WILLIAM R. SHEA, born in the province of Quebec (1937), holds a Ph.D.
in Philosophy from the University of Cambridge. He is Professor of History
and Philosophy of Science at McGill University. He is a former Fellow of
Harvard University and the Institute for Advanced Studies in Berlin. He was
Directeur d'etudes at the Ecole des Hautes Etudes en Sciences Sociales de Paris
and a Visiting Professor at the University of Rome. He is a Fellow in the Royal
Society of Canada and in 1993 he was awarded the Koyre Medal, the most
prestigious award of the International Academy of the History of Science.
His main interest is the History and Philosophy of the Scientific Revolution.
He is the author of Galileo's Intellectual Revolution and The Magic of Numbers
and Motion, The Scientific Career of Rene Descartes. He has edited several
books including Reason, Experiment and Mysticism in the Scientific Revolution; Nature, Experiment and the Sciences, Creativity in the Arts and Sciences;
and Nature Mathematized.
DANIEL VANDERVEKEN, born in Belgium (1949), received a PhD in
Philosophy at the Universite Catholique de Louvain in 1974. He was First
Researcher at that University (1970-71) and at the Fonds National Beige de
la Recherche Scientifique (1971-75). Later, he has been Post-Doctoral
Research Fellow at the University of California at Berkeley, thanks to the
NATO Scientific Committee (1975-76) and to the Institute of Humanities of
Berkeley (1976-77). He is Professor and Director of the Research Group in
Analytic Philosophy at the Universite du Quebec a Trois-Rivieres. His primary
research interests are in Logic and Philosophy of Language, Speech Act theory

314

NOTES ON THE AUTHORS

and Universal Grammar. He was the co-author with John Searle of Foundations
of Illocutionary Logic (1985). He is also the author of Meaning and Speech
Acts, Volume 1: Principles of Language Use, Volume 2: Formal Semantics
of Success and Satisfaction (1990-91).

NAME INDEX

Abel, N.H. 202


Ackermann, W. 135, 154, 168, 185
Agricola (G. Bauer) 300, 301
Agrippa, H.C. 301
Albert, D. 231, 232, 239, 240
Allison, H. 281, 282, 284-286
Andrews, P. 27, 38n, 38
Arakelov, S. I. 206, 212
Archimedes 299
Archytas (of Tarentum) 299
Aristotle 57-59, 62, 66, 70, 73, 79, 119, 299,
312
Arthur, I. 211n, 212
Artin, E. 202
Aspect, A. 217, 220-222, 226n

Blarney, S. 28, 3Sn, 38


Blumenthal, O. 176n, 181
Boas, M. 306n, 307n
Bochvar, D.A. 80, 94n
Bohm, D. 21S, 222, 226n
Bohr, N. 225, 226, 227n
Bolyai, I. 201
Bolzano, B. 138, 200
Boolos, G. 174n
Borceux, F. 93n
Born, M. 17Sn, IS0n
Boltzmann, L. 243
Bourbaki, N. 189
Boyle, R. 305, 307n
Brouwer, L.E.I. 70, 107, 108, 118, 119
Browder, F. 182
Brueckner, A. 278, 286
Brunelleschi, F. 303
Bunge, M. ix, 90, 219, 222, 224, 225, 226n,
227n, 309
Burgess, I.P. 52, 55
Burgos, M.E. 225
Buss, S.R. 117, 122
Butterfield, I.N. 24On, 240

Bach, E. 67
Bacon, F. 303, 304, 306n
Baer, R. 211n
Bai-Lin, H. 258n
Barnes, I. 67
Barr, M. 87, 93n
Barwise, I. 41, 55, 62, 67
Basalla, G. 306n
Bedini, S.A. 306n
Bell, E.T. 201, 202, 208n, 21On, 212
Bell, I.L. 93n, 94n, 217, 219-224, 226n, 235,
237
Bellmam, R.E. SI, 93n
Beltrami, E. 295n
Benacerraf, P. 181, 182
Bencivenga, E. 1,8, 19n-21n, 21
Bennet, I.A. 307n
Berkeley, G. 76
Bernays, P. 135, 136-13S, 144, 145, 147,
149, 150, 156, 157, 160, 175n, 177n179n, ISln, 182, 185
Besson, I. 301
Bhargava, M. 71, 78
Biringuccio, V. 300, 301, 303
Birkhoff, G. 211n, 212
Bismut, I.-M. 207
Black, M. 146, 147, 177n, 183
Blackburn, S. 277, 286
Blais, M. xi, 309

Campbell, K. 231
Cantor,G. 114, 116,117,138,148,149,152,
153, 157-159, 161, 168, 177n-179n,
181n, 182, 190, 191,205, 207n-209n,
212, 232
Carnap,R.95,9S, 100, 105, 175n, 292,296n
Carson, E. 174n
Cartwright, N. 260, 262, 269, 270, 271n,
272
Cassels, J. W. S. 212
Cassirer, E. 294n
Cauchy, A. 114, 143, 177n
Caus, S. de 302
Chevalley, C. 202, 211n, 212
Church, A. ix, 24, 38, 97, 105, 117, 135,
IS2
Chwistek, L. 24
Cini, M. 222, 226n
Clauser, I.F. 220, 221, 226n, 237, 241
Claverie, P. 21S, 226n

315

316

NAME INDEX

Clifton, P.K. 24On, 240


Clough, A.H. 306n
Cocchiarella, N. 19n, 21
Cohen, R.S. 174n, IS2
Commandino, F. 302
Cooper, R. 41, 55
Corcoran, J. 65,67
Coulomb, C.A. de 260-262, 266, 269
Couturat, L. 207n, 212
Couture, J. x, 69, 7S
Craig, W. 292
Creary, L. 260, 272
Cresswell, M. 105
Crutchfield, J.P. 25Sn
Currie, G. 2S6n, 2S6
Cushing, J.T. 241
Dalibard, J. 226n
Dalmedico, D. 25Sn
Darwin, C. 265
Dauben,J. 207n, 20Sn, 212
Davenport, H. 119, 122
Davidson, D. 261,272
da Vinci, L. 302
Davis, D. xi, 309
de Broglie, L. 21S
de la Pena-Auerbach, L. 21S, 226n
De Solla Price, D.J. 306n
Dedekind, R. 74, 13S, 140, 141-150, 152,
153, 16S-170, 173, 175n-l77n, 179n,
ISln, IS2, IS3, IS9-191, 193-203,207n,
209n, 21On, 212
Democritus 73
Demopoulos, W. 174n, 277, 2S6
Derry, T.K. 306n
Descartes, R. 297, 302, 304, 305, 313
d'Espagnat, B. 221, 226n
Deuring, M. 203
Devitt, M. 27S, 2S0, 2S6
DeWitt, B. S. 229, 241
Dicke, R. 262, 272
Dieudonne, J. 204, 211n, 212
di Giorgo Martini, F. 302
Diner, S. 21S, 226n
Dirac, P.A.M. 225, 226, 227n
Dirichlet, P.-G. Lejeune- IS9, 195, 196,
209n, 210n, 211n, 214
Dom, G. 309
Dryden, J. 306n
Dugac, P. 191, 20Sn-21On, 212
Dummett, M.A.E. 176n, IS3
Dunn, J.M. 93n
Dupre, J. 26S-271, 272n, 272
Dwyer, S. 272n

Eberhard, P.H. 224, 226n


Edwards, H.M. IS9, 199,201, 20Sn-21On,
212n, 212
Eells, E. 26S, 269, 272
Eichler, M. 203
Einstein, A. 217-219, 223, 225, 226n, 230,
241
Eisenstein, F.M.1. 202
Elkana, Y. IS2
Ellison, W.J. & F. 211n
Empedocles 76
Epstein, G. 93n
Erdos, P. 119
Euclid lOS, 115-121
Euler, L. 60, IS0n, 194
Everett, H. 229-231,241
Ewald, W. 174n, IS3
Eudoxus 74
Faltings, G. 206, 207, 212
Faltz, L.M. 6S
Faraday, M. 291
Farmer, W.M. 27, 3S
Fermat, P. de xi, liS-liS, 120-122, 197,206
Feynman, R.P. 25Sn
Fine, A. 16, ISn, 21, 224, 226n, 230, 234,
235,241
Fitch, F.B. 4, 5, IS, 17, 21
Flew, A. 257n
Fourier, J. 195, 209n
Fraenkel, A. IS In
Francini, T. 302
Frege, G. 95-97, lOS, 135, 137, 13S, 140,
141, 143, 144, 146, 147, 153-155, 160,
164,170, 173, 175n-17Sn, ISOn, IS3, 194
Frey, G. 211n
Friedman, H. 204,205,213
Frohlich, J. 212
Furtwangler, P. 202, 203
Galilei, G. xi, 297, 313
Gallin, D. 25, 27, 39
Galois, E. 193,203, 211n
Gauss, C.F. 153, 201, 202, 207n
Gauthier, Y. x, xi, 107, 121, 122, 2lOn, 212n,
213,309
Geach, P.T. 5S, 60,6S, 146, 147, 177n, IS3
Gel'fond, A.O. IS0n
Gelban, S. 211n, 213
Gentzen, G. lOS, liS, 116, 122, 17Sn
George, A. IS3
Ghirardi, G.c. 239, 241
Gillet, H. 207,213
Gillon, B.S. 73, 77, 7S

NAME INDEX
Girard, J.-Y. 109, 122
Gleick, J. 258n
Glymour, C. 278, 280, 286
Gooei, K. 43, 70, 114, 116, 171, 177n, 178n,
180n, 183
Goldblatt, R.I. 93n
Goldman, A. 277, 286
Gombay, A. 312
Goodman, N. 259, 272
Gordan, P. 200, 201, 210n, 213
Grangier, P. 226n
Greenberger, D.M. 24On, 241
Grothendieck, A. 121, 207
Grzegorczyk, A. 28, 39
Gupta, A.K. 60, 68
Hacking, I. 277, 279, 284, 286, 312
Hailperin, T. ix, 18n
Haken, H. 258n
Hall, R.A. 297, 306n
Hallett, M. x, xi, 172, 174n, 175n, 177n,
179n, 181n, 183, 310
Hansen, e. 278, 285, 286n, 286
Hanson, N.R. 263,272
Hart, H. 263, 272
Hasse, H. 189, 202-204, 208n, 211n, 212
Hatfield, G. 294n-296n
Hausdorff, F. 82
Hawtrey, R. 131
Healey, R. 229, 230, 241
Heck, R. 174
Hecke, E. 189, 203, 211n
Hegel, G.W.F. 75, 287
Heidegger, M. 76
Heine, H. 192
Heisenberg, W. 226, 240, 241, 246
Hellman, G. 239, 241
Helmholtz, H.L.F. von 287-293, 294n, 295n
Hempel, C. 261, 272, 292
Henkin, L. 25, 27, 28, 37, 38n, 39
Hensel, K. 189, 191, 193, 195,203, 209n,
211n
Heraclitus 75, 189
Herbrand, J. 108, 109, 116, 122, 21On, 211n,
213
Hermite, e. 208n
Hero (of Alexandria) 299, 302, 306n
Hertz, H. 289
Heyting, A. 70, 116
Hilbert, D. xi, 107, Ill, 115, 135-138, 140,
141,143-174, 175n-181n, 184, 189-191,
194, 198,200, 201-203, 205, 208n, 210n,
213, 214, 309
Hiley, B. 222, 226n

317

Hintikka, J. 18n, 21, 98, 105, 174n


Hobbs, J. 258n
Hohle, U. 93n
Holmyard, E.J. 306n
Honore, A. 263, 272
Hom, L.R. 93n
Home, M.A. 240n
Hughes, R.I.G. 230, 241
Hume, D. 76, 244, 257n
Hurwitz, A. 189,201, 207n, 210n, 214
Husser!, E. 185
I\gauds, H. J. 179n, 186
Ireland, K. 121, 122
Irvine, A. 185
Isaacson, D. 212n
Jacobi, e.G. 194,202
Jahn, R. G. 227n
Jespersen, O. 73
Johnstone, P.T. 93n
Joseph, G. 260, 261, 262, 272
Jourdain, P.E.B. 196,205,214
Jungnickel, C. 294n
Kahl, R. 294n, 295n
KAlnay, A. 222, 225, 226n
Kant, I. 57, 138, 170, 181n, 275, 281, 282,
284, 286, 287, 290, 294n
Kaplan, D. 68
Keenan, E.L. 41, 55, 68, 77
Keferstein, H. 146, 177n, 200
Keller, A. 306n
Kelvin, Lord (W. Thomson) 291
Kepler, 1. 307n
Kirchhoff, G.R. 291, 294, 296n
Kirk, G.S. 207n, 214
Kleene, S.e. 29, 33, 38n, 117, 135, 185
Klein, F. 185
Kluge, E.-H. W. 185
Kneebone, G.T. 209n, 214
Konig, 1. 189, 205, 207n, 214
Kratzer, A. 185
Kripke, S. 62, 66
Kronecker, L. xi, 107, 121, 152, 168, 189196, 198-207, 208n-211n, 214
Kuipers, A. xi
Kummer, E.E. 142,194, 197, 198,202,205,
209n, 210n, 214
La Palme Reyes, M. x, 68, 310
Lagrange, 1.-L. de 193, 194,205
Lambek, B. 77
Lambek, 1. x, 70, 71, 74, 77, 78, 93, 310

318

NAME INDEX

Lambert, K. 1, 2, 8, 18n, 21
Lang, S. 207, 211n, 214
Langlands, R.P. 189, 203, 206, 214
Lapierre, S. x, 39, 46, 55, 310
Laplace, P.S. Marquis de 243, 248, 252,
257n
Lasker, E. 207n
Lawvere, F.W. 57,71,78,88, 94n
Laymon, R. 262, 273
Lebesgue, H. 177n
Leblanc, H. ix, x, 18n, 21, 22,311
Legendre, A.M. 209n
Leibniz, G.W. 139, 194
Lejeune-Dirichlet, P.-G. see Dirichlet
Lenin (V.1. Oulianov) 294n
Leonard, H.S. 1,7, 18n, 19n, 22
Lepage, F. x, 39, 311
Leroi-Gourhan, A. 307n
Leroux, J. xi, 311
Lesniewski, S. 28
Lewis, D.K. 52, 55, 98, 101,277,278,282,
283, 286n, 286
Lilley, S. 306n
Lindeman, C.L.F. 208n
Lindenbaum, A. 71, 82
Lindstrom, P. 41, 55
Lipschitz, R. 210n
Loewer, B. 231, 232, 239, 240
Lorentz, H.A. 223, 231
Lorenz, E.N. 249, 250, 258n
Lowenheim, L. 178n, 277
Lukasiewicz, J. 59, 65, 67, 68, 94n
Macaulay, F.S. 207n, 214
Machover, M. 174n
Macintyre, A. 212n
MacLane, S. 68, 93n, 94n, 21ln, 212
Macnamara, J. x, 68, 77, 78,311,313
Majer, U. 174n
Makkai, M. 68, 93n, 174n, 313
Mandelbrot, B. 258n
Marion, M. xi, 174n, 209n, 214, 215, 312
Marquis, J.-P. x, 312
Mascheroni, L. 161, 179n, 180n
Mawanda, M.M. 93n
Maxwell, J.e. 291
May, R.M. 258n
Mayberry, J. 174n
Mazur, B. 209n, 211n, 215
McCall, S. xi, 76, 78, 229, 241, 312
McCommach, R. 294n
McLarty, e. 93n
McMullin, E. 241
Mendelson, E. 135, 186

Menn, S. 174n
Mermin, D. 235, 240n, 241
Merrill, G.H. 278, 286
Meschowski, H. 204, 205, 209n, 215
Milton, J. 301
Minkowski, H. 210n
Mittag-Leffler, G. 208n
Moerdijk, I. 68, 93n, 94n, 313
Molk, J. 193, 195, 196, 207n, 209n, 215
Montague, R. 28, 38n., 39, 41, 55, 100, 105
Montaigne, M.E. de 302
Mordell, L.J. 206
Moschovakis, Y.N. 131, 132
Mostowski, A. 41,55
Mulhern, M. 58, 59, 68
Miiller, J. 288, 295n
Musken, R., 38n, 39
Nagel, E. 293, 296n
Nelson, E. 117, 118, 122
Neumann, O. 213
Newton, I. xi, 262, 272n, 297
Noether, E. 189, 194
Nordin, I. 224, 226n
Ockham, W. of 57,68,77
Olschki, L. 306n
Ovid 300
Pappus 300
Parmenides 72, 75
Parry, W.T. 100, 101, 105
Pasch, M. 138
Pastino, L. 21n
Paty, M. 223, 226n
Patzig, G. 186
Pauli, W. 221
Peano, G. 115, 116
Pearle, P. 239, 241
Peebles, P. 272
Pelham, J. x, xi, 123, 131, 132n, 132, 312
Penzias, A. 262, 273
Perry, J. 62, 67
Philo (of Alexandria) 69
Philo (of Byzentum) 299, 302
Pietroski, P.M. xi, 263, 272n, 273, 312
Pitts, A. 93n
Plato 74, 109, 299
Pliny 300
Plutarch 299, 306n
Podolsky, B. 217, 226n
Poincare, H. 116,122,138,171, 175n, 180n,
186, 201, 206, 210n, 211n, 215, 249,
257n

NAME INDEX
Popper, K.P. ix, 14,224
Post, E. 146, 151, 177n, 186
Pringsheim, A. 195, 209n, 215
Purkert, W. 179n, 186, 2IOn, 213
Putnam, H. xi, 181, 182,275-285, 286n, 286,
310
Pythagoras 73, 74
Quine, W.v. ix, 3, 15, 21n, 22, 27, 38n, 39,
77, 175n, 285, 286
Ramelli, A. 301
Ramsey, F.P. 24, 209n, 259, 260, 263,
265-267, 271, 273, 292
Rang, B. 186
Rasiowa, H. 94n
Raven, J.E. 207n, 214
Redhead, M.L.G. 24On, 240
Reed, D. 211n, 212n
Reid, C. 180n, 186
Rescher, N. 89, 94n
Rey, G. 263, 272n
Reyes, G.E. x, 68, 93, 313
Rheticus, J. 305
Ribet, K. 211n
Richard, J. 160-162, 165-167, 180n, 186
Riemann, B. 154, 190, 195, 197, 198, 203,
207, 209n, 295n
Rimini, A. 239, 241
Roch, G. 203, 207
Rodabaugh, S.E. 93n
Roeper, P. ix, 17, 21n
Roger, G. 226n
Rohrlich, F. 224, 226n
Rosen, M. 121, 122
Rosen, N. 217, 226n
Rossi, P. 306n
Russell, B. xi, 18n, 22, 24, 96, 123-131,
132n, 132, 135, 138-143, 154, 155, 168,
175n, 176n, 181n, 186, 194, 211n, 215,
312
Sapir, E. 77
Scharlau, W. 177n, 186
Schelling, F.W.J. von 287
Schiffer, S. 261, 273
Schlick, M. 178n, 186
Schneider, T. 180n
Schoenflies, A. 208n, 215
Schofield, M. 207n, 214
SchOnberg, M. xi, 217n
SchrOder, E. 155, 186
SchrOdinger, E. 222, 223, 226, 227n, 229,
238, 239

319

Schuhl, P.-M. 307n


Scott, D. 29, 39
Scott, P. J. x, 69-71, 78, 310
Searle, J.R. 105, 314
Selberg, A. 119
Seneca 300, 306n
Shaw, R. 258n
Shea, W. xi, 313
Shimony, A. 220, 221, 226n, 237, 239, 24On,
241
Shimura, G. 203, 211n, 215
Sieg, W. 174n, 2IOn, 214
Siegel, K.L. 180n, 189
Simondon, G. 307n
Simpson, S.G. 204,205, 211n, 213, 215, 267
Sinaceur, M.-A. 187
Singer, C. 306n
Skolem, T. 116, 178n, 18Dn, 181n, 187,277,
310
Skyrms, B. 272n, 273
Slupecki, J. 28, 39
Smith, R.L. 205, 212
Sober, E. 260, 268, 269, 272, 273
Soule, C. 207, 213
Spencer, E. 301, 306n
St-Louis, G. 21n
Stapp, H. 240, 241
Stavi, J. 41, 55
Stein, H. 233, 240, 241
Steinitz, E. 189, 205, 208n, 211n, 215
Stem, J. 93n
Stewart, H.B. 258n
Stone, A.H. 82
Stout, L.N. 93n
Strawson, P.F. 176n
Sturm, C.F. 193
Suppe, F. 294n
Takagi, T. 202, 203, 211n
Takeuti, G. 117, 122
Taniyama, Y. 203, 206, 211n, 215
Tarski, A. 2, l8n, 22, 27, 28, 38n, 39, 71,
82, 178n, 187
Tate, J.T. 211n, 215
Tauber, F. 174n
Thales 73
Thijsse, E. 38n, 39
Thomason, R. ix
Thompson, J.M.T. 258n
Tichy, P. 24, 28, 38n, 39
Tierney, M. 88, 94n
Tijdeman, R. 180n, 187
Toepell, M.-M. 175n, 187
Turing, A.M. 171

320

NAME INDEX

Urquhart, A. xi, 39, 80, 123, 132


Usher, A.P. 306n
Vacher, L.-M. 309
van Benthem, J. 38, 43, 45, 49, 52, 55
van der Waerden, B. 211n, 215
van der Wall, J. 272n
van Fraassen, B.C. ix, 20n, 22, 224, 226n,
232, 272n, 273
van Heijenoort, J. 174n, 179n, 184
van Inwagen, P. 244, 245, 252, 254, 257n
Vanderveken, D. x, xi, 105,313
Vendler, Z. 78
Venn, J. 60
Vesey, G. 257n
Vitruvius 298, 300
Vladut, S. 202, 203, 208n, 209n, 215
Voizard, A. xi, 21n
von Helmholtz, H. xi, 174n, 184
von Neumann, J. 179n, 181n, 210n, 229,
230
von Staudt, F. 153
Weber, H. 142, 176n, 181n, 189-191, 195,
198-203, 209n, 212, 215
Weber, T. 239, 241
Weierstrass, K. 190,200, 209n, 210n

Weil, A. 121, 189, 190, 192,202,203,206,


208n, 211n, 215
Weingartner, P. 309
Westerstal11, D. 55
Westfall, R.S. 306n
Weyl, H. 107, 122, 156, 161-165, 171,
176n-178n, 180n, 181n, 187, 189, 190,
193, 208n, 209n, 211n, 215
Wheeler, J.A. 224, 227n
Whitehead, A.N. 135, 187
Whorf, B. 77
Wiles, A. 211n
Willard, D. 187
Williams, T.!. 306n
Wilson, R. 262, 273
Wisdom, W.A. ix, 21n
Wittgenstein, L. 99,105,191,194,196,205,
206, 209n, 215, 290, 312
Wunderlin, A. 258n
Zadeh, L. 81, 93n
Zeilinger, A. 240n
Zermelo, E. 159-163,165, 169, 179n-181n,
187,208n
Zilsel, E. 306n
Zolfghari, H. 93
Zonca, V. 301

Boston Studies in the Philosophy of Science


Editor: Robert S. Cohen, Boston University
1.

2.

3.

4.

5.

6.
7.
8.

9.

10.

11.

12.
13.

M.W. Wartofsky (ed.): Proceedings of the Boston Colloquium for the


Philosophy of Science, 196111962. [Synthese Library 6] 1963
ISBN 90-277-0021-4
R.S. Cohen and M.W. Wartofsky (eds.): Proceedings of the Boston Colloquium
for the Philosophy of Science, 1962/1964. In Honor of P. Frank. [Synthese
Library 10]1965
ISBN 90-277-9004-0
R.S. Cohen and M.W. Wartofsky (eds.): Proceedings of the Boston Colloquium
for the Philosophy of Science, 1964/1966. In Memory of Norwood Russell
Hanson. [Synthese Library 14]1967
ISBN 90-277-0013-3
R.S. Cohen and M.W. Wartofsky (eds.): Proceedings of the Boston Colloquium
for the Philosophy of Science, 1966/1968. [Synthese Library 18]1969
ISBN 90-277-0014-1
R.S. Cohen and M.W. Wartofsky (eds.): Proceedings of the Boston Colloquium
for the Philosophy of Science, 1966/1968. [Synthese Library 19] 1969
ISBN 90-277-0015-X
R.S. Cohen and R.I. Seeger (eds.): Ernst Mach, Physicist and Philosopher.
[Synthese Library 27]1970
ISBN 90-277-0016-8
M. Capek: Bergson and Modern Physics. A Reinterpretation and Re-eva1uation.
[Synthese Library 37]1971
ISBN 90-277-0186-5
R.C. Buck and R.S. Cohen (eds.): PSA 1970. Proceedings of the 2nd Biennial
Meeting of the Philosophy and Science Association (Boston, Fall 1970). In
Memory of Rudolf Carnap. [Synthese Library 39] 1971
ISBN 90-277-0187-3; Pb 90-277-0309-4
A.A. Zinov'ev: Foundations of the Logical Theory of Scientific Knowledge
(Complex Logic). Translated from Russian. Revised and enlarged English
Edition, with an Appendix by G.A. Smirnov, E.A. Sidorenko, A.M. Fedina and
L.A. Bobrova. [Synthese Library 46] 1973
ISBN 90-277-0193-8; Pb 90-277-0324-8
L. Tond1: Scientific Procedures. A Contribution Concerning the Methodological Problems of Scientific Concepts and Scientific Explanation.Translated from
Czech. [Synthese Library 47]1973 ISBN 90-277-0147-4; Pb 90-277-0323-X
R.J. Seeger and R.S. Cohen (eds.): Philosophical Foundations of Science.
Proceedings of Section L, 1969, American Association for the Advancement of
Science. [Synthese Library 58]1974 ISBN 90-277-0390-6; Pb 90-277-0376-0
A. Griinbaum: Philosophical Problems of Space and Times. 2nd enlarged ed.
[Synthese Library 55]1973
ISBN 90-277-0357-4; Pb 90-277-0358-2
R.S. Cohen and M.W. Wartofsky (eds.): Logical and Epistemological Studies
in Contemporary Physics. Proceedings of the Boston Colloquium for the
Philosophy of Science, 1969/72, Part I. [Synthese Library 59]1974
ISBN 90-277-0391-4; Pb 90-277-0377-9

Boston Studies in the Philosophy of Science


14.

15.

16.
17.
18.

19.

20.

21.
22.

23.
24.
25.

26.

27.
28.

R.S. Cohen and M.W. Wartofsky (eds.): Methodological and Historical Essays
in the Natural and Social Sciences. Proceedings of the Boston Colloquium for
the Philosophy of Science, 1969/72, Part II. [Synthese Library 60] 1974
ISBN 90-277-0392-2; Pb 90-277-0378-7
R.S. Cohen, 1.1. Stache1 and M.W. Wartofsky (eds.): For Dirk Struik.
Scientific, Historical and Political Essays in Honor of Dirk I. Struik. [Synthese
Library 61] 1974
ISBN 90-277-0393-0; Pb 90-277-0379-5
N. Geschwind: Selected Papers on Language and the Brains. [Synthese Library
68] 1974
ISBN 90-277-0262-4; Pb 90-277-0263-2
B.G. Kuznetsov: Reason and Being. Translated from Russian. Edited by C.R.
Fawcett and R.S. Cohen. 1987
ISBN 90-277-2181-5
P. Mittelstaedt: Philosophical Problems of Modern Physics. Translated from
the revised 4th German edition by W. Riemer and edited by R.S. Cohen.
[Synthese Library 95] 1976
ISBN 90-277-0285-3; Pb 90-277-0506-2
H. Mehlberg: Time, Causality, and the Quantum Theory. Studies in the
Philosophy of Science. Vol. I: Essay on the Causal Theory of Time. Vol. II:
Time in a Quantized Universe. Translated from French. Edited by R.S. Cohen.
1980
Vol. I: ISBN 90-277-0721-9; Pb 90-277-1074-0
Vol. II: ISBN 90-277-1075-9; Pb 90-277-1076-7
K.F. Schaffner and R.S. Cohen (eds.): PSA 1972. Proceedings of the 3rd
Biennial Meeting of the Philosophy of Science Association (Lansing,
Michigan, Fall 1972). [Synthese Library 64] 1974
ISBN 90-277-0408-2; Pb 90-277-0409-0
R.S. Cohen and J.J. Stachel (eds.): Selected Papers of Leon Rosenfeld.
[Synthese Library 100] 1979
ISBN 90-277-0651-4; Pb 90-277-0652-2
M. Capek (ed.): The Concepts of Space and Time. Their Structure and Their
Development. [Synthese Library 74] 1976
ISBN 90-277-0355-8; Pb 90-277-0375-2
M. Grene: The Understanding of Nature. Essays in the Philosophy of Biology.
[Synthese Library 66] 1974
ISBN 90-277-0462-7; Pb 90-277-0463-5
D. Ihde: Technics and Praxis. A Philosophy of Technology. [Synthese Library
130] 1979
ISBN 90-277-0953-X; Pb 90-277-0954-8
l Hintikka and U. Remes: The Method of Analysis. Its Geometrical Origin and
Its General Significance. [Synthese Library 75] 1974
ISBN 90-277-0532-1; Pb 90-277-0543-7
lE. Murdoch and E.D. Sylla (eds.): The Cultural Context of Medieval
Learning. Proceedings of the First International Colloquium on Philosophy,
Science, and Theology in the Middle Ages, 1973. [Synthese Library 76] 1975
ISBN 90-277-0560-7; Pb 90-277-0587-9
M. Grene and E. Mendelsohn (eds.): Topics in the Philosophy of Biology.
[Synthese Library 84] 1976
ISBN 90-277-0595-X; Pb 90-277-0596-8
I. Agassi: Science in Flux. [Synthese Library 80] 1975
ISBN 90-277-0584-4; Pb 90-277-0612-3

Boston Studies in the Philosophy of Science


29.
30.
31.
32.

33.
34.
35.
36.
37.

38.
39.
40.
41.
42.
43.
44.
45.
46.

J.J. Wiatr (ed.): Polish Essays in the Methodology of the Social Sciences.
[Synthese Library 131] 1979
ISBN 90-277-0723-5; Pb 90-277-0956-4
P. Janich: Protophysics of Time. Constructive Foundation and History of Time
ISBN 90-277-0724-3
Measurement. Translated from German. 1985
R.S. Cohen and M.W. Wartofsky (eds.): Language, Logic, and Method. 1983
ISBN 90-277-0725-1
R.S. Cohen, C.A. Hooker, A.C. Michalos and lW. van Evra (eds.): PSA 1974.
Proceedings of the 4th Biennial Meeting of the Philosophy of Science
Association. [Synthese Library 101] 1976
ISBN 90-277-0647-6; Pb 90-277-0648-4
G. Holton and W.A. Blanpied (eds.): Science and Its Public. The Changing
Relationship. [Synthese Library 96] 1976
ISBN 90-277-0657-3; Pb 90-277-0658-1
M.D. Grmek, R.S. Cohen and G. Cimino (eds.): On Scientific Discovery. The
1977 Erice Lectures. 1981
ISBN 90-277-1122-4; Pb 90-277-1123-2
S. Amsterdamski: Between Experience and Metaphysics. Philosophical
Problems of the Evolution of Science. Translated from Polish. [Synthese
Library 77] 1975
ISBN 90-277-0568-2; Pb 90-277-0580-1
M. Markovic and G. Petrovic (eds.): Praxis. Yugoslav Essays in the Philosophy
and Methodology of the Social Sciences. [Synthese Library 134] 1979
ISBN 90-277-0727-8; Pb 90-277-0968-8
H. von Helmholtz: Epistemological Writings. The Paul Hertz / Moritz Schlick
Centenary Edition of 1921. Translated from German by M.F. Lowe. Edited
with an Introduction and Bibliography by R.S. Cohen and Y. Elkana. [Synthese
Library 79] 1977
ISBN 90-277-0290-X; Pb 90-277-0582-8
R.M. Martin: Pragmatics, Truth and Language. 1979
ISBN 90-277-0992-0; Pb 90-277-0993-9
R.S. Cohen, P.K. Feyerabend and M.W. Wartofsky (eds.): Essays in Memory of
Imre Lakatos. [Synthese Library 99] 1976
ISBN 90-277-0654-9; Pb 90-277-0655-7
Not published.
Not published.
H.R. Maturana and F.J. Varela: Autopoiesis and Cognition. The Realization of
the Living. With a Preface to' Autopoiesis' by S. Beer. 1980
ISBN 90-277-1015-5; Pb 90-277-1016-3
A. Kasher (ed.): Language in Focus: Foundations, Methods and Systems.
Essays in Memory ofYehoshua Bar-Hillel. [Synthese Library 89] 1976
ISBN 90-277-0644-1; Pb 90-277-0645-X
T.D. Thao: Investigations into the Origin of Language and Consciousness.
1984
ISBN 90-277-0827-4
Not published.
P.L. Kapitza: Experiment, Theory, Practice. Articles and Addresses. Edited by
R.S. Cohen. 1980
ISBN 90-277-1061-9; Pb 90-277-1062-7

Boston Studies in the Philosophy of Science


47.
48.
49.
SO.

51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.

M.L. Dalla Chiara (ed.): Italian Studies in the Philosophy of Science. 1981
ISBN 90-277-0735-9; Pb 90-277-1073-2
M.W. Wartofsky: Models. Representation and the Scientific Understanding.
[Synthese Library 129] 1979
ISBN 90-277-0736-7; Pb 90-277-0947-5
T.D. Thao: Phenomenology and Dialectical Materialism. Edited by R.S.
Cohen. 1986
ISBN 90-277-0737-5
Y. Fried and J. Agassi: Paranoia. A Study in Diagnosis. [Synthese Library
102] 1976
ISBN 90-277-0704-9; Pb 90-277-0705-7
K.H. Wolff: Surrender and Cath. Experience and Inquiry Today. [Synthese
ISBN 90-277-0758-8; Pb 90-277-0765-0
Library 105] 1976
K. Kosik: Dialectics of the Concrete. A Study on Problems of Man and World.
1976
ISBN 90-277-0761-8; Pb 90-277-0764-2
N. Goodman: The Structure ofAppearance. [Synthese Library 107] 1977
ISBN 90-277-0773-1; Pb 90-277-0774-X
H.A. Simon: Models of Discovery and Other Topics in the Methods of Science.
ISBN 90-277-0812-6; Pb 90-277-0858-4
[Synthese Library 114] 1977
M. Lazerowitz: The Language of Philosophy. Freud and Wittgenstein.
ISBN 90-277-0826-6; Pb 90-277-0862-2
[Synthese Library 117] 1977
T. Nickles (ed.): Scientific Discovery, Logic, and Rationality. 1980
ISBN 90-277-1069-4; Pb 90-277-1070-8
J. Margolis: Persons and Mind. The Prospects of Nonreductive Materialism.
ISBN 90-277-0854-1; Pb 90-277-0863-0
[Synthese Library 121] 1978
G. Radnitzky and G. Andersson (eds.): Progress and Rationality in Science.
[Synthese Library 125] 1978
ISBN 90-277-0921-1; Pb 90-277-0922-X
G. Radnitzky and G. Andersson (eds.): The Structure and Development of
Science. [Synthese Library 136] 1979 ISBN 90-277-0994-7; Pb 90-277-0995-5
T. Nickles (ed.): Scientific Discovery. Case Studies. 1980
ISBN 90-277-1092-9; Pb 90-277-1093-7
M.A. Finocchiaro: Galileo and the Art of Reasoning. Rhetorical Foundation of
Logic and Scientific Method. 1980
ISBN 90-277-1094-5; Pb 90-277-1095-3
W.A. Wallace: Prelude to Galileo. Essays on Medieval and 16th-Century
Sources of Galileo's Thought. 1981 ISBN 90-277-1215-8; Pb 90-277-1216-6
F. Rapp: Analytical Philosophy of Technology. Translated from German. 1981
ISBN 90-277-1221-2; Pb 90-277-1222-0
R.S. Cohen and M.W. Wartofsky (eds.): Hegel and the Sciences. 1984
ISBN 90-277-0726-X
J. Agassi: Science and Society. Studies in the Sociology of Science. 1981
ISBN 90-277-1244-1; Pb 90-277-1245-X
L. Tondl: Problems of Semantics. A Contribution to the Analysis of the
Language of Science. Translated from Czech. 1981
ISBN 90-277-0148-2; Pb 90-277-0316-7
J. Agassi and R.S. Cohen (eds.): Scientific Philosophy Today. Essays in Honor
of Mario Bunge. 1982
ISBN 90-277-1262-X; Pb 90-277-1263-8

Boston Studies in the Philosophy of Science


68.

W. Krajewski (ed.): Polish Essays in the Philosophy of the Natural Sciences.


Translated from Polish and edited by R.S. Cohen and C.R. Fawcett. 1982
ISBN 90-277-1286-7; Pb 90-277-1287-5
69. J.H. Fetzer: Scientific Knowledge. Causation, Explanation and Corroboration.
1981
ISBN 90-277-1335-9; Pb 90-277-1336-7
70. S. Grossberg: Studies of Mind and Brain. Neural Principles of Learning,
Perception, Development, Cognition, and Motor Control. 1982
ISBN 90-277-1359-6; Pb 90-277-1360-X
71. R.S. Cohen and M.W. Wartofsky (eds.): Epistemology, Methodology, and the
ISBN 90-277-1454-1
Social Sciences. 1983.
72. K. Berka: Measurement. Its Concepts, Theories and Problems. Translated from
Czech. 1983
ISBN 90-277-1416-9
73. G.L. Pandit: The Structure and Growth of Scientific Knowledge. A Study in the
ISBN 90-277-1434-7
Methodology of Epistemic Appraisal. 1983
74. A.A. Zinov'ev: Logical Physics. Translated from Russian. Edited by R.S.
ISBN 90-277-0734-0
Cohen. 1983
See also Volume 9.
75. G-G. Granger: Formal Thought and the Sciences of Man. Translated from
ISBN 90-277-1524-6
French. With and Introduction by A. Rosenberg. 1983
76. R.S. Cohen and L. Laudan (eds.): Physics, Philosophy and Psychoanalysis.
Essays in Honor of Adolf Griinbaum. 1983
ISBN 90-277-1533-5
77. G. Bohme, W. van den Daele, R. Hohlfeld, W. Krohn and W. Schafer:
Finalization in Science. The Social Orientation of Scientific Progress.
Translated from German. Edited by W. Schlifer. 1983
ISBN 90-277-1549-1
78. D. Shapere: Reason and the Search for Knowledge. Investigations in the
Philosophy of Science. 1984
ISBN 90-277-1551-3; Pb 90-277-1641-2
79. G. Andersson (ed.): Rationality in Science and Politics. Translated from
German. 1984
ISBN 90-277-1575-0; Pb 90-277-1953-5
80. P.T. Durbin and F. Rapp (eds.): Philosophy and Technology. [Also Philosophy
ISBN 90-277-1576-9
and Technology Series, Vol. 1] 1983
81. M. Markovic: Dialectical Theory of Meaning. Translated from Serbo-Croat.
1984
ISBN 90-277-1596-3
82. R.S. Cohen and M.W. Wartofsky (eds.): Physical Sciences and History of
Physics. 1984.
ISBN 90-277-1615-3
83. E. Meyerson: The Relativistic Deduction. Epistemological Implications of the
Theory of Relativity. Translated from French. With a Review by Albert
ISBN 90-277-1699-4
Einstein and an Introduction by Milic Capek. 1985
84. R.S. Cohen and M.W. Wartofsky (eds.): Methodology, Metaphysics and the
History of Science. In Memory of Benjamin Nelson. 1984 ISBN 90-277-1711-7
85. G. Tamas: The Logic of Categories. Translated from Hungarian. Edited by R.S.
Cohen. 1986
ISBN 90-277-1742-7
86. S.L. de C. Fernandes: Foundations of Objective Knowledge. The Relations of
Popper's Theory of Knowledge to That of Kant. 1985
ISBN 90-277-1809-1

Boston Studies in the Philosophy of Science


87. R.S. Cohen and T. Schnelle (eds.): Cognition and Fact. Materials on Ludwik
Fleck. 1986
ISBN 90-277-1902-0
88. G. Freudenthal: Atom and Individual in the Age of Newton. On the Genesis of
the Mechanistic World View. Translated from German. 1986
ISBN 90-277-1905-5
89. A. Donagan, A.N. Perovich Jr and M.V. Wedin (eds.): Human Nature and
Natural Knowledge. Essays presented to Marjorie Grene on the Occasion of
Her 75th Birthday. 1986
ISBN 90-277-1974-8
90. C. Mitcham and A. Hunning (eds.): Philosophy and Technology II. Information
Technology and Computers in Theory and Practice. [Also Philosophy and
ISBN 90-277-1975-6
Technology Series, Vol. 2] 1986
91. M. Grene and D. Nails (eds.): Spinoza and the Sciences. 1986
ISBN 90-277-1976-4
92. S.P. Turner: The Search for a Methodology of Social Science. Durkheim,
Weber, and the 19th-Century Problem of Cause, Probability, and Action. 1986.
ISBN 90-277-2067-3
93. I.C. Jarvie: Thinking about Society. Theory and Practice. 1986
ISBN 90-277-2068-1
94. E. Ullmann-Margalit (ed.): The Kaleidoscope of Science. The Israel Colloquium: Studies in History, Philosophy, and Sociology of Science, Vol. 1. 1986
ISBN 90-277-2158-0; Pb 90-277-2159-9
95. E. Ullmann-Margalit (ed.): The Prism of Science. The Israel Colloquium:
Studies in History, Philosophy, and Sociology of Science, Vol. 2. 1986
ISBN 90-277-2160-2; Pb 90-277-2161-0
96. G. Markus: Language and Production. A Critique of the Paradigms. Translated
from French. 1986
ISBN 90-277-2169-6
97. F. Amrine, F.J. Zucker and H. Wheeler (eds.): Goethe and the Sciences: A
Reappraisal. 1987
ISBN 90-277-2265-X; Pb 90-277-2400-8
98. J.C. Pitt and M. Pera (eds.): Rational Changes in Science. Essays on Scientific
Reasoning. Translated from Italian. 1987
ISBN 90-277-2417-2
99. O. Costa de Beauregard: Time, the Physical Magnitude. 1987
ISBN 90-277-2444-X
100. A. Shimony and D. Nails (eds.): Naturalistic Epistemology. A Symposium of
Two Decades. 1987
ISBN 90-277-2337-0
ISBN 90-277-2467-9
101. N. Rotenstreich: Time and Meaning in History. 1987
102. D.B. Zilberman: The Birth of Meaning in Hindu Thought. Edited by R.S.
Cohen. 1988
ISBN 90-277-2497-0
103. T.F. Glick (ed.): The Comparative Reception of Relativity. 1987
ISBN 90-277-2498-9
104. Z. Harris, M. Gottfried, T. Ryckman, P. Mattick Jr, A. Daladier, T.N. Harris
and S. Harris: The Form of Information in Science. Analysis of an Immunology
SUblanguage. With a Preface by Hilary Putnam. 1989
ISBN 90-277-2516-0

Boston Studies in the Philosophy of Science


105. F. Burwick (ed.): Approaches to Organic Form. Pennutations in Science and
Culture. 1987
ISBN 90-277-2541-1
106. M. Almasi: The Philosophy ofAppearances. Translated from Hungarian. 1989
ISBN 90-277-2150-5
107. S. Hook, W.L. O'Neill and R. OToole (eds.): Philosophy, History and Social
Action. Essays in Honor of Lewis Feuer. With an Autobiographical Essay by L.
Feuer. 1988
ISBN 90-277-2644-2
108. I. Hronszky, M. Feher and B. Dajka: Scientific Knowledge Socialized. Selected
Proceedings of the 5th Joint International Conference on the History and
Philosophy of Science organized by the IUHPS (Veszprem, Hungary, 1984).
1988
ISBN 90-277-2284-6
109. P. Tillers and E.D. Green (eds.): Probability and Inference in the Law of
Evidence. The Uses and Limits of Bayesianism. 1988
ISBN 90-277-2689-2
110. E. Ullmann-Margalit (ed.): Science in Reflection. The Israel Colloquium:
Studies in History, Philosophy, and Sociology of Science, Vol. 3. 1988
ISBN 90-277-2712-0; Pb 90-277-2713-9
111. K. Gavroglu, Y. Goudaroulis and P. Nicolacopoulos (eds.): Imre Lakatos and
Theories of Scientific Change. 1989
ISBN 90-277-2766-X
112. B. Glassner and J.D. Moreno (eds.): The Qualitative-Quantitative Distinction in
ISBN 90-277-2829-1
the Social Sciences. 1989
113. K. Arens: Structures of Knowing. Psychologies of the 19th Century. 1989
ISBN 0-7923-0009-2
114. A. Janik: Style, Politics and the Future of Philosophy. 1989
ISBN 0-7923-0056-4
115. F. Amrine (ed.): Literature and Science as Modes of Expression. With an
Introduction by S. Weininger. 1989
ISBN 0-7923-0133-1
116. J.R. Brown and J. Mittelstrass (eds.): An Intimate Relation. Studies in the
History and Philosophy of Science. Presented to Robert E. Butts on His 60th
Birthday. 1989
ISBN 0-7923-0169-2
117. F. D' Agostino and I.C. Jarvie (eds.): Freedom and Rationolity. Essays in Honor
of John Watkins. 1989
ISBN 0-7923-0264-8
118. D. Zolo: Reflexive Epistemology. The Philosophical Legacy of Otto Neurath.
1989
ISBN 0-7923-0320-2
119. M. Kearn, B.S. Philips and R.S. Cohen (eds.): Georg Simmel and ContemISBN 0-7923-0407-1
porary Sociology. 1989
120. T.H. Levere and W.R. Shea (eds.): Nature, Experiment and the Science. Essays
on Galileo and the Nature of Science. In Honour of Stillman Drake. 1989
ISBN 0-7923-0420-9
121. P. Nicolacopoulos (ed.): Greek Studies in the Philosophy and History of
Science. 1990
ISBN 0-7923-0717-8
122. R. Cooke and D. Costantini (eds.): Statistics in Science. The Foundations of
Statistical Methods in Biology, Physics and Economics. 1990
ISBN 0-7923-0797-6

Boston Studies in the Philosophy of Science


123. P. Duhem: The Origins oj Statics. Translated from French by G.F. Leneaux,
V.N. Vagliente and G.H. Wagner. With an Introduction by S.L. Jaki. 1991
ISBN 0-7923-0898-0
124. H. Kamerlingh Onnes: Through Measurement to Knowledge. The Selected
Papers, 1853-1926. Edited and with an Introduction by K. Gavroglu and Y.
Goudaroulis. 1991
ISBN 0-7923-0825-5
125. M. Capek: The New Aspects oj Time: Its Continuity and Novelties. Selected
Papers in the Philosophy of Science. 1991
ISBN 0-7923-0911-1
126. S. Unguru (ed.): Physics, Cosmology and Astronomy, 1300-1700. Tension and
ISBN 0-7923-1022-5
Accommodation. 1991
127. Z. Bechler: Newton's Physics on the Conceptual Structure oj the Scientific
ISBN 0-7923-1054-3
Revolution. 1991
128. E. Meyerson: Explanation in the Sciences. Translated from French by M-A.
Siple and D.A. Siple. 1991
ISBN 0-7923-1129-9
129. A.I. Tauber (ed.): Organism and the Origins oJSelf. 1991
ISBN 0-7923-1185-X
130. F.J. Varela and J-P. Dupuy (eds.): Understanding Origins. Contemporary
ISBN 0-7923-1251-1
Views on the Origin of Life, Mind and Society. 1992
131. G.L. Pandit: Methodological Variance. Essays in Epistemological Ontology
and the Methodology of Science. 1991
ISBN 0-7923-1263-5
132. G. Munevar (ed.): Beyond Reason. Essays on the Philosophy of Paul
ISBN 0-7923-1272-4
Feyerabend. 1991
133. T.E. Uebel (ed.): Rediscovering the Forgotten Vienna Circle. Austrian Studies
on Otto Neurath and the Vienna Circle. Partly translated from German. 1991
ISBN 0-7923-1276-7
134. W.R. Woodward and R.S. Cohen (eds.): World Views and Scientific Discipline
Formation. Science Studies in the [former] German Democratic Republic.
Partly translated from German by W.R. Woodward. 1991
ISBN 0-7923-1286-4
135. P. Zambelli: The Speculum Astronomiae and Its Enigma. Astrology, Theology
and Science in Albertus Magnus and His Contemporaries. 1992
ISBN 0-7923-1380-1
136. P. Petitjean, C. Jami and A.M. Moulin (eds.): Science and Empires. Historical
Studies about Scientific Development and European Expansion.
ISBN 0-7923-1518-9
137. W.A. Wallace: Galileo's Logic oj Discovery and Proof. The Background,
Content, and Use of His Appropriated Treatises on Aristotle's Posterior
ISBN 0-7923-1577-4
Analytics. 1992
138. W.A. Wallace: Galileo's Logical Treatises. A Translation, with Notes and
Commentary, of His Appropriated Latin Questions on Aristotle's Posterior
Analytics. 1992
ISBN 0-7923-1578-2
Set (137 + 138) ISBN 0-7923-1579-0

Boston Studies in the Philosophy of Science


139. M,J. Nye. J.L. Richards and R.H. Stuewer (eds.): The Invention of Physical
Science. Intersections of Mathematics. Theology and Natural Philosophy since
the Seventeenth Century. Essays in Honor of Erwin N. Hiebert. 1992
ISBN 0-7923-1753-X
140. G. Corsi. M.L. dalla Chiara and G.C. Ghirardi (eds.): Bridging the Gap:
Philosophy, Mathematics and Physics. Lectures on the Foundations of Science.
1992
ISBN 0-7923-1761-0
141. C.-H. Lin and D. Fu (eds.): Philosophy and Conceptual History of Science in
Taiwan. 1992
ISBN 0-7923-1766-1
142. S. Sarkar (ed.): The Founders of Evolutionary Genetics. A Centenary Reappraisal. 1992
ISBN 0-7923-1777-7
143. J. Blackmore (ed.): Ernst Mach - A Deeper Look. Documents and New
Perspectives. 1992
ISBN 0-7923-1853-6
144. P. Kroes and M. Bakker (eds.): Technological Development and Science in the
Industrial Age. New Perspectives on the Science-Technology Relationship.
1992
ISBN 0-7923-1898-6
145. S. Amsterdamski: Between History and Method. Disputes about the Rationality
of Science. 1992
ISBN 0-7923-1941-9
146. E. Ullmann-Margalit (ed.): The Scientific Enterprise. The Bar-Hillel Colloquium: Studies in History. Philosophy. and Sociology of Science. Volume 4.
1992
ISBN 0-7923-1992-3
147. L. Embree (ed.): Metaarchaeology. Reflections by Archaeologists and Philosophers. 1992
ISBN 0-7923-2023-9
148. S. French and H. Kamminga (eds.): Correspondence, Invariance and Heuristics. Essays in Honour of Heinz Post. 1993
ISBN 0-7923-2085-9
ISBN 0-7923-2153-7
149. M.Bunzl: The Context of Explanation. 1993
150. I.B. Cohen (ed.): The Natural Sciences and the Social Sciences. Some Critical
and Historical Perspectives. 1994
ISBN 0-7923-2223-1
151. K. Gavroglu. Y. Christianidis and E. Nicolaidis (eds.): Trends in the Historiography of Science. 1994
ISBN 0-7923-2255-X
152. S. Poggi and M. Bossi (eds.): Romanticism in Science. Science in Europe.
1790-1840. 1994
ISBN 0-7923-2336-X
153. J. Faye and H,J. Folse (eds.): Niels Bohr and Contemporary Philosophy. 1994
ISBN 0-7923-2378-5
154. C.C. Gould and R.S. Cohen (eds.): Artifacts, Representations, and Social
Practice. Essays for Marx W. Wartofsky. 1994
ISBN 0-7923-2481-1
155. R.E. Butts: Historical Pragmatics. Philosophical Essays. 1993
ISBN 0-7923-2498-6
156. R. Rashed: The Development of Arabic Mathematics: Between Arithmetic and
Algebra. Translated from French by A.F.W. Armstrong. 1994
ISBN 0-7923-2565-6

Boston Studies in the Philosophy of Science


157. I. Szumilewicz-Lachman (ed.): Zygmunt Zawirski: His Life and Work. With
Selected Writings on Time, Logic and the Methodology of Science. Translations by Feliks Lachman. Ed. by R.S. Cohen, with the assistance of B. Bergo.
1994
ISBN 0-7923-2566-4
158. S.N. Haq: Names, Natures and Things. The Alchemist jabir ibn l:Iayyiin and
ISBN 0-7923-2587-7
His Kitab al-A1Jjar (Book of Stones). 1994
159. P. Plaass: Kant's Theory of Natural Science. Translation, Analytic Introduction
and Commentary by Alfred E. and Maria G. Miller. 1994
ISBN 0-7923-2750-0
160. J. Misiek (ed.): The Problem of Rationality in Science and its Philosophy. On
Popper vs. Polanyi. The Polish Conferences 1988-89. 1995
ISBN 0-7923-2925-2
161. I.C. Jarvie and N. Laor (eds.): Critical Rationalism, Metaphysics and Science.
Essays for Joseph Agassi, Volume I. 1995
ISBN 0-7923-2960-0
162. I.C. Jarvie and N. Laor (eds.): Critical Rationalism, the Social Sciences and the
Humanities. Essays for Joseph Agassi, Volume II. 1995 ISBN 0-7923-2961-9
Set (161-162) ISBN 0-7923-2962-7
163. K. Gavroglu, J. Stachel and M.W. Wartofsky (eds.): Physics, Philosophy, and
the Scientific Community. Essays in the Philosophy and History of the Natural
Sciences and Mathematics. In Honor of Robert S. Cohen. 1995
ISBN 0-7923-2988-0
164. K. Gavroglu, J. Stachel and M.W. Wartofsky (eds.): Science, Politics and
Social Practice. Essays on Marxism and Science, Philosophy of Culture and
the Social Sciences. In Honor of Robert S. Cohen. 1995 ISBN 0-7923-2989-9
165. K. Gavroglu, J. Stachel and M.W. Wartofsky (eds.): Science, Mind and Art.
Essays on Science and the Humanistic Understanding in Art, Epistemology,
Religion and Ethics. Essays in Honor of Robert S. Cohen. 1995
ISBN 0-7923-2990-2
Set (163-165) ISBN 0-7923-2991-0
166. K.H. Wolff: Transformation in the Writing. A Case of Surrender-and-Catch.
1995
ISBN 0-7923-3178-8
167. A.J. Kox and D.M. Siegel (eds.): No Truth Except in the Details. Essays in
Honor of Martin J. Klein. 1995
ISBN 0-7923-3195-8
168. J. Blackmore: Ludwig Boltzmann, His Later Life and Philosophy, 1900-1906.
Book One: A Documentary History. 1995
ISBN 0-7923-3231-8
169. R.S. Cohen, R. Hilpinen and Q. Renzong (eds.): Realism and Anti-Realism in
the Philosophy of Science. Beijing International Conference, 1992. 1995
ISBN 0-7923-3233-4
170. I. Ku~uradi and R.S. Cohen (eds.): The Concept of Knowledge. The Ankara
Seminar. 1995
ISBN 0-7923-3241-5

Boston Studies in the Philosophy of Science


171. M.A. Grodin (ed.): Meta Medical Ethics: The Philosophical Foundations of
Bioethics. 1995
ISBN 0-7923-3344-6
172. S. Ramirez and R.S. Cohen (eds.): Mexican Studies in the History and
ISBN 0-7923-3462-0
Philosophy of Science. 1995
173. C. Dilworth: The Metaphysics of Science. An Account of Modem Science in
Terms of Principles, Laws and Theories. 1995
ISBN 0-7923-3693-3
174. 1. Blackmore: Ludwig Boltzmann, His Later Life and Philosophy, 1900-1906
Book Two: The Philosopher. 1995
ISBN 0-7923-3464-7
175. P. Damerow: Abstraction and Representation. Essays on the Cultural Evolution
of Thinking. 1995
ISBN 0-7923-3813-8
176. G. Tarozzi (ed.): Karl Popper, Philosopher of Science.
(in prep.)
177. M. Marion and R.S. Cohen (eds.): Quebec Studies in the Philosophy of Science.
Part I: Logic, Mathematics, Physics and History of Science. Essays in Honor of
Hugues Leblanc. 1995
ISBN 0-7923-3559-7
178. M. Marion and R.S. Cohen (eds.): Quebec Studies in the Philosophy of Science.
Part II: Biology, Psychology, Cognitive Science and Economics. Essays in
Honor of Hugues Leblanc. 1996
ISBN 0-7923-3560-0
Set (177-178) ISBN 0-7923-3561-9
179. F. Dainian and R.S. Cohen (eds.): Chinese Studies in the History and
Philosophy of Science and Technology. 1995
ISBN 0-7923-3463-9
180. P. Forman and 1.M. Sanchez-Ron (eds.): National Military Establishments and
the Advancement of Science and Technology. Studies in 20th Century History.
1995
ISBN 0-7923-3541-4
181. E.l. Post: Quantum Reprogramming. Ensembles and Single Systems: A TwoTier Approach to Quantum Mechanics. 1995
ISBN 0-7923-3565-1

Also of interest:
R.S. Cohen and M.W. Wartofsky (eds.): A Portrait of Twenty-Five Years Boston
Colloquia for the Philosophy of Science, 1960-1985. 1985
ISBN Pb 90-277-1971-3
Previous volumes are still available.
KLUWER ACADEMIC PUBLISHERS - DORDRECHT / BOSTON / LONDON

Você também pode gostar