Escolar Documentos
Profissional Documentos
Cultura Documentos
Handbook oƒ
Model-Based
Science
Magnani
Bertolotti
Editors
123
Springer Handbook
of Model-Based Science
Springer Handbooks provide
a concise compilation of approved
key information on methods of
research, general principles, and
functional relationships in physical
and applied sciences. The world’s
leading experts in the fields of
physics and engineering will be as-
signed by one or several renowned
editors to write the chapters com-
prising each volume. The content
is selected by these experts from
Springer sources (books, journals,
online content) and other systematic
and approved recent publications of
scientific and technical information.
The volumes are designed to be
useful as readable desk reference
book to give a fast and comprehen-
sive overview and easy retrieval of
essential reliable key information,
including tables, graphs, and bibli-
ographies. References to extensive
sources are provided.
H
Springer
Handbook
of Model-Based Science
Lorenzo Magnani, Tommaso Bertolotti (Eds.)
K
Editors
Lorenzo Magnani
University of Pavia
Department of Humanities
Piazza Botta 6
Pavia 27100, Italy
Tommaso Bertolotti
University of Pavia
Department of Humanities
Piazza Botta 6
Pavia 27100, Italy
Hardly a minute of our lives goes by without reasoning, on elegant distinctions) or look for
without “going beyond the information given” (Bruner, them in the brain. On this view, rep-
1966). Reasoning happens without our awareness and resentations are internalized percep-
without our intention. Our visual system rapidly parses tions. However, representations cannot
light arriving to the retina into figures and ground, be copies, they are highly processed.
into the faces, bodies, objects, scenes, and actions that They are interpretations of the content
we recognize in fractions of a second and that en- that is the focus of thought. They may
ter awareness. The gist stays with us, but the details select some information from the world
are not recorded, a process revealed in change blind- and ignore other information, they may
ness. Perception is deeply and inherently tied to action, rework the information selected, and
epitomized in mirror neurons, single cells in a mon- they may add information, drawing on
key cortex that respond both to perceiving an action information already stored in the brain.
and to performing it, linking perception and action in In this sense, representations are mod- Barbara Tversky
a single neuron. The world we see is constantly chang- els. On this view, operations are inter- Stanford University and
ing; in order to act, perception is used to predict what nalized actions, which are analogous Columbia Teachers College
happens next, allowing us to catch balls and avoid to actions in the world. Operations act on represen-
collisions. Experienced chess and basketball players tations, transforming them and thereby creating new
parse and predict at a more sophisticated level, but representations. Examples are in order. We may form a
the underlying processes seem to be similar, extensive representation of the arrangement of furniture in a room
practice in seeing patterns, interpreting them, and se- or a corporate structure. We can then draw inferences
lecting appropriate actions. This kind of reasoning is from the representations by imagining or carrying out
fast thinking (Kahneman, 2011) and the kind of rea- actions on the representations, such as comparing parts
soning thoughtfully and thoroughly analyzed in this or tracing paths, for example that the coffee table is too
impressive volume is slow thinking, thinking that is de- far from the couch or the route from one division to
liberate and reflective and that can unfold over weeks another is too circuitous. You have probably noted that
and years and even centuries. those inferences also depend on functional information
How does deliberative reasoning happen? It hap- stored between the ears. We can then transform the
pens in a multitude of ways, as is shown insightfully in arrangements to create new configurations and act on
the many domains analyzed in the chapters that follow. those to draw further inferences. Seen this way, repre-
Might there be a way to encompass all of them? Here sentations can be created by operations; these processes
is one view of thinking, let us call it the inside-outside are iterative and reductive. However, momentarily, rep-
view, a view inspired by theories of Bruner (1966), resentations are regarded as static and transformations
Piaget (1954), Norman and Rumelhart (1975), Shep- as active, changing representations to generate infer-
ard (1984), and Vygotsky (1962), among others, who, ences that go beyond the information given to yield new
however, cannot be held responsible for the present thoughts and discoveries.
formulation. The inside part is the thinking that hap- The ways that we talk about thinking suggest gen-
pens between the ears; the outside part is the thinking erality to this view. When we understand, we say that
that gets put into the body and the world. This view we see something; we have an image or an idea, or a
is iterative, as what gets put in the world can get thought or a concept. These are static and they stay still
internalized and then get worked on inside. Inside think- to be acted on. Then we pull ideas together, compare
ing can be further separated into representations and them, turn them inside out, divide them up, reorganize
operations. Representations and operations are useful them, or toss them out.
fictions, ways to think and talk about thinking. Do Forming representations, keeping them in mind,
not expect philosophical precision from them (there and transforming them can quickly overwhelm the
is a temperamental difference between and psychol- mind. When thought overwhelms the mind, the mind
ogists and philosophers: psychologists live on gener- puts thought in the body and the world. Counting is
alities that ignore sloppy variability, philosophers live a paradigmatic example: actions of the finger or the
VI Foreword: Thinking Inside and Outside
hand (or the head or the eye) on an array of ob- different representations capture different information,
jects in the world, pointing to them or moving them, highlight different relationships, and encourage differ-
while keeping track of the count with number words. ent inferences. A pilot’s map will not serve a hiker or
Counting is a sequence of actions on objects linked a bicyclist, or a surveyor. Chemists have a multitude
one-to-one to a sequence of concepts. If representa- of representations of molecules, human biologists of
tions are internalized perceptions and transformations the body, statisticians of a set of data, each designed
of thoughts are internalized actions, re-externalizing to simplify, highlight, explain, understand, explore, or
representations and transformations should promote discover different aspects of multifaceted, often elusive,
thought. Moreover, they do, as counting and countless phenomena.
other examples demonstrate. The actions of the body This very simple view hides enormous complexity.
on the objects in the world exemplify the outside kind It can be accused of being an oversimplification. If it
of thinking and, importantly, they are linked to internal were all that simple, design, science, and mathemat-
actions, in this case, keeping track of the count. ics would be done with, and they are not. Fortunately,
Putting thought into the world expands the mind. the volume at hand corrects that oversimplification. In-
Putting thought into the world allows physical actions troducing this volume is a humbling task. So many
to substitute for mental ones. Putting thought in the kinds of reasoning are revealed so perceptively in so
world makes thought evident to others and to ourselves many domains. The diverse thoughtful and thought-
at other times. Putting thought into the world sepa- provoking contributions reveal fascinating intricacies in
rates the representing and the transforming and makes model-based reasoning, the nuances of finding suitable
them apparent. To be effective, both inside and outside, representations (models), and the complexities of us-
the representations and the transformations, should be ing them to draw inferences. The many insights in each
congruent with thought (e.g., Tversky, 2011, 2015). contribution and in the section overviews cannot read-
Maps preserve the spatial relations of landmarks and ily be summarized, they must be savored. They will be
paths. Discrete one-to-one actions help children add a continuing source of inspiration.
and continuous actions help children estimate (Segal,
Tversky, and Black, 2014). Gesturing the layout of an Barbara Tversky
environment helps adults remember the spatial relations
(Jamalian, Giardino, and Tversky, 2013). References
This analysis, representations and operations, inside
and outside, is simple, even simplistic. The complexity J. S. Bruner: On cognitive growth. In: Studies in
comes from the interactions between and within inside Cognitive Growth, ed. by J. S. Bruner, R. R. Olver,
and outside, in the constructions of the representations, P. M. Greenfield (Wiley, Oxford 1966) pp. 1–29
and in the subtleties of the inferences. Representations A. Jamalian, V. Giardino, B. Tversky: Gestures
and operations are intimately interlinked. Representa- for thinking, Proc. 35th Annu. Conf. Cogn. Sci.
tions, internal or external, carry content, but they also Soc., ed. by M. Knauff, M. Pauen, N. Sabaenz,
have a format, i.e., the way that information is captured I. Wachsmuth (Cognitive Science Society, Austin
and arrayed. The format encourages certain inferences 2013)
and discourages others. The Arabic number system is J. D. Watson: The Double Helix: A Perlonal Ac-
friendlier to arithmetic computations than the Roman count of the Discovery of the Structure of DNA
number system. Maps are friendlier to inferences about (Athenum, New York 1968)
routes, distances, and directions than tables of GPS D. Kahneman: Thinking, Fast and Slow (Macmillan,
coordinates. Finding the right representation, i. e., the New York 2011)
one that both captures the information at hand and D. A. Norman, D. E. Rumelhart: Explorations in
enables productive inferences, can be hard work; the cognition (WH Freeman, San Francisco 1975)
history of science is filled with such struggles from J. Piaget: The Construction of Reality in the Child
the structure and workings of the universe to those (Basic Books, New York 1954)
of the smallest particles. In The Double Helix (1968), A. Segal, B. Tversky, J. B. Black: Conceptually con-
Watson describes the intricate back-and-forth between gruent actions can promote thought, J. Res. Mem.
empirical findings, theory, hypotheses, conversation, Appl. Cogn. 3, 124–130 (2014)
photographs, and models that led to the discovery of the R. N. Shepard: Ecological constraints on internal
model of DNA that succeeded in integrating the biolog- representation: Resonant kinematics of perceiving,
ical and chemical structures and phenomena. Typically, imagining, thinking, and dreaming, Psychol. Rev.
there is no single right representation exactly because 91, 417–447 (1984)
Foreword: Thinking Inside and Outside VII
B. Tversky: Visualizations of thought, Top. Cogn. Special Issue on Pictorial and Diagrammatic Rep-
Sci. 3, 499–535 (2011) resentation
B. Tversky: The cognitive design of tools of L. Vygotsky: Thought and Language (MIT Press,
thought, Rev. Philos. Psychol. 6, 99–116 (2015), Cambridge 1962)
IX
Foreword
Otávio Bueno
XI
Preface
The debate about models has crossed philosophy along ing a map – a powerful model construction activity –
the centuries, ranging from the most speculative to whose embodiment in the brain was made possible by
the most pragmatic and cognitive outlooks. Recently, the enhancement of human cognitive capabilities).
epistemological perspectives and both scientific and In order to grasp to the fullest the rich universe of
cognitive insights sparked new interdisciplinary studies models, their relevance in model-based science, but also
about how models are created and used. The relevance as cognitive architectures, we divided the handbook into
of the discourse about models transcended the bound- nine parts. The first three parts can be seen as the ABC
aries of philosophy of science, as it was immensely of the discourse, providing a cognitive and theoretical
boosted by the progress being made in computation alphabet for the understanding of model-based science,
since the 1950s, making the discourse on models not while the remaining six parts each deal with precise,
only relevant to scientists and philosophers, but also to and applied, fields of model-based science.
computer scientists, programmers, and logicians. An- Part A – Theoretical Issues in Models, edited by
other field of study, strictly connected to modeling, was Demetris Portides, sets the foundation for all of the sub-
the study of inferential processes that would go beyond sequent debates, exploring the relationships between
traditional logic and yet play a crucial role in the cre- models and core notions such as those of theory, rep-
ation and use of models. The most relevant field, in this resentation, explanation, and simulation; furthermore,
respect, concerns abduction and studies on hypothetical the part extensively lays out the contemporary and com-
cognition. plex debate about the ontology of models, that is, the
To provide an initial definition, we can agree that different stances concerning their existence and reality,
a model is something we use in order to gain some ben- answering questions such as How real are models?, Are
efit in the understanding or explanation of something they fictitious?, Do they exist like an object exists or like
else, which can be called the target. “A model allows us an idea exists?.
to infer something about the thing modeled,” as summa- In Part B – Theoretical and Cognitive Issues in Ab-
rized by the late John Holland in his 1995 book Hidden duction and Scientific Inference, Editor Woosuk Park
Order. A model lets us understand the target, and con- selected contributions exploring the fundamental as-
sequently behave in a way that would not be possible pects of a key inference in the production of models,
without it; different models usually optimize the under- both at the cognitive and scientific levels: abduction.
standing of different aspects of the target. This can defined as the most basic building block of
This definition of a model should make it easy to hypothetical reasoning, or the process of inferring cer-
appreciate how many situations that we face every day tain facts and/or laws and hypotheses that render some
are tackled by making use of models; to deal with sentences plausible, that explain (and also sometimes
other people we make models of their minds and their discover) a certain (eventually new) phenomenon or ob-
intentions, to operate machinery we make models of servation.
their functioning, in the remote case of trying to escape Atocha Aliseda edited Part C – The Logic of Hy-
from wild animals we make models of their hunt- pothetical Reasoning, Abduction, and Models, offering
ing strategies and perceptual systems, to explore novel a broad perspective on how different kinds of logic can
environments we make models of their spatial configu- be employed to model modeling itself, and how this
rations, to mention only a few. We make use of models sheds light on model-building processes. As a bridge
in a wide array of circumstances, but what all mod- between the more theoretical and the more specific
els actually share is a dimension of non-abstractness; parts, Part D – Model-Based Reasoning in Science
we create them, or make use of models that have al- and History of Science, edited by Nora Alejandrina
ready been constructed by other people, and models Schwartz, frames some issues of exemplar theory and
usually display a distributed nature, since they are ei- cases concerning the use and the understanding of mod-
ther built on external, material supports (i. e., by means els in the history and philosophy of physics, biology,
of artifacts, paper sheets, sound waves, body gestures) and social sciences, but is also about the relevant sub-
or, in the case of mental models, are encoded in brain ject of thought experiments.
wirings by synapses and chemicals (a mental map, for Albrecht Heeffer edited Part E – Models in Math-
instance, is the mental simulation of the action of draw- ematics, which illuminates crucial issues such as the
XII Preface
role of diagrams in mathematical reasoning, the impor- esis. Indeed, when we were offered the opportunity to
tance of models in actual mathematical practice, and the be general editors of the Springer Handbook of Model-
role played by abductive inferences in the emergence of Based Science, an intense activity of decision-making
mathematical knowledge. followed. To be able to make a decision, we had to
In Part F – Model-Based Reasoning in Cognitive think about what editing a handbook was like. Other-
Science, Editor Athanasios Raftopoulos has selected wise said, in order to decide we had to know better,
a number of contributions highlighting the strict re- and in order to know better we had to make ourselves
lationship between model-based science and model- a model of handbook editing. This complex model was
based reasoning (cognitive science being both a model- partly in our heads, partly in sketches and emails. Part
based science and the science of modeling), namely of it was deduced from evidence (other handbooks),
the model-based processes underpinning vision and di- part of it came out as hypotheses. Once the model was
agrammatic reasoning, but also the relevance of deeper sufficiently stabilized, giving us a good projection of the
cognitive mechanisms such as embodiment and the neu- major criticalities and some (wishful) scheduling, we
ral correlates to model-based reasoning. accepted the challenge, and the model – continuously
Francesco Amigoni and Viola Schiaffonati edited updating the progression of the work – would guide our
the contributions composing Part G – Modeling and behavior step by step.
Computational Issues, concerning the main intersec- We undertook the editing of this Handbook because,
tions between computation, engineering, and model- so far, there is a vast amount of literature on models, on
based science, especially with respect to computational the inferential and logical processes underdetermining
rendering and the simulation of model-based reasoning them, and on the philosophy of model-based science,
in artificial cognitive processes, up to robotics. but it is dispersed in more or less recent (and vari-
Part H – Current and Historical Perspectives on the ably authoritative) collections and monographs, journal
Use of Models in Physics, Chemistry, and Life Sciences, articles, and conference proceedings. The aim of this
edited by Mauro Dorato and Matteo Morganti, offers Handbook is to offer the possibility to access the core
an exemplary outlook on the fundamental aspects con- and the state-of-the-art of these studies in a unique,
cerning models in hard sciences and life sciences, from reliable source on the topic, authored by a team of
a perspective that is not chiefly historical (absolved by renowned experts.
Part D), but rather focuses on practical and theoretical The present Handbook is the exemplary fruit of re-
issues as they happen in actual scientific practice. search and collaboration. As general editors, we were
Cameron Shelley edited the final Part I – Models able to rely on the formidable team of editors we men-
in Engineering, Architecture, and Economical and Hu- tioned above, who took the reigns of their parts: Atocha
man Sciences, providing a series of stimulating and Aliseda, Francesco Amigoni, Mauro Dorato, Albrecht
innovative contributions focusing on less represented Heeffer, Matteo Morganti, Woosuk Park, Demetris Por-
examples of model-based reasoning and science, for in- tides, Athanasios Raftopoulos, Viola Schiaffonati, Nora
stance in archaeology, economics, architecture, design Alejandrina Schwartz, and Cameron Shelley. They are
and innovation, but also social policing and moral rea- all remarkable and hard-working academics, and we are
soning. The focus of this closing part also resides in most grateful to them for taking the time and shoulder-
its ability to show that model-based sciences go beyond ing the burden to contact authors, inspire and review
the tradition of exact and life sciences, as indeed the contributions, whilst keeping in touch with us. In turn,
reliance on models affects nearly all human endeavors. the editors could count on some of the most renowned
The brief excursus on the contents does little jus- and promising scholars in each discipline and field; they
tice to the richness and the extensive variety of topics are too many to mention here, but our undying recog-
reviewed by the Handbook, but it should be enough to nition and gratitude go to them as well. In addition to
convey one of the main ideas of the Handbook: Models our recognition, all of the editors and authors certainly
are everywhere, and the study thereof is crucial in any have our congratulations and admiration upon the com-
human science, discipline, or practice. This is why we pletion of this work.
conceived this book, hoping to make it highly relevant Many of the editors and contributors were already
not only for the philosophy, epistemology, cognitive part of the ever-growing MBR (model-based reasoning)
science, logic, and computational science communities, community, an enthusiastic group of philosophers, epis-
but also for theoretical biologists, physicists, engineers, temologists, logicians, cognitive scientists, computer
and other human scientists dealing with models in their scientists, engineers, and other academics working in
daily work. the different and multidisciplinary facets of what is
We like to see the ample theoretical breadth of this known as model-based reasoning, especially focusing
Handbook as having a counterpart in its editorial gen- on hypothetical-abductive reasoning and its role in sci-
Preface XIII
entific rationality. The outreach of this handbook goes Finally, beyond its tutorial value for our community,
far beyond the theoretical and personal borders of the it is our hope that the Handbook will serve as a useful
MBR community, but it can nevertheless be saluted as source to attract new researchers to model-based sci-
a celebration of the 17 years of work and exchange ence (and model-based reasoning), and inspire decades
since the first MBR conference was held in Pavia, Italy, of vibrant progress in this fascinating interdisciplinary
in 1998. For us, this Handbook is also a recognition of area. The contents of this Handbook admirably present
the work and lives of the many beautiful minds who a very useful bringing together of the vast accom-
came to join us, or interacted with the MBR community, plishments that have taken place in the last 50 years.
who are no longer among us but who will be forever re- Certainly, the contents of this Handbook will serve as
membered and appreciated. a valuable tool and guide to those who will produce the
Last but clearly not least, we are most grateful to even more capable and diverse next generations of re-
Springer’s editorial and production team for their con- search on models.
stant trust, encouragement, and support. In particular,
we wish to thank Leontina Di Cecco, Judith Hinterberg, Pavia, Italy
and Holger Schaepe, as their resilient help and collabo- Lorenzo Magnani
ration made a difference in achieving this Handbook. Tommaso Bertolotti
XV
University of Cyprus Demetris Portides received his PhD from the University of London (LSE) in 2000. He
Dept. of Classics and Philosophy teaches in the Department of Classics and Philosophy at the University of Cyprus. His
1678 Nicosia, Cyprus research interests include topics in philosophy of science, particularly idealization and
portides@ucy.ac.cy abstraction, scientific models, and scientific representation. He is currently working
on the ways by which idealization and abstraction are employed in the construction of
scientific models.
Korea Advanced Institute of Science and Woosuk Park is currently Full Professor of Philosophy at Korea Advanced Institute
Technology (KAIST) of Science and Technology (KAIST). He received his PhD degree from the State
School of Humanities and Social Science University of New York at Buffalo with his dissertation on Duns Scotus’ haecceity
Daejeon, 34141, Korea theory of individuation. He was interested in the issues on the border between logic
woosukpark@kaist.ac.kr
and ontology. He has expanded the scope of research primarily by his historical
approach. His research area now includes the history of logic and axiomatic methods
and the history of medieval science and philosophy. Recently, he has become the most
interested in abductive cognition. In a series of papers, he has dealt with Lorenzo
Magnani’s innovative ideas such as animal abduction or visual abduction, thereby
examining to what extent Magnani goes with and beyond Peirce. He has published
articles in international journals, including Journal of Applied Logic, Erkenntnis,
Foundations of Science, Review of Metaphysics, and The Modern Schoolman.
Universidad Nacional Autónoma de Atocha Aliseda received her PhD from Stanford University in Philosophy
México (UNAM) and Symbolic Systems (1997) and later held a postdoctoral position at
Instituto de Investigaciones Filosóficas Groningen University (2000–2002). She is Full Professor at the Institute
04510 Ciudad de México, Mexico for Philosophical Research at the National Autonomous University of
atocha@filosoficas.unam.mx
México (UNAM). She has published and edited a number of books
and articles on logic and the philosophy of science and specializes on
abduction and the logics of scientific discovery. She is currently working
on clinical reasoning.
Universidad de Buenos Aires Nora A. Schwartz received her Professor degree in Philosophy from
Faculty of Economics the Universidad de Buenos Aires, Argentina, where she has been
Buenos Aires, Argentina teaching since 1997. Professor Schwartz is a member of Asociación
nora_schwartz@yahoo.com.ar de Filosofía de la República Argentina and Sociedad Argentina de
Análisis Filosófico. Her research focuses on physical models implicated
in creative scientific reasoning leading to conceptual innovation. She
takes the cognitive-historical approach developed by Nancy Nersessian
in a critical way, emphasizing the cultural dimension of physical
models with which scientists reason. She also analyzes the features that
satisfactory scientific models should have in order to draw inferences
about the reality modeled by them. Currently, she is investigating
Luigi Galvani’s “discovery of animal electricity”. In particular, she
is interested in his selection of different three-dimensional objects as
model candidates to reason about human neuromuscular system and
in the improvement that this historical research case may make to the
understanding of analogy.
XVIII About the Part Editors
Ghent University Albrecht Heeffer holds a degree in Engineering and a PhD in Philosophy. He
Centre for Logic and Philosophy of Science publishes on the history of optics, the history of mathematics, and the philosophy
9000 Ghent, Belgium of mathematical practice. He is a Research Fellow at Ghent University, Belgium.
albrecht.heeffer@ugent.be Albrecht has been a Visiting Fellow at Kobe University (2008), the Sydney Center for
the Foundations of Science (Sydney University, 2011), and The Max Planck Institute
for History of Science (Berlin, 2014).
University of Cyprus Athanassios Raftopoulos received his PhD from the Johns Hopkins University in 1993.
Department of Psychology He teaches in the Department of Psychology at the University of Cyprus. His research
1678 Nicosia, Cyprus interests include philosophy of science, philosophy of perception, epistemology,
raftop@ucy.ac.cy philosophy of the mind, and cognitive science. He is currently working on the relation
between cognition and perception.
Politecnico di Milano Francesco Amigoni received his PhD in Computer Engineering and
Dipartimento di Elettronica, Automatica from the Politecnico di Milano (Italy) in 2000. He has been
Informazione e Bioingegneria an Associate Professor at the Dipartimento di Elettronica, Informazione e
20133 Milano, Italy Bioingegneria of the Politecnico di Milano since 2007. His main research
francesco.amigoni@polimi.it
interests include agents and multiagent systems, autonomous mobile
robotics, and the philosophical aspects of artificial intelligence.
Politecnico di Milano Viola Schiaffonati has Laurea (Milano, 1999) and PhD (Genova, 2004)
Dipartimento di Elettronica, Informazione degrees. She is Associate Professor of Logic and Philosophy of Science
e Bioingegneria at the Dipartimento di Elettronica, Informazione e Bioingegneria
20133 Milano, Italy of Politecnico di Milano. Her main research interests include the
viola.schiaffonati@polimi.it
philosophical foundations of artificial intelligence and robotics and
the philosophy of computing sciences and information, with particular
attention on the philosophical issues of computational science and the
epistemology of experiments.
University of Rome Mauro Dorato earned his PhD in Philosophy from the Johns Hopkins University
Dipartimento di Filosofia, Comunicazione (1992). He is Full Professor for Philosophy of Science at the University of Rome
e Spettacolo ‘Tre’. He has been Director of the PhD Program in Philosophy since 2013. His
00144 Rome, Italy research focuses on the philosophy of physics, philosophy of time and the nature of
mauro.dorato@uniroma3.it
scientific laws. Currently he is Co-Editor of the European Journal for Philosophy of
Science.
University of Rome Matteo Morganti is Associate Professor at the University of Rome ‘Tre’. He works
Dipartimento di Filosofia, Comunicazione in the field of philosophy of science and is particularly interested in the interplay
e Spettacolo between science and analytic metaphysics and in the issue of scientific realism versus
00144 Rome, Italy antirealism and the interpretation of contemporary physical theories. He earned is
matteo.morganti@uniroma3.it
PhD at the London School of Economics and has worked at IHPST in Paris and the
University of Konstanz.
University of Waterloo Cameron Shelley received his PhD in Philosophy from the University of
Centre for Society, Technology & Values Waterloo in 1999. He is currently a Lecturer at the Centre for Society,
Waterloo, N2L 3G1, Canada Technology & Values in the Department of Systems Design Engineering
cam_shelley@yahoo.ca at the University of Waterloo. His research focuses on model-based
reasoning and philosophical issues in technology, such as the involvement
of ideology and fairness in technological design.
XIX
List of Authors
Contents
20 Model-Based Diagnosis
Antoni Ligęza, Bartłomiej Górny ............................................... 435
20.1 A Basic Model for Diagnosis ............................................ 437
20.2 A Review and Taxonomy of Knowledge Engineering Methods
for Diagnosis ............................................................. 438
20.3 Model-Based Diagnostic Reasoning ................................... 440
20.4 A Motivation Example ................................................... 440
20.5 Theory of Model-Based Diagnosis ..................................... 442
20.6 Causal Graphs............................................................ 444
20.7 Potential Conflict Structures ............................................ 446
20.8 Example Revisited. A Complete Diagnostic Procedure ................ 448
20.9 Refinement: Qualitative Diagnoses .................................... 450
20.10 Dynamic Systems Diagnosis: The Three-Tank Case.................... 454
20.11 Incremental Diagnosis .................................................. 456
20.12 Practical Example and Tools ............................................ 458
20.13 Concluding Remarks .................................................... 459
References ....................................................................... 460
37 Biorobotics
Edoardo Datteri .................................................................. 817
37.1 Robots as Models of Living Systems.................................... 817
37.2 A Short History of Biorobotics .......................................... 825
Contents XXXIII
41 Models in Geosciences
Alisa Bokulich, Naomi Oreskes.................................................. 891
41.1 What Are Geosciences?.................................................. 891
41.2 Conceptual Models in the Geosciences ................................ 892
41.3 Physical Models in the Geosciences ................................... 893
41.4 Numerical Models in the Geosciences ................................. 895
41.5 Bringing the Social Sciences Into Geoscience Modeling.............. 897
41.6 Testing Models: From Calibration to Validation ....................... 898
41.7 Inverse Problem Modeling .............................................. 902
41.8 Uncertainty in Geoscience Modeling................................... 903
41.9 Multimodel Approaches in Geosciences ............................... 907
41.10 Conclusions .............................................................. 908
References ....................................................................... 908
List of Abbreviations
F KJ Kawakita Jiro
KL Kullback–Liebler
F.O.K. feeling of knowing
FBM feature-based modeling
FD fractal dimension L
FEF front eye fields
LDI logics of deontic (in)consistency
FEL fast enabling link
LEM landscape evolution model
FEM finite element model
LFI logics of formal inconsistency
FFNN feedforward neural network
LLL lower limit logic
FFS feedforward sweep
LOC lateral occipital complex
fMRI functional magnetic resonance imaging
LRP local recurrent processing
FOL first-order logic
LSF low spatial frequency
FT fundamental theorem
LT Logic Theorist
LTM long term memory
G LTP long term potentiation
LTWM long term working memory
GCM general circulation model
GDI general definition of information
GEM good experimental methodology M
GG General Griceanism
GIS geographical information system MABS multi-agent-based simulation
GLUE generalized likelihood uncertainty MASON multi-agent simulator of neighborhoods
estimation MAYA most advanced yet acceptable
GOLEM geomorphic-orogenic landscape MBR model-based reasoning
evolution model MG mechanism governing
GPL general purpose language MI mechanism implemented
GRM Gaussian random matrix MMH massive modularity hypothesis
GRP global recurrent processing MoralDM moral decision-making
GTF geomorphic transport function MSE mean squared error
GW Gabbay–Woods
N
H
NCF net cash inflow
HD hypothetico-deductive model of NFC nonfinancial corporation
confirmation
HM Hopf monoids
HS hypothetico-structural O
HSF high spatial frequency
OBS observations
OMG object management group
I OOP object oriented programming
IBAE inference to the best available explanation
IBE inference to the best explanation P
ICB idle cash balance
ICT information and communication P.T.R. prediction of total recall
technologies PCN prenex conjunctive normal
IPE intuitive physics engine PCS potential conflict structure
IQ intelligence quotient pdf probability density function
ISB integrative systems biology PDP parallel distributed processing
IT inferotemporal cortex in the brain PI processes of induction
POPI principle of property independence
J PSI principles of synthetic intelligence
PT prospect theory
J.O.L. judgment of learning PTS possible-translations semantics
K Q
KB knowledge base QALY quality adjusted life year
KE knowledge engineering QM quantum mechanics
List of Abbreviations XXXIX
S U
SCN suprachiasmatic nucleus UML unified modeling language
SD system description
SES socioeconomic status V
SeSAm Shell for Simulated Agent Systems
SEU subjective expected utility VaR value at risk
SF standard format
SMG specialist modelling group W
SSA social structure of accumulation
SSK sociology of scientific knowledge WM working memory
1
Part A
Theoretic Part A Theoretical Issues in Models
Ed. by Demetris Portides
It is not hard to notice the lack of attention paid to sci- have been offered in the quest to answer that particular
entific models in mid-twentieth century philosophy of question. We hope that the final outcome is helpful and
science. Models were, for instance, absent from philo- illuminating to the reader.
sophical theories of scientific explanation; they were
Axel Gelfert in his contribution, Chap. 1: The Ontol-
also absent from attempts to understand how theoretical
ogy of Models, explicates the different ways in which
concepts relate to experimental results. In the last few
philosophers have addressed the issue of what a sci-
decades, however, this has changed, and philosophers
entific model is. For historical reasons, he begins by
of science are increasingly turning their attention to sci-
examining the view that was foremost almost a cen-
entific models. Models and their relation to other parts
tury ago, which held that models could be understood
of the scientific apparatus are now under philosophical
as analogies. He then quickly turns his attention to
scrutiny; at the same time, they are instrumental parts
a debate that took place in the second half of the twen-
of approaches that aim to address certain philosophical
tieth century between advocates of logical positivism,
questions.
who held that models are interpretations of a formal
After recognizing the significance of models in
calculus, and advocates of the semantic view, which
scientific inquiry and in particular the significance of
maintained that models are directly defined mathemat-
models in linking theoretical concepts to experimental
ical structures. He continues by examining the more
reports, philosophers have begun to explore a number
recent view, which identifies models with fictional enti-
of questions about the nature and function of models.
ties. He closes his chapter with an explication of what
There are several philosophically interesting questions
he calls the more pragmatic accounts, which hold that
that could fit very well into the theme of this set of chap-
models can best be understood with the use of a mixed
ters. For example, what is the function of models? and
ontology.
what is the role of idealization and abstraction in mod-
eling?. It is, however, not the objective of this set of In Chap. 2: Models and Theories, Demetris Portides
chapters to address every detail about models that has explicates the two main conceptions of the structure of
gained philosophical interest over time. In this part of scientific theories (and subsequently the two main con-
the book five model-related philosophical questions are ceptions of the theory–model relation) in the history of
isolated from others and are explored in separate chap- the philosophy of science, the received and the semantic
ters: views. He takes the reader through the main arguments
that led to the collapse of the received view and gives
1. What is a scientific model? the reader a lens by which to distinguish the different
2. How do models and theories relate? versions of the semantic view. He finally presents the
3. How do models represent phenomena? main arguments against the semantic view and in doing
4. How do models function in scientific explanation? so he explicates a more recent philosophical trend that
5. How do models and other modes of scientific theo- conceives the theory–model relation as too complex to
rizing, such as simulations, relate? sufficiently capture with formal tools.
Roman Frigg and James Nguyen, in Chap. 3: Mod-
Of course, the authors of these chapters are all
els and Representation begin by analyzing the concept
aware that isolating these questions is only done in or-
of representation and clarifying its main characteristics
der to reach an intelligible exposition of the explored
and the conditions of adequacy any theory of represen-
problems concerning models, and not because differ-
tation should meet. They then proceed to explain the
ent questions have been kept systematically apart in
main theories of representation that have been proposed
the philosophical literature that preceded this work. In
in the literature and explain with reference to their pro-
fact, the very nature of some of these questions dic-
posed set of characteristics and conditions of adequacy
tates an interrelation with others and attempts to address
where each theory is found wanting. The similarity, the
one leads to overlaps with attempts to address oth-
structuralist, the inferential, the fictionalist, and the de-
ers. For example, how one addresses the question what
notational accounts of representation are all thoroughly
sort of entities are models? or how one conceives the
explained and critically assessed. By doing this the au-
theory–model relation affects the understanding of their
thors expose and explicate many of the weaknesses of
scientific representation and scientific explanation, and
the different accounts of representation.
vice versa. Although this point becomes evident in the
subsequent chapters, a conscious attempt was made by In Chap. 4: Models and Explanation, Alisa
each author to focus on the one question of concern of Bokulich explains that by recognizing the extensive use
their chapter and to attempt to extrapolate and expli- of models in science and by realizing that models are
cate the different proposed philosophical accounts that more often than not highly idealized and incomplete
3
descriptions of phenomena that frequently incorporate istics of computational simulations both in the context
fictional elements, philosophers have been led to revise of well-developed overarching theories and in the con-
previous philosophical accounts of scientific explana- text where an overarching theory is absent. The authors
tion. By scrutinizing different model-based accounts also highlight the epistemological significance of sim-
of scientific explanation offered in the literature and ulations for all such contexts by elaborating on how
exposing the problems involved, she highlights the dif- simulations introduce novel problems that should con-
ficulties involved in resolving the issue of whether or cern philosophers. Finally, they elaborate on the relation
not the falsehoods present in models are operative in between simulations and other constructs of human
scientific explanation. cognition such as thought experiments.
Finally, Nancy Nersessian and Miles McLeod, in In most cases, in all chapters the technical as-
Chap. 5: Models and Simulations, explicate a more re- pects of the philosophical arguments have been kept to
cent issue that is increasingly gaining the interest of a minimum in order to make them accessible even to
philosophers: how scientific models, i.e., the mathemat- readers working outside the sphere of the philosophy
ical entities that scientists traditionally use to represent of science. Suppressing the technical aspects has not,
phenomena, relate to simulations, particularly computa- however, introduced misrepresentation or distortion to
tional simulations. They give a flavor of the character- philosophical arguments.
5
The Ontology
1. The Ontology of Models
Part A | 1
Axel Gelfert
The philosophical discussion about models has emerged range of competing philosophical approaches for dis-
from a cluster of concerns, which span a range of the- cussion. This chapter will summarize and critically dis-
oretical, formal, and practical questions across disci- cuss a number of such approaches, especially those that
plines ranging from logic and mathematics to aesthetics shed light on the question what is a model?; these will
and artistic representations. In what follows, the term range from views that, by now, are of largely historical
models will normally be taken as synonymous to sci- interest to recent proposals at the cutting edge of the phi-
entific models, and any departure from this usage – for losophy of science. While the emphasis throughout will
example, when discussing the use of models in non- be on the ontology of models, it will often be necessary
scientific settings – will either be indicated explicitly to also reflect on their function, use, and construction.
or will be clear from context. Focusing on scientific This is not meant to duplicate the discussion provided
models helps to clarify matters, but still leaves a wide in other chapters of this handbook; rather, it is the natu-
6 Part A Theoretical Issues in Models
ral result of scientific models having traditionally been that dominated much of twentieth-century discussion of
Part A | 1.1
defined either in terms of their function (e.g., to provide scientific models. In particular, it will survey the syntac-
representations of target systems) or via their relation tic view of theories and models and its main competitor,
to other (purportedly) better understood entities, such as the semantic view, along with recent formal approaches
scientific theories. (such as the partial structures approach) which aim to
The rest of this chapter is organized as follows: address the shortcomings of their predecessors. Sec-
Sect. 1.1 will set the scene by introducing a num- tion 1.5 provides a sketch of what has been called the
ber of examples of scientific models, thereby raising folk ontology of models – that is, a commonly shared
the question of what degree of unity any philosophi- set of assumptions that inform the views of scientific
cal account of scientific models can reasonably aspire practitioners. On this view, models are place-holders for
to. Section 1.2 will characterize models as functional imaginary concrete systems and as such are not unlike
entities and will provide a general taxonomy for how fictions. The implications of fictionalism about models
to classify various possible philosophical approaches. are discussed in Sect. 1.6. Finally, in Sect. 1.7, recent
A first important class of specific accounts, going back pragmatic accounts are discussed, which give rise to
to nineteenth-century scientists and philosophers, will what may be called a mixed ontology, according to
be discussed in Sect. 1.3, which focuses on models as which models are best conceived of as a heterogeneous
analogies. Section 1.4 is devoted to formal approaches mixture of elements.
Part A | 1.1
X 3. Mathematical models
ED Jij Si Sj ; 4. Theoretical models.
i;j
The basic idea of scale and analog models is
with the variable Si representing the orientation (C1 straightforward: a scale model increases or decreases
or 1) of an elementary magnet at site i in the crys- certain (e.g., spatial) features of the target system, so
tal lattice and Jij representing the strength of interaction as to render them more manageable in the model;
between two such elementary magnets at different lat- an analog model also involves the change of medium
tice sites i and j. (as in once popular hydraulic models of the econ-
Contrast this with model organisms in biology, omy, where the flow of money was represented by
the most famous example of which is the fruit fly the flow of liquids through a system of pumps and
Drosophila melanogaster. Model organisms are real or- valves). Mathematical models are constructed by first
ganisms – actual plants and animals that are alive and identifying a number of relevant variables and then de-
can reproduce – yet they are used as representations ei- veloping empirical hypotheses concerning the relations
ther of another organism (e.g., when rats are used in that may hold between the variables; through (often
place of humans in medical research) or of a biologi- drastic) simplification, a set of mathematical equations
cal phenomenon that is more universal (e.g., when fruit is derived, which may then be evaluated analytically or
flies are used to study the effects of crossover between numerically and tested against novel observations. The-
homologous chromosomes). Model organisms are often oretical models, finally, begin usually by extrapolating
bred for specific purposes and are subject to artificial se- imaginatively from a set of observed facts and regu-
lection pressures, so as to purify and standardize certain larities, positing new entities and mechanisms, which
features (e.g., genetic defects or variants) that would may be integrated into a possible theoretical account of
not normally occur, or would occur only occasionally, a phenomenon; comparison with empirical data usually
in populations in the wild. As Ankeny and Leonelli put comes only at a later stage, once the model has been
it, in their ideal form “model organisms are thought to formulated in a coherent way.
be a relatively simplified form of the class of organism Achinstein [1.5] includes mathematical models in
of interest” [1.3, p. 318]; yet it often takes consider- his definition of theoretical model, and proposes an
able effort to work out the actual relationships between analysis in terms of sets of assumptions about a model’s
the model organism and its target system (whether it be target system. This allows him to include Bohr’s model
a certain biological phenomenon or a specific class of of the atom, the DNA double-helix model (considered
target organisms). Tractability and various experimen- as a set of structural hypotheses rather than as a phys-
tal desiderata – for example, a short life cycle (to allow ical ball-and-stick model), the Ising model, and the
for quick breeding) and a relatively small and compact Lotka–Volterra model among the class of theoretical
genome (to allow for the quick identification of vari- systems. Typically, when a scientist constructs a theo-
ants) – take precedence over theoretical questions in the retical model, she will help herself to certain established
choice of model organisms; unlike for the Ising model, principles of a more fundamental theory to which she
there is no simple mathematical formula that one can is committed. These will then be adapted or modified,
rely on to study how one’s model behaves, only the notably by introducing various new assumptions spe-
messy world of real, living systems. cific to the case at hand. Typically, an inner structure or
The Ising model of ferromagnetism and model or- mechanism is posited which is thought to explain the
ganisms such as Drosophila melanogaster may be at features of the target system. At the same time, there
opposite ends of the spectrum of scientific models. is the (often explicit) acknowledgment that the target
Yet the diversity among those models that occupy the system is far more complex than the model is able to
middle ground between theoretical description and ex- capture: in this sense, a theoretical model is believed
perimental system is no less bewildering. How, one by the practitioner to be false as a description of the tar-
might wonder, can a philosophical account of scien- get system. However, this acknowledgment of the limits
tific models aspire to any degree of unity or generality of applicability of models also allows researchers to si-
in the light of such variety? One obvious strategy is to multaneously use different models of the same target
begin by drawing distinctions between different overar- system alongside each other. Thus understood, theoret-
ching types of models. Thus, Black [1.4] distinguishes ical models usually involve the combination of general
between four such types: theoretical principles and specific auxiliary assump-
tions, which may only be valid for a narrow range of
1. Scale models parameters.
8 Part A Theoretical Issues in Models
The great variety of models employed in scientific prac- In the absence of any widely accepted unified ac-
tice, as illustrated by the long list given in the preceding count of models – let alone one that would provide
section, suggests two things. First, it makes vivid just a conclusive answer to ontological questions arising
how central the use of models is to the scientific en- from models – it may be natural to assume, as indeed
terprise and to the self-image of scientists. As von many contributors to the debate have done, that “if all
Neumann put it, with some hyperbole [1.6, p. 492]: scientific models have something in common, this is not
“The sciences do not try to explain, they hardly even try their nature but their function” [1.8, p. 194]. One option
to interpret, they mainly make models.” Whatever shape would be to follow the quietist strategy concerning the
and form the scientific enterprise might take without the ontology of models and “refuse to engage with this is-
use of models, it seems safe to say that it would not look sue and ask, instead, how can we best represent these
anything like science as we presently know it. Second, features [and functions of models] in order that we can
one might wonder whether it is at all reasonable to look understand” [1.7, p. 245] the practice of scientific mod-
for a unitary philosophical account of models. Given eling. Alternatively, however, one might simply accept
the range of things we call models, and the diversity of that the function of models in scientific inquiry is our
uses to which they are being put, it may simply not be best – and perhaps only – guide when exploring an-
possible to give a one-size-fits-all answer to the ques- swers to the question what is a model?. At the very
tion what is a model? This has led some commentators least, it is not obvious that an exploration of the on-
to propose quietism as the only viable attitude toward tological aspects of models is necessarily fruitless or
ontological questions concerning models and theories. misguided. Ducheyne puts this nicely when he argues
As French puts it [1.7, p. 245], that [1.9, p. 120],
“whereas positing the reality of quarks or genes may
“if we accept that models are functional entities, it
contribute to the explanation of certain features of
should come as no surprise that when we deal with
the physical world, adopting a similar approach to-
scientific models ontologically, we cannot remain
ward theories and models – that is, reifying them as
silent on how such models function as carriers of
entities for which a single unificatory account can
scientific knowledge.”
be given – does nothing to explain the features of
scientific practice.”
As a working assumption, then, let us treat scientific
While there are good grounds for thinking that models as functional entities and explore how much on-
quietism should only be a position of last resort in tological unity – over and above their mere functional
philosophy, the sentiment expressed by French may role – we can give to the notion of scientific model.
go some way toward explaining why there has been Two broad classes of functional characterizations
a relative dearth of philosophical work concerning the of models can be distinguished, according to which it
ontology of models. The neglect of ontological ques- is either instantiation or representation that lie at the
tions concerning models has been remarked upon by heart of how models function. As Giere [1.10] sees it,
a number of contributors, many of whom, like Con- on the instantial view, models instantiate the axioms
tessa, find it [1.8, p. 194] of a theory, where the latter is understood as being
comprised of linguistic statements, including mathe-
“surprising if one considers the amount of interest
matical statements and equations. (For an elaboration
raised by analogous questions about the ontology
of how such an account might turn out, see Sect. 1.4.)
and epistemology of mathematical objects in the
By contrast, on the representational view, “language
philosophy of mathematics.”
connects not directly with the world, but rather with
A partial explanation of this discrepancy lies in a model, whose characteristics may be precisely de-
the arguably greater heterogeneity in what the term fined”; the model then connects with the world “by
scientific models is commonly thought to refer to, way of similarity between a model and the designated
namely, anything from physical ball-and-stick models parts of the world” [1.10, p. 156]. Other proponents
of chemical molecules to mathematical models formu- of the representational view have de-emphasized the
lated in terms of differential equations. (If we routinely role of similarity, while still endorsing representation as
included dividers, compasses, set squares, and other one of the key functions of scientific models. Generally
technical drawing tools among, say, the class of geo- speaking, proponents of the representational view con-
metrical entities, the ontology of mathematical entities, sider models to be “tools for representing the world,”
too, would quickly become rather unwieldy!) whereas those who favor the instantial view regard them
The Ontology of Models 1.2 The Nature and Function of Models 9
primarily as “providing a means for interpreting formal purport to refer to elements in the physical world. The
Part A | 1.2
systems” [1.10, p. 44]. possibility of demonstration from within a model –
Within the class of representational views, one in particular, the successful mathematical derivation of
can further distinguish between views that empha- results for models that lend themselves to mathemati-
size the informational aspects of models and those cal derivation techniques – attests both to the models
that take their pragmatic aspects to be more central. having a nontrivial internal dynamic and to its be-
Chakravartty nicely characterizes the informational va- ing a viable object of fruitful theoretical investigation.
riety of the representational view as follows [1.11, Through successful interpretation, a model user then
p. 198]: relates the theoretically derived results back to the phys-
ical world, including the model’s target system. Clearly,
“The idea here is that a scientific representation is
the DDI account depends crucially on there being some-
something that bears an objective relation to the
one who engages in the activities of interpreting and
thing it represents, on the basis of which it contains
demonstrating – that is, it depends on the cognitive ac-
information regarding that aspect of the world.”
tivities of human agents, who will inevitably draw on
The term objective here simply means that the req- their background knowledge, cognitive interests, and
uisite relation obtains independently of the model user’s derivational skills in establishing the requisite relations
beliefs or intentions as well as independently of the spe- for bringing about representation.
cific representational conventions he or she might be The contrast between informational and pragmatic
employing. Giere’s similarity-based view of represen- approaches to model-based representation roughly
tation – according to which scientific models represent maps onto another contrast, between what Knuuttila
in virtue of their being similar to their target systems has dubbed dyadic and triadic approaches. The former
in certain specifiable ways – would be an example of takes “the model–target dyad as a basic unit of analysis
such an informational view similarity, as construed by concerning models and their epistemic values” [1.13,
Giere, is a relation that holds between the model and p. 142]. This coheres well with the informational ap-
its target, irrespective of a model user’s beliefs or in- proach which, as discussed, tends to regard models as
tentions, and regardless of the cognitive uses to which (often abstract) structures that stand in a relation of iso-
he or she might put the model. Other philosophical po- morphism, or partial isomorphism, to a target system.
sitions that are closely aligned with the informational By contrast, triadic accounts – in line with pragmatic
approach might posit that, for a model to represent views of model-based representation – based represen-
its target, the two must stand in a relation of isomor- tation shift attention away from models and the abstract
phism, partial isomorphism, or homomorphism to one relations they stand in, toward modeling as a theoretical
another. activity pursued by human agents with cognitive inter-
By contrast, the pragmatic variety of the represen- ests, intentions, and beliefs. On this account, model-
tational view of models posits that models function as based representation cannot simply be a matter of any
representations of their targets in virtue of the cogni- abstract relationship between the model and a target
tive uses to which human reasoners put them. The basic system since one cannot, as Suárez puts it, “reduce
idea is that a scientific model facilitates certain cogni- the essentially intentional judgments of representation
tive activities – such as the drawing of inferences about users to facts about the source and target object or sys-
a target system, the derivation of predictions, or per- tems and their properties” [1.14, p. 768]. Therefore,
haps a deepening of the scientific understanding – on so the suggestion goes, the model–target dyad needs
the part of its user and, therefore, necessarily involves to be replaced by a three-place relation between the
the latter’s cognitive interests, beliefs, or intentions. model, its target, and the model user. Suárez, for exam-
Hughes [1.12], for example, emphasizes the interplay ple, proposes an inferentialist account of model-based
of three cognitive–theoretical processes – denotation, representation, according to which a successful model
demonstration, and interpretation – which jointly give must allow “competent and informed agents to draw
rise to the representational capacity of (theoretical) specific inferences regarding” [1.14, p. 773] the target
models in science. On Hughes’ (aptly named) DDI system – thereby making the representational success
account of model-based representation, denotation ac- of a model dependent on the qualities of a (putative)
counts for the fact that theoretical elements of a model model user.
10 Part A Theoretical Issues in Models
Some scholars trace the emergence of the concept of merely formal, when only the relations (but not the re-
a scientific model to the second half of the nineteenth lata) resemble another, or it may be material, when the
century [1.15]. Applying our contemporary concept of relata from the two domains (i. e., A and B on one side,
model to past episodes in the history of science, we C and D on the other) have certain attributes or charac-
can of course identify prior instances of models be- teristics in common. Duhem’s understanding of analogy
ing employed in science; however, until the nineteenth is more specific, in that he conceives of analogy as be-
century scientists were engaged in little systematic ing a relation between two sets of statements, such as
self-reflection on the uses and limitations of models. between one theory and another [1.16, p. 97]:
Philosophy of science took even longer to pay attention
“Analogies consist in bringing together two abstract
to models in science, focusing instead on the role and
systems; either one of them already known serves to
significance of scientific theories. Only from the middle
help us guess the form of the other not yet known, or
of the twentieth century onward did philosophical inter-
both being formulated, they clarify the other. There
est in models acquire the requisite momentum to carry
is nothing here that can astonish the most rigorous
the debate forward. Yet in both science and philosophy,
logician, but there is nothing either that recalls the
the term model underwent important transformations,
procedures dear to ample but shallow minds.”
so it will be important to identify some of these shifts,
in order to avoid unnecessary ambiguity and confusion Consider the following example: When Christiaan
in our exploration of the question What is a model?. Huygens (1629–1695) proposed his theory of light, he
Take, for example, Duhem’s dismissal, in 1914, of did so on the basis of analogy with the theory of sound
what he takes to be the excessive use of models in waves: the relations between the various attributes and
Maxwell’s theory of electromagnetism, as presented in characteristics of light are similar to those described by
an English textbook published at the end of the nine- acoustic theory for the rather different domain of sound.
teenth century [1.16, p. 7]: Thus understood, analogy becomes a legitimate instru-
ment for learning about one domain on the basis of
“Here is a book intended to expound the modern
what we know about another. In modern parlance, we
theories of electricity and to expound a new theory.
might want to say that sound waves provided Huygens
In it there are nothing but strings which move round
with a good theoretical model – at least given what was
pulleys which roll around drums, which go through
known at the time – for the behavior of light.
pearl beads, which carry weights; and tubes which
There is, however, a risk of ambiguity in that last
pump water while others swell and contract; toothed
sentence – an ambiguity which, as Mellor [1.18, p. 283]
wheels which are geared to one another and engage
has argued, it would be wrong to consider harmless.
hooks. We thought we were entering the tranquil
Saying that sound waves provide a good model for the
and neatly ordered abode of reason, but we find our-
theory of light appears to equate the model with the
selves in a factory.”
sound waves – as though one physical object (sound
What Duhem is mocking in this passage, which waves) could be identified with the model. At first sight,
is taken from a chapter titled Abstract Theories and this might seem unproblematic, given that, as far as
Mechanical Models, is a style of reasoning that is dom- wave-like behavior is concerned, we do take light and
inated by the desire to visualize physical processes in sound to be relevantly analogous. However, while it is
purely mechanical terms. His hostility is thus directed indeed the case that “some of the constructs called anal-
at mechanical models only – as the implied contrast in ogy in the nineteenth century would today be routinely
the chapter title makes clear – and does not extend to the referred to as models” [1.19, p. 46], it is important to
more liberal understanding of the term scientific model distinguish between, on the one hand, analogy as the
in philosophy of science today. similarity relation that exists between a theory and an-
Indeed, when it comes to the use of analogy in other set of statements and, on the other hand, the latter
science, Duhem is much more forgiving. The term anal- set of statements as the analog of the theory. Further-
ogy, which derives from the Greek expression for pro- more, we need to distinguish between the analog (e.g.,
portion, itself has multiple uses, depending on whether the theory of sound waves, in Huygens’s case) and the
one considers its use as a rhetorical device or as a tool set of entities of which the analog is true (e.g., the sound
for scientific understanding. Its general form is that of waves themselves). (On this point, see [1.18, p. 283].)
“pointing to a resemblance between relations in two dif- What Duhem resents about the naïve use of what he
ferent domains, that is, A is related to B like C is related refers to as mechanical models is the hasty conflation
to D” [1.17, p. 110]. An analogy may be considered of the visualized entities – (imaginary) pulleys, drums,
The Ontology of Models 1.3 Models as Analogies and Metaphors 11
pearl beads, and toothed wheels – with what is in fact mechanical theories,” which he takes to be firmly rooted
Part A | 1.3
scientifically valuable, namely the relation of analogy in our psychology. But he insists that [1.21, p. 156]
that exists between, say, the theory of light and the the-
ory of sound. “we should notice that the considerations which
This interpretation resolves an often mentioned ten- have been offered justify only the attempt to adopt
sion – partly perpetuated by Duhem himself, through some form of theory involving ideas closely related
his identification of different styles of reasoning (the to those of force and motion; it does not justify the
English style of physics with its emphasis on mechan- attempt to force all such theories into the Newtonian
ical models, and the Continental style which prizes mold.”
mathematical principles above all) – between Duhem’s
account of models and that of the English physicist To be sure, significant differences between Duhem
Norman Campbell. Thus, Hesse, in her seminal essay and Campbell remain, notably concerning what kinds
Models and Analogies in Science [1.20], imagines a di- of uses of analogies in science (or, in today’s termi-
alogue between a Campbellian and a Duhemist. At the nology, of scientific – including theoretical – models)
start of the dialogue, the Campbellian attributes to the are appropriate. For Duhem, such uses are limited to
Duhemist the following view: “I imagine that along a heuristic role in the discovery of scientific theories. By
with most contemporary philosophers of science, you contrast, Campbell claims that “in order that a theory
would wish to say that the use of models or analogs is may be valuable [. . . ] it must display analogy” [1.21,
not essential to scientific theorizing and that [. . . ] the p. 129] – though it should be emphasized again, not
theory as a whole does not require to be interpreted by necessarily analogy of the mechanical sort. (As Mel-
means of any model.” To this, the Duhemist, who ad- lor argues, Duhem and Campbell differ chiefly in their
mits that “models may be useful guides in suggesting views of scientific theories and less so in their take
theories,” replies: “When we have found an acceptable on analogy, with Duhem adopting a more static per-
theory, any model that may have led us to it can be spective regarding theories and Campbell taking a more
thrown away. Kekulé is said to have arrived at the struc- realist perspective [1.18].)
ture of the benzene ring after dreaming of a snake with It should be said, though, that Hesse’s Campbellian
its tail in its mouth, but no account of the snake appears and Duhemist are at least partly intended as carica-
in the textbooks of organic chemistry.” The Campbel- tures and serve as a foil for Hesse’s own account of
lian’s rejoinder is as follows: “I, on the other hand, want models as analogies. The account hinges on a three-
to argue that models in some sense are essential to the part distinction between positive, negative, and neutral
logic of scientific theories” [1.20, pp. 8–9]. The quoted analogies [1.20]. Using the billiard ball model of gases
part of Hesse’s dialogue has often been interpreted as as her primary example, Hesse notes that some char-
suggesting that the bone of contention between Duhem acteristics are shared between the billiard balls and the
and Campbell is the status of models in general (in the gas atoms (or, rather, are ascribed by the billiard ball
modern sense that includes theoretical models), with model to the gas atoms); these include velocity, mo-
Campbell arguing in favor and Duhem arguing against. mentum, and collision. Together, these constitute the
But we have already seen that Duhem, using the lan- positive analogy. Those properties we know to belong
guage of analogy, does allow for theoretical models to to billiard balls, but not to gas atoms – such as color –
play an important role in science. This apparent tension constitute the negative analogy of the model. However,
can be resolved by being more precise about the target there will typically be properties of the model (i. e., the
of Duhem’s criticism: “Kekulé’s snake dream might il- billiard ball system) of which we do not (yet) know
lustrate the use of a visualizable model, but it certainly whether they also apply to its target (in this case, the gas
does not illustrate the use of an analogy, in Duhem atoms). These form the neutral analogy of the model.
and Campbell’s sense” [1.18, p. 285]. In other words, Far from being unimportant, the neutral analogy is cru-
Duhem is not opposed to scientific models in general, cial to the fruitful use of models in scientific inquiry,
but to its mechanical variety in particular. And, on the since it holds out the promise of acquiring new knowl-
point of over-reliance on mechanical models, Camp- edge about the target system by studying the model in
bell, too, recognizes that dogmatic attachment to such its place [1.20, p. 10]:
a style of reasoning is open to criticism. Such a dog-
matic view would hold “that theories are completely “If gases are really like collections of billiard balls,
satisfactory only if the analogy on which they are based except in regard to the known negative analogy, then
is mechanical, that is to say, if the analogy is with the from our knowledge of the mechanics of billiard
laws of mechanics” [1.21, p. 154]. Campbell is clearly balls, we may be able to make new predictions about
more sympathetic than Duhem toward our “craving for the expected behavior of gases.”
12 Part A Theoretical Issues in Models
In dealing with scientific models we may choose to a metaphor is a linguistic expression that involves at
Part A | 1.4
disregard the negative analogy (which results in what least one part that is being transferred from a domain
Hesse calls model1 ) and consider only the known posi- of discourse where it is common to another – the tar-
tive and neutral analogies – that is, only those properties get domain – where it is uncommon. The existence of
that are shared, or for all we know may turn out to be an analogy may facilitate such a transfer of linguis-
shared, between the target system and its analog. (On tic expression; at the same time, it is entirely possible
the terminology discussed in Sect. 1.1, due to Black that “it is the metaphor that prompts the recognition
and Achinstein, model1 would qualify as a theoretical of analogy” [1.17, p. 114] – both are compatible with
model.) This, Hesse argues, typically describes our use one another and neither is obviously prior to the other.
of models for the purpose of explanation: we resolve Metaphorical language is widespread in science, not
to treat model1 as taking the place of the phenom- just in connection with models: for example, physicists
ena themselves. Alternatively, we may actively include routinely speak of black holes and quantum tunneling
the negative analogy in our considerations, resulting in as important predictions of general relativity theory and
what Hesse calls model2 or a form of analog model. quantum theory, respectively. Yet, as Soskice and Harré
Given that, let us assume, the model system (e.g., the note, there is a special affinity between models and
billiard balls) was chosen because it was observable – metaphor [1.22, p. 302]:
or, at any rate, more accessible than the target sys-
tem (e.g., the gas) – model2 allows us to study the “The relationship of model and metaphor is this: if
similarities and dissimilarities between the two analo- we use the image of a fluid to explicate the supposed
gous domains; model2 , qua being a model for its target, action of the electrical energy, we say that the fluid
thus has a deeper structure than the system of bil- is functioning as a model for our conception of the
liard balls considered in isolation – and, like model1 , nature of electricity. If, however, we then go on to
importantly includes the neutral analogy, which holds speak of the rate of flow of an electrical current, we
out the promise of novel insights and predictions. As are using metaphorical language based on the fluid
Hesse puts it, in the voice of her Campbellian interlocu- model.”
tor [1.20, pp. 12–13]:
In spite of this affinity, it would not be fruitful to
“My whole argument is going to depend on these simply equate the two – let alone jump to the conclu-
features [of the neutral analogy] and so I want sion that, in the notion of metaphor, we have found
to make it clear that I am not dealing with static an answer to the question What is a model?. Mod-
and formalized theories, corresponding only to the els and metaphors both issue in descriptions, and as
known positive analogy, but with theories in the pro- such they may draw on analogies we have identified
cess of growth.” between two otherwise distinct domains; more, how-
ever, needs to be said about the nature of the relations
Models have been discussed not only in terms of that need to be in place for something to be con-
analogy, but also in terms of metaphor. Metaphor, more sidered a (successful) model of its target system or
explicitly than analogy, refers to the linguistic realm: phenomenon.
1.4.1 Models and the Study (physical) theories as systems of hypotheses designed
Part A | 1.4
of Formal Languages to save the phenomena, and the mathematical program,
pioneered by David Hilbert, which sought to formalize
Model theory originated as the study of formal lan- (mathematical) theories as axiomatic systems. By com-
guages and their interpretations, starting from a Tarski- bining the two, it seemed possible to identify a theory
style truth theory based only on notions from syntax with the set of logical consequences that could be de-
and set theory. On a broader understanding, the re- rived from its fundamental principles (which were to
striction to formal languages may be dropped, so as be treated as axioms), using only the rules of the lan-
to include scientific languages (which are often closer guage in which the theory was formulated. In spite of
to natural language than to logic), or even natural lan- its emphasis on syntax, the syntactic view is not en-
guages. However, the distinction between the syntax tirely divorced from questions of semantics. When it
and the semantics of a language, which is sharpest in comes to scientific theories, we are almost always deal-
logic, also provides a useful framework for studying ing with interpreted sets of sentences, some of which –
scientific languages and has guided the development the fundamental principles or axioms – are more ba-
of both the syntactic and the semantic views of theo- sic than others, with the rest derivable using syntactic
ries and models. The syntax of a language L is made rules. The question then arises at which level interpreta-
up of the vocabulary of L, along with the rules that tion of the various elements of a theory is to take place.
determine which sequence of symbols counts as a well- This is where the slogan to save the phenomena points
formed expression in L; in turn, the semantics of L us in the right direction: on the syntactic view, inter-
provides interpretations of the symbolic expressions in pretation only properly enters at the level of matching
L, by mapping them onto another relational structure R, singular theoretical predictions, formulated in strictly
such that all well-formed expressions in L are rendered observational terms, with the observable phenomena.
intelligible (e.g., via rules of composition) and can be Higher level interpretations – for example, pertain-
assessed in terms of their truth or falsity in R. ing to purely theoretical terms of a theory (such as
The contrast between the syntax and the semantics posited unobservable entities, causal mechanisms, laws,
of a language allows for two different approaches to the etc.) – would be addressed through correspondence
notion of a theory. A theory T may either be defined rules, which offered at least a partial interpretation, so
syntactically, as the set of all those sentences that can that some of the meaning of such higher level terms
be derived, through a proper application of the syntac- of a theory could be linked up with observational sen-
tic rules, from a set of axioms (i. e., statements that are tences.
taken to be fundamental); or it may be defined semanti- As an example, consider the example of classical
cally, as all those (first-order) sentences that a particular mechanics. Similar to how Euclidean geometry can
structure, M, satisfies. An example of the former would be fully derived from a set of five axioms, classical
be Euclidean geometry, which consists of five axioms mechanics is fully determined by Newton’s laws of
and all the theorems derivable from them using geo- mechanics. At a purely formal level, it is possible to
metrical rules; an example of the latter would be group provide a fully syntactic axiomatization in terms of the
theory, which simply consists of all those first-order relevant symbols, variables, and rules for their manipu-
sentences that a set of groups – definable in terms of set- lation – that is, in terms of what Rudolf Carnap calls the
theoretic entities – satisfies. (This example, and much calculus of mechanics. If one takes the latter as one’s
of the short summary in this section, is owed to [1.23]; starting point, it requires interpretation of the results
for further discussion, see references therein.) The syn- derived from within this formal framework, in order
tactic and semantic definitions of what a theory is are for the calculus to be recognizable as a theory of me-
closely related: starting from the semantic definition, to chanics, that is, of physical phenomena. In the case
see whether a particular structure M is a model of an of mechanics, we may have no difficulty stating the
axiomatizable first-order theory T, all that one needs to axioms in the form of the (physically interpreted) New-
show is that M satisfies the axioms. tonian laws of mechanics, but in other cases – perhaps
in quantum mechanics – making this connection with
1.4.2 The Syntactic View of Theories observables may not be so straightforward. As Carnap
notes [1.24, p. 57]:
The syntactic view of theories originated from the com-
bination of the insights – or, to put it a little more “[t]he relation of this theory [D the physically in-
cautiously, fundamental tenets – of two research pro- terpreted theory of mechanics] to the calculus of
grams: the philosophical program, aligned with Pierre mechanics is entirely analogous to the relation of
Duhem (Sect. 1.3) and Henri Poincaré, of treating physical to mathematical geometry. ”
14 Part A Theoretical Issues in Models
As in the Euclidean case, the syntactic view iden- tic formulations of the theory. A structure that provides
Part A | 1.4
tifies the theory with a formal language or calculus an interpretation for, and makes true, the set of sen-
(including, in the case of scientific theories, relevant tences associated with a specific linguistic formulation
correspondence rules), “whose interpretation – what the of the theory is called a model of the theory. Hence,
calculus is a theory of – is fixed at the point of applica- the semantic view is often characterized as conceiving
tion” [1.25, p. 125]. of theories as collections of models. This not only puts
On the syntactic view of theories, models play models – where these are to be understood in the logi-
at best a very marginal role as limiting cases or ap- cal sense outlined earlier – center stage in our account
proximations. This is for two reasons. First, since the of scientific theories, but also renders the latter funda-
nonobservational part of the theory – that is, the the- mentally extra-linguistic entities.
ory proper, as one might put it – does not admit of An apt characterization of the semantic view is
direct interpretation, the route to constructing theoret- given by Suppe as follows [1.27, pp. 82–83]:
ical models on the basis of our directly interpreting the
“This suggests that theories be construed as pro-
core ingredients of the theory is obstructed. Interpreta-
pounded abstract structures serving as models for
tion at the level of observational statements, while still
sets of interpreted sentences that constitute the lin-
available to us, is insufficient to imbue models with any-
guistic formulations. [. . . ] [W]hat the theory does
thing other than a purely one-off auxiliary role. Second,
is directly describe the behavior of abstract sys-
as Cartwright has pointedly argued in criticism directed
tems, known as physical systems, whose behaviors
at both the syntactic and the semantic views, there is
depend only on the selected parameters. However,
a shared – mistaken – assumption that theories are a bit
physical systems are abstract replicas of actual phe-
like vending machines [1.26, p. 247]:
nomena, being what the phenomena would have
“[Y]ou feed it input in certain prescribed forms for been if no other parameters exerted an influence.”
the desired output; it gurgitates for a while; then it
According to a much-quoted remark by one of the
drops out the sought-for-representation, plonk, on
main early proponents of the semantic view, Suppes,
the tray, fully formed, as Athena from the brain of
“the meaning of the concept of model is the same in
Zeus.”
mathematics and in the empirical sciences.” However,
This limits what we can do with models, in that as Suppe’s quote above makes clear, models in sci-
there are only two stages [1.26, p. 247]: ence have additional roles to play, and it is perhaps
worth noting that Suppes himself immediately contin-
“First, eyeballing the phenomenon, measuring it up,
ues: “The difference to be found in these disciplines is
trying to see what can be abstracted from it that
to be found in their use of the concept” [1.28, p. 289].
has the right form and combination that the vend-
Supporters of the semantic view often claim that it is
ing machine can take as input; secondly, [. . . ] we
closer to the scientific practices of modeling and theo-
do either tedious deduction or clever approximation
rizing than the syntactic view. On this view, according
to get a facsimile of the output the vending machine
to van Fraassen [1.29, p. 64],
would produce.”
“[t]o present a theory is to specify a family of struc-
Even if this caricature seems a little too extreme,
tures, its models; and secondly, to specify certain
the fact remains that, by modeling theories after first-
parts of those models (the empirical substructures)
order formal languages, the syntactic view limits our
as candidates for the direct representation of observ-
understanding of what theories and models are and what
able phenomena.”
we can do with them.
Unlike what the syntactic view suggests, scientists
1.4.3 The Semantic View do not typically formulate abstract theoretical axioms
and only interpret them at the point of their applica-
One standard criticism of the syntactic view is that tion to observable phenomena; rather, “scientists build
it conflates scientific theories with their linguistic for- in their mind’s eye systems of abstract objects whose
mulations. Proponents of the semantic view argue that properties or behavior satisfy certain constraint (includ-
by adding a layer of (nonlinguistic) structures between ing law)” [1.23, p. 154] – that is, they engage in the
the linguistic formulations of theories and our assess- construction of theoretical models.
ment of them, one can side-step many of the problems Unlike the syntactic view, then, the semantic view
faced by the syntactic view. According to the seman- appears to give a more definite answer to the question
tic view, a theory should be thought of as the set of what is a model? In line with the account sketched so
set-theoretic structures that satisfy the different linguis- far, a model of a theory is simply a (typically extra-
The Ontology of Models 1.4 Models Versus the Received View: Sentences and Structures 15
linguistic) structure that provides an interpretation for, predicate or the axioms that compose it” [1.30, p. 531];
Part A | 1.4
and makes true, the set of axioms associated with the it merely fits a structure to the description provided by
theory (assuming that the theory is axiomatizable). Yet the fully interpreted axioms (1)–(7), and in this way en-
it is not clear that, in applying their view to actual sci- sures that they are satisfied, but it does not make them
entific theories, the semanticists always heed their own come out true in virtue of providing an interpretation
advice to treat models as both giving an interpretation, (i. e., by invoking semantic theory). To Thomson-Jones,
and ensuring the truth, of a set of statements. More im- this suggests that identifying scientific models with
portantly, the model-theoretic account demands that, in truth-making structures in the model-theoretic sense
a manner of speaking, a model should fulfil its truth- may, at least in the sciences, be an unfulfilled promise of
making function in virtue of providing an interpretation the semantic view; instead, he argues, we should settle
for a set of sentences. Other ways of ensuring truth – for a less ambitious (but still informative) definition of
for example by limiting the domain of discourse for a model as “a mathematical structure used to represent
a set of fully interpreted sentences, thereby ensuring a (type of) system under study” [1.30, p. 525].
that the latter will happen to be true – should not qual-
ify. Yet, as Thomson-Jones [1.30] has argued, purported 1.4.4 Partial Structures
applications of the semantic view often stray from the
original model-theoretic motivation. As an example, Part of the motivation for the semantic view was its
consider Suppes’ axiomatization of Newtonian parti- perceived greater ability to account for how scientists
cle physics. (The rest of this subsection follows [1.30, actually go about developing models and theories. Even
pp. 530–531].) Suppes [1.31] begins with the following so, critics have claimed that the semantic view is unable
definition (in slightly modified form) to accommodate the great diversity of scientific mod-
els and faces special challenges from, for example, the
Definition 1.1 use of inconsistency in many models. In response to
A system ˇ D hP; T; s; m; f ; gi is a model of particle such criticisms, a philosophical research program has
mechanics if and only if the following seven axioms are emerged over the past two decades, which seeks to es-
satisfied: tablish a middle ground between the classical semantic
view of models discussed in the previous section and
Kinematical axioms:
those who are sceptical about the prospects of formal
1 The set P is finite and nonempty approaches altogether. This research program is often
2 The set T is an interval of real numbers called the partial structures approach, which was pi-
3 For p in P, sp is twice differentiable. oneered by Newton da Costa and Steven French and
whose vocal proponents include Otávio Bueno, James
Dynamical axioms:
Ladyman, and others; see [1.32] and references therein.
4 For p in P, m.p/ is a positive real number Like many adherents of the semantic view, partial
5 For p and q in P and t in T, structures theorists hold that models are to be recon-
structed in set-theoretic terms, as ordered n-tuples of
f .p; q; t/ D f .q; p; t/ : sets: a set of objects with (sets of) properties, quantities
and relations, and functions defined over the quanti-
6 For p and q in P and t in T, ties. A partial structure may then be defined as A D
hD; Ri ii2I , where D is a nonempty set of n-tuples of just
s.p; t/ f .p; q; t/ D s.q; t/ f .q; p; t/ : this kind and each Ri is a n-ary relation. Unlike on the
traditional semantic view, the relations Ri need not be
7 For p in P and t in T, complete isomorphisms, but crucially are partial rela-
X tions: that is, they need not be defined for all n-tuples
m.p/D2 sp .t/ D f .p; q; t/ C g.p; t/ : of elements of D. More specifically, for each partial re-
q2P lation Ri , in addition to the set of n-tuples for which the
relation holds and the set of n-tuples for which it does
At first sight, this presentation adheres to core ideas not hold, there is also a third set of n-tuples for which
that motivate the semantic view. It sets out to define an it is underdetermined whether or not it holds. (There
extra-linguistic entity, ˇ, in terms of a set-theoretical is a clear parallel here with Hesse’s notion of positive,
predicate; the entities to which the predicate applies are negative, and neutral analogies which, as da Costa and
then to be singled out on the basis of the seven axioms. French put it, “finds a natural home in the context of
But as Thomson-Jones points out, a specific model S partial structures” [1.32, p. 48].) A total structure is said
defined in this way “is not a serious interpreter of the to extend a partial structure, if it subsumes the first two
16 Part A Theoretical Issues in Models
sets without change (i. e., includes all those objects and if partial relations are so easy to come by, how can one
Part A | 1.5
definite relations that exist in the partial structures) and tell the interesting from the vast majority of irrelevant
renders each extended relation well defined for every ones? (Pincock speaks in this connection of the “danger
n-tuple of objects in its domain. This gives rise to a hi- of trivializing our representational relationships” [1.34,
erarchy of structures and substructures, which together p. 1254].) Suárez and Cartwright add further urgency to
with the notion of partial isomorphism loosens the re- this criticism, by noting that the focus on set-theoretical
quirements on representation, since all that is needed structures obliterates all those uses of models and as-
for two partial models A and A0 to be partially isomor- pects of scientific practice that do not amount to the
phic is that a partial substructure of A be isomorphic to making of claims [1.35, p. 72]:
a partial substructure in A0 .
Proponents of the partial structures approach claim “So all of scientific practice that does not consist in
that it “widens the framework of the model-theoretic the making of claims gets left out. [. . . ] Again, we
approach and allows various features of models and the- maintain that this inevitably leaves out a great deal
ories – such as analogies, iconic models, and so on – to of the very scientific practice that we are interested
be represented,” [1.33, p. 306] that it can successfully in.”
contain the difficulties arising from inconsistencies in
models, and that it is able to capture “the existence of It is perhaps an indication of the limitations of the
a hierarchy of models stretching from the data up to partial structures approach that, in response to such crit-
the level of theory” [1.33]. Some critics have voiced icism, its proponents need to again invoke heuristic fac-
criticism about such sweeping claims. One frequent tors, which cannot themselves be subsumed under the
criticism concerns the proliferation of partial isomor- proposed formal framework of models as set-theoretic
phisms, many of which will trivially obtain; however, structures with partial relations.
tities we are imagining when we contemplate models itself, information in various formats, including linguis-
Part A | 1.5
as imagined concrete things, we can focus on the con- tic, formulaic, visual, auditory, kinesthetic, can be used
scious processes that attend such imaginings (or, if one in its construction” [1.39, p. 12].
prefers a different way of putting it, the phenomenol- How does this apply to the case of scientific mod-
ogy of interacting with models). Foremost among these els? As an example, Nersessian considers James Clerk
is the mental imagery that is conjured up by the de- Maxwell’s famous molecular vortex model, which vi-
scriptions of models. (Indeed, as we shall see in the sualized the lines of magnetic force around a magnet
next section, on certain versions of the fictionalist view, as though they were vortices within a continuous fluid
a model prescribes imaginings about its target sys- (Fig. 1.1).
tem.) How much significance one should attach to the As Nersessian sees it, Maxwell’s drawing “is a vi-
mental pictures that attend our conscious considera- sual representation of an analogical model that is ac-
tion of models has been a matter of much controversy: companied with instructions for animating it correctly
recall Duhem’s dismissal of mechanical imagery as in thought” [1.39, p. 13]. And indeed Maxwell gives de-
a way of conceptualizing electromagnetic phenomena tailed instructions regarding how to interpret, and bring
(Sect. 1.3). to life, the model of which the reader is only given a mo-
Focusing on the mental processes that accompany mentary snapshot [1.40, p. 477]:
the use of scientific models might lead one to propose
an analysis of models in terms of their cognitive foun- “Let the current from left to right commence in AB.
dations. Nancy Nersessian has developed just such an The row of vortices gh above AB will be set in mo-
analysis, which ties the notion of models in science tion in the opposite direction to a watch [. . . ]. We
closely to the cognitive processes involved in mental shall suppose the two of vortices kl still at rest, then
modeling. Whereas the traditional approach in psychol- the layer of particles between these rows will be
ogy had been to think of reasoning as consisting of the acted on by the row gh,”
mental application of logical rules to propositional rep-
resentations, mounting empirical evidence of the role and so forth. It does seem plausible to say that such
of heuristics and biases suggested that much of human instructions are intended to prescribe certain mental
reasoning proceeds via mental models [1.38], that is, models on the part of the reader. Convincing though
by carrying out thought experiments on internal mod- this example may be, it still begs the question of what,
els. A mental model, on this account, is “a structural in general, a mental model is. At the same time, it
analog of a real-world or imaginary situation, event, or illustrates what is involved in conjuring up a mental
process” as constructed by the mind in reasoning (and, model and which materials – in this case, spatial repre-
presumably, realized by certain underlying brain pro- sentations, along with intuitions about the mechanical
cesses) [1.39, pp. 11–12]: motion of parts in a larger system – are involved in its
constitution.
“What it means for a mental model to be a struc-
tural analog is that it embodies a representation of
the spatial and temporal relations among, and the
causal structures connecting the events and enti-
ties depicted and whatever other information that
is relevant to the problem-solving talks. [. . . ] The
k l
essential points are that a mental model can be non-
linguistic in form and the mental mechanisms are p q
such that they can satisfy the model-building and
simulative constraints necessary for the activity of g h
mental modeling.”
A B
While this characterization of mental models may
have an air of circularity, in that it essentially defines
mental models as place-holders for whatever it takes
to support the activity of mental modeling, it nonethe-
less suggests a place to look for the materials from
which models are constructed: the mind itself, with its
various types of content and mental representation. As Fig. 1.1 Maxwell’s drawing of the molecular vortex model
Nersessian puts it: “Whatever the format of the model (after [1.40])
18 Part A Theoretical Issues in Models
As noted in the previous section, the face-value practice Meinong [1.43], we might, for example, distinguish
of scientific modeling and its concomitant folk ontol- between being and existence and consider Sherlock
ogy, according to which models are imagined concrete Holmes to be an object that has all the requisite prop-
things, have a natural affinity to the way we think about erties we normally attribute to him, except for the
fictions. As one proponent of models as fictions puts property of existence. Or we might take fictions to
it [1.41, p. 253]: have existence, but only as abstract entities, not as ob-
jects in space and time. By contrast, antirealists about
“The view of model systems that I advocate re- fictions deny that they have independent being or ex-
gards them as imagined physical systems, that is, istence and instead settle for other ways of making
as hypothetical entities that, as a matter of fact, do sense of how we interpret fictional discourse. Following
not exist spatiotemporally but are nevertheless not Bertrand Russell, we might paraphrase the statement
purely mathematical or structural in that they would Sherlock Holmes is a pipe smoker and resides at 221B
be physical things if they were real.” Baker Street without the use of a singular term (Sher-
lock Holmes), solely in terms of a suitably quantified
Plausible though this may sound, the devil is in the existence claim: There exists one and only one x such
details. A first – perhaps trivial – caveat concerns the that x is a pipe smoker and x resides at 221B Baker
restriction that model systems would be physical things Street. However, while this might allow us to parse the
if they were real. In order to allow for the notion of meaning of further statements about Sherlock Holmes
model to be properly applied to the social and cogni- more effectively, it does not address the puzzle that cer-
tive sciences, such as economics and psychology, it is tain claims (such as He is a pipe smoker) ring true,
best to drop this restriction to physical systems. (On this whereas others do not – since it renders each part of
point, see [1.30, p. 528].) This leaves as the gist of the the explicated statement false. This might not seem like
folk-ontological view the thought that model systems, a major worry for the case of literary fictions, but it
if they were real, would be just as we imagine them (or, casts doubt on whether we can fruitfully think about sci-
more carefully, just as the model instructs us to imagine entific models in those terms, given the epistemic role
them). of scientific models as contributors to scientific knowl-
In order to sharpen our intuitions about fictions, let edge.
us introduce an example of a literary fiction, such as In recent years, an alternative approach to fic-
the following statement from Doyle’s The Adventure of tions has garnered the attention of philosophers of
the Three Garridebs (1924) [1.42]: “Holmes had lit his science, which takes Walton’s notion of “games of
pipe, and he sat for some time with a curious smile upon make-believe” as its starting point. Walton introduces
his face.” There is, of course, no actual human being this notion in the context of his philosophy of art, where
that this statement represents: no one is sitting smil- he characterizes (artistic) representations as “things
ingly at 221B Baker Street, filling up the room with possessing the social function of serving as props in
smoke from their pipe. (Indeed, until the 1930s, the games of make-believe” [1.44, p. 69]. In games of
address itself had no real-world referent, as the high- make-believe, participants engage in behavior akin to
est number on Baker Street then was No. 85.) And yet children’s pretend play: when a child uses a banana as
there is a sense in which this passage does seem to rep- a telephone to call grandpa, this action does not amount
resent Sherlock Holmes and, within the context of the to actually calling her grandfather (and perhaps not even
story, tells us something informative about him. In par- attempting to call him); rather, it is a move within the
ticular, it seems to lend support to certain statements context of play – where the usual standards of realism
about Sherlock Holmes as opposed to others. If we are suspended – whereby the child resolves to treat the
say Holmes is a pipe smoker, we seem to be asserting situation as if it were one of speaking to her grandfather
something true about him, whereas if we say Holmes is on the phone.
a nonsmoker, we appear to be asserting something false. The banana is simply a prop in this game of make-
One goal of the ontology of fictions is to make sense of believe. The use of the banana as a make-believe
this puzzle. telephone may be inspired by some physical similarity
Broadly speaking, there are two kinds of philo- between the two objects (e.g., their elongated shape, or
sophical approaches – realist and antirealist – regarding the way that each can be conveniently held to one’s ear
fictions. On the realist approach, even though Sher- and mouth at the same time), but it is clear that props
lock Holmes is not an actual human being, we must can go beyond material objects to include, for example,
grant that he does exist in some sense. Following linguistic representations (as would be the case with
The Ontology of Models 1.6 Models and Fiction 19
the literary figure of Sherlock Holmes). While the rules things that are, as a matter of fact, false; however, so
Part A | 1.6
governing individual pretend play may be ad hoc, com- the direct view holds, this is nonetheless preferable to
munal games of make-believe are structured by shared the alternative option of positing independently exist-
normative principles which authorize certain moves as ing fictional entities [1.45, p. 42]. Why might one be
legitimate, while excluding other moves as illegitimate. tempted to posit, as the indirect view does, that fictional
It is in virtue of such principles that fictional truths can objects fitting the model descriptions must exist? An
be generated: for example, a toy model of a bridge at the important motivation has to do with the assertoric force
scale of 1 W 1000 prescribes that, “if part of the model of our model-based claims. As Giere puts it: “If we in-
has a certain length, then, fictionally, the corresponding sist on regarding principles as genuine statements, we
part of the bridge is a thousand times that length” [1.45, have to find something that they describe, something
p. 38] – in other words, even though the model itself to which they refer” [1.48, p. 745]. In response, pro-
is only a meter long, it represents the bridge as a thou- ponents of the direct view have disputed the need “to
sand meters long. Note that the scale model could be regard theoretical principles formulated in modeling as
a model of a bridge that is yet to be built – in which genuine statements”; instead, as Toon puts it, “they are
case it would still be true that, fictionally, the bridge is prescriptions to imagine” [1.45, p. 44].
a thousand meters long: props, via the rules that govern One potential criticism the models as fictions view
them, create fictional truths. needs to address is the worry that, by focusing on the
One issue of contention has been what kinds of user’s imaginings, what a model is becomes an en-
metaphysical commitments such a view of models tirely subjective matter. A similar worry may be raised
entails. Talk of imagined concrete things as the ma- with respect to the mental models view discussed in
terial from which models are built has been criticized Sect. 1.5: if a model is merely a place-holder for what-
for amounting to an indirect account of modeling, by ever is needed to sustain the activity of mental modeling
which [1.46, pp. 308, fn. 14] (or imagining) on the part of an agent, how can one be
certain that the same kinds of models (or props) reli-
“prepared descriptions and equations of motion ask ably give rise to the same kinds of mental modeling
us to imagine an imagined concrete system which (or imaginings)? In this respect, at least, the models
then bears some other form of representation rela- as fictions view appears to be in a stronger position.
tion to the system being modelled.” Recall that, unlike in individual pretend play (or uncon-
strained imagining), in games of make-believe certain
A more thoroughgoing direct view of models as imaginations are sanctioned by the prop itself and the –
fictions is put forward by Toon, who considers the fol- public, shared – rules of the game. As a result, “some-
lowing sentence from Wells’s The War of the Worlds: one’s imaginings are governed by intersubjective rules,
“The dome of St. Paul’s was dark against the sunrise, which guarantee that, as long as the rules are respected,
and injured, I saw for the first time, by a huge gaping everybody involved in the game has the same imagin-
cavity on its western side” [1.47, p. 229]. As Toon ar- ings” [1.41, p. 264] – though it should be added, not
gues [1.46, p. 307]: necessarily the same mental images.
In his 1963 book, Models and Metaphors, Black
“There is no pressure on us to postulate a fictional, expressed his hope that an “exercise of the imagina-
damaged, St. Paul’s for this passage to represent; the tion, with all its promise and its dangers” may help
passage simply represents the actual St. Paul’s. Sim- pave the way for an “understanding of scientific mod-
ilarly, on my account, our prepared description and els and archetypes” as “a reputable part of scientific
equation of motion do not give rise to a fictional, culture” [1.4, p. 243]. Even though Black was writing
idealised bouncing spring since they represent the in general terms (and perhaps for rhetorical effect), his
actual bouncing spring.” characterization would surely be considered apt by the
proponents of the models as fictions view, who believe
By treating models as prescribing imaginings about that models allow us to imagine their targets to be a cer-
the actual objects (where these exist and are the model’s tain way, and that, by engaging in such imaginings, we
target system), we may resolve to imagine all sorts of can gain new scientific insights.
20 Part A Theoretical Issues in Models
In Sect. 1.1, a distinction was drawn between informa- together models from a variety of ingredients. (On this
tional views of models, which emphasize the objective, point, see especially [1.35, p. 76].)
two-place relation between the model and what it repre- A number of such accounts have coalesced into
sents, and pragmatic views, according to which a model what has come to be called the models as mediators
depends at least in part on the user’s beliefs or in- view (see [1.49] for a collection of case studies). Ac-
tentions, thereby rendering model-based representation cording to this view, models are to be regarded neither
a three-place relation between model, target, and user. as a merely auxiliary intermediate step in applying
Unsurprisingly, which side one comes down on in this or interpreting scientific theories, nor as constructed
debate will also have an effect on one’s take on the purely from data. Rather, they are thought of as me-
ontology of scientific models. Hence, structuralist ap- diating between our theories and the world in a partly
proaches (e.g., the partial structures approach discussed autonomous manner. As Morrison and Morgan put it,
in Sect. 1.4.4) are a direct manifestation of the informa- models “are not situated in the middle of an hierarchical
tional view, whereas the models as fictions approach – structure between theory and the world,” but oper-
especially insofar as it considers models to be props for ate outside the hierarchical “theory-world axis” [1.50,
the user’s imagination – would be a good example of pp. 17–18]. A central tenet of the models as media-
the pragmatic view. The pragmatic dimension of sci- tors view is the thesis that models “are made up from
entific representation has received growing attention in a mixture of elements, including those from outside the
the philosophical literature, and while this is not the domain of investigation”; indeed, it is thought to be pre-
place for a detailed survey of pragmatic accounts of cisely in virtue of this heterogeneity that they are able
model-based representation in particular, the remainder to retain “an element of independence from both theory
of this section will be devoted to a discussion of the on- and data (or phenomena)” [1.50, p. 23].
tological consequences of several alternative pragmatic At one level, the models as mediators view appears
accounts of models. Particular emphasis will be placed to be making a descriptive point about scientific prac-
on what I shall call mixed ontologies, that is, accounts tice. As Morrison and Morgan [1.50] point out, there is
of models that emphasize the heterogeneity and diver- “no logical reason why models should be constructed
sity of their components. to have these qualities of partial independence” [1.50,
p. 17], though in practice they do exhibit them, and
1.7.1 Models as Mediators examples that involve the integration of heterogeneous
elements beyond theory and data “are not the exception
Proponents of pragmatic accounts of models usually but the rule” [1.50, p. 15]. Yet, there is also the fur-
take scientific practice as the starting point of their ther claim that models could not fulfil their epistemic
analysis. This often directly informs how they think function unless they are partially autonomous entities:
about models; in particular, it predisposes them to treat “we can only expect to use models to learn about our
models as the outcome of a process of model con- theories or our world if there is at least partial indepen-
struction. On this view, it is not only the function of dence of the model from both” [1.50, p. 17]. Given that
models – for example, their capacity to represent tar- models are functional entities (in the sense discussed
get systems – which depends on the beliefs, intentions, in Sect. 1.2), this has repercussions for the ontological
and cognitive interests of a model user, but also the question of what kind of entities models are. More often
very nature of models which is dependent on human than not, models will integrate – perhaps imperfectly,
agents in this way. In other words, what models are but in irreducible ways – heterogeneous components
is crucially determined by their being the result of from disparate sources, including (but not limited to)
a deliberate process of model construction. Model con- “elements of theories and empirical evidence, as well
struction, most pragmatic theorists of models insist, is as stories and objects which could form the basis for
marked by “piecemeal borrowing” [1.35, p. 63] from modeling decisions” [1.50, p. 15]. As proponents of
a range of different domains. Such conjoining of het- the models as mediators view are at pains to show,
erogeneous components to form a model cannot easily even in cases where models initially seem to derive
be accommodated by structuralist accounts, or so it has straightforwardly from fundamental theory or empiri-
been claimed; at the very least, there is considerable cal data, closer inspection reveals the presence of other
tension between, say, the way that the partial structures elements – such as “simplifications and approxima-
approach allows for a nested hierarchy of models (con- tions which have to be decided independently of the
nected with one another via partial isomorphisms) and theoretical requirements or of data conditions” [1.50,
the much more ad hoc manner in which modelers piece p. 16].
The Ontology of Models 1.8 Summary 21
For the models as mediators approach, any answer language of theories and data would, in the vast ma-
Part A | 1.8
to the question what is a model? must be tailored to jority of cases, give a misleading impression; instead,
the specific case at hand: models in high-energy physics models are seen as epistemic tools [1.52, p. 267]:
will have a very different composition, and will consist
of an admixture of different elements, than, say, models “Concrete artifacts, which are built by various rep-
in psychology. However, as a general rule, no model – resentational means, and are constrained by their
or, at any rate, no interesting model – will ever be fully design in such a way that they enable the study
reducible to theory and data; attempts to clean up the of certain scientific questions and learning through
ontology of scientific models so as to render them either constructing and manipulating them.”
purely theoretical or entirely empirical, according to the
models as mediators view, misconstrue the very nature This links the philosophical debate about models
and function of models in science. to questions in the philosophy of technology, for ex-
ample concerning the ontology of artifacts, which are
1.7.2 Models as Epistemic Artifacts likewise construed as both material bodies and func-
tional objects. It also highlights the constitutive role
A number of recent pragmatic approaches take the mod- of design and construction, which applies equally to
els as mediators view as their starting point, but suggest models with a salient material dimension – such as
that it should be extended in various ways. Thus, Knuut- scale models in engineering or ball-and-stick mod-
tila acknowledges the importance of mediation between els in chemistry – and to largely theoretical models.
theory and data, but a richer account of models is For example, it has been argued that mathematical
needed to account for how this partial independence models (e.g., in many-body physics) may be fruit-
comes about. For Knuuttila, materiality is the key en- fully characterized not only in theoretical terms (say,
abling factor that imbues models with such autonomy: as a Hamiltonian) or as mathematical entities (as an
it is “the material dimension, and not just additional ele- operator equation), but also as the output of a mature
ments, that makes models able to mediate” [1.51, p. 48]. mathematical formalism (in this case, the formalism of
Materiality is also seen as explaining the various epis- second quantization) – that is, a physically interpreted
temic functions that models have in inquiry, not least by set of notational rules that, while embodying various
way of analogy with scientific experiments. For exam- theoretical assumptions, is not usually reducible to fun-
ple, just as in experimentation much effort is devoted to damental theory [1.53].
minimizing unwanted external factors (such as noise), As in the case of the models as mediators approach,
in scientific models certain methods of approximation the ontological picture that emerges from the artifac-
and idealization serve the purpose of neutralizing un- tual approach to models is decidedly mixed: models
desirable influences. Models typically draw on variety will typically consist of a combination of different
of formats and representations, in a way that enables materials, media and formats, and deploy different rep-
certain specific uses, but at the same time constrains resentational means (such as pictorial, symbolic, and
them; this breaks with the traditional assumption that diagrammatic notations) as well as empirical data and
we can “clearly tell apart those features of our scientific theoretical assumptions. Beyond merely acknowledg-
representations that are attributable to the phenom- ing the heterogeneity of such a mixture of elements,
ena described from the conventions used to describe however, the artifactual approach insists that it is in
them” [1.52, p. 268]. virtue of their material dimension that the various el-
On the account sketched thus far, attempting to ements of a model, taken together, enable and constrain
characterize the nature and function of models in the its representational and other epistemic functions.
1.8 Summary
As the survey in this chapter demonstrates, the term close attention to philosophical accounts of model-
model in science refers to a great variety of things: based representation, it is possible to discern certain
physical objects such as scale models in engineering, clusters of positions. At a general level, it is useful to
descriptions and sets of sentences, set-theoretic struc- think of models as functional entities, as this allows one
tures, fictional objects, or an assortment of all of the to explore how different functional perspectives lead to
above. This makes it difficult to arrive at a uniform char- different conceptions of the ontology of models. Hence,
acterization of models in general. However, by paying with respect to the representational function of mod-
22 Part A Theoretical Issues in Models
els, it is possible to distinguish between informational they did so out of a realization that it may not al-
Part A | 1
views, which we found to be closely associated with ways be possible to apply fundamental theory directly
structuralist accounts of models, and pragmatic views, to reality, either because any attempt to do so faces
which tend to give rise to more heterogeneous accounts, insurmountable complexities, or because no such fun-
according to which models may be thought of as props damental theory is as yet available. At the beginning
for the imagination, as partly autonomous mediators of the twenty-first century, these challenges have not
between theory and data, or as epistemic artifacts con- diminished, and scientists find themselves turning to
sisting of an admixture of heterogeneous elements. an ever greater diversity of scientific models, a uni-
When nineteenth century physicists began to re- fied philosophical theory of which is still outstand-
flect systematically on the role of analogy in science, ing.
References
1.1 R. Frigg: Models in science. In: Stanford Ency- 1.16 P. Duhem: The Aim and Structure of Physical The-
clopedia of Philosophy, ed. by E.N. Zalta http:// ory (Princeton Univ. Press, Princeton 1954), Transl. by
plato.stanford.edu/entries/models-science/ (Spring P.P. Wiener
2012 Edition) 1.17 D. Bailer-Jones: Models, metaphors and analogies.
1.2 N. Goodman: Languages of Art (Bobbs-Merrill, Indi- In: The Blackwell Guide to the Philosophy of Science,
anapolis 1968) ed. by P. Machamer, M. Silberstein (Blackwell, Oxford
1.3 R. Ankeny, S. Leonelli: What’s so special about model 2002) pp. 108–127
organisms?, Stud. Hist. Philos. Sci. 42(2), 313–323 1.18 D.H. Mellor: Models and analogies in science: Duhem
(2011) versus Campbell?, Isis 59(3), 282–290 (1968)
1.4 M. Black: Models and Metaphors: Studies in Lan- 1.19 D. Bailer-Jones: Scientific Models in Philosophy of
guage and Philosophy (Cornell Univ. Press, Ithaca Science (Univ. Pittsburgh Press, Pittsburgh 2009)
1962) 1.20 M. Hesse: Models and Analogies in Science (Sheed
1.5 P. Achinstein: Concepts of Science: A Philosophical Ward, London 1963)
Analysis (Johns Hopkins, Baltimore 1968) 1.21 N.R. Campbell: Physics: The Elements (Cambridge
1.6 J. von Neumann: Method in the physical sciences. Univ. Press, Cambridge 1920)
In: Collected Works Vol. VI. Theory of Games, As- 1.22 J.M. Soskice, R. Harré: Metaphor in science. In: From
trophysics, Hydrodynamics and Meteorology, ed. by a Metaphorical Point of View: A Multidisciplinary Ap-
A.H. Taub (Pergamon, Oxford 1961) pp. 491–498 proach to the Cognitive Content of Metaphor, ed. by
1.7 S. French: Keeping quiet on the ontology of models, Z. Radman (de Gruyter, Berlin 1995) pp. 289–308
Synthese 172(2), 231–249 (2010) 1.23 C. Liu: Models and theories I: The semantic view re-
1.8 G. Contessa: Editorial introduction to special issue, visited, Int. Stud. Philos. Sci. 11(2), 147–164 (1997)
Synthese 2010(2), 193–195 (2010) 1.24 R. Carnap: Foundations of Logic and Mathematics
1.9 S. Ducheyne: Towards an ontology of scientific mod- (Univ. Chicago Press, Chicago 1939)
els, Metaphysica 9(1), 119–127 (2008) 1.25 R. Hendry, S. Psillos: How to do things with the-
1.10 R. Giere: Using models to represent reality. In: ories: An interactive view of language and models
Model-Based Reasoning in Scientific Discovery, ed. in science. In: The Courage of Doing Philosophy: Es-
by L. Magnani, N. Nersessian, P. Thagard (Plenum, says Presented to Leszek Nowak, ed. by J. Brzeziński,
New York 1999) pp. 41–57 A. Klawiter, T.A.F. Kuipers, K. Lastowksi, K. Paprzycka,
1.11 A. Chakravartty: Informational versus functional the- P. Przybyzs (Rodopi, Amsterdam 2007) pp. 123–158
ories of scientific representation, Synthese 217(2), 1.26 N. Cartwright: Models and the limits of theory:
197–213 (2010) Quantum hamiltonians and the BCS model of super-
1.12 R.I.G. Hughes: Models and representation, Proc. Phi- conductivity. In: Models as Mediators, ed. by M. Mor-
los. Sci., Vol. 64 (1997) pp. S325–226 rison, M. Morgan (Cambridge Univ. Press, Cambridge
1.13 T. Knuuttila: Some consequences of the pragma- 1999) pp. 241–281
tist approach to representation. In: EPSA Epistemol- 1.27 F. Suppe: The Semantic Conception of Theories and
ogy and Methodology of Science, ed. by M. Suárez, Scientific Realism (Univ. Illinois Press, Urbana 1989)
M. Dorato, M. Rédei (Springer, Dordrecht 2010) 1.28 P. Suppes: A comparison of the meaning and uses of
pp. 139–148 models in mathematics and the empirical sciences,
1.14 M. Suárez: An inferential conception of scientific Synthese 12(2/3), 287–301 (1960)
representation, Proc. Philosophy of Science, Vol. 71 1.29 B. van Fraassen: The Scientific Image (Oxford Univ.
(2004) pp. 67–779 Press, Oxford 1980)
1.15 M. Jammer: Die Entwicklung des Modellbegriffs in 1.30 M. Thomson-Jones: Models and the semantic view,
den physikalischen Wissenschaften, Stud. Gen. 18(3), Philos. Sci. 73(4), 524–535 (2006)
166–173 (1965) 1.31 P. Suppes: Introduction to Logic (Van Nostrand,
Princeton 1957)
The Ontology of Models References 23
1.32 N. da Costa, S. French: Science and Partial Truth: 1.43 A. Meinong: Untersuchungen zur Gegenstandstheo-
Part A | 1
A Unitary Approach to Models and Scientific Reason- rie und Psychologie (Barth, Leipzig 1904)
ing (Oxford Univ. Press, New York 2003) 1.44 K. Walton: Mimesis as Make-Believe: On the Foun-
1.33 S. French: The structure of theories. In: The Rout- dations of the Representational Arts (Harvard Univ.
ledge Companion to Philosophy of Science, 2nd edn., Press, Cambridge 1990)
ed. by M. Curd, S. Psillos (Routledge, London 2013) 1.45 A. Toon: Models as Make-Believe: Imagination,
pp. 301–312 Fiction, and Scientific Representation (Palgrave-
1.34 C. Pincock: Overextending partial structures: Ideal- Macmillan, Basingstoke 2012)
ization and abstraction, Philos. Sci. 72(4), 1248–1259 1.46 A. Toon: The ontology of theoretical modelling: Mod-
(2005) els as make-believe, Synthese 172(2), 301–315 (2010)
1.35 M. Suárez, N. Cartwright: Theories: Tools versus mod- 1.47 H.G. Wells: War of the Worlds (Penguin, London
els, Stud. Hist. Philos. Mod. Phys. 39(1), 62–81 (2008) 1897), 1978
1.36 P. Godfrey-Smith: The strategy of model-based sci- 1.48 R. Giere: How models are used to represent reality,
ence, Biol. Philos. 21(5), 725–740 (2006) Proc. Philosophy of Science, Vol. 71 (2004) pp. S742–
1.37 M. Thomson-Jones: Missing systems and the face S752
value practice, Synthese 172(2), 283–299 (2010) 1.49 M. Morrison, M. Morgan (Eds.): Models as Mediators
1.38 P.N. Johnson-Laird: Mental Models (Harvard Univ. (Cambridge Univ. Press, Cambridge 1999)
Press, Cambridge 1983) 1.50 M. Morrison, M. Morgan: Models as mediating in-
1.39 N. Nersessian: Model-based reasoning in concep- struments. In: Models as Mediators, ed. by M. Mor-
tual change. In: Model-Based Reasoning in Scientific rison, M. Morgan (Cambridge Univ. Press, Cambridge
Discovery, ed. by L. Magnani, N. Nersessian, P. Tha- 1999) pp. 10–37
gard (Plenum, New York 1999) pp. 5–22 1.51 T. Knuuttila: Models as Epistemic Artefacts: Toward
1.40 J.C. Maxwell: The Scientific Papers of James Clerk a Non-Representationalist Account of Scientific Rep-
Maxwell, Vol. 1 (Cambridge Univ. Press, Cambridge resentation (Univ. Helsinki, Helsinki 2005)
1890), ed. by W.D. Niven 1.52 T. Knuuttila: Modelling and representing: An arte-
1.41 R. Frigg: Models and fiction, Synthese 172(2), 251–268 factual approach to model-based representation,
(2010) Stud. Hist. Philos. Sci. 42(2), 262–271 (2011)
1.42 A.C. Doyle: The Adventure of the Three Garridebs. In: 1.53 A. Gelfert: How to Do Science with Models: A Philo-
The Casebook of Sherlock Holmes, ed. by J. Miller sophical Primer (Springer, Cham 2016)
(Dover, 2005)
25
Demetris Portides
Models and T 2. Models and Theories
Part A | 2
2.1 The Received View
Both the received view (RV) and the semantic view
of Scientific Theories ............................ 26
(SV) of scientific theories are explained. The argu- 2.1.1 The Observation–Theory Distinction........ 27
ments against the RV are outlined in an effort to 2.1.2 The Analytic–Synthetic Distinction.......... 29
highlight how focusing on the syntactic character 2.1.3 Correspondence Rules ........................... 30
of theories led to the difficulty in characterizing 2.1.4 The Cosmetic Role
theoretical terms, and thus to the difficulty in ex- of Models According to the RV ................ 32
plicating how theories relate to experiment. The 2.1.5 Hempel’s Provisos Argument.................. 33
absence of the representational function of models 2.1.6 Theory Consistency
in the picture drawn by the RV becomes evident; and Meaning Invariance ........................ 34
and one does not fail to see that the SV is in part 2.1.7 General Remark on the Received View .... 35
a reaction to – what its adherents consider to be
2.2 The Semantic View
an – excessive focus on syntax by its predecessor
of Scientific Theories ............................ 36
and in part a reaction to the complete absence 2.2.1 On the Notion of Model in the SV ........... 38
of models from its predecessor’s philosophical at- 2.2.2 The Difference Between
tempt to explain the theory–experiment relation. Various Versions of the SV ...................... 40
The SV is explained in an effort to clarify its main 2.2.3 Scientific Representation
features but also to elucidate the differences be- Does not Reduce
tween its different versions. Finally, two kinds to a Mapping of Structures..................... 42
of criticism are explained that affect all versions 2.2.4 A Unitary Account of Models
of the SV but which do not affect the view that Does not Illuminate
models have a warranted degree of importance in Scientific Modeling Practices .................. 44
scientific theorizing. 2.2.5 General Remark on the Semantic View ... 46
References..................................................... 47
Scientists use the term model with reference to iconic that grew out of the logical positivist tradition. Ac-
or scaled representations, analogies, and mathematical cording to this view, theories are construed as formal
(or abstract) descriptions. Although all kinds of models axiomatic calculi whose logical consequences extend
in science may be philosophically interesting, mathe- to observational sentences. Models are thought to have
matical models stand out. Representation with iconic or no representational role; their role is understood meta-
scale models, for instance, mostly applies to a particu- mathematically, as interpretative structures of subsets of
lar state of a system at a particular time, or it requires sentences of the formal calculus. Ultimately it became
the mediation of a mathematical (or abstract) model in clear that such a role ascribed to models does not do jus-
order to relate to theories. Representation via mathe- tice to how science achieves theoretical representations
matical models, on the other hand, is of utmost interest of phenomena. This conclusion was reached largely
because it applies to types of target systems and it can due to the advent of the second systematic attempt to
be used to draw inferences about the time-evolution of explore the relation between theory and models, the se-
such systems, but more importantly for our purposes mantic view (SV) or model-theoretic view of scientific
because of its obvious link to scientific theories. theories. The semantic view regards theories as classes
In the history of philosophy of science, there have of models that are directly defined without resort to
been two systematic attempts to explicate the relation a formal calculus. Thus, models in this view are inte-
of such models to theory. The first is what had been gral parts of theories, but they are also the devices by
labeled the received view (RV) of scientific theories which representation of phenomena is achieved.
26 Part A Theoretical Issues in Models
Although, the SV recognized the representational are typically supplemented with ingredients that derive
capacity of models and exposed that which was con- from background knowledge, from semiempirical re-
cealed by the logical positivist tradition, namely that sults and from experiment. In order to better understand
one of the primary functions of scientific models is to the character of successful representational models, ac-
apply the abstract theoretical principles in ways that cording to this latter view, we must move away from
actual physical systems can be represented, it also gen- a purely theory-driven view of model construction and
erated a debate concerning the complexities involved also place our emphasis on the idea that representational
in scientific representation. This recent debate has sig- models are entities that consist of assortments of the
Part A | 2.1
nificantly enhanced our understanding of the represen- aforementioned sorts of conceptual ingredients.
tational role of scientific models. At the same time it In order to attain insight into how models could re-
gave rise, among other things, to questions regarding late to theory and also be able to use that insight in
the relation between models and theory. The adherents addressing other issues regarding models, in what fol-
of the SV claim that a scientific theory is identified with lows, I focus on the RV and the SV of scientific theories.
a class of models, hence that models are constitutive Each of the two led to a different conception of the na-
parts of theory and thus they represent by means of ture of theory structure and subsequently to a different
the conceptual apparatus of theory. The critics of the suggestion for what scientific models are, what they are
SV, however, argue that those models that are success- used for, and how they function. In the process of ex-
ful representations of physical systems utilize a much plicating these two conceptions of theory structure, I
richer conceptual apparatus than that provided by the- will also review the main arguments that have been pro-
ory and thus claim that they should be understood as posed against them. The RV has long been abandoned
partially autonomous from theory. for reasons that I shall explore in Sect. 2.1, but the SV
A distinguishing characteristic of this debate is the lives on despite certain inadequacies that I shall also
notion of representational model, that is, a scientific explore in Sect. 2.2. Toward the end of Sect. 2.2, in
entity which possesses the necessary features that ren- Sect. 2.2.4, I shall very briefly touch upon the more re-
der it representational of a physical system. In the cent view that the relation between theory and models
SV, theoretical models, that is, mathematical models is far more complex than advocates of the RV or the
that are constitutive parts of theory structure, are con- SV have claimed, and that models in science demon-
sidered to be representational of physical systems. Its strate a certain degree of partial autonomy from the
critics, however, argue that in order to provide a model theory that prompts their construction and because of
with the capacity to represent actual physical systems, this a unitary account of models obscures significant
the theoretical principles from which the model arises features of scientific modeling practices.
laws), consist only of terms from VT . This construal of 3. It employs the obscure notion of correspondence
theories is a syntactic system, which naturally requires rules to account for the interpretation of theoretical
semantics in order to be useful as a model of scientific terms and to account for theory application.
theories. 4. It does not assign a representational function to
It is further assumed that the terms of VO refer to models.
directly observable physical objects and directly ob- 5. It assigns a deductive status to the relation between
servable properties and relations of physical objects. empirical theories and experiment.
Thus the semantic interpretation of such terms, and the 6. It commits to a theory consistency condition and to
Part A | 2.1
sentences belonging to LO , is provided by direct obser- a meaning invariance condition.
vation. The terms of VT , and subsequently all the other
sentences of L not belonging to LO , are partially inter- 2.1.1 The Observation–Theory Distinction
preted via the theoretical postulates, T, and – a finite set
of postulates that has come to be known as – the corre- The separation of L into VO and VT terms implies that
spondence rules, C. The latter are mixed sentences of L, the RV requires an observational–theoretical distinction
that is, they are constructed with at least one term from in the terms of the vocabulary of the theory. This idea
each of the two classes VT and VO . (The reader could was criticized in two ways. The first kind of objection to
consult Suppe [2.1] for a detailed exposition of the RV, the observation–theory distinction relied on a twofold
but also for a detailed philosophical study of the de- argument. On the one hand, the critics claim that an
velopments that the RV underwent under the weight of observation–theory distinction of scientific terms can-
several criticisms until it reached, what Suppe calls, the not be drawn; and on the other, that a classification
“final version of the RV”). of terms following such a distinction would give rise
We could synopsize how scientific theories are con- to a distinction of observational–theoretical statements,
ceived according to the RV as follows: The scientific which also cannot be drawn for scientific languages.
laws, which as noted constitute the axioms of the the- The second kind of objection to the distinction relies
ory, specify relations holding between the theoretical on attempts to establish accounts of observation that
terms. Via a set of correspondence rules, theoretical are incompatible with the observation–theory distinc-
terms are reduced to, or defined by, observation terms. tion and on showing that observation statements are
Observation terms refer to objects and relations of the theory laden.
physical world and thus are interpreted. Hence, a scien-
tific theory, according to the RV, is a formal axiomatic The Untenability
system having as point of departure a set of theoret- of the Observation–Theory Distinction
ical postulates, which when augmented with a set of The argument of the first kind that focuses on the un-
correspondence rules has deductive consequences that tenability of the observation–theory distinction is due
stretch all the way to terms, and sentences consisting of to Achinstein [2.2, 3] and Putnam [2.4]. Achinstein ex-
such terms, that refer to the directly observable physi- plores the sense of observation relevant to science, that
cal objects. Since according to this view, the backbone is, “the sense in which observing involves visually at-
of a scientific theory is the set of theoretical postulates, tending to something,” and he claims that this sense
T, and a partial interpretation of L is given via the set of exhibits the following characteristics:
correspondence rules, C, let TC (i. e., the union set of T
and C) designate the scientific theory. 1. Observation involves attention to the various as-
From this sketch, it can be inferred that the RV pects or features of an item depending on the ob-
implies several philosophically interesting things. For server’s concerns and knowledge.
the purposes of this chapter, it suffices to limit the dis- 2. Observation does not necessarily involve recogni-
cussion only to those implications of the RV that are tion of the item.
relevant to the criticisms that have contributed to its 3. Observation does not imply that whatever is ob-
downfall. These implications, which in one way or an- served is in the visual field or in the line of sight
other relate to the difficulty in characterizing VT terms, of the observer.
are: 4. Observation could be achieved indirectly.
5. The description of what one observes can be done
1. It relies on an observational–theoretical distinction in different ways (The reader could refer to Achin-
of the terms of L. stein [2.3, pp. 160–165] for an explication of these
2. It embodies an analytic–synthetic distinction of the characteristics of observation by the use of specific
sentences of L. examples).
28 Part A Theoretical Issues in Models
If now one urges an observation–theory distinction theoretical. They do not however show the untenabil-
by simply constructing lists of observable and unob- ity of the observation–theory distinction as employed
servable terms (as proponents of the RV according by the RV. As Suppe [2.8] correctly observes, what they
to Achinstein do), the distinction becomes untenable. show is that the RV needs a sufficiently rich artificial
For example, according to typical lists of unobserv- language for science, no matter how complex it may
ables, electron is a theoretical term. But based on points turn out to be. Such a language, in which presumably
(3) and (4) above, Achinstein claims, this could be re- the observation–theory distinction is tenable, must have
jected. Similarly based on point (5), Achinstein also a plethora of terms, such that, to use his example, the
Part A | 2.1
rejects the tenability of such a distinction at the level designated term red o will refer to the observable occur-
of statements, because “what scientists as well as oth- rences of the predicate red, and the designated term red t
ers observe is describable in many different ways, using will refer to the unobservable occurrences.
terms from both vocabularies” [2.3, p. 165].
Furthermore, if, as proponents of the RV have of- The Theory-Ladenness of Observation
ten claimed, (For instance, Hempel [2.5], Carnap [2.6] Hanson’s argument is a good example of the second
and [2.7]), items in the observational list are directly kind, in which an attempt is made to show that there is
observable whereas those in the theoretical list are not, no theory-neutral observation language and that obser-
then Achinstein [2.3, pp. 172–177] claims that a close vation is theory-laden and thus establish an account of
construal of directly observable reveals that the desired observation that is incompatible with the observation–
classification of terms into the two lists fails. He ex- theory distinction required by the RV (Hanson [2.9,
plains that directly observable could mean that it can be pp. 4–30]. Hanson [2.10, pp. 59–198]. Also Suppe [2.1,
observed without the use of instruments. If this is what pp. 151–166]). He does this by attempting to establish
advocates of the RV require, then it does not warrant that an observation language that intersubjectively can
the distinction. First, it is not precise enough to clas- be given a theory-independent semantic interpretation,
sify things seen by images and reflections. Second, if as the RV purports, cannot exist.
something is not observable without instruments means He begins by asking whether two people see the
that no aspect of it is observable without instruments same things when holding different theories. We could
then things like temperature and mass would be observ- follow his argument by reference to asking whether Ke-
ables, since some aspects of them are detected without pler and Tycho Brahe see the same thing when looking
instruments. If however directly observable means that at the sun rising. Kepler, of course, holds that the earth
no instruments are required to detect its presence, then revolves around the sun, while Tycho holds that the sun
it would be insufficient because one cannot talk about revolves around the earth. Hanson addresses this ques-
the presence of temperature. Finally, if it means that tion by considering ambiguous figures, that is, figures
no instruments are required to measure it or its prop- that sometimes can be seen as one thing and other times
erties, then such terms as volume, weight, etc. would as another. The most familiar example of this kind is the
have to be classified as theoretical terms. Hence, Achin- duck–rabbit figure.
stein concludes that the notion of direct observability is When confronted with such figures, viewers see ei-
unclear and thus fails to draw the desired observation– ther a duck or a rabbit depending on the perspective
theory distinction. they take, but in both cases they see the same dis-
Along similar lines, Putnam [2.4] argues that the tal object (i. e., the object that emits the rays of light
distinction is completely broken-backed mainly for that impinge the retina). Hanson uses this fact to de-
three reasons. First, if an observation term is one that velop a sequence of arguments to counter the standard
only refers to observables then there are no observation interpretations of his time. There were two standard
terms. For example, the term red is in the observable interpretations at the time. The first was that the per-
class but it was used by Newton to refer to a theoretical ceptual system delivers the same visual representation
term, namely red corpuscles. Second, many terms that and then cognition (thought) interprets this either as
refer primarily to the class of unobservables are not the- a duck or as a rabbit. The other was that the perceptual
oretical terms. Third, some theoretical terms, that are of system outputs both representations and then cogni-
course the outcome of a scientific theory, refer primarily tion chooses one of the two. Both interpretations are
to observables. For example, the theory of evolution, as strongly linked with the idea that the perceptual pro-
put forward by Darwin, referred to observables by em- cess and the cognitive process function independently
ploying theoretical terms. of one another, that is, the perceptual system delivers its
What these arguments accomplish is to highlight output independent of any cognitive influences. How-
the fact that scientific languages employ terms that can- ever, Hanson challenges the assumption that the two
not clearly and easily be classified into observational or observers see the same thing and via thought they in-
Models and Theories 2.1 The Received Viewof Scientific Theories 29
terpret it differently. He claims that perception does not lates (or semantical rules) that specify the meanings of
deliver either a duck or a rabbit, or an ambiguous figure, sentences in L. However, if meaning specification were
and then via some other independent process thought the only function of TC then TC would be analytic,
chooses one or the other. On the contrary, the switch and in such case it would not be subject to empirical
from seeing one thing to seeing the other seems to take investigation. TC must therefore have a factual com-
place spontaneously and moreover a process of back ponent, and the meaning postulates must separate the
and forth seeing without any thinking seems to be in- meaning from the factual component. This would im-
volved. He goes on to ask, what could account for the ply an analytic–synthetic separation, if those sentences
Part A | 2.1
difference in what is seen? His answer is that what in L that are logical truths or logical consequences of
changes is the organization of the ambiguous figure as the meaning postulates are analytic and all nonanalytic
a result of the conceptual background of each viewer. sentences are understood to be synthetic. Moreover, any
This entails that what one sees, the percept, depends nonanalytic sentence in L taken in conjunction with the
on the conceptual background that results from one’s class of meaning postulates would have certain empiri-
experience and knowledge, which means that thought cal consequences. If the conjunction is refuted or con-
affects the formation of the percept; thus perception firmed by directly observable evidence, this will reflect
and cognition become intertwined. When Tycho and only on the truth value of the conjunction and not on the
Kepler look at the sun, they are confronted with the meaning postulates. Hence such conjunctive sentences
same distal object but they see different things because can only be synthetic. Thus every nonanalytic sentence
their conceptual organizations of their experiences are of LO and every sentence of L constituted by a mixed
vastly different. In other words, Hanson’s view is that vocabulary is synthetic. So the observation–theory dis-
the percept depends on background knowledge, which tinction supports an analytic–synthetic distinction of
means that cognition influences perceptual processing. the sentences of L.
Consequently, observation is theory laden, namely, ob- The main criticism against the analytic–synthetic
servation is conditional on background knowledge. distinction consists of attempts to show its untenabil-
By this argument, Hanson undermines the RV’s po- ity. Quine [2.20] points out that there are two kinds
sition, which entails that Kepler and Brahe see the same of analytic statements: (a) logical truths, which remain
thing but interpret it differently; and also establishes true under all interpretations, and (b) statements that
that conceptual organizations are features of seeing are true by virtue of the meaning of their nonlogical
that are indispensable to scientific observation and thus terms, for example, No bachelor is married. He then
that Kepler and Brahe see two different things because argues that the analyticity of statements of the second
perception inherently involves interpretation, since the kind cannot be established without resort to the notion
former is conditional on background knowledge. It is, of synonymy, and that the latter notion is just as prob-
however, questionable whether Hanson’s arguments are lematic as the notion of analyticity. The argument runs
conclusive. Fodor [2.11–13], Pylyshyn [2.14, 15], and roughly as follows. Given that meaning (or intension)
Raftopoulos [2.16–18], for example, have extensively is clearly distinguished from its extension, that is, the
argued on empirical grounds that perception, or at least class of entities to which it refers, a theory of meaning
a part of it, is theory independent and have proposed is primarily concerned with cognitive synonymy (i. e.,
explanations of the ambiguous figures that do not in- the synonymy of linguistic forms). For example, to say
voke cognitive effects in explaining the percept and the that bachelor and unmarried man are cognitively syn-
switch between the two interpretations of the figure. onymous is to say that they are interchangeable in all
This debate, therefore, has not yet reached its conclu- contexts without change of truth value. If such were
sion; and many today would argue that fifty or so years the case then the statement No bachelor is married
after Hanson the arguments against the theory ladenness would become No unmarried man is married, which
of observation are much more tenable. would be a logical truth. In other words, statements
of kind (b) are reduced to statements of kind (a) if
2.1.2 The Analytic–Synthetic Distinction only we could interchange synonyms for synonyms.
But as Quine argues, the notion of interchangeability
The RV’s dependence on the observation–theory dis- salva veritate is an extensional concept and hence does
tinction is intimately linked to the requirement for an not help with analyticity. In fact, no analysis of the
analytic–synthetic distinction. An argument to defend interchangeability salva veritate account of synonymy
this claim is given by Suppe [2.1, pp. 68–80]. Here is is possible without recourse to analyticity, thus mak-
a sketch of that argument. The analytic–synthetic dis- ing such an effort circular, unless interchangeability is
tinction is embodied in the RV, because (as suggested “[. . . ] relativized to a language whose extent is spec-
by Carnap [2.19]) implicit in TC are meaning postu- ified in relevant respects” [2.20, p. 30]. That is to say,
30 Part A Theoretical Issues in Models
we first need to know what statements are analytic in or- the main features of the RV, the RV allows that a com-
der to decide which expressions are synonymous; hence plete empirical semantic interpretation in terms of di-
appeal to synonymy does not help with the notion of rectly observables is given to VO terms and to sentences
analyticity. that belong to LO . However, no such interpretation is in-
Similarly White [2.21] argues that an artificial lan- tended for VT terms and consequently for sentences of
guage, L1 , can be constructed with appropriate defi- L containing them. It is TC as a whole that supplies the
nitional rules, in which the predicates P1 and Q1 are empirical content of VT terms. Such terms receive a par-
synonymous whereas P1 and Q2 are not; hence mak- tial observational meaning indirectly by being related to
Part A | 2.1
ing such sentences as 8x .P1 .x/ ! Q1 .x// logical truths sets of observation terms via correspondence rules. To
and such sentences as 8x .P1 .x/ ! Q2 .x// synthetic. In use one of Achinstein’s examples [2.22, p. 90]:
a different artificial language L2 , P1 could be defined
“it is in virtue of [a correspondence-rule] which
to be synonymous to Q2 and not to Q1 , hence mak-
connects a sentence containing the theoretical term
ing the sentence 8x .P1 .x/ ! Q2 .x// a logical truth and
electron to a sentence containing the observational
the sentence 8x .P1 .x/ ! Q1 .x// synthetic. This relies
term spectral line that the former theoretical term
merely upon convention. However, he asks, in a natural
gains empirical meaning within the Bohr theory of
language what rules are there that dictate what choice
the atom”
of synonymy can be made such that one formula is
a synthetic truth rather than analytic? The key point of Correspondence rules were initially introduced to
the argument is therefore that in a natural language or serve three functions in the RV:
in a scientific language, which are not artificially con-
1. To define theoretical terms.
structed and which do not contain definitional rules, the
2. To guarantee the cognitive significance of theoreti-
notion of analyticity is unclear.
cal terms.
Nevertheless, it could be argued that such argu-
3. To specify the empirical procedures for applying
ments as the above are not entirely conclusive, primar-
theory to phenomena.
ily because the RV is not intended as a description of ac-
tual scientific theories. Rather, the RV is offered as a ra- In the initial stages of logical positivism it was
tional reconstruction of scientific theories, that is, an assumed that if observational terms were cognitively
explication of the structure of scientific theories. It does significant, then theoretical terms were cognitively sig-
not aim to describe how actual theories are formulated, nificant if and only if they were explicitly defined in
but only to indicate a logical framework (i. e., a canon- terms of observational terms. The criteria of explicit
ical linguistic formulation) in which theories can be es- definition and cognitive significance were abandoned
sentially reformulated. Therefore, all that proponents of once proponents of the RV became convinced that dis-
the RV, needed to show was that the analytic–synthetic positional terms, which are cognitively significant, do
distinction is tenable in some artificial language (with not admit of explicit definitions (Carnap [2.23, 24], also
meaning postulates) in which scientific theories could Hempel [2.25, pp. 23–29], and Hempel [2.5]). Consider,
potentially be reformulated. In view of this, in order for example, the dispositional term tearable (let us as-
for the RV to overcome the obscurity of the notion of sume all the necessary conditions for an object to be
analyticity, pointed out by Quine and White, it would torn apart hold), if we try to explicitly define it in terms
require the conclusion of a project that Carnap begun: of observables we end up with something like this:
To spell out a clear way by which to characterize mean-
“An object x is tearable if and only if, if it is pulled
ing postulates for a specified theoretical language (This
sharply apart at time t then it will tear at t (assuming
is clearly Carnap’s intention in his [2.19]).
for simplicity that pulling and tearing occur simul-
taneously).”
2.1.3 Correspondence Rules
The above definition could be rendered as 8x
In order to distinguish the character and function of the- .T.x/ $ 8t.P.x; t/ ! Q.x; t///, where, T is the theoret-
oretical terms from speculative metaphysical ones (e.g., ical term tearable, P is the observational term pulled
unicorn), logical positivists sought for a connection of apart, and Q is the observational term tears. But this
theoretical to observational terms by giving an analysis does not correctly define the actual dispositional prop-
of the empirical nature of theoretical terms contrary to erty tearable, because the right-hand side of the bicon-
that of metaphysical terms. This connection was formu- ditional will be true of objects that are never pulled
lated in what we can call, following Achinstein [2.22], apart. As a result, some objects that are not tearable and
the Thesis of Partial Interpretation, which is basically have never being pulled apart will by definition have the
the following: As indicated above, in the brief sketch of property tearable.
Models and Theories 2.1 The Received Viewof Scientific Theories 31
Because of this, Carnap [2.23, 24] proposed to re- they both conclude that any attempt to elucidate the
place the construal of correspondence rules as explicit notion of partial interpretation is problematic and that
definitions, by reduction sentences that partially de- partial interpretation of VT terms cannot be adequately
termine the observational content of theoretical terms. explicated. For example, Putnam gives the following
A reduction sentence defined the dispositional property plausible explications for partial interpretation:
tearable as follows: 8x8t .P.x; t/ ! .Q.x; t/ $ T.x///.
1. To partially interpret VT terms is to specify a class
That is, (Carnap calls such sentences bilateral reduc-
of intended models.
tion sentences [2.23, 24]):
Part A | 2.1
2. To partially interpret a term is to specify a veri-
“If an object x is pulled-apart at time t, then it tears fication–refutation procedure that applies only to
at time t if and only if it is tearable.” a proper subset of the extension of the term.
3. To partially interpret a formal language L is to inter-
Unlike the explicit definition case, if a is a non-
pret only part of the language.
tearable object that is never pulled apart then it is not
implied that T.a/ is true. What will be implied, in such In similar spirit, Achinstein gives three other plau-
case, is that 8t .P.a; t/ ! .Q.a; t/ $ T.a///, is true. sible explications. One of Putnam’s counterexamples
Thus the above shortcoming of explicit definitions is is that (1) above cannot meet its purpose because the
avoided, because a reduction sentence does not com- class of intended models, that is, the semantic struc-
pletely define a disposition term. In fact, this is also tures or interpretations that satisfy TC and which are
the reason why correspondence rules supply only par- so intended by scientists, is not well defined (A critical
tial observational content, since many other reduction assessment of these arguments can be found in [2.1]).
sentences can be used to supply other empirical as- The other function of correspondence rules, that of
pects of the term tearable, for example, being torn specifying empirical procedures for theory application
by excessively strong shaking. Consequently, although to phenomena, also came under criticism. Suppe [2.1,
correspondence rules were initially meant to provide pp. 102–109] argued that the account of correspondence
explicit definitions and cognitive significance to VT rules inherent in the RV is inadequate for understanding
terms, these functions were abandoned and substituted actual science on the following three grounds:
by reduction sentences and partial interpretation (A
1. They are mistakenly viewed as components of the
detailed explication of the changes in the use of cor-
theory rather than as auxiliary hypotheses.
respondence rules through the development of the RV
2. The sorts of connections (e.g., explanatory causal
can be found in [2.1]).
chains) that hold between theories and phenomena
Therefore, in its most defensible version the RV
are inadequately captured.
could be construed to assign the following functions to
3. They oversimplify the ways in which theories are
correspondence rules: First, they specify empirical pro-
applied to phenomena.
cedures for the application of theory to phenomena and
second, as a constitutive part of TC, they supply VT and The first argument is that the RV considers TC as
LT with partial interpretation. Partial interpretation in postulates of the theory. Hence C is assumed to be an
the above sense is all the RV needs since, given its goal integral part of the theory. But, if a new experimental
of distinguishing theoretical from speculative meta- procedure is discovered it would have to be incorpo-
physical terms, it only needs a way to link the VT terms rated into C, and the result would be a new set of rules
to the VO terms. The version of the RV that employs cor- C0 that subsequently leads to a new theory TC0 . But ob-
respondence rules for these two purposes motivated two viously the theory does not undergo any change. When
sorts of criticisms. The first concerns the idea that cor- new experimental procedures are discovered we only
respondence rules provide partial interpretation to VT improve our knowledge of how to apply theory to phe-
terms, and the second concerns the function of corre- nomena. So we must think of correspondence rules as
spondence rules for providing theory application. auxiliary hypotheses distinct from theory.
The thesis of partial interpretation came under at- The second argument is based upon Schaffner’s
tack from Putnam [2.4] and Achinstein [2.3, 22]. The [2.26] observation that there is a way in which theo-
structure of their arguments is similar. They both think ries are applied to phenomena, which is not captured
that partial interpretation is unclear and they attempt by the RV’s account of correspondence rules. This is
to clarify the concept. They do so by suggesting plau- the case when various auxiliary theories (independent
sible explications for partial interpretation. Then they of T) are used to describe a causal sequence, which
show that for each plausible explication that each of obtains between the states described by T and the obser-
them suggests partial interpretation is either an incoher- vation reports. These causal sequences are descriptions
ent notion or inadequate for the needs of the RV. Thus, of the mechanisms involved within physical systems to
32 Part A Theoretical Issues in Models
cause the measurement apparatus to behave as it does. pects of the scientific enterprise. Such aspects are the
Thus, they supplement theoretical explanations of the design of experiments, the interpretation of theories, the
observed behavior of the apparatus by linking the the- various calibration procedures, the employment of re-
ory to the observation reports via a causal story. For sults and procedures of related branches of science, etc.
example, such auxiliary hypotheses are used to estab- All these unrelated aspects are compounded into the
lish a causal link between the motion of an electron (VT correspondence rules. Contrary to the implications of
term) and the spectral line (VO term) in a spectrom- the RV, Suppe claims, in applying a theory to phenom-
eter photograph. Schaffner’s point is that the relation ena we do not have any direct link between theoretical
Part A | 2.1
between theory and observation reports is frequently terms and observational terms. In a scientific experi-
achieved by the use of these auxiliary hypotheses that ment we collect data about the phenomena, and often
establish explanations of the behavior of physical sys- enough the process of collecting the data involves rather
tems via causal mechanisms. Without recognizing the sophisticated bodies of theory. Experimental design and
use of these auxiliaries the RV may only describe a type control, instrumentation, and reliability checks are nec-
of theory application whereby theoretical states are just essary for the collection of data. Moreover, sometimes
correlated to observational states. If these kinds of aux- generally accepted laws or theories are also employed
iliaries were to be viewed as part of C then it is best in collecting these data. All these features of exper-
that C is dissociated from the core theory and is re- imentation and data collection are then employed in
garded as a separate set of auxiliary hypotheses required ways as to structure the data into forms (which Suppe
for establishing the relation between theory and experi- calls, hard data) that allow meaningful comparison
ment, because such auxiliaries are obviously not theory to theoretical predictions. In fact, theory application
driven, but if they are not to be considered part of C then according to Suppe involves contrasting theoretical pre-
C does not adequately explain the theory–experiment dictions to hard data, and not to something directly
relation. observed [2.8, p. 11]:
Finally, the third argument is based on Sup-
“Accordingly, the correspondence rules for a theory
pes’ [2.27, 28] analysis of the complications involved in
should not correlate direct-observation statements
relating theoretical predictions to observation reports.
with theoretical statements, but rather should cor-
Suppes observes that in order to reach the point where
relate hard data with theoretical statements.”
the two can meaningfully be compared, several episte-
mologically important modifications must take place on In a nutshell, although both Suppes’ and Suppe’s
the side of the observation report. For example, Suppes arguments do not establish with clarity how the theory–
claims, on the side of theory we typically have pre- experiment relation is achieved they do make the
dictions derived from continuous functions, and on the following point: Actual scientific practice, and in par-
side of an observation report we have a set of discrete ticular theory–application, is far more complex than the
data. The two can only be compared after the obser- description given by the RV’s account of correspon-
vation report is modified accordingly. Similarly, the dence rules.
theory’s predictions may be based on the assumption
that certain idealizing conditions hold, for example, no 2.1.4 The Cosmetic Role
friction. Assuming that in the actual experiment these of Models According to the RV
conditions did not hold, it would mean that to achieve
a reasonable comparison between theory and experi- The objection that the RV obscures several epistemo-
ment the observational data will have to be converted logically important features of scientific theories is
into a corresponding set that reflects the result of an implicitly present in all versions of the SV of theories.
ideal experiment. In other words, the actual observa- Suppe, however, brings this out explicitly in the form
tional data must be converted into what they would have of a criticism (Suppe [2.1, 29, 30]). To clarify the sort
been had the idealizing conditions obtained. Accord- of criticism presented by Suppe, we need to make use
ing to Suppes, these sorts of conversion are obtained by of some elements of the alternative picture of scientific
employing appropriate theories of data. So, frequently, theories given by the SV, which we shall explore in de-
there will not be a direct comparison between theory tail in Sect. 2.2.
and observation, but a comparison between theory and The reasoning behind Suppe’s argument is the fol-
observation-altered-by-theory-of-data. lowing. Science, he claims, has managed so far to go
By further developing Suppes’ analysis, Suppe [2.8] about its business without involving the observation–
argues that because of its reliance on the observation– theory distinction and all the complexities that it gives
theory distinction, the RV employs correspondence rise to. Since, he suggests, the distinction is not required
rules in such a way as to blend together unrelated as- by science, it is important to ask not only whether an
Models and Theories 2.1 The Received Viewof Scientific Theories 33
analysis of scientific theories that employs the distinc- relation, the physical system plays the intermediate role
tion is adequate or not, that is, the issue on which (as between phenomena and theory and this role, which
we have seen so far) many of the criticisms of the RV is operative in theory–application, is what needs to be
have focused, but whether or not the observation–theory illuminated. The RV implies that the correspondence
distinction which leads to the notion of correspondence rules “[. . . ] amalgamate together the two sorts of moves
rules subsequently steers toward obscuring epistemo- [. . . ] so as to eliminate the physical system” [2.29,
logical aspects of scientific theorizing. p. 16], thus obscuring this important epistemological
The sciences, he argues, do not deal with all the feature of scientific theorizing.
Part A | 2.1
detailed features of phenomena and not with phenom- So, according to Suppe, correspondence rules
ena in all their complexity. Rather they isolate a certain must give way to this two-stage move, if we are to
number of physical parameters by abstraction and ide- identify and elucidate the epistemic features of physical
alization and use these parameters to characterize phys- systems. Suppe’s suggestion is that the only way to
ical systems (Suppe’s terminology is idiosyncratic, he accommodate physical systems into our understanding
uses the term physical system to refer to the abstract of how theories relate to phenomena is to give models
entity that an idealized model of the theory represents of the theory their representational status. The repre-
and not to the actual target physical system), which sentational means of the RV are linguistic entities, for
are highly abstract and idealized replicas of phenom- example, sentences. Models, within the RV, are denied
ena. A classical mechanical description of the earth–sun any representational function. They are conceived
system of our solar system, would not deal with the ac- exclusively as interpretative devices of the formal
tual system, but with a physical system in which some calculus, that is, as structures that satisfy subsets of
relevant parameters are abstracted (e.g., mass, displace- sentences of the theory. This reduces models to meta-
ment, velocity) from the complex features of the actual mathematical entities that are employed in order to
system. And in which some other parameters are ig- make intelligible the abstract calculus, which amounts
nored, for example, the intensity of illumination by the to treating them as more or less cosmetic aspects of sci-
sun, the presence of electromagnetic fields, the presence ence. But this understanding of the role of models leads
of organic life. In addition, these abstracted parameters to the incapacity of the RV to elucidate the epistemic
are not used in their full complexity to characterize the features of physical systems, and thus obscures – what
physical system. Indeed, the description would idealize Suppe considers to be – epistemologically important
the physical system by ignoring certain factors or fea- features of scientific theorizing.
tures of the actual system that may plausibly be causally
relevant to the actual system. For instance, it may as- 2.1.5 Hempel’s Provisos Argument
sume that the planets are point masses, or that their
gravitational fields are uniform, or that there are no dis- In one of his last writings, Hempel [2.31] raises
turbances to the system by external factors and that the a problem that suggests a flaw in interpreting the link
system is in a vacuum. What scientific theories do is between empirical theories and experimental reports
attempt to characterize the behavior of such physical as mere deduction. Assuming that a theory is a formal
systems not the behavior of directly observable phe- axiomatic system consisting of T and C, as we did
nomena. so far, consider Hempel’s example. If we try to apply
Although this is admittedly a rough sketch of the theory of magnetism for a simple case we are
Suppe’s view, it is not hard to see that the aim of the faced with the following inferential situation. From
argument is to lead to the conclusion that the directly the observational sentence b is a metal bar to which
observable phenomena are connected to a scientific iron filings are clinging (SO1 ), by means of a suitable
theory via the physical system. That is to say, (if we put correspondence rule we infer the theoretical sentence
together this idea with the one presented at the end of b is a magnet (ST1 ). Then by using the theoretical
Sect. 2.1.3 above) the connection between the theory postulates in T, we infer if b is broken into two bars,
and the phenomena, according to Suppe, requires an then both are magnets and their poles will attract or
analysis of theories and of theory–application that repel each other (ST2 ). Finally using further correspon-
involves a two-stage move. The first move involves dence rules we derive the observational sentence if b is
the connection between raw phenomena and the hard broken into two shorter bars and these are suspended,
data about the particular target system in question. by long thin threads, close to each other at the same
The second move involves the connection between distance from the ground, they will orient themselves
the physical system that represents the hard data and so as to fall into a straight line (SO2 ) ([2.31, p. 20]).
the theoretical postulates of the theory. According If the inferential structure is assumed to be deductive
to Suppe’s understanding of the theory–experiment then the above structure can be read as follows: SO1 in
34 Part A Theoretical Issues in Models
combination with the theory deductively implies SO2 . ductively link theoretical to observational statements,
Hempel concludes that this deductivist construal faces provisos are required. However, in many theory ap-
a difficulty, which he calls the problem of provisos. plications there would be an indefinitely large number
To clarify the problem of provisos, we must look of provisos, thus trivializing the concept of scientific
into the third inferential step from ST2 to SO2 . What is laws understood as empirical universal generalizations.
necessary here is for the theory of magnetism to pro- In other cases, some provisos would not even be ex-
vide correspondence rules that would turn this step into pressible in the language of the theory, thus making the
a deductive inference. The theory however, as Hempel deductive step impossible. Hempel’s challenge is that
Part A | 2.1
points out, clearly does not do this. In fact, the the- theory–applications presuppose provisos and this does
ory allows for the possibility that the magnets orient not cohere with the view that theory relates to obser-
themselves in a way other than a straight line, for ex- vation sentences deductively (For an interesting discus-
ample, if an external magnetic field of suitable strength sion of Hempel’s problem of provisos, see [2.32–35]).
and direction is present. This leads to recognizing that
the third inferential step presupposes the additional as- 2.1.6 Theory Consistency
sumption that there are no disturbing influences to the and Meaning Invariance
system of concern. Hempel uses the term provisos,
“[. . . ] to refer to assumptions [of this kind] [. . . ], which Feyerabend criticized the logical positivist conception
are essential, but generally unstated, presuppositions of of scientific theories on the ground that it imposes on
theoretical inferences” [2.31, p. 23]. Therefore, provi- them a meaning invariance condition and a consis-
sos are presupposed in the application of a theory to tency condition. By the consistency condition he meant
phenomena (The problem we saw in Sect. 2.1.3 which that [2.36, p. 164]
Suppes raises, namely that in science theoretical predic-
“[. . . ] only such theories are [. . . ] admissible in
tions are not confronted with raw observation reports
a given domain which either contain the theories
but with observation-altered-by-theory-of-data reports,
already used in this domain, or which are at least
neighbors this problem but it is not the same. Hempel’s
consistent with them inside the domain.”
problem of provisos concerns whether it is possible to
deductively link theory to observational statements no By the condition of meaning invariance he meant
matter how the latter are constructed). that [2.36, p. 164]:
What is the character of provisos? Hempel suggests
“[. . . ] meanings will have to be invariant with re-
we may view provisos as assumptions of completeness.
spect to scientific progress; that is, all future the-
For example, in a theoretical inference from a sentence
ories will have to be framed in such a manner that
S1 to another S2 , a proviso is required that asserts that in
their use in explanations [or reductions] does not af-
a given case “[. . . ] no factors other than those specified
fect what is said by the theories, or factual reports
in S1 are present that could affect the event described by
to be explained”
S2 ” [2.31, p. 29]. As, for example, is the case in the ap-
plication of the Newtonian theory to a two-body system, Feyerabend’s criticisms are not aimed directly at the
where it is presupposed that their mutual gravitational RV, but rather at two other claims of logical positivism
attraction are the only forces the system is subjected to. that are intimately connected to the RV, namely the the-
It is clear that [2.31, p. 26]: ses of the development of theories by reduction and the
covering law model of scientific explanation.
“[. . . ] a proviso as here understood is not a clause
A brief digression, in order to look into the afore-
that can be attached to a theory as a whole and
mentioned theses, would be helpful. The development
vouchsafe its deductive potency by asserting that
of theories by reduction involves the reduction of one
in all particular situations to which the theory is
theory (secondary) into a second more inclusive theory
applied, disturbing factors are absent. Rather, a pro-
(primary). In such developments, the former theory may
viso has to be conceived as a clause that pertains
employ [2.37, p. 342]
to some particular application of a given theory and
asserts that in the case at hand, no effective factors “[. . . ] in its formulations [. . . ] a number of distinc-
are present other than those explicitly taken into ac- tive descriptive predicates that are not included in
count.” the basic theoretical terms or in the associated rules
of correspondence of the primary [theory] [. . . ].”
Thus, if a theory is conceived as a deductively
closed set of statements and its axioms conceived as That is to say, the VT terms of the secondary the-
empirical universal generalizations, as the RV purports, ory are not necessarily all included in the theoretical
then to apply theory to phenomena, that is, to de- vocabulary of the primary theory. Nagel builds up his
Models and Theories 2.1 The Received Viewof Scientific Theories 35
case based on the example of the reduction of ther- by the covering law model – is that meanings must
modynamics to statistical mechanics. There are several be invariant. Feyerabend objects to the meaning invari-
requirements that have to be satisfied for theory reduc- ance and the consistency conditions and argues his case
tion to take place, two of which are: (1) the VT terms inductively by drawing from historical examples of the-
for both theories involved in the reduction must have ory change. For example, the concept of mass does not
unambiguously fixed meanings by codified rules of us- have the same meaning in relativity theory as it does
age or by established procedures appropriate to each in classical mechanics. Relativistic mass is a relational
discipline, for example, theoretical postulates or corre- concept between an object and its velocity, whereas
Part A | 2.1
spondence rules. (2) for every VT term in the secondary in classical mechanics mass is a monadic property of
theory that is absent from the theoretical vocabulary an object. Similarly, Galileo’s law asserts that acceler-
of the primary theory, assumptions must be introduced ation due to gravity is constant, but if Newton’s law
that postulate suitable relations between these terms and of gravitation is applied to the surface of the earth it
corresponding theoretical terms in the primary theory. yields a variable acceleration due to gravity. Hence,
(See Nagel [2.37, pp. 345–358]. In fact Nagel presents Galileo’s law cannot be derived from Newton’s law.
a larger set of conditions that have to hold in order By such examples, he attempts to undermine Nagel’s
for reduction to take place [2.37, pp. 336–397], but assumptions (1) and (2) above and establish that nei-
these are the only two relevant to Feyerabend’s argu- ther meaning invariance nor the related notion of theory
ments). consistency characterize actual science and scientific
The covering law model of scientific explanation progress (see Feyerabend [2.36, 38–40]. Numerous au-
is, in a nutshell, explanation in terms of a deduc- thors have criticized Feyerabend’s views. For instance,
tively valid argument. The sentence to be explained objections to his views have been raised based on his
(explanandum) is a logical consequence of a set of law- idiosyncratic analysis of meaning, on which his argu-
premises together with a set of premises consisting of ments rely. His views are hence not presented here as
initial conditions or other particular facts involved (ex- conclusive criticisms of the RV; but only to highlight
planans). For the special case when the explanandum that they cast doubt on the adequacy of the theses of
is a scientific theory, T 0 , the covering law model can theory development by reduction and the covering law
be formulated as follows: A theory T explains T 0 if and model of explanation).
only if T together with initial conditions constitute a de-
ductively valid inference with consequence T 0 . In other 2.1.7 General Remark on the Received View
words, if T 0 is derivable from T together with state-
ments of particular facts involved then T 0 is explained The RV is intended as an explicative and not a de-
by T. It seems that reduction and explanation of theo- scriptive view of scientific theories. We have seen that
ries go hand in hand, that is, if T 0 is reduced to T, then even as such it is vulnerable to a great deal of criti-
T explains T 0 and conversely. cism. One way or another, all these criticisms rely on
Feyerabend points out that Nagel’s two assump- one weakness of the RV: Its inability to clearly spell
tions – (1) and (2) above – for theory reduction re- out the nature of theoretical terms (and how they ac-
spectively impose a condition of meaning invariance quire their meaning) and its inability to specify how
and a consistency condition to scientific progress. The sentences consisting of such terms relate to experimen-
thesis of development of theories by reduction con- tal reports. This is a weakness that has been understood
demns science to restrict itself to theories that are by the RV’s critics to stem from the former’s focus on
mutually consistent. But the consistency condition re- syntax. By shifting attention away from the representa-
quires that terms in the admissible theories for a domain tional function of models and attempting to characterize
must be used with the same meanings. Similarly, it can theory structure in syntactic terms, the RV makes itself
be shown that the covering law model of explanation vulnerable to such objections. Despite all of the above
also imposes these two conditions. In fact, the con- criticisms pointing to the difficulty in explicating how
sistency condition follows from the requirement that theoretical terms relate to observation, I do not think
the explanandum must be a logical consequence of the that any one of them is conclusive in the ultimate sense
explanans, and since the meanings of the terms and of rebutting the RV. Nevertheless, the subsequent result
statements in a logically valid argument must remain was that under the weight of all of these criticisms to-
constant, an obvious demand for explanation – imposed gether the RV eventually made room for its successor.
36 Part A Theoretical Issues in Models
of theory is identified with, or presented as, classes way by which to overcome the shortcomings of stan-
of models. A logical consequence of identifying the- dard formalization. As mentioned by Gelfert, Chap. 1,
ory structure with classes of models is that models Suppe’s example of a set-theoretical axiomatization is
and modeling are turned into crucial components of classical particle mechanics (CPM). Three axioms of
scientific theorizing. Indeed, this has been one of the kinematics and four axioms of dynamics (explicitly
major contributions of the SV, since it unquestionably stated in Chap. 1 of this volume: The Ontology of Mod-
assisted in putting models and modeling at the fore- els) are articulated by the use of predicates that are
front of philosophical attention. However, identifying defined in terms of set theoretical notions. The struc-
theory structure with classes of models is not a logical ture } D hP; T; s; m; f ; gi can then be understood to
consequence of the thesis that models (and model- be a model of CPM if and only if it satisfies those
ing) are important components of scientific theorizing. axioms [2.41, p. 294]. Such a structure is what logi-
Some philosophers who came to this conclusion have cians would label a (semantic) model of the theory,
since defended the view that although models are cru- or more accurately a class of models. In general, the
cial to scientific theorizing, the relation between theory model–theoretic notion of a structure, S, is that of an
and models is much more complex than that of set- entity consisting of a nonempty set of individuals, D,
theoretical inclusion. I shall proceed in this section by and a set of relations defined upon the former, R, that
articulating the major features of the SV; in the pro- is, S D hD; Ri. The set D specifies the domain of the
cess I shall try to clarify the notion of model inherent structure and the set R specifies the relations that hold
in the view and also explain – what I consider to be – between the individuals in D. (Note that as far as the
the main difference among its proponents, and finally I notion of a structure is concerned, it only matters how
will briefly discuss the criticisms against it, which, nev- many individuals are there and not what they are, and it
ertheless, do not undermine the importance of models only matters that the relations in R hold between such
in science. and such individuals of D and not what the relations
Patrick Suppes was the first to attempt a model- are. For more on this point and a detailed analysis of
theoretic account of theory structure. He was one of the notion of structure Frigg and Nguyen, Chap. 3).
the major denouncers of the attempts by the logical Models of data, according to Suppes, are possible
positivists to characterize theories as first-order cal- realizations of the experimental data. It is to models of
culi supplemented by a set of correspondence rules. data that models of the theory are contrasted. The RV
(See [2.27, 28, 41–43]; much of the work developed in would have it that the theoretical predictions have a di-
these papers is included in [2.44]). His objections to rect analogue in the observation statements. This view
the RV led him on the one hand to suggest that in sci- however, is, according to Suppes, a distorting simplifi-
entific practice the theory–experiment relation is more cation. As we have seen in Sect. 2.1.3, Suppes defends
sophisticated than what is implicit in the RV and that the claim that by the use of theories of experimen-
theories are not confronted with raw experimental data tal design and other auxiliary theories, the raw data
(as we have seen in Sect. 2.1) but with, what has since are regimented into a structural form that bears a re-
been dubbed, models of data. On the other hand, he lation to the models of the theory. To structure the data,
proposed that theories be construed as collections of as we saw earlier, various influencing factors that the
models. The models are possible realizations (in the theory does not account for, but are known to influ-
Tarskian sense) that satisfy sets of statements of theory, ence the experimental data, must be accommodated by
and these models, according to Suppes, are entities of an appropriate conversion of the data into canonical
the appropriate set-theoretical structure. Both of these form. This regimentation results in a finished product
insights have been operative in shaping the SV. that Suppes dubbed models of data, which are struc-
Suppes urged against standard formalizations of sci- tures that could reasonably be contrasted to the models
entific theories. First, no substantive example of a sci- of the theory. Suppes’ picture of science as an enter-
entific theory is worked out in a formal calculus, and prise of theory construction and empirical testing of
second the [2.28, p. 57] theories involves establishing a hierarchy of models,
Models and Theories 2.2 The Semantic Viewof Scientific Theories 37
roughly consisting of the general categories of mod- the SV toward the end of this section. For now, let me
els of the theory and models of the data. Furthermore, turn our attention to some putative differences between
since the theory–experiment relation is construed as no the various proponents of the SV.
more than a comparison (i. e., a mapping) of mathe- Despite agreeing about focusing on the mathemat-
matical structures, he invokes the mathematical notion ical structure of theories for giving a unitary account
of isomorphism of structure to account for the link of models, it is not hard to notice in the relevant liter-
between theory and experiment. (An isomorphism be- ature that different proponents of the SV have spelled
tween structures U and V exists, if there is a function out the details of thesis (1) in different ways. This is
Part A | 2.2
that maps each element of U onto each element of V). because different proponents of the SV have chosen dif-
Hence, Suppes can be read as urging the thesis that ferent mathematical entities with which to characterize
defining the models of the theory and checking for theory structure. As we saw above, Suppes chooses set
isomorphism with models of data, is a rational recon- theoretical predicates a choice that seems to be shared
struction that does more justice to actual science than by da Costa and French [2.45, 46]. Van Fraassen [2.47]
the RV does. on the other hand prefers state-spaces, and Suppe [2.30]
The backbone of Suppes’ account is the sharp dis- uses relational systems.
tinction between models of theory and models of data. Let us, by way of example, briefly look into van
In his view, the traditional syntactic account of the re- Fraassen’s state-space approach. The objects of concern
lation between theory and evidence, which could be of scientific theories are physical systems. Typically,
captured by the schema: .T&A/ ! E (where, T stands mathematical models represent physical systems that
for theory, A for auxiliaries, E for empirical evidence), can generally be conceived as admitting of a certain
is replaced by theses (1), (2), and (3) below: set of states. State-spaces are the mathematical spaces
the elements of which can be used to represent the
1. MT TS, where MT stands for model of the theory
states of physical systems. It is a generic notion that
TS for the theory structure, and for the relation of
refers to what, for example, physicists would label as
inclusion
phase space in classical mechanics or Hilbert space in
2. .A&E&D/ 7! MD , where MD stands for model of
quantum mechanics. A simple example of a state-space
data, A for auxiliary theories, E for theories of ex-
would be that of an n-particle system. In CPM, the state
perimental design etc., D for raw empirical data, and
of each particle at a given time is specified by its po-
7! for . . . used in the construction of . . .
sition q D .qx ; qy ; qz / and momentum p D .px ; py ; pz /
3. MT MD , where stands for mapping of the ele-
vectors. Hence the state-space of an n-particle system
ments and relations of one structure onto the other.
would be a Euclidean 6n-dimensional space, whose
MT TS expresses Suppes’ view that by defining points are the 6n-tuples of real numbers
a theory structure a class of models is laid down for the
representation of physical systems. .A&E&D/ 7! MD hq1x ; q1y ; q1z ; : : : ; qnx ; qny ; qnz ;
is meant to show how Suppes distances himself from p1x ; p1y ; p1z ; : : : ; pnx ; pny ; pnz i :
past conceptions of the theory–experiment relation, by
More generally, a state-space is the collection of mathe-
claiming that theories are not directly confronted with
matical entities such as, vectors, functions, or numbers,
raw experimental data (collected from the target physi-
which is used to specify the set of possible states for
cal systems) but rather that the latter are used, together
a particular physical system. A model, in van Fraassen’s
with much of the rest of the scientific inventory, in the
characterization of theory structure, is a particular se-
construction of data structures, MD . These data struc-
quence of states of the state-space over time, that is, the
tures are then contrasted to a theoretical model, and the
state of the modeled physical system evolves over time
theory–experiment relation consists in an isomorphism,
according to the particular sequence of states admitted
or more generally in a mapping of a data onto a theo-
by the model. State-spaces unite clusters of models of
retical structure, that is, MT MD . The proponents of
a theory, and they can be used to single out the class of
the SV would, I believe, concur to the above three gen-
intended models just as set-theoretical predicates would
eral theses. Furthermore, they would concur with two of
in Suppes’ approach. The presentation of a scientific
the theses’ corollaries: that scientific representation of
theory, according to van Fraassen, consists of a descrip-
phenomena can be explicated exclusively by mapping
tion of a class of state-space types. As van Fraassen
of structures, and that all scientific models constructed
explains [2.47, p. 44]:
within the framework of a particular scientific theory
are united under a common mathematical or relational “[w]henever certain parameters are left unspecified
structure. We shall look into these two contentions of in the description of a structure, it would be more
38 Part A Theoretical Issues in Models
The different choices of different authors on how for a number of different, and possibly nonequiva-
theory structure is characterized, however, belong to the lent, sets of sentences or linguistic formulations of
realm of personal preference and do not introduce any the theory.”
significant differences on the substance of thesis (1)
From such remarks, one is justifiably led to be-
of the SV, which is that all models of the theory are
lieve that propounding a theory as a class of models
united under an all-inclusive theory structure. So, ir-
directly defined, without recourse to its syntax, only
respective of the particular means used to characterize
aims at convenience in avoiding the hustle of construct-
theory structure, the SV construes models as structures
ing a standard formalization, and at easier adaptability
(or structure types) and theories as collections of such
of our reconstruction with common scientific practices.
structures. Neither have disagreements been voiced re-
Epigrammatically, the difference – between the SV and
garding thesis (2). On the contrary, there seems to be
the RV – would then be methodological and heuristic.
a consensus among adherents of the SV that models of
Reasons such as this have led some authors to ques-
theory are confronted with models of data and not the
tion the logical difference between defining the class of
direct result of an experimental setup (Not much work
models directly as opposed to metamathematically.
has been done to convincingly analyze particular sci-
Examples are Friedman and Worrall who in their
entific examples and to show the details of the use of
separate reviews of van Fraassen [2.47] ask whether
models of data in science; rather, adherents of the SV
the class of models that constitutes the theory, accord-
repeatedly use the notion with reference to something
ing to the proponents of the SV, is to be identified with
very general with unclear applications in actual scien-
an elementary class, that is, a class that contains all
tific contexts).
the models (structures) that satisfy a first-order theory.
They both notice that not only does van Fraassen and
2.2.1 On the Notion of Model in the SV
other proponents of the SV offer no reason to oppose
such a supposition, but also they even encourage it (as
An obvious objection to thesis (1) would be that a stan-
in the above quotations). But if that is the case [2.49,
dard formalization could be used to express the theory
p. 276]:
and subsequently define the class of semantic mod-
els metamathematically, as the class of structures that “[t]hen the completeness theorem immediately
satisfy the sentences of the theory, despite Suppes sug- yields the equivalence of van Fraassen’s account
gestion that such a procedure would be unnecessarily and the traditional syntactic account [i. e., that of the
complex and tedious. RV].”
In fact, proponents of the SV have often encouraged
In other words [2.50, p. 71]:
this objection. Van Fraassen and Suppe are notable
examples as the following quotations suggest [2.48, “So far as logic is concerned, syntax and semantics
p. 326]: go hand-in-hand – to every consistent set of first-
order sentences there corresponds a nonempty set
“There are natural interrelations between the two
of models, and to every normal (elementary) set of
approaches [i. e., the RV and the SV]: An axiomatic
models there corresponds a consistent set of first-
theory may be characterized by the class of interpre-
order sentences.”
tations which satisfy it, and an interpretation may
be characterized by the set of sentences which it If we assume (following Friedman and Worrall) that
satisfies; though in neither case is the characteriza- the proponents of the SV are referring to the elemen-
tion unique. These interrelations [. . . ] would make tary class of models then the preceding argument is
implausible any claim of philosophical superiority sound. The SV, in agreement with the logical positivists,
for either approach. But the questions asked and retains formal methods as the primary tool for philo-
methods used are different, and with respect to fruit- sophical analysis of science. The only new elements of
fulness and insight they may not be on a par with its own would be the suggestions that first it is more
specific contexts or for special purposes.” convenient that rather than developing these methods
Models and Theories 2.2 The Semantic Viewof Scientific Theories 39
using proof–theory we should instead use formal se- (i. e., excludes all nonstandard models), despite the pos-
mantics (model-theory), and second we should assign sibility that one could see the prospect of the SV being
to models (i. e., the semantic interpretations of sets of heuristically superior to the RV. (Of course, we must not
sentences) a representational capacity. forget that this superiority would not necessarily be the
Van Fraassen, however, resists the construal of the result of thesis (1) of the SV, but it could be the result of
class of models of the SV with an elementary class (See its consequence of putting particular emphasis on the
van Fraassen [2.51, pp. 301–303] and his [2.52]). Let significance of scientific models that, as noted earlier,
me rehearse his argument. The SV claims that to present does not logically entail thesis (1)).
Part A | 2.2
a theory is to define a class M of models. This is the Let us, for the sake of argument, ignore the Fried-
class of structures the theory makes available for mod- man–Worrall argument. Now, according to the SV,
eling its domain. For most scientific theories, the real models of theory have a dual role. On the one hand,
number continuum would be included in this class. Now they are devices by which phenomena are represented,
his argument goes, if we are able to formalize what is and on the other, they are structures that would sat-
meant to be conveyed by M in some appropriate lan- isfy a formal calculus were the theory formalized. The
guage, then we will be left with a class N of models of SV requires this dual role. First because the represen-
the language, that is, the class of models in which the tational role of models is the way by which the SV
axioms and theorems of the language are satisfied. Our accounts for scientific representation without the use of
hope is that every structure in M occurs in N. However, language; and second because the role of interpreting
the real number continuum is infinite and [2.52, p. 120]: a set of axioms ensures that a unitary account of mod-
els is given. Now, Thompson-Jones [2.53] notices that
“[t]here is no elementary class of models of a denu-
the notion of model implicit in the SV is either that of
merable first-order language each of which includes
an interpretation of a set of sentences or a mathemat-
the real numbers. As soon as we go from math-
ical structure (the disjunction is of course inclusive).
ematics to metamathematics, we reach a level of
He analyzes the two possible notions and argues that
formalization where many mathematical distinc-
the SV becomes more tenable if the notion of model
tions cannot be captured.”
is only understood as that of a mathematical structure
Furthermore, “[t]he Löwenheim–Skolem theorems that functions as a representation device. If that were
[. . . ] tell us [. . . ] that N contains many structures the case then the adherents of the SV could possi-
not isomorphic to any member of M” [2.51, p. 302]. bly claim that defining the class of structures directly
Van Fraassen relies, here, on the following reasoning: indeed results in something distinct from the metamath-
The Löwenheim–Skolem theorem tells us that all sat- ematical models of a formal syntax. Thompson-Jones’
isfiable first-order theories that admit infinite models suggestion, however, would give rise to new objections.
will have models of all different infinite cardinalities. Here is one. It would give rise to the following ques-
Now models of different cardinality are nonisomor- tion: How could a theory be identified with a class of
phic. Consequently, every theory that makes use of the models (i. e., mathematical structures united under an
real number continuum will have models that are not all-inclusive theory structure) if the members of such
isomorphic to the intended models (i. e., nonstandard a class do not attain membership in the class because
interpretations) but which satisfy the axioms of the the- they are interpretations of the same set of theory ax-
ory. So van Fraassen is suggesting that M is the intended ioms? In other words, the proponents of the SV would
class of models, and since the limitative meta-theorems have to explain what it is that unites the mathematical
tell us that it cannot be uniquely determined by any set models other than the satisfaction relation they have to
of first-order sentences we can only define it directly. the theoretical axioms. To my knowledge, proponents
Here is his concluding remark [2.51, p. 302]: of the SV have not offered an answer to this question.
If Thompson-Jones’ suggestion did indeed offer a plau-
“The set N contains [. . . ] [an] image M of M,
sible way to overcome the Friedman–Worrall argument
namely, the set of those members of N which consist
then the SV would have to abandon the quest of giv-
of structures in M accompanied by interpretations
ing a unitary account of models. Given the dual aim
therein of the syntax. But, moreover, [. . . ] M is not
of the SV, namely to give a unitary account of models
an elementary class.”
and to account for scientific representation by means of
Evidently, van Fraassen’s argument aims to estab- structural relations, it seems that the legitimate notion
lish that the directly defined class of models is not an of model integral to this view must have these two-hard
elementary class. It is hard, however, to see that defin- to reconcile-roles; namely, to function both as an inter-
ing the models of the theory directly without resort to pretation of sets of sentences and as a representation of
formal syntax yields only the intended models of theory phenomena. (Notice that this dual function of models is
40 Part A Theoretical Issues in Models
an aspect of all versions of the SV, independent of how (albeit this division is not drawn in linguistic terms),
one chooses to characterize theory structure and of how and the empirical substructures of such models are as-
one chooses to interpret that structure). sumed to be isomorphic to the observable aspects of the
physical system. In other words, the theory structure
2.2.2 The Difference Between is interpreted as having distinctly divided observable
Various Versions of the SV and nonobservable features, and the theory–experiment
relation is interpreted as being an isomorphic relation
The main difference among the various versions of the between the data model and the observable parts of the
Part A | 2.2
SV relates to two intertwined issues that relate to the- theoretical model. Now, the state-space is a class of
sis (3), namely how the theory structure is construed models, it thus includes – for CPM – many models in
and how the theory–experiment mapping relation is which the world is a Newtonian mechanical system. In
construed. To a first approximation we could divide the fact, it seems that the state-space includes (unites) all
different versions of the SV, from the perspective of logically possible models, as the following dictum sug-
these two issues, into two sorts. Those in which par- gests ([2.52, p. 111], [2.54, p. 226]):
ticular emphasis is given to the presence of abstraction
“In one such model, nothing except the solar system
and idealization in scientific theorizing for explicating
exists at all; in another the fixed stars also exist, and
the theory–experiment (or model–experiment) relation,
in a third, the solar system exists and dolphins are
and those in which the significance of this nature of sci-
its only rational inhabitants.”
entific theorizing is underrated.
According to van Fraassen, the theory is empiri-
Idealization and Abstraction Underrated cally adequate if we can find a model of the theory
Van Fraassen (Suppes most probably could be placed in which we can specify empirical substructures that
in this group too), for example, seems to be a clear case are isomorphic to the data model. The particular view
of this sort. Here is how he encapsulates his conception of scientific representation that resides within this idea
of scientific theories and of how theory relates to exper- is this: A model represents its target if and only if it
iment [2.47, p. 64]: is isomorphic to a data model constructed from mea-
surements of the target. Not much else seems to matter
“To present a theory is to specify a family of struc-
for a representation relation to hold but the isomor-
tures, its models; and secondly, to specify certain
phism condition. Many would argue, however, that such
parts of those models (the empirical substructures)
a condition for the representation relation is too strong
as candidates for the direct representation of ob-
to explicate how actual scientific models relate to ex-
servable phenomena. The structures which can be
perimental results and would object to this view on
described in experimental and measurement reports
the ground that for isomorphism to occur it would re-
we can call appearances: The theory is empirically
quire that target physical systems occur under highly
adequate if it has some model such that all appear-
idealized conditions or in isolated circumstances. (Ad-
ances are isomorphic to empirical substructures of
mittedly, it would not be such a strong requirement
that model.”
for models that would only describe observable as-
Appearances (which is van Fraassen’s term for mod- pects of the world. In such cases isomorphism could be
els of data) are relational structures of measurements of achieved, but at the expense of the model’s epistemic
observable aspects of the target physical system, for ex- significance. I do not think, for instance, that such mod-
ample, relative distances and velocities. For example, in els would be of much value to a science like Physics as,
the Newtonian description of the solar system, as van more often than not, they would be useless in predicting
Fraassen points out, the relative motions of the planets the future behavior of their targets).
“[. . . ] form relational structures defined by measuring
relative distances, time intervals, and angles of sepa- Idealization and Abstraction Highlighted
ration” [2.47, p. 45]. Within the theoretical model for In the second camp of the SV, we encounter several
this physical system, “[. . . ] we can define structures that varieties. One of these is Suppe [2.30], who interprets
are meant to be exact reflections of those appearances theory structure and the theory–experiment relation
[. . . ]” [2.47, p. 45]. Van Fraassen calls these empirical as follows. Theories characterize particular classes of
substructures. When a theory structure is defined each target systems. However, target systems are not charac-
of its models, which are candidates for the represen- terized in their full complexity, as already mentioned
tation of phenomena, includes empirical substructures. in Sect. 2.1.4. Instead, Suppe’s understanding is that
So within representational models we could specify certain parameters are abstracted and employed in this
a division between observable/nonobservable features characterization. In the case of CPM, these are the posi-
Models and Theories 2.2 The Semantic Viewof Scientific Theories 41
tion and momentum vectors. These two parameters are However, even when a certain mathematical product
abstracted from all other characteristics that target sys- of theory is identified with a causally possible phys-
tems may possess. Furthermore, once the factors, which ical system, we still know that typically the situation
are assumed to influence the class of target systems described by the physical system does not obtain. The
in the theory’s intended scope, have been abstracted actual torsion pendulum apparatus is subject to a num-
the characterization of physical systems (as mentioned ber of different factors (or may have a number of
in Sect.2.1.4, physical systems in Suppe’s terminology different characteristics) that may or may not influence
refer to the abstract entities that models of the the- the process of oscillation. Some influencing factors are
Part A | 2.2
ory represent and not to the actual target systems) still the amplitude of the angle of oscillation, the mass dis-
does not fully account for target systems. Physical sys- tribution of the rod and disc, the nonuniformity of the
tems are not concerned with the actual values of the gravitational field of the earth, the buoyancy of the rod
parameters the particulars possess, for example, actual and disc, the resistance of the air and the stirring up of
velocities, but with the values of these parameters under the air due to the oscillations. In modeling the torsion
certain conditions that obtain only within the physical pendulum by means of the linear harmonic oscillator
system itself. Thus in CPM, where the behavior of di- the physical system is abstracted from factors assumed
mensionless point-masses are studied in isolation from to influence the oscillations in the same manner as from
outside interactions, physical systems characterize this those assumed not to. Therefore, the replicating rela-
behavior only by reference to the positions and mo- tion between the physical system, P, and the target
menta of the point-masses at given times. system, S, which Suppe urges cannot be understood as
An example can serve to demonstrate Suppe’s idea one of identity or isomorphism. Suppe is explicit about
in bit more detail. The linear harmonic oscillator, that this [2.30, p. 94]:
is, a mathematical instrument, is expressed by the fol-
“The attributes in P determine a sequence of states
lowing equation of motion xR C .k=m/x D 0 , which is
over time and thus indicate a possible behavior of S
the result of applying Newton’s second law to a linear
[. . . ] Accordingly, P is a kind of replica of S; how-
restoring force. The mathematical model is interpreted
ever, it need not replicate S in any straight-forward
(and thus characterizes a physical system) as follows:
manner. For the state of P at t does not indicate what
Periodic oscillations are assumed to take place with re-
attributes the particulars in S possess at t; rather, it
spect to time, x is the displacement of an oscillating
indicates what attributes they would have at t were
mass-point, and k and m are constant coefficients that
the abstracted parameters the only ones influencing
may be replaced by others. When the mathematical pa-
the behavior of S and were certain idealized con-
rameters in the above equation are linked to features
ditions met. In order to see how P replicates S we
of a specific object, the equation can be used to model
need to investigate these abstractive and idealizing
for instance the torsion pendulum, that is, an elastic
conditions holding between them.”
rod connected to a disk that oscillates about an equi-
librium position. This sort of linking of mathematical In summary, the replicating relation is counterfac-
terms to features of objects could be understood to be tual: If the conditions assumed to hold for the descrip-
a manifestation of what Giere calls identification. Giere tion of the physical system were to hold for the target
introduces a useful distinction between interpretation system, then the target system would behave in the way
and identification [2.55, p. 75]: described by the physical system. The behavior of ac-
tual target systems, however, may be subject to other
“[. . . ] [Interpretation] is the linking of the mathe-
unselected parameters or other conditions, for which the
matical symbols with general terms, or concepts,
theory does not account.
such as position[. . . ] [Identification] is the linking
The divergence of Suppe’s view from that of van
of a mathematical symbol with some feature of
Fraassen is one based primarily on the representation
a specific object, such as the position of the moon.”
relation of theory to phenomena. Suppe understands the
In the torsion pendulum model, x is identified with theory structure as being a highly abstract and idealized
the angle of twist, k with the torsion constant, and m representation of the complexities of the real world.
with the moment of inertia. By linking the mathematical Van Fraassen disregards this because he is concerned
symbols of a model to features of a target system we can with the observable aspects of theories and assumes that
reasonably assume, according to Suppe, that the model these can, to a high degree of accuracy, be captured by
could be associated with an actual system of the world; experiments. Thus van Fraassen regards theories as con-
the model characterizes, as Suppe would say in his own taining empirical substructures that stand in isomorphic
jargon, “a causally possible physical system.” relations to the observable aspects of the world. Suppe’s
42 Part A Theoretical Issues in Models
understanding of theory structure, however, points to is explicated by da Costa and French [2.46] as a par-
a significant drawback present in van Fraassen’s view: tial isomorphism. A partial isomorphism between two
How can isomorphism obtain between a data model and partial structures U and V exists when a partial sub-
an empirical substructure of the model, given that the structure of U is isomorphic to a partial substructure
model is abstract and idealized? Suppe’s difference with of V. In other words, partial isomorphism exists when
van Fraassen’s view of the representation relation and some elements of the set of relations in U are mapped
of the epistemic inferences that can be drawn from it is onto elements of the set of relations in V. If a model
this, if indeed it is the case that isomorphism obtains be- of theory is partially isomorphic to a data model then,
Part A | 2.2
tween a data model and an empirical substructure, then da Costa and French claim, the model is partially true.
it is so for either of two reasons: (1) the experiment is The notion of partial truth is meant to convey a prag-
highly idealized, or (2) the data model is converted to matic notion of truth, which plausibly could avoid the
what the measurements would have been if the influ- problems of correspondence or complete truth, and cap-
ences that are not accounted by the theory did not have ture the commonplace idea that theories (or models) are
any effect on the experimental setup. This is a signif- incomplete or imperfect or abstract or idealized descrip-
icantly different claim from what van Fraassen would tions of target systems.
urge, to wit that the world or some part of it is isomor- In conclusion, if we could speak of different ver-
phic to the model. According to Suppe’s understanding sions of the SV and not just different formulations of
of theory structure, no part of the world is or can be iso- the same idea, if, in other words, the proposed versions
morphic to a model of the theory, because abstraction of the semantic conception of theories can be differen-
and idealization are involved in scientific theorizing. tiated in any significant way amongst them, it is on the
Geire [2.55] is another example of a version of the basis of how thesis (3) is conceived: There are those
SV that places the emphasis on abstraction and ide- that understand the representation relation, MT MD ,
alization. Following Suppes and van Fraassen, Giere as a strict isomorphic relation, and those that construe
understands theories as classes of models. He does not it more liberally, for example, as a similarity relation.
have any special preference about the mathematical en- In particular, van Fraassen prefers an isomorphic re-
tities by which theory structure is characterized, but lation between theory and experiment, whereas Suppe
he is interested in looking at the characteristics of ac- and others understand theories as being abstract and
tual science and how these could be captured by the idealized representations of phenomena. It would seem
SV. This leads him to a similar claim as Suppe. He therefore that particular criticisms would not necessar-
claims that although he does not see any logical rea- ily target both versions. This has not been the case
son why a real target system could not be isomorphic however, as we shall examine in the next two subsec-
to a model, nevertheless for the examples of models tions. Critics of the SV have either targeted theses (1)
found in mechanics texts, typically, no claim of isomor- and (2) and the unitary account of models implicit in the
phism is made, indeed “[. . . ] the texts often explicitly SV, or thesis (3) and the representation relation however
note respects in which the model fails to be isomor- the latter is conceived. The arguments against the uni-
phic to the real system” [2.55, p. 80]. He attributes tary account of scientific models, which obviously aim
this to the abstract and idealized nature of models of indiscriminately at all versions of the SV, will be ex-
the theory. His solution is to substitute the strict crite- plored in Sect.2.2.4. The arguments against the nature
rion of isomorphism, as a way by which to explicate of the representation relation implied by the SV, which
the theory–experiment relation, with that of similarity shall be explored in Sect.2.2.3, if properly adapted af-
in relevant respects and degrees between the model and fect both versions of the SV.
its target.
Finally, there is another example of a version of 2.2.3 Scientific Representation
the SV that also gives attention to idealization and ab- Does not Reduce
straction, namely the version advocated by da Costa to a Mapping of Structures
and French in [2.45, 46, 56]. They do this indirectly by
interpreting theories as partial structures, that is, struc- Suarez [2.57] presents five arguments against the idea
tures consisting of a domain of individuals and a set of that scientific representation can be explicated by ap-
partial relations defined on the domain, where a partial pealing to a structural relation (like isomorphism or
relation is one that is not defined for all the n-tuples similarity) that may hold between the representational
of individuals of the domain for which it presumably device and the represented target. (Suarez [2.57] also
holds. If models of theory are interpreted in this man- develops his arguments for other suggested interpreta-
ner and if it is assumed that models of data are also tions of theses (3), such as partial isomorphism). These
partial structures, then the theory–experiment relation arguments, which are summarized below, imply that
Models and Theories 2.2 The Semantic Viewof Scientific Theories 43
the representational capacity of scientific models can- also tells us why X fails to produce representational
not derive from having a structural relation with its devices that are isomorphic or similar to their targets.
target. Suarez’s first argument is that in science many A different argument but with the same conclusion is
disparate things act as representational devices, for ex- given by Portides [2.59], who argues that isomorphism,
ample, a mathematical equation, or a Feynman diagram, or other forms of structural mapping, is not necessary
or an architect’s model of a building, or the double helix for representation because it is possible to explicate the
macro-model of the DNA molecule. Neither isomor- representational function of some successful quantum
phism nor similarity can be applied to such disparate mechanical models, which are not isomorphic to their
Part A | 2.2
representational devices in order to explicate their rep- targets. Suarez’s final argument is that neither isomor-
resentational function. A similar point is also made by phism nor similarity is sufficient for representation. In
Downes [2.58], who by also exploring some examples other words, even though there may not be a represen-
of scientific models, argues that models in science re- tation relation between A and B, A and B may, however,
late to their target systems in various ways, and that be isomorphic or similar.
attempts to explicate this relation by appeal to isomor- Aiming at the same feature of the SV as Suarez,
phism or similarity does little to serve the purpose of Frigg [2.60] reiterates some of the arguments above
understanding the theory–experiment relation. and gives further reasons to fortify them, but he also
The second argument concerns the logical proper- presents two more arguments that undermine the notion
ties of representation vis-a-vis those of isomorphism of representation as dictated by thesis (3) of the SV. Em-
and similarity. Suarez explains that representation is ployed in his first argument is a particular notion of ab-
nonsymmetric, nonreflexive and nontransitive. If scien- stractness of concepts advocated by Cartwright [2.61].
tific representation is a type of representation then any A concept is considered abstract in relation to a set
attempt to explicate scientific representation cannot im- of more concrete concepts if for the former to apply
ply different logical features from representation. But it is necessary that one of its concrete instances apply.
appeal to a structural relation does not accomplish this, One of Frigg’s intuitive examples is that the concept of
because “[. . . ] similarity is reflexive and symmetric, traveling is more abstract than the concept of sitting in
and isomorphism is reflexive, symmetric and transitive” a moving train. So according to this sense of abstract-
[2.55, p. 233]. ness the concept of traveling applies whenever one is
His third argument is that any explication of rep- sitting in a moving train and that the abstract concept
resentation must allow for misrepresentation or inac- does not apply if one is not performing some action
curate representation. Misrepresentation, he explains, that belongs to the set of concrete instances of traveling.
occurs either when the target of a representation is mis- Frigg then claims, “[. . . ] that possessing a structure is
taken or when a representation is inaccurate because it abstract in exactly this sense and it therefore does not
is either incomplete or idealized. Neither isomorphism apply without some more concrete concepts applying
nor similarity allows for the first kind of misrepresen- as well” [2.60, p. 55]. He defends this claim with the
tation and isomorphism does not allow for the second following argument. Since to have a structure means to
kind. Although, similarity does account for the second consist of a set of individuals which enter into some
kind of representation, Suarez argues, it does so in a re- relations, then it follows that whenever the concept of
strictive sense. That is, if we assume that an incomplete possessing a structure applies to S the concept of being
representation is given according to theory X then sim- an individual applies to members of a set of S and the
ilarity does account for misrepresentation. However, if concept of being in a relation applies to some parts of
a complete representation were given according to the- that set. The concepts of being an individual and being
ory X (i. e., if we have similarity in all relevant respects in a relation are abstract in the above sense. For exam-
that X dictates) but the predictions of this representation ple, given the proper context, for being an individual
still diverge from measurements of the values of the tar- to apply, occupying a certain space-time region has to
get’s attributes then similarity does not account for this apply. Similarly, given the proper context, for being in
kind of misrepresentation. a relation to apply it must be the case that being greater
The fourth argument is that neither isomorphism than applies. Therefore, both being an individual and
nor similarity is necessary for representation. Our in- being in a relation are abstract. Thus Frigg concludes,
tuitions about the notion of representation allow us to possessing a structure is abstract; hence for it to apply,
accept the representational device derived from the- it must be the case that a concrete description of the tar-
ory X as a representation of its target, even though we get applies. Because, the claim that the representation
may know that isomorphism or similarity does not ob- relation can be construed as an isomorphism (or similar-
tain because, for example, an alternative theory Y not ity) of structures presupposes that the target possesses
only gives us better predictions about the target but a structure, Frigg concludes that such a claim “[. . . ] pre-
44 Part A Theoretical Issues in Models
supposes that there is a more concrete description that quence of most of these arguments is that the unitary
is true of the [target] system” [2.60, p. 56]. This argu- account of models that the SV provides through the-
ment shows that to reduce the representation relation to sis (1) that all models are constitutive parts of theory
a mapping of structures the proponents of the SV need structure, obscures the particular features that represen-
to invoke nonstructural elements into their account of tational scientific models demonstrate.
representation, so pure and simple reduction fails. One such example is Morrison [2.62], who ar-
Frigg’s second argument, as he states, is inductive. gues that models are partially autonomous from the
He examines several examples of systems from differ- theories that may be responsible for instigating their
Part A | 2.2
ent contexts in order to support the claim that a target construction. This partial autonomy is something that
system does not have a unique structure. For a sys- may derive from the way they function but also from
tem to have a structure it must be made of individuals the way they are constructed. She discusses Prandtl’s
and relations, but slicing up the physical systems of hydrodynamic model of the boundary layer in order to
the world into individuals and relations is dependent on mark out that the inability of theory to provide an expla-
how we conceptualize the world. The world itself does nation of the phenomenon of fluid flow did not hinder
not provide us with a unique slicing. “Because differ- scientific modeling. Prandtl constructed the model with
ent conceptualizations may result in different structures little reliance on high-level theory and with a concep-
there is no such thing as the one and only structure of tual apparatus that was partially independent from the
a system” [2.60, p. 57]. One way that Frigg’s argument conceptual resources of theory. This partial indepen-
could be read is this: Thesis (2) of the SV implies that dence in construction, according to Morrison, gives
the measurements of an experiment are structured to rise to functional independence and renders the model
form a data model. But, according to Frigg, this struc- partially autonomous from theory. Furthermore, Morri-
turing is not unique. So the claim of thesis (3), that there son raises another issue (see [2.62], as well as [2.63]);
is, for example, an isomorphism between a theoretical that theories, and hence theoretical models as direct
model and a data model is not epistemically informative conceptual descendants of theory, are highly abstract
since there may be numerous other structures that could and idealized descriptions of phenomena, and therefore
be constructed from the data that are not isomorphic to they represent only the general features of phenom-
the theoretical model. ena and do not explain the specific mechanisms at
work in physical systems. In contrast, actual repre-
2.2.4 A Unitary Account of Models sentational scientific models – that she construes as
Does not Illuminate partially autonomous mediators between theories and
Scientific Modeling Practices phenomena – are constructed in ways that allow them
to function as explanations of the specific mechanisms
The second group of criticisms against the SV consists and thus function as sources of knowledge about corre-
of several heterogeneous arguments stemming from sponding target systems and their constitutive parts. (As
different directions and treating a variety of features she makes clear in Morrison [2.64], to regard a model
and functions of models. Despite this heterogeneity, as partially independent from theory does not mean
they can be grouped together because they all indi- that theory plays an unimportant role in its construc-
rectly undermine the idea that the unitary account of tion). This argument, in which representational capacity
scientific models given by employing a set theoreti- is correlated to the explanatory power of models, is
cal (or other mathematical) characterization of theory meant to achieve two goals. Firstly, to offer a way
structure is adequate for understanding the notion of by which to go beyond the narrow understanding of
representational model and the model–experiment re- scientific representation as a mapping relation of struc-
lation. This challenge to the SV is indirect because the ture, and second, to offer a general way to understand
main purpose of these arguments is to illuminate partic- the representational function of both kinds of models
ular features of actual scientific models. In highlighting that physicists call theory-driven and phenomenologi-
these features, these arguments illustrate that actual rep- cal (In Portides [2.65] a more detailed contrast between
resentational models in science are constructed in ways Morrison’s view of the representation relation and that
that are incompatible with the SV, they function in ways of the SV is offered). Cartwright et al. [2.66] and
that the SV does not adequately account for and they Portides [2.67] have also argued that by focusing ex-
represent in ways that is incompatible with the SV’s clusively on theory-driven models and the mapping
account of representation; furthermore, they indicate relation criterion, the SV obscures the representational
that models in science are complex entities that can- function of phenomenological models and also many
not be thoroughly understood by unitary accounts such aspects of scientific theorizing that are the result of phe-
as set-theoretical inclusion. In other words, a conse- nomenological methods.
Models and Theories 2.2 The Semantic Viewof Scientific Theories 45
It is noteworthy that the unitary account that the much because it is not sufficient as to account for the
SV offers may be applicable to theory-driven models. abstract–concrete distinction that exists between theory
Whether that is helpful or not is debatable. How- and models. A stronger reading of the argument is that
ever, more often than not representation in science is the structure of theories is completely irrelevant to how
achieved by the use of phenomenological models or theories represent the world, because they just do not
phenomenological elements incorporated into theory- represent it at all. Only models represent pieces of the
driven models. One aspect of Morrison’s argument is world and they are partially independent from theory
that if we are not to dismiss the representational capac- because they are constituted by concrete concepts that
Part A | 2.2
ity of such models we should give up unitary accounts apply only to particular physical systems.
of models. Cartwright makes a similar point but her ap- Other essays in the volume by Morgan and Mor-
proach to the same problem is from another angle. rison [2.69] discuss different aspects of partial inde-
Cartwright [2.61, 68] claims that theories are highly pendence of models from theory. Here are two brief
abstract and thus do not and cannot represent what examples that aim to show the partial independence of
happens in actual situations. Cartwright’s observation model construction from theory. Suarez [2.70] explains
seems similar to versions of the SV such as Suppe’s, how simplifications and approximations that are intro-
however her approach is much more robust. To claim duced into representational models (such as the London
that theories represent what happens in actual situa- brothers model of superconductivity) are decided in-
tions, she argues, is to overlook that the concepts used in dependently of theory and of theoretical requirements.
them – such as, force functions and Hamiltonians – are This process gives rise to a model that mediates in
abstract. Such abstract concepts could only apply to the the sense that the model itself is the means by which
phenomena whenever more concrete descriptions (as corrections are established that may be incorporated
those present in models) can stand-in for them and for into theory in order to facilitate its applications. But
this to happen the bridge principles of theory must me- even in cases of models that are strongly linked to
diate. Hence the abstract terms of theory apply to actual theory such as the MIT-bag model of quark confine-
situations via bridge principles, and this makes bridge ment, Hartmann [2.71] argues, many parts of the model
principles an operative aspect of theory-application to are not motivated by theory but by an accompanying
phenomena. It is only when bridge principles sanc- story about quarks. From the empirical fact that quarks
tion the use of theoretical models that we are led to were not observed physicists were eventually led to the
the construction of a model – with a relatively close hypothesis that quarks are confined. But confinement
relation to theory – that represents the target system. is not something that follows from theory. Neverthe-
But Cartwright observes that there are only a small less, via the proper amalgam of theory and story about
number of such theoretical models that can be used quarks the MIT-bag model was constructed to account
successfully to construct representations of physical for quark confinement.
systems and this is because there are only a hand- I mentioned earlier in Sect. 2.2.2 that Giere [2.55]
ful of theory bridge principles. In most other cases, is also an advocate of the SV. However, his later writ-
where no bridge principles exist that enable the use of ings [2.72, 73] suggest that he makes a gradual shift
a theoretical model, concrete descriptions of phenom- from his earlier conception of representational models
ena are achieved by constructing phenomenological in science to a view that neighbors that of Morrison
models. Phenomenological models are constructed with and Cartwright. Even in Giere [2.55] the reader notices
minimal aid from theory, and surely there is no deduc- that he, unlike most other advocates of the SV, is less
tive (or structural) relation between them and theory. concerned with the attempt to give a unitary account
The relation between the two should be sought in the of models and more concerned with the importance
nature of the abstract–concrete distinction between sci- of models in actual scientific practices. But in [2.72]
entific concepts, according to Cartwright. Models in and [2.73] this becomes more explicit. Giere [2.55] es-
science, whether constructed phenomenologically or by pouses the idea that the laws of a theory are definitional
the use of available bridge principles, encompass de- devices of theoretical models. This view is compati-
scriptions that are in some way independent from theory ble with the use of scientific laws in the SV. However,
because they are made up of more concrete concep- in Giere [2.72, p. 94] he suggests that scientific laws
tual ingredients. A weak reading of this argument is “[. . . ] should be understood as rules devised by humans
that the SV could be a plausible suggestion for under- to be used in building models to represent specific as-
standing the structure of scientific theories for use in pects of the natural world.” It is patent that operating as
foundational work. But in the context of utilizing the rules for building models is quite a different thing from
theory to construct representations of phenomena, fo- understanding laws to be the means by which models
cusing on the structure of theory does not illuminate are defined. The latter view is in line with the three
46 Part A Theoretical Issues in Models
theses of the SV; the former however is only in line that scientific representation is no more than a mapping
with the view that models are important in scientific relation between these two kinds of structures. As we
theorizing. Moreover, in Giere [2.73] he makes a more have seen, serious arguments against the idea that rep-
radical step in distinguishing between the abstract mod- resentation can be reduced to structural mapping have
els (which he calls abstract objects) defined by the laws surfaced; and these arguments counter the SV indepen-
and those models used by scientists to represent phys- dently of how the details of the mapping relation is
ical systems (which he calls representational models). construed.
The latter [2.73, p. 63] Furthermore, the SV implies that by defining a the-
Part A | 2.2
References
2.1 F. Suppe: The search for philosophic understanding of Language, ed. by L. Linsky (Univ. Illinois Press, Ur-
of scientific theories. In: The Structure of Scientific bana 1952) pp. 272–286
Theories, ed. by F. Suppe (Univ. Illinois Press, Urbana 2.22 P. Achinstein: Theoretical terms and partial interpre-
1974) pp. 1–241 tation, Br. J. Philos. Sci. 14, 89–105 (1963)
2.2 P. Achinstein: The problem of theoretical terms, Am. 2.23 R. Carnap: Testability and meaning, Philos. Sci. 3,
Philos. Q. 2(3), 193–203 (1965) 420–468 (1936)
Part A | 2
2.3 P. Achinstein: Concepts of Science: A Philosophical 2.24 R. Carnap: Testability and meaning, Philos. Sci. 4, 1–
Analysis (Johns Hopkins, Baltimore 1968) 40 (1937)
2.4 H. Putnam: What theories are not. In: Logic, 2.25 C. Hempel: Fundamentals of Concept Formation in
Methodology and Philosophy of Science, ed. by Empirical Science (Univ. Chicago Press, Chicago 1952)
E. Nagel, P. Suppes, A. Tarski (Stanford Univ. Press, 2.26 K.F. Schaffner: Correspondence rules, Philos. Sci. 36,
Stanford 1962) pp. 240–251 280–290 (1969)
2.5 Theoretician’s dilemma: A study in the logic of the- 2.27 P. Suppes: Models of data. In: Logic, Methodology
ory construction. In: Aspects of Scientific Explanation and Philosophy of Science, ed. by E. Nagel, P. Sup-
and Other Essays in the Philosophy of Science, ed. pes, A. Tarski (Stanford Univ. Press, Stanford 1962)
by C. Hempel, C. Hempel (Free Press, New York 1958) pp. 252–261
pp. 173–226 2.28 P. Suppes: What is a scientific theory? In: Philoso-
2.6 R. Carnap: The methodological character of theoreti- phy of Science Today, ed. by S. Morgenbesser (Basic
cal concepts. In: Minnesota Studies in the Philosophy Books, New York 1967) pp. 55–67
of Science: The Foundations of Science and the Con- 2.29 F. Suppe: What’s wrong with the received view on
cepts of Psychology and Psychoanalysis, Vol. 1, ed. the structure of scientific theories?, Philos. Sci. 39,
by H. Feigl, M. Scriven (Univ. Minnesota Press, Min- 1–19 (1972)
neapolis 1956) pp. 38–76 2.30 F. Suppe: The Semantic Conception of Theories and
2.7 R. Carnap: Philosophical Foundations of Physics (Ba- Scientific Realism (Univ. Illinois Press, Urbana 1989)
sic Books, New York 1966) 2.31 C. Hempel: Provisos: A problem concerning the
2.8 F. Suppe: Theories, their formulations, and the op- inferential function of scientific theories. In: The
erational imperative, Synthese 25, 129–164 (1972) Limitations of Deductivism, ed. by A. Grünbaum,
2.9 N.R. Hanson: Patterns of Discovery: An Inquiry into W.C. Salmon (Univ. California Press, Berkeley 1988)
the Conceptual Foundations of Science (Cambridge pp. 19–36
Univ. Press, Cambridge 1958) 2.32 M. Lange: Natural laws and the problem of provisos,
2.10 N.R. Hanson: Perception and Discovery: An Intro- Erkenntnis 38, 233–248 (1993)
duction to Scientific Inquiry (Freeman, San Francisco 2.33 M. Lange: Who’s afraid of ceteris paribus laws?,
1969) or: How I learned to stop worrying and love them,
2.11 J. Fodor: The Modularity of Mind (MIT, Cambridge Erkenntnis 57, 407–423 (2002)
1983) 2.34 J. Earman, J. Roberts: Ceteris paribus, there is no
2.12 J. Fodor: Observation reconsidered, Philos. Sci. 51, problem of provisos, Synthese 118, 439–478 (1999)
23–43 (1984) 2.35 J. Earman, J. Roberts, S. Smith: Ceteris paribus lost,
2.13 J. Fodor: The modularity of mind. In: Meaning and Erkenntnis 57, 281–301 (2002)
Cognitive Structure, ed. by Z. Pylyshyn, W. Demopou- 2.36 P.K. Feyerabend: Problems of empiricism. In: Beyond
los (Ablex, Norwood 1986) the Edge of Certainty, ed. by R.G. Colodny (Prentice-
2.14 Z. Pylyshyn: Is vision continuous with cognition?, Hall, New Jersey 1965) pp. 145–260
Behav. Brain Sci. 22, 341–365 (1999) 2.37 E. Nagel: The Structure of Science (Hackett Publishing,
2.15 Z. Pylyshyn: Seeing and Visualizing: It’s Not What Indianapolis 1979)
You Think (MIT, Cambridge 2003) 2.38 P.K. Feyerabend: Explanation, reduction and em-
2.16 A. Raftopoulos: Is perception informationally en- piricism. In: Minnesota Studies in the Philosophy of
capsulated?, The issue of the theory-ladenness of Science: Scientific Explanation, Space and Time, Vol.
perception, Cogn. Sci. 25, 423–451 (2001) 3, ed. by H. Feigl, G. Maxwell (Univ. Minnesota Press,
2.17 A. Raftopoulos: Reentrant pathways and the the- Minneapolis 1962) pp. 28–97
ory-ladenness of observation, Phil. Sci. 68, 187–200 2.39 P.K. Feyerabend: How to be a good empiricist –
(2001) A plea for tolerance in matters epistemological. In:
2.18 A. Raftopoulos: Cognition and Perception (MIT, Cam- Philosophy of Science: The Delaware Seminar, Vol.
bridge 2009) 2, ed. by B. Baumrin (Interscience, New York 1963)
2.19 R. Carnap: Meaning postulates, Philos. Stud. 3(5), pp. 3–39
65–73 (1952) 2.40 P.K. Feyerabend: Problems of empiricism, Part II. In:
2.20 W.V. Quine: Two dogmas of empiricism. In: From The Nature and Function of Scientific Theories, ed. by
a Logical Point of View, (Harvard Univ. Press, Mas- R.G. Colodny (Univ. Pittsburgh Press, Pittsburgh 1970)
sachusetts 1980) pp. 20–46 pp. 275–353
2.21 M.G. White: The analytic and the synthetic: An un- 2.41 P. Suppes: Introduction to Logic (Van Nostrand, New
tenable dualism. In: Semantics and the Philosophy York 1957)
48 Part A Theoretical Issues in Models
2.42 P. Suppes: A Comparison of the meaning and uses of 2.59 D. Portides: Scientific models and the semantic view
models in mathematics and the empirical sciences. of scientific theories, Philos. Sci. 72(5), 1287–1298
In: The Concept and the Role of the Model in Math- (2005)
ematics and the Natural and Social Sciences, ed. by 2.60 R. Frigg: Scientific representation and the semantic
H. Freudenthal (Reidel, Dordrecht 1961) pp. 163–177 view of theories, Theoria 55, 49–65 (2006)
2.43 P. Suppes: Set-Theoretical Structures in Science 2.61 N.D. Cartwright: The Dappled World: A Study of the
(Stanford Univ., Stanford 1967), mimeographed lec- Boundaries of Science (Cambridge Univ. Press, Cam-
ture notes bridge 1999)
2.44 P. Suppes: Representation and Invariance of Scien- 2.62 M.C. Morrison: Models as autonomous agents. In:
Part A | 2
tific Structures (CSLI Publications, Stanford 2002) Models as Mediators, ed. by M.S. Morgan, M. Morri-
2.45 N.C.A. Da Costa, S. French: The model-theoretic ap- son (Cambridge Univ. Press, Cambridge 1999) pp. 38–
proach in the philosophy of science, Philos. Sci. 57, 65
248–265 (1990) 2.63 M.C. Morrison: Modelling nature: Between physics
2.46 N.C.A. Da Costa, S. French: Science and Partial Truth, and the physical world, Philos. Naturalis 35, 65–85
a Unitary Approach to Models and Scientific Reason- (1998)
ing (Oxford Univ. Press, Oxford 2003) 2.64 M.C. Morrison: Where have all the theories gone?,
2.47 B.C. Van Fraassen: The Scientific Image (Oxford Univ. Philos. Sci. 74, 195–228 (2007)
Press, Oxford 1980) 2.65 D. Portides: Models. In: The Routledge Companion to
2.48 B.C. Van Fraassen: On the extension of beth’s se- the Philosophy of Science, ed. by S. Psillos, M. Curd
mantics of physical theories, Philos. Sci. 37, 325–339 (Routledge, London 2008) pp. 385–395
(1970) 2.66 N.D. Cartwright, T. Shomar, M. Suarez: The tool-box
2.49 M. Friedman: Review of Bas C. van Fraassen: The sci- of science. In: Theories and Models In Scientific Pro-
entific image, J. Philos. 79, 274–283 (1982) cesses, Poznan Studies, Vol. 44, ed. by E. Herfel,
2.50 J. Worrall: Review article: An unreal image, Br. J. Phi- W. Krajewski, I. Niiniluoto, R. Wojcicki (Rodopi, Am-
los. Sci. 35, 65–80 (1984) sterdam 1995) pp. 137–149
2.51 B.C. Van Fraassen: An Introduction to the Philosophy 2.67 D. Portides: Seeking representations of phenomena:
of Time and Space, 2nd edn. (Columbia Univ. Press, Phenomenological models, Stud. Hist. Philos. Sci.
New York 1985) 42, 334–341 (2011)
2.52 B.C. Van Fraassen: The semantic approach to sci- 2.68 N.D. Cartwright: Models and the limits of theory:
entific theories. In: The Process of Science, ed. by Quantum hamiltonians and the BCS models of su-
N.J. Nersessian (Martinus Nijhoff, Dordrecht 1987) perconductivity. In: Models as Mediators, ed. by
pp. 105–124 M.S. Morgan, M. Morrison (Cambridge Univ. Press,
2.53 M. Thompson-Jones: Models and the semantic view, Cambridge 1999) pp. 241–281
Philos. Sci. 73, 524–535 (2006) 2.69 M.S. Morgan, M. Morrison (Eds.): Models as Medi-
2.54 B.C. Van Fraassen: Laws and Symmetry (Oxford Univ. ators: Perspectives on Natural and Social Science
Press, Oxford 1989) (Cambridge Univ. Press, Cambridge 1999)
2.55 R.N. Giere: Explaining Science: A Cognitive Approach 2.70 M. Suarez: The role of models in the application
(The Univ. Chicago Press, Chicago 1988) of scientific theories: Epistemological implications.
2.56 S. French: The structure of theories. In: The Rout- In: Models as Mediators: Perspectives on Natural
ledge Companion to the Philosophy of Science, ed. and Social Science, ed. by M.S. Morgan, M. Morrison
by S. Psillos, M. Curd (Routledge, London 2008) (Cambridge Univ. Press, Cambridge 1999) pp. 168–196
pp. 269–280 2.71 S. Hartman: Models and stories in hadron physics.
2.57 M. Suarez: Scientific representation: Against simi- In: Models as Mediators: Perspectives on Natural
larity and isomorphism, Int. Stud. Philos. Sci. 17(3), and Social Science, ed. by M.S. Morgan, M. Morrison
225–244 (2003) (Cambridge Univ. Press, Cambridge 1999) pp. 326–
2.58 S.M. Downes: The importance of models in theoris- 346
ing: A deflationary semantic view, PSA 1992, Vol. 1, 2.72 R. Giere: Science Without Laws (Univ. Chicago Press,
ed. by D. Hull, M. Forbes, K. Okruhlik (Philosophy of Chicago 1999)
Science Associaion, Chicago 1992) pp. 142–153 2.73 R. Giere: Scientific Perspectivism (Univ. Chicago Press,
Chicago 2006)
49
Part A | 3
if they represent the selected parts or aspects of 3.3.1 Similarity and ER-Problem .................... 58
the world we investigate. This raises an important 3.3.2 Accuracy and Style ................................ 62
question: In virtue of what do scientific models 3.3.3 Problems of Ontology ............................ 64
represent their target systems? In this chapter we
first disentangle five separate questions associ-
3.4 The Structuralist Conception ................. 66
3.4.1 Structures and the Problem of Ontology.. 66
ated with scientific representation and offer five
3.4.2 Structuralism and the ER-Problem ......... 68
conditions of adequacy that any successful answer
3.4.3 Accuracy, Style and Demarcation ............ 70
to these questions must meet. We then review the
3.4.4 The Structure of Target Systems .............. 71
main contemporary accounts of scientific repre-
sentation – similarity, isomorphism, inferentialist, 3.5 The Inferential Conception.................... 76
and fictionalist accounts – through the lens of 3.5.1 Deflationary Inferentialism .................... 76
these questions. We discuss each of their attributes 3.5.2 Inflating Inferentialism: Interpretation ... 80
and highlight the problems they face. We finally 3.5.3 The Denotation, Demonstration,
outline our own preferred account, and suggest and Interpretation Account.................... 82
that it provides the most promising way of ad- 3.6 The Fiction View of Models ................... 83
dressing the questions raised at the beginning of 3.6.1 Models and Fiction ............................... 84
the chapter. 3.6.2 Direct Representation............................ 86
3.6.3 Parables and Fables .............................. 88
3.6.4 Against Fiction...................................... 89
3.7 Representation-as ............................... 91
3.7.1 Exemplification and Representation-as .. 91
3.7.2 From Pictures to Models:
The Denotation, Exemplification,
Keying-up and Imputation Account ....... 93
3.8 Envoi ................................................... 96
References..................................................... 96
Models play a central role in contemporary science. nature of its subject matter if it represents the selected
Scientists construct models of atoms, elementary par- part or aspect of the world that we investigate. So if we
ticles, polymers, populations, genetic trees, economies, want to understand how models allow us to learn about
rational decisions, airplanes, earthquakes, forest fires, the world, we have to come to understand how they rep-
irrigation systems, and the world’s climate – there is resent.
hardly a domain of inquiry without models. Models are The problem of representation has generated a siz-
essential for the acquisition and organization of scien- able literature, which has been growing fast in particular
tific knowledge. We often study a model to discover over the last decade. The aim of this chapter is to re-
features of the thing it stands for. How does this work? view this body of work and assess the strengths and
The answer is that a model can instruct us about the weaknesses of the different proposals. This enterprise
50 Part A Theoretical Issues in Models
faces an immediate difficulty: Even a cursory look at perform a number of functions other than represen-
the literature on scientific representation quickly reveals tation. To mention but few: Knuuttila [3.9, 10] points
that there is no such thing as the problem of scientific out that the epistemic value of models is not limited
representation. In fact, we find a cluster of interrelated to their representational function and develops an ac-
problems. In Sect. 3.1 we try to untangle this web and count that views models as epistemic artifacts that allow
get clear on what the problems are and on how they us to gather knowledge in diverse ways; Morgan and
relate to one another (for a historical introduction to Morrison [3.11] emphasize the role models play in
the issue, see [3.1]). The result of this effort is a list the mediation between theories and the world; Hart-
with five problems and five conditions of adequacy, mann [3.12] discusses models as tools for theory con-
which provides the analytical lens through which we struction; Peschard [3.13] investigates the way in which
look at the different accounts. In Sect. 3.2 we discuss models may be used to construct other models and
Griceanism and stipulative fiat. In Sect. 3.3 we look at generate new target systems; and Bokulich [3.14] and
the time-honored similarity approach, and in Sect. 3.4 Kennedy [3.15] present nonrepresentational accounts
we examine its modern-day cousin, the structuralist ap- of model explanation (Woody [3.16] and Reiss [3.17]
Part A | 3
proach. In Sect. 3.5 we turn to inferentialism, a more provide general discussions of the relation between rep-
recent family of conceptions. In Sect. 3.6 we discuss resentation and explanation). Not only do we not see
the fiction view of models, and in Sect. 3.7 we consider projects like these as being in conflict with a view that
the conception of representation-as. sees some models as representational; we think that the
Before delving into the discussion, a number of approaches are in fact complementary.
caveats are in order. The first is that our discussion Finally, there is a popular myth according to which
in no way presupposes that models are the sole unit a representation is a mirror image, a copy, or an im-
of scientific representation, or that all scientific repre- itation of the thing it represents. In this view repre-
sentation is model-based. Various types of images have sentation is ipso facto realistic representation. This is
their place in science, and so do graphs, diagrams, and a mistake. Representations can be realistic, but they
drawings (Perini [3.2–4] and Elkins [3.5] provide dis- need not. And representations certainly need not be
cussions of visual representation in the sciences). In copies of the real thing. This, we take it, is the moral
some contexts scientists use what Warmbrōd [3.6] calls of the satire about the cartographers who produce maps
natural forms of representation and what Peirce [3.7] as large as the country itself only to see them aban-
would have classified as indices: tree rings, fingerprints, doned. The story has been told by Lewis Carroll in
disease symptoms. These are related to thermometer Sylvie and Bruno and Jorge Luis Borges in On Exacti-
readings and litmus paper indications, which are com- tude in Science. Throughout this review we encounter
monly classified as measurements. Measurements also positions that make room for nonrealistic representa-
provide representations of processes in nature, some- tion and hence testify to the fact that representation is
times together with the subsequent condensation of a much broader notion than mirroring.
measurement results in the form of charts, curves, tables There is, however, a sense in which we presuppose
and the like (Tal [3.8] provides a discussion of measure- a minimal form of realism. Throughout the discussion
ment). And, last but not least, many would hold that we assume that target systems exist independently of
theories represent too. At this point the vexing problem human observers, and that they are how they are ir-
of the nature of theories and the relation between theo- respective of what anybody thinks about them. That
ries and models rears is head again. We refer the reader is, we assume that the targets of representation exist
to Portides’ contribution to this volume, Chap. 2, for independently of the representation. This is a presuppo-
a discussion of this issue. Whether these other forms of sition not everybody would share. Constructivists (and
scientific representation have features in common with other kinds of metaphysical antirealists) assume that
how models represent is an interesting question, but this there is no phenomenon independent of its represen-
is a problem for another day. Our aim here is a more tation: representations constitute the phenomena they
modest one: to understand how models represent. To represent (this view is expounded for instance by Lynch
make the scope of our investigation explicit we call and Wooglar [3.18]; Giere [3.19] offers a critical dis-
the kind of representation we are interested in model- cussion). It goes without saying that an assessment of
representation. the constructivist program is beyond the scope of this
The second point to emphasize is that our dis- review. It is worth observing, though, that many of the
cussion is not premised on the claim that all models discussions to follow are by no means pointless from
are representational; nor does it assume that repre- a constructivist perspective. What in the realist idiom
sentation is the only (or even primary) function of is conceptualized as the representation of an object in
models. It has been emphasized variously that models the world by a model would, from the constructivist
Models and Representation 3.1 Problems Concerning Model-Representation 51
perspective, turn into the study of the relation between gets are not identical, and the fact that targets are
a model and another representation, or an object con- representationally constituted would not obliterate the
stituted by another representation. This is because even differences between a target representation and scien-
from a constructivist perspective, models and their tar- tific model.
Part A | 3.1
ter around a relatively well-circumscribed set of issues. models rather than on reality itself, and this is done with
The aim of this section is to make these issues explicit the aim of discovering features of the things models
and formulate five problems that an account of model- stands for. Every acceptable theory of scientific repre-
representation has to answer. These problems will help sentation has to account for how reasoning conducted
us in structuring the discussion in later sections and put on models can yield claims about their target systems.
views and positions into perspective. In the course of do- Let us call this the surrogative reasoning condition.
ing so we also articulate five conditions of adequacy that The term surrogative reasoning was introduced
every account of model-representation has to satisfy. by Swoyer [3.25, p. 449], and there seems to be
Models are representations of a selected part or as- widespread agreement on this point (although Callen-
pect of the world. This is the model’s target system. der and Cohen [3.26], whose views are discussed in
The first and most fundamental question about a model Sect. 3.3, provide a noteworthy exception). To mention
therefore is: In virtue of what is a model a represen- just a few writers on the subject: Bailer-Jones [3.27,
tation of something else? Attention has been drawn p. 59] emphasizes that models “tell us something
to this issue by Frigg ([3.20, p. 17], [3.21, p. 50]), about certain features of the world” (original empha-
Morrison [3.22, p. 70], and Suárez [3.23, p. 230]. To sis). Boliskna [3.28] and Contessa [3.29] both call
appreciate the thrust of this question it is instructive to models epistemic representations; Frigg ([3.21, p. 51],
briefly ponder the same problem in the context of pic- [3.30, p. 104]) sees the potential for learning as an es-
torial representation. When seeing, say, Soutine’s The sential explanandum for any theory of representation;
Groom or the Bellboy we immediately realize that it Liu [3.31, p. 93] emphasizes that the main role for mod-
depicts a man in a red dress. Why is this? Per se the els in science and technology is epistemic; Morgan and
painting is a plane surface covered with pigments. How Morrison [3.11, p. 11] regard models as investigative
does an arrangement of pigments on a surface represent tools; Suárez ([3.23, p. 229], [3.32, p. 772]) submits that
something outside the picture frame? Likewise, models, models license specific inferences about their targets;
before being representations of atoms, populations, or and Weisberg [3.33, p. 150] observes that the “model-
economies, are equations, structures, fictional scenar- world relation is the relationship in virtue of which
ios, or mannerly physical objects. The problem is: what studying a model can tell us something about the na-
turns equations and structures, or fictional scenarios and ture of a target system”. This distinguishes models from
physical objects into representations of something be- lexicographical representations such as words. Study-
yond themselves? It has become customary to phrase ing the internal constitution of a model can provide
this problem in terms of necessary and sufficient con- information about the target. Not so with words. The
ditions and throughout this review we shall follow suit properties of a word (consisting of so and so many let-
(some may balk at this, but it’s worth flagging that the ters and syllables, occupying this or that position in
standard arguments against such an analysis, e.g., those a dictionary, etc.) do not matter to its functioning as
surveyed in Laurence and Margolis [3.24], lose much a word; and neither do the physical properties of the ink
of their bite when attention is restricted to core cases as used to print words on a piece of paper. We can replace
we do here). The question then is: What fills the blank one word by another at will (which is what happens in
in M is a model-representation of T iff , where M translations from one language to another), and we can
stands for model and T for target system? print words with other methods than ink on paper. This
To spare ourselves difficulties further down the is possible because the properties of a word as an object
line, this formulation needs to be adjusted in light of do not matter to its semantic function.
52 Part A Theoretical Issues in Models
This gives rise to a problem for the schema M received little, if any, attention in the recent literature
is a model-representation of T iff . The prob- on scientific representation, which would suggest that
lem is that any account of representation that fills the other authors either share Callender and Cohen’s skep-
blank in a way that satisfies the surrogative reason- ticism, or regard it as a nonissue to begin with. The
ing condition will almost invariably end up covering latter seems to be implicit in approaches that discuss
other kinds of representations too. Geographical maps, scientific representation alongside pictorial representa-
graphs, diagrams, charts, drawings, pictures, and pho- tion such as Elgin [3.34], French [3.35], Frigg [3.21],
tographs often provide epistemic access to features of Suárez [3.32], and van Fraassen [3.36]. But a dismissal
the items they represent, and hence are likely to fall of the problem is in no way a neutral stance. It amounts
under an account of representation that explains this to no less than the admission that model-representations
sort of reasoning. This is a problem for an analy- are not fundamentally different from other epistemic
sis of model-representation in terms of necessary and representations, or that we are unable to pin down what
sufficient conditions because if something that is not the distinguishing features are. Such a stance should be
prima facie a model (for instance a map or a photo- made explicit and, ideally, justified.
Part A | 3.1
graph) satisfies the conditions of an account of model- Two qualifications concerning the ER-scheme need
representation, then one either has to conclude that the to be added. The first concerns its flexibility. Some
account fails because it does not provide necessary con- might worry that posing the problem in this way pre-
ditions, or that first impressions are wrong and other judges what answers can be given. The worry comes in
representations (such as maps or photographs) are in a number of variants. A first variant is that the scheme
fact model-representations. presupposes that representation is an intrinsic relation
Neither of these options is appealing. To avoid this between M and T (i. e., a relation that only depends on
problem we follow a suggestion of Contessa’s [3.29] intrinsic properties of M and T and on how they re-
and broaden the scope of the investigation. Rather late to one another rather than on how they relate to
than analyzing the relatively narrow category of model- other objects) or even that it is naturalisable (a notion
representation, we analyze the broader category of epis- further discussed in Sect. 3.3). This is not so. In fact,
temic representation. This category comprises model- R might depend on any number of factors other than
representations, but it also includes other representa- M and T themselves, and on ones that do not qual-
tions that allow for surrogative reasoning. The task then ify as natural ones. To make this explicit we write the
becomes to fill the blank in M is an epistemic repre- ER-scheme in the form R.M; T/ iff C.M; T; x1 ; : : : ; xn /,
sentation of T iff . For brevity we use R.M; T/ as where n is a natural number and C is an (nC2)-ary rela-
a stand in for M is an epistemic representation of T, and tion that grounds representation. The xi can be anything
so the biconditional becomes R.M; T/ iff . We call that is deemed relevant to epistemic representation, for
the general problem of figuring out in virtue of what instance a user’s intentions, standards of accuracy, and
something is an epistemic representation of something specific purposes. We call C the grounding relation of
else the epistemic representation problem (ER-problem, an epistemic representation.
for short), and the above biconditional the ER-scheme. Before adding a second qualification, let us in-
So one can say that the ER is to fill the blank in the troduce the next problem in connection with model-
ER-scheme. Frigg [3.21, p. 50] calls this the “enigma representation. Even if we restrict our attention to
of representation” and in Suárez’s [3.23, p. 230] termi- scientific epistemic representations (if they are found to
nology this amounts to identifying the constituents of be relevantly different to nonscientific epistemic repre-
a representation (although he questions whether both sentations as per the demarcation problem above), not
necessary and sufficient conditions can be given; see all representations are of the same kind. In the case of
Sect. 3.5 for further discussion on how his views fit into visual representations this is so obvious that it hardly
the ER-framework). needs mention: An Egyptian mural, a two-point per-
Analyzing the larger category of epistemic rep- spective ink drawing, a pointillist oil painting, an archi-
resentation and placing model-representations in that tectural plan, and a road map represent their respective
category can be seen as giving rise to a demarcation targets in different ways. This pluralism is not limited to
problem for scientific representations: How do scien- visual representations. Model-representations do not all
tific model-representations differ from other kinds of seem to be of the same kind either. Woody [3.37] argues
epistemic representations? We refer to this question as that chemistry as a discipline has its own ways to repre-
the representational demarcation problem. Callender sent molecules. But differences in style can also appear
and Cohen [3.26, p. 69] formulate this problem, but in models from the same discipline. Weizsäcker’s liquid
then voice skepticism about our ability to solve it [3.26, drop model represents the nucleus of an atom in a man-
p. 83]. The representational demarcation problem has ner that seems to be different from the one of the shell
Models and Representation 3.1 Problems Concerning Model-Representation 53
model. A scale model of the wing of a plane represents is not a new scheme; it’s the old scheme where C D
the wing in a way that is different from how a mathe- ŒC1 or C2 or C3 or : : : is spelled out.
matical model of its cross section does. Or Phillips and Alternatively one could formulate a different
Newlyn’s famous hydraulic machine and Hicks’ math- scheme for every kind of representation. This would
ematical models both represent a Keynesian economy amount to changing the scheme slightly in that one does
but they seem to do so in different ways. This gives not analyze epistemic representation per se. Instead one
rise to the question: What styles are there and how can would analyze different kinds of epistemic representa-
they be characterized? This is the problem of style [3.21, tions. Consider the above example again. Let R1 .M; T/
p. 50]. There is no expectation that a complete list of stand for M is an analogue epistemic representation of
styles be provided in response. Indeed, it is unlikely T and R2 .M; T/ for M is an idealized epistemic repre-
that such a list can ever be drawn up, and new styles sentation of T. The response to the ER-problem then
will be invented as science progresses. For this rea- consists in presenting the two biconditionals R1 .M; T/
son a response to the problem of style will always be iff CA and R2 .M; T/ iff CI . This generalizes straight-
open-ended, providing a taxonomy of what is currently forwardly to the case of any number of styles, and the
Part A | 3.1
available while leaving room for later additions. open-endedness of the list of styles can be reflected in
With this in mind we can now turn to the second the fact that an open-ended list of conditionals of the
qualification concerning the ER-scheme. The worry is form Ri .M; T/ iff Ci can be given, where the index
this: The scheme seems to assume that representation ranges over styles.
is a monolithic concept and thereby make it impossible In contrast with the second option, which pulls in
to distinguish between different kinds of representation. the direction of more diversity, the third aims for more
The impression is engendered by the fact the scheme unity. The crucial observation here is that the ground-
asks us to fill a blank, and blank is filled only once. But ing relation can in principle be an abstract relation that
if there are different kinds of representations, we should can be concretized in different ways, or a determinable
be able to fill the blank in different ways on different that can have different determinates. On the third view,
occasions because a theory of representation should not then, the concept of representation is like the concept
force upon us the view that the different styles are all of force (which is abstract in that in a concrete situ-
variations of one overarching concept of representation. ation force is gravity or electromagnetic attraction or
The ER-scheme is more flexible than it appears at some other specific force), or like color (where a col-
first sight. There are at least three ways in which dif- ored object must be blue or green or ). This view
ferent styles of representations can be accommodated. would leave R.M; T/ iff C.M; T; x1 ; : : : ; xn / unchanged
For the sake of illustration, and to add some palpabil- and take it as understood that C is an abstract relation.
ity to an abstract discussion, let us assume that we have At this point we do not adjudicate between these
identified two styles: analogue representation and ide- options. Each has its own pros and cons, and which
alized representation. The result of an analysis of these one is the most convenient to work with depends on
relations is the identification of their respective ground- one’s other philosophical commitments. What matters
ing relations. Let CA .M; T; : : : / and CI .M; T; : : : / be is that the ER-scheme does have the flexibility to ac-
these relations. The first way of accommodating them commodate different representational styles, and that it
in the ER-scheme is to fill the blank with the disjunction can in fact accommodate them in at least three different
of the two: R.M; T/ iff CA .M; T; : : : / or CI .M; T; : : : /. ways.
In plain English: M represents T if and only if M is The next problem in line for the theory of model-
an analogue representation of T or M is an idealized representation is to specify standards of accuracy.
representation of T. This move is possible because, Some representations are accurate; others aren’t. The
first appearances notwithstanding, nothing hangs on Schrödinger model is an accurate representation of the
the grounding relation being homogeneous. The rela- hydrogen atom; the Thomson model isn’t. On what
tion can be as complicated as we like and there is no grounds do we make such judgments? In Morrison’s
prohibition against disjunctions. In the above case we words: “how do we identify what constitutes a accurate
have C D ŒCA or CI . Furthermore, the grounding re- representation?” [3.22, p. 70]. We call this the prob-
lation could even be an open disjunction. This would lem of standards of accuracy. Answering this question
help accommodating the above observation that a list might make reference to the purposes of the model and
of styles is potentially open-ended. In that case there model user, and thus it is important to note that by accu-
would be a grounding relation for each style and the racy we mean something that can come in degrees and
scheme could be written as R.M; T/ iff C1 .M; T : : : / may be context dependent. Providing a response to the
or C2 .M; T : : : / or C3 .M; T : : : / or : : : , where the Ci problem of accuracy is a crucial aspect of an account of
are the grounding relations for different styles. This epistemic representation.
54 Part A Theoretical Issues in Models
This problem goes hand in hand with a second ophy of mathematics (see Shapiro [3.41, Chap. 8] for
condition of adequacy: the possibility of misrepresen- a review). But, with the exception of Bueno and Coly-
tation. Asking what makes a representation an accurate van [3.42], there has been little contact with the liter-
representation already presupposes that inaccurate rep- ature on scientific modeling. This is a regrettable state
resentations are representations too. And this is how it of affairs. The question of how a mathematized model
should be. If M does not accurately portray T, then it represents its target implies the question of how mathe-
is a misrepresentation but not a nonrepresentation. It is matics applies to a physical system. So rather than sep-
therefore a general constraint on a theory of epistemic arating the question of model-representation from the
representation that it has to make misrepresentation problem of the applicability of mathematics and dealing
possible. This can be motivated by a brief glance at with them in separate discussions, they should be seen
the history of science, but is plausibly also part of the as the two sides of the same coin and be dealt with in
concept of representation, and as such is found in dis- tandem. For this reason, our fifth and final condition of
cussions of other kinds of representation (Stitch and adequacy is that an account of representation has to ex-
Warfield [3.38, pp. 6–7], for instance, suggest that a the- plain how mathematics is applied to the physical world.
Part A | 3.1
ory of mental representation should be able to account We call this the applicability of mathematics condition.
for misrepresentation, as do Sterelny and Griffiths [3.39, In answering the above questions one invariably
p. 104] in their discussion of genetic representation). runs up against a further problem, the problem of on-
A corollary of this requirement is that representation is tology: What kinds of objects are models? Are they
a wider concept than accurate representation and that structures in the sense of set theory, fictional entities,
representation cannot be analyzed in terms of accurate descriptions, equations or yet something else? Or are
representation. there no models at all? While some authors develop
A related condition concerns models that misrepre- an ontology of models, others reject an understand-
sent in the sense that they lack target systems. Models ing of models as things and push a program that can
of ether, phlogiston, four-sex populations, and so on, are be summed up in the slogan modeling without mod-
all deemed scientific models, but ether, phlogiston, and els [3.43]. There is also no presupposition that all
four-sex populations don’t exist. Such models lack (ac- models be of the same kind. Some models are material
tual) target systems, and one hopes that an account of objects, some are things that one holds in one’s head
epistemic representation would allow us to understand rather than one’s hands (to use Hacking’s phrase [3.44,
how these models work. We call this the problem of tar- p. 216]). For the most part, the focus in debates about
getless models (or models without targets). representation has been on nonmaterial models, and
The fourth condition of adequacy for an account we will follow this convention. It is worth emphasiz-
of model-representation is that it must account for ing, however, that also the seemingly straightforward
the directionality of representation. Models are about material models raise interesting philosophical ques-
their targets, but (at least in general) targets are not tions: Rosenblueth and Wiener [3.45] discuss the cri-
about their models. So there is an essential direc- teria for choosing an object as a model; Ankeny and
tionality to representations, and an account of model- Leonelli [3.46] discuss issues that arise when using
representation has to identify the root of this direction- organisms as models; and the contributors to [3.47] dis-
ality. We call this the requirement of directionality. cuss representation in the laboratory.
Many scientific models are highly mathematized, A theory of representation can recognize different
and their mathematical aspects are crucial to their cog- kinds of models, or indeed no models at all. The re-
nitive as well as their representational function. This quirement only asks us to be clear on our commitments
forces us to reconsider a time-honored philosophical and provide a list with things, if any, that we recognize
puzzle: the applicability of mathematics in the empirical as models and give an account of what they are in case
sciences. Even though the problem can be traced back these entities raise questions (what exactly do we mean
at least to Plato’s Timaeus, its canonical modern expres- by something that one holds in one’s head rather than
sion is due to Wigner, who famously remarked that “the one’s hands?).
enormous usefulness of mathematics in the natural sci- In sum, an account of model-representation has to
ences is something bordering on the mysterious and that do the following:
there is no explanation for it” [3.40, p. 2]. One need not
go as far as seeing the applicability of mathematics as 1. Provide an answer to the epistemic representation
an inexplicable miracle, but the question remains: How problem (filling the blank in ER-scheme: M is an
does mathematics hook onto the world? epistemic representation of T iff . . . ).
The recent discussion of this problem has taken 2. Take a stand on the representational demarcation
place in a body of literature that grew out of the philos- problem (the question of how scientific epistemic
Models and Representation 3.2 General Griceanismand Stipulative Fiat 55
representations differ from other kinds of epistemic 3. Targetless models (what are we to make of scientific
representations). representations that lack targets?).
3. Respond to the problem of style (what styles are 4. Requirement of directionality (models are about
there and how can they be characterized?). their targets, but targets are not about their mod-
4. Formulate standards of accuracy (how do we iden- els).
tify what constitutes an accurate representation?). 5. Applicability of mathematics condition (how the
5. Address the problem of ontology (what kinds of ob- mathematical apparatus used in M latches onto the
jects are models?). physical world).
Any satisfactory answer to these five issues will To frame the problem in this way is not to say that
have to meet the following five conditions of adequacy: these are separate and unrelated issues, which can be
dealt with one after the other in roughly the same way
1. Surrogative reasoning condition (models represent in which we first buy a ticket, walk to the platform and
their targets in a way that allows us to generate hy- then take a train. This division is analytical, not factual.
potheses about them). It serves to structure the discussion and to assess pro-
Part A | 3.2
2. Possibility of misrepresentation (if M does not ac- posals; it does not imply that an answer to one of these
curately represent T, then it is a misrepresentation questions can be dissociated from what stance we take
but not a nonrepresentation). on the other issues.
“The questions about the utility of these representa- representation. We discuss both concerns in reverse
tional vehicles are questions about the pragmatics of order.
things that are representational vehicles, not ques- Stipulative fiat (Definition 3.1) fails to meet the
tions about their representational status per se.” surrogative reasoning condition: it fails to provide an
account of how claims about Madagascar could be ex-
So, in sum, scientific representation [3.26, p. 78] tracted from reasoning about the salt shaker. Even if
“is constituted in terms of a stipulation, together we admit that stipulative fiat (Definition 3.1) estab-
with an underlying theory of representation for lishes that models denote their targets (and as we will
mental states, isomorphism, similarity, and infer- see soon, there is a question about this), denotation
ence generation are all idle wheels.” is not sufficient for epistemic representation. Both the
word Napoleon and Jacques-Louis David’s portrait of
The first question we are faced with when assessing Napoleon serve to denote the French general. But this
this account is the relation between GG and stipula- does not imply that they represent him in the same
tive fiat (Definition 3.1). Callender and Cohen do not way, as noted by Toon [3.48, pp. 78–79]. Bueno and
Part A | 3.2
comment on this issue, but that they mention both in French [3.53, pp. 871–874] gesture in the same direc-
the same breath would suggest that they regard them as tion when they point to Peirce’s distinction between
one and the same doctrine, or at least as the two sides icon, index and symbol and dismiss Callender and Co-
of the same coin. This is not so. Stipulative fiat (Def- hen’s views on grounds that they cannot explain the
inition 3.1) is just one way of fleshing out GG, which obvious differences between different kinds of repre-
only requires that there be some explanation of how sentations.
derivative representations relate to fundamental repre- Supporters of stipulative fiat (Definition 3.1) could
sentations; GG does not require that this explanation try to mitigate the force of this objection in two ways.
be of a particular kind, much less that it consists of First, they could appeal to additional facts about the
nothing but an act of stipulation ([3.48, pp. 77–78], object, as well as its relation to other items, in or-
[3.49, p. 244]). Even if GG is correct, it doesn’t fol- der to account for surrogative reasoning. For instance,
low that stipulative fiat is a satisfactory answer to the the salt shaker being to the right of the pepper mill
ER-problem. Model-representation can, in principle, be might allow us to infer that Madagascar is to the east
reduced to fundamental representation in many differ- of Mozambique. Moves of this sort, however, invoke
ent ways (some of which we will encounter later in (at least tacitly) a specifiable relation between features
this chapter). Conversely, the failure of stipulate fiat of the model and features of the target (similarity, iso-
does not entail that we must reject GG: one can up- morphism, or otherwise), and an invocation of this kind
hold the idea that an appeal to the intentions of model goes beyond mere stipulation. Second, the last quota-
users is a crucial element in an account of scientific tion from Callender and Cohen suggests that they might
representation even if one dismisses stipulative fiat want to relegate surrogative reasoning into the realm of
(Definition 3.1). pragmatics and deny that it is part of the relation prop-
Let us now examine stipulative fiat (Definition 3.1). erly called epistemic representation. This, however, in
Callender and Cohen emphasize that anything can be effect amounts to a removal of the surrogative reasoning
a representation of anything else [3.26, p. 73]. This is condition from the desiderata of an account of scientific
correct. Things that function as models don’t belong representation, and we have argued in Sect. 3.1 that sur-
to a distinctive ontological category, and it would be rogative reasoning is one of the hallmarks of scientific
a mistake to think that that some objects are, intrin- representation. And even if it were pragmatics, we still
sically, representations and other are not. This point would want an account of how it works.
has been made by others too (including Frigg [3.50, Let us now turn to our first point, that a mere act
p. 99], Giere [3.51, p. 269], Suárez [3.32, p. 773], of stipulation is insufficient to turn M into a representa-
Swoyer [3.25, p. 452], and Teller [3.52, p. 397]) and, tion of T. We take our cue from a parallel discussion in
as we shall see, it is a cornerstone of several alternative the philosophy of language, where it has been pointed
accounts of representation. out that it is not clear that stipulation is sufficient to
But just because anything can, in principle, be establish a denotational relationship (which is weaker
a representation of anything else, it doesn’t follow that than epistemic representation). A position similar to
a mere act of stipulation suffices to turn M into a rep- stipulative fiat (Definition 3.1) faces what is known as
resentation of T. Furthermore, it doesn’t follow that an the Humpty Dumpty problem, named in reference to
object elevated to the status of a representation by an Lewis Carroll’s discussion of Humpty using the word
act of fiat represents its target in a way that can ap- glory to mean a nice knockdown argument [3.54, 55]
propriately be characterized as an instance of epistemic (it’s worth noting that this debate concerns meaning,
Models and Representation 3.3 The Similarity Conception 57
rather than denotation, but it’s plausible that it can be then stipulative fiat (Definition 3.1) will fail to get off
reconstructed in terms of the latter). If stipulation is all the ground at all.
that matters, then as long as Humpty simply stipulates It now pays that we have separated GG from stipu-
that glory means a nice knockdown argument, then it lative fiat (Definition 3.1). Even though stipulative fiat
does so. And this doesn’t seem to be the case. Even if (Definition 3.1) does not provide an adequate answer
the utterance glory could mean a nice knockdown argu- to the ER-problem, one can still uphold GG. As Cal-
ment – if, for example, Humpty was speaking a different lender and Cohen note, all that it requires is that there
language – in the case in question it doesn’t, irre- is a privileged class of representations (they take them
spective of Humpty’s stipulation. In the contemporary to be mental states but are open to the suggestion that
philosophy of language the discussion of this prob- they might be something else [3.26, p. 82]), and that
lem focuses more on the denotation of demonstratives other types of representations owe their representational
rather than proper names, and work in that field focuses capacities to their relationship with the primitive ones.
on propping up existing accounts so as to ensure that So philosophers need an account of how members of
a speaker’s intentions successfully establish the deno- this privileged class of representations represent, and
Part A | 3.3
tation of demonstratives uttered by the speaker [3.56]. how derivative representations, which includes scien-
Whatever the success of these endeavors, their mere ex- tific models, relate to this class.
istence shows that successfully establishing denotation This is a plausible position, and when stated like
requires moving beyond a bare appeal to stipulation, or this, many recent contributors to the debate on scientific
brute intention. But if a brute appeal to intentions fails representation can be seen as falling under the um-
in the case of demonstratives – the sorts of terms that brella of GG. As we will see below, the more developed
such an account would most readily be applicable to – versions of the similarity (Sect. 3.3) and isomorphism
then we find it difficult to see how stipulative fiat (Defi- (Sect. 3.4) accounts of scientific representation make
nition 3.1) will establish a representational relationship explicit reference to the intentions and purposes of
between models and their targets. Moreover, this whole model users, even if their earlier iterations did not.
discussion supposed that an intention-based account of And so do the accounts discussed in the latter sections,
denotation is the correct one. This is controversial – see where the intentions of model users (in a more com-
Reimer and Michaelson [3.57] for an overview of dis- plicated manner than that suggested by stipulative fiat
cussions of denotation in the philosophy of language. (Definition 3.1)) are invoked to establish epistemic rep-
If this is not the correct way to think about denotation, resentation.
ity holds between properties themselves, then T would Jane’s passport represents Jane; but Jane does not rep-
have to instantiate properties similar to M (however, it resent her passport photograph; and the same holds true
is worth noting that this kind of knowledge transfer can for myriads of other representations. Goodman is cor-
cause difficulties in some contexts, Frigg et al. [3.63] rect in pointing out that typically representation is not
discuss these difficulties in the context of nonlinear dy- symmetrical and reflexive: a target T does not represent
namic modeling). model M just because M represents T.
However, appeal to similarity in the context of rep- A reply diametrically opposed to Yaghmaie’s
resentation leaves open whether similarity is offered as emerges from the writings of Tversky and Weisberg.
an answer to the ER-problem, the problem of style, or They accept that representation is not symmetric, but
whether it is meant to set standards of accuracy. Pro- dispute that similarity fails on this count. Using a grad-
ponents of the similarity account typically have offered ual notion of similarity (i. e., one that allows for state-
little guidance on this issue. So we examine each op- ments like A is similar to B to degree d), Tversky found
tion in turn and ask whether similarity offers a viable that subjects in empirical studies judged that North Ko-
answer. We then turn to the question of how the simi- rea was more similar to China than China was to North
Part A | 3.3
larity view deals with the problem of ontology. Korea [3.66]; similarly Poznic [3.67, Sect. 4.2] points
out with reference to the characters in a Polanski movie
3.3.1 Similarity and ER-Problem that the similarity relation between a baby and the father
need not be symmetric.
Understood as response to the ER-problem, a similarity So allowing degrees into ones notion of similarity
view of representation amounts to the following: makes room for an asymmetry (although degrees by
themselves are not sufficient for asymmetry; metric-
Definition 3.2 Similarity 1 based notions are still symmetric). This raises the ques-
A scientific model M represents a target T iff M and T tion of how to analyze similarity. We discuss this thorny
are similar. issue in some detail in the next subsection. For now
we concede the point and grant that similarity need not
A well-known objection to this account is that similar- always be symmetrical. However, this does not solve
ity has the wrong logical properties. Goodman [3.64, Goodman’s problem with reflexivity (as we will see on
pp. 4–5] submits that similarity is symmetric and re- Weisberg’s notion of similarity everything is maximally
flexive yet representation isn’t. If object A is similar to similar to itself); nor does it, as will see now, solve other
object B, then B is similar to A. But if A represents B, problems of the similarity account.
then B need not (and in fact in most cases does not) However the issue of logical properties is resolved,
represent A: the Newtonian model represents the so- there is another serious problem: similarity is too inclu-
lar system, but the solar system does not represent the sive a concept to account for representation. In many
Newtonian model. And everything is similar to itself, cases neither one of a pair of similar objects repre-
but most things do not represent themselves. So this sents the other. Two copies of the same book are similar
account does not meet our third condition of adequacy but neither represents the other. Similarity between
for an account of scientific representation insofar as it two items is not enough to establish the requisite re-
does not provide a direction to representation. (Simi- lationship of representation; there are many cases of
lar problems also arise in connection with other logical similarity where no representation is involved. And this
properties, e.g., transitivity; see Frigg [3.30, p. 31] and won’t go away even if similarity turns out to be non-
Suárez [3.23, pp. 232–233].) symmetric. That North Korea is similar to China (to
Yaghmaie [3.65] argues that this conclusion – along some degree) does not imply that North Korea rep-
with the third condition itself – is wrong: epistemic rep- resents China, and that China is not similar to North
resentation is symmetric and reflexive (he discusses this Korea to the same degree does not alter this conclusion.
in the context of the isomorphism view of represen- This point has been brought home in a now-classical
tation, which we turn to in the next section, but the thought experiment due to Putnam [3.68, pp. 1–3] (but
point applies here as well). His examples are drawn see also Black [3.69, p. 104]). An ant is crawling on
from mathematical physics, and he presents a detailed a patch of sand and leaves a trace that happens to resem-
case study of a symmetric representation relation be- ble Winston Churchill. Has the ant produced a picture
tween quantum field theory and statistical mechanics. of Churchill? Putnam’s answer is that it didn’t because
His case raises interesting questions, but even if one the ant has never seen Churchill and had no intention to
grants that Yaghmaie has identified a case where repre- produce an image of him. Although someone else might
sentation is reflexive and symmetrical it does not follow see the trace as a depiction of Churchill, the trace itself
that representation in general is. The photograph in does not represent Churchill. This, Putnam concludes,
Models and Representation 3.3 The Similarity Conception 59
shows that “[s]imilarity [. . . ] to the features of Winston Second, similarity in relevant respects and to the rel-
Churchill is not sufficient to make something represent evant degrees does not guarantee that M represents the
or refer to Churchill” [3.68, p. 1]. And what is true of right target. As Suárez points out [3.23, pp. 233–234],
the trace and Churchill is true of every other pair of even a regimented similarity can obtain with no cor-
similar items: similarity on its own does not establish responding representation. If John dresses up as Pope
representation. Innocent X (and he does so perfectly), then he resem-
There is also a more general issue concerning simi- bles Velázquez’s portrait of the pope (at least in as far
larity: it is too easy to come by. Without constraints on as the pope himself resembled the portrait). In cases like
what counts as similar, any two things can be consid- these, which Suárez calls mistargeting, a model repre-
ered similar to any degree [3.70, p. 21]. This, however, sents one target rather than another, despite the fact that
has the unfortunate consequence that anything repre- both targets are relevantly similar to the model. Like in
sents anything else because any two objects are similar the case of Putnam’s ant, the root cause of the prob-
in some respect. Similarity is just too inclusive to ac- lem is that the similarity is accidental. In the case of
count for representation. An obvious response to this the ant, the accident occurs at the representation end of
Part A | 3.3
problem is to delineate a set of relevant respects and the relation, whereas in the case of John’s dressing up
degrees to which M and T have to be similar. This sug- the accidental similarity occurs at the target end. Both
gestion has been made explicitly by Giere [3.71, p. 81] cases demonstrate that similarity 2 (Definition 3.3) can-
who suggests that models come equipped with what he not rule out accidental representation.
calls theoretical hypotheses, statements asserting that Third, there may simply be nothing to be similar to
model and target are similar in relevant respects and to because some representations represent no actual ob-
certain degrees. This idea can be molded into the fol- ject [3.64, p. 26]. Some paintings represent elves and
lowing definition: dragons, and some models represent phlogiston and the
ether. None of these exist. As Toon points out, this is
Definition 3.3 Similarity 2 a problem in particular for the similarity view [3.49,
A scientific model M represents a target T iff M and T pp. 246–247]: models without objects cannot represent
are similar in relevant respects and to the relevant de- what they seem to represent because in order for two
grees. things to be similar to each other both have to exist. If
there is no ether, then an ether model cannot be similar
On this definition one is free to choose one’s respects to the ether.
and degrees so that unwanted similarities drop out It would seem that at least the second problem could
of the picture. While this solves the last problem, it be solved by adding the requirement that M denote T
leaves the others untouched: similarity in relevant re- (as considered, but not endorsed, by Goodman [3.64,
spects and to the relevant degrees is reflexive (and pp. 5–6]). Amending the previous definition accord-
symmetrical, depending on one’s notion of similar- ingly yields:
ity); and presumably the ant’s trace in the sand is still
similar to Churchill in the relevant respects and de- Definition 3.4 Similarity 3
grees but without representing Churchill. Moreover, A scientific model M represents a target T iff M and T
similarity 2 (Definition 3.3) introduces three new prob- are similar in relevant respects and to the relevant de-
lems. grees and M denotes T.
First, a misrepresentation is one that portrays its
target as having properties that are not similar in the This account would also solve the problem with reflex-
relevant respects and to the relevant degrees to the ivity (and symmetry), because denotation is directional
true properties of the target. But then, on similar- in a way similarity is not. Unfortunately similarity 3
ity 2 (Definition 3.3), M is not a representation at (Definition 3.4) still suffers from the first and the third
all. Ducheyne [3.72] embraces this conclusion when problems. It would still lead to the conflation of mis-
he offers a variant of a similarity account that explic- representatios with nonrepresentations because the first
itly takes the success of the hypothesized similarity conjunct (similar in the relevant respects) would still
between a model and its target to be a necessary con- be false. And a nonexistent system cannot be denoted
dition on the model representing the target. In Sect. 3.2 and so we have to conclude that models of, say, the
we argued that the possibility of misrepresentation is ether and phlogiston represent nothing. This seems an
a condition of adequacy for any acceptable account of unfortunate consequence because there is a clear sense
representation and so we submit that misrepresentation in which models without targets are about something.
should not be conflated with nonrepresentation ([3.20, Maxwell’s writings on the ether provide a detailed and
p. 16], [3.23, p. 235]). intelligible account of a number of properties of the
60 Part A Theoretical Issues in Models
ether, and these properties are highlighted in the model. representation. This involves adopting an agent-based
If ether existed then similarity 3 (Definition 3.4) could notion of representation that focuses on “the activity of
explain why these were important by appealing to them representing” [3.60, p. 743]. Analyzing epistemic repre-
as being relevant for the similarity between an ether sentation in these terms amounts to analyzing schemes
model and its target. But since ether does not, no such like “S uses X to represent W for purposes P” [3.60,
explanation is offered. p. 743], or in more detail [3.51, p. 274]:
A different version of the similarity view sets aside
“Agents (1) intend; (2) to use model, M; (3) to rep-
the moves made in similarity 3 (Definition 3.4) and tries
resent a part of the world W; (4) for purposes, P. So
to improve on similarity 2 (Definition 3.3). The crucial
agents specify which similarities are intended and
move is to take the very act of asserting a specific sim-
for what purpose.”
ilarity between a model and a target as constitutive of
the scientific representation. This conception of representation had already been
proposed half a century earlier by Apostel when he urged
Definition 3.5 Similarity 4 the following analysis of model-representation [3.75,
Part A | 3.3
users directly into an answer to the ER-problem, simi- of manufacturing). This, Toon submits, is not a case of
larity 5 (Definition 3.6) is explicitly not a naturalistic representation: neither car is representational. Yet, if we
account (in contrast, for example, to similarity 1 (Def- delete the expression to represent on the right hand side
inition 3.2)). As noted in Sect. 3.2 we do not demand of the biconditional in similarity 5 (Definition 3.6), the
a naturalistic account of model-representation (and as resulting condition provides an accurate description of
we will see later, many of the more developed answers what happens in the showroom. So the only difference
to the ER-problem are also not naturalistic accounts). between the nonrepresentational activity of comparing
Does this suggest that similarity 5 (Definition 3.6) cars and representing B by A is that in one case A is
is a successful similarity-based solution to the ER- used to represent and in the other it’s only used. So rep-
problem? Unfortunately not. A closer look at similar- resentation is explained in terms of to represent, which
ity 5 (Definition 3.6) reveals that the role of similarity is circular. So similarity 5 (Definition 3.6) does not pro-
has shifted. As far as offering a solution to the ER- vide nontrivial conditions for something to be used as
problem is concerned, all the heavy lifting in similar- a representation.
ity 5 (Definition 3.6) is done by the appeal to agents One way around the problem would be to replace
Part A | 3.3
and similarity has in fact become an idle wheel. Giere to represent by to denote. This, however, would bring
implicitly admits this when he writes [3.60, p. 747]: the account close to similarity 3 (Definition 3.4), and it
would suffer from the same problems.
“How do scientists use models to represent aspects
Mäki [3.79] suggested an extension of similarity 5
of the world? What is it about models that makes it
(Definition 3.6), which he explicitly brands as “a (more
possible to use them in this way? One way, perhaps
or less explicit) version” of Giere’s. Mäki adds two con-
the most important way, but probably not the only
ditions to Giere’s: the agent uses the model to address
way, is by exploiting similarities between a model
an audience E and adds a commentary C [3.79, p. 57].
and that aspect of the world it is being used to repre-
The role of the commentary is to specify the nature of
sent. Note that I am not saying that the model itself
the similarity. This is needed because [3.79, p. 57]:
represents an aspect of the world because it is simi-
lar to that aspect. There is no such representational “representation does not require that all parts of
relationship. [footnote omitted] Anything is similar the model resemble the target in all or just any ar-
to anything else in countless respects, but not any- bitrary respects, or that the issue of resemblance
thing represents anything else. It is not the model legitimately arises in regard to all parts. The relevant
that is doing the representing; it is the scientist us- model parts and the relevant respects and degrees of
ing the model who is doing the representing.” resemblance must be delimited.”
But if similarity is not the only way in which What these relevant respects and degrees of resem-
a model can be used as a representation, and if it is the blance are depends on the purposes of the scientific
use by a scientist that turns a model into a representa- representation in question. These are not determined in
tion (rather than any mind-independent relationship the the model as it were, but are pragmatic elements. From
model bears to the target), then similarity has become this it transpires that in effect C plays the same role
otiose in a reply to the ER-problem. A scientist could as that played by theoretical hypotheses in Giere’s ac-
invoke any relation between M and T and M would still count. Certain aspects of M are chosen as those relevant
represent T. Being similar in the relevant respects to the to the representational relationship between M and T.
relevant degrees now plays the role either of a represen- The addition of an audience, however, is problem-
tational style, or of a normative criterion for accurate atic. While models are often shared publicly, this does
representation, rather than of a grounding of represen- not seem to be a necessary condition for the representa-
tation. We assess in the next section whether similarity tional use of a model. There is nothing that precludes
offers a cogent reply to the issues of style and accuracy. a lone scientist from coining a model M and using
A further problem is that there seems to be a hidden it representationally. That some models are easier to
circularity in the analysis. As Toon [3.49, pp. 251–252] grasp, and therefore serve as more effective tools to
points out, having a scientist form a theoretical hypoth- drive home a point in certain public settings, is an indis-
esis about the similarity relation between two objects A putable fact, but one that has no bearing on a model’s
and B and exploit this similarity for a certain purpose status as a representation. The pragmatics of commu-
P is not sufficient for representation. A and B could be nication and the semantics of modeling are separate
two cars in a showroom and an engineer inspects car A issues.
and then use her knowledge about similarities to make The conclusion we draw from this discussion is that
assertions about B (for instance if both cars are of the similarity does not offer a viable answer to the ER-
same brand she can infer something about B’s quality problem.
62 Part A Theoretical Issues in Models
3.3.2 Accuracy and Style of scientific representation. Both Giere and Teller have
insisted – rightly, in our view – that there need not be
Accounting for the possibility of misrepresentation re- a substantive sense of similarity uniting all representa-
sulted in a shift of the division of labor for the more tions (see also Callender and Cohen [3.26, p. 77] for
developed similarity-based accounts. Rather than be- a discussion). A proponent of the similarity view is free
ing the relation that grounds representation, similarity to propose different kinds of similarity for different rep-
should be considered as setting a standard of accuracy resentations and is under no obligation to also show that
or as providing an answer to the question of style (or they are special cases of some overarching conception
both). The former is motivated by the observation that of similarity.
a proposed similarity between M and T could be wrong, We now turn to the issue of style. A first step in the
and hence if the model user’s proposal does in fact hold direction of an understanding of styles is the explicit
(and M and T are in fact similar in the specified way) analysis of the notion of similarity. Unfortunately the
then M is an accurate representation of T. The latter philosophical literature contains surprisingly little ex-
transpires from the simple observation that a judgment plicit discussion about what it means for something to
Part A | 3.3
of accuracy in fact presupposes a choice of respects be similar to something else. In many cases similarity
in which M and T are claimed to be similar. Simply is taken to be primitive, possible worlds semantics be-
proposing that they are similar in some unspecified re- ing a prime example. The problem is then compounded
spect is vacuous. But delineating relevant properties by the fact that the focus is on comparative overall sim-
could potentially provide an answer to the problem of ilarity instead rather than on similarity in respect and
style. For example, if M and T are proposed to be simi- degrees; for a critical discussion see [3.81]. Where the
lar with respect to their causal structure, then we might issue is discussed explicitly, the standard way of cash-
have a style of causal modeling; if M and T are pro- ing out what it means for an object to be similar to
posed to be similar with respect to structural properties, another object is to require that they co-instantiate prop-
then we might have a style of structural modeling; and erties. This is the idea that Quine [3.82, pp. 117–118]
so on and so forth. So the idea is that if M representing and Goodman [3.83, p. 443] had in mind in their influ-
T involves the claim that M and T are similar in a cer- ential critiques of the notion. They note that if all that
tain respect, the respect chosen specifies the style of the is required for two things to be similar is that they co-
representation; and if M and T are in fact similar in that instantiate some property, then everything is similar to
respect (and to the specified degree), then M accurately everything else, since any pair of objects have at least
represents T within that style. one property in common.
In this section we investigate both options. But be- The issue of similarity seems to have attracted more
fore delving into the details, let us briefly step back and attention in psychology. In fact, the psychological lit-
reflect on possible constraints on viable answers. Tak- erature provides formal accounts to capture it directly
ing his cue from Lopes’ [3.59] discussion of pictures, in more fully worked out accounts. The two most
Downes [3.80, pp. 421–422] proposes two constraints prominent suggestions are the geometric and contrast
on allowable notions of similarity. The first, which he accounts (see [3.84] for an up-to-date discussion). The
calls the independence challenge, requires that a user former, associated with Shepard [3.85], assigns objects
must be able to specify the relevant representation- a place in a multidimensional space based on values as-
grounding similarity before engaging in a comparison signed to their properties. This space is then equipped
between M and T. Similarities that are recognizable with a metric and the degree of similarity between two
only with hindsight are an unsound foundation of a rep- objects is a function of the distance between the points
resentation. We agree with this requirement, which in representing the two objects in that space.
fact is also a consequence of the surrogative reasoning This account is based on the strong assumptions that
condition: a model can generate novel hypotheses only values can be assigned to all features relevant to similar-
if (at least some of the) similarity claims are not known ity judgments, which is deemed unrealistic. This prob-
only ex post facto. lem is supposed to be overcome in Tversky’s contrast
Downes’ second constraint, the diversity constraint, account [3.86]. This account defines a gradated notion
is the requirement that the relevant notion of similar- of similarity based on a weighted comparison of prop-
ity has to be identical in all kinds of representation erties. Weisberg ([3.33, Chap. 8], [3.87]) has recently
and across all representational styles. So all models introduced this account into the philosophy of science
must bear the same similarity relations to their targets. where it serves as the starting point for his so-called
Whatever its merits in the case of pictorial representa- weighted feature matching account of model world-
tion, this observation does not hold water in the case relations. This account is our primary interest here.
Models and Representation 3.3 The Similarity Conception 63
The account introduces a set of relevant proper- sibility of M being similar to T to a different degree than
ties. Let then M be the set of properties from T is similar to M. So S provides the asymmetrical notion
that are instantiated by the model M; likewise T is the of similarity mentioned in Sect. 3.3.1. Second, S has
set of properties from instantiated by the target sys- a property called maximality: everything is maximally
tem. Furthermore let f be a ranking function assigning similar to itself and every other nonidentical object is
a real number to every subset of . The simplest ver- equally or less similar. Formally: S.A; A/ S.A; B/ for
sion of a ranking function is one that assigns to each set all objects A and B as long as A ¤ B [3.33, p. 154].
the number of properties in the set, but rankings can be What does this account contribute to a response to
more complex, for instance by giving important proper- the question of style? The answer, we think, is that
ties more weight. The level of similarity between M and it has heuristic value but does not provide substantive
T is then given by the following equation [3.87, p. 788] account. In fact, stylistic questions stand outside the
(the notation is slightly amended) proposed framework. The framework can be useful in
bringing questions into focus, but eventually the sub-
S.M; T/ D f .M \ T / ˛f .M T / stantive stylistic questions concern inclusion criteria for
ˇf .T M / ; (what properties do we focus on?), the weight given
Part A | 3.3
by f to properties (what is the relative importance of
where ˛, ˇ and are weights, which can in princi- properties?) and the value of the parameters (how sig-
ple take any value. This equation provides “a similarity nificant are disagreements between the properties of M
score that can be used in comparative judgments of and T?). These questions have to be answered outside
similarity” [3.87, p. 788]. The score is determined by the account. The account is a framework in which ques-
weighing the properties the model and target have in tions can be asked but which does not itself provide
common against those they do not. (Thus we note that answers, and hence no classification of representational
this account could be seen as a quantitative version of styles emerges from it.
Hesse’s [3.88] theory of analogy in which properties Some will say that this is old news. Goodman
that M and T share are the positive analogy and ones denounced similarity as “a pretender, an impostor,
they don’t share are the negative analogy.) In the above a quack” [3.83, p. 437] not least because he thought that
formulation the similarity score S can in principle vary it merely put a label to something unknown without an-
between any two values (depending on the choice of alyzing it. And even some proponents of the similarity
the ranking function and the value of the weights). One view have insisted that no general characterization of
can then use standard mathematical techniques to renor- similarity was possible. Thus Teller submits that [3.52,
malize S so that it takes values in the unit interval Œ0; 1 p. 402]:
(these technical moves need not occupy us here and we
“[t]here can be no general account of similarity, but
refer the reader to Weisberg for details [3.33, Chap. 8]).
there is also no need for a general account because
The obvious question at this point is how the var-
the details of any case will provide the information
ious blanks in the account can be filled. First in line
which will establish just what should count as rele-
is the specification of a property set . Weisberg is
vant similarity in that case.”
explicit that there are no general rules to rely on and
that “the elements of come from a combination of This amounts to nothing less than the admission that
context, conceptualization of the target, and theoreti- no analysis of similarity (or even different kinds of sim-
cal goals of the scientist” [3.33, p. 149]. Likewise, the ilarity) is possible and that we have to deal with each
ranking function as well as the values of weighting pa- case in its own right.
rameters depend on the goals of the investigation, the Assume now, for the sake of argument, that the
context, and the theoretical framework in which the sci- stylistic issues have been resolved and full specifica-
entists operate. Weisberg further divides the elements tions of relevant properties and their relative weights
of into attributes and mechanisms. The former are are available. It would then seem plausible to say that
the “the properties and patterns of a system” while S.M; T/ provides a degree of accuracy. This reading is
the latter are the “underlying mechanism[s] that gen- supported by the fact that Weisberg paraphrases the role
erates these properties” [3.33, p. 145]. This distinction of S.M; T/ as providing “standards of fidelity” [3.33,
is helpful in the application to concrete cases, but for p. 147]. Indeed, in response to Parker [3.89], Weisberg
the purpose of our conceptual discussion it can be set claims that his weighted feature matching account is
aside. supposed to answer the ER-problem and provide stan-
Irrespective of these choices, the similarity score dards of accuracy.
S has a number of interesting features. First, it is As we have seen above, S.M; T/ is maximal if M
asymmetrical for ˛ ¤ ˇ, which makes room for the pos- is a perfect replica of T (with respect to the properties
64 Part A Theoretical Issues in Models
in ), and the fewer properties M and T share, the less similar occurs outside of the formal account itself. The
accurate the representation becomes. This lack of accu- inclusion criteria on what goes into now not only
racy is then reflected in a lower similarity score. This is has to delineate relevant properties, but, at least for
plausible and Weisberg’s account is indeed a step for- the quantitative ones, also has to provide an interval
ward in the direction of quantifying accuracy. defining when they qualify as similar. Furthermore, it
Weisberg’s account is an elaborate version of the co- remains unclear how to account for M and T to be alike
instantiation account of similarity. It improves signifi- with respect to their qualitative properties. The similar-
cantly on simple versions, but it cannot overcome that ity between genuinely qualitative properties cannot be
account’s basic limitations. Niiniluoto distinguishes be- accounted for in terms of numerical intervals. This is
tween two different kinds of similarities [3.90, pp. 272– a particularly pressing problem for Weisberg, because
274]: partial identity and likeness (which also feature in he takes the ability to compare models and their targets
Hesse’s discussion of analogies, see, for instance [3.88, with respect to their qualitative properties as a central
pp. 66–67]). Assume M instantiates the relevant proper- desideratum for any account of similarity between the
ties P1 ; : : : ; Pn and T instantiates the relevant properties two [3.33, p. 136].
Q1 ; : : : ; Qn . If these properties are identical, i. e., if
Part A | 3.3
objects is rather large. Numbers and other objects of entirely clear what Teller means by this, but our guess
pure mathematics, classes, propositions, concepts, the is that he would regard models as bundles of proper-
letter A, and Dante’s Inferno are abstract objects [3.95], ties. Target systems, as concrete objects, are the sorts
and Hale [3.96, pp. 86–87] lists no less than 12 differ- of things that can instantiate properties delineated by
ent possible characterizations of abstract objects. At the theoretical hypotheses. Models, since they are abstract,
very least this list shows that there is great variety in cannot. But rather than being objects instantiating prop-
abstract objects and classifying models as abstract ob- erties, a model can be seen as a bundle of properties.
jects adds little specificity to an account of what models A collection of properties is an abstract entity that is
are. Giere could counter that he limits attention to those the sort of thing that can contain the properties speci-
abstract objects that possess “all and only the character- fied by theoretical hypotheses as parts. The similarity
istics specified in the principles” [3.60, p. 745], where relation between models and their targets shifts from
principles are general rules like Newton’s laws of mo- the co-instantiation of properties, to the idea that tar-
tion. He further specifies that he takes “abstract entities gets instantiate (relevant) properties that are parts of the
to be human constructions” and that “abstract models model. With respect to what it means for a model to be
Part A | 3.3
are definitely not to be identified with linguistic entities a bundle of properties Teller claims that the “[d]etails
such as words or equations” [3.60, p. 747]. While this will vary with ones account of instantiation, of proper-
narrows down the choices somehow, it still leaves many ties and other abstract objects, and of the way properties
options and ultimately the ontological status of models enter into models” [3.52].
in a similarity account remains unclear. But as Thompson-Jones [3.98, pp. 294–295] notes,
Giere fails to expand on this ontological issue for it is not obvious that this suggestion is an improve-
a reason: he dismisses the problem as one that philoso- ment on Giere’s abstract objects. A bundle view incurs
phers of science can set aside without loss. He voices certain metaphysical commitments, chiefly the exis-
skepticism about the view that philosophers of science tence of properties and their abstractness, and a bundle
“need a deeper understanding of imaginative processes view of objects, concrete or abstract, faces a number of
and of the objects produced by these process” [3.97, serious problems [3.99]. One might speculate that ad-
p. 250] or that “we need say much more [. . . ] to get on dressing these issues would push Teller either towards
with the job of investigating the functions of models in the kind of more robust account of abstract objects that
science” [3.97]. he endeavored to avoid, or towards a fictionalist under-
We remain unconvinced about this skepticism, not standing of models.
least because there is an obvious yet fundamental issue The latter option has been discussed by Giere,
with abstract objects. No matter how the above issues who points out that a natural response to Teller’s and
are resolved (and irrespective of whether they are re- Thomson-Jones’ problem is to regard models as akin to
solved at all), at the minimum it is clear that models imaginary or fictional systems of the sort presented in
are abstract in the sense that they have no spatiotem- novels and films. It seems true to say that Sherlock is
poral location. Teller [3.52, p. 399] and Thomson- a smoker, despite the fact that Sherlock an imaginary
Jones [3.98] supply arguments suggesting that this detective, and smoking is a physical property. At times,
alone causes serious problems for the similarity ac- Giere seems sympathetic to this view. He says [3.97,
count. The similarity account demands that models can p. 249]:
instantiate properties and relations, since this is a nec-
“it is widely assumed that a work of fiction is a cre-
essary condition on them being similar to their targets.
ation of human imagination [. . . ] the same is true of
In particular, it requires that models can instantiate
scientific models. So, ontologically, scientific mod-
the properties and relations mentioned in theoretical
els and works of fiction are on a par. They are both
hypotheses or commentaries. But such properties and
imaginary constructs.”
relations are typically physical. And if models have no
spatiotemporal location, then they do not instantiate any And he observes that [3.51, p. 278]:
such properties or relations. Thomson-Jones’ example
“novels are commonly regarded as works of imagi-
of the idealized pendulum model makes this clear. If
nation. That, ontologically, is how we should think
the idealized pendulum is abstract then it is difficult to
of abstract scientific models. They are creations of
see how to make sense of the idea that it has a length, or
scientists imaginations. They have no ontological
a mass, or an oscillation period of any particular time.
status beyond that.”
An alternative suggestion due to Teller [3.52] is
that we should instead say that whilst “concrete ob- However, these seem to be occasional slips and he
jects HAVE properties [. . . ] properties are PARTS of recently positioned himself as an outspoken opponent
models” [3.52, p. 399] (original capitalization). It is not of any approach to models that likens them to literary
66 Part A Theoretical Issues in Models
fiction. We discuss these approaches as well as Giere’s In sum, the similarity view is yet to be equipped
criticisms of them in Sect. 3.6. with a satisfactory account of the ontology of models.
posed as an account of theory structure rather than widely used in mathematics and logic; see for instance
model-representation. The driving idea behind the po- Machover [3.111, p. 149], Hodges [3.112, p. 2], and
sition is that scientific theories are best thought of Rickart [3.113, p. 17]. It is convenient to write these
as collections of models. This invites the questions: as S D hU; Ri, where h ; i denotes an ordered tuple.
What are these models, and how do they represent Sometimes operations are also included in the definition
their target systems? Defenders of the semantic view of of a structure. While convenient in some applications,
theories take models to be structures, which represent operations are redundant because operations reduce
their target systems in virtue of there being some kind to relations (see Boolos and Jeffrey [3.114, pp. 98–
of mapping (isomorphism, partial isomorphism, homo- 99]).
morphism, . . . ) between the two. (It is worth noting It is important to be clear on what we mean by ob-
that Giere, whose account of scientific representation ject and relation in this context. As Russell [3.115, p. 60]
we discussed in the previous section, is also associated points out, in defining the domain of a structure it is
with the semantic view, despite not subscribing to either irrelevant what the objects are. All that matters from
of these positions.) a structuralist point of view is that there are so and so
This conception has two prima facie advantages. many of them. Whether the object is a desk or a planet
The first advantage is that it offers a straightforward is irrelevant. All we need are dummies or placehold-
answer to the ER-problem, and one that accounts for ers whose only property is objecthood. Similarly, when
surrogative reasoning: the mappings between the model defining relations one disregards completely what the
and the target allow scientists to convert truths found relation is in itself. Whether we talk about being the
in the model into claims about the target system. The mother of or standing to the left of is of no concern
second advantage concerns the applicability of math- in the context of a structure; all that matters is between
ematics. There is time-honored position in the phi- which objects it holds. For this reason, a relation is spec-
losophy of mathematics that sees mathematics as the ified purely extensionally: as a class of ordered n-tuples.
study of structures; see, for instance, Resnik [3.108] and The relation literally is nothing over and above this
Shapiro [3.109]. It is a natural move for the scientific class. So a structure consists of dummy objects between
structuralist to adopt this point of view, which, without which purely extensionally defined relations hold.
further ado, provides a neat explanation of how mathe- Let us illustrate this with an example. Consider the
matics is used in scientific modeling. structure with the domain U D fa; b; cg and the fol-
lowing two relations: r1 D fag and r2 D fha; bi; hb; ci;
3.4.1 Structures and the Problem ha; cig. Hence R consists of r1 and r2 , and the structure
of Ontology itself is S D hU; Ri. This is a structure with a three-
object domain endowed with a monadic property and
Almost anything from a concert hall to a kinship sys- a transitive relation. Whether the objects are books or
tem can be referred to as a structure. So the first task iron rods is of no relevance to the structure; they could
for a structuralist account of representation is to artic- be literally anything one can think of. Likewise r1 could
ulate what notion of structure it employs. A number of be literally any monadic property (being green, being
different notions of structure have been discussed in the waterproof, etc.) and r2 could be any (irreflexive) tran-
literature (for a review see Thomson-Jones [3.110]), but sitive relation (larger than, hotter than, more expensive
by far the most common and widely used is the notion than, etc.).
Models and Representation 3.4 The Structuralist Conception 67
It is worth pointing out that this use of structure dif- But philosophers of science need not resolve this issue
fers from the use one sometimes finds in logic, where and can pass off the burden of explanation to philoso-
linguistic elements are considered part of the model as phers of mathematics. This is what usually happens, and
well. Specifically, over and above S D hU; Ri, a struc- hence we don’t pursue this matter further.
ture is also taken to include a language (sometimes An extension of the standard conception of struc-
called a signature) L, and an interpretation function ture is the so-called partial structures approach (for
([3.112, Chap. 1] and [3.116, pp. 80–81]). But in the instance, Da Costa and French [3.102] and Bueno
context of the accounts discussed in this section, a struc- et al. [3.130]). Above we defined relations by specify-
ture is the ordered pair S D hU; Ri as introduced above ing between which tuples it holds. This naturally allows
and so we disregard this alternative use of structure. a sorting of all tuples into two classes: ones that belong
The first basic posit of the structuralist theory of to the relation and ones that don’t. The leading idea
representation is that models are structures in the above of partial structures is to introduce a third option: for
sense (the second is that models represent their targets some tuples it is indeterminate whether or not they be-
by being suitably morphic to them; we discuss mor- long to the relation. Such a relation is a partial relation.
Part A | 3.4
phisms in the next subsection). Suppes has articulated A structure with a set R containing partial relations is
this stance clearly when he declared that “the meaning a partial structure (formal definitions can be found in
of the concept of model is the same in mathematics references given above). Partial structures make room
and the empirical sciences” [3.117, p. 12]. Likewise, for a process of scientific investigation where one be-
van Fraassen posits that a “scientific theory gives us gins not knowing whether a tuple falls under the relation
a family of models to represent the phenomena”, that and then learns whether or not it does.
“[t]hese models are mathematical entities, so all they Proponents of that approach are more guarded as
have is structure [. . . ]” [3.118, pp. 528–529] and that regards the ontology of models. Bueno and French em-
therefore [3.118, p. 516] phasize that “advocates of the semantic account need
not be committed to the ontological claim that mod-
“[s]cience is [. . . ] interpreted as saying that the enti-
els are structures” [3.53, p. 890] (original emphasis).
ties stand in relations which are transitive, reflexive,
This claim is motivated by the idea that the task for
etc. but as giving no further clue as to what those
philosophers of science is to represent scientific the-
relations are.”
ories and models, rather than to reason about them
Redhead submits that “it is this abstract structure as- directly. French [3.131] makes it explicit that accord-
sociated with physical reality that science aims, and to ing to his account of the semantic view of theories,
some extent succeeds, to uncover [. . . ]” [3.119, p. 75]. a scientific theory is represented as a class of models,
Finally, French and Ladyman affirm that “the specific but should not be identified with that class. Moreover,
material of the models is irrelevant; rather it is the struc- a class of models is just one way of representing a the-
tural representation [. . . ] which is important” [3.120, ory; we can also use an intrinsic characterization and
p. 109]. Further explicit statements of this view are of- represent the same theory as a set of sentences in order
fered by: Da Costa and French [3.121, p. 249], Suppes to account for how they can be objects of our epistemic
([3.122, p. 24], [3.123, Chap. 2]) and van Fraassen attitudes [3.132].
([3.101, pp. 43, 64], [3.118, pp. 516, 522], [3.124, He therefore adopts a quietist position with respect
p. 483], [3.125, p. 6]). to what a theory or a model is, declining to answer the
These structuralist accounts have typically been question [3.131, 133]. There are thus two important no-
proposed in the framework of the so-called seman- tions of representation at play: representation of targets
tic view of theories. There are differences between by models, which is the job of scientists, and represen-
them, and formulations vary from author to author. tation of theories and models by structures, which is
However, as Da Costa and French [3.126] point out, the job of philosophers of science. The question for this
all these accounts share a commitment to analyz- approach then becomes whether or not the structuralist
ing models as structures. So we are presented with representation of models and epistemic representation –
a clear answer to the problem of ontology: models as partial structures and morphisms that hold between
are structures. The remaining issue is what structures them – is an accurate or useful one. And the concerns
themselves are. Are they platonic entities, equivalence raised below remain when translated into this context as
classes, modal constructs, or yet something else? This well.
is a hotly debated issue in the philosophy of logic There is an additional question regarding the cor-
and mathematics; for different positions see for in- rect formal framework for thinking about models in
stance Dummett [3.127, 295ff.], Hellman [3.128, 129], the structuralist position. Landry [3.134] argues that in
Redhead [3.119], Resnik [3.108], and Shapiro [3.109]. certain contexts group, rather than set, theory should
68 Part A Theoretical Issues in Models
be used when talking about structures and morphisms semantic view is an extrapolation. Representation be-
between them, and Halvorson [3.135, 136] argues that came a much-debated topic in the first decade of the
theories should be identified with categories rather than 21st century, and many proponents of the semantic view
classes or sets. Although these discussions highlight then either moved away from structuralism 1 (Defi-
important questions regarding the nature of scientific nition 3.7), or pointed out that they never held such
theories, the question of how individual models repre- a view. We turn to more advanced positions shortly, but
sent remains unchanged. Halvorson still takes individ- to understand what motivates such positions it is helpful
ual models to be set-theoretic structures. And Landry’s to understand why structuralism 1 (Definition 3.7) fails.
paper is not an attempt to reframe the representa- An immediate question concerns the target end
tional relationship between models and their targets structure ST . At least prima facie target systems
(see [3.137] for her skepticism regarding how struc- aren’t structures; they are physical objects like planets,
turalism deals with this question). Thus, for reasons of molecules, bacteria, tectonic plates, and populations of
simplicity we will focus on the structuralist view that organisms. An early recognition that the relation be-
identifies models with set-theoretic structures through- tween targets and structures is not straightforward can
Part A | 3.4
out the rest of this section. be found in Byerly, who emphasizes that structures are
abstracted from objects [3.103, pp. 135–138]. The re-
3.4.2 Structuralism and the ER-Problem lation between structures and physical targets is indeed
a serious question and we will return to it in Sect. 3.4.4.
The most basic structuralist conception of scientific rep- In this subsection we grant the structuralist the assump-
resentation asserts that scientific models, understood as tion that target systems are (or at least have) structures.
structures, represent their target systems in virtue of be- The first and most obvious problem is the same as
ing isomorphic to them. Two structures Sa D hUa ; Ra i with the similarity view: isomorphism is symmetrical,
and Sb D hUb ; Rb i are isomorphic iff there is a map- reflexive, and transitive, but epistemic representation
ping f W Ua ! Ub such that (i) f is one-to-one (bijective) isn’t. This problem could be addressed by replacing iso-
and (ii) f preserves the system of relations in the fol- morphism with an alternative mapping. Bartels [3.140],
lowing sense: The members a1 ; : : : ; an of Ua satisfy Lloyd [3.141], and Mundy [3.142] suggest homomor-
the relation ra of Ra iff the corresponding members phism; van Fraassen [3.36, 101, 118] and Redhead iso-
b1 D f .a1 / ; : : : ; bn D f .an / of Ub satisfy the relation rb morphic embeddings [3.119]; advocates of the partial
of Rb , where rb is the relation corresponding to ra (for structures approach prefer partial isomophisms [3.102,
difficulties in how to cash out this notion of correspon- 120, 121, 143–145]; and Swoyer [3.25] introduces what
dence without reference to an interpretation function he calls = morphisms. We refer to these collec-
see Halvorson [3.135] and Glymour [3.138]). tively as morphisms.
Assume now that the target system T exhibits the This solves some, but not all problems. While many
structure ST D hUT ; RT i and the model is the structure of these mappings are asymmetrical, they are all still
SM D hUM ; RM i. Then the model represents the target reflexive, and at least some of them are also transitive.
iff it is isomorphic to the target: But even if these formal issues could be resolved in
one way or another, a view based on structural map-
Definition 3.7 Structuralism 1 pings would still face other serious problems. For ease
A scientific model M represents its target T iff SM is of presentation we discuss these problems in the con-
isomorphic to ST . text of the isomorphism view; mutatis mutandis other
formal mappings suffer from the same difficulties (For
This view is articulated explicitly by Ubbink, who detailed discussions of homomorphism and partial iso-
posits that [3.139, p. 302] morphism see Suárez [3.23, pp. 239-241] and Pero and
Suárez [3.146]; Mundy [3.142] discusses general con-
“a model represents an object or matter of fact in
straints one may want to impose on morphisms.)
virtue of this structure; so an object is a model [. . . ]
Like similarity, isomorphism is too inclusive: not
of matters of fact if, and only if, their structures are
all things that are isomorphic represent each other. In
isomorphic.”
the case of similarity this case was brought home by
Views similar to Ubbink’s seem operable in many Putnam’s thought experiment with the ant crawling on
versions of the semantic view. In fairness to propo- the beach; in the case of isomorphism a look at the
nents of the semantic view it ought to be pointed out, history of science will do the job. Many mathemati-
though, that for a long time representation was not the cal structures have been discovered and discussed long
focus of attention in the view and the attribution of before they have been used in science. Non-Euclidean
(something like) structuralism 1 (Definition 3.7) to the geometries were studied by mathematicians long before
Models and Representation 3.4 The Structuralist Conception 69
Einstein used them in the context of spacetime theo- As we have seen in the last section, a misrepresen-
ries, and Hilbert spaces were studied by mathematicians tation is one that portrays its target as having features
prior to their use in quantum theory. If representa- it doesn’t have. In the case of an isomorphism account
tion was nothing over and above isomorphism, then of representation this presumably means that the model
we would have to conclude that Riemann discovered portrays the target as having structural properties that
general relativity or that that Hilbert invented quantum it doesn’t have. However, isomorphism demands iden-
mechanics. This is obviously wrong. Isomorphism on tity of structure: the structural properties of the model
its own does not establish representation [3.20, p. 10]. and the target must correspond to one another exactly.
Isomorphism is more restrictive than similarity: A misrepresentation won’t be isomorphic to the tar-
not everything is isomorphic to everything else. But get. By the lights of structuralism 1 (Definition 3.7)
isomorphism is still too abundant to correctly identify it is therefore is not a representation at all. Like sim-
the extension of a representation (i. e., the class of ple similarity accounts, structuralism 1 (Definition 3.7)
systems it represents), which gives rise to a version of conflates misrepresentation with nonrepresentation.
the mistargeting problem. The root of the difficulties is Muller [3.148, p. 112] suggests that this problem
Part A | 3.4
that the same structures can be instantiated in different can be overcome in a two-stage process: one first
target systems. The 1=r2 law of Newtonian gravity is identifies a submodel of the model, which in fact is iso-
also the mathematical skeleton of Coulomb’s law of morphic to at least a part of the target. This reduced
electrostatic attraction and the weakening of sound isomorphism establishes representation. One then con-
or light as a function of the distance to the source. structs “a tailor-made morphism on a case by case
The mathematical structure of the pendulum is also basis” [3.148, p. 112] to account for accurate represen-
the structure of an electric circuit with condenser tation. Muller is explicit that this suggestion presup-
and solenoid (a detailed discussion of this case is poses that there is “at least one resemblance” [3.148,
provided by Kroes [3.147]). Linear equations are p. 112] between model and target because “other-
ubiquitous in physics, economics and psychology. wise one would never be called a representation of
Certain geometrical structures are instantiated by many the other” [3.148, p. 112]. While this may work in
different systems; just think about how many spherical some cases, it is not a general solution. It is not clear
things we find in the world. This shows that the same whether all misrepresentations have isomorphic sub-
structure can be exhibited by more than one target models. Models that are gross distortions of their targets
system. Borrowing a term from the philosophy of mind, (such as the liquid drop model of the nucleus or the lo-
one can say that structures are multiply realizable. If gistic model of a population) may well not have such
representation is explicated solely in terms of isomor- submodels. More generally, as Muller admits, his solu-
phism, then we have to conclude that, say, a model of tion “precludes total misrepresentation” [3.148, p. 112].
a pendulum also represents an electric circuit. But this So in effect it just limits the view that representation
seems wrong. Hence isomorphism is too inclusive to coincides with correct representation to a submodel.
correctly identify a representation’s extension. However, this is too restrictive a view of representation.
One might try to dismiss this point as an artifact of Total misrepresentations may be useless, but they are
a misidentification of the target. Van Fraassen [3.101, representations nevertheless.
p. 66], mentions a similar problem under the heading Another response refers to the partial structures ap-
of “unintended realizations” and then expresses confi- proach and emphasizes that partial structures are in
dence that it will “disappear when we look at larger fact constructed to accommodate a mismatch between
observable parts of the world”. Even if there are mul- model and target and are therefore not open to this ob-
tiply realizable structures to begin with, they vanish as jection [3.53, p. 888]. It is true that the partial structures
science progresses and considers more complex sys- framework has a degree of flexibility that the standard
tems because these systems are unlikely to have the view does not. However, we doubt that this flexibil-
same structure. Once we focus on a sufficiently large ity stretches far enough. While the partial structure
part of the world, no two phenomena will have the same approach deals successfully with incomplete represen-
structure. There is a problem with this counter, how- tations, it does not seem to deal well with distortive
ever. To appeal to future science to explain how models representations (we come back to this point in the next
work today seems unconvincing. It is a matter of fact subsection). So the partial structures approach, while
that we currently have models that represent electric enjoying an advantage over the standard approach, is
circuits and sound waves, and we do not have to await nevertheless not yet home and dry.
future science providing us with more detailed accounts Like the similarity account, structuralism 1 (Defini-
of a phenomenon to make our models represent what tion 3.7) has a problem with nonexistent targets because
they actually already do represent. no model can be isomorphic to something that doesn’t
70 Part A Theoretical Issues in Models
exist. If there is no ether, a model can’t be isomorphic to obvious response to this challenge: one can represent
it. Hence models without target cannot represent what a system by coming up with a model that is structurally
they seem to represent. isomorphic to it. We call this the isomorphism-style.
Most of these problems can be resolved by making This style also offers a clear-cut condition of accuracy:
moves similar to the ones that lead to similarity 5 (Defi- the representation is accurate if the hypothesized iso-
nition 3.6): introduce agents and hypothetical reasoning morphism holds; it is inaccurate if it doesn’t.
into the account of representation. Going through the This is a neat answer. The question is what status it
motions one finds: has vis-à-vis the problem of style. Is the isomorphism-
style merely one style among many other styles which
Definition 3.8 Structuralism 2 are yet to be identified, or is it in some sense privileged?
A scientific model M represents a target system T iff The former is uncontentious. However, the emphasis
there is an agent A who uses M to represent a target many structuralists place on isomorphism suggests that
system T by proposing a theoretical hypothesis H spec- they do not regard isomorphism as merely one way
ifying an isomorphism between SM and ST . among others to represent something. What they seem
Part A | 3.4
ception is French, who discusses isomorphism accounts despite the fact that they are obviously not. Thomson-
in the context of pictorial representation [3.35]. He Jones [3.98] dubs this face value practice, and there is
discusses in detail Budd’s [3.151] account of pictorial a question whether structuralism can account for that
representation and points out that it is based on the no- practice.
tion of a structural isomorphism between the structure
of the surface of the painting and the structure of the 3.4.4 The Structure of Target Systems
relevant visual field. Therefore representation is the per-
ceived isomorphism of structure [3.35, pp. 1475–1476] Target systems are physical objects: atoms, planets,
(this point is reaffirmed by Bueno and French [3.53, populations of rabbits, economic agents, etc. Isomor-
pp. 864–865]; see Downes [3.80, pp. 423–425] for phism is a relation that holds between two structures
a critical discussion). In a similar vein, Bueno claims and claiming that a set theoretic structure is isomorphic
that the partial structures approach offers a framework to a piece of the physical world is prima facie a category
in which different representations – among them “out- mistake. By definition, all of the mappings suggested –
puts of various instruments, micrographs, templates, isomorphism, partial isomorphism, homomorphism, or
Part A | 3.4
diagrams, and a variety of other items” [3.150, p. 94] – isomorphic embedding – only hold between two struc-
can be accommodated. This would suggest that an iso- tures. If we are to make sense of the claim that the
morphism account of representation at least has a claim model is isomorphic to its target we have to assume that
to being a universal account covering representations the target somehow exhibits a certain structure ST D
across different domains. hUT ; RT i. But what does it mean for a target system –
This approach faces a number of questions. First, a part of the physical world – to possess a structure, and
neither a visual field nor a painting is a structure, and where in the target system is the structure located?
the notion of there being an isomorphism in the set the- The two prominent suggestions in the literature are
oretic sense between the two at the very least needs that data models are the target end structures repre-
unpacking. The theory is committed to the claim that sented by models, and that structures are, in some sense,
paintings and visual fields have structures, but, as we instantiated in target systems. The latter option comes
will see in the next subsection, this claim faces serious in three versions. The first version is that a structure is
issues. Second, Budd’s theory is only one among many ascribed to a system; the second version is that systems
theories of pictorial representation, and most alterna- instantiate structural universals; and the third version
tives do not invoke isomorphism. So there is question claims that target systems simply are structures. We
whether a universal claim can be built on Budd’s theory. consider all suggestions in turn.
In fact, there is even a question about isomorphism’s What are data models? Data are what we gather in
universality within scientific representation. Nonmath- experiments. When observing the motion of the moon,
ematized sciences work with models that aren’t struc- for instance, we choose a coordinate system and ob-
tures. Godfrey-Smith [3.152], for instance, argues that serve the position of the moon in this coordinate system
models in many parts of biology are imagined concrete at consecutive instants of time. We then write down
objects. There is a question whether isomorphism can these observations. The data thus gathered are called
explain how models of that kind represent. the raw data. The raw data then undergo a process of
This points to a larger issue. The structuralist view is cleansing, rectification and regimentation: we throw
a rational reconstruction of scientific modeling, and as away data points that are obviously faulty, take into
such it has some distance from the actual practice. Some consideration what the measurement errors are, take
philosophers have worried that this distance is too large averages, and usually idealize the data, for instance by
and that the view is too far removed from the actual replacing discrete data points by a continuous function.
practice of science to be able to capture what matters Often, although not always, the result is a smooth curve
to the practice of modeling (this is the thrust of many through the data points that satisfies certain theoretical
contributions to [3.11]; see also [3.73]). Although some desiderata (Harris [3.153] and van Fraassen [3.36,
models used by scientists may be best thought of as set pp. 166–168] elaborate on this process). These resulting
theoretic structures, there are many where this seems to data models can be treated as set theoretic structures.
contradict how scientists actually talk about, and reason In many cases the data points are numeric and the data
with, their models. Obvious examples include physical model is a smooth curve through these points. Such
models like the San Francisco bay model [3.33], but a curve is a relation over Rn (for some n), or subsets
also systems such as the idealized pendulum or imagi- thereof, and hence it is structure in the requisite sense.
nary populations of interbreeding animals. Such models Suppes [3.122] was the first to suggested that data
have the strange property of being concrete-if-real and models are the targets of scientific models: models
scientists talk about them as if they were real systems, don’t represent parts of the world; they represent data
72 Part A Theoretical Issues in Models
structures. This approach has then been adopted by van tographs. Weak neutral currents are the phenomenon
Fraassen, when he declares that “[t]he whole point of under investigation; the photographs taken at CERN
having theoretical models is that they should fit the are the raw data, and any summary one might con-
phenomena, that is, fit the models of data” [3.154, struct of the content of these photographs would be
p. 667]. He has defended this position numerous times a data model. But it’s weak neutral currents that oc-
over the years ([3.77, p. 164], [3.101, p. 64], [3.118, cur in the model; not any sort of data we gather in an
p. 524], [3.155, p. 229] and [3.156, p. 271]) including experiment.
in his most recent book on representation [3.36, This is not to say that these data have nothing to do
pp. 246, 252]. So models don’t represent planets, atoms with the model. The model posits a certain number of
or populations; they represent data that are gathered particles and informs us about the way in which they in-
when performing measurements on planets, atoms or teract both with each other and with their environment.
populations. Using this knowledge we can place them in a certain
This revisionary point of view has met with stiff experimental context. The data we then gather in an ex-
resistance. Muller articulates the unease about this po- periment are the product of the elements of the model
Part A | 3.4
sition as follows [3.148, p. 98]: and of the way in which they operate in that context.
Characteristically this context is one that we are able
“the best one could say is that a data structure D
to control and about which we have reliable knowledge
seems to act as simulacrum of the concrete actual
(knowledge about detectors, accelerators, photographic
being B [. . . ] But this is not good enough. We don’t
plates and so on). Using this and the model we can de-
want simulacra. We want the real thing. Come on.”
rive predictions about what the outcomes of an experi-
Muller’s point is that science aims (or at least has ment will be. But, and this is the salient point, these pre-
to aim) to represent real systems in the world and not dictions involve the entire experimental setup and not
data structures. Van Fraassen calls this the “loss of only the model and there is nothing in the model itself
reality objection” [3.36, p. 258] and accepts that the with which one could compare the data. Hence, data are
structuralist must ensure that models represent target highly contextual and there is a big gap between observ-
systems, rather than finishing the story at the level of able outcomes of experiments and anything one might
data. In his [3.36] he addresses this issue in detail and call a substructure of a model of neutral currents.
offers a solution. We discuss his solution below, but To underwrite this claim Bogen and Woodward
before doing so we want to articulate the objection in notice that parallel to the research at CERN, the
more detail. To this end we briefly revisit the discussion National Accelerator Laboratory (NAL) in Chicago
about phenomena and data which took place in the also performed an experiment to detect weak neutral
1980s and 1990s. currents, but the data obtained in that experiment were
Bogen and Woodward [3.157], Woodward [3.158], quite different. They consisted of records of patterns of
and more recently (and in a somewhat different guise) discharge in electronic particle detectors. Though the
Teller [3.159], introduced the distinction between phe- experiments at CERN and at NAL were totally different
nomena and data and argue that models represent phe- and as a consequence the data gathered had nothing in
nomena, not data. The difference is best introduced common, they were meant to provide evidence for the
with an example: the discovery of weak neutral cur- same theoretical model. But the model, to reiterate the
rents [3.157, pp. 315–318]. What the model at stake point, does not contain any of these contextual factors.
consists of is particles: neutrinos, nucleons, and the Z 0 It posits certain particles and their interaction with
particle, along with the reactions that take place be- other particles, not how detectors work or what read-
tween them. (The model we are talking about here is ings they show. That is, the model is not idiosyncratic
not the so-called standard model of elementary parti- to a special experimental context in the way the data
cles as a whole. Rather, what we have in mind is one are and therefore it is not surprising that they do not
specific model about the interaction of certain particles contain a substructure that is isomorphic to the data.
of the kind one would find in a theoretical paper on this For this reason, models represent phenomena, not data.
experiment.) Nothing of that, however, shows in the rel- It is difficult to give a general characterization of
evant data. CERN (Conseil Européen pour la Recherche phenomena because they do not belong to one of the tra-
Nucléaire) in Geneva produced 290 000 bubble cham- ditional ontological categories [3.157, p. 321]. In fact,
ber photographs of which roughly 100 were considered phenomena fall into many different established cat-
to provide evidence for the existence of neutral cur- egories, including particular objects, features, events,
rents. The notable point in this story is that there is no processes, states, states of affairs, or they defy classi-
part of the model (provided by quantum field theory) fication in these terms altogether. This, however, does
that could be claimed to be isomorphic to these pho- not detract from the usefulness of the concept of a phe-
Models and Representation 3.4 The Structuralist Conception 73
nomenon because specifying one particular ontological need to be addressed here are: (a) What establishes the
category to which all phenomena belong is inessen- representational relationship between data models and
tial to the purpose of this section. What matters to the phenomena? and (b) Why if a scientific model rep-
problem at hand is the distinctive role they play in con- resented some data model, which in turn represented
nection with representation. some phenomenon, would that establish a represen-
What then is the significance of data, if they are not tational relationship between the model and the phe-
the kind of things that models represent? The answer to nomenon itself? With respect to the first question, Brad-
this question is that data perform an evidential function. ing and Landry argue that it cannot be captured within
That is, data play the role of evidence for the presence the structuralist framework [3.137, p. 575]. The ques-
of certain phenomena. The fact that we find a certain tion has just been pushed back: rather than asking how
pattern in a bubble chamber photograph is evidence for a scientific model qua mathematical structure repre-
the existence of neutral currents. Thus construed, we do sents a phenomenon, we now ask how a data model qua
not denigrate the importance of data in science, but we mathematical structure represents a phenomenon. With
do not have to require that data have to be embeddable respect to the second question, although representation
Part A | 3.4
into the model at stake. is not intransitive, it is not transitive [3.20, pp. 11–12].
Those who want to establish data models as targets So more needs to be said regarding how a scientific
can reply to this in three ways. The first reply is an model representing a data model, which in turn repre-
appeal to radical empiricism. By postulating phenom- sents the phenomenon from which data are gathered,
ena over and above data we leave the firm ground of establishes a representational relationship between the
observable things and started engaging in transempir- first and last element in the representational chain.
ical speculation. But science has to restrict its claims The third reply is due to van Fraassen [3.36]. His
to observables and remain silent (or at least agnos- Wittgensteinian solution is to diffuse the loss of reality
tic) about the rest. Therefore, so the objection goes, objection. Once we pay sufficient attention to the prag-
phenomena are chimeras that cannot be part of any se- matic features of the contexts in which scientific and
rious account of science. It is, however, doubtful that data models are used, van Fraassen claims, there ac-
this helps the data model theorist. Firstly, note that it tually is no difference between representing data and
even rules out representing observable phenomena. To representing a target (or a phenomenon in Bogen and
borrow van Fraassen’s example on this story, a popula- Woodward’s sense) [3.36, p. 259]:
tion model of deer reproduction would represent data,
“in a context in which a given [data] model is
rather than deer [3.36, pp. 254–260]. Traditionally, em-
someone’s representation of a phenomenon, there is
piricists would readily accept that deer, and the rates
for that person no difference between the question
at which they reproduce, are observable phenomena.
whether a theory [theoretical model] fits that repre-
Denying that they are represented, by replacing them
sentation and the question whether that theory fits
with data models, seems to be an implausible move.
the phenomenon.”
Secondly, irrespective of whether one understands phe-
nomena realistically [3.157] or antirealistically [3.160], Van Frasseen’s argument for this claim is long and
it is phenomena that models portray and not data. To difficult and we cannot fully investigate it here; we re-
deny the reality of phenomena just won’t make a theo- strict attention to one crucial ingredient and refer the
retical model represent data. Whether we regard neutral reader to Nguyen [3.161] for a detailed discussion of
currents as real or not, it is neutral currents that are por- the argument.
trayed in a field-theoretical model, not bubble chamber Moore’s paradox is that we cannot assert sentences
photographs. Of course, one can suspend belief about of the form p and I don’t believe that p, where p is an ar-
the reality of these currents, but that is a different mat- bitrary proposition. For instance, someone cannot assert
ter. that Napoleon was defeated in the battle of Waterloo
The second reply is to invoke a chain of representa- and assert, at the same time, that she doesn’t believe that
tional relationships. Brading and Landry [3.137] point Napoleon was defeated in the battle of Waterloo. Van
out that the connection between a model and the world Fraassen’s treatment of Moore’s paradox is that speak-
can be broken down in two parts: the connection be- ers cannot assert such sentences because the pragmatic
tween a model and a data model, and the connection commitments incurred by asserting the first conjunct in-
between a data model and the world [3.137, p. 575]. clude that the speaker believe that p. This commitment
So the structuralist could claim that scientific models is then contradicted by the assertion of the second con-
represent data models in virtue of an isomorphism be- junct. So instances of Moore’s paradox are pragmatic
tween the two and additionally claim that data models contradictions. Van Fraassen then draws an analogy be-
in turn represent phenomena. But the key questions that tween this paradox and the scientific representation. He
74 Part A Theoretical Issues in Models
submits that a user simply cannot, on pain of pragmatic The main problem facing this approach is the
contradiction, assert that a data model of a target sys- underdetermination of target-end structure. Under-
tem be embeddable within a theoretical model without determination threatens in two distinct ways. Firstly,
thereby accepting that the theoretical model represents in order to identify the structure determined by
the target. a target system, a domain of objects is required.
However, Nguyen [3.161] argues that in the case What counts as an object in a given target system is
of using a data model as a representation of a phe- a substantial question [3.21]. One could just as well
nomenon, no such pragmatic commitment is incurred, choose bonds as objects and consider the relation
and therefore no such contradiction follows when ac- sharing a node with another bond. Denoting the
companied by doubt that the theoretical model also bonds by a0 ; b0 ; c0 and d 0 , we obtain a structure S0
represents the phenomenon. To see why this is the case, with the domain U 0 D fa0 ; b0 ; c0 ; d 0 g and the relation
consider a more mundane example of representation: r D fha0 ; b0 i; hb0 ; a0 i; ha0 ; c0 i; hc0 ; a0 i; ha0 ; d 0 i; hd 0 ; a0 i;
a caricaturist can represent Margaret Thatcher as draco- hb0 ; c0 i; hc0 ; b0 i; hb0 ; d 0 i; hd 0 ; b0 i; hc0 ; d 0 i; hd 0 ; c0 ig. Ob-
nian without thereby committing himself to the belief viously S and S0 are not isomorphic. So which structure
Part A | 3.4
that Margaret Thatcher really is draconian. Pragmati- is picked out depends on how the system is described.
cally speaking, acts of representation are weaker than Depending on which parts one regards as individuals
acts of assertion: they do not incur the doxastic com- and what relation one chooses, very different structures
mitments required for van Fraassen’s analogy to go can emerge. And it takes little ingenuity to come up
through. So it seems van Fraassen doesn’t succeed in with further descriptions of the methane molecule,
dispelling the loss of reality objection. How target sys- which lead to yet other structures.
tems enter the picture in the structuralist account of There is nothing special about the methane
scientific representation remains therefore a question molecule, and any target system can be presented under
that structuralists who invoke data models as provid- alternative descriptions, which ground different struc-
ing the target-end structures must address. Without such tures. So the lesson learned generalizes: there is no such
an account the structuralist account of representation thing as the structure of a target system. Systems only
remains at the level of data, a position that seems im- have a structure under a particular description, and there
plausible, and contrary to actual scientific practice. are many nonequivalent descriptions. This renders talk
We now turn to the second response: that a structure about a model being isomorphic to target system sim-
is instantiated in the system. As mentioned above, this pliciter meaningless. Structural claims do not stand on
response comes in three versions. The first is metaphys- their own in that their truth rests on the truth of a more
ically more parsimonious and builds on the systems’ concrete description of the target system. As a conse-
constituents. Although target systems are not structures, quence, descriptions are an integral part of an analysis
they are composed of parts that instantiate physical of scientific representation.
properties and relations. The parts can be used to de- In passing we note that Frigg [3.21, pp. 55–56]
fine the domain of individuals, and by considering the also provides another argument that pulls in the same
physical properties and relations purely extensionally, direction: structural claims are abstract and are true
we arrive at a class of extensional relations defined only relative to a more concrete nonstructural descrip-
over that domain (see for instance Suppes’ discussion tion. For a critical discussion of this argument see
of the solar system [3.100, p. 22]). This supplies the Frisch [3.162, pp. 289–294] and Portides, Chap. 2.
required notion of structure. We might then say that How much of a problem this is depends on how aus-
physical systems instantiate a certain structure, and it tere one’s conception of models is. The semantic view of
is this structure that models are isomorphic to. theories was in many ways the result of an antilinguis-
As an example consider the methane molecule. tic turn in the philosophy of science. Many proponents
The molecule consists of a carbon atom and four of the view aimed to exorcise language from an anal-
hydrogen atoms grouped around it, forming a tetra- ysis of theories, and they emphasized that the model-
hedron. Between each hydrogen atom and the carbon world relationship ought to be understood as a purely
atom there is a covalent bond. One can then regard structural relation. Van Fraassen, for instance, submits
the atoms as objects and the bonds are relations. that “no concept which is essentially language depen-
Denoting the carbon atom by a, and the four hydrogen dent has any philosophical importance at all” [3.101,
atoms by b, c, d, and e, we obtain a structure S with p. 56] and observes that “[t]he semantic view of theo-
the domain U D fa; b; c; d; eg and the relation r D ries makes language largely irrelevant” [3.155, p. 222].
fha; bi; hb; ai; ha; ci; hc; ai; ha; di; hd; ai; ha; ei; he; aig, And other proponents of the view, while less vocal about
which can be interpreted as being connected by the irrelevance of language, have not assigned language
a covalent bond. a systematic place in their analysis of theories.
Models and Representation 3.4 The Structuralist Conception 75
For someone of that provenance the above argument can be resolved in one way or another. This would still
is bad news. However, a more attenuated position could leave us with serious epistemic and semantic questions.
integrate descriptions in the package of modeling, but How do we know a certain structure is instantiated in
this would involve abandoning the idea that representa- a system and how do we refer to it? Objects do not come
tion can be cashed out solely in structural terms. Bueno with labels on their sleeves specifying which structures
and French have recently endorsed such a position. they instantiate, and proponents of structural universals
They accept the point that different descriptions lead face a serious problem in providing an account of how
to different structures and explain that such descrip- we access the structures instantiated by target systems.
tions would involve “at the very least some minimal Even if – as a brute metaphysical fact – target systems
mathematics and certain physical assumptions” [3.53, only instantiate a small number of structures, and there-
p. 887]. Likewise, Munich structuralists explicitly ac- fore there is a substantial question regarding whether
knowledge the need for a concrete description of the or not scientific models represent them, this does not
target system [3.163, pp. 37–38], and they consider help us understand how we could ever come to know
these informal descriptions to be internal to the theory. whether or not the isomorphism holds. It seems that
Part A | 3.4
This is a plausible move, but those endorsing this so- individuating a domain of objects and identifying re-
lution have to concede that there is more to epistemic lations between them is the only way for us to access
representation than structures and morphisms. a structure. But then we are back to the first version of
The second way in which structural indeterminacy the response, and we are again faced with all the prob-
can surface is via Newman’s theorem. The theorem lems that it raises.
essentially says that any system instantiates any struc- The third version of the second response is more
ture, the only constraint being cardinality (a practically radical. One might take target systems themselves to be
identical conclusion is reached in Putnam’s so called structures. If this is the case then there is no problem
model-theoretic argument; see Demopoulos [3.164] for with the idea that they can be isomorphic to a scien-
a discussion). Hence, any structure of cardinality C tific model. One might expect ontic structural realists to
is isomorphic to a target of cardinality C because take this position. If the world fundamentally is a struc-
the target instantiates any structure of cardinality C ture, then there is nothing mysterious about the notion
(see Ketland [3.165] and Frigg and Votsis [3.166] for of an isomorphism between a model and the world. Sur-
discussions). This problem is not unsolvable, but all prisingly, some ontic structuralists have been hesitant
solutions require that among all structures formally to adopt such a view (see French and Ladyman [3.120,
instantiated by a target system one is singled out as p. 113] and French [3.169, p. 195]). Others, however,
being the true or natural structure of the system. How seem to endorse it. Tegmark [3.170], for instance, offers
to do this in the structuralist tradition remains unclear an explicit defense of the idea that the world simply is
(Ainsworth [3.167] provides as useful summary of the a mathematical structure. He defines a seemingly mod-
different solutions). erate form of realism – what he calls the external reality
Newman’s theorem is both stronger and weaker hypothesis (ERH) – as the claim that “there exists an
than the argument from multiple descriptions. It’s external physical reality completely independent of us
stronger in that it provides more alternative structures humans” [3.170, p. 102] and argues that this entails that
than multiple descriptions. It’s weaker in that many of the world is a mathematical structure (his “mathemati-
the structures it provides are unphysical because they cal universe hypothesis”) [3.170, p. 102]. His argument
are purely set theoretical combinations of elements. By for this is based on the idea that a so-called theory of
contrast, descriptions pick out structures that a system everything must be expressible in a form that is devoid
can reasonably be seen as possessing. of human-centric baggage (by the ERH), and the only
The second version of the second response emerges theories that are devoid of such baggage are mathemat-
from the literature on the applicability of mathe- ical, which, strictly speaking, describe mathematical
matics. Structural platonists like Resnik [3.108] and structures. Thus, since a complete theory of everything
Shapiro [3.41, 109, 168] take structures to be ante rem describes an external reality independent of humans,
universals. In this view, structures exist independently and since it describes a mathematical structure, the ex-
of physical systems, yet they can be instantiated in ternal reality itself is a mathematical structure.
physical systems. In this view systems instantiate struc- This approach stands or falls on the strengths of its
tures and models are isomorphic to these instantiated premise that a complete theory of everything will be
structures. formulated purely mathematically, without any human
This view raises all kind of metaphysical issues baggage, which in turn relies on a strict reductionist
about the ontology of structures and the instantiation re- account of scientific knowledge [3.170, pp. 103–104].
lation. Let us set aside these issues and assume that they Discussing this in any detail goes beyond our current
76 Part A Theoretical Issues in Models
purposes. But it is worth noting that Tegmark’s dis- mental super-string structure of the bits of matter that
cussion is focused on the claim that fundamentally the make up the wing, and we don’t construct wing mod-
world is a mathematical structure. Even if this were the els that are isomorphic to such fundamental structures.
case, it seems irrelevant for many of our current sci- So Tegmark’s account offers no answer to the question
entific models, whose targets aren’t at this level. When about where structures are to be found at the level of
modeling an airplane wing we don’t refer to the funda- nonfundamental target systems.
ous accounts discussed, a model’s inferential capacity allows competent and informed agents to draw specific
dropped out of whatever it was that was supposed to inferences regarding T.
answer the ER-problem: proposed morphisms or sim-
ilarity relations between models and their targets for Notice that this condition is not an instantiation of the
example. The accounts discussed in this section build ER-scheme: in keeping with n-deflationism it features
the notion of surrogative reasoning directly into the a material conditional rather than a biconditional and
conditions on epistemic representation. hence provides necessary (but not sufficient) conditions
for M to represent T. We now discuss each condition
3.5.1 Deflationary Inferentialism in turn, trying to explicate in what way they satisfy s-
deflationism.
Suárez argues that we should adopt a “deflationary or The first condition is designed to make sure that M
minimalist attitude and strategy” [3.32, p. 770] when and T indeed enter into a representational relationship,
addressing the problem of epistemic representation. We and Suárez stresses that representational force is “nec-
will discuss deflationism in some detail below, but in essary for any kind of representation” [3.32, p. 776].
order to formulate and discuss Suárez’s theory of rep- But explaining representation in terms of representa-
resentation we need at least a preliminary idea of what tional force seems to shed little light on the matter
is meant by a deflationary attitude. In fact two different as long as no analysis of representational force is of-
notions of deflationism are in operation in his account. fered. Suárez addresses this point by submitting that
The first is [3.32, p. 771]: the first condition can be “satisfied by mere stipula-
tion of a target for any source” [3.32, p. 771]. This
“abandoning the aim of a substantive theory to seek
might look like denotation as in Sect. 3.2. But Suárez
universal necessary and sufficient conditions that
stresses that this is not what he intends for two rea-
are met in each and every concrete real instance of
sons. Firstly, he takes denotation to be a substantive
scientific representation [. . . ] necessary conditions
relation between a model and its target, and the intro-
will certainly be good enough.”
duction of such a relation would violate the requirement
We call the view that a theory of representa- of s-deflationism [3.172, p. 41]. Secondly, M can de-
tion should provide only necessary conditions n- note T only if T exists. Thus including denotation
deflationism (n for necessary). The second notion is as a necessary condition on scientific representation
that we should seek “no deeper features to representa- “would rule out fictional representation, that is, repre-
tion other than its surface features” [3.32, p. 771] or sentation of nonexisting entities” [3.32, p. 772], and
“platitudes” [3.171, p. 40], and that we should deny “any adequate account of scientific representation must
that an analysis of a concept “is the kind of analysis accommodate representations with fictional or imagi-
that will shed explanatory light on our use of the con- nary targets” [3.172, p. 44].
cept” [3.172, p. 39]. We call this position s-deflationism The second issue is one that besets other accounts of
(s for surface feature). As far as we can tell, Suárez in- representation too, in particular similarity and isomor-
tends his account of representation to be deflationary in phism accounts. The first reason, however, goes right
both senses. to the heart of Suárez’s account: it makes good on the
Suárez dubs the account that satisfies these criteria s-deflationary condition that nothing other than surface
inferentialism [3.32, p. 773]: features can be included in an account of representation.
Models and Representation 3.5 The Inferential Conception 77
At a surface level one cannot explicate representational “part (ii) of this conception accounts for inaccuracy
force at all and any attempt to specify what representa- since it demands that we correctly draw inferences
tional force consists in is a violation of s-deflationism. from the source about the target, but it does not de-
The second necessary condition, that models allow mand that the conclusions of these inferences be all
competent and informed agents to draw specific infer- true, nor that all truths about the target may be in-
ences about their targets, is in fact just the surrogative ferred.”
reasoning condition we introduced in Sect. 3.1, now
Models represent their targets only if they license
taken as a necessary condition on epistemic represen-
inferences about them. They represent them accurately
tation. The sorts of inferences that models allow are
to the extent that the conclusions of these inferences are
not constrained. Suárez points out that the condition
true.
“does not require that [M] allow deductive reasoning
With respect to the representational demarcation
and inference; any type of reasoning inductive, analog-
problem, Suárez illustrates his account with a large
ical, abductive – is in principle allowed” [3.32, p. 773].
range of representations, including diagrams, equa-
(The insistence on inference makes Suárez’s account an
tions, scientific models, and nonscientific representa-
Part A | 3.5
instance of what Chakravartty [3.173] calls a functional
tions such as artistic portraits. He explicitly states that
conception of representation.)
“if the inferential conception is right, scientific rep-
A problem for this approach is that we are left with
resentation is in several respects very close to iconic
no account of how these inferential rules are generated:
modes of representation like painting” [3.32, p. 777]
what is it about models that allows them to license infer-
and he mentions the example of Velázquez’s portrait of
ences about their targets, or what leads them to license
Innocent X [3.32]. It is clear that the conditions of infer-
some inferences and not others? Contessa makes this
entialism 1 (Definition 3.9) are met by nonscientific as
point most stridently when he argues that [3.29, p. 61]:
well as scientific epistemic representations. So, at least
“On the inferential conception, the user’s ability without sufficient conditions, there is no clear way of
to perform inferences from a vehicle [model] to demarcating between the different kinds of epistemic
a target seems to be a brute fact, which has no representation.
deeper explanation. This makes the connection be- Given the wide variety of types of representation
tween epistemic representation and valid surroga- that this account applies to, it’s unsurprising that Suárez
tive reasoning needlessly obscure and the perfor- has little to say about the ontological problem. The only
mance of valid surrogative inferences an activity constraint that inferentialism 1 (Definition 3.9) places
as mysterious and unfathomable as soothsaying or on the ontology of models is that “[i]t requires [M] to
divination.” have the internal structure that allows informed agents
to correctly draw inferences about [T]” [3.32, p. 774].
This seems correct, but Suárez can dismiss this And relatedly, since the account is supposed to apply
complaint by appeal to s-deflationism. Since inferen- to a wide variety of entities, including equations and
tial capacity is supposed to be a surface-level feature of mathematical structures, the account implies that math-
scientific representation, we are not supposed to ask for ematics is successfully applied in the sciences, but in
any elucidation about what makes an agent competent keeping with the spirit of deflationism no explanation is
and well informed and how inferences are drawn. offered about how this is possible.
For these reasons Suárez’s account is deflation- Suárez does not directly address the problem of
ary both in the sense of n-deflationism and of s- style, but a minimalist answer emerges from what he
deflationism. His position provides us with a concept says about representation. On the one hand he explicitly
of epistemic representation that is cashed out in terms acknowledges that many different kinds of inferences
of an inexplicable notion of representational force and are allowed by the second condition in inferentialism 1
of an inexplicable capacity to ground inferences. This (Definition 3.9). In the passage quoted above he men-
is very little indeed. It is the adoption of a deflationary tions inductive, analogical and abductive inferences.
attitude that allows him to block any attempt to further This could be interpreted as the beginning of classi-
unpack these conditions and so the crucial question is: fication of representational styles. On the other hand,
why should one adopt deflationism? Suárez remains silent about what these kinds are and
We turn to this question shortly. Before doing so we about how they can be analyzed. This is unsurpris-
want to briefly outline how the above account fares with ing because spelling out what these inferences are, and
respect to the other problems introduced in Sect. 3.1. what features of the model ground them, would amount
The account provides a neat explanation of the possi- to giving a substantial account, which is something
bility of misrepresentation [3.32, p. 776]: Suárez wants to avoid.
78 Part A Theoretical Issues in Models
Let us now return to the question about the mo- rect analysis of truth. This, however, is far from an
tivation for deflationism. As we have seen, a com- established fact. Different positions are available in the
mitment to deflationism about the concept is central debate and whether deflationism (or any specific ver-
to Suárez’s approach to scientific representation. But sion of it) is superior to other proposals remains a matter
deflationism comes in different guises, which Suárez il- of controversy (see, for instance, Künne [3.174]). But
lustrates by analogy with deflationism with respect to as long as it’s not clear that deflationism about truth
truth. Suárez [3.172] distinguishes between the redun- is a superior position, it’s hard to see how one can
dancy theory (associated with Frank Ramsey and also muster support for deflationism about representation by
referred to as the no theory view), abstract minimalism appealing to deflationism about truth.
(associated with Crispin Wright) and the use theory (as- Moreover, a position that allows only necessary
sociated with Paul Horwich). What all three are claimed conditions on epistemic representation faces a serious
to have in common is that they accept the disquotational problem. While such an account allows us to rule out
schema – i. e., instances of the form: P is true iff P. certain scenarios as instances of epistemic represen-
Moreover they [3.172, p. 37] tation (for example a proper name doesn’t allow for
Part A | 3.5
If one takes conditions (i) and (ii) to refer to “features lidity of argument [3.175, 176]. Instead, we are urged to
of activates within a normative practice, [that] do not begin from the inferential role of sentences (or propo-
stand for relations between sources and targets” [3.172, sitions, or concepts, and so on) – that is the role that
p. 46], then we arrive at a use-based account of epis- they play in providing reasons for other sentences (or
temic representation. In order to understand a particular propositions etc.), and having such reasons provided for
instance of a model M representing a target T we have them – and from this reconstruct their representational
to understand how scientists go about establishing that aspects.
M’s representational force points towards T, and the in- Such an approach is developed by de Donato Ro-
ferential rules, and particular inferences from M to T, dríguez and Zamora Bonilla [3.177] and seems like
they use and make. a fruitful route for future research, but for want of space
Plausibly, such a focus on practice amounts to look- we will not discuss it in detail here. There is no evidence
ing at the inferential rules employed in each instance, that Suárez would endorse such an approach. And,
or type of instance, of epistemic representation. This, more worrying for inferentialism 2 (Definition 3.10),
however, raises a question about the status of any such it is not clear whether such an approach would satisfy
Part A | 3.5
analysis vis-à-vis the general theory of representation as s-deflationism. Each investigation into the inferential
given in inferentialism 2 (Definition 3.10). There seem rules utilized in each instance, or type of instance of
to be two options. The first is to affirm inferentialism 2’s epistemic representation will likely be a substantial
(Definition 3.10) status as an exhaustive theory of repre- (possibly sociological or anthropological) project. Thus
sentation. This, however, would imply that any analysis the s-deflationary credentials of the approach – at least
of the workings of a particular model would fall out- if they are taken to require that nothing substantial can
side the scope of a theory of representation because be said about scientific representation in each instance,
any attempt to address Contessa’s objection would push as well as in general – are called into question.
the investigation outside the territory delineated by s- Finally, if the conditions in inferentialism 2 (Defi-
deflationism. Such an approach seems to be overly nition 3.10) are taken to be abstract platitudes then we
purist. The second option is to understand inferential- arrive at an abstract minimalism. Although inferential-
ism 2 (Definition 3.10) as providing abstract conditions ism 2 (Definition 3.10) defines the concept of epistemic
that require concretization in each instance of epistemic representation, the definition does not suffice to explain
representation (abstraction can here be understood, for the use of any particular instance of epistemic represen-
instance, in Cartwright’s [3.74] sense). Studying the tation for ([3.172, p. 48], cf. [3.171]):
concrete realizations of the abstract conditions is then
an integral part of the theory. This approach seems “on the abstract minimalism here considered, to
plausible, but it renders deflationism obsolete. Thus apply this notion to any given concrete case of rep-
understood, the view becomes indistinguishable from resentation requires that some additional relation
a theory that accepts the surrogative reasoning condi- obtains between [M] and [T], or a property of [M]
tion and the requirement of directionality as conditions or [T], or some other application condition.”
of adequacy and analyzes them in pluralist spirit, that is,
under the assumption that these conditions can have dif- Hence, according to this approach representational
ferent concrete realizers in different contexts. But this force and inferential capacity are taken to be abstract
program can be carried out without ever mentioning de- platitudes that suffice to define the concept of scien-
flationism. tific representation. However, because of their level of
One might reply that the first option unfairly stacks generality, they fail to explain any particular instance
the deck against inferentialism and point out that dif- of it. To do this requires reference to additional fea-
ferent inferential practices can be studied within the tures that vary from case to case. These other conditions
inferentialist framework. One way of making good on can be “isomorphism or similarity” and they “would
this idea would be to submit that the inferences from need to obtain in each concrete case of representa-
models to their targets should be taken as conceptually tion” ([3.171, p. 45], [3.32, p. 773], [3.172, p. 43]).
basic, denying that they need to be explained; in par- These extra conditions are called the means of repre-
ticular, denying that they need to be grounded by any sentation, the relations that scientists exploit in order
(possibly varying) relation(s) that might hold between to draw inferences about targets from their models,
models and their targets. Such an approach is inspired and are to be distinguished from conditions (i) and
by Brandom’s inferentialism in the philosophy of lan- (ii), the constituents of representation, that define the
guage where the central idea is to reverse the order of concept ([3.23, p. 230], [3.171, p. 43], [3.172, p. 46],
explanation from representational notions – like truth [3.178, pp. 93–94]). We are told that the means cannot
and reference – to inferential notions – such as the va- be reduced to the constituents but that [3.171, p. 43]:
80 Part A Theoretical Issues in Models
“all representational means (such as isomorphism Contessa offers a detailed formal characterization of an
and similarity) are concrete instantiations, or real- interpretation, which we cannot repeat here for want of
izations, of one of the basic platitudes that constitute space (see [3.29, pp. 57–62] for details). The leading
representation” idea is that the model user first identifies a set of rel-
evant objects in the model, and a set of properties and
and that “there can be no application of represen-
relations these objects instantiate, along with a set of
tation without the simultaneous instantiation of a par-
relevant objects in the target and a set of properties and
ticular set of properties of [M] and [T], and their
relations these objects instantiate. The user then:
relation” [3.171, p. 44].
Such an approach amounts to using conditions (i)
1. Takes M to denote T.
and (ii) to answer the ER-problem, but again with the
2. Takes every identified object in the model to denote
caveat that they are abstract conditions that require con-
exactly one object in the target (and every relevant
cretization in each instance of epistemic representation.
object in the target has to be so denoted and as a re-
In this sense it is immune to Contessa’s objection about
sult there is a one-to-one correspondence between
the mysterious capacity that models have to license
Part A | 3.5
“not mean to imply that all interpretation of vehi- Contessa [3.180] distinguishes between mathematical
cles [models] in terms of the target are necessarily models and fictional models, where fictional models are
analytic. Epistemic representations whose standard taken to be fictional objects. We briefly return to his on-
interpretations are not analytic are at least conceiv- tological views in Sect. 3.6.
able.” In order to deal with the possibly of misrepresen-
tation, Contessa notes that “a user does not need to
Even with this in mind, it is clear that he intends that
believe that every object in the model denotes some
some interpretation is a necessary condition on epis-
object in the system in order to interpret the model in
temic representation.
terms of the system” [3.29, p. 59]. He illustrates this
Let’s now turn to how interpretation fares with
claim with an example of contemporary scientists us-
respect to our questions for an account of epistemic rep-
ing the Aristotelian model of the cosmos to represent
resentation as set out in Sect. 3.2. Modulo the caveat
the universe, pointing out that “in order to interpret the
about nonanalytical interpretations, interpretation (Def-
model in terms of the universe, we do not need to as-
inition 3.11) provides necessary and sufficient condi-
sume that the sphere of fixed stars itself [. . . ] denotes
tions on epistemic representation and hence answers the
Part A | 3.5
anything in the universe” [3.29].
ER-problem. Furthermore, it does so in a way that ex-
From this example it is clear that the relevant sets
plains the directionality of representation: interpreting
of objects, properties and functions isolated in the con-
a model in terms of a target does not entail interpreting
struction of the analytic interpretation do not need to
a target in terms of a model.
exhaust the objects, properties, relations, and functions
Contessa does not comment on the applicability of
of either the model or the target. The model user can
mathematics but since his account shares with the struc-
identify a relevant proper subset in each instance. This
turalist account an emphasis on relations and one-to-
allows interpretation (Definition 3.11) to capture the
one model-target correspondence, Contessa can appeal
common practice of abstraction in scientific models:
to the same account of the applicability of mathematics
a model need only represent some features of its target,
as structuralist.
and moreover, the model may have the sort of surplus
With respect to the demarcation problem, Contessa
features are not taken to represent anything in the tar-
is explicit that “[p]ortraits, photographs, maps, graphs,
get, i. e., that not all of a model’s features need to play
and a large number of other representational devices”
a direct representational role.
perform inferential functions [3.29, p. 54]. Since noth-
This suggestion bears some resemblance to par-
ing in the notion of an interpretation seems restricted
tial structures, and it suffers from the same problem
to scientific models, it is plausible to regard interpreta-
too. In particular distortive idealisations are a source of
tion (Definition 3.11) as a universal theory of epistemic
problems for interpretation (Definition 3.11), as several
representation (a conclusion that is also supported by
commentators have observed (see Shech [3.181] and
the fact that Contessa [3.29] uses the example of the
Bolinska [3.28]). Contessa is aware of this problem and
London Underground map to motivate his account; see
illustrates it with the example of a massless string. His
also [3.179]). As such, interpretation (Definition 3.11)
response to the problem is to appeal to a user’s correc-
seems to deny the existence of a substantial distinction
tive abilities [3.29, p. 60]:
between scientific and nonscientific epistemic repre-
sentations (at least in terms of their representational “since models often misrepresent some aspect of the
properties). It remains unclear how interpretation (Defi- system or other, it is usually up to the user’s compe-
nition 3.11) addresses the problem of style. As we have tence, judgment, and background knowledge to use
seen earlier, in particular visual representations fall into the model successfully in spite of the fact that the
different categories. It is a question for future research model misrepresents certain aspects of the system.”
how these can be classified within the interpretational
framework. This is undoubtedly true, but it is unclear how such
With respect to the question of ontology, interpre- a view relates, or even derives from, interpretation (Def-
tation (Definition 3.11) itself places few constraints on inition 3.11). An appeal to the competence of users
what scientific models are, ontologically speaking. All seems to be an ad hoc move that has no systematic
it requires is that they consist of objects, properties, re- grounding in the idea of an interpretation, and it is an
lations, and functions. For this reason our discussion in open question how the notion of an interpretation could
Sect. 3.3.3 above rears its head again here. As before, be amended to give distortive idealizations a systematic
how to apply interpretation (Definition 3.11) to physical place.
models can be understood relatively easily. But how to Ducheyne [3.182] provides a variant of interpreta-
apply it to nonphysical models is less straightforward. tion (Definition 3.11) that one might think could be used
82 Part A Theoretical Issues in Models
to accommodate these distortive idealizations. The de- be the case that all cases of scientific misrepresentation
tails of the account, which we won’t state precisely here are instances where the model is an approximation of
for want of space, can be found in [3.182, pp. 83–86]. the target (or even conversely, it is not clear whether all
The central idea is that each relevant relation specified instances of approximation need to be considered cases
in the interpretation holds precisely in the model, and of misrepresentation in the sense that they license false-
corresponds to the same relation that holds only ap- hoods about their targets).
proximately (with respect to a given purpose) in the
target. For example, the low mass of an actual pen- 3.5.3 The Denotation, Demonstration,
dulum’s string approximates the masslessness of the and Interpretation Account
string in the model. The one-to-one correspondence
between (relevant) objects and relations in the model Our final account is Hughes’ denotation, demonstra-
and target is retained, but the notion of a user tak- tion, and interpretation (DDI) account of scientific
ing relations in the model to denote relations in the representation [3.188] and [3.189, Chap. 5]. This ac-
target is replaced with the idea that the relations in count has inspired both the inferential (see Suárez [3.32,
Part A | 3.5
the target are approximations of the ones they corre- p. 770] and [3.172]) and the interpretational account
spond to. Ducheyne calls this the pragmatic limiting (see Contessa [3.179, p. 126]) discussed in this section.
case account of scientific representation (the pragmatic Quoting directly from Goodman [3.64, p. 5],
element comes from the fact that the level of approx- Hughes takes a model of a physical system to “be
imation required is determined by the purpose of the a symbol for it, stand for it, refer to it” [3.188, p. 330].
model user). Presumably the idea is that a model denotes its target
However, if this account is to succeed in explaining it the same way that a proper name denotes its bearer,
how distortive idealizations are scientific representa- or, stretching the notion of denotation slightly, a pred-
tions, then more needs to be said about how a target icate denote elements in its extension. (Hughes [3.188,
relation can approximate a model relation. Ducheyne p. 330] notes that there is an additional complication
implicitly relies on the fact that relations are such that when the model has multiple targets but this is not spe-
“we can determine the extent to which [they hold] cific to the DDI account and is discussed in more detail
empirically” [3.182, p. 83] (emphasis added). This sug- in Sect. 3.8). This is the first D in DDI. What makes
gests that he has quantifiable relations in mind, and models epistemic representations and thereby distin-
that what it means for a relation r in the target to guishes them from proper names, are the demonstration
approximate a relation r’ in the model is a matter and interpretation conditions.
of comparing numerical values, where a model user’s The demonstration condition, the second D in DDI,
purpose determines how close they must be if the for- relies on a model being “a secondary subject that has,
mer is to count as an approximation of the latter. But so to speak, a life of its own. In other words, [a] rep-
whether this exhausts the ways in which relations can resentation has an internal dynamic whose effects we
be approximations remains unclear. Hendry [3.183], can examine” [3.188, p. 331] (that models have an in-
Laymon [3.184], Liu [3.185], Norton [3.186], and Ram- ternal dynamic is all that Hughes has to say about the
sey [3.187], among others, offer discussions of dif- problem of ontology). The two examples offered by
ferent kinds of idealizations and approximations, and Hughes are both models of what happens when light is
Ducheyne would have to make it plausible that all these passed through two nearby slits. One model is math-
can be accommodated in his account. ematical where the internal dynamics are “supplied
More importantly, Ducheyne’s account has prob- by the deductive, resources of the mathematics they
lems dealing with misrepresentations. Although it is employ” [3.188], the other is a physical ripple cham-
designed to capture models that misrepresent by being ber where they are supplied by “the natural processes
approximations of their targets, it remains unclear how involved in the propagation of water waves” [3.188,
it deals with models that are outright mistaken. For ex- p. 332].
ample, it seems a stretch to say that Thomson’s model Such demonstrations, on either mathematical mod-
of the atom (now derogatively referred to as the plum els or physical models are still primarily about the
pudding model) is an approximation of what the quan- models themselves. The final aspect of Hughes’ ac-
tum mechanical shell model tells us about atoms, and count – the I in DDI – is interpretation of what has
it seems unlikely that there is a useful sense in which been demonstrated in the model in terms of the target
the relations that hold between electrons in Thomson’s system. This yields the predictions of the model [3.188,
model approximate those that hold in reality. But this p. 333]. Unfortunately Hughes has little to say about
does not mean that it is not a scientific representation of what it means to interpret a result of a demonstration
the atom; it’s just an incorrect one. It does not seem to on a model in terms of its target system, and so one has
Models and Representation 3.6 The Fiction View of Models 83
Part A | 3.6
in light of this. On one reading, he can be seen as de- cashed out in the same way as Contessa’s analytic in-
scribing how we use models. As such, DDI functions as terpretation, then the account will be vulnerable to the
a diachronic account of what a model user does when same issues as those discussed previously. In another
using a model in an attempt to learn about a target sys- place Hughes endorses Giere’s semantic view of theo-
tem. We first stipulate that the model stands for the ries, which he characterizes as connecting models to the
target, then prove what we want to know, and finally target with a theoretical hypothesis [3.190, p. 121]. This
transfer the results obtained in the model back to the suggests that an interpretation is a theoretical hypothe-
target. Details aside, this picture seems by and large sis in this sense. If so, then Hughes’s account collapses
correct. The problem with the DDI account is that it into a version of Giere’s.
does not explain why and how this is possible. Under Given that Hughes describes his account as “de-
what conditions is it true that the model denotes the tar- signedly skeletal [and in need] to be supplemented
get? What kinds of things are models that they allow on a case-by-case basis” [3.188, p. 335], one option
for demonstrations? How does interpretation work; that available is to take the demonstration and interpreta-
is, how can results obtained in the model be transferred tion conditions to be abstract (in the sense of abstract
to the target? These are questions an account of epis- minimalism discussed above), which require filling in
temic representation has to address, but which are left each instance, or type of instance, of epistemic repre-
unanswered by the DDI account thus interpreted. Ac- sentation. As Hughes notes, his examples of the internal
cordingly, DDI provides an answer to a question distinct dynamics of mathematical and physical models are rad-
from the ER-problem. Although a valuable answer to ically different with the demonstrations of the former
the question of how models are used, it does not help us utilizing mathematics, and the latter physical proper-
too much here, since it presupposes the very representa- ties such as the propagation of water waves. Similar
tional relationship we are interested in between models remarks apply to the interpretation of these demonstra-
and their targets. tions, as well as to denotation. But as with Suárez’s
An alternative reading of Hughes’ account emerges account, the definition sheds little light on the prob-
when we consider the developments of the structural- lem at hand as long as no concrete realizations of
ist and similarity conceptions discussed previously, and the abstract conditions are discussed. Despite Hughes’
the discussion of deflationism in Sect. 3.5.1: perhaps the claims to the contrary, such an account could prove
very act of using a model, with all the user intentions a viable answer the ER-problem, and it seems to cap-
and practices that brings with it, constitutes the epis- ture much of what is valuable about both the abstract
temic representation relationship itself. And as such, minimalist version of inferentialism 2 (Definition 3.10)
perhaps the DDI conditions could be taken as an answer as well as interpretation (Definition 3.11) discussed
to the ER-problem: above.
3.6.1 Models and Fiction mind entities like Sherlock Holmes’ London, and
Tolkein’s Middle Earth. [. . . ] the model systems of
Scientific discourse is rife with passages that appear science often work similarly to these familiar fic-
to be descriptions of systems in a particular discipline, tions.”
and the pages of textbooks and journals are filled with
discussions of the properties and the behavior of those This is the core of the fiction view of models:
systems. Students of mechanics investigate at length the models are akin to places and characters in literary
dynamical properties of a system consisting of two or fiction. When modeling the solar system as consist-
three spinning spheres with homogeneous mass distri- ing of ten perfectly spherical spinning tops physicists
butions gravitationally interacting only with each other. describe (and take themselves to describe) an imagi-
Population biologists study the evolution of one species nary physical system; when considering an ecosystem
that reproduces at a constant rate in an unchanging en- with only one species biologists describe an imaginary
vironment. And when studying the exchange of goods, population; and when investigating an economy with-
economists consider a situation in which there are only out money and transaction costs economists describe
Part A | 3.6
two goods, two perfectly rational agents, no restric- an imaginary economy. These imaginary scenarios are
tions on available information, no transaction costs, no tellingly like the places and characters in works of fic-
money, and dealings are done immediately. Their sur- tion like Madame Bovary and Sherlock Holmes.
face structure notwithstanding, no one would mistake Although hardly at the center of attention, the par-
descriptions of such systems as descriptions of an ac- allels between certain aspects of science and literary
tual system: we know very well that there are no such fiction have not gone unnoticed. Maxwell discussed in
systems (of course some models are actual systems – great detail the motion of “a purely imaginary fluid”
a scale model of a car in a wind tunnel for example – in order to understand the electromagnetic field [3.192,
but in this section we focus on models that are not of pp. 159–160]. The parallel between science and fiction
this kind). Scientists sometimes express this fact by say- occupied center stage in Vaihinger’s [3.193] philoso-
ing that they talk about model land (for instance [3.191, phy of the as if. More recently, the parallel has also
p.135]). been drawn specifically between models and fiction.
Thomson-Jones [3.98, p. 284] refers to such a de- Cartwright observes that “a model is a work of fic-
scription as a “description of a missing system”. These tion” [3.194, p. 153] and later suggests an analysis of
descriptions are embedded in what he calls the “face models as fables [3.73, Chap. 2]. McCloskey [3.195]
value practice” [3.98, p. 285]: the practice of talking emphasises the importance of narratives and stories
and thinking about these systems as if they were real. in economics. Fine notes that modeling natural phe-
We observe that the amplitude of an ideal pendulum nomena in every area of science involves fictions in
remains constant over time in much the same way in Vaihinger’s sense [3.196, p. 16], and Sklar highlights
which we say that the Moon’s mass is approximately that describing systems as if they were systems of
7:34 1022 kg. Yet the former statement is about a point some other kind is a royal route to success [3.197,
mass suspended from a massless string – and there is no p. 71]. Elgin [3.198, Chap. 6] argues that science shares
such thing in the world. important epistemic practices with artistic fiction. Hart-
The face value practice raises a number of ques- mann [3.199] and Morgan [3.200] emphasize that sto-
tions. What account should be given of these descrip- ries and narratives play an important role in models,
tions and what sort of objects, if any, do they describe? and Morgan [3.201] stresses the importance of imag-
How should we analyze the face value practice? Are ination in model building. Sugden [3.202] points out
we putting forward truth-evaluable claims when putting that economic models describe “counterfactual worlds”
forward descriptions of missing systems? An answer to constructed by the modeler. Frigg [3.30, 203] suggests
these questions emerges from the following passage by that models are imaginary objects, and Grüne-Yanoff
Peter Godfrey-Smith [3.152, p. 735]: and Schweinzer [3.204] emphasize the importance of
stories in the application of game theory. Toon [3.48,
“[. . . ] I take at face value the fact that modelers 205] has formulated an account of representation based
often take themselves to be describing imaginary bi- on a theory of literary fiction. Contessa [3.180] provides
ological populations, imaginary neural networks, or a fictional ontology of models and Levy [3.43, 206] dis-
imaginary economies. [. . . ] Although these imag- cusses models as fictions.
ined entities are puzzling, I suggest that at least But simply likening modeling to fiction does not
much of the time they might be treated as simi- solve philosophical problems. Fictional discourse and
lar to something that we are all familiar with, the fictional entities face well-known philosophical ques-
imagined objects of literary fiction. Here I have in tions, and hence explaining models in terms of fictional
Models and Representation 3.6 The Fiction View of Models 85
characters seems to amount to little more than to ex- different formats allow scientists to draw different in-
plain obscurum per obscurius. The challenge for pro- ferences. This ties in with Knuuttila’s insistence that we
ponents of the fiction view is to show that drawing ought to pay more attention to the “medium of represen-
an analogy between models and fiction has heuristic tation” when studying models [3.9, 217].
value. One last point stands in need of clarification: the
A first step towards making the analogy productive meaning of the term fiction. Setting aside subtleties that
is to get clear on what the problem is that the appeal are irrelevant to the current discussion, the different
to fiction is supposed to solve. This issue divides pro- uses of fiction fall into two groups: fiction as falsity and
ponents of the fiction view into two groups. Authors fiction as imagination [3.218]. Even though not mutu-
belonging to the first camp see the analogy with fic- ally exclusive, the senses should be kept separate. The
tion as providing an answer to the problem of ontology. first use of fiction characterizes something as deviating
Models, in that view, are ontologically on par with liter- from reality. We brand Peter’s account of events a fic-
ary fiction while there is no productive parallel between tion if he does not report truthfully how things have
models and fiction as far as the ER-problem (or in- happened. In the second use, fiction refers to a kind of
Part A | 3.6
deed any other problem of representation) is concerned. literature, literary fiction. Rife prejudice notwithstand-
Authors belonging to the second group hold the oppo- ing, the defining feature of literary fiction is not falsity.
site view. They see the analogy with fiction first and Neither is everything that is said in, say, a novel untrue
foremost as providing an answer to the ER-problem (al- (novels like War and Peace contain correct historical
though, as we have seen, this may place restrictions on information); nor does every text containing false re-
the ontology of models). Scientific representation, in ports qualify as fiction (a wrong news report or a faulty
this view, has to be understood along the lines of how documentary do not by that token turn into fiction –
literary fiction relates to reality. Positions on ontology they remain what they are, namely wrong factual state-
vary. Some authors in this group also adopt a fiction ments). What makes a text fictional is the attitude that
view of ontology; some remain agnostic about the anal- the reader is expected to adopt towards it. When reading
ogy’s contribution to the matters of ontology; and some a novel we are not meant to take the sentences we read
reject the problem of ontology altogether. as reports of fact; rather we are supposed to imagine the
This being a review of models and representa- events described.
tion, we refer the reader to Gelfert’s contribution to It is obvious from what has been said so far that
this book for an in-depth discussion of the ontology the fiction view of models invokes the second sense
of models, Chap. 1, and focus on the fiction view’s of fiction. Authors in this tradition do not primarily
contribution to semantics. Let us just note that those intend to brand models as false; they aim to empha-
who see fiction as providing an ontology of models size that models are presented as something to ponder.
are spoiled for choice. In principle every option avail- This is not to say the first sense of fiction is irrele-
able in the extensive literature on fiction is a candidate vant in science. Traditionally fictions in that sense have
for an ontology of models; for reviews of these op- been used as calculational devices for generating pre-
tions see Friend [3.207] and Salis [3.208]. Different dictions, and recently Bokulich [3.14] emphasized the
authors have made different choices, with proposals explanatory function of fictions. The first sense of fic-
being offered by Contessa [3.180], Ducheyne [3.72], tion is also at work in philosophy where antirealist
Frigg [3.203], Godfrey-Smith [3.209], Levy [3.43], and positions are described as fictionalism. For instance,
Sugden [3.210]. Cat [3.211], Liu [3.212, 213], Pin- someone is a fictionalist about numbers if she thinks
cock [3.214, Chap. 12], Thomson-Jones [3.98] and that numbers don’t exist (see Kalderon [3.219] for a dis-
Toon [3.205] offer critical discussions of some of these cussion of several fictionalisms of this kind). Scientific
approaches. antirealists are fictionalists about many aspects of scien-
Even if these ontological problems were settled in tific theories, and hence Fine characterizes fictionalism
a satisfactory manner, we would not be home and dry as an “antirealist position in the debate over scientific
yet. Vorms [3.215, 216] argues that what’s more im- realism” [3.196, 220, 221], a position echoed in Wins-
portant than the entity itself is the format in which berg [3.222] and Suárez [3.223]. Morrison [3.224] and
the entity is presented. A fiction view that predomi- Purves [3.225] and offer critical discussions of this ap-
nantly focuses on understanding the fictional entities proach, which the latter calls fiction as “truth conducive
themselves (and, once this task is out of the way, their falsehood” [3.225, p. 236]; Woods [3.226] offers a crit-
relation to the real-world targets), misses an impor- ical assessment of fictionalism in general. Although
tant aspect, namely how agents draw inferences from there are interesting discussions to be had about the role
models. This, Vorms submits, crucially depends on the that this kind of fictions play in the philosophy of sci-
format under which they are presented to scientists, and ence, it is not our interest here.
86 Part A Theoretical Issues in Models
This view faces the problem of ontology because it has cial expressions. We are not mandated to imagine that
to say what kind of things model systems are. This view Napoleon was made of bronze, or that he hasn’t moved
contrasts with what Toon [3.205, p. 43] and Levy [3.43, for more than 100 years.
p. 790] call a direct view of representation (Levy [3.206, The second important kind of props are works of
p. 741] earlier also referred to it as the worldly fic- literary fiction. In this case the text is the prop, which to-
tion view). This view does not recognize model systems gether with principles of generation appropriate for lit-
and aims instead to explain epistemic representation as erary fictions of a certain kind, generates fictional truths
a form of direct description. Model descriptions (like by prescribing readers to imagine certain things. For
the description of an ideal pendulum) provide an “imag- instance, when reading The War of the Worlds [3.205,
inative description of real things” [3.206, p. 741] such p. 39] we are prescribed to imagine that the dome of St
as actual pendula, and there is no such thing as a model Paul’s Cathedral has been attacked by aliens and now
system of which the pendulum description is literally has a gaping hole on its western side.
true [3.205, pp. 43–44]. In what follows we use Toon’s In Walton’s theory something is a representation
terminology and refer to this approach as direct repre- if it has the social function of serving as a prop in
sentation. a game of make believe, and something is an object of
Toon and Levy both reject the indirect approach be- a representation if the representation prescribes us to
cause of metaphysical worries about fictional entities, imagine something about the object [3.229, pp. 35,39].
and they both argue that the direct view has the con- In the above examples the statue and the written text
siderable advantage that it does not have to deal with are the props, and Napoleon and St Paul’s Cathe-
the vexed problem of the ontology of model systems dral, respectively, are the objects of the representa-
and their comparison with real things at all. Levy [3.43, tions.
p. 790] sees his approach as “largely complimentary to The crucial move now is to say that models are
Toon’s”. So we first discuss Toon’s approach and then props in games of make believe. Specifically, material
turn to Levy’s. models – such as an architectural model of the Forth
Toon [3.48, 205, 228] takes as his point of departure Road Bridge – are like the statue of Napoleon [3.205,
Walton’s [3.229] theory of representation in the arts. At p. 37]: the model is the prop and the bridge is the ob-
the heart of this theory is the notion of a game of make ject of the representation. The same observation applies
believe. The simplest examples of these games are chil- to theoretical models, such as a mechanical model of
dren’s plays [3.229, p. 11]. In one such play we imagine a bob bouncing on a spring. The model portrays the
that stumps are bears and if we spot a stump we imagine bob as a point mass and the spring as perfectly elas-
that we spot a bear. In Walton’s terminology the stumps tic. The model description represents the real ball and
are props, and the rule that we imagine a bear when spring system in the same way in which a literary text
we see a stump is a principle of generation. Together represents its objects [3.205, pp. 39–40]: the model de-
a prop and a principle of generation prescribe what is scription prescribes imaginings about the real system –
to be imagined. If a proposition is so prescribed to be we are supposed to imagine the real spring as perfectly
imagined, then the proposition is fictional in the rele- elastic and the bob as a point mass.
vant game. The term fictional has nothing to do with We now see why Toon’s account is a direct view
falsity; on the contrary, it indicates that the proposition of modeling. Theoretical model descriptions represent
is true in the game. The set of propositions actually actual concrete objects: the Forth Road Bridge and
imagined by someone need not coincide with the set the bob on a spring. There is no intermediary en-
Models and Representation 3.6 The Fiction View of Models 87
tity of which model descriptions are literally true and by prescribing imaginings about the target; if a model
which are doing the representing. Models prescribe has no target it prescribes imaginings about a fictional
imaginings about a real world target, and that is what character [3.205, p. 54].
representation consists in. Toon immediately admits that models without tar-
This is an elegant account of representation, but it gets “give rise to all the usual problems with fictional
is not without problems. The first issue is that it does characters” [3.205, p. 54]. However, he seems to think
not offer an answer to the ER-problem. Imagining that that this is a problem we can live with because the
the target has a certain feature does not tell us how more important case is the one where models do have
the imagined feature relates to the properties the target a target, and his account offers a neat solution there.
actually has, and so there is no mechanism to trans- He offers the following summative statement of his ac-
fer model results to the target. Imagining the pendulum count [3.205, p. 62]:
bob to be a point mass tells us nothing about which, if
any, claims about point masses are also true of the real Definition 3.13 Direct Representation
bob. Toon mentions this problem briefly. His response A scientific model M represents a target system T iff M
Part A | 3.6
is that [3.205, pp. 68–69]: functions as prop in game of make believe.
“Principles of generation often link properties of This definition takes it to be understood that the imagin-
models to properties of the system they represent in ings prescribed are about the target T if there is a target,
a rather direct way. If the model has a certain prop- and about a fictional character if there isn’t because
erty then we are to imagine that system does too. If there need not be any object that the model prescribes
the model is accurate, then the model and the sys- imaginings about [3.205, p. 81].
tem will be similar in this respect. [. . . ] [But] not This bifurcation of imaginative activities raises
all principles of generation are so straightforward. questions. The first is whether the bifurcation squares
[. . . ] In some cases similarity seems to play no role with the face value practice. Toon’s presentation would
at all.” suggest that the imaginative practices involved in mod-
els with targets are very different from the ones involved
In as far as the transfer mechanism is similarity, the in models without them. Moreover, they require a dif-
view moves close to the similarity view, which brings ferent analysis because imagining something about an
with it both some of the benefits and the problems we existing object is different from imagining something
have discussed in Sect. 3.3. The cases in which simi- about a fictional entity. This, however, does not seem
larity plays no role are left unresolved and it remains to sit well with scientific practice. In some cases we are
unclear how surrogative reasoning with such models is mistaken: we think that the target exists but then find out
supposed to happen. that it doesn’t (as in the case of phlogiston). But does
The next issue is that not all models have a target that make a difference to the imaginative engagement
system, which is a serious problem for a view that an- with a phlogiston model of combustion? Even today we
alyzes representation in terms of imagining something can understand and use such models in much the same
about a target. Toon is well aware of this issue and calls way as its original protagonists did, and knowing that
them models without objects [3.205, p. 76]. Some of there is no target seems to make little, if any, differ-
these are models of discredited entities like the ether ence to our imaginative engagement with the model. Of
and phlogiston, which were initially thought to have course the presence or absence of a target matters to
a target but then turned out not to have one [3.205, many other issues, most notably surrogative reasoning
p. 76]. But not all models without objects are errors: (there is nothing to reason about if there is no target!),
architectural plans of buildings that are never built or but it seems to have little importance for how we imag-
models of experiments that are never carried out fall inatively engage with the scenario presented to us in
into the same category [3.205, p. 76]. a model.
Toon addresses this problem by drawing another In other cases it is simply left open whether there
analogy with fiction. He points out that not all novels is target when the model is developed. In elementary
are like The War of the Worlds, which has an object. particle physics, for instance, a scenario is often pro-
Passages from Dracula, for instance, “do not repre- posed simply as a suggestion worth considering and
sent any actual, concrete object but are instead about only later, when all the details are worked out, the ques-
fictional characters” [3.205, p. 54]. Models without tion is asked whether this scenario bears an interesting
a target are like passages from Dracula. So the solu- relation to what happens in nature, and if so what the
tion to the problem is to separate the two cases neatly. relation is. So, again, the question of whether there is
When a model has target then it represents that target or isn’t a target seems to have little, if any, influence
88 Part A Theoretical Issues in Models
on the imaginative engagement of physicists with sce- distortive idealizations are crucial and cannot be set
narios in the research process. This does not preclude aside. These require a different treatment and it’s an
different philosophical analyzes being given of mod- open question what this treatment would be.
eling with and without a target, but any such analysis Levy offers a radical solution to the problem of mod-
will have to make clear the commonalities between the els without targets: there aren’t any! He first broadens
two. the notion of a target system, allowing for models that
Let us now turn to a few other aspects of direct are only loosely connected to targets [3.43, pp. 796–
representation (Definition 3.13). The view successfully 797]. To this end he appeals to Godfrey-Smith’s notion
solves the problem of asymmetry. Even if it uses sim- of hub-and-spoke cases: families of models where only
ilarity in response to the ER-problem, the imaginative some have a target (which makes them the hub mod-
process is clearly directed towards the target. An appeal els) and the others are connected to them via conceptual
to imagination also solves the problem of misrepresen- links (spokes) but don’t have a specific target. Levy
tation because there is no expectation that our imagina- points out that such cases should be understood as hav-
tions are correct when interpreted as statements about ing a generalized target. If something that looks like
Part A | 3.6
the target. Given its roots in a theory of representation a model doesn’t meet the requirement of having even
in art, it’s natural to renounce any attempts to demarcate a generalized target, then it’s not a model at all. Levy
scientific representation from other kinds of representa- mentions structures like the game of life and observes
tion [3.205, p. 62]. The problem of ontology is dispelled that they are “bits of mathematics” rather than mod-
for representations with an object, but it remains unre- els [3.43, p. 797]. This eliminates the need for fictional
solved for representations without one. However, direct characters in the case of targetless models.
representation (Definition 3.13) offers at best a partial This is a heroic act of liberation, but questions
answer to the ER-problem, and nothing is said about ei- about it remain. The direct view renders fictional enti-
ther the problem of style and/or standards of accuracy. ties otiose by positing that a model is nothing but an act
Similarly, Toon remains silent about the applicability of of imagining something about a concrete actual thing.
mathematics. But generalized targets are not concrete actual things,
Levy also rejects an indirect view primarily because and often not even classes of such things. There is a se-
of the unwieldiness of its ontology and endorses a di- rious question whether one can still reap the (alleged)
rect view of representation ([3.43, pp. 780–790], [3.206, benefits of a view that analyzes modeling as imagin-
pp. 744–747]). Like Toon, he develops his version of ings about concrete things, if the things about which
the direct view by appeal to Walton’s notion of prop- we imagine something are no longer concrete. Popula-
oriented make believe. When, for instance, we’re asked tion growth or complex behavior are not concrete things
where in Italy the town of Crotone lies, we can be told like rabbits and stumps, and this would seems to pull
that it’s in the arch of the Italian boot. In doing so we the rug from underneath a direct approach to represen-
are asked to imagine something about the shape of Italy tation. Likewise, the claim that models without target
and this imagination is used to convey geographical in- are just mathematics stands in need of further elucida-
formation. Levy then submits that “we treat models as tion. Looking back at Toon’s examples of such models,
games of prop-oriented make believe” [3.206, p. 791]. a view that considers them just mathematics does not
Hence modeling consists in imagining something di- come out looking very natural.
rectly about the target.
Levy pays careful attention to the ER-problem. 3.6.3 Parables and Fables
In his [3.206, p. 744] he proposed that the prob-
lem be conceptualized in analogy with metaphors, but Cartwright [3.231] focuses on highly idealized models
immediately added that this was only a beginning such as Schelling’s model of social segregation [3.232]
which requires substantial elaboration. In his [3.43, and Pissarides’ model of the labor market [3.233].The
pp. 792–796] he takes a different route and appeals to problem with these models is that the objects and sit-
Yablo’s [3.230] theory of partial truth. The core idea uations we find in such models are not at all like the
of this view is that a statement is partially true “if it things in the world that we are interested in. Cities
is true when evaluated only relative to a subset of the aren’t organized as checkerboards and people don’t
circumstances that make up its subject matter – the sub- move according to simple algorithmic rules (as they do
set corresponding to the relevant content-part” [3.43, in Schelling’s model), and there are no laborers who
p. 792]. Levy submits that this will also work for are solely interested in leisure and income (as is the
a number of cases of modeling, but immediately adds case in Pissarides’ model). Yet we are supposed to learn
that there are other sorts of cases that don’t fit the something about the real world from these models. The
mold [3.43, p. 794]. Such cases often are ones in which question is how.
Models and Representation 3.6 The Fiction View of Models 89
Cartwright submits that an answer to this question a range of actually occurring (so-called target) sit-
emerges from a comparison of models with narratives, uations” [3.231, p. 22] (original emphasis). In fact,
in particular fables and parables. An example of a fable making this transfer of model results to the real world is
is the following: “A marten eats the grouse; a fox throt- the ER-problem. Unfortunately she does not offer much
tles the marten; the tooth of the wolf, the fox. Moral: the by way of explaining this step and merely observes that
weaker are always prey to the stronger” [3.231, p. 20]. “a description of what happens in the model that does
The characters in the fable are highly idiosyncratic, and not fit the target gets recast as one that can” [3.231,
typically we aren’t interested in them per se – we don’t p. 20]. This gestures in the right direction, but more
read fables to learn about foxes and martens. What we would have to be said about how exactly a model
are interested in is the fable’s general and more abstract description is recast to allow for transfer of model
conclusion, in the above example that the weaker are results to target systems. In earlier work Cartwright
always prey to the stronger. In the case of the fable observed that what underlies the relationship between
the moral is typically built in the story and explicitly models and their targets is a “loose notion of resem-
stated [3.231]. blance” [3.73, pp. 192–193] and [3.74, pp. 261–262].
Part A | 3.6
Cartwright then invites us to consider the parable This could be read as suggesting that she would en-
of the laborers in the vineyard told in the Gospel of dorse some kind of similarity view of representation.
Matthew [3.231]. A man goes to the market to hire day Such a view, however, is independent of an appeal to
laborers. He hires the first group early in the morning, fables and parables.
and then returns several times during the day to hire In passing we would like to mention that the same
more laborers, and he hires the last group shortly before kind of models is also discussed in Sugden [3.202, 210].
dusk. Some worked all day, while some hardly started However, his interest is in induction rather than repre-
when the day ended. Yet he pays the same amount to sentation, and if reframed in representational terms then
all of them. Like in a fable, when engaging with a para- his account becomes a similarity account like Giere’s.
ble the reader takes no intrinsic interest in the actors See Grüne-Yanoff [3.234] and Knuuttila [3.235] for
and instead tries to extract a more general moral. But a discussion.
unlike in fables, in parables no moral appears as part of
the parable itself [3.231, p. 29]. Hence parables need in- 3.6.4 Against Fiction
terpretation, and alternative interpretations are possible.
The above fable is often interpreted as being about the The criticisms we have encountered above were intrin-
entry to God’s kingdom, but, as Cartwright observes, sic criticisms of particular versions of the fiction view,
it can just as well be interpreted as making the market- and as such they presuppose a constructive engage-
based capitalist point that you get what you contract for, ment with the view’s point of departure. Some critics
and should not appeal to higher forms of justice [3.231, think that any such engagement is misplaced because
p. 21]. the view got started on the wrong foot entirely. There
These are features models share with fables and are five different lines of attack. The first criticism is
parables: “like the characters in the fable, the objects driven by philosophical worries about fiction. Fictions,
in the model are highly special and do not in general re- so the argument goes, are intrinsically dubious and are
semble the ones we want to learn about” [3.231, p. 20] beset with so many serious problems that one should
and the “lesson of the model is, properly, more abstract steer away from them whenever possible. So it could
than what is seen to happen in the model” [3.231, p. 28]. be claimed that assigning them a central role in sci-
This leaves the question whether models are fables or ence is a manifestation of philosophical masochism.
parables. Some models are like fables in that they have This, however, overstates the problems with fictions.
the conclusion explicitly stated in them. But most mod- Sure enough, there is controversy about fictions. But the
els are like parables [3.231, p. 29]: their lesson is not problems pertaining to fictions aren’t more devastating
written in the models themselves [3.231, p. 21], and than those surrounding other items on the philosophi-
worse: “a variety of morals can be attributed to the cal curriculum, and these problems surely don’t render
models” [3.231, p. 21]. A model, just like a parable, fictions off limits.
is interpreted against a rich background of theory and The second criticism, offered for example by
observation, and the conclusion we draw depends to Giere [3.97, p. 257], is that the fiction view – in-
a large extent on the background [3.231, p. 30]. voluntarily – plays into the hands of irrationalists.
So far the focus was on deriving a conclusion about Creationists and other science skeptics will find great
the model itself. Cartwright is clear that one more step comfort, if not powerful rhetorical ammunition, in the
is needed: “In many cases we want to use the re- fact that philosophers of science say that scientists pro-
sults of these models to inform our conclusions about duce fiction. This, so the argument goes, will be seen
90 Part A Theoretical Issues in Models
as a justification of the view that religious dogma is of a work. Second, as we have seen in Sect. 3.6.1, fal-
on par with, or even superior to, scientific knowledge. sity is not a defining feature of fiction. We agree with
Hence the fiction view of models undermines the au- Giere that there is a clear distinction between texts of
thority of science and fosters the cause of those who fiction and nonfiction, but we deny that this distinction
wish to replace science with religious or other unscien- is defined by truth or falsity; it is the attitude that we are
tific worldviews. supposed to adopt towards the text’s content that makes
Needless to say, we share Giere’s concerns about the difference. Once this is realized, the problem fades
creationism. In order not to misidentify the problem it away. Third, many proponents of the fiction view (those
is important to point out that Giere’s claim is not that the belonging to the first group mentioned in Sect. 3.6.1)
view itself – or its proponents – support creationism; his are clear that problems of ontology should be kept sep-
worry is that the view is a dangerous tool when it falls arate from function and agree that it is one of the prime
into the wrong hands. What follows from this, however, function of models to represent. This point has been
is not that the fiction view itself should be abandoned; stressed by Godfrey-Smith [3.209, pp. 108–111] and it
but rather that some care is needed when dealing with is explicit in other views such as Frigg’s [3.203].
Part A | 3.6
the press office. As long as the fiction view of models is The fourth objection is due to Magnani, who dis-
discussed in informed circles, and, when popularized, misses the fiction view for misconstruing the role of
is presented carefully and with the necessary qualifica- models in the process of scientific discovery. The fun-
tions, it is no more dangerous than other ideas, which, damental role played by models, he emphasizes [3.236,
when taken out of context, can be put to uses that would p. 3]:
(probably) send shivers down the spines of their pro-
“is the one we find in the core conceptual discov-
genitors (think, for instance, of the use of Darwinism to
ery processes, and that these kinds of models cannot
justify eugenics).
be indicated as fictional at all, because they are
The third objection, also due to Giere, has it that
constitutive of new scientific frameworks and new
the fiction view misidentifies the aims of models. Giere
empirical domains.”
agrees that from an ontological point of view scientific
models and works of fictions are on par, but empha- This criticism seems to be based on an understand-
sizes that “[i]t is their differing function in practice that ing of fiction as falsity because falsities can’t play
makes it inappropriate to regard scientific models as a constitutive role in the constitution of new empirical
works of fiction” [3.97, p. 249]. Giere identifies three domains. We reiterate that the fiction view is not com-
functional differences [3.97, pp. 249–252]. First, while mitted to the fiction as falsity account and hence is not
fictions are the product of a single author’s individual open to this objection.
endeavors, scientific models are the result of a pub- The fifth objection is that fictions are superfluous
lic effort because scientists discuss their creations with and hence should not be regarded as forming part of
their colleagues and subject them to public scrutiny. (let alone being) scientific models because we can give
Second, there is a clear distinction between fiction a systematic account of how scientific models work
and nonfiction books, and even when a book classi- without invoking fictions. This point has been made
fied as nonfiction is found to contain false claims, it in different ways by Pincock [3.214, Chap. 12] and
is not reclassified as fiction. Third, unlike works of Weisberg [3.33, Chap. 4] (for a discussion of Weis-
fiction, whose prime purpose is to entertain (although berg’s arguments see Odenbaugh [3.237]). We cannot
some works can also give insight into certain aspects do justice to the details of their sophisticated arguments
of human life), scientific models are representations of here, and will concern ourselves only with their main
certain aspects of the world. conclusion. They argue that scientific models are math-
These observations, although correct in themselves, ematical objects and that they relate to the world due to
have no force against the fiction view of models. First, the fact that there is a relationship between the mathe-
whether a fiction is the product of an individual or a col- matical properties of the model and the properties found
lective effort has no impact on its status as a fiction; in the target system (in Weisberg’s version similarity
a collectively produced fiction is just a different kind relations to a parametrized version of the target). In
of fiction. Even if War and Peace (to take Giere’s ex- other words, models are mathematical structures and
ample) had been written in a collective effort by all they represent due to there being certain mathematical
established Russian writers of Tolstoy’s time, it would relations between these structures and a mathematical
still be a fiction. Vice versa, even if Newton had never rendering of the target system. (Weisberg includes fic-
discussed his model of the solar system with anybody tions as convenient folk ontology that may serve as
before publishing it, it would still be science. The his- a crutch when thinking about the model, but takes them
tory of production is immaterial to the fictional status to be ultimately dispensable when it comes to explain-
Models and Representation 3.7 Representation-as 91
ing how models relate to the world.) This, however, far from unproblematic. So it is at best an open ques-
brings us back to a structuralist theory of representa- tion whether getting rid of fiction provides an obvious
tion, and this theory, as we have seen in Sect. 3.4, is advantage.
3.7 Representation-as
In this section we discuss approaches that depart from The locution of representation-as functions in the
Goodman’s notion of representation-as [3.64]. In his following way: An object X (e.g., a picture, statue, or
account of aesthetic representation the idea is that model) represents a subject Y (e.g., a person or target
a work of art does not just denote its subject, but more- system) as being thus or so (Z). The question then is
over it represents it as being thus or so. Elgin [3.34] what establishes this sort of representational relation-
further developed this account and, crucially, suggested ship? The answer requires presenting some of the tools
Part A | 3.7
that it also applies to scientific representations. This is Goodman and Elgin use to develop their account of
a vital insight and it provides the entry point to what representation-as.
we think of as the most promising account of epistemic One of the central posits of Goodman’s account is
representation. that denotation is “the core of representation” [3.64,
In this section we present Goodman and Elgin’s p. 5]. Stoddart’s statue of David Hume denotes Hume
notion of representation-as, and outline how it is a com- and a model of the solar system denotes the solar sys-
plex type of reference involving a mixture of denotation tem. In that sense the statue and the model are represen-
and what they call exemplification. We introduce the tations of their respective targets. To distinguish repre-
term of art representation-as to indicate that we are sentation of something from other notions of represen-
talking about the specific concept that emerges from tation we introduce the technical term representation-
Goodman’s and Elgin’s writings. We then discuss how of. Denotation is what establishes representation-of.
the account needs to be developed in the context of sci- (For a number of qualifications and caveats about de-
entific representation. And finally we present our own notation see our [3.238, Sect. 2]).
answer to the ER-problem, and demonstrate how it an- Not all representations are a representation-of.
swers the questions laid out in Sect. 3.1. A picture showing a unicorn is not a representation-of
a unicorn because things that don’t exist can’t be de-
3.7.1 Exemplification noted. Yet there is a clear sense in which such a picture
and Representation-as is a representation. Goodman and Elgin’s solution to
this is to distinguish between being a representation-of
Many instances of epistemic representation are in- something and being a something-representation ([3.34,
stances of representation-as. Caricatures are paradig- pp. 1–2], [3.64, pp. 21–26]). What makes a picture
matic examples: Churchill is represented as a bulldog, a something-representation (despite the fact it may fail
Thatcher is represented as a boxer, and the Olympic to denote anything) is that it is the sort of symbol that
Stadium is represented as a UFO. Using these carica- denotes. Elgin argues [3.34, pp. 1–2]:
tures we can attempt to learn about their targets: attempt
to learn about a politician’s personality or a building’s “A picture that portrays a griffin, a map that maps
appearance. The notion applies beyond caricatures. the route to Mordor [. . . ] are all representations,
Holbein’s Portrait of Henry VIII represents Henry as although they do not represent anything. To be
imposing and powerful and Stoddart’s statue of David a representation, a symbol need not itself denote,
Hume represents him as thoughtful and wise. The lead- but it needs to be the sort of symbol that denotes.
ing idea is that scientific representation works in much Griffin pictures are representations then because
the same way. A model of the solar system represents they are animal pictures, and some animal pictures
the sun as perfect sphere; the logistic model of growth denote animals. Middle Earth maps are representa-
represents the population as reproducing at fixed inter- tions because they are maps and some maps denote
vals of time; and so on. In each instance, models can real locations. [. . . ] So whether a symbol is a rep-
be used to attempt to learn about their targets by de- resentation is a question of what kind of symbol it
termining what the former represent the latter as being. is.”
So representation-as relates, in a way to be made more
specific below, to the surrogative reasoning condition These representations can be classified into gen-
discussed in Sect. 3.1. res, in a way that does not depend on what they are
92 Part A Theoretical Issues in Models
representations-of (since some may fail to denote), but This gives a name to the crucial step: imputation.
instead on what they portray. In the case of pictures, This step can be analyzed in terms of stipulation by
this is fairly intuitive (how this is to be developed in a user of a representation. When someone uses X as
the case of scientific models is discussed below). If a representation-as, she has to stipulate that certain
a picture portrays a man, it is a man-representation, properties that are exemplified in X be imputed to Y.
if it portrays a griffin it is a griffin-representation and We emphasize that imputation does not imply truth: Y
so on. In general, a picture X is Z-representation if it may or may not have the properties imputed to it by X.
portrays Z. The crucial point is that this does not pre- So the representation can be seen as generating a claim
suppose that X be a representation-of Z; indeed X can be about Y that can be true or false; it should not be under-
Z-representation without denoting anything. A picture stood as producing truisms.
must denote a man to be a representation-of a man. But Applied to scientific models, the account of epis-
it need not denote anything to be a man-representation. temic representation that emerges from Goodman and
The next notion we need to introduce is exempli- Elgin’s discussion of representation can then be sum-
fication. An item exemplifies a property if it at once marized as follows:
Part A | 3.7
3.7.2 From Pictures to Models: material constitution. An O-object specifies the kind of
The Denotation, Exemplification, object something is, qua physical object.
Keying-up and Imputation Account O-objects become representations when they are
used as such. But how are they classified as Z-
According to Goodman and Elgin, for a picture to representations? How does the Phillips–Newyln ma-
be a Z-representation it has to be the kind of symbol chine become an economy-representation, or how
that denotes. On the face of it, there is a mismatch does a collection of billiard balls become a gas-
between pictures and scientific models in this regard. representation? (Again, recall that this is not because
The Schelling model represents social segregation with they denote economies or gases.) We suggest, and this
a checkerboard; billiard balls are used to represent is the second step, that this requires an act of inter-
molecules; the Phillips–Newlyn model uses a system pretation (notice that we do not use interpretation in
of pipes and reservoirs to represent the flow of money the same sense as Contessa). In the case of pictures,
through an economy; and the worm Caenorhabditis ele- the nature of this interpretation has been the center of
gans is used as a model of other organisms. But neither attention for a good while: how one sees a canvas cov-
Part A | 3.7
checkerboards, billiard balls, pipes, or worms seem to ered with paint as showing a cathedral is regarded by
belong to classes of objects that typically denote. The many as one of the important problems of aesthetics.
same observation applies to scientific fictions (friction- Schier [3.240, p. 1] dubbed it the “enigma of depic-
less planes, utility maximizing agents, and so on) and tion”, and an entire body of literature is been concerned
the mathematical objects used in science. In fact, ma- with it (Kulvicki [3.241] provides a useful review). In
trices, curvilinear geometries, Hilbert spaces etc. were the case of scientific models we don’t think a simple and
all studied as mathematical objects before they became universal account of how models are interpreted as Z-
important in the empirical sciences. representations can be given. Interpreting an O-object
Rather than relying on the idea that scientific mod- as a Z-representation requires attributing properties of
els belong to classes of objects that typically denote Zs to the object. How this is done will depend on disci-
we propose directly introducing an agent and ground plinary traditions, research interests, background theory
representation in this agent’s actions. Specific checker- and much more. In fact, interpretation is a blank to
boards, systems of pipes, frictionless places and math- be filled, and it will be filled differently in different
ematical structures, are epistemic representations be- cases.
cause they are used by an agent to represent a system. Some examples should help elucidate what we
When an agent uses an object as a representation, we mean by this. In the case of scale models the interpre-
call it a base. tation is close to the O-object in that it interprets the
What allows us to classify bases into Z-representa- object in its own terms. The small car is interpreted as
tions is also less clear in the case of scientific represen- a car-representation and the small ship is interpreted as
tation. We approach this issue in two steps. The first is a ship-representation. Likewise, in the case of the Army
to recognize the importance of the intrinsic constitution Corps’ model of the San Francisco bay [3.33], parts of
of the base. Pictures are typically canvases covered with the model bay are interpreted in terms of the real bay.
paint. They are classified as Z-representations because In cases like these, the same predicates that apply to
under appropriate circumstances the canvas is recog- the base (qua O-object) are applied to the object in or-
nized as portraying a Z. Much can be said about the der to make it into a Z-representation (here O D Z). But
canvas’ material constitution (the thickness or chemical this is not always the case. For example, the Phillips–
constitution of the paint, etc.), but these are generally Newlyn machine is a system of pipes and reservoirs, but
of little interest to understanding what the picture por- it becomes an economy-representation only when the
trays. By contrast, the properties of a scientific model – quantity and flow of water throughout the system are
qua material object – do matter. How water flows interpreted as the quantity and flow of money through-
through the pipes in the Phillips–Newlyn model is cru- out an economy. The system is interpreted in terms of
cial to how it represents the movement of money in predicates that do not apply to the object (qua O-object),
an economy. That Caenorhabditis elegans is a biolog- but turn it into a Z-representation (here O and Z come
ical organism is of vital importance for how it is used apart). In sum, an O-object that has been chosen as the
representationally. In fact, models are frequently clas- base of a representation becomes a Z-representation if
sified according to what their material base is. We talk O is interpreted in terms of Z.
about a pipe model of the economy or worm model of Next in line is exemplification. Much can be said
cell division because their bases are pipes and worms. about exemplification in general, but the points by and
Here we introduce a term of art to recognize that scien- large carry over from the general discussion to the case
tific models are generally categorized according to their of models without much ado. There is one difference,
94 Part A Theoretical Issues in Models
though, in cases like the Phillips–Newlyn machine. Re- approximated (or any of their cognates) makes things
call that exemplification was defined as the instantiation worse, not better. For one, idealization can mean very
of a property P by an object in such a way that the object different things in different contexts and hence describ-
thereby refers to P. How can the Phillips–Newlyn ma- ing the relation between two properties as idealization
chine exemplify economic properties when it does not, adds little specificity (see Jones [3.242] for a discussion
strictly speaking, instantiate them? The crucial point is of different kinds of idealization). For another, while the
that nothing in the current account depends on instan- relationship between some representation-target prop-
tiation being literal instantiation. On this point we are erties may be characterized in terms of idealization,
in agreement with Goodman and Elgin, whose account many cannot. A map of the world exemplifies a distance
relies on nonliteral instantiation. The portrait of Henry of 29 cm between the two points labeled Paris and New
cannot, strictly speaking, instantiate the property of be- York; the distance between the two cities is 5800 km;
ing male, even if it represents him as such. Goodman but 29 cm is not an idealization of 5800 km. A scale
and Elgin call this metaphorical instantiation ([3.64, model of a ship being towed through water is not an
pp. 50–51], [3.239, p. 81]). idealization of an actual ship, at least not in any obvious
Part A | 3.7
What matters is that properties are epistemically way. Or in standard representations of Mandelbrod sets
accessible and salient, and this can be achieved with the color of a point indicates the speed of divergence of
what we call instantiation-under-an-interpretation I, an iterative function for certain parameter value associ-
I-instantiation for short. An economic interpretation ated with that point, but color is not an idealization of
of the Phillips–Newlyn machine interprets amounts divergence speed.
of water as amounts of money. It does so by in- For this reason it is preferable, in our view, to build
troducing a clearly circumscribed rule of proportion- a specification of the relationship between model prop-
ality: x liters of water correspond to y millions of erties and target properties directly into an account of
the model-economy’s currency. This rule is applied epistemic representation. Let P1 ; : : : ; Pn be the proper-
without exception when the machine is interpreted as ties exemplified by M, and let Q1 ; : : : ; Qm be the related
an economy-representation. So we say that under the properties that M imputes to Y (where n and m are pos-
economic interpretation Ie the machine Ie -instantiates itive natural numbers that can but need not be equal).
money properties. With the notion of I-instantiation at Then the representation M must come with a key K
hand, exemplification poses no problem. that specifies how exactly P1 ; : : : ; Pn are converted into
The final issue to clear is the imputation of the Q1 ; : : : ; Qm [3.50]. Borrowing notation from algebra
model’s exemplified properties to the target system. (somewhat loosely) we can write K.hP1 ; : : : ; Pn i/ D
In particular, which properties are so imputed? Elgin hQ1 ; : : : ; Qm i. K can, but need not be, the identity func-
describes this as the imputation of the properties ex- tion; any rule that associates a unique set Q1 ; : : : ; Qm
emplified by M or related ones. The observation that with P1 ; : : : ; Pn is admissible. The relevant clause in
the properties exemplified by a scientific model and the the definition of representation-as then becomes: M
properties imputed to its target system need not be iden- exemplifies P1 ; : : : ; Pn and the representation imputes
tical is correct. In fact, few, if any, models in science properties Q1 ; : : : ; Qm to T where the two sets of prop-
portray their targets as exhibiting exactly the same fea- erties are connected to each other by a key K.
tures as the model itself. The problem with invoking The above examples help illustrate what we have
related properties is not its correctness, but its lack of in mind. Let us begin with the example of the map (in
specificity. Any property can be related to any other fact the idea of a key is motivated by a study of maps;
property in some way or other, and as long as no specific for a discussion of maps see Galton [3.243] and Sis-
relation is specified it remains unclear which properties mondo and Chrisman [3.244]). P is a measured distance
are imputed onto the system. on the map between the point labeled New York and
In the context of science, the relation between the the point labeled Paris; Q is the distance between New
properties exemplified and the ones ascribed to the York and Paris in the world; and K is the scale of the
system is sometimes described as one of simplifica- map (in the above case, 1 W 20000000). So the key al-
tion [3.198, p. 184], idealization [3.198, p. 184] and lows us to translate a property of the map (the 29 cm
approximation [3.34, p. 11]. This could suggest that re- distance) into a property of the world (that New York
lated ones means idealized, at least in the context of and Paris are 5800 km apart). But the key involved in
science (we are not attributing this claim to Elgin; we the scale model of the ship is more complicated. One of
are merely considering the option), perhaps similar to the Ps in this instance is the resistance the model ship
the way in which Ducheyne’s account discussed above faces when moved through the water in a tank. But this
took target properties to be approximations of model doesn’t translate into the resistance faced by the actual
properties. But shifting from related to idealized or ship in the same way in which distances in a map trans-
Models and Representation 3.7 Representation-as 95
late into distances in reality. In fact, the relation between We call this the DEKI account of representation to
the resistance of the model and the resistance of the highlight its key features: denotation, exemplification,
real ship stand in a complicated nonlinear relationship keying-up and imputation.
because smaller models encounter disproportionate ef- Before highlighting some issues with this account,
fects due to the viscosity of the fluid. The exact form let us clarify how the account answers the questions
of the key is often highly nontrivial and emerges as we laid out in Sect. 3.1. Firstly, as an answer to the
the result of a thoroughgoing study of the situation; ER-problem, DEKI (Definition 3.15) provides an ab-
see Sterrett [3.245] for a discussion of fluid mechan- stract framework in which to think about epistemic
ics. In the representation of the Madelbrod set in [3.246, representation. In general, what concretizes each of the
p. 660] a key is used that translates color into divergence conditions needs to be investigated on a case-by-case
speed [3.246, p. 695]. The square shown is a segment of basis. But far from being a defect, this degree of ab-
the complex plane and each point represents a complex stractness is an advantage. Epistemic representation,
number. This number is used as parameter value for an and even the narrower model-representation, are um-
iterative function. If the function converges for number brella terms covering a vast array of different activities
Part A | 3.7
c, then the point in the plane representing c is colored in different fields, and a view that sees representations
black. If the function diverges, then a shading from yel- in fields as diverse as elementary particle physics, evo-
low over green to blue is used to indicate the speed of lutionary biology, hydrology and rational choice theory
divergence, where yellow is slow, green is in the middle work in exactly the same way is either mistaken or too
and blue is fast. coarse to make important features visible. DEKI (Def-
Neither of these keys is obvious or trivial. Deter- inition 3.15) occupies the right middle ground: it is
mining how to move from properties exemplified by general enough to cover a large array of cases and yet it
models to properties of their target systems can be highlights what all instances of scientific representation
a significant task, and should not go unrecognized in have in common. At the same time the account offers an
an account of scientific representation. In general K is elegant solution to the problem of models without tar-
a blank to be filled, and it depends on a number of fac- gets: a model that apparently represents Z while there is
tors: the scientific discipline, the context, the aims and no Z is a Z-representation but not representation of a Z.
purposes for which M is used, the theoretical backdrop It should be clear how we can use models to per-
against which M operates, etc. Building K into the defi- form surrogative reasoning about their targets according
nition of representation-as does not prejudge the nature to DEKI (Definition 3.15). The account requires that
of K, much less single out a particular key as the correct we investigate the properties that are exhibited by the
one. The requirement merely is that there must be some model. These are then translated into a set of properties
key for M to qualify as a representation-as. that are imputed onto the target. This act of imputation
With these modifications in place we can now for- supplies a hypothesis about the target system: does it,
mulate our own account of representation [3.238, 247]. or does it not, have those properties? This hypothesis
Consider an agent who chooses an O-object as the base does not have to be true, and as such DEKI (Defini-
of representation and turns it into Z-representation by tion 3.15) allows for the possibility of misrepresentation
adopting an interpretation I. Let M refer to the package in a straightforward manner.
of the O-object together with the interpretation I that DEKI’s (Definition 3.15) abstract character also al-
turns it into a Z-representation. Then: lows us to talk about different styles of representation.
Style, on the DEKI (Definition 3.15) account, is not
a monolithic concept; instead it has several dimensions.
Definition 3.15 DEKI
Firstly, different O-objects can be chosen. In this way
A scientific model M represents a target T iff:
we may speak, say, of the checkerboard style and of the
1. M denotes T (and, possibly, parts of M denote parts cellular automaton style. In each case a specific kind of
of T) object has been chosen for various modeling purposes.
2. M is a Z-representation exemplifying properties Secondly, the notion of an interpretation allows us to
P1 ; : : : ; Pn talk about how closely connected the properties of the
3. M comes with a key, K, specifying how P1 ; : : : ; Pn model are to those that the object I-instantiates. Thirdly,
are translated into a set of features Q1 ; : : : ; Qm : different types of keys could be used to characterize dif-
K.hP1 ; : : : ; Pn i/ D hQ1 ; : : : ; Qm i ferent styles. In some instances the key might be the
4. The model imputes at least one of the properties identity key, which would amount to a style of model-
Q1 ; : : : ; Qm onto T. ing that aims to construct replicas of target systems; in
other cases the key incorporates different kinds of ideal-
96 Part A Theoretical Issues in Models
izations or abstractions, which gives rise to idealization Like the problem of style, various options are avail-
and abstraction keys. But different keys may be associ- able. Firstly, mathematical structures themselves can be
ated with entirely different representational styles. taken to be O-objects and feature as bases of repre-
Similarly, DEKI (Definition 3.15) suggests that sentation. They can be interpreted on their own terms
there is no significant difference between scientific and therefore exemplify strictly mathematical proper-
representations and other kinds of epistemic representa- ties. If one were of a structuralist bent, then the ap-
tion, at least at the general level. However, this is not to propriate mathematical properties could be structural,
say that the two cannot be demarcated whatsoever. The which could then be imputed onto the target system
sorts of interpretations under which pictures portray Zs (although notice that this approach faces a similar prob-
seem to be different to the sorts of interpretations that lem to the question of target-end structure discussed
are adopted in the scientific framework. Whether or not in Sect. 3.4.4). Alternatively, the key could provide
this can be cashed of more specifically is an interesting a translation of these mathematical properties into ones
question that we cannot investigate here. more readily applicable to physical systems. A third
Many details in DEKI (Definition 3.15) still need to alternative would be to take scientific models to be fic-
Part A | 3
be spelled out. But the most significant difficulty, per- tional objects, and then adopt an interpretation towards
haps, arises in connection with the problem of ontology. them under which they exemplify mathematical prop-
It is not by accident that we have illustrated the account erties. Again, these could be imputed directly onto the
with a physical model, the Phillips–Newlyn machine. target system, or translated into an alternative set of
Exemplification requires instantiation, which is easily properties. Finally, these fictional models could them-
understood for material models, but is highly problem- selves exemplify physical properties, but in doing so
atic in the context of nonconcrete models. One option exemplify structural ones as well. Whenever a physi-
is to view models as fictional entities as discussed in cal property is exemplified, this provides an extensional
Sect. 3.6. But whether, and if so how, fictional entities relation defined over the objects that instantiate it. The
instantiate properties is controversially discussed and pros and cons of each of these approaches demands fur-
more philosophical work is needed to make sense of ther research, but for the purposes of this chapter we
such a notion. It is therefore an open question how this simply note that DEKI (Definition 3.15) puts all of these
account works for nonconcrete models; for a discussion options on the table. Using the framework of O-objects,
and a proposal see Frigg and Nguyen [3.248]. interpretations, exemplification, keys, and imputation
Finally, the account provides us with resources with provides a novel way in which to think about the ap-
which to think about the applicability of mathematics. plicability of mathematics.
3.8 Envoi
We reviewed theories of epistemic representation. That Acknowledgments. The authors are listed alphabeti-
each approach faces a number of challenges and that cally; the chapter is fully collaborative. We would like
there is no consensus on the matter will not have come to thank Demetris Portides and Fiora Salis for helpful
as a surprise to anybody. We hope, however, that we comments on an earlier draft.
managed to map the lay of the land and to uncover the
fault lines, and thereby aid future discussions.
References
3.1 G. Boniolo: On Scientific Representations: From 3.5 J. Elkins: The Domain of Images (Cornell Univ. Press,
Kant to a New Philosophy of Science (Palgrave Ithaca, London 1999)
Macmillan, Hampsire, New York 2007) 3.6 K. Warmbrōd: Primitive representation and mis-
3.2 L. Perini: The truth in pictures, Philos. Sci. 72, 262– representation, Topoi 11, 89–101 (1992)
285 (2005) 3.7 C. Peirce: Principles of philosophy and elements of
3.3 L. Perini: Visual representation and confirmation, logic. In: Collected Papers of Charles Sanders Peirce,
Philos. Sci. 72, 913–926 (2005) Volumes I and II: Principles of Philosophy and El-
3.4 L. Perini: Scientific representation and the semi- ements of Logic, ed. by C. Hartshorne, P. Weiss
otics of pictures. In: New Waves in the Philosophy (Harvard Univ. Press, Cambridge 1932)
of Science, ed. by P.D. Magnus, J. Busch (Macmilan, 3.8 E. Tal: Measurement in science. In: Stanford Ency-
New York 2010) pp. 131–154 clopedia of Philosophy, ed. by E.N. Zalta, http://
Models and Representation References 97
Part A | 3
pp. 49–67 and Convention: Representation in Art and Science,
3.13 I. Peschard: Making sense of modeling: Beyond ed. by R. Frigg, M.C. Hunter (Springer, Berlin, New
representation, Eur. J. Philos. Sci. 1, 335–352 (2011) York 2010) pp. 1–18
3.14 A. Bokulich: Explanatory fictions. In: Fictions in 3.35 S. French: A model-theoretic account of represen-
Science. Philosophical Essays on Modelling and tation (or, I don’t know much about art . . . but
Idealization, ed. by M. Suárez (Routledge, London, I know it involves isomorphism), Philos. Sci. 70,
New York 2009) pp. 91–109 1472–1483 (2003)
3.15 A.G. Kennedy: A non representationalist view of 3.36 B.C. van Fraassen: Scientific Representation: Para-
model explanation, Stud. Hist. Philos. Sci. 43, 326– doxes of Perspective (Oxford Univ. Press, Oxford
332 (2012) 2008)
3.16 A.I. Woody: More telltale signs: What attention 3.37 A.I. Woody: Putting quantum mechanics to work in
to representation reveals about scientific explana- chemistry: The power of diagrammatic pepresen-
tion, Philos. Sci. 71, 780–793 (2004) tation, Philos. Sci. 67, S612–S627 (2000)
3.17 J. Reiss: The explanation paradox, J. Econ. 3.38 S. Stich, T. Warfield (Eds.): Mental Representation:
Methodol. 19, 43–62 (2012) A Reader (Blackwell, Oxford 1994)
3.18 M. Lynch, S. Woolgar: Representation in Scientific 3.39 K. Sterelny, P.E. Griffiths: Sex and Death: An In-
Practice (MIT, Cambridge 1990) troduction to Philosophy of Biology (Univ. Chicago
3.19 R.N. Giere: No representation without representa- Press, London, Chicago 1999)
tion, Biol. Philos. 9, 113–120 (1994) 3.40 E. Wigner: The unreasonable effectiveness of
3.20 R. Frigg: Models and Representation: Why Struc- mathematics in the natural sciences, Commun.
tures Are Not Enough, Measurement in Physics Pure Appl. Math. 13, 1–14 (1960)
and Economics Project Discussion Paper, Vol. DP 3.41 S. Shapiro: Philosophy of Mathematics: Structure
MEAS 25/02 (London School of Economics, London and Ontology (Oxford Univ. Press, Oxford 1997)
2002) 3.42 O. Bueno, M. Colyvan: An inferential conception of
3.21 R. Frigg: Scientific representation and the semantic the application of mathematics, Nous 45, 345–374
view of theories, Theoria 55, 49–65 (2006) (2011)
3.22 M. Morrison: Models as representational structures. 3.43 A. Levy: Modeling without models, Philos. Stud.
In: Nancy Cartwright’s Philosophy of Science, ed. by 152, 781–798 (2015)
S. Hartmann, C. Hoefer, L. Bovens (Routledge, New 3.44 I. Hacking: Representing and Intervening: Intro-
York 2008) pp. 67–90 ductory Topics in the Philosophy of Natural Science
3.23 M. Suárez: Scientific representation: Against sim- (Cambridge Univ. Press, Cambridge 1983)
ilarity and isomorphism, Int. Stud. Philos. Sci. 17, 3.45 A. Rosenblueth, N. Wiener: The role of models in
225–244 (2003) science, Philos. Sci. 12, 316–321 (1945)
3.24 S. Laurence, E. Margolis: Concepts and cognitive 3.46 R.A. Ankeny, S. Leonelli: What’s so special about
science. In: Concepts: Core Readings, ed. by S. Lau- model organisms?, Stud. Hist. Philos. Sci. 42, 313–
rence, E. Margolis (MIT, Cambridge 1999) pp. 3–81 323 (2011)
3.25 C. Swoyer: Structural representation and surroga- 3.47 U. Klein (Ed.): Tools and Modes of Representation in
tive reasoning, Synthese 87, 449–508 (1991) the Laboratory Sciences (Kluwer, London, Dordrecht
3.26 C. Callender, J. Cohen: There is no special problem 2001)
about scientific representation, Theoria 55, 7–25 3.48 A. Toon: Models as make-believe. In: Beyond
(2006) Mimesis and Convention: Representation in Art and
3.27 D.M. Bailer-Jones: When scientific models repre- Science, ed. by R. Frigg, M. Hunter (Springer, Berlin
sent, Int. Stud. Philos. Sci. 17, 59–74 (2003) 2010) pp. 71–96
3.28 A. Bolinska: Epistemic representation, informative- 3.49 A. Toon: Similarity and scientific representation,
ness and the aim of faithful representation, Syn- Int. Stud. Philos. Sci. 26, 241–257 (2012)
98 Part A Theoretical Issues in Models
3.50 R. Frigg: Fiction and scientific representation. In: 3.71 R.N. Giere: Explaining Science: A Cognitive Ap-
Beyond Mimesis and Convention: Representation in proach (Chicago Univ. Press, Chicago 1988)
Art and Science, ed. by R. Frigg, M. Hunter (Springer, 3.72 S. Ducheyne: Towards an ontology of scientific
Berlin, New York 2010) pp. 97–138 models, Metaphysica 9, 119–127 (2008)
3.51 R.N. Giere: An agent-based conception of models 3.73 N. Cartwright: The Dappled World: A Study of the
and scientific representation, Synthese 172, 269– Boundaries of Science (Cambridge Univ. Press, Cam-
281 (2010) bridge 1999)
3.52 P. Teller: Twilight of the perfect model model, 3.74 N. Cartwright: Models and the limits of theory:
Erkenntnis 55, 393–415 (2001) Quantum hamiltonians and the BCS models of
3.53 O. Bueno, S. French: How theories represent, Br. superconductivity. In: Models as Mediators: Per-
J. Philos. Sci. 62, 857–894 (2011) spectives on Natural and Social Science, ed. by
3.54 A.F. MacKay: Mr. Donnellan and Humpty Dumpty on M. Morgan, M. Morrison (Cambridge Univ. Press,
referring, Philos. Rev. 77, 197–202 (1968) Cambridge 1999) pp. 241–281
3.55 K.S. Donnellan: Putting Humpty Dumpty together 3.75 L. Apostel: Towards the formal study of models in
again, Philos. Rev. 77, 203–215 (1968) the non-formal sciences. In: The Concept and the
3.56 E. Michaelson: This and That: A Theory of Reference Role of the Model in Mathematics and Natural and
for Names, Demonstratives, and Things in Between, Social Sciences, ed. by H. Freudenthal (Reidel, Dor-
Part A | 3
Ph.D. Thesis (Univ. California, Los Angels 2013) drecht 1961) pp. 1–37
3.57 M. Reimer, E. Michaelson: Reference. In: Stanford 3.76 A.-M. Rusanen, O. Lappi: An information semantic
Encycloledia of Philosophy, ed. by E.N. Zalta, http:// account of scientific models. In: EPSA Philosophy
plato.stanford.edu/archives/win2014/entries/ of Science: Amsterdam 2009, ed. by H.W. de Regt,
reference/ (Winter Edition 2014) S. Hartmann, S. Okasha (Springer, Dordrecht 2012)
3.58 C. Abell: Canny resemblance, Philos. Rev. 118, 183– pp. 315–328
223 (2009) 3.77 B.C. van Fraassen: The Empirical Stance (Yale Univ.
3.59 D. Lopes: Understanding Pictures (Oxford Univ. Press, New Haven, London 2002)
Press, Oxford 2004) 3.78 H. Putnam: The Collapse of the Fact-Value Distinc-
3.60 R.N. Giere: How models are used to represent re- tion (Harvard Univ. Press, Cambridge 2002)
ality, Philos. Sci. 71, 742–752 (2004) 3.79 U. Mäki: Models and the locus of their truth, Syn-
3.61 R.N. Giere: Visual models and scientific judgement. these 180, 47–63 (2011)
In: Picturing Knowledge: Historical and Philosoph- 3.80 S.M. Downes: Models, pictures, and unified ac-
ical Problems Concerning the Use of Art in Science, counts of representation: Lessons from aesthetics
ed. by B.S. Baigrie (Univ. Toronto Press, Toronto for philosophy of science, Perspect. Sci. 17, 417–428
1996) pp. 269–302 (2009)
3.62 B. Kralemann, C. Lattmann: Models as icons: Mod- 3.81 M. Morreau: It simply does not add up: The trou-
eling models in the semiotic framework of Peirce’s ble with overall similarity, J. Philos. 107, 469–490
theory of signs, Synthese 190, 3397–3420 (2013) (2010)
3.63 R. Frigg, S. Bradley, H. Du, L.A. Smith: Laplace’s de- 3.82 W.V.O. Quine: Ontological Relativity and Other Es-
mon and the adventures of his apprentices, Philos. says (Columbia Univ. Press, New York 1969)
Sci. 81, 31–59 (2014) 3.83 N. Goodman: Seven strictures on similarity. In:
3.64 N. Goodman: Languages of Art (Hacket, Indi- Problems and Projects, ed. by N. Goodman (Bobbs-
anapolis, Cambridge 1976) Merrill, Indianapolis, New York 1972) pp. 437–446
3.65 A. Yaghmaie: Reflexive, symmetric and transitive 3.84 L. Decock, I. Douven: Similarity after Goodman,
scientific representations, http://philsci-archive. Rev. Philos. Psychol. 2, 61–75 (2011)
pitt.edu/9454 (2012) 3.85 R.N. Shepard: Multidimensional scaling, tree-fit-
3.66 A. Tversky, I. Gati: Studies of similarity. In: Cogni- ting, and clustering, Science 210, 390–398 (1980)
tion and Categorization, ed. by E. Rosch, B. Lloyd 3.86 A. Tversky: Features of similarity, Psychol. Rev. 84,
(Lawrence Elbaum Associates, Hillside New Jersey 327–352 (1977)
1978) pp. 79–98 3.87 M. Weisberg: Getting serious about similarity, Phi-
3.67 M. Poznic: Representation and similarity: Suárez los. Sci. 79, 785–794 (2012)
on necessary and sufficient conditions of scien- 3.88 M. Hesse: Models and Analogies in Science (Sheed
tific representation, J. Gen. Philos. Sci. (2015), Ward, London 1963)
doi:10.1007/s10838-015-9307-7 3.89 W. Parker: Getting (even more) serious about sim-
3.68 H. Putnam: Reason, Truth, and History (Cambridge ilarity, Biol. Philos. 30, 267–276 (2015)
Univ. Press, Cambridge 1981) 3.90 I. Niiniluoto: Analogy and similarity in scientific
3.69 M. Black: How do pictures represent? In: Art, reasoning. In: In Analogical Reasoning: Perspec-
Perception, and Reality, ed. by E. Gombrich, tives of Artificial Intelligence, Cognitive Science,
J. Hochberg, M. Black (Johns Hopkins Univ. Press, and Philosophy, ed. by D.H. Helman (Kluwer, Dor-
London, Baltimore 1973) pp. 95–130 drecht 1988) pp. 271–298
3.70 J.L. Aronson, R. Harré, E. Cornell Way: Realism 3.91 M. Weisberg: Biology and philosophy symposium
Rescued: How Scientific Progress is Possible (Open on simulation and similarity: Using models to un-
Court, Chicago 1995) derstand the world: Response to critics, Biol. Phi-
los. 30, 299–310 (2015)
Models and Representation References 99
3.92 A. Toon: Playing with molecules, Stud. Hist. Philos. 3.116 H.B. Enderton: A Mathematical Introduction to
Sci. 42, 580–589 (2011) Logic (Harcourt, San Diego, New York 2001)
3.93 M. Morgan, T. Knuuttila: Models and modelling in 3.117 P. Suppes: A comparison of the meaning and uses
economics. In: Philosophy of Economics, ed. by of models in mathematics and the empirical sci-
U. Mäki (Elsevier, Amsterdam 2012) pp. 49–87 ences. In: Studies in the Methodology and Founda-
3.94 M. Thomson-Jones: Modeling without mathemat- tions of Science: Selected Papers from 1951 to 1969,
ics, Philos. Sci. 79, 761–772 (2012) ed. by P. Suppes (Reidel, Dordrecht 1969) pp. 10–23,
3.95 G. Rosen: Abstract objects. In: The Stanford 1960
Encyclopedia of Philosophy, ed. by E.N. Zalta, 3.118 B.C. van Fraassen: Structure and perspective:
http://plato.stanford.edu/archives/fall2014/entries/ Philosophical perplexity and paradox. In: Logic
abstract-objects/ (Fall 2014 Edition) and Scientific Methods, ed. by M.L. Dalla Chiara
3.96 S. Hale: Spacetime and the abstract–concrete dis- (Kluwer, Dordrecht 1997) pp. 511–530
tinction, Philos. Stud. 53, 85–102 (1988) 3.119 M. Redhead: The intelligibility of the universe. In:
3.97 R.N. Giere: Why scientific models should not be re- Philosophy at the New Millennium, ed. by A. O’Hear
garded as works of fiction. In: Fictions in Science. (Cambridge Univ. Press, Cambridge 2001)
Philosophical Essays on Modelling and Idealiza- 3.120 S. French, J. Ladyman: Reinflating the semantic ap-
tion, ed. by M. Suárez (Routledge, London 2009) proach, Int. Stud. Philos. Sci. 13, 103–121 (1999)
Part A | 3
pp. 248–258 3.121 N.C.A. Da Costa, S. French: The model-theoretic ap-
3.98 M. Thomson-Jones: Missing systems and face value proach to the philosophy of science, Philos. Sci. 57,
practise, Synthese 172, 283–299 (2010) 248–265 (1990)
3.99 D.M. Armstrong: Universals: An Opinionated Intro- 3.122 P. Suppes: Models of data. In: Studies in the
duction (Westview, London 1989) Methodology and Foundations of Science: Selected
3.100 P. Suppes: Representation and Invariance of Scien- Papers from 1951 to 1969, ed. by P. Suppes (Reidel,
tific Structures (CSLI Publications, Stanford 2002) Dordrecht 1969) pp. 24–35, 1962
3.101 B.C. van Fraassen: The Scientific Image (Oxford Univ. 3.123 P. Suppes: Set-Theoretical Structures in Science
Press, Oxford 1980) (Stanford Univ., Stanford 1970), lecture notes
3.102 N.C.A. Da Costa, S. French: Science and Partial Truth: 3.124 B.C. van Fraassen: Quantum Mechanics: An Empiri-
A Unitary Approach to Models and Scientific Rea- cist View (Oxford Univ. Press, Oxford 1991)
soning (Oxford Univ. Press, Oxford 2003) 3.125 B.C. van Fraassen: A philosophical approach to
3.103 H. Byerly: Model-structures and model-objects, Br. foundations of science, Found. Sci. 1, 5–9 (1995)
J. Philos. Sci. 20, 135–144 (1969) 3.126 N.C.A. Da Costa, S. French: Models, theories, and
3.104 A. Chakravartty: The semantic or model-theoretic structures: Thirty years on, Philos. Sci. 67, 116–127
view of theories and scientific realism, Synthese (2000)
127, 325–345 (2001) 3.127 M. Dummett: Frege: Philosophy of Mathematics
3.105 C. Klein: Multiple realizability and the semantic (Duckworth, London 1991)
view of theories, Philos. Stud. 163, 683–695 (2013) 3.128 G. Hellman: Mathematics Without Numbers: To-
3.106 D. Portides: Scientific models and the semantic wards a Modal-Structural Interpretation (Oxford
view of theories, Philos. Sci. 72, 1287–1289 (2005) Univ. Press, Oxford 1989)
3.107 D. Portides: Why the model-theoretic view of the- 3.129 G. Hellman: Structuralism without structures, Phi-
ories does not adequately depict the methodology los. Math. 4, 100–123 (1996)
of theory application. In: EPSA Epistemology and 3.130 O. Bueno, S. French, J. Ladyman: On representing
Methodology of Science, ed. by M. Suárez, M. Do- the relationship between the mathematical and
rato, M. Rédei (Springer, Dordrecht 2010) pp. 211–220 the empirical, Philos. Sci. 69, 452–473 (2002)
3.108 M.D. Resnik: Mathematics as a Science of Patterns 3.131 S. French: Keeping quiet on the ontology of mod-
(Oxford Univ. Press, Oxford 1997) els, Synthese 172, 231–249 (2010)
3.109 S. Shapiro: Thinking About Mathematics (Oxford 3.132 S. French, J. Saatsi: Realism about structure: The
Univ. Press, Oxford 2000) semantic view and nonlinguistic representations,
3.110 M. Thomson-Jones: Structuralism about scientific Philos. Sci. 73, 548–559 (2006)
representation. In: Scientific Structuralism, ed. by 3.133 S. French, P. Vickers: Are there no things that are
A. Bokulich, P. Bokulich (Springer, Dordrecht 2011) scientific theories?, Br. J. Philos. Sci. 62, 771–804
pp. 119–141 (2011)
3.111 M. Machover: Set Theory, Logic and Their Limita- 3.134 E. Landry: Shared structure need not be shared set-
tions (Cambridge Univ. Press, Cambridge 1996) structure, Synthese 158, 1–17 (2007)
3.112 W. Hodges: A Shorter Model Theory (Cambridge 3.135 H. Halvorson: What scientific theories could not be,
Univ. Press, Cambridge 1997) Philos. Sci. 79, 183–206 (2012)
3.113 C.E. Rickart: Structuralism and Structure: A Math- 3.136 H. Halvorson: Scientific theories. In: The Ox-
ematical Perspective (World Scientific Publishing, ford Handbook of Philosophy of Science, ed. by
Singapore 1995) P. Humphreys (Oxford Univ. Press, Oxford 2016)
3.114 G.S. Boolos, R.C. Jeffrey: Computability and Logic 3.137 K. Brading, E. Landry: Scientific structuralism: Pre-
(Cambridge Univ. Press, Cambridge 1989) sentation and representation, Philos. Sci. 73, 571–
3.115 B. Russell: Introduction to Mathematical Philoso- 581 (2006)
phy (Routledge, London, New York 1993)
100 Part A Theoretical Issues in Models
3.138 C. Glymour: Theoretical equivalence and the se- 3.160 J.W. McAllister: Phenomena and patterns in data
mantic view of theories, Philos. Sci. 80, 286–297 sets, Erkenntnis 47, 217–228 (1997)
(2013) 3.161 J. Nguyen: On the pragmatic equivalence between
3.139 J.B. Ubbink: Model, description and knowledge, representing data and phenomena, Philos. Sci. 83,
Synthese 12, 302–319 (1960) 171–191 (2016)
3.140 A. Bartels: Defending the structural concept of rep- 3.162 M. Frisch: Users, structures, and representation, Br.
resentation, Theoria 21, 7–19 (2006) J. Philos. Sci. 66, 285–306 (2015)
3.141 E. Lloyd: A semantic approach to the structure of 3.163 W. Balzer, C.U. Moulines, J.D. Sneed: An Archi-
population genetics, Philos. Sci. 51, 242–264 (1984) tectonic for Science the Structuralist Program (D.
3.142 B. Mundy: On the general theory of meaningful Reidel, Dordrecht 1987)
representation, Synthese 67, 391–437 (1986) 3.164 W. Demopoulos: On the rational reconstruction of
3.143 S. French: The reasonable effectiveness of math- our theoretical knowledge, Br. J. Philos. Sci. 54,
ematics: Partial structures and the application of 371–403 (2003)
group theory to physics, Synthese 125, 103–120 3.165 J. Ketland: Empirical adequacy and ramsification,
(2000) Br. J. Philos. Sci. 55, 287–300 (2004)
3.144 O. Bueno: Empirical adequacy: A partial structure 3.166 R. Frigg, I. Votsis: Everything you always wanted to
approach, Stud. Hist. Philos. Sci. 28, 585–610 (1997) know about structural realism but were afraid to
Part A | 3
3.145 O. Bueno: What is structural empiricism? Scientific ask, Eur. J. Philos. Sci. 1, 227–276 (2011)
change in an empiricist setting, Erkenntnis 50, 59– 3.167 P. Ainsworth: Newman’s objection, Br. J. Philos. Sci.
85 (1999) 60, 135–171 (2009)
3.146 F. Pero, M. Suárez: Varieties of misrepresentation 3.168 S. Shapiro: Mathematics and reality, Philos. Sci. 50,
and homomorphism, Eur. J. Philos. Sci. 6(1), 71–90 523–548 (1983)
(2016) 3.169 S. French: The Structure of the World. Metaphysics
3.147 P. Kroes: Structural analogies between physical sys- and Representation (Oxford Univ. Press, Oxford
tems, Br. J. Philos. Sci. 40, 145–154 (1989) 2014)
3.148 F.A. Muller: Reflections on the revolution at Stan- 3.170 M. Tegmark: The mathematical universe, Found.
ford, Synthese 183, 87–114 (2011) Phys. 38, 101–150 (2008)
3.149 E.W. Adams: The foundations of rigid body me- 3.171 M. Suárez, A. Solé: On the analogy between cog-
chanics and the derivation of its laws from those of nitive representation and truth, Theoria 55, 39–48
particle mechanics. In: The Axiomatic Method: With (2006)
Special Reference to Geometry and Physics, ed. by 3.172 M. Suárez: Deflationary representation, inference,
L. Henkin, P. Suppes, A. Tarski (North-Holland, Am- and practice, Stud. Hist. Philos. Sci. 49, 36–47 (2015)
sterdam 1959) pp. 250–265 3.173 A. Chakravartty: Informational versus functional
3.150 O. Bueno: Models and scientific representations. theories of scientific representation, Synthese 172,
In: New Waves in Philosophy of Science, ed. by 197–213 (2010)
P.D. Magnus, J. Busch (Pelgrave MacMillan, Hamp- 3.174 W. Künne: Conceptions of Truth (Clarendon, Oxford
shire 2010) pp. 94–111 2003)
3.151 M. Budd: How pictures look. In: Virtue and Taste, 3.175 R.B. Brandom: Making it Explicit: Reasoning, Rep-
ed. by D. Knowles, J. Skorupski (Blackwell, Oxford resenting and Discursive Commitment (Harvard
1993) pp. 154–175 Univ. Press, Cambridge 1994)
3.152 P. Godfrey-Smith: The strategy of model-based sci- 3.176 R.B. Brandom: Articulating Reasons: An Introduc-
ence, Biol. Philos. 21, 725–740 (2006) tion to Inferentialism (Harvard Univ. Press, Cam-
3.153 T. Harris: Data models and the acquisition and ma- bridge 2000)
nipulation of data, Philos. Sci. 70, 1508–1517 (2003) 3.177 X. de Donato Rodriguez, J. Zamora Bonilla: Credibil-
3.154 B.C. van Fraassen: Theory construction and ex- ity, idealisation, and model building: An inferential
periment: An empiricist view, Proc. Philos. Sci. 2, approach, Erkenntnis 70, 101–118 (2009)
663–677 (1981) 3.178 M. Suárez: Scientific Representation, Philos. Com-
3.155 B.C. van Fraassen: Laws and Symmetry (Clarendon, pass 5, 91–101 (2010)
Oxford 1989) 3.179 G. Contessa: Scientific models and representation.
3.156 B.C. van Fraassen: Empricism in the philoso- In: The Continuum Companion to the Philosophy
phy of science. In: Images of Science: Essays of Science, ed. by S. French, J. Saatsi (Continuum
on Realism and Empiricism with a Reply from Press, London 2011) pp. 120–137
Bas C. van Fraassen, ed. by P.M. Churchland, 3.180 G. Contessa: Scientific models and fictional objects,
C.A. Hooker (Univ. Chicago Press, London, Chicago Synthese 172, 215–229 (2010)
1985) pp. 245–308 3.181 E. Shech: Scientific misrepresentation and guides
3.157 J. Bogen, J. Woodward: Saving the phenomena, to ontology: The need for representational code
Philos. Rev. 97, 303–352 (1988) and contents, Synthese 192(11), 3463–3485 (2015)
3.158 J. Woodward: Data and phenomena, Synthese 79, 3.182 S. Ducheyne: Scientific representations as limiting
393–472 (1989) cases, Erkenntnis 76, 73–89 (2012)
3.159 P. Teller: Whither constructive empiricism, Philos. 3.183 R.F. Hendry: Models and approximations in quan-
Stud. 106, 123–150 (2001) tum chemistry. In: Idealization IX: Idealization in
Contemporary Physics, ed. by N. Shanks (Rodopi,
Models and Representation References 101
Amsterdam 1998) pp. 123–142 3.207 S. Friend: Fictional characters, Philos. Compass 2,
3.184 R. Laymon: Computer simulations, idealizations 141–156 (2007)
and approximations, Proc. Bienn. Meet. Philos. Sci. 3.208 F. Salis: Fictional entities. In: Online Companion to
Assoc., Vol. 2 (1990) pp. 519–534 Problems in Aanalytical Philosophy, ed. by J. Bran-
3.185 C. Liu: Explaining the emergence of cooperative quinho, R. Santos, doi:10.13140/2.1.1931.9040 (2014)
phenomena, Philos. Sci. 66, S92–S106 (1999) 3.209 P. Godfrey-Smith: Models and fictions in science,
3.186 J. Norton: Approximation and idealization: Why the Philos. Stud. 143, 101–116 (2009)
difference matters, Philos. Sci. 79, 207–232 (2012) 3.210 R. Sugden: Credible worlds, capacities and mech-
3.187 J.L. Ramsey: Approximation. In: The Philosophy of anisms, Erkenntnis 70, 3–27 (2009)
Science: An Encyclopedia, ed. by S. Sarkar, J. Pfeifer 3.211 J. Cat: Who’s afraid of scientific fictions?: Mauricio
(Routledge, New York 2006) pp. 24–27 Suárez (Ed.): Fictions in Science. Philosophical Es-
3.188 R.I.G. Hughes: Models and representation, Philos. says on Modeling and Idealization, J. Gen. Philos.
Sci. 64, S325–S336 (1997) Sci. 43, 187–194 (2012), book review
3.189 R.I.G. Hughes: The Theoretical Practises of Physics: 3.212 C. Liu: A Study of model and representation based
Philosophical Essays (Oxford Univ. Press, Oxford on a Duhemian thesis. In: Philosophy and Cogni-
2010) tive Science: Western and Eastern studies, ed. by
3.190 R.I.G. Hughes: Laws of nature, laws of physics, and L. Magnani, P. Li (Springer, Berlin, Heidelberg 2012)
Part A | 3
the representational account of theories, ProtoSo- pp. 115–141
ciology 12, 113–143 (1998) 3.213 C. Liu: Symbolic versus modelistic elements in sci-
3.191 L.A. Smith: Chaos: A Very Short Introduction (Oxford entific modeling, Theoria 30, 287–300 (2015)
Univ. Press, Oxford 2007) 3.214 C. Pincock: Mathematics and Scientific Representa-
3.192 W.D. Niven (Ed.): The Scientific Papers of James tion (Oxford Univ. Press, Oxford 2012)
Clerk Maxwell (Dover Publications, New York 1965) 3.215 M. Vorms: Representing with imaginary models:
3.193 H. Vaihinger: The Philosophy of as if: A System of Formats matter, Stud. Hist. Philos. Sci. 42, 287–295
the Theoretical, Practical, and Religious Fictions of (2011)
Mankind (Kegan Paul, London 1911) p. 1924, English 3.216 M. Vorms: Formats of representation in scientific
translation theorising. In: Models, Simulations, and Represen-
3.194 N. Cartwright: How the Laws of Physics Lie (Oxford tations, ed. by P. Humphreys, C. Imbert (Routledge,
Univ. Press, Oxford 1983) New York 2012) pp. 250–274
3.195 D.N. McCloskey: Storytelling in economics. In: Nar- 3.217 T. Knuuttila, M. Boon: How do models give us
rartive in Culture. The uses of Storytelling in the knowledge? The case of Carnot’s ideal heat engine,
Sciences, Philosophy, and Literature, ed. by C. Nash Eur. J. Philos. Sci. 1, 309–334 (2011)
(Routledge, London 1990) pp. 5–22 3.218 R. Frigg: Fiction in science. In: Fictions and Models:
3.196 A. Fine: Fictionalism, Midwest Stud. Philos. 18, 1–18 New Essays, ed. by J. Woods (Philiosophia, Munich
(1993) 2010) pp. 247–287
3.197 L. Sklar: Theory and Truth. Philosophical Critique 3.219 M.E. Kalderon (Ed.): Fictionalism in Metaphysics
Within Foundational Science (Oxford Univ. Press, (Oxford Univ. Press, Oxford 2005)
Oxford 2000) 3.220 A. Fine: Fictionalism. In: Routledge Encyclopedia
3.198 C.Z. Elgin: Considered Judgement (Princeton Univ. of Philosophy, ed. by E. Craig (Routledge, London
Press, Princeton 1996) 1998)
3.199 S. Hartmann: Models and stories in hadron physics. 3.221 A. Fine: Science fictions: Comment on Godfrey-
In: Models as Mediators. Perspectives on Natural Smith, Philos. Stud. 143, 117–125 (2009)
and Social Science, ed. by M. Morgan, M. Morrison 3.222 E. Winsberg: A function for fictions: Expanding the
(Cambridge Univ. Press, Cambridge 1999) pp. 326– scope of science. In: Fictions in Science: Philosoph-
346 ical Essays in on Modeling and Idealization, ed. by
3.200 M. Morgan: Models, stories and the economic M. Suárez (Routledge, New York 2009) pp. 179–191
world, J. Econ. Methodol. 8, 361–384 (2001) 3.223 M. Suárez: Scientific fictions as rules of inference.
3.201 M. Morgan: Imagination and imaging in model In: Fictions in Science: Philosophical Essays in on
building, Philos. Sci. 71, 753–766 (2004) Modeling and Idealization, ed. by M. Suárez (Rout-
3.202 R. Sugden: Credible worlds: The status of theoreti- ledge, New York 2009) pp. 158–178
cal models in economics, J. Econ. Methodol. 7, 1–31 3.224 M. Morrison: Fictions, representations, and real-
(2000) ity. In: Fictions in Science: Philosophical Essays on
3.203 R. Frigg: Models and fiction, Synthese 172, 251–268 Modeling and Idealization, ed. by M. Suárez (Rout-
(2010) ledge, New York 2009) pp. 110–135
3.204 T. Grüne-Yanoff, P. Schweinzer: The roles of sto- 3.225 G.M. Purves: Finding truth in fictions: Identify-
ries in applying game theory, J. Econ. Methodol. ing non-fictions in imaginary cracks, Synthese 190,
15, 131–146 (2008) 235–251 (2013)
3.205 A. Toon: Models as Make-Believe. Imagination, 3.226 J. Woods: Against fictionalism. In: Model-Based
Fiction and Scientific Representation (Palgrave Reasoning in Science and Technology: Theoretical
Macmillan, Basingstoke 2012) and Cognitive Issues, ed. by L. Magnani (Springer,
3.206 A. Levy: Models, fictions, and realism: Two pack- Berlin, Heidelberg 2014) pp. 9–42
ages, Philos. Sci. 79, 738–748 (2012)
102 Part A Theoretical Issues in Models
3.227 M. Weisberg: Who is a modeler?, Br. J. Philos. Sci. tice: Nancy Cartwright and the Nature of Scientific
58, 207–233 (2007) Reasoning, ed. by H.-K. Chao, R. Julian, C. Szu-Ting
3.228 A. Toon: The ontology of theoretical modelling: (Springer, New York 2017), in press
Models as make-believe, Synthese 172, 301–315 3.239 C.Z. Elgin: With Reference to Reference (Hackett, In-
(2010) dianapolis 1983)
3.229 K.L. Walton: Mimesis as Make-Believe: On the 3.240 F. Schier: Deeper in Pictures: An Essay on Pictorial
Foundations of the Representational Arts (Harvard Representation (Cambridge Univ. Press, Cambridge
Univ. Press, Cambridge 1990) 1986)
3.230 S. Yablo: Aboutness (Princeton Univ. Press, Prince- 3.241 J. Kulvicki: Pictorial representation, Philos. Com-
ton 2014) pass 1, 535–546 (2006)
3.231 N. Cartwright: Models: Parables v fables. In: Beyond 3.242 M. Jones: Idealization and abstraction: A frame-
Mimesis and Convention. Representation in Art and work. In: Idealization XII: Correcting the Model-
Science, ed. by R. Frigg, M.C. Hunter (Springer, Idealization and Abstraction in the Sciences, ed. by
Berlin, New York 2010) pp. 19–32 M. Jones, N. Cartwright (Rodopi, Amsterdam 2005)
3.232 T. Schelling: Micromotives and Macrobehavior pp. 173–218
(Norton, New York 1978) 3.243 A. Galton: Space, time, and the representation of
3.233 C.A. Pissarides: Loss of skill during unemployment geographical reality, Topoi 20, 173–187 (2001)
Part A | 3
and the persistence of unemployment shocks, Q. 3.244 S. Sismondo, N. Chrisman: Deflationary meta-
J. Econ. 107, 1371–1391 (1992) physics and the nature of maps, Proc. Philos. Sci.
3.234 T. Grüne-Yanoff: Learning from minimal economic 68, 38–49 (2001)
models, Erkenntnis 70, 81–99 (2009) 3.245 S.G. Sterrett: Models of machines and models
3.235 T. Knuuttila: Isolating representations versus cred- of phenomena, Int. Stud. Philos. Sci. 20, 69–80
ible constructions? Economic modelling in theory (2006)
and practice, Erkenntnis 70, 59–80 (2009) 3.246 J.H. Argyris, G. Faust, M. Haase: Die Erforschung des
3.236 L. Magnani: Scientific models are not fictions: Chaos: Eine Einführung für Naturwissenschaftler
Model-based science as epistemic warfare. In: Phi- und Ingenieure (Vieweg Teubner, Braunschweig
losophy and Cognitive Science: Western and East- 1994)
ern Studies, ed. by L. Magnani, P. Li (Springer, 3.247 R. Frigg, J. Nguyen: The Turn of the Valve:
Berlin, Heidelberg 2012) pp. 1–38 Representing with Material Models, Unpublished
3.237 J. Odenbaugh: Semblance or similarity?, Reflec- Manuscript
tions on simulation and similarity, Biol. Philos. 30, 3.248 R. Frigg, J. Nguyen: The fiction view of models
277–291 (2015) reloaded, forthcoming in The Monist, July 2016
3.238 R. Frigg, J. Nguyen: Scientific representation is rep-
resentation as. In: Philosophy of Science in Prac-
103
Alisa Bokulich
Models and E 4. Models and Explanation
Part A | 4
4.6 Conclusion ......................................... 116
tice do real explanatory work? Do some highly
abstract and mathematical models exhibit a non- References................................................... 117
causal form of scientific explanation? How can one
distinguish an exploratory how-possibly model
explanation from a genuine how-actually model
explanation? Do modelers face tradeoffs such that
a model that is optimized for yielding explana-
tory insight, for example, might fail to be the most
predictively accurate, and vice versa? This chapter
explores the various answers that have been given
to these questions.
Explanation is one of the central aims of science, and As philosophers of science have turned to more
the attempt to understand the nature of scientific ex- careful examinations of actual scientific practice, how-
planation is at the heart of the philosophy of science. ever, there have been three key observations that have
An explanation can be analyzed as consisting of two challenged this traditional approach: first, many of
parts, a phenomenon or event to be explained, known the phenomena scientists seek to explain are incred-
as the explanandum, and that which does the job of ex- ibly complex; second, the laws of nature supposedly
plaining, the explanans. On the traditional approach, to needed for explanation are either few and far between
explain a phenomenon is either to deduce the explanan- or entirely absent in many of the sciences; and third,
dum phenomenon from the relevant laws of nature and a detailed causal description of the chain of events and
initial conditions, such as on the deductive-nomological interactions leading up to a phenomenon are often either
(DN) account [4.1], or to trace the detailed causal beyond our grasp or not in fact what is most important
chain leading up to that event, such as on the causal– for a scientific understanding of the phenomenon.
mechanical account [4.2]. Underlying this traditional More generally, there has been a growing recog-
approach are the assumptions that, in order to genuinely nition that much of science is a model-based activity.
explain, the explanans must be entirely true, and that the (For an overview of many different types of models in
more complete and detailed the explanans is, the better science, and some of the philosophical issues regard-
the scientific explanation. ing the nature and use of such models, refer to [4.3]).
104 Part A Theoretical Issues in Models
Models are by definition incomplete and idealized de- And most relevant to our discussion here, how might
scriptions of the systems they describe. This practice the extensive use of models in science lead us to
raises all sorts of epistemological questions, such as revise our philosophical account of scientific explana-
how can it be that false models lead to true insights? tion?
sources. The model demonstrates that such a strategy writes [4.7, p. 261]:
is stable and successful, and hence can be used as part
“If techniques for which no theoretical justification
of the explanation for why we find this polymorphism
can be given have to be utilized to correct a formal
among sparrows (see [4.4, 5] for further discussion).
idealization, this is taken to count against the ex-
There are, of course, many perils in assuming that
planatory propriety of that idealization. The model
just because we see a phenomenon or pattern exhibited
itself in such a case is suspect, no matter how good
in a model that it therefore explains why we see it in the
the predictive results it may produce.”
real world: the same pattern or phenomenon could be
produced in multiple, very different ways, and hence it He further notes that a theoretical justification for
might be only a phenomenological model at best, useful the de-idealization process will only succeed if the orig-
for prediction, but not a genuine explanation. Explana- inal model has successfully captured the real structure
tion and the concomitant notion of understanding are of the phenomenon of interest.
what we call success terms: if the purported explana- As an example, McMullin [4.8] describes the fertil-
tion is not, in fact, right (right in some sense that will ity of the continental drift model in explaining why the
need to be spelled out) and the understanding is only continents seem to fit together like pieces of a puzzle
illusory, then it is not, in fact, a genuine explanation. and why similar fossils are found at distant locations.
Determining what the success conditions are for a gen- The continental drift model involved all sorts of ideal-
uine explanation is the central philosophical problem in izations and gaps: most notably, the chief proponent of
scientific explanation. this approach, Alfred Wegener, could offer no account
Those who have defended the explanatory power of the forces or mechanisms by which the massive con-
of models have typically argued that further condi- tinents could move. Strictly speaking, we now know
tions must be met in order for a model’s exhibiting that the continental drift model is false, and has been
of a salient pattern or phenomenon to count as part of supplanted by plate tectonics. But as McMullin notes,
a genuine explanation of its real-world counterpart. Not the continental drift model nonetheless captures key
all models are explanatory, and an adequate account of features of the real structure of the phenomenon of
model explanation must provide grounds for making interest, and, hence, succeeds in giving genuine ex-
such discriminations. As we will see, however, differ- planatory insight.
ent approaches have filled in these further requirements While McMullin’s account of HS model explana-
in different ways. tions fits in many cases, there are other examples of
One of the earliest defenses of the view that models model explanations in the sciences that do not seem
can explain is McMullin’s [4.6] hypothetico-structural to fit his account. First, there seem to be examples
HS account of model explanations. In an HS expla- of model explanations where the idealizations are in-
nation, one explains a complex phenomenon by pos- eliminable, and, hence, they cannot be justified through
Models and Explanation 4.1 The Explanatory Function of Models 105
anything like the de-idealization analysis that McMullin an organism given certain constraints (e.g., the opti-
describes [4.9]. Second, not all models are related to mal length of a bear’s fur, given the benefits of longer
their target phenomena via an idealization: some mod- fur and the costs of growing it, or the optimal height
els represent through a fictionalization [4.10]. Third, at which crows should drop walnuts in order to crack
insofar as McMullin’s HS model explanations are a sub- open the shells, given the costs of flying higher, etc.).
species of causal explanations, they do not account for If organisms are indeed fitter the closer a trait is to
noncausal model explanations. These sort of cases will the optimal value, and if natural selection is the only
be discussed more fully in subsequent sections. force operating, then the optimal value for that trait will
Another early account of the explanatory power evolve in the population. Thus, optimality models are
of models is Cartwright’s [4.11] simulacrum account used to explain why organisms have trait values at or
of explanation, which she introduces as an alternative near the optimal value (e.g., why crows drop walnuts
to the DN account of explanation and elaborates in from an average of 3 m high [4.14]).
her book How the Laws of Physics Lie. Drawing on As Elgin and Sober note, optimality models contain
Duhem’s [4.12] theory of explanation, she argues [4.11, all sorts of idealizations: “they describe evolutionary
p. 152]: trajectories of populations that are infinitely large in
which reproduction is asexual with offspring always
“To explain a phenomenon is to find a model that
resembling their parents, etc.” [4.13, p. 447]. Nonethe-
fits it into the basic framework of the theory and that
less, they argue that these models are genuinely ex-
thus allows us to derive analogues for the messy and
planatory when it can be shown that the value described
complicated phenomenological laws which are true
in the explanandum is close to the value predicted by the
of it.”
idealized model; when this happens we can conclude
Part A | 4.1
According to Cartwright, the laws of physics do that the idealizations in the model are harmless [4.13,
not describe our real messy world, only the idealized p. 448]. Apart from this concession about harmless ide-
world we construct in our models. She gives the exam- alizations, Elgin and Sober’s account of explanation
ple of the harmonic oscillator model, which is used in remains close to the traditional DN account in that they
quantum mechanics to describe a wide variety of sys- further require:
tems. One describes a real-world helium-neon laser as
1. The explanans must cite the cause of the explanan-
if it were a van der Pol oscillator; this is how the phe-
dum
nomenon becomes tractable and we are able to make
2. The explanans must cite a law
use of the mathematical framework of our theory. The
3. All of the explanans propositions must be true [4.13,
laws of quantum mechanics are true in this model, but
p. 446]
this model is just a simulacrum of the real-world phe-
nomenon. By model, Cartwright means “an especially though their condition 3 might better be stated as all
prepared, usually fictional description of the system un- the explanans propositions are either true or harmlessly
der study” [4.11, p. 158]. She notes that while some of false.
the properties ascribed to the objects in the models are As a general account of model explanations, how-
idealizations, there are other properties that are pure fic- ever, one might argue that the approaches of Cartwright,
tions; hence, one should not think of models in terms of Elgin, and Sober are too restrictive. As noted before,
idealizations alone. this approach still depends on there being laws of nature
Although Cartwright’s simulacrum account is from which the phenomenon is to be derived, and such
highly suggestive, it leaves unanswered many key ques- laws just might not be available. Moreover, it is not clear
tions, such as when a model should or should not that explanatory models will contain only harmless ide-
be counted as explanatory. Elgin and Sober [4.13] of- alizations. There may very well be cases in which the
fer a possible emendation to Cartwright’s account that idealizations make a difference (are not harmless) and
they argue discriminates which sorts of idealized causal yet are essential to the explanation (e.g., [4.15, 16]).
models can explain. The key, according to their ap- While the simulacrum approach of Cartwright,
proach, is to determine whether or not the idealizations especially as further developed by Elgin and Sober,
in the model are what they call harmless. A harmless largely draws its inspiration from the traditional DN
idealization is one that if corrected “wouldn’t make approach to explanation, there are other approaches
much difference in the predicted value of the effect to model explanation that are tied more closely to the
variable” [4.13, p. 448]. They illustrate this approach traditional causal–mechanical approach to explanation.
using the example of optimality models in evolutionary Craver [4.17], for example, has argued that models are
biology. Optimality models are models that determine explanatory when they describe mechanisms. He writes
what value of a trait maximizes fitness (is optimal) for “[. . . ] the distinction between explanatory and nonex-
106 Part A Theoretical Issues in Models
planatory models is that the [former], and not the [latter] that are in fact responsible for producing the phe-
describe mechanisms” [4.17, p. 367]. The central notion nomenon ([4.17, p. 361], [4.19, p. 353]). Craver and
of mechanism, here, can be understood as consisting Kaplan rule out the possibility that fictional, metaphor-
of the various components or parts of the phenomenon ical, or strongly idealized models can be explanatory.
of interest, the activities of those components, and how One of the most comprehensive defenses of the ex-
they are organized in relation to each other. planatory power of models is given by Bokulich [4.18,
Craver imposes rather strict conditions on when 20–22], who argues that model explanations such as
such mechanistic models can be counted as explana- the three discussed previously (McMullin, Cartwright–
tory; he writes, “To characterize the phenomenon cor- Elgin–Sober, and Craver–Kaplan), can be seen as spe-
rectly and completely is the first restrictive step in cial cases of a more general account of the explana-
turning a model into an acceptable mechanistic expla- tory power of models. Bokulich’s approach draws on
nation” [4.17, p. 369]. (Some have argued that if one Woodward’s counterfactual account of explanation, in
has a complete and accurate description of the sys- which [4.23, p. 11]
tem or phenomenon of interest, then it is not clear that
“the explanation must enable us to see what sort of
one has a model [4.18]). Craver analyzes the example
difference it would have made for the explanandum
of the Hodgkin–Huxley mathematical model of the ac-
if the factors cited in the explanans had been differ-
tion potential in an axon (nerve fiber). Despite the fact
ent in various possible ways.”
that this model allowed Hodgkin and Huxley to derive
many electrical features of neurons, and the fact that it She argues that model explanations typically share
was based on a number of fundamental laws of physics the following three features: first, the explanans makes
and chemistry, Craver argues that it was not in fact an essential reference to a scientific model, which, as is the
Part A | 4.1
explanatory model. He describes it instead as merely case with all models, will be an idealized, abstracted, or
a phenomenological model because it failed to accu- fictionalized representation of the target system. Sec-
rately describe the details of the underlying mechanism. ond, the model explains the explanandum by showing
A similar mechanistic approach to model explana- how the elements of the model correctly capture the pat-
tion has been developed by Kaplan [4.19], who intro- terns of counterfactual dependence in the target system,
duces what he calls the mechanism–model–mapping enabling one to answer a wide range of what Wood-
(or 3M) constraint. He defines the 3M constraint as fol- ward calls what-if-things-had-been-different questions.
lows [4.19, p. 347]: Finally, there must be what Bokulich calls a justifi-
catory step, specifying the domain of applicability of
“A model of a target phenomenon explains that the model and showing where and to what extent the
phenomenon to the extent that (a) the variables in model can be trusted as an adequate representation of
the model correspond to identifiable components, the target for the purpose(s) in question [4.18, p. 39];
activities, and organizational features of the target see also [4.22, p. 730]. She notes that this justifica-
mechanism that produces, maintains, or underlies tory step can proceed bottom-up through something
the phenomenon, and (b) the (perhaps mathemat- like a de-idealization analysis (as McMullin, Elgin, and
ical) variables in the model correspond to causal Sober describe), top-down through an overarching the-
relations among the components of the target mech- ory (such as in the semiclassical mechanics examples
anism.” Bokulich [4.20, 21] discusses), or through some combi-
nation.
Kaplan takes this 3M constraint to provide a demar- Arguably one of the advantages of Bokulich’s ap-
cation line between explanatory and nonexplanatory proach is that it is not tied to one particular conception
models. He further notes that [4.19, p. 347] of scientific explanation, such as the DN or mechanis-
tic accounts. By relaxing Woodward’s manipulationist
“3M aligns with the highly plausible assumption construal of the counterfactual condition, Bokulich’s
that the more accurate and detailed the model is for approach can even be extended to highly abstract, struc-
a target system or phenomenon the better it explains tural, or mathematical model explanations. She argues
that phenomenon.” that the various subspecies of model explanation can
be distinguished by noting what she calls the origin or
Models that do not comply with 3M are rejected as ground of the counterfactual dependence. She explains,
nonexplanatory, being at best phenomenological mod- it could be either [4.18, p. 40]
els, useful for prediction, but giving no explanatory
insight. In requiring that, explanatory models describe “the elements represented in the model causally
the real components and activities in the mechanism producing the explanandum (in the case of causal
Models and Explanation 4.1 The Explanatory Function of Models 107
model explanations), the elements of the model Rice rightly notes that the question of causation
being the mechanistic parts which make up the is conceptually distinct from the question of what ex-
explanandum-system whole (in the case of mech- plains. He further requires on this approach that model
anistic model explanations), or the explanandum explanations provide two kinds of counterfactual infor-
being a consequence of the laws cited in the model mation, namely both what the phenomenon depends on
(in the case of covering law model explanations).” and what sorts of changes are irrelevant to that phe-
nomenon. Following Batterman [4.9, 15, 26], he notes
She goes on to identify a fourth type of model ex- that for explanations of phenomena that exhibit a kind
planation, which she calls structural model explanation, of universality, an important part of the explanation is
in which the counterfactual dependence is grounded understanding that the particular causal details or pro-
in the typically mathematical structure of the theory, cesses are irrelevant – the same phenomenon would
which limits the sorts of objects, properties, states, or have been reproduced even if the causal details had been
behaviors that are admissible within the framework different in certain ways.
of that theory [4.18, p. 40]. Bokulich’s approach can As an illustration, Rice discusses the case of opti-
be thought of as one way to flesh out Morrison’s mality modeling in biology. He notes that optimality
suggestive, but unelaborated, remark that “the reason models are not only highly idealized, but also can be
models are explanatory is that in representing these understood as a type of equilibrium explanation, where
systems, they exhibit certain kinds of structural depen- “most of the explanatory work in these models is done
dencies” [4.24, p. 63]. by synchronic mathematical representations of struc-
More recently, Rice [4.25] has drawn on Bokulich’s tural features of the system” [4.25, p. 8]. He connects
account to develop a similar approach to the explana- this to the counterfactual account of model explanation
Part A | 4.1
tory power of models that likewise uses Woodward’s as follows [4.25, p. 17]:
counterfactual approach without the manipulation con-
dition. He writes [4.25, p. 20]: “Optimality models primarily focus on noncausal
counterfactual relations between structural features
“The requirement that these counterfactuals must and the system’s equilibrium point. Moreover, these
enable one to, in principle, intervene in the system features can sometimes explain the target phe-
restricts Woodward’s account to specifically causal nomenon without requiring any additional causal
explanations. However, I think it is a mistake to re- claims about the relationships represented in the
quire that all scientific explanations must be causal. model.”
Indeed, if one looks at many of the explanations
These causal details are irrelevant because the struc-
offered by scientific modelers, causes are not men-
tural features cited in the model are multiply realizable;
tioned.”
indeed, this is what allows optimality models to be used
Compare this to Bokulich’s statement [4.18, p. 39]: in explaining a wide variety of features across a diver-
sity of biological systems.
“I think it is a mistake to construe all scientific ex- In the approaches to model explanations discussed
planation as a species of causal explanation, and here, two controversial issues have arisen that merit
more to the point here, it is certainly not the case closer scrutiny: first, whether the fictions or false-
that all model explanations should be understood as hoods in models can themselves do real explanatory
causal explanations. Thus while I shall adopt Wood- work (i. e., even when they are neither harmless, de-
ward’s account of explanation as the exhibiting of idealizable, nor eliminable), and second, whether many
a pattern of counterfactual dependence, I will not model explanations illustrate an important, but often
construe this dependence narrowly in terms of the overlooked, noncausal form of explanation. These is-
possible causal manipulations of the system” sues will be taken up in turn in the next two sections.
108 Part A Theoretical Issues in Models
proach (such as [4.28] are explored in a special issue is a continuum (rather than describing it veridically as
of the Journal of Economic Methodology (volume 20, a collection of discrete gas or water molecules). These
issue 3).) false continuum assumptions are essential for obtain-
The field has largely split into two camps on this ing the desired explanation. In the breaking drop case,
issue: those who think it is only the true parts of mod- it turns out that different fluids of different viscosities
els that do explanatory work and those who think the dripping from faucets of different widths will all exhibit
falsehoods play an essential role in the model expla- the same shape upon breakup. The explanation depends
nation. Those in the former camp rely on things like on a singularity that exists only in the (false) contin-
de-idealization and harmless analyses to show that the uum model; such an explanation does not exist on the
falsehoods do not get in the way of the true parts of the de-idealized molecular dynamics approach [4.15, pp.
model that do the real explanatory work. Those in the 442–443]). Hence, he concludes [4.15, p. 427],
latter camp have the challenging task of showing that
“continuum idealizations are explanatorily inelim-
some idealizations are essential and some fictions yield
inable and [. . . ] a full understanding of certain
true insights.
physical phenomena cannot be obtained through
The received view is that the false parts of models
completely detailed, nonidealized representations.”
only concern those things that are explanatorily irrele-
vant. Defenders of the received view include Strevens, If such analyses are right, then they show that not all
who in his book detailing his kairetic account of scien- idealizations can be de-idealized, and, moreover, those
tific explanation (Strevens takes the term kairetic from falsehoods can play an essential role in the explanation.
the ancient Greek word kairos, meaning crucial mo- Bokulich [4.10, 20–22] has similarly defended the
ment [4.29, p. 477].)), writes, “No causal account of view that it is not just the true parts of models that can
explanation – certainly not the kairetic account – al- do explanatory work, arguing that in some cases even
lows nonveridical models to explain” [4.29, p. 297]. He fictions can be explanatory. She writes, “some fictions
spells out more carefully how such a view is to be rec- can give us genuine insight into the way the world is,
onciled with the widespread use of idealized models to and hence be genuinely explanatory and yield real un-
explain phenomena in nature, by drawing the following derstanding” [4.10, p. 94]. She argues that some fictions
distinction [4.29, p. 318]: are able to do this by capturing in their fictional rep-
resentation real patterns of structural dependencies in
“The content of an idealized model, then, can be the world. As an example, she discusses semiclassical
divided into two parts. The first part contains the models whereby fictional electron orbits are used to ex-
difference-makers for the explanatory target. [. . . ] plain peculiar features of quantum spectra. Although,
The second part is all idealization; its overt claims according to quantum mechanics, electrons do not fol-
are false but its role is to point to parts of the actual low definite trajectories or orbits (i. e., such orbits are
Models and Explanation 4.2 Explanatory Fictions: Can Falsehoods Explain? 109
fictions), physicists recognized that puzzling peaks in “When one sees the sharp shadows of buildings in
the recurrence spectrum of atoms in strong magnetic a city, it seems difficult to insist that light-rays are
fields have a one-to-one correspondence with particular merely calculational tools that provide approxima-
closed classical orbits [4.30, pp. 2789–2790] (quoted tions to the full solution of the wave equation.”
in [4.10, p. 99]):
Similarly, Batterman, argues [4.33, pp. 154–155]:
“The resonances [. . . ] form a series of strikingly “One cannot explain various features of the rainbow
simple and regular organization, not previously an- (in particular, the universal patterns of intensities
ticipated or predicted. [. . . ] The regular type reso- and fringe spacings) without ultimately having to
nances can be physically rationalized and explained appeal to the structural stability of ray theoretic
by classical periodic orbits of the electron on closed structures called caustics – focal properties of fami-
trajectories starting at and returning to the proton as lies of rays.”
origin.”
Batterman is quite explicit that he does not think
that an explanatory appeal to these ray-theoretic struc-
As she explains, at no point are these physicists
tures requires reifying the rays; they are indeed fictions.
challenging the status of quantum mechanics as the
Some, such as Belot, want to dismiss ray-optics
true, fundamental ontological theory; rather, they are
models as nothing but a mathematical device devoid of
deploying the fiction with the express recognition that
any physical content outside of the fundamental (wave)
it is indeed a literally false representation (interestingly
theory. He writes [4.34, p. 151]:
this was one of the Vaihinger’s criteria for a scientific
fiction [4.31, p. 98]). Nonetheless, it is a represen- “The mathematics of the less fundamental theory is
Part A | 4.2
tation that is able to yield true physical insight and definable in terms of that of the more fundamental
understanding by carefully capturing in its fictional rep- theory; so the requisite mathematical results can be
resentation the appropriate patterns of counterfactual proved by someone whose repertoire of interpreted
dependence of the target phenomenon. physical theories included only the latter.”
Bokulich [4.10, 20–22] offers several such exam-
The point is roughly this: it looks like in Batter-
ples of explanatory fictional models from semiclassical
man’s examples that one is making an explanatory ap-
mechanics, where the received explanation of quantum
peal to fictional entities from a less fundamental theory
phenomena appeals to classical structures, such as the
that has been superseded (e.g., ray optics or classical
Lyapunov (stability) exponents of classical trajectories,
mechanics). However, all one needs from that super-
that have no clear quantum counterpart. Moreover, she
seded theory is the mathematics – one does not need to
notes that these semiclassical models with their fic-
give those bits of mathematics a physical interpretation
tional assumption of classical trajectories are valued not
in terms of the fictional entities or structures. Moreover,
primarily as calculation tools (often they require cal-
that mathematics appears to be definable in terms of
culations that are just as complicated), but rather are
the mathematics of the true fundamental theory. Hence,
valued as models that provide an unparalleled level of
those fictional entities are not, in fact, playing an ex-
physical insight into the structure of the quantum phe-
planatory role.
nomena. Bokulich is careful to note that not just any
Batterman has responded to these objections, ar-
fiction can do this kind of explanatory work; indeed,
guing that in order to have an explanation, one does,
most fictions cannot. She shows more specifically how
in fact, need the fictional physical interpretation of
these semiclassical examples meet the three criteria of
that mathematics, and hence the explanatory resources
her account of model-based explanation, discussed ear-
of the nonfundamental theory. He explains [4.33, p.
lier (e.g., [4.10, p. 106]).
159]:
A more pedestrian example of an explanatory fic-
tion, and one that brings out some of the objections to “Without the physical interpretation to begin with,
such claims, is the case of light rays postulated by the we would not know what boundary conditions to
ray (or geometrical) theory of optics. Strictly speaking, join to the differential equation. Neither, would we
light rays are a fiction. The currently accepted funda- know how to join those boundary conditions to the
mental theory of wave optics denies that they exist. Yet, equation. Put another way, we must examine the
light rays seem to play a central role in the scientific physical details of the boundaries (the shape, reflec-
explanation of lots of phenomena, such as shadows and tive and refractive details of the drops, etc.) in order
rainbows. The physicists Kleppner and Delos, for ex- to set up the boundary conditions required for the
ample, note [4.32, p. 610]: mathematical solution to the equation.”
110 Part A Theoretical Issues in Models
In other words, without appealing to the fictional pealing to which harmonic components appear in the
rays, we would not have the relevant information we Fourier decomposition of the electron’s classical or-
need to appropriately set up and solve the mathematical bit (see [4.20, Sect. 4.2] and references therein). He
model that is needed for the explanation. does this even long after he has conceded to the new
In a paper with Jansson, Belot has raised similar quantum theory that classical electron trajectories in the
objections against Bokulich’s arguments that classical atom are impossible (i. e., they are a fiction). Although
structures can play a role in explaining quantum phe- Heisenberg used this formulation of the correspondence
nomena. They write [4.35, p. 82]: principle to construct his matrix mechanics, he argued
that “it must be emphasized that this correspondence
“Bokulich and others see explanations that draw on
is a purely formal result” [4.37, p. 83], and should not
semiclassical considerations as involving elements
be thought of as involving any physical content from
of classical physics as well as of quantum physics.
the other theory. Bohr, by contrast, was dissatisfied
[. . . ] But there is an alternative way of thinking of
with this interpretation of the correspondence princi-
semiclassical mechanics: [. . . ] starting with the for-
ple as pure mathematics, arguing instead that it revealed
malism of quantum mechanics one proves theorems
a deep physical connection between classical and quan-
about approximate solutions – theorems that hap-
tum mechanics. Even earlier, we can see some of these
pen to involve some of the mathematical apparatus
issues arising in the work of Maxwell, who, in exploit-
of classical mechanics. But this need not tempt us to
ing the utility of fictional models and physical analogies
think that there is [classical] physics in our explana-
between disparate fields, argued ([4.38, p. 187]; for
tions.”
a discussion, see [4.39]):
Once again, we see the objection that it is just the
Part A | 4.2
bare mathematics, not the mathematics with its phys- “My aim has been to present the mathematical ideas
ical interpretation that is involved in the explanation. to the mind in an embodied form [. . . ] not as mere
On Bokulich’s view, however, it is precisely by con- symbols, which convey neither the same ideas, nor
necting that mathematical apparatus to its physical readily adapt themselves to the phenomena to be ex-
interpretation in terms of classical mechanics, that one plained.”
gains a deeper physical insight into the system one is
studying. On her view, explanation is importantly about Three other challenges have been raised against
advancing understanding, and for this the physical in- the explanatory power of fictional models. First, there
terpretation is important. (Potochnik [4.5, Chap. 5] has is a kind of slippery-slope worry that, once we ad-
also argued for a tight connection between explanation mit some fictional models as explanatory, we will not
and understanding, responding to some of the tradi- have any grounds on which to dismiss other fictional
tional objections against this association. More broadly, models as nonexplanatory. Bokulich [4.22] introduces
she emphasizes the communicative function of expla- a framework for addressing this problem. Second,
nation over the ontological approach to explanation, Schindler [4.40] has raised what he sees as a tension
which makes more room for nonveridical model ex- in Bokulich’s account. He claims that on one hand
planations than the traditional approach.) Even though she says semiclassical explanations of quantum phe-
classical mechanics is not the true fundamental theory, nomena are autonomous in the sense that they provide
there are important respects in which it gets things right, more insight than the quantum mechanical ones. Yet,
and hence reasoning with fictional classical structures on the other hand, she notes that semiclassical mod-
within the well-established confines of semiclassical els are justified through semiclassical theory, which
mechanics, can yield explanatory insight and deepen connects these representations as a kind of approxima-
our understanding. tion to the full quantum mechanics. Hence, they cannot
As we have seen, these claims that fictions can ex- be autonomous. This objection seems to trade on an
plain (in special cases such as ray optics and classical equivocation of the term autonomous: in the first case,
structures) remain controversial and involve subtle is- autonomous is used to mean “a representation of the
sues. These debates are not entirely new, however, and phenomenon that yields more physical insight” and in
they have some interesting historical antecedents, for the second case autonomous is used to mean “cannot be
example, in the works of Niels Bohr and James Clerk mathematical justified through various approximation
Maxwell. More specifically, when Bohr is articulating methods”. These seem to be two entirely different con-
his widely misunderstood correspondence principle, cepts, and, hence, not really in tension with each other.
(for an accessible discussion see [4.36]) he argues that Moreover, Bokulich never uses the term autonomous to
one can explain why only certain quantum transitions describe either, so this seems to be a misleading reading
between stationary states in atoms are allowed by ap- of her view.
Models and Explanation 4.2 Explanatory Fictions: Can Falsehoods Explain? 111
Schindler also rehearses the objection, raised by relating cognition to neural structures. Finally, there is
Belot and Jansson [4.35], that by eliminating the also fictionalization, which, as he describes [4.41, p.
interventionist condition in Woodward’s counterfactual 331],
approach to explanation she loses what he calls “the
“involves putting components into a model that are
asymmetry-individuating function”, by which he
known not to correspond to any element of the mod-
means her account seems susceptible to the traditional
eled system, but which serve an essential role in
problem of asymmetry that plagued the DN account
getting the models to operate correctly.”
of explanation (e.g., that falling barometers could be
used to explain impending storms or shadows could He gives as an example of a fiction in cognitive
used to explain the height of flag poles, to recall modeling what are called fast enabling links (FELs),
Sylvain Bromberger’s well-known examples). This which are independent of the channels by which cells
problem was taken to be solved by the causal approach actually communicate and are assumed to have func-
to explanation, whereby one secures the explanatory tionally infinite propagation speeds, allowing two cells
asymmetry simply by appealing to the asymmetry of to fire in synchrony [4.41, p. 331]. Despite being false
causation. It is important to note, however, that this is in these ways, some modelers take these fictions to be
not an objection specifically to Bokulich’s account of essential to the operation of the model and not likely to
structural model explanation, but rather is a challenge be eliminated in future versions.
for any noncausal account of explanation (Bokulich Weiskopf concludes that models involving reifica-
outlines a solution to the problem of asymmetry for her tions, functional abstractions, and fictions, can nonethe-
account in [4.22]). Since many examples of explanatory less in some cases succeed in “meeting the general
models purport to be noncausal explanations, we will normative constraints on explanatory models perfectly
Part A | 4.2
examine this topic more fully in the next section. well” [4.41, p. 332], and hence such models can be
Another context in which this issue about the ex- counted as genuinely explanatory. Although Weiskopf
planatory power of fictional models arises is in con- recognizes the many great successes of mechanistic ex-
nection with cognitive models in psychology and cog- planations in biological and neural systems, he wants to
nitive neuroscience. Weiskopf, for example, discusses resist an imperialism that attempts to reduce all cases of
how psychological capacities are often understood in model explanations in these fields to mechanistic model
terms of cognitive models that functionally abstract explanations.
from the underlying real system. More specifically, he More recently, Buckner [4.42] has criticized
notes [4.41, p. 328]: Weiskopf’s arguments that functionalist models involv-
ing fictions, abstractions, and reification can be ex-
“In attempting to understand the high level dynam-
planatory and defended the mechanist’s maxim (e.g., as
ics of complex systems like brains, modelers have
articulated by Craver and Kaplan) that only mechanis-
recourse to many techniques for constructing such
tic models can genuinely explain. Buckner employs two
indirect accounts [. . . ] reification, functional ab-
strategies in arguing against Weiskopf: first, in cases
straction, and fictionalization.”
where the models do explain, he argues that they are
By reification, he means “positing something with really just mechanism sketches, and where they cannot
the characteristics of a more or less stable and endur- be reconstructed mechanistically, he dismisses them as
ing object, where in fact no such thing exists” [4.41, p. impoverished explanations. He writes [4.42, p. 3]:
328]. He gives as an example the positing of symbolic
“Concerning fictionalization and reification, I con-
representations in classical computational systems, even
cede that models featuring such components cannot
though he notes that nothing in the brain seems to stand
be interpreted as mechanism sketches, but argue that
still or be manipulable in the way symbols do. Func-
interpreting their nonlocalizable components as nat-
tional abstraction, he argues occurs when we [4.41, p.
ural kinds comes with clear costs in terms of those
329]
models’ counterfactual power. [. . . ] Functional ab-
“decompose a modeled system into subsystems and straction, on the other hand, can be considered a le-
other components on the basis of what they do, gitimate source of kinds, but only on the condition
rather than their correspondence with organizations that the functionally abstract models be interpreted
and groupings in the target system.” as sketches that could be elaborated into a more
complete mechanistic model.”
He notes that this occurs when there are cross-
cutting functional groupings that do not map onto the An essential feature of mechanistic models seems
structural or anatomical divisions of the brain. He notes to be that their components are localizable. Weiskopf
that this strategy emphasizes networks, not locations in argues, however, that his functional kinds are multi-
112 Part A Theoretical Issues in Models
ply realizable, that is, they apply to many different chrony we might have a deeper explanation (at least
kinds of underlying mechanisms, and that in some on the assumption that this true account of synchrony
cases, they are distributed in the sense that they as- would allow us to answer a wider range of what-if-
cribe to a given model component capacities that are things-had-been-different questions) (For an account
distributed amongst distinct parts of the physical sys- of explanatory depth, see [4.43]). However, the ex-
tem. Hence, without localization, such models cannot planation involving the fiction might still be perfectly
be reconstructed as mechanistic models. adequate for the purpose for which it is being deployed,
What of Buckner’s claim that fictional models will and hence it need not even be counted as impoverished.
be impoverished with regard to their counterfactual For example, there might be some explananda (ones
power? Consider again Weiskopf’s example of the fic- other than the explanadum of how do cells achieve syn-
tional FELs, which are posited in the model to allow chrony) for which it simply does not matter how cells
the cells to achieve synchrony. Buckner argues expla- achieve synchrony; the fact that they do achieve syn-
nations involving models with FELs are impoverished chrony might be all that is required for some purposes.
in that if one had a true account of synchrony, that Weiskopf is not alone in trying to make room for
model explanation would support more counterfactual nonmechanistic model explanations; Irvine [4.44] and
knowledge. It is not clear, however, that this objection Ross [4.45] have also recently defended nonmechanis-
undermines the explanatory power of models involv- tic model explanations in cognitive science and biology.
ing FELs per se; rather it seems only to suggest that Their approaches argue for noncausal forms of model
if we knew more and had the true account of syn- explanation, which we will turn to next.
Part A | 4.3
to their large-scale behavior. This story demon- irrelevant to the explanation of the universal behavior of
strates, rather than assumes, a kind of stability or class I neurons. The minimal models approach, as we
robustness of the large-scale behavior we want to saw above, is designed precisely to capture these sort
explain under drastic changes in the various details explanations involving universality.
of the system.” More generally, many highly abstract or highly
mathematical model explanations also seem to fall into
They make two further claims about these minimal this general category of noncausal model explanations.
model explanations. First, they argue that these expla- Pincock, for example, identifies a type of explanation
nations are “distinct from various causal, mechanical, that he calls abstract explanation, which could be ex-
difference making, and so on, strategies prominent in tended to model-based explanations. He writes “the best
the philosophical literature” [4.47, p. 349]. Second, they recent work on causal explanation is not able to natu-
argue that the explanatory power of minimal models rally accommodate these abstract explanations” [4.49,
cannot be accounted for by any kind of mirroring or p. 11]. Although some of the explanations Pincock
mapping between the model and target system (what cites, such as the topological (graph theory) explana-
they call the common features account). Instead, these tion for why one cannot cross the seven bridges of
noncausal explanations work by showing that the min- Königsberg exactly once in a nonbacktracking circuit,
imal model and diverse real-world systems fall into the seem to be genuinely noncausal explanations, it is not
same universality class. This latter claim has been chal- clear that all abstract explanations are necessarily non-
lenged by Lange [4.48] who, though sympathetic to causal. Reutlinger and Andersen [4.50] have recently
their claim that minimal models are a noncausal form of raised this objection against Pincock’s account, arguing
model explanation, argues that their explanatory power that an explanation’s being abstract is not a sufficient
Part A | 4.3
does in fact derive from the model sharing features in condition for it being noncausal. They argue that many
common with the diverse systems it describes (i. e., the causal explanations can be abstract too and so more
common features account Batterman and Rice reject). work needs to be done identifying what makes an ex-
Ross [4.45] has applied the minimal models account planation truly noncausal. This is a particularly pressing
to dynamical model explanations in the neurosciences. issue in model-based explanations, since many scien-
More specifically, she considers as an explanandum tific models are abstract in this sense of leaving out
phenomenon the fact that a diverse set of neural sys- microphysical or concrete causal details about the ex-
tems (e.g., rat hippocampal neurons, crustacean motor planandum phenomenon.
neurons, and human cortical neurons Ross [4.45, p. Lange [4.51] has also identified a kind of noncausal
48]), which are quite different at the molecular level, explanation that he calls a distinctively mathematical
nonetheless all exhibit the same type I excitability be- explanation. Lange considers a number of candidate
havior. She shows that the explanation for this involves mathematical explanations, such as why one cannot
applying mathematical abstraction techniques to the divide 23 strawberries evenly among three children,
various detailed models of each particular type of neural why cicadas have life-cycle periods that are prime, and
system and then showing that all these diverse systems why honeybees build their combs on a hexagonal grid.
converge on one and the same canonical model (known Lange notes that whether these are to count as distinc-
as the Ermentrout–Kopell model). After defending the tively mathematical explanations depends on precisely
explanatory power of these canonical models, Ross then how one construes the explanandum phenomenon. If
contrasts this kind of noncausal model explanation with we ask why honeybees divide the honeycomb into
the causal–mechanical model approach [4.45, p. 46]: hexagons, rather than other polygons, and we cite
that it is selectively advantageous for them to min-
“The canonical model approach contrasts with Ka- imize the wax used, together with the mathematical
plan and Craver’s claims because it is used to fact that a hexagonal grid has the least total perime-
explain the shared behavior of neural systems with- ter, then it is an ordinary causal explanation (it works
out revealing their underlying causal–mechanical by citing selection pressures). If, however [4.50, p.
structure. As the neural systems that share this be- 500]:
havior consist of differing causal mechanisms [. . . ]
a mechanistic model that represented the causal “we narrow the explanandum to the fact that in any
structure of any single neural system would no scheme to divide their combs into regions of equal
longer represent the entire class of systems.” area, honeybees would use at least the amount of
wax they would use in dividing their combs into
Her point is that a noncausal explanation is called hexagons. [. . . ] this fact has a distinctively mathe-
for in this case because the particular causal details are matical explanation.”
114 Part A Theoretical Issues in Models
As Lange explains more generally [4.51, p. 485]: tion works by virtue of citing those causes. Dis-
tinctively mathematical (noncausal) explanations show
“These explanations are noncausal, but this does
the explanandum to be necessary to a stronger de-
not mean that they fail to cite the explanandum’s
gree than would result from the causal powers
causes, that they abstract away from detailed causal
alone.
histories, or that they cite no natural laws. Rather,
As this literature makes clear, distinguishing causal
in these explanations, the facts doing the explaining
from noncausal explanations is a subtle and open prob-
are modally stronger than ordinary causal laws. ”
lem, but one crucial for understanding the wide-spread
The key issue is not whether the explanans cite use of abstract mathematical models in many scientific
the explanandum’s causes, but whether the explana- explanations.
Bokulich [4.55] defends another construal of the does not track how detailed the explanation is. She ex-
how-possibly/how-actually distinction and applies it to plains [4.55, p. 334]:
model-based explanations more specifically. She con-
“It is not the amount of detail that is relevant, but
siders, as an example, model-based explanations of
rather whether the mechanism represented in the
a puzzling ecological phenomenon known as tiger bush.
model is the mechanism operating in nature. Indeed
Tiger bush is a striking periodic banding of vegetation in
as we saw in the tiger bush case, the more abstractly
semi-arid regions, such as southwest Niger. A surprising
the explanatory mechanism is specified, the easier
feature of tiger bush is that it can occur for a wide vari-
it is to establish it as a how-actually explanation;
ety of plants and soils, and it is not induced by any local
whereas the more finely the explanatory mechanism
heterogeneities or variations in topography. By tracing
is specified, the less confident scientists typically
how scientists use various idealized models (e.g., Turing
are that their particular detailed characterization of
models or differential flow models) to explain phenom-
the mechanism is the actual one.”
ena such as this, Bokulich argues a new insight into the
how-possibly/how-actually distinction can be gained. Hence, somewhat counterintuitively, model expla-
The first lesson she draws is that there are different nations at a more fine-grained level are more likely to
levels of abstraction at which the explanandum phe- be how-possibly model explanations, even when they
nomenon can be framed, which correspond to different are nested within a higher level how-actually model
explanatory contexts [4.55, p. 33]. These different ex- explanation of a more abstract characterization of the
planatory contexts can be clarified by considering the phenomenon. She concludes that when assessing model
relevant contrast class of explanations (for a discussion explanations, it is important to pay attention to what
of contrast classes and their importance in scientific ex- might be called the scale of resolution at which the ex-
Part A | 4.5
planation, see [4.56, Chap. 5]). Second, she argues pace planandum phenomenon is being framed in a particular
Craver that the how-possibly/how-actually distinction explanatory context.
ous aims, such as scientific explanation, remains Bokulich uses cases such as these to argue for what
a methodologically important, though underexplored she calls a division of cognitive labor among mod-
topic. els [4.60, p. 121]:
More recently, Bokulich [4.60] has explored such
“If one’s goal is explanation, then reduced complex-
tradeoffs in the context of modeling in geomorphology,
ity models will be more likely to yield explanatory
which is the study of how landscapes and coastlines
insight than simulation models; whereas if one’s
change over time. Even when it comes to a single phe-
goal is quantitative predictions for concrete sys-
nomenon, such as braided rivers (i. e., rivers in which
tems, then simulation models are more likely to be
there is a number of interwoven channels and bars that
successful. I shall refer to this as the division of cog-
dynamically shift over time), one finds that scientists
nitive labor among models.”
use different kinds of models depending on whether
their primary aim is explanation or prediction. When As Bokulich notes, however, one consequence of
they are interested explaining why rivers braid geo- this division of cognitive labor is that a model that
morphologists tend to use what are known as reduced was designed to optimize explanatory insight might fail
complexity models, which are typically very simple to make quantitatively accurate predictions (a different
cellular automata models with a highly idealized repre- cognitive goal). She continues [4.60, p. 121]:
sentation of the fluvial dynamics [4.61]. The goal is to
“This failure in predictive accuracy need not mean
try to abstract away and isolate the key mechanisms re-
that the basic mechanism hypothesized in the ex-
sponsible for the production of the braided pattern. This
planatory model is incorrect. Nonetheless, explana-
approach is contrasted with an alternative approach
tory models need to be tested to determine whether
to modeling in geomorphology known as reductionist
the explanatory mechanism represented in the model
Part A | 4.6
4.6 Conclusion
There is a growing realization that the use of idealized mining what is or is not to count as a causal explanation
models to explain phenomena is pervasive across the turns out to be a subtle issue.
sciences. The appreciation of this fact has led philoso- Finally, just because a model or computer sim-
phers of science to begin to introduce model-based ulation can reproduce a pattern or behavior that is
accounts of explanation in order to bring the philo- strikingly like the phenomenon to be explained, does
sophical literature on scientific explanation into closer not mean that it thereby explains that phenomenon.
agreement with actual scientific practice. An important distinction here is that between a how-
A key question here has been whether the idealiza- possibly model explanation and a how-actually model
tions and falsehoods inherent in modeling are harmless explanation. Despite the wide agreement that such
in the sense of doing no real explanatory work, or a distinction is important, there has been less agree-
whether they have an essential – maybe even inelim- ment concerning how precisely these lines should be
inable – role to play in some scientific explanations. drawn.
Are such fictions compatible with the explanatory aims Although significant progress has been made in
of science, and if so, under what circumstances? While recent years in understanding the role of models in sci-
some inroads have been made on this question, it re- entific explanation, there remains much work to be done
mains an ongoing area of research. As we saw, yet in further clarifying many of these issues. However, as
another controversial issue concerns the fact that many the articles reviewed here reveal, exploring just how and
highly abstract and mathematical models seem to ex- when models can explain is a rich and fruitful area of
emplify a noncausal form of explanation, contrary to philosophical investigation and one essential for under-
the current orthodoxy in scientific explanation. Deter- standing the nature of scientific practice.
Models and Explanation References 117
References
4.1 C. Hempel: Aspects of Scientific Explanation and 4.24 M. Morrison: Models as autonomous agents. In:
Other Essays in the Philosophy of Science (Free Press, Models and Mediators: Perspectives on Natural and
New York 1965) Social Science, ed. by M. Morgan, M. Morrison (Cam-
4.2 W. Salmon: Scientific Explanation and the Causal bridge Univ. Press, Cambridge 1999) pp. 38–65
Structure of the World (Princeton Univ. Press, Prince- 4.25 C. Rice: Moving beyond causes: Optimality mod-
ton 1984) els and scientific explanation, Noûs 49(3), 589–615
4.3 R. Frigg, S. Hartmann: Models in science. In: (2015)
The Stanford Encyclopedia of Philosophy, ed. by 4.26 R. Batterman: Devil in the Details: Asymptotic Rea-
E.N. Zalta (Stanford Univ., Stanford 2012) soning in Explanation, Reduction, and Emergence
4.4 J. Maynard Smith: Evolution and the Theory of Games (Oxford University Press, Oxford 2002)
(Cambridge Univ. Press, Cambridge 1982) 4.27 J. Reiss: The explanation paradox, J. Econ. Methodol.
4.5 A. Potochnik: Idealization and the Aims of Science 19(1), 43–62 (2012)
(Univ. Chicago Press, forthcoming) 4.28 U. Mäki: On a paradox of truth, or how not to ob-
4.6 E. McMullin: Structural explanation, Am. Philos. Q. scure the issue of whether explanatory models can
15(2), 139–147 (1978) be true, J. Econ. Methodol. 20(3), 268–279 (2013)
4.7 E. McMullin: Galilean idealization, Stud. Hist. Philos. 4.29 M. Strevens: Depth: An Account of Scientific Expla-
Sci. 16(3), 247–273 (1985) nation (Harvard Univ. Press, Cambridge 2008)
4.8 E. McMullin: A case for scientific realism. In: Scien- 4.30 J. Main, G. Weibusch, A. Holle, K.H. Welge: New
tific Realism, ed. by J. Leplin (Univ. California Press, quasi-Landau structure of highly excited atoms: The
Berkeley 1984) hydrogen atom, Phys. Rev. Lett. 57, 2789–2792 (1986)
4.9 R. Batterman: Critical phenomena and breaking 4.31 H. Vaihinger: The Philosophy of ‘As If’: A System of
drops: Infinite idealizations in physics, Stud. Hist. the Theoretical, Practical, and Religious Fictions of
Part A | 4
Philos. Modern Phys. 36, 25–244 (2005) Mankind, 2nd edn. (Lund Humphries, London [1911]
4.10 A. Bokulich: Explanatory Fictions. In: Fictions in 1952), translated by C.K. Ogden
Science: Philosophical Essays on Modeling and Ide- 4.32 D. Kleppner, J.B. Delos: Beyond quantum mechanics:
alization, ed. by M. Suárez (Routledge, London 2009) Insights from the work of Martin Gutzwiller, Found.
pp. 91–109 Phys. 31, 593–612 (2001)
4.11 N. Cartwright: How the Laws of Physics Lie (Clarendon 4.33 R. Batterman: Response to Belot’s “Whose Devil?
Press, Oxford 1983) Which Details?”, Philos. Sci. 72, 154–163 (2005)
4.12 P. Duhem: The Aim and Structure of Physical Theory 4.34 G. Belot: Whose Devil? Which Details?, Philos. Sci. 52,
(Princeton Univ. Press, Princeton 1914/ 1954) 128–153 (2005)
4.13 M. Elgin, E. Sober: Cartwright on explanation and 4.35 G. Belot, L. Jansson: Review of reexamining the
idealization, Erkenntnis 57, 441–450 (2002) quantum-classical relation, Stud. Hist. Philos. Mod-
4.14 D. Cristol, P. Switzer: Avian prey-dropping behav- ern Phys. 41, 81–83 (2010)
ior. II. American crows and walnuts, Behav. Ecol. 10, 4.36 A. Bokulich: Bohr’s correspondence principle. In:
220–226 (1999) The Stanford Encyclopedia of Philosophy, ed. by
4.15 R. Batterman: Idealization and modeling, Synthese E.N. Zalta (Stanford Univ., Stanford 2014), http://
169, 427–446 (2009) plato.stanford.edu/archives/spr2014/entries/bohr-
4.16 A. Kennedy: A non representationalist view of model correspondence/
explanation, Stud. Hist. Philos. Sci. 43(2), 326–332 4.37 W. Heisenberg: In: The Physical Principles of the
(2012) Quantum Theory, ed. by C. Eckart, F. Hoyt (Univ.
4.17 C. Craver: When mechanistic models explain, Syn- Chicago Press, Chicago 1930)
these 153, 355–376 (2006) 4.38 J.C. Maxwell: On Faraday’s Lines of Force. In: The
4.18 A. Bokulich: How scientific models can explain, Syn- Scientific Papers of James Clerk Maxwell, ed. by
these 180, 33–45 (2011) W. Niven (Dover Press, New York [1855/56] 1890)
4.19 D.M. Kaplan: Explanation and description in compu- pp. 155–229
tational neuroscience, Synthese 183, 339–373 (2011) 4.39 A. Bokulich: Maxwell, Helmholtz, and the unreason-
4.20 A. Bokulich: Reexamining the Quantum-Classical able effectiveness of the method of physical analogy,
Relation: Beyond Reductionism and Pluralism (Cam- Stud. Hist. Philos. Sci. 50, 28–37 (2015)
bridge Univ. Press, Cambridge 2008) 4.40 S. Schindler: Explanatory fictions – For real?, Syn-
4.21 A. Bokulich: Can classical structures explain quan- these 191, 1741–1755 (2014)
tum phenomena?, Br. J. Philos. Sci. 59(2), 217–235 4.41 D. Weiskopf: Models and mechanism in psychologi-
(2008) cal explanation, Synthese 183, 313–338 (2011)
4.22 A. Bokulich: Distinguishing explanatory from non- 4.42 C. Buckner: Functional kinds: A skeptical look, Syn-
explanatory fictions, Philos. Sci. 79(5), 725–737 (2012) these 192, 3915–3942 (2015)
4.23 J. Woodward: Making Things Happen: A Theory of 4.43 C. Hitchcock, J. Woodward: Explanatory generaliza-
Causal Explanation (Oxford University Press, Oxford tions: Part II. Plumbing explanatory depth, Noûs
2003) 37(2), 181–199 (2003)
118 Part A Theoretical Issues in Models
4.44 E. Irvine: Models, robustness, and non-causal ex- 4.54 P. Forber: Confirmation and explaining how possi-
planation: A foray into cognitive science and bi- ble, Stud. Hist. Philos. Biol. Biomed. Sci. 41, 32–40
ology, Synthese 192, 3943–3959 (2015), doi:10.1007/ (2010)
s11229-014-0524-0 4.55 A. Bokulich: How the tiger bush got its stripes: ‘How
4.45 L. Ross: Dynamical models and explanation in neu- possibly’ vs. ‘How actually’ model explanations, The
roscience, Philos. Sci. 82(1), 32–54 (2015) Monist 97(3), 321–338 (2014)
4.46 J. Saatsi, M. Pexton: Reassessing Woodward’s ac- 4.56 B. van Fraassen: The Scientific Image (Oxford Univer-
count of explanation: Regularities, counterfactuals, sity Press, Oxford 1980)
and noncausal explanations, Philos. Sci. 80(5), 613– 4.57 R. Giere: The nature and function of models, Behav.
624 (2013) Brain Sci. 24(6), 1060 (2001)
4.47 R. Batterman, C. Rice: Minimal model explanations, 4.58 R. Levins: The Strategy of model building in popula-
Philos. Sci. 81(3), 349–376 (2014) tion biology, Am. Sci. 54(4), 421–431 (1966)
4.48 M. Lange: On ‘Minimal model explanations’: A re- 4.59 J. Matthewson, M. Weisberg: The structure of trade-
ply to Batterman and Rice, Philos. Sci. 82(2), 292–305 offs in model building, Synthese 170(1), 169–190
(2015) (2008)
4.49 C. Pincock: Abstract explanations in science, Br. 4.60 A. Bokulich: Explanatory models versus predic-
J. Philos. Sci. 66(4), 857–882 (2015), doi:10.1093/bjps/ tive models: Reduced complexity modeling in ge-
axu016 omorphology. In: EPSA11 Perspectives and Founda-
4.50 A. Reutlinger, H. Andersen: Are explanations non- tional Problems in Philosophy of Science, ed. by
causal by virtue of being abstract?, unpublished V. Karakostas, D. Dieks (Springer, Cham, Heidelberg,
manuscript New York, Dordrecht, London 2013)
4.51 M. Lange: What makes a scientific explanation dis- 4.61 A.B. Murray: Contrasting the goals, strategies, and
tinctively mathematical?, Br. J. Philos. Sci. 64, 485– predictions associated with simplified numerical
511 (2013) models and detailed simulations. In: Prediction in
Part A | 4
4.52 W. Dray: Law and Explanation in History (Oxford Geomorphology, ed. by P. Wilcock, R. Iverson (Amer-
Univ. Press, Oxford 1957) ican Geophysical Union, Washington 2003) pp. 151–
4.53 R. Brandon: Adaptation and Environment (Princeton 165
Univ. Press, Princeton 1990)
119
Models and S
Nancy J. Nersessian, Miles MacLeod
5. Models and Simulations
Part A | 5.1
lation, and this is the focus of our review. However, turn to issues of whether and how simulation
we wish to note that the chief distinguishing modeling introduces novel concerns for the phi-
characteristic between a model and a simula- losophy of science in Sect. 5.3. Finally, we conclude
tion (model) is that the latter is dynamic. They can in Sect. 5.4 by addressing the question of the rela-
be run either as constructed or under a range of tion between human cognition and computational
experimental conditions. Thus, the broad class of simulation, including the relationship between
simulation models should be understood as com- the latter and thought experimenting.
time. Simulations open up a whole new set of philo- question of the extent to which the confirmation af-
sophical issues concerning the practices and reliability forded to the theory flows down to the simulation [5.2].
of much modern science. For instance, although fitting a certain data set might
Winsberg’s analysis of theory-based simulation well be the dominant mode of validation of a simu-
shares much with Cartwright’s [5.6] and Morgan and lation model, the model could be considered to hold
Morrison’s [5.7] challenges to the role of theories. Like outside the range of that data because the model applies
them, he starts by strongly disputing the presupposition a well-accepted theory of the phenomenon thought to
that simulations are somehow deductive derivations hold under very general conditions.
from theory. Simulations are applied principally in There is widespread agreement that untangling the
the physical sciences when the equations generated relations between theories and simulations, and the re-
from a theory to represent a particular phenomenon liability of simulations built from theories will require
are not analytically solvable. The path from a theory more in depth investigation of the actual practices sci-
to a simulation requires processes of computerization, entists use to justify the steps they make when building
which transform equations into tractable computable a simulation model. In the absence of such investiga-
structures by relying on practices of discretization tions discussions of justification are limited to consider-
and idealization [5.8]. These practices employ specific ations about whether a simulation fits the observational
transformations and simplifications in combination data or not. Among other things, this limitation hides
with those used to make tractable the application of from view important issues about the warrant of the
theoretical equations to a specific phenomenon such various background steps that transform theoretical in-
as boundary conditions and symmetry assumptions. formation into simulations [5.10]. In general, what is
As such simulations are, according to Winsberg [5.1], required is an epistemology of simulation which can dis-
better construed as particular articulations of a theory cover rigorous grounds upon which scientists can and
rather than derivations from theory. They make use of do sanction their results, and more properly the role of
theoretical information and the credibility, explanatory theory in modern science.
scope and depth, of well-established theories, to pro- The concern with practices of simulation has
vide warrant to simulations of particular phenomena. opened up a new angle on the older discussion about
Inferences drawn by computational simulations have the structure of theories. Humphreys [5.11] has used
several features in this regard; they are downward, the entanglement of theory and simulation in modern
Part A | 5.1
motley and autonomous [5.9]. Inferences are downward scientific practice to reflect more explicitly upon the
because they move from theory to the real world (rather proper philosophical characterization of the structure
than from the real world to theory). They are motley of physical theories. Simulations, as with other mod-
because they depend not just on theory but on a large els, are not logical derivations from theory which is
range of extra-theoretical techniques and resources a central, but incorrect, feature of the syntactic view.
in order to derive inferences, such as approximation Humphreys also argues, however, that the now dom-
and simplification techniques, numerical methods, inant semantic view of theories, which treats theories
algorithmic methods, computer languages and hard- as nonlinguistic entities, is not adequate either. On the
ware, and much trial and error. Finally, simulations semantic view a syntactical formulation of a theory,
are autonomous, in the sense of being autonomous and whether different formulations might be solvable
from both theory and data. Simulations, according to or not, is not important for philosophical assessment
Winsberg, are principally used to study phenomena of relations of representations to the world. Relations
where data is sparse and unavailable. These three con- of representation are only in fact sensibly held by
ditions on inference from simulation require a specific models not theories. Both Humphreys and Winsberg
philosophical evaluation of their reliability. construe the semantic view as dismissing the role of
Such evaluation is complicated by the fact that re- theories in both normative and descriptive accounts of
lations between theory and inferences drawn from the science, in place of models. But as Humphreys [5.12,
simulation model are unclear and difficult to untangle. p. 620] puts it, “the specific syntactic representation
As Winsberg [5.1, 9] suggests it is a complex task to used is often crucial to the solvability of a theory’s
unpack what role theories play in the final result given equations”, and thus, the solvability of models derived
all these intervening steps. The fact that much valida- from it. Computational tractability, as well as choices of
tion of simulations is done through matching simulation approximation and simplification techniques, will de-
outputs to the data, muddies the water further (see pend on the particular syntax of a theory. Hence both
also [5.10]). A well-matched simulation constructed the semantic and syntactic views are inadequate for
through a downward, motley and autonomous process describing theory in ways that capture their role in sci-
from a nonetheless well-established theory raises the ence.
Models and Simulations 5.2 Simulation not Driven by Theory 121
Part A | 5.2
domain of theory-driven science reveals whole new ogy and economics, commonly simplify very com-
practices of scientific model production using compu- plex interactions in order to create computationally
tational simulations that are not in fact theory-based, tractable simulations. If a simplistic model captures
in the sense of traditional physical sciences. Some of a known behavior, can we trust its predictions? To
the most compelling and innovative fields in science address questions such as these we need an episte-
today, including, for instance, big-data biology, sys- mology that can evaluate proposed techniques for es-
tems biology and neuroscience, and much modeling tablishing the robustness of agent-based models. One
in the social sciences, are not theory-driven. As Wins- alternative is to argue that agent-based models re-
berg [5.5] admits (in response to Parker [5.13]), his quire a novel epistemology that is able to rationalize
description of simulation modeling is theory-centric, their function as types of fictions rather than as rep-
and neither necessarily applicable to understanding the resentations [5.18, 19]. Another alternative, presented
processes by which simulation models are built in the by Grüne-Yanoff and Weirich [5.16], is to argue that
absence of theory, nor an appropriate framework for agent-based models provide in many cases functional
assessing the reliability and informativeness of mod- rather than causal explanations of the phenomena they
els built that way. This is not to say that characteristics simulate [5.20]. Agent-based model simulations rarely
of theory-based simulation are irrelevant to simulations control for all the potential explanatory factors that
that are not. Both theory and nontheory-based simu- might be relevant to a given phenomenon, and any
lations share an independence of theory and there are choice of particular interaction mechanism is usually
likely to be similarities between them, but there are also thoroughly underdetermined. In practice, all possible
profound differences. mechanisms cannot be explored. But agent-based mod-
One kind of simulation that is important in this re- els can show reliably how particular lower-level ca-
gard is agent-based modeling. Keller [5.14] has labeled pacities behave in certain ways, when modeled by
much agent-based modeling as modeling from above suitably general interactions rules, and can constitute
in the sense that such models are not constructed us- higher-level capacities no matter how multiply real-
ing a mathematical theory that governs the motions of ized those interactions might be. Hence, such mod-
agents. Agents follow local interactions rules. In many els, even though greatly simplified, can extract useful
122 Part A Theoretical Issues in Models
information despite a large space of potential ex- lower level information they possess. We have found
plananda. that canonical templates mediate this process by pro-
Nontheory-driven forms of simulation such as viding a possible structure for gluing together this lower
agent-based models provide a basis for reflecting more level information in a tractable way [5.21]. These theo-
broadly on the role theory plays in the production of ries do not offer any direct explanatory value by virtue
simulations, and the warrant a theory brings to simu- of their use.
lations based on it. Comparative studies of the kinds of Theory can in fact be used not just to describe
arguments used to justify relying on a simulation should a body of laws and theoretical principles, but also to
expose the roles well-established theories play. Our in- describe principles that instruct scientists on how to re-
vestigations of integrative systems biology (ISB) have liably build models of given classes of phenomena from
revealed that not all equation-based modeling is theory- a background theory. As Peck puts it [5.18, p. 393]:
driven, if theory is construed in terms of theory in the
“In traditional mathematical modeling, there is
physical sciences. The canonical meaning based on the
a long established research program in which stan-
physical sciences is something like a background body
dard methods, such as those used for differential
of laws and principles of a domain.
equation modeling, are used to bring about certain
In the case of systems biology, researchers gener-
ends. Once the variables and parameters and their
ally do not have access to such theory and in fact the
relationships are chosen for the representation of the
kinds of theory they do make use of have a function dif-
model, standard formulations are used to complete
ferent from what is usually meant by theory in fields
the modeling venture.”
like physics [5.21]. There are certain canonical theories
in systems biology of how to mathematically represent If one talks about what physical scientists often
interactions among, for instance, metabolites, in the start with it is not just the raw theory itself but well-
form of sets of ordinary differential equations. These established rules for formulating the theory and ap-
posit particular canonical mathematical forms for repre- plying it with respect to a particular phenomenon. We
senting a large variety of interactions (see Biochemical might refer to this latter sense of theory as a theory of
Systems Theory [5.22]). In principle, for any particular how to apply a background theory to reliably represent
metabolic network, if all the interactions and reactants a phenomenon. The two senses of theory are exclusive.
are known, the only work for the modeler is to write In the case of the canonical frameworks, what is meant
Part A | 5.2
down the equations for a particular network and calcu- by theory is something closer to this latter rather than
late the parameters. The mathematics will take care of former sense.
the rest since the mathematical formulations of inter- Additionally, the modelers we have studied are
actions are general enough that any potential nonlinear never in a position to rely on these frameworks un-
behaviors should be represented if parameters are cor- critically and in fact no theory exists that specifies
rectly fixed. which representations to use that will reliably lead to
For the most part, however, these canonical frame- a good representation in all data situations. In integra-
works do not provide the basic ontological information tive systems biology the variety of data situations are
from which a representation of a system is ultimately very complex, and the data are often sparse and are
drawn, in the way say that the Navier-Stokes equations rarely adequate for applying a set mathematical frame-
of fluid dynamics describe fluids and their component work. This forces researchers in practice into much
interactions in a particular way. In practice, modelers more intensive and adaptive model-building processes
in systems biology need to assemble that information that certainly share much in common with the back and
themselves in the form of pathway diagrams which forth processes Winsberg talks about in the context of
more or less list the molecules involved and then make theory application. But these processes have the added
their own decisions about how to represent molecular and serious difficulty that the starting points for even
interactions. A canonical framework is better inter- composing the mathematical framework out of which
preted as a theory of how to approximate and simplify a model should be built are open-ended and need to be
the information that the systems biologist has assem- decided based on thorough investigation of the possibil-
bled about a pathway in order to reliably simulate the ities with the specific data available.
dominant dynamics of a network given sparse data and Canonical frameworks are just an option for mod-
complex nonlinear dynamics. Hence, there is no real elers and do not drive the model-building process in
theory articulation in Winsberg’s terms. Researchers do the way physical theories do. Currently, systems bi-
not articulate a general theory for a particular applica- ology generally lacks effective theory of either kind.
tion. The challenge for systems biologists is to build Modelers have many different choices about how to
a higher level or system level representation out of the confront a particular problem that do not necessarily
Models and Simulations 5.2 Simulation not Driven by Theory 123
involve picking up a canonical framework or sticking They build models in nest-like fashion in which bits of
to it. MacLeod and Nersessian [5.21] have documented biological information and data and mathematical and
how the nontheory-derived model-building processes computational techniques, get combined to create stable
work in these contexts. Models are strategic adaptations models. These processes transform not only the shape
to a complex set of constraints system biologists are of the solutions, but also the problems, as researchers
working under [5.23]. Among these constraints are: figure out what actual problem can be solved with the
data at hand. Simulation plays a central exploratory role
Constraints of the biological problem: A model in the process. This point goes further than Lenhard’s
must address the constraints of the biological prob- idea of an explorative cooperation between experimen-
lem, such as how the redox environment is main- tal simulation and models [5.8]. Simulation in systems
tained in a healthy cell. The system involved is often biology is not just for experimenting on systems in or-
of considerable complexity. der to sound out the consequences of a model [5.8,
Informational/data constraints: There are con- p. 181], but plays a fundamental role in incrementally
straints on the accessibility and availability of ex- building the model and learning the relevant known and
perimental data and molecular and system parame- sometimes unknown features of a system and gaining
ters for constructing models. an understanding of its dynamics. Simulation’s roles as
Cost constraints: ISB is data-intensive and relies a cognitive resource make the construction of represen-
on data that often go beyond what are collected tations of complex systems without a theoretical basis
by molecular biologists in small scale experiments. possible (see also [5.24, 25]).
However, data are very costly to obtain. Similar conclusions have been drawn by Peck for
Collaboration constraints: Constraints on the abil- ecology which shares with systems biology the com-
ity to communicate effectively with experimental plexity in its problems and a lack of generalizable
collaborators with different backgrounds or in dif- theory. As Peck [5.18, p. 393] points out:
ferent fields in order to obtain expert advice or new
data. Molecular biologists largely do not understand “there are no formal methodological procedures for
the nature of simulation modeling, do not under- building these types of models suggesting that con-
stand the data needs of modeling, and do not see the structing an ecological simulation can legitimately
cost-benefit of producing the particular data systems be described as an art.”
Part A | 5.2
biologists ask from them.
Time-scale constraints: Different time scales op- This situation promotes methodological pluralism
erate with respect to generating molecular experi- and creative methodological exploration by modelers.
mental data versus computational model testing and Modelers in these contexts thus focus our attention on
construction. the deeper roles (sometimes called heuristic roles [5.5])
Infrastructure constraints: There is little in the way that simulation plays in the ability of researchers to
of standardized databases of experimental informa- explore potential solutions in order to solve complex
tion or standardized modeling software available for problems.
systems biologists to rely upon. These roles have added epistemological importance
Knowledge constraints: Modelers’ lack knowledge when it is realized that the downward character of sim-
of biological systems and experimental methods ulation can be fact reversed in both senses we have
limits their understanding of what is biologically mentioned above. This is a potentially significant dif-
plausible and what reliable extrapolations can be ference between cases of theory and nontheory-driven
made from the data sets available. simulation. Consider again systems biology. Firstly, the
Cognitive constraints: Constraints on the ability to methodological exploration we witness amongst the
process and manipulate models because of their researchers we have studied can be rationalized as pre-
complexity, and thus constraints on the ability to cisely an attempt by the field to establish a good theory
comprehend biological systems through modeling. of how to build models of biological systems that work
well given a variety of data situations. Since the com-
Working with these constraints requires them to be plexities of these systems and computational constraints
adaptive problem-solvers. Given the complexity of the make this difficult to know at the outset, the field needs
systems, lack of data, and the ever-present problem of its freedom to explore the possibilities. Lab directors do
computational tractability, researchers have to exper- encourage exploration, and part of the reason they do is
iment with different mathematical formulations, dif- to try to glean which practices work well and which do
ferent parameter-fixing algorithms and approximation not given a lack of knowledge of what will work well
techniques in highly intensive trial and error processes. for a given problem.
124 Part A Theoretical Issues in Models
Secondly, systems biology aspires to a theory of bi- more traditional philosophical analysis of how theories
ological systems which will detail general system-level are in fact justified, only in this case robust simu-
characteristics of biological systems but also the de- lation models will possibly be the more significant
sign principles underlying biological networks [5.26, source of evidence rather than traditional experiment
27]. What is interesting about this theory, if it does and observation. How this affects the nature and re-
emerge, is that it will in fact be theory generated by liability of our inferences to theory, and what kind
simulation rather than the other way around. Simula- of resemblance such theory might have to theory in
tion makes possible the exploration of quite complex physics, is something that will need investigation. Thus,
systems for generalities that can form the basis of further exploration of nontheory-driven modeling prac-
a theory of systems biology. As such the use of sim- tices stand to provide a rich ground for investigation
ulations can also be upwards, not just downwards, to of novel practices that are emerging with simula-
perhaps an unprecedented extent. Upward uses of sim- tion, but also for exploring the roles and meanings of
ulation requires analysis that appears to fit better with theory.
concern of philosophy of science with the justification the disagreement is over how one construes new issues
of theory, and 2) the relative autonomy of simula- or new questions for philosophy, since certainly at some
tions and simulation-building from the theory. The steps level the basic philosophical questions about how rep-
involved in generating simulations, such as applying resentations represent and what makes them reliably do
approximation methods designed to generate compu- so, are still the same questions.
tational tractability, are novel to science. These steps To some extent, part of the debate might be con-
do not gain their legitimacy from a theory but are “au- strued as a disagreement over the relevance of contexts
tonomously sanctioned” [5.1, p. 837]. Winsberg argues, of discovery to philosophy of science. Classically con-
for instance, that while idealization and approxima- texts of discovery, the scientific contexts in which
tion methods have been discussed in the literature it model-building takes place, are considered irrelevant
has mostly been from a representational perspective in to normative philosophical assessments of whether
terms of how idealized and approximate models rep- those models are justified or not. Winsberg [5.3] and
resent or resemble the world and in turn justify the Humphreys [5.12] seem willing to assert that one of
theories on which they are based. But since simulations the lessons for philosophy of science from simulation is
are often employed where data are sparse, they cannot that practical constraints on scientific discovery matter
usually be justified by being compared with the world for constructing relevant normative principles – both in
alone. Simulations must be assessed according to the terms of evaluating current practice, which in the case
reliability of the processes used to construct them, and of simulation-building is driven by all kinds of practical
these often distinct and novel techniques require sep- constraints, and in terms of normatively directing prac-
arate philosophical evaluation. Mainstream philosophy tice sensitively within those constraints.
of science with its focus on theoretical justification does Part of the motivation for using the discov-
not have the conceptual resources for accounting for ery/justification distinction to define philosophical in-
applications using computational methods. Even where terest and relevance is the belief that there is a clear
theory is concerned, both Humphreys and Winsberg distinction between the two contexts. Arguably Frigg
maintain that neither of the established semantic and and Reiss are reinforcing the idea of a clear dis-
syntactic conception of theories, conceptions which fo- tinction by relying on widespread presupposition that
Models and Simulations 5.3 What is Philosophically NovelAbout Simulation? 125
validation and verification are distinct independent pro- As a first step in helping with this task of assess-
cesses [5.4]. Validation is the process of establishing ing reliability and validity of simulation, philosophers
that a simulation is a good representation, a quintessen- such as Winsberg [5.29] have drawn lessons from com-
tial concept of justification. Verification is the process parison with experimentation, which they argue shares
of ensuring that a computational simulation adequately much with simulation in both function (enabling, for in-
captures the equations from which it is constructed. Ver- stance, in silico experiments) and also in terms of how
ification, according to Frigg and Reiss, represents the the reliability of simulations is generated. Scientific re-
only novel aspects of modeling that simulation intro- searchers try to control for error in their simulations,
duces. Yet it is a purely mathematical exercise that is of and fix parameters, in ways that seem analogous to how
no relevance to questions of validation. As such, sim- experimenters calibrate their devices. Simulations build
ulations involve no new issues of justification beyond up credibility over long time scales and may have lives
those of ordinary models. Winsberg [5.3, 4], however, of their own independent of developments in other parts
counters that there is, in practice, no clear division of science. These observations suggest a potentially rich
between processes of verification and validation. The analogy between simulations and Hacking’s account of
equations chosen to represent a system are not simply experimentation [5.29]. In a normative step, based on
selected on the basis of how valid they are, but also these links, Parker [5.10] has suggested that in fact
on the basis of decisions about computational tractabil- Mayo’s [5.30] rigorous error-statistical approach for ex-
ity. Much of what validates a representation in practice perimentation should be an appropriate starting point
occurs at the end stage, after all the necessary tech- for more thorough evaluation of the results of simula-
niques of numerical approximation and discretization tions. Simulations need to be evaluated by the degree to
have been applied, by comparing the results of simula- which they avoid false positives when it comes to test-
tions with the data. As such, [5.5]: ing hypotheses by successfully controlling for potential
sources of error that creep in during the simulation
“If we want to understand why simulation results
process. At the same time a rather vigorous debate
are taken to be credible, we have to look at the epis-
has emerged concerning the clarification of the precise
temology of simulation as an integrated whole, not
epistemological dissimilarities or disanalogies between
as clearly divided into verification and validation –
simulation and traditional experimentation (see for in-
each of which would look inadequate to the task.”
stance [5.31–36]). This question is in itself of inde-
Part A | 5.3
Hence what would otherwise seem to be distinct pendent philosophical interest for assessing the benefits
discovery and justification processes are in the context and value of each as alternatives, but should also help
computational simulation interwoven. define the limits of the relevance of experimentation
Frigg and Reiss are right at some level that simu- as a model for understanding and assessing simulation
lations do not change basic epistemological questions practices.
connected to the justification of models. They are also From our perspective, however, the new method-
right that Winsberg in his downward, motley and au- ological and epistemological strategies that modelers
tonomous description of simulation, does not reveal any are introducing in order to construct and guarantee the
fundamentally new observations on model-building that reliability of simulation models could prove to be the
have not already been identified as issues by philoso- most interesting and novel aspect of simulation with
phers discussing traditional modeling. However, what which philosophers will have to grapple. Indeed, while
appears to be really new in the case of simulation much attention has focused on the contrasts and similar-
is: 1) the complexity of the philosophical problems ities between simulations, experiments and simulation
of representation and reliability, and 2) the different experiments, no one has called attention to the fact
methodological and epistemological strategies that have that real-world experiments and simulations are also
become available to modelers as a result of simulation. being used in concert to enhance the ability of re-
Winsberg, in reply to Frigg and Reiss, has clarified searchers to handle uncertain complex systems. One
what he thinks as novel about theory-based simulation of the labs we have studied conducts bimodal model-
as the simultaneous confluence of downward, motley ing, where the modelers conduct their own experiments
and autonomous features of model-building [5.4]. It is in the service of building their models. We have an-
the reliability and validity of the complex modeling pro- alyzed the case of one modeler’s behavior in which
cesses instantiated by these three features that must be model-building, simulation and experimentation were
accounted for by an epistemology of simulation, and tightly interwoven [5.37]. She used a conjunction of
no current philosophical approaches are adequate to do experiment and simulation to triangulate on errors and
so, particularly not those within traditional philosophi- uncertainties in her model, thus demonstrating that the
cal boundaries of analysis. two can be combined in practice in sophisticated ways.
126 Part A Theoretical Issues in Models
Her model-building would not have been possible with- arguments about network structure and dynamic rela-
out the affordances of both simulation and her ability tionships among specific variables. There is not usually
to perform experimentation precisely adapted to test any well-established theory that licenses these argu-
questions about the model as she was in the process of ments. The fact that the models converge on the same
formulating it. Simulation and experiment closely cou- relevant results is motivation for inferring that these
pled in this fashion offers the possibility of extending models are right at least about those aspects of the
the capacity to produce reliable models of complex phe- system for which they are designed to account. Unfor-
nomena. tunately, because access to real-world experimentation
Bimodal modeling is relatively easy to character- is quite difficult, it is hard to judge how reliable this
ize epistemologically since experimentation is used to technique is in producing robust models. What is novel
validate and check the simulations as the model is about this kind of strategy is that it implicitly treats
being constructed. Simulations are not relied on inde- parameter-fixing as an opportunity, not just a problem,
pendent of experimental verification. Often, however, for modelers. If instead of trying to capture the dy-
experimental or any kind of observational data are namics of whole systems modelers just fix their goals
hard to come by for practical or theoretical reasons. on capturing robust properties and relations of a sys-
More philosophically challenging will be to evaluate tem, the potential of finding results that work within
the new epistemological strategies researchers are in these constraints in large parameter-spaces increases,
fact developing for drawing inferences in these often and from the multiple models obtained modelers can
deeply uncertain and complex contexts with the aid pare down to those that converge. The more complex
of computation. Parker [5.38, 39], for instance, iden- problem thus seems to allow a pathway for solving
tifies the practice in climate science and meteorology a simpler one. Nonetheless, whether we should accept
of ensemble modeling. No theory of model-building these kinds of strategies as reliable and the models pro-
exists that tells climate and weather modelers how to duced as robust remains the fundamental question, and
go from physical theory to reliable models. Different an overarching question for the field itself. It is a rea-
formulations using different initial conditions, models sonable reaction to suspect that something important is
structures and different parameterizations of those mod- being given up in the process, which will affect how
els that fit the observational data can be developed from well scientists can assess the reliability and importance
the physical theory. In this situation modelers average of the models they produce. Whether the power com-
Part A | 5.3
over results from large collections of models, using dif- putational processes can adequately compensate for the
ferent weighting schemas, and argue for the validity of potential distortions or errors introduced is one of the
these results on the basis that these models collectively most critical and novel epistemological questions for
represent the possibility space. However, considerable philosophy today.
philosophical questions emerge as to the underlying The kinds of epistemological innovations we have
justifiability of these ensemble practices and the proba- been considering raise deeper questions about the pur-
bility weightings being relied upon. Background theory poses of simulation, particularly in terms of traditional
can provide little guidance in this context and in the epistemic categories like understanding, explanation
case of climate modeling there is little chance for pre- and so on. Of course at one extreme some simulations of
dictively testing performance. Further, the robustness of the purely data-driven kind is purely phenomenological.
particular ensemble choices is often very low and justi- Theory plays no role in its generation, and is not sought
fications for picking out particular ensembles are rarely as its outcome. However in other cases some form of
carefully formulated. understanding at least is sought. In many cases though,
The ability to generate and compare large num- where theory might be thought the essential agent of
bers of complex models in this way is a development understanding, the complexity of the equations and re-
of modern computational power. In our studies we sulting complexity of the computational processes that
have also come across novel argumentation, particu- instantiate them, simply block any way of decomposing
larly connected with parameter-fixing [5.40]. Because the theory or theoretical model in order to understand
the parameter spaces these modelers have to deal with how the theory might explain a phenomena and thus
are so complex, there is almost no chance of getting assess the accuracy and plausibility of the underlying
a best fit solution. Instead modelers produce multiple mechanisms it might prescribe. Humphreys labels this
models often using Monte Carlo techniques that con- epistemic opacity [5.11]. Lenhard [5.41] in turn identi-
verge on similar behavior and output. These models fies a form of pragmatic understanding that can replace
have different parameterizations and ultimately repre- theoretical understanding when a simulation model is
sent the underlying mechanisms of the systems differ- epistemically opaque. This form of understanding is
ently. However, modelers can nonetheless make specific pragmatic in the sense of being an understanding of how
Models and Simulations 5.4 Computational Simulationand Human Cognition 127
to control and manipulate phenomena, rather explain planation is thus given up, for some weaker form of
them using background theoretical principles and laws. understanding.
Settling for this form of understanding is a choice made Finally, computational modeling and simulation in
by researchers in order to handle more complex prob- the situations we have been considering in this section
lems and systems using simulations. But it is a novel are driving a profound shift in the nature and level of hu-
one in the context of physics and chemistry. In sys- man cognitive engagement in scientific production pro-
tems biology we recognize something similar [5.40]. cesses and their outputs [5.12, 24, 25, 42, 43]. So much
Researchers give up accurate mechanistic understand- of philosophy of science has been based on intuitive
ing of their systems for more pragmatic goals of gaining notions of human cognitive abilities. Our concepts of
network control, at least over specific variables. To do explanation and understanding are constructed implic-
so they use simplification and parameter-fitting tech- itly on the basis of what we can grasp as humans. With
niques that obscure the extent to which their models simulation and big-data science those kinds of charac-
capture the underlying mechanisms. Mechanistic ex- terizations may no longer be accurate or relevant [5.44].
Part A | 5.4
resentational and computational issues. . . . We are
ning of their models under various conditions getting
now faced with a problem, which we can call the an-
a feel for the model, which enables them to get a feel
thropocentric predicament, of how we, as humans,
for the dynamics of the system.
can understand and evaluate computationally-based
In our investigations we have witnessed that model-
scientific methods that transcend our own abilities.”
ers (mainly engineers) with little understanding of biol-
Unlike machine-learning contexts, computational ogy have been able to provide novel insights and highly
modeling is in many cases a practice of using compu- significant predictions, later confirmed by biological
tation to extend traditional modeling practices and our collaborators, for the systems they are investigating
own capabilities to draw insight out of low-data con- through simulation. How is it possible that engineers
texts and complex systems for which theory provides at with little to no biological training can be making sig-
best a limited guide. In this way cognitive capacities are nificant biological discoveries? A related question con-
often heavily involved. The hybrid nature of computa- cerns how complete novices are making scientific dis-
tional science thus motivates the need for understanding coveries through simulations crowdsourced by means
how human agents cognitively engage with and con- of video games such as Foldit and EteRNA, which
trol opaque computational processes, and in turn draw appear to enable nonscientists to quickly build accu-
information out of them. Evaluating these processes – rate/veridical structures representing molecular entities
their productiveness and reliability – requires in the first they had no prior knowledge of [5.45, 46]. Nersessian
step having some understanding of them. As we will and Chadrasekharan, individually and together [5.24,
see, although computational calculation processes are 25, 42, 47–49], have argued that the answer to this ques-
beyond our abilities, at least in the case of systems bi- tion lies in understanding how computational simula-
ology the use of computation by modelers is often far tion enhances human cognition in discovery processes.
more integrated with their own cognitive processes and Because of the visual and manipulative nature of the
understanding, and thus far more under their control, crowdsourcing cases, the answer points in the direction
than we might think. of the coupling of the human sensorimotor systems with
As we have seen there are several lines of philo- simulation models. These crowdsourcing models re-
sophical research on computational simulation that un- represent conceptual knowledge developed by the sci-
128 Part A Theoretical Issues in Models
entific community (e.g., structure of proteins) as com- cognitive system would take use too far afield in this
putational representations with a control interface that review (but see [5.25]). Instead, we will flesh out the no-
can be manipulated through the gamer’s actions. The tion a bit by noting some of the ways in which building
interface enables these novices to build new representa- and using simulation models enhance human cognitive
tions drawing on tacit/implicit sensorimotor processes. capabilities and, in particular, extend the capability of
Although the use of crowdsourcing simulations in sci- the imagination system for simulative model-based rea-
entific problem solving is new, the human sensorimotor soning.
system has been used explicitly to detect patterns, A central, but yet not well-researched premise
especially in dynamic data generated by computational of distributed cognition is, as Hutchins has stated
models, since the dawn of computational modeling. succinctly, that “humans create cognitive powers by
Entire disciplines and methods have been built using creating the environments in which they exercise
visualized patterns on computer screens. Complexity those powers” [5.58, p. 169]. Since building modeling-
theory [5.50, 51], artificial life [5.52, 53] and compu- environments for problem solving is a major component
tational chemistry [5.54, 55] provide a few exemplars of scientific research [5.49], scientific practices provide
where significant discoveries have been made. an especially good locus for examining the human
Turning back now to the computational simula- capability to extend and create cognitive powers. In the
tions used by scientists that we have been discussing, case of simulation model-building, the key question
all of the above suggests that the model-building pro- is: What are the cognitive changes involved in building
cesses facilitate a close coupling between the model and a simulation model and how do these lead to discover-
the researcher’s mental modeling processes even in the ies? The key cognitive change is that over the course of
absence of a dynamic visualization. The building pro- many iterations of model-construction and simulation,
cess manipulates procedural and declarative knowledge the model gradually becomes coupled with the mod-
in the imagination and in the representation, creating eler’s imagination system (mental model simulation),
a coupled cognitive system of model and modeler [5.25, which enables the modeler to explore different scenar-
42, 43, 48, 56, 57]. This coupling can lead to explicit ios. The coupling allows what if questions in the mind
understanding of the dynamics of the system under of the modeler to be turned into detailed explorations
investigation. The notion of a coupled cognitive sys- of the system, which would not be possible in the mind
tem is best understood in terms of the framework of alone. The computational model enables this explo-
Part A | 5.4
distributed cognition [5.58, 59], which was developed ration because as it is incrementally built using many
to study cognitive processes in complex task environ- data sets, the model’s behavior, in the systems biology
ments, particularly where external representations and case, for instance, comes to parallel the dynamics of the
other cognitive artifacts and, possibly, groups of peo- pathway. Each replication of experimental results adds
ple, accomplish the task. The primary unit of analysis is complexity to the model and the process continues until
the socio-technical system that generates, manipulates the model is judged to fit all available data well. This
and propagates representations (internal and external to judgment is complex, as it is based on a large number
people). Research leading to the formation of the dis- of iterations where a range of factors such as sensitivity,
tributed cognition framework has focused largely on stability, consistency, computational complexity and so
the use of existing representational artifacts and less forth are explored. As the model gains complexity it
so on the building/creation of the artifacts. The central starts to reveal or expose many details of the system’s
metaphor is that of the human offloading complex cog- behavior enabling the modeler to interrogate the model
nitive processes such as memory to the artifact, which, in ways that are not possible in the mind alone (thought
for example, in the canonical exemplar of the speed bug experimenting) or in real-world experiments. It makes
that marks critical airspeeds for a particular flight, re- evident many details of the system’s behavior that the
places complex cognitive operations with a perceptual modeler could not have imagined alone because of the
operation and provides a publically available represen- fine grain and complexity of the details.
tation that is shared between pilot and co-pilot. The parallel between computation simulation ex-
In the research cited above, we have been arguing perimenting and thought experimenting is one philoso-
that offloading is not the right metaphor for under- phers have commented on, but the current framing
standing the cognitive enhancements provided through of the discussion primarily centers on the issue of
the building of novel computational representations. interpreting simulations and whether computational
Rather, the metaphor should be that of coupling be- simulations should be construed as opaque thought
tween internal and external representations. Delving experiments [5.60, 61]. Di Paolo et al. [5.60] have ar-
into the modifications needed of the distributed cogni- gued that computational models are more opaque than
tion framework to accommodate the notion of a coupled thought experiments, and as such, require more system-
Models and Simulations 5.4 Computational Simulationand Human Cognition 129
atic enquiry through probing of the model’s behavior. In It allows testing what-if scenarios with changes
a similar vein, Lenhard [5.61] has claimed that thought among many variables that would be impossible to
experiments are more lucid than computational mod- do in the mind.
els, though it is left unclear what is meant by lucid It allows stopping the simulation at various points
in this context, particularly given the extensive discus- and checking and tracking its states. If some de-
sions around what specific thought experiments actually sirable effect is seen, variables can be tweaked in
demonstrate. In the context of the discussion of the process to get that effect consistently.
relation of thought experimenting and computational It allows taking the system apart as modules, sim-
simulation, we have argued that the discussion should ulating them, and putting them together in different
be shifted from issues of interpretation to a process- combinations.
oriented analysis of modeling [5.47]. Nersessian [5.62] It allows changing the time in which intermediate
casts thought experimenting as a form of simulative processes kick in.
model-based reasoning, the cognitive basis of which is
the human capacity for mental modeling. Thought ex- These complex manipulations expose the modeler
periments (conceptual models), physical models [5.63] to system-level behaviors that are not possible to exam-
and computational models [5.47, 48] form a spectrum ine in either thought alone or in real-world experimenta-
of simulative model-based reasoning in that all these tion. The processes involved in building the distributed
types of modeling generate and test counterfactual model-based reasoning system comprising simulation
situations that are difficult (if not impossible) to imple- model and modeler enhance several cognitive abilities.
ment in the real world. Both thought experiments and Here we will conclude by considering three (for a fuller
computational models support simulation of counter- discussion see [5.25]). First, the model-building pro-
factual situations, however, while thought experiments cess brings together a range of experimental data. Given
are built using concrete elements, computational mod- Internet search engines and online data bases, current
els are built using variables. Simulating counterfactual models synthesize more data than even before and cre-
scenarios beyond the specific one constructed in the ate a synthesis that exists nowhere in the literature and
thought experiment is difficult and requires complex would not be possible for modelers or biologists to
cognitive transformations to move away from the con- produce on their own. In effect, the model becomes
crete case to the abstract, generic case. On the other a running literature review. Thus, modeling enhances
Part A | 5.4
hand, computational simulation constructs the abstract, the synthesizing and integrating capabilities of the mod-
generic case from the outset. Since computational mod- eler, which is an important part of the answer as to how
els are made entirely of variables, they naturally support a modeler with scant biological knowledge can make
thinking about parameter spaces, possible variations important discoveries. Second, an important cognitive
to the design seen in nature, and why this variation effect of the model-building is to enhance the mod-
occurs rather than the many others that are possi- eler’s powers of abstraction. Most significantly, through
ble. the gradual process of thousands of runs of simulations
Thought experiments are a product of a resource and analyses of system dynamics for these, the modeler
environment in science where the only tools available gains an external, global view of the system as a whole.
were writing implements, paper (blackboards, etc.) and Such a global view would not be possible to develop
the brain. Computational models create cognitive en- just from mental simulation, especially since the inter-
hancements that go well beyond those resources and actions among elements are complex and difficult to
enable scientists to study the complex, dynamic and keep track of separately. The system view, together with
nonlinear behaviors of the phenomena that are the fo- the detailed understanding of the dynamics, provides
cus of contemporary science. the modeler with an intuitive sense (a feeling for the
Returning to the nature of the cognitive enhance- model) of the biological mechanisms that enables her
ments created, the coupling of the computational model to extend the pathway structure in a constrained fash-
with the modeler’s imagination system significantly en- ion to accommodate experimental data that could not
hances the researcher’s natural capacity for simulative be accounted for by the current pathway from which
model-based reasoning, particularly in the following the model started. Additionally, this intuitive sense of
ways: the mechanism built from interaction with the model
helps to explain the success of the crowdsourcing mod-
It allows running many more simulations, with els noted above (see also [5.64]).
many variables at gradients not perceivable or ma- Finally, the model enhances the cognitive capac-
nipulable by the mind, which can be compared and ity for counterfactual or possible-worlds thinking. As
contrasted. noted in our discussion of thought experimenting, the
130 Part A Theoretical Issues in Models
model-building process begins by capturing the re- idea that computational processes are offloaded auto-
actions/interactions using variables. Variables provide mated processes from which inferences are derived.
a place-holder representation, which when interpreted The implications of this research into hybrid nature of
with combinations of numbers for these variables, can simulation modeling are that modelers might often have
generate model data that parallels the known experi- more control over and insight into their models and
mental data. One interesting feature of the place-holder their alignment with the phenomena than philosophers
representation is that it provides the modeler with a flex- have realized. Given the emphasis placed in published
ible way of thinking about the reactions, as opposed to scientific literature on fitting the data and predictive
the experimentalist who works with only one set of val- success for validating simulations, we might be missing
ues. Once the model is using the experimental values, out on the important role that these processes internal
the variables can take any set of values, as long as they to the model-building or discovery context appear to
generate a fit with the experimental data. The modeler be playing (from a microanalysis of practice) in sup-
is able to think of the real-world values as only one port of the models constructed. Indeed, the ability of
possible scenario, to examine why this scenario is com- computational modeling to support highly exploratory
monly seen in nature, and envision other scenarios that investigative processes makes it particularly relevant for
fit. Thinking in variables supports both the objective philosophers to have fine-grained knowledge of model-
modelers often have of altering or redesigning a reac- building processes in order to begin to understand why
tion (such as the thickness of lignin in plant wall for models work as well as they do and how reliable they
biofuels) and the objective of developing generic design can be considered to be.
patterns and principles. More broadly, the variable rep-
resentation significantly expands the imagination space Acknowledgments. We gratefully acknowledge the
of the modeler, enabling counterfactual explorations of support of the US National Science Foundation grant
possible worlds that far outstrip the potential of thought DRL097394084. Our analysis has benefited from col-
experimenting alone. laboration with members of the Cognition and Learning
A more microscopic focus like this one on the actual in Interdisciplinary Cultures (CLIC) Research Group
processes by which computational simulation is cou- at the Georgia Institute of Technology, especially with
pled with the cognitive processes of the modeler begins Sanjay Chandrasekharan. Miles MacLeod’s participa-
to help break down some of the mystery and seeming tion was also supported by a postdoctoral fellowship at
Part A | 5
inscrutability surrounding computation conveyed by the the TINT Center, University of Helsinki.
References
5.1 E. Winsberg: Sanctioning models: The epistemology 5.9 E. Winsberg: Simulations, models, and theories:
of simulation, Sci. Context 12(2), 275–292 (1999) Complex physical systems and their representations,
5.2 E. Winsberg: Models of success vs. the success of Philos. Sci. 68(3), 442–454 (2001)
models: Reliability without truth, Synthese 152, 1–19 5.10 W. Parker: Computer simulation through an error-
(2006) statistical lens, Synthese 163(3), 371–384 (2008)
5.3 E. Winsberg: Computer simulation and the philoso- 5.11 P. Humphreys: Extending Ourselves: Computational
phy of science, Philos. Compass 4/5, 835–845 (2009) Science, Empiricism, and Scientific Method (Oxford
5.4 E. Winsberg: Science in the Age of Computer Simula- Univ. Press, New York 2004)
tion (Univ. of Chicago Press, Chicago 2010) 5.12 P. Humphreys: The philosophical novelty of com-
5.5 E. Winserg: Computer simulations in science. In: puter simulation methods, Synthese 169, 615–626
The Stanford Encyclopedia of Philosophy, ed. by (2009)
E.N. Zalta (Stanford Univ., Stanford 2014), http:// 5.13 W. Parker: Computer simulation. In: The Routledge
plato.stanford.edu/cgi-bin/encyclopedia/archinfo. Companion to Philosophy of Science, ed. by S. Psillos,
cgi?entry=simulations-science M. Curd (Routledge, London 2013) pp. 135–145
5.6 N. Cartwright: The Dappled World: A Study of the 5.14 E. Fox Keller: Models, simulation, and computer
Boundaries of Science (Cambridge Univ. Press, Cam- experiments. In: The Philosophy of Scientific Exper-
bridge 1999) imentation, ed. by H. Radder (Univ. of Pittsburgh
5.7 M.S. Morgan, M. Morrison: Models as mediating in- Press, Pittsburgh 2003) pp. 198–215
struments. In: Models as Mediators: Perspectives on 5.15 S. Peck: Agent-based models as fictive instantiations
Natural and Social Science, ed. by M.S. Morgan, of ecological processes, Philos. Theory Biol. 4, 1–12
M. Morrison (Cambridge Univ. Press, Cambridge 1999) (2012)
5.8 J. Lenhard: Computer simulation: The cooperation 5.16 T. Grüne-Yanoff, P. Weirich: Philosophy of simula-
between experimenting and modeling, Philos. Sci. tion, simulation and gaming, Interdiscip. J. 41(1),
74(2), 176–194 (2007) 1–31 (2010)
Models and Simulations References 131
5.17 M.A. Bedau: Weak emergence and computer simula- 5.37 M. MacLeod, N.J. Nersessian: Coupling simulation
tion. In: Models, Simulations, and Representations, and experiment: The bimodal strategy in integra-
ed. by P. Humphreys, C. Imbert (Routledge, New York tive systems biology, Stud. Hist. Philos. Sci. Part C 44,
2011) pp. 91–114 572–584 (2013)
5.18 S. Peck: The Hermeneutics of ecological simulation, 5.38 W.S. Parker: Predicting weather and climate: Uncer-
Biol. Philos. 23(3), 383–402 (2008) tainty, ensembles and probability, Stud. Hist. Philos.
5.19 R. Frigg: Models and fiction, Synthese 172(2), 251–268 Sci. Part B 41(3), 263–272 (2010)
(2010) 5.39 W.S. Parker: Whose probabilities? Predicting climate
5.20 T. Grüne-Yanoff: The explanatory potential of artifi- change with ensembles of models, Philos. Sci. 77(5),
cial societies, Synthese 169(3), 539–555 (2009) 985–997 (2010)
5.21 M. MacLeod, N.J. Nersessian: Building simulations 5.40 M. MacLeod, N.J. Nersessian: Modeling systems-level
from the ground-up: Modeling and theory in sys- dynamics: Understanding without mechanistic ex-
tems biology, Philos. Sci. 80(4), 533–556 (2013) planation in integrative systems biology, Stud. Hist.
5.22 E.O. Voit: Computational Analysis of Biochemical Philos. Sci. Part C 49(1), 1–11 (2015)
Systems: A Practical Guide for Biochemists and 5.41 J. Lenhard: Surprised by a nanowire: Simulation,
Molecular Biologists (Cambridge Univ. Press, Cam- control, and understanding, Philos. Sci. 73(5), 605–
bridge 2000) 616 (2006)
5.23 M. MacLeod, N.J. Nersessian: The creative industry of 5.42 N.J. Nersessian: Creating Scientific Concepts (MIT
systems biology, Mind Soc. 12, 35–48 (2013) Press, Cambridge 2008)
5.24 S. Chandrasekharan, N.J. Nersessian: Building cog- 5.43 N.J. Nersessian: How do engineering scientists
nition: The construction of external representa- think? Model-based simulation in biomedical en-
tions for discovery, Cogn. Sci. 39(8), 1727–1763 (2015), gineering research laboratories, Top. Cogn. Sci. 1,
doi:10.1111/cogs.12203 730–757 (2009)
5.25 S. Chandrasekharan, N.J. Nersessian: Building cog- 5.44 W. Callebaut: Scientific perspectivism: A philosopher
nition: The construction of computational repre- of science’s response to the challenge of big data
sentations for scientific discovery, Cogn. Sci. 39(8), biology, Stud. Hist. Philos. Sci. Part C 43(1), 69–80
1727–1763 (2015) (2012)
5.26 H. Kitano: Looking beyond the details: A rise in sys- 5.45 J. Bohannon: Gamers unravel the secret life of
tem-oriented approaches in genetics and molecular protein, Wired 17 (2009), http://www.wired.com/
biology, Curr. Genet. 41(1), 1–10 (2002) medtech/genetics/magazine/17-05/ff_protein, Last
5.27 H.V. Westerhoff, D.B. Kell: The methodologies of accessed 06-06-2016
systems biology. In: Systems Biology: Philosophical 5.46 F. Khatib, F. DiMaio, Foldit Contenders Group, Foldit
Part A | 5
Foundations, ed. by F.C. Boogerd, F.J. Bruggeman, Void Crushers Group, S. Cooper, M. Kazmierczyk,
J.S. Hofmeyr, H.V. Westerhoff (Elsevier, Amsterdam M. Gilski, S. Krzywda, H. Zabranska, I. Pichova,
2007) pp. 23–70 J. Thompson, Z. Popovic, M. Jaskolski, D. Baker:
5.28 R. Frigg, J. Reiss: The philosophy of simulation: Hot Crystal structure of a monomeric retroviral protease
new issues or same old stew, Synthese 169, 593–613 solved by protein folding game players, Nat. Struct.
(2009) Mol. Biol. 18(10), 1175–1177 (2011)
5.29 E. Winsberg: Simulated experiments: Methodology 5.47 S. Chandrasekharan, N.J. Nersessian, V. Subrama-
for a virtual world, Philos. Sci. 70(1), 105–125 (2003) nian: Computational modeling: Is this the end of
5.30 D.G. Mayo: Error and the Growth of Experimental thought experiments in science? In: Thought Exper-
Knowledge (Univ. of Chicago Press, Chicago 1996) iments in Philosophy, Science and the Arts, ed. by
5.31 N. Gilbert, K. Troitzsch: Simulation for the Social Sci- J. Brown, M. Frappier, L. Meynell (Routledge, London
entist (Open Univ. Press, Philadelphia 1999) 2013) pp. 239–260
5.32 F. Guala: Models, simulations, and experiments. In: 5.48 S. Chandrasekharan: Building to discover: A common
Model-based reasoning: Science, technology, val- coding model, Cogn. Sci. 33(6), 1059–1086 (2009)
ues, ed. by L. Magani, N.J. Nersessian (Kluwer Aca- 5.49 N.J. Nersessian: Engineering concepts: The interplay
demic/Plenum Publishers, New York 2002) pp. 59–74 between concept formation and modeling practices
5.33 F. Guala: Paradigmatic experiments: The ultimatum in bioengineering sciences, Mind Cult. Activ. 19, 222–
game from testing to measurement device, Philos. 239 (2012)
Sci. 75, 658–669 (2008) 5.50 C.G. Langton: Self-reproduction in cellular au-
5.34 M. Morgan: Experiments without material interven- tomata, Physica D 10, 135–144 (1984)
tion: Model experiments, virtual experiments and 5.51 C.G. Langton: Computation at the edge of chaos:
virtually experiments. In: The Philosophy of Scien- Phase transitions and emergent computation, Phys-
tific Experimentation, ed. by H. Radder (University ica D 42, 12–37 (1990)
of Pittsburgh Press, Pittsburgh 2003) pp. 216–235 5.52 C. Reynolds: Flocks, herds, and schools: A distributed
5.35 W. Parker: Does matter really matter? Computer behavioral model, Comp. Graph. 21(4), 25–34 (1987)
simulations, experiments and materiality, Synthese 5.53 K. Sims: Evolving 3D morphology and behavior by
169(3), 483–496 (2009) competition, Artif. Life 1(4), 353–372 (1994)
5.36 E. Winsberg: A tale of two methods, Synthese 169(3), 5.54 W. Banzhaf: Self-organization in a system of binary
575–592 (2009) strings. In: Artificial Life IV, ed. by R. Brooks, P. Maes
(MIT Press, Cambridge MA 2011) pp. 109–119
132 Part A Theoretical Issues in Models
5.55 L. Edwards, Y. Peng, J. Reggia: Computational mod- 5.61 J. Lenhard: When experiments start. Simulation
els for the formation of protocell structure, Artif. Life experiments within simulation experiments, Int.
4(1), 61–77 (1998) Workshop Thought Exp. Comput. Simul. (2010)
5.56 N.J. Nersessian, E. Kurz-Milcke, W.C. Newstetter, 5.62 N.J. Nersessian: In the theoretician’s laboratory:
J. Davies: Research laboratories as evolving dis- Thought experimenting as mental modeling, Proc.
tributed cognitive systems, Proc. 25th Annu. Conf. Philos. Assoc. Am., Vol. 2 (1992) pp. 291–301
Cogn. Sci. Soc. (2003) pp. 857–862 5.63 N.J. Nersessian, C. Patton: Model-based reasoning in
5.57 L. Osbeck, N.J. Nersessian: The distribution of repre- interdisciplinary engineering. In: Handbook of the
sentation, J. Theor. Soc. Behav. 36, 141–160 (2006) Philosophy of Technology and Engineering Sciences,
5.58 E. Hutchins: Cognition in the Wild (MIT Press, Cam- ed. by A. Meijers (Elsevier, Amsterdam 2009) pp. 687–
bridge 1995) 718
5.59 E. Hutchins: How a cockpit remembers its speeds, 5.64 S. Chandrasekharan: Becoming knowledge: Cogni-
Cogn. Sci. 19(3), 265–288 (1995) tive and neural mechanisms that support scien-
5.60 E.A. Di Paolo, J. Noble, S. Bullock: Simulation mod- tific intuition. In: Rational Intuition: Philosophical
els as opaque thought experiments. In: Artificial Life Roots, Scientific Investigations, ed. by L.M. Osbeck,
VII, ed. by M.A. Bedau, J.S. McCaskill, N.H. Packard, B.S. Held (Cambridge University Press, Cambridge
S. Rasmussen (MIT Press, Cambridge 2000) pp. 497– 2014) pp. 307–337
506
Part A | 5
133
Part B
Theoretic Part B Theoretical and Cognitive Issues
on Abduction and Scientific Inference
Ed. by Woosuk Park
In the last century, abduction was extensively studied in In Chap. 6 John Woods provides us with the broader
logic, semiotics, the philosophy of science, computer context in which the significance of abductive reasoning
science, artificial intelligence, and cognitive science. can be appreciated. He asks whether abduction’s epis-
The surge of interest in abduction derived largely from temic peculiarities can be readily accommodated in phi-
serious reflection on the neglect of the logic of dis- losophy’s mainline theories of knowledge, and whether
covery at the hands of logical positivists and Popper, abduction provides any reason to question the assump-
especially their distinction between the context of dis- tion that the goodness of drawing a conclusion from
covery and the context of justification. At the same premises depends on an underlying relation of logical
time, the desire to recover the rationality of science consequence. His answer to these questions amounts
that has been seriously challenged by the publication to a timely response to Hintikka’s announcement of
of Kuhn’s The Structure of Scientific Revolutions might abduction as the central problem in contemporary epis-
be another important factor. However, the consensus is temology, as well as the signal of naturalistic turn in
that researchers have failed to secure the core meaning logic.
of abduction, let alone to cover the full range of its ap-
plications. The controversial status of abduction can be Gerhard Schurz’s Chap. 7 presents a thorough clas-
immediately understood if we consider our inability to sification of different patterns of abduction. In partic-
answer the following questions satisfactorily: ular, it attempts the most comprehensive treatment of
the patterns of creative abduction, such as theoretical
What are the differences between abduction and in- model abduction, common cause abduction, and statis-
duction? tical factor analysis. This is significant, for, compared
What are the differences between abduction and the to selective abductions, creative abductions are rarely
well-known hypothetico-deductive method? discussed, although they are essential in science. By ap-
What does Peirce mean when he says that abduction pealing to independent testability and explanatory uni-
is a kind of inference? fication, a demarcation between scientifically fruitful
Does abduction involve only the generation of hy- abductions and speculative abductions is also proposed.
potheses or their evaluation as well? Applications of abductive inference in the domains of
Are the criteria for the best explanation in abductive belief revision and instrumental/technological reason-
reasoning epistemic or pragmatic, or both? ing represent the author’s most recent interest in the
How many kinds of abduction are there? border between logic and the philosophy of science.
In Chap. 8 Gerhard Minnameier, by appropriating
Fortunately, the situation has improved much in all recent studies on abduction, presents a well-rounded
the last two decades. To say the least, some ambi- overview of the intricate relationships among deduc-
tious attempts to attain a unified overview of abduction tion, induction, and abduction. By taking Peirce’s claim
have been made, e.g., in Gabbay and Woods (2005), seriously that (1) that there are only three kinds of rea-
Magnani (2001), and Aliseda (2006). Each of these at- soning, i. e. abduction, deduction, and induction, and
tempts emphasizes its own strengths and achievements. (2) that these are mutually distinct, he wants to clar-
For example, Aliseda’s book represents some logical ify the very notion of abduction. For this purpose,
and computational approaches to abduction quite well. Minnameier carefully examines the fundamental fea-
Gabbay and Woods, by introducing the distinction be- tures of the three inferences. He also suggests a novel
tween explanatory/non-explanatory abductions, adopt distinction between two dimensions: i. e., levels of ab-
a broadly logical approach comprehending practical straction and domains of reasoning. To say the least,
reasoning of real-life logical agents. By introducing his his taxonomy of inferential reasoning seems to pro-
multiple distinctions between different kinds of abduc- vide us with a nice framework in which different
tion, i. e., selective/creative, theoretical/manipulative, forms of inferences can be systematically accommo-
and sentential/model-based, Magnani (2001, 2009) de- dated.
velops an eco-cognitive view of abduction, according
to which instances of abduction are found not only in Finally, Woosuk Park counts Lorenzo Magnani’s
science and any other human enterprises, but also in an- discovery of manipulative abduction as one of the most
imals, bacteria, and brain cells. Part B of this Handbook important developments in recent studies on abduction
presents an overview of the most recent research on the in Chap. 9. After briefly introducing Magnani’s distinc-
foundational and cognitive issues on abduction inspired tion between theoretical and manipulative abduction,
by all this. Park discusses how and why Magnani counts diagram-
135
matic reasoning in geometry as the prime example of Mathematics by Charles S. Peirce, Vol. IV (Mou-
manipulative abduction. Among the commentators of ton, The Hague, 1976), ed. by C. Eisele
Peircean theorematic reasoning, Magnani is unique in MS: manuscript: Peirce manuscript, followed by
equating theorematic reasoning itself as abduction. Park a number in Richard R. Robin, Annotated Cata-
also discusses what he counts as some common charac- logue of the Papers of Charles S. Peirce Amherst:
teristics of manipulative abductions, and how and why University of Massachusetts, 1967.
Magnani views manipulative abduction as a form of
practical reasoning. Ultimately, he argues that it is ma-
nipulative abduction that enables Magnani to extend
References
abduction to all directions to develop the eco-cognitive
model of abduction.
D. Gabbay, J. Woods: A Practical Logic of Cogni-
tive Systems. The Reach of Abduction: Insight and
Trial, Vol. 2 (Elsevier, Amsterdam 2005)
The authors of this part follow the following com-
monly accepted abbreviation used to refer to the edi-
L. Magnani: Abduction, Reason, and Science: Pro-
cesses of Discovery and Explanation (Kluwer, New
tions of Peirce’s work:
York 2001)
CP: Collected papers: C.S. Peirce: Reviews, Corre- A. Aliseda: Abductive Reasoning. Logical Investi-
spondence, and Bibliography, Collected Papers of gations into Discovery and Explanation (Springer,
Charles Sanders Peirce, Vol. 8 (Harvard Univ. Press, Dordrecht 2006)
Cambridge 1958), ed. by A.W. Burks L. Magnani: Abductive Cognition. The Epistemo-
NEM: New Elements of Mathematics: C.S. Peirce: logical and Eco-cognitive Dimensions of Hypothet-
Mathematical Philosophy, The New Elements of ical Reasoning (Springer, Berlin, Heidelberg 2009)
137
Reorienting t
6. Reorienting the Logic of Abduction
Part B | 6
John Woods
Three facts about today’s logic stand out: of that core structure is the relation of logical conse-
quence. It occasions some sensible operational advice:
1. Never has it been done with such technical virtuos-
If in your work you seek to enlarge logic’s present
ity
multiplicities, have the grace to say why you think it
2. Never has there been so much of it
qualifies as logic, that is, embodies logic’s structural
3. Never has there been so little consensus about its
core. This is not idle advice. I hope to give it heed in
common subject matters.
the pages to follow, as we turn our attention to the logic
It would seem that the more we have of it, the less of abduction.
our inclination to get to the bottom of its sprawlingly Although logic’s dominant focus has been the con-
incompatible provisions. There is nothing remotely like sequence relation, in the beginning its centrality owed
this in real analysis, particle physics or population ge- comparatively little to its intrinsic appeal. Consequence
netics. There is nothing like it in the premiss-conclusion was instrumentally interesting; it was thought to be
reasonings of politics and everyday life. Left undealt the relation in virtue of which premiss-conclusion rea-
with, one might see in logic’s indifference to its own soning is safe, or whose absence would expose it to
rivalries some sign of not quite knowing its own mind. risk. Reasoning in turn had an epistemic motivation.
It could be said that one of logic’s more stimulating Man may be many kinds of animal, but heading the
events in our still-young century is the revival of the list is his cognitive identity. He is a knowledge-seeking
idea that it is a universal discipline, that when all is said and knowledge-attaining being to which his survival
and done there is a core structure to which all the mul- and prosperity are indissolubly linked, indispensable
tiplicities of our day are ultimately answerable. If the to which is his capacity to adjust what he believes to
historical record is anything to go on, the cornerstone what follows from what. We might say then that as
138 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
long as logic has retained its interest in good and bad ductive contexts. It is not a strict partition. Between
Part B | 6.1
reasoning it has retained this same epistemic orienta- the logical and computational paradigms, abductive
tion. Accordingly, a logic of good and bad reasoning logic programming and semantic tableaux abduction
carries epistemological presuppositions that aren’t typ- occupy a more intermediate position. Whatever its pre-
ically explicitly developed. cise details, the logic-computer science dichotomy is
It would be premature to say that abduction by not something I welcome. It distributes the theory of
now has won a central and well-established place in abductive reasoning into different camps that have yet
the research programs of modern logic, but there are to learn how to talk to one another in a systematic way.
some hopeful signs of progress (important sources in- A further difficulty is that whereas abduction is now an
clude [6.1–13]). In the literature to date there are two identifiable research topic in logic – albeit a minority
main theoretical approaches, each emphasizing the dif- one – it has yet to attain that status in computer science.
ferent sides of a product-process distinction. The log- Such abductive insights as may occur there are largely
ical (or product) approach seeks for truth conditions in the form of obiter dicta attached to the main business
on abductive consequence relations and of such other at hand (I am indebted to Atocha Aliseda for insightful
properties as may be interdefinable with it. The compu- advice on this point). This leaves us awkwardly po-
tational (or process) approach constructs computational sitioned. The foundational work for a comprehensive
models of how hypotheses are selected for use in ab- account of abductive reasoning still awaits completion.
6.1 Abduction
6.1.1 Peirce’s Abduction to experimental trial ([6.15, CP 5.599], [6.18, CP
6.469–6.473], [6.17, 7.202–219]).
Although there are stirrings of it in Aristotle’s notion of P5 The connection between the truth of the abduced
apagogē [6.14], we owe the modern idea of abduction hypothesis A and the observed fact C is subjunc-
to Peirce. It is encapsulated in the Peircean abduction tive [6.15, CP 5.189].
schema, as follows [6.15, CP 5.189]: P6 The inference that the abduction licenses is not to
the proposition A, but rather that A’s truth is some-
“The surprising fact C is observed. thing that might plausibly be suspected [6.15, CP
But if A were true, C would be a matter of course. 5.189].
Hence there is reason to suspect that A is true.” P7 The hence of the Peircean conclusion is ventured de-
feasibly [6.15, CP 5.189].
Peirce’s schema raises some obvious questions. One
is how central to abduction is the factor of surprise. An- Let us note that P3 conveys something of basic
other is the issue of how we are to construe the element importance. It is that successful abductions are eviden-
of suspicion. A third concerns what we are expected tially inert. They offer no grounds for believing the
to do with propositions that creep thus into our suspi- hypotheses abduced. What, then, is the good of them?
cions. A fourth is what we are to make of the idea that
an occurrence of something is a matter of course. Like 6.1.2 Ignorance Problems
so many of his better ideas and deeper insights, Peirce
has nothing like a fully developed account of abduc- Seen in Peirce’s way, abductions are responses to igno-
tion. Even so, the record contains some important ideas, rance problems. An agent has an ignorance problem in
seven of which I’ll mention here: relation to an epistemic target when it can’t be attained
by the cognitive resources presently at his command, or
P1 Abduction is triggered by surprise [6.15, CP 5.189]. within easy and timely reach of it. If, for some propo-
P2 Abduction is a form of guessing, underwritten in- sition A, you want to know whether A is the case, and
nately by instinct ([6.16, p. 128], [6.15, CP 5.171], you lack the information to answer this question, or to
[6.17, CP 7.220]). draw it out by implication or projection from what you
P3 A successful abduction provides no grounds for be- currently do know, then you have an ignorance problem
lieving the abduced proposition to be true [6.16, with respect to A.
p. 178]. Two of the most common responses to ignorance
P4 Rather than believing them, the proper thing to problems are (1) subduance and (2) surrender. In the
do with abduced hypotheses is to send them off first case, one’s ignorance is removed by new knowl-
Reorienting the Logic of Abduction 6.1 Abduction 139
edge, and an altered position is arrived at, which may 2. R.K; T/ [fact]
Part B | 6.1
serve as a positive basis for new action. In the second 3. Subduance is not presently an option [fact]
case, one’s ignorance is fully preserved, and is so in 4. Surrender is not presently an option [fact]
a way that cannot serve as a positive basis for new ac- 5. H 62 K [fact]
tion (new action is action whose decision to perform is 6. H 62 K [fact]
lodged in reasons that would have been afforded by that 7. R.H; T) [fact]
knowledge). For example, suppose that you’ve forgot- 8. R.K.H/, T/ [fact]
ten when Barb’s birthday is. If her sister Joan is nearby 9. H R.K.H/; T/ [fact]
you can ask her, and then you’ll have got what you 10. H meets further conditions S1 ; : : : Sn [fact]
wanted to know. This is subduance. On the other hand 11. Therefore, C.H/ [sub-conclusion, 1–7]
if Joan is traveling incognito in Peru and no one else is 12. Therefore, H c [conclusion, 1–8].
about, you might find that knowing Barb’s birthday no
longer interests you. So you might rescind your epis- It is easy to see that the distinctive epistemic feature
temic target. This would be surrender. of abduction is captured by the schema. It is a given
There is a third response that is sometimes avail- that H is not in the agent’s knowledge set K. Nor is
able. It is a response that splits the difference between it in its immediate successor K . Since H is not in
the prior two. It is abduction. Like surrender, abduc- K, then the revision of K by H is not a knowledge-
tion is ignorance-preserving, and like subduance, it successor set to K. Even so, H R.K.H/; T/. But that
offers the agent a positive basis for new action. With subjunctive fact is evidentially inert with respect to H.
subduance, the agent overcomes his ignorance. With So the abduction of H leaves the agent no closer than
surrender, his ignorance overcomes him. With abduc- he was before to achieving the knowledge he sought.
tion, his ignorance remains, but he is not overcome Though abductively successful, H doesn’t enable the
by it. It offers a reasoned basis for new action in the abducer to attain his epistemic target. So we have it
presence of that ignorance. No one should think that that successful abduction is ignorance-preserving. Of
the goal of abduction is to maintain that ignorance. course, the devil is in the details. Specifying the Si is
The goal is to make the best of the ignorance that one perhaps the hardest open problem for abductive logic.
chances to be in. In much of the literature it is widely accepted that K-
sets must be consistent and that its consistency must
6.1.3 The Gabbay–Woods Schema be preserved by K.H/. This strikes me as unrealistic.
Belief sets are often, if not routinely, inconsistent. Also
The nub of abduction can be described informally. You commonly imposed is a minimality condition. There are
want to know whether something A is the case. But you two inequivalent versions of it. The simplicity version
don’t know and aren’t in a position here and now to advises that complicated hypotheses should be avoided
get to know. However, you observe that if some fur- as much as possible. It is sometimes assumed that truth
ther proposition H were true, then it together with what tends to favor the uncomplicated. I see no reason to ac-
you already know would enable you to answer your cept that. On the other hand, simplicity has a prudential
question with regard to A. Then, on the basis of this appeal. Simple ideas are more easily understood than
subjunctive connection, you infer that H is a conjec- complicated ones. But it would be overdoing things to
turable hypothesis and, on that basis, you release it elevate this desideratum to the status of a logically nec-
provisionally for subsequent inferential work in the rel- essary condition. The other version is a form of Quine’s
evant contexts. maxim of minimum mutilation. It bids the theorist to
More formally, let T be an agent’s epistemic target revise his present theory in the face of new information
at a time, and K his knowledge base at that time. Let in ways that leave as much as possible of the now-
K be an immediate successor of K that lies within the old theory intact. It advises the revisionist to weigh the
agent’s means to produce in a timely way. Let R be an benefits of admitting the new information against the
attainment relation for T and let denote the subjunc- costs of undoing the theory’s current provisions. This,
tive conditional relation. K.H/ is the revision of K upon too, is little more than prudence. No one wants to rule
the addition of H. C.H/ denotes the conjecture of H out Planck’s introduction of the quantum to physics,
and H c its activation. Accordingly, the general structure never mind the mangling of old physics that ensued.
of abduction can be captured by what has come to be Another of the standard conditions is that K.H/ must
known as the Gabbay–Woods schema [6.6, 19, 20]: entail the proposition for which abductive support has
been sought. In some variations inductive implication is
1. T! E [The ! operator sets T as an epistemic target substituted. Both I think are too strong. Note also that
with respect to some state of affairs E] none of the three – consistency, minimality or implica-
140 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
tion – could be thought of as process protocols. The Si fully abduced hypotheses the promise of poor-cousin
Part B | 6.1
are conditions on hypothesis selection. I have no very evidential backing; but it wouldn’t be backing with no
clear idea about how this is done, and I cannot but think evidential force. It is an attractive idea, but it cuts too
that my ignorance is widely shared. Small wonder that far.
logicians have wanted to offload the logic of discov- There are too many cases in which successful rea-
ery to psychology. I will come back to this briefly in soning, indeed brilliant reasoning, has the very charac-
due course. Meanwhile let’s agree to regard line (10) as teristic the reformers would wish to suppress. A case
a promissory note [6.21, Chap. 11]. in point is Planck’s quantum hypothesis. In the physics
of 1900s, black body radiation lacked unifying laws for
6.1.4 The Yes-But Phenomenon high and low frequencies. Planck was disturbed by this.
Notwithstanding his lengthy acquaintanceship with it,
Perhaps it won’t come as much of a surprise to learn the disunification of the black body laws was a surpris-
of the resistance with which the ignorance-preservation ing event. It was, for physics, not a matter of course.
claim has been met when the Gabbay–Woods schema Planck wanted to know what it would take to ease his
has been presented to (what is by now a sizable num- cognitive irritation. Nothing he knew about physics an-
ber of) philosophical audiences. There are those who swered this question. Nothing he would come to know
think that precisely because it strips good abductions about physics would answer it either, as long as physics
of evidential force, the G–W schema misrepresents was done in the standard way. Planck recognized that he
Peirce. Others think that precisely because it is faithful would never attain his target until physics were done in
to Peirce’s conditions the G–W schema discredits the a new way, in a way sufficiently at odds with the present
Peircean concept of abduction. Of particular interest is paradigm to get some movement on this question; yet
the hesitation shown by philosophers who are actually not so excessively ajar from it as to make it unrecogniz-
inclined to accept the schema, and accept the Peircean able as physics. That day in 1900 when he announced
notion. It may be true, they seem to think, that abduction to his son that he had overturned Newton, Planck was
is ignorance-preserving, but it is not a truth to which drawn to the conditional that if the quantum hypothesis
they take kindly. Something about it they find unsatis- Q were true then K.Q/ – that is, physics as revised by
fying. There is a conventional way of giving voice to the incorporation of Q – would enable him to reach his
this kind of reticence. One does it with the words, Yes, target. So he put it to work accordingly. At no stage did
but : : :. So we may speak of this class of resisters as the Planck think that Q was true. He thought it lacked phys-
ignorance-preservation yes-buts. ical meaning. He thought that his reasoning provided no
Some philosophers are of the view that there are at evidence that Q was true and no grounds for believing it
least three grades of evidential strength. There is evi- to be true. Peirce wanted a logic that respected this kind
dential strength of the truth-preserving sort; evidential of thinking. This is what I want too. The poor cousin
strength of the probability-enhancing sort; and eviden- thesis doesn’t do this, and cannot.
tial strength of a weaker kind. This latter incorporates Ignorance removal is prompted by the reasoner’s
a notion of evidence that is strong in its way without desire to know something he doesn’t now know, or
being either deductively strong or inductively strong. to have more knowledge of it than he currently does.
It is, as we might say, induction’s poor cousin. Pro- What are the conditions under which this happens? It
ponents of this approach are faced with an interesting seems right to say that without an appreciation of the
challenge. They must try to tell us what it is for pre- general conditions under which a human reasoner is in
misses nondeductively to favor a conclusion for which a state of knowledge, this is a question without a princi-
there is no strong inductive support. If the weak cousin pled answer. If, as I aver, there are abductive modes of
thesis is false, lots of philosophers are nevertheless reasoning prompted by the desire to improve one’s epis-
drawn to it. So perhaps the better explanation of the temic condition which, even when wholly successful,
yes-buts’ resistance to the ignorance-preservation claim do not fulfill that objective, there must be two particu-
is that they think that it overstates the poor cousin the- lar considerations thanks to which this is so. One would
sis, that it makes of abduction a poorer thing than it have to do with abduction. The other has to do with
actually is. The poor cousin thesis says that abduc- knowledge. A fair part of this first factor is captured
tion is the weakest evidential relation of the family. by the Gabbay–Woods schema (or so I say). The sec-
But the ignorance-preservation thesis says that it is an ond is catered for by the right theory of knowledge, if
evidential relation of no kind, no matter how weak. there is one. We asked why, if a philosopher accepted
Accordingly, what the yes-buts are proposing is tanta- the Gabbay–Woods schema for abduction, would he
mount to retention of the G–W schema for abduction dislike its commitment to the ignorance-preservation
minus Peirce’s clause P3. This would allow success- claim? The possibility that we’re now positioned to con-
Reorienting the Logic of Abduction 6.2 Knowledge 141
sider is that his yes-but hesitancy flows from how he not his logic. If so, the yes part of yes, but : : : is di-
Part B | 6.2
approaches the general question of knowledge. That is rected to the logic, but the but part is directed to the
to say, it is his epistemology that makes him nervous, epistemology.
6.2 Knowledge
6.2.1 Epistemology Corollary 6.1
There are abductive contexts in which knowledge can
I said in the abstract that epistemological considera- be attained in the absence of evidence.
tions affecting the goodness or badness of premiss-
conclusion reasoning are little in evidence in main- The idea of knowledge without supporting evidence
stream logic. In so saying, I intend no slight to the isn’t entirely new or in the least shocking. There is
now large growing and prospering literature on epis- a deeply dug-in inclination to apply this characteriza-
temic logics [6.22–24]. For the most part these logics tion to quite large classes of cases. Roughly, these are
construct formal representations of the standard reper- the propositions knowledge of which is a priori or in-
toire of properties – consequence, validity, derivability, dependent of experience; or, as with Aristotle’s first
consistency, and so on – defined for sentences to which principles, are known without the necessity or even pos-
symbols for it is known that, and it is believed that sibility of demonstration; or, as some insist, are the
function as sentence operators. A central task for these immediate disclosures of sense and introspection. Dis-
logics is to construct a formal semantics for such sen- agreements have arisen, and still do, about whether
tences, typically on the assumption that these epistemic these specifications are accurate or sustainable, but it
expressions are modal operators, hence subject to a pos- would be a considerable exaggeration to call this sort of
sible worlds treatment. Notwithstanding their explicitly evidential indifference shocking, and wildly inaccurate
epistemic orientation, it remains true that there is in as a matter of historical fact to think of it as new.
this literature virtually no express contact with any of In truth, apriorism is beside the point of the right-
the going epistemologies. So here, too, if they operate wrong thesis and its corollary. The knowledge that falls
at all epistemological considerations operate tacitly as within their intended ambit is our knowledge of con-
part of an unrevealed epistemological background in- tingent propositions, whether of the empirical sciences
formation. I intend something different here. I want to or of the common experience of life. The right-wrong
bring epistemology to the fore, which is precisely where claim is that there are contingent propositions about the
it belongs in logics of premiss-conclusion reasoning of world which, without being in any way epistemically
all kinds. privileged, can be ignorance-reducing by virtue of con-
I want also to move on to what I think may be siderations that lend them no evidential weight. So what
the right explanation of the yes-buts’ dissatisfactions. is wanted is a theory of knowledge that allows this to
Before getting started, a caveat of some importance happen.
should be flagged. The explanation I’m about to proffer The historically dominant idea in philosophy is that
attributes to the yes-buts an epistemological perspec- knowledge is true belief plus some other condition,
tive that hardly anyone shares; I mean by this hardly usually identified as justification or evidence. This, the
any epistemologist shares, a notable exception is [6.25]. J-condition, has been with us at least since Plato’s The-
There is a good chance that whatever its intrinsic plau- aeatetus, and much scholarly ink has been spilled over
sibility, this new explanation will lack for takers. Even how it is best formulated and whether it might require
so, for reasons that will appear, I want to persist with it the corrective touch of some further condition. But, as
for awhile. Here is what it proposes. a general idea, the establishment bona fides of the J-
condition are as rock solid as anything in philosophy.
The Right-Wrong Thesis The account of knowledge I am looking for arises at
While the Gabbay–Woods schema gets something the juncture of two epistemological developments. One
right about abduction, it nevertheless gets ignorance- is the trend towards naturalism [6.26] and the other is
preservation wrong. What it gets right is that good the arrival of reliabilism [6.27]. It is a theory in which
abductions are evidentially inert. What it gets wrong is the J-condition fails as a general constraint on epistemi-
that this lack of evidential heft entails a corresponding cally unprivileged contingent knowledge. Accordingly,
failure to lift the abducer in any degree from his present my first task is to try to downgrade the condition, to
ignorance. deny it a defining role. Assuming some success with the
142 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
first, my second task will be to find at the intersection disengaged. They stand in radical contrast to highly en-
Part B | 6.2
of these trends an epistemological orientation – perhaps gaged justifications, which we may call forensic.
I would better call it an epistemological sensibility – By construction of the case presently in view, fac-
which might without too much strain be reconciled to tive justification will be the constant companion of any
the loss of the J-condition. For ease of reference let me piece of epistemically unprivileged contingent knowl-
baptize this orientation, this sensibility, the causal re- edge that S chances to have. But we have in this
sponse turn. constancy not conditionhood but concomitance. Fac-
Whereupon task number three, which is to identify tive justification is a faithful accompaniment of such
those further features of the causal response model that knowledge, but it is not a constituent of it. Forensic jus-
link up the notions of evidence and knowledge in the tification is another story. We might grant that if, when
heterodox ways demanded by the right-wrong thesis. S knows that p, he has a forensic justification for his be-
lief, then his justification will have made a contribution
6.2.2 Losing the J-Condition to this knowledge. But in relation to all that S knows it
is comparatively rare that there is a forensic justifica-
The J-condition has attracted huge literature and un- tion. Here is a test case, with a tip of the hat to Peirce:
derwritten a good deal of strategic equivocation. On Do you know who your parents are? Of course you do!
engaged readings of the condition, a person’s belief is Very well, then, let’s have your forensic justification.
justified or evidenced only if he himself has produced This is troublesome. If we persist in making foren-
his justification then and there, or he has presented the sic justification a condition on knowledge, the result
evidence for it on the spot. On disengaged readings, is skepticism on an undesirable scale. If, on the other
a person is justified in believing if a justification exists hand, we decide to go with factive justification, then
but hasn’t been invoked, or evidence exists but hasn’t justifications exist whenever knowledge exists, but they
been adduced or even perhaps found. The engaged aren’t conditions on this knowledge. They are not
and disengaged readings raise an interesting question. a structural element of it. Whereupon we are met with
How deeply engaged does one have to be to meet the J-condition dilemma.
the J-condition on knowledge? Most epistemologists
formulate the engaged-disengaged distinction as one J-Condition Dilemma
between internalist and externalist justification. Depending on how it is read, the J-condition is either an
Engagement here is a matter of case making. The irrelevant concomitant of knowledge, or a skepticism-
two readings of J define a spectrum, but for present inducing discouragement of it.
purposes there is little that needs saying of what lies The forensic-factive ambiguity runs through all the
within. It suffices to note that in its most engaged sense idioms of J-attribution. Concerning his belief that p
a belief is justified or evidenced only if the believer can there might be evidence for p that S adduces or there
himself make the case for it here and now. At the other may be evidence for p that exists without attribution.
extreme, the belief is justified or evidenced if a case for There may be reasons for it that S gives, or reasons for
it is available in principle to someone or other. In the it that exist without being given. Like confusions repose
first case, the individual in question has a high degree in careless uses of have. If we allow that S has a justi-
of case-making engagement. In the other, his engage- fication or has evidence or has reasons whenever these
ment is a gestural, anonymous and proxied one: it is things exist factively, we mislicense the inference from
engagement in name only. the factive to the forensic, allowing, in so doing, S to
Suppose the following were true. Suppose that, for have justifications that he’s never heard of.
every piece of epistemically unprivileged contingent
knowledge p, there were a structure of facts in virtue of 6.2.3 The Causal Response Model
which p is the case. Suppose further that for every such of Knowledge
p a person knows, it would be possible in principle to
discern this structure of the facts and the in-virtue-of re- The causal response (CR) model of knowledge is
lation it bears to p’s truth. (I don’t think there is any re- rightly associated with reliabilism. In all the going
alistic chance of this being so, but let’s assume it for the forms of it, the J-condition is preserved [6.28]. One
point at hand.) Suppose, finally, that we agreed to say of the few places in the reliabilist literature where we
that when in principle knowledge of that structure and see stirrings of the pure version of the causal model is
that relation exists with respect to a p that a subject S Alvin Goldman’s first reliabilist paper, which appeared
knows, there exists a justification of S’s belief that p. For in 1967. It is a rare place in Goldman’s foundational
ease of reference, let’s call these factive justifications. corpus where the J-condition, if there at all, is given
Factive justifications are justifications at their most shortest shrift. In some versions, the J-condition is
Reorienting the Logic of Abduction 6.2 Knowledge 143
satisfied when one’s belief has been reached by reli- aptitude for knowledge. But there are cases galore in
Part B | 6.2
able procedures. In others, the condition is met when which aptitude requires the supplementation of voca-
the belief was reliably produced, that is, produced by tion and talent – and training. CR theorists are no less
belief-forming mechanisms that were working reliably. aware of this than their J rivals. The difference between
In contrast to the standard versions, the pure version them falls in where the emphasis falls. Among J the-
is one in which the J-condition is eliminated, rather orists there is a tendency to generalize the hard cases.
than reinterpreted along reliabilist lines. As a first ap- Among CR theorists there is a contrary tendency to
proximation, the pure theory characterizes knowledge keep the hard cases in their place.
as follows: Let me say again that J-theories give an exag-
gerated, if equivocal, place to the role of showing in
“S knows that if and only if p is true, S believes that,
knowing. Contrary to what might be supposed, the CR
the belief was produced by belief-forming devices,
model is no disrespecter of the showing-knowing dis-
in good working order, operating as they should on
tinction, albeit with a more circumscribed appreciation
good information and in the absence of Gettier nui-
of showing. I want to turn to this now.
sances and other hostile externalities.”
Fundamental to what I’ve been calling the pure 6.2.5 Showing and Knowing
theory is the conviction that knowledge is not in any es-
sential or general way tied to case making, that knowing Consider the case of Fermat’s Last Theorem. The the-
is one thing and showing another. This is not to say that orem asserts that for integers x, y, and z, the equation
case making is never implicated in knowledge. There xn C yn D zn lacks a solution when n > 2. Fermat fa-
are lots of beliefs that would not have been had in the mously left a marginal note claiming to have found
absence of the case makings that triggered their forma- a proof of his theorem. I want to simplify the exam-
tion. Think here of a mother’s sad realization that her ple by stipulating that he did not have a proof and did
son is guilty of the crime after all, or a nineteenth cen- not think or say that he did. The received wisdom is
tury mathematician’s grudging acknowledgment of the that Fermat went to his grave not knowing that his the-
transfinite. But as a general constraint, case making is orem is true. The received wisdom is that no one knew
rejected by pure causalists; by causalists of the sort that whether the theorem is true until Andrew Wiles’ proof
Goldman was trying to be in 1967. of it in 1995. If the forensically conceived J model were
true, this would be pretty much the way we would ex-
6.2.4 Naturalism pect the received wisdom to go.
If the J model is hard on knowledge, the CR model
Epistemology’s naturalized turn supplies a welcoming is a good deal more accommodating. It gives to knowl-
habitat for the CR model. Naturalism comes in vari- edge a generous provenance. But I daresay that it will
ous and competing versions, but at the core of them come as a surprise that, on some perfectly plausible
all is the insistence that human knowledge is a natural assumptions, Fermat did indeed know the truth of his
phenomenon, achieved by natural beings in accordance theorem, never mind (as we have stipulated) that he was
with their design and wherewithal, interacting in the all at sea about its proof. Fermat was no rookie. He was
causal nexi in which the human organism lives out his a gifted and experienced mathematician. He was im-
life. Unlike the J theorist, the CR theorist is a respecter mersed in a sea of mathematical sophistication. He was
of the passive side of knowledge. He knows that there a mathematical virtuoso. Fermat knew his theorem if
are large classes of cases in which achieving a knowl- the following conditions were met: It is true (as indeed
edge of something is a little more than just being awake it is), he believed it (as indeed he did), his highly trained
and on the scene. Even where some initiative is re- belief-forming devices were in good order (as indeed
quired by the knower, the resultant knowledge is always they were) and not in this instance misperforming (as
a partnership between doing and being done to. So even indeed they were not), and their operations were not
worked-for knowledge is partly down to him and partly compromised by bad information or Gettier nuisances
down to his devices. (as indeed was the case). So Fermat and generations
It would be wrong to leave the impression that, on of others like-placed knew the theorem well before its
the CR model, knowing things is just a matter of do- proof could be contrived.
ing what comes naturally. There are ranges of cases in We come now to a related point about showing
which knowledge is extremely difficult to get, if get- and knowing. Showing and knowing mark two distinct
table at all. There are cases in which knowledge is goals for science, and a corresponding difference in
unattainable except for the intelligence, skill, training their satisfaction conditions. Not unlike the law, science
and expertise of those who seek it. Everyone has an is in significant measure a case-making profession –
144 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
a forensic profession – made so by the premium it of those successfully abduced hypotheses that happen
Part B | 6.2
places on knowing when knowledge has been achieved, to be true and, contrary to Peirce’s advice, believed
rather than just achieving it. This has something to do by its abducer S. What would it take to get us seri-
with its status as a profession, subject to its own ex- ously to propose that, when these conditions are met,
acting requirements for apprenticeship, advancement S’s belief-forming device’s are malfunctioning or are
and successful practice. These are factors that impose in poor operating order. Notice that a commonly held
on people in the showing professions expectations that answer is not available here, on pain of question beg-
regulate public announcement. Fermat may well have ging. It cannot be said that unevidenced belief is itself
known the truth of his theorem and may have had oc- evidence of malfunction and disorder. That is, it can-
casion to say so to a trusted friend or his mother. But not be said to the CR-theorist, since implicit in his
he was not to say it for publication. Publication is a ve- rejection of justificationism is his rejection of this an-
hicle for case making, and case making is harder than swer.
knowing. Journal editors don’t give a toss for what you Is there, then, any reason to suppose that the arousal
know. But they might sit up and notice if you can show of unevidenced belief might be some indication of prop-
what you know. erly functioning belief formation? Ironically enough,
there is an affirmative answer in Peirce himself. Peirce
6.2.6 Explaining the Yes-Buts is much taken with our capacity for right guessing. Our
facility with guessing is so impressive that Peirce is
The ignorance-preservation claim is rooted in the idea driven to the idea that good guessing is something the
of the no evidence-no knowledge thesis. human animal is built for. But if we are built for good
guessing, and good abduction is a form of guessing,
The No Evidence-No Knowledge Thesis how can the abduction of true hypotheses not be like-
Since successful abduction is evidentially inert, it is wise something we’re built for? Accordingly, there is
also epistemically inert. But this is justificationism: No a case for saying that.
advance in knowledge without some corresponding ad-
vance in evidence. Knowledge Enhancement
The CR model jettisons justificationism. It de- In the CR model of knowledge, there are numbers of
nies the very implication in which the ignorance- instances in which successful abduction is not only not
preservation thesis is grounded. It is not hard to see that ignorance preserving, but actually knowledge enhanc-
the evidence, whose abductive absence Peirce seizes ing.
upon, is not evidence in the factive sense. Peirce in- Part of what makes for the irony of Peirce’s enthu-
sists that we have no business believing a successfully siasm for right guessing is his insistence that guesses
abduced hypothesis. Peirce certainly doesn’t deny that not be indulged by belief. In this he is a justificationist.
behind any plausibly conjectured hypothesis there is Abducers have no business in believing unevidenced
a structure of facts in virtue to which it owes its truth propositions, never mind their abductive allure. This is
value. Peirce thinks that our track record as abductive enough of a basis to pin the ignorance-preservation the-
guessers is remarkably good. He is struck by the ratio sis on Peirce, but not on a CR theorist who accepts the
of right guesses to guesses. He is struck by our aptitude Gabbay–Woods schema. What this shows is that theirs
for correcting wrong guesses. The evidence whose ab- is not a disagreement about abduction. It is a disagree-
sence matters here is forensic, it is evidence by which ment about knowledge.
an abducer could vindicate his belief in the hypothesis There isn’t much likelihood that yes-buts will flock
at hand. But Peirce thinks that in the abductive context to this accommodation. The reason is that hardly any-
nothing vindicates that belief. one (any philosopher anyway) thinks the CR model is
We come now to a critical observation. There is true in its pure form. There is no space left to me to
nothing in Peirce’s account that tells us that abduced debate the ins and outs of this. Suffice it to say that
hypotheses aren’t believed as a matter of fact. Some it offers the abductive logician the very relief that the
clearly are not. At the time of their respective advance- yes-buts pine for. Besides, the CR theory just might be
ments, Planck didn’t believe the quantum hypothesis true [6.21].
and Gell-Mann didn’t believe the quark hypothesis. But
it takes no more than simple inspection to see that there 6.2.7 Guessing
are masses of cases to the contrary, that abductive suc-
cess is belief-inducing on a large scale. In line (10) of the G–W schema the Si occur as place-
There is in this commonplace fact something for holders for conditions on hypothesis selection. Previ-
the CR theories to make something of. Let H be one ously, I said that I didn’t know what these conditions
Reorienting the Logic of Abduction 6.2 Knowledge 145
are [6.7]. In point of fact there are two things that I times that they are rightly guessed is amazing; so much
Part B | 6.2
don’t know. One is the normative conditions in virtue of so that Peirce is led to surmise that our proclivity for
which the selection made is a worthy choice. The other right guesses is innate. Of course, not all good guess-
is the causal conditions that enable the choice to be ing is accurate. A good guess can be one that puts the
made. It is easy to see that there are a good many Hs that guessed-at proposition in the ball park, notwithstand-
could serve as antecedents in line (9)’s H R.K.H/; T/ ing that it might actually not be true. Here, too, good
without disturbing its truth value. It is also easy to see guesses might include more incorrect ones than correct.
that a good many of those Hs would never be abduc- But as before, the ratio of correct to merely good could
tively concluded, never mind their occurrence there. It be notably high. So the safer claim on Peirce’s behalf is
is clear that a reasonable choice of H must preserve that beings like us are hardwired to make for good, al-
the truth of (9). It is also clear that this is not enough though not necessarily correct, guesses with a very high
for abductive significance. A reasonable choice must frequency. It is lots easier to make a ball-park guess than
have some further features. I am especially at a loss to a true one; so much so that the hesitant nativist might
describe how beings like us actually go about finding claim a hardwired proclivity for ball-park, yet not for
things like that. Perhaps it will be said that my difficulty truth, save as a welcome contingency, which in its own
is a reflection on me, not on the criteria for hypothesis turn presents itself with an agreeable frequency. Thus
selection. It is true that the number of propositions that the safe inference to draw from the fact that H was se-
could be entertained is at least as large as the number lected is that H is in the ball park. The inference to H’s
of Hs that slot into the antecedent of (9) in a truth- truth is not dismissable, but it is weaker.
preserving way. Let’s think of these as constituting the Needless to say, nativism has problems all its own.
hypothesis-selection space. Selection, in turn, is a mat- But what I want to concentrate on is a problem it poses
ter of cutting down this large space to a much smaller for Peircian abduction. At the heart of all is what to
proper subset, ideally a unit set. Selection, to this same make of ball-park guesses. The safest thing is to pro-
effect, would be achieved by a search engine operating pose is that, even when false, a ball-park hypothesis
on the hypothesis-selection space. Its purpose would be in a given context is one that bears serious operational
to pluck from that multiplicity the one, or few ones, that consideration there. There might be two overarching
would serve our purposes. reasons for this. One is that ball-park hypotheses show
There is nothing remotely mystifying or opaque promise of having a coherently manageable role in the
about search engines (why else would we bother with conceptual spaces of the contexts of their engagement.
Google?). So isn’t the problem I’m having with the Take again the example of Planck. The quantum hy-
Si that I’m not a software engineer? Wouldn’t it be pothesis was a big wrench to classical physics. It didn’t
prudent to outsource the hypothesis-selection task to then have an established scientific meaning. It entered
someone equipped to perform it? To which I say: If that the fray without any trace of a track record. Even so, for
is a doable thing we should do it. There is no doubt all its foreignness, it was a ball-park hypothesis. What
that algorithms exist in exuberant abundance for search made it so was that P.Q/ was a theory revision recog-
tasks of considerable variety and complexity. There are nizable as physics. Contrast Q with the gold fairy will
algorithms that cut down a computer system’s search achieve the sought-for unification. Of course, all of this
space to one answering to the algorithm’s flags. Perhaps turns on the assumption that Peirce got it right in think-
such an arrangement could be said to model hypothesis ing that hypothesis selection is guessing, and to note
selection. But it is another thing entirely as to whether, that good guessing is innate. Call this the innateness hy-
when we ourselves are performing them, our hypothe- pothesis. The second consideration is that the frequency
sis selections implement the system’s algorithms. So I of true hypotheses to ball-park hypotheses is notably
am minded to say that my questions about the Si are not high.
comprehensively answerable by a software engineer. Whether he (expressly) knows how it’s done, when
Here is where guessing re-enters the picture, which an abductive agent is going through his paces, there is
is what Peirce thinks that hypothesis selection is. Peirce a point at which he selects a hypothesis H. If the innate-
is struck by how good we are at it. By this he needn’t ness thesis holds, then the agent has introduced a propo-
have meant that we have more correct guesses than in- sition that has an excellent shot at being ball-park, and
correct. It is enough that, even if we make fewer correct a decent shot of being true. On all approaches to the
guesses than incorrect, the ratio of correct to incorrect matter, an abduction won’t have been performed in the
is still impressively high. We get it right, rather than absence of H; and on the G–W approach, it won’t have
wrong, with a notable frequency. Our opportunities for been performed correctly unless H is neither believed
getting it wrong are enormous. Relative to the propo- nor (however weakly) evidenced by its own abductive
sitions that could have been guessed at, the number of success. On the other, our present reflections suggest
146 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
that the very fact that H was selected is evidence that swer. Some consequence relations are truth-preserving;
Part B | 6.2
it is ball-park, and less good but not nonexistent ev- all the others aren’t. Truth-preserving consequence is
idence that it is true. Moreover, H is the antecedent (said to be) monotonic. (It isn’t. To take an ancient
of our subjunctive conditional (9) H R.K.H/; T/. If example, Aristotle’s syllogistic consequence is truth-
H is true so is R.K.H/; T/ by modus ponens; and if preserving but nonmonotonic.) Premisses from which
R.K.H/; T/ holds the original ignorance problem is a conclusion follows can be supplemented at will and
solved by a form of subduance. In which case, the the conclusion will still follow. One way of captur-
abduction simply lapses. It lapses because the nonevi- ing this point is that truth-preserving consequence is
dential weight lent to a successfully abduced hypothesis impervious to the openness of the world. As far as
is, on the G–W model, weaker than the evidential sup- consequencehood is concerned, the world might as
port given it by way of the innateness hypothesis as well be closed. Once a consequence of something, al-
regards its very selection. ways a consequence of it. It is strikingly otherwise
If, on the other hand, H is not true, but ball-park – with non-truth-preserving consequence. It is precisely
hence favorably evidenced – and being evidenced is this indifference to the openness of the world that is
closed under consequence, then the reasoning at hand lost.
also goes through under the obvious adjustments.
The problem is that there are two matters on which 6.2.8 Closed Worlds
Peirce can’t have his cake and eat it too. If he re-
tains the innateness thesis he can’t have the ignorance- When we were discussing the J condition on knowl-
preservation thesis. Equally, if he keeps ignorance edge, we called upon a distinction between the factive
preservation he must give up innateness, which nota justification of a belief and its forensic justification. In
bene is not the thesis that guessing is innate but that a rough and ready way, a factive justification is down
good guessing is innate. Yet if we give up innateness to the world, whereas a forensic justification is down to
we’re back to where we started, with less than we would us. We find ourselves at a point at which the idea of fac-
like to say about the actual conditions for which the tivity might be put to further good use. To see how, it is
G–W Si are mere placeholders. I leave the innateness- necessary to acknowledge that the distinction between
ignorance preservation clash as an open problem in open and closed worlds is systematically ambiguous. In
the abduction research program. Since, by our earlier one sense it marks a contrast between information states
reasoning, there is an epistemology (CR) that retains at a time, with the closed world being the state of total
ignorance preservation only as a contingent property of information, and open ones states of incomplete infor-
some abductions, my present uncertain inclination is to mation. In the other sense, a closed world can be called
retain G–W as modified by CR and to rethink innate- factive. A closed world at t is everything that is the case
ness. But I’m open to offers. I’ll get back to this briefly at t. It is the totality of facts at t. A closed world is also
in the section to follow. open at t, not with regard to the facts that close it at t, but
Having had my say about the epistemological con- in respect of the facts thence to come. We may suppose
siderations that play out in the logic of abduction, I want that the world will cease to open at the crack of doom,
to turn to the question of how, or to what extent, a logic and that the complete inventory of all the facts that ever
of abduction will meet universalist conditions on logic. were would be logged in the right sort of Doomsday
I want to determine whether or to what extent abductive Book. It is not, of course, a book that any of us will get
theories embody the structural core assumed by univer- to read. Like it or not, we must make do with openness.
salists to be common to any theory that qualifies for Both our information states and the world are open at
admittance to the province of logic. any t before the crack. But the diachronics of facticity
Whatever the details, abduction is a form of outpace the accuracy of information states. When there
premiss-conclusion reasoning. By the conclusions- is a clash, the world at t always trumps our information
consequence thesis, whenever the reasoning is good about it at t-n.
the conclusion that’s drawn is a consequence of those At any given time the world will be more closed
premisses. As logics have proliferated, so too the con- than its concurrent information states. At any given
sequences, albeit not exactly in a strict one-to-one time the state of the world outreaches the state of our
correspondence. If today there are more logics than one knowledge of it. When we reason from premisses to
can shake a stick at, there is a concomitant plenitude conclusions we are not negotiating with the world.
of consequences relations. Much of what preoccupies We are negotiating with informational reflections of
logicians presently is the classification, individuation, the world. We are negotiating with information states.
and interrelatedness of this multiplicity. Whatever their Given the limitations on human information states, our
variations, there is one distinction to which they all an- representations of the world are in virtually all respects
Reorienting the Logic of Abduction 6.2 Knowledge 147
open, and most premises-conclusion relations are sus- track record with both invites a nativist account each
Part B | 6.2
ceptible to rupture. Truth-preserving consequences are time. Oversimplified, we are as good as we are at se-
an interesting exception. The world can be as open as lecting hypotheses because that’s the way we were built.
openness gets, but a truth-preserving consequence of We are as good as we are at closing the world because
something is always a consequence of it, never mind the that too is the way we were built. I suggested earlier
provisions at any t of our information states. Nonmono- that in abductive contexts the very fact that H has been
tonic consequence is different: Today a consequence selected is some evidence that it is true (and even better
tomorrow a nonconsequence. evidence that it is ball-park). But this seems to contra-
We might think that the more prudent course is dict the Peircian thesis that abductive success confers
to cease drawing conclusions and postpone the deci- on H nothing stronger than the suspicion that it might be
sions they induce us to make until our information state true. Since Peirce’s account of abduction incorporates
closes, until our information is permanently total. The both the innateness thesis and the no-evidential-support
ludicrousness of the assumption speaks for itself. Cog- thesis, it would appear that Peirce’s account is inter-
nitive and behavioral paralysis is not an evolutionary nally inconsistent. I said a section ago that I had a slight
option. Thus arises the closed world assumption. Given leaning for retaining the no-evidence thesis and lighten-
that belief and action cannot await the arrival of total in- ing up on the innateness thesis. Either way is Hobson’s
formation, it behooves us to draw our conclusions and choice. That, anyhow, is how it appears.
take our decisions when the likelihood of informational In fact, however, the appearance is deceptive. There
defeat is least high, at which point we would invoke the is no contradiction. Peirce does not make it a condition
assumption that for the matter at hand the world might on abductive hypothesis-selection that H enter the fray
just as well be closed. entirely untouched by reasons to believe it or evidence
The key question about the closed world assump- that supports it. He requires that the present support-
tion is the set of conditions under which it is reasonable status of H has no role to play in the abductive process.
to invoke it. The follow-up question is whether we’re That H is somewhat well supported doesn’t, if true,
much good at it. I am not much inclined to think that we have any premissory role here. Moreover, it is not the
have done all that well in answering the first question. goal of abduction to make any kind of case for H’s truth.
But my answer to the second is that, given the plenitude The goal is to find an H which, independently of its own
of times and circumstances at which to invoke it, our epistemic status, would if true enable a reasoner to hit
track record is really quite good; certainly good enough his target T. But whatever the target is, it’s not the tar-
to keep humanity’s knowledge-seeking project briskly get of wanting to know whether H is true. It is true that,
up and running. Even so, the closed world assumption if all goes well, Peirce asserts that it may be defeasibly
is vulnerable to two occasions of defeat. One is by way concluded that there is reason to suspect that H might
of later information about later facts. Another is by way be true. But, again, abduction’s purpose is not to make
of later information about the facts now in play. It is a case for H, no matter how weakly. The function of
easy to see, and no surprise at all, that new facts will the suspectability observation is wholly retrospective.
overturn present information about present facts with It serves as a hypothesis-selection vindicator. You’ve
a frequency that matches the frequency of the world’s picked the (or a) right hypothesis only if the true sub-
own displacement of old facts by new. Less easy to see junctive conditional in which it appears as antecedent
is how we manage as well as we do in invoking closure occasions the abducer’s satisfaction that that, in and
in the absence of information about the present destruc- of itself, would make it reasonable to suspect that H
tive facts currently beyond our ken. Here, too, we have might be so. In a way, then, the G–W schema misrepre-
a cut-down problem. We call upon closure in the hope- sents this connection. It is not that the abduction implies
ful expectation that no present unannounced fact will H’s suspectibility, but rather that the abduction won’t
undo the conclusions we now draw and the decisions succeed unless the truth of line (9) induces the sus-
they induce us to make. Comparatively speaking, vir- pectibility belief [6.21] (for more on the causal role in
tually all the facts there are now are facts that no one inference, readers could again consult [6.21]). And that
will ever know. That’s quite a lot of facts, indeed it is won’t happen if the wrong H has been selected, never
nondenumerably many (for isn’t it a fact that, for any mind that it preserves (9)’s truth. For the point at hand,
real number, it is a number, and is self-identical, and so however, we’ve arrived at a good result. The innateness
on?). thesis and the no-support thesis are both implicated in
There is a point of similarity between hypothesis se- the Peircean construal of abduction, but are in perfect
lection and the imposition of world closure. Our good consistency.
148 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
6.3 Logic
Part B | 6.3
6.3.1 Consequences and Conclusions lation that is also truth-preserving. The monotonicity
of consequence provides the sole instance in which
I said at the beginning that for nearly two and a half a consequence is impervious to the informational open-
millennia the central focus of logic has been the con- ness of the world. It is the one case in informational
sequence relation. More basic still was a concomitant openness at t that is indifferent to the world’s factive
preoccupation with premiss-conclusion reasoning. For closure at t, to say nothing of its final closure at the
a very long time logicians took it as given that these crack of doom. It has long been known that logicians,
two matters are joined at the hip. then and now, harbor an inordinate affection for deduc-
tive consequence. It’s not hard to see why. Deductive
Conclusions and Consequences consequence has proved more responsive to theoreti-
When someone correctly draws a conclusion from some cal treatment than any of the nondeductive variety. But
premisses, his conclusion is a consequence of them. more centrally, it is the only consequence relation that
captures permanent chunks of facticity.
Corollary 6.2 Whatever else we might say, we can’t say that
If a conclusion drawn from some premisses is not a con- nonmonotonic relations are relations of semantic con-
sequence of them, then the conclusion is incorrectly sequence. If B is a nonmonotonic consequence of A it
drawn. holds independently of whatever makes for the truth
of A and B. Sometimes perhaps it holds on account
If this were so, it could be seen at once that there is of probability conditions on A and B, but probability
a quite intuitive distinction between the consequences has nothing to do with truth. If there is such a thing
that a premiss set has and the consequences that a rea- as probabilistic consequence – think here of Carnap’s
sonable reasoner would conclude from it. Any treat- partial entailment – it is not a semantic relation. We
ment of logic in which this distinction is at least implic- may have it now that the evidence strongly supports the
itly present, there is a principled role for agents, for the charge against Spike in last night’s burglary. We might
very beings who draw what conclusions they will from come to know better tomorrow. We might learn that at
the consequences that flow from the premisses at hand. the time of the offense Spike was spotted on the other
In any such logic there will be at least implicit provision side of town. So the world at t didn’t support then the
for the nature of the agent’s involvement. In every case proposition that Spike did do it, never mind the state of
the involvement is epistemically oriented. People want information the day after t.
to know what follows from what. They want to know No one doubts that yesterday there existed between
how to rebut an opponent. They want to know whether, the evidence on hand and the charge against Spike a re-
when this follows from that that, they can now be said lation of epistemic and decisional importance, a kind
to know that. In a helpful simplification, it could be said of relation in whose absence a survivable human life
that logic got out of the agency business in 1879. It is would be impossible. But a fair question nevertheless
not that agency was overlooked entirely, but rather that presses for attention: Where is the gain in conceptualiz-
it was scandalously short-sheeted. For consequence, the ing these vital premiss-conclusion relations as relations
having-drawing distinction would fold into having; and of logical consequence? Where is the good of trying
having, it would be said, would be the very things to construe nonmonotonic relations on the model of at-
drawn by an ideally rational reasoner. Of course, this tenuated and retrofitted monotonic consequences? My
downplaying of cognitive agency was never without its own inclination is to say that talk of nomonotonic con-
dissenters. Indeed today we are awash in game theoretic sequence misconceives the import of nonmonotonicity.
exuberance, to name just one development of note. We tend to think of it as a distinguishing feature of con-
sequence relations, when what it really is is the defining
6.3.2 Semantics feature of nontruth preservation.
When premiss-conclusion reasoning is good but not
Consequence derives its semantic character from its at- truth-preserving, it is made so by an underlying relation.
tachment to truth, itself a semantic property in an odd Any theory of premiss-conclusion reasoning had better
baptismal bestowal by Tarski. In the deductive case, it have something to say about this, about its nature and
is easy to see how truth is implicated in consequence how it operates. We should give it the name it both de-
and how, in turn, consequence assumes its status as serves and better reflects how it actually does function.
a semantic relation. Not only does truth ground the Let’s call it conclusionality. Conclusionality is an epis-
very definition of consequence, but it makes for a re- temic or epistemic/prudential relation. It is a relation
Reorienting the Logic of Abduction References 149
that helps rearrange our belief states, hence possessing eral to be a species of consequence, the faster we’ll
Part B | 6
decisional significance. Any struggle to discern whether achieve some theoretical respectability. We would be
it is also a consequence relation seems to me to be sail- better served to place conclusionality at the core of logic
ing into the wind. and to place consequence in an annex of less central im-
Abductive conclusions are on the receiving end of portance. If we did that, we could reinstate the logic of
this relation; they are occupants of its converse do- abduction and equip it for admittance into universalist
main. If our present reflections can be made to stand, respectability. But we could also reinvest to good effect
there is no relation of abductive consequence; and it all that energy we’ve devoted to consequentializing the
will cause us no end of distraction trying to figure out conclusionality relation, in a refreshed effort to figure
how to make it one. It hardly needs saying that depriv- how conclusionality actually works in the epistemically
ing a logic of abduction of its own relation of abductive sensitive environments in which, perforce, the human
consequence must of necessity rearrange how abductive organism must operate.
logic is conceptualized. There are plenty of logicians
more than ready to say that a logic without consequence Acknowledgments. I would know a good deal less
relations is a logic in name only – a logic façon de par- than I presently do about abduction without stimulat-
ler, hence a logic that fails universalistic prescriptions. ing instruction from Dov Gabbay, Lorenzo Magnani,
I am otherwise minded. Logic started with conclusion- Atocha Aliseda, Ahti-Veikko Pietarinen, Peter Bruza,
ality relations. It was adventitiousness, not essence, that Woosuk Park, Douglas Niño and more recently – es-
brought it about that the ones first considered were also pecially in relation to sections 11 and 12 – Madeleine
consequence relations. Logic has had a good innings Ransom. To all my warmest thanks. My student Frank
right from the beginning. In a way, this has been un- Hong has also pitched in with astute suggestions; equal
fortunate. The success we’ve had with consequence has gratitude to him. For technical support and everything
obscured our view of conclusionality. It has led us to else that matters, Carol Woods is my go-to gal. Without
think that the more we can get conclusionality in gen- whom not.
References
6.1 J.R. Josephson, S.G. Josephson (Eds.): Abduc- 6.10 P.D. Bruza, D.W. Song, R.M. McArthur: Abduction in
tive Inference: Computation, Philosophy, Technology semantic space: Towards a logic of discovery, Logic
(Cambridge University Press, Cambridge 1994) J. IGPL 12, 97–110 (2004)
6.2 T. Kapitan: Peirce and h structure of abductive in- 6.11 G. Schurz: Patterns of abduction, Synthese 164, 201–
ference. In: Studies in the Logic of Charles Sanders 234 (2008)
Peirce, ed. by N. Housser, D.D. Roberts, J. Van 6.12 P. Bruza, A. Barros, M. Kaiser: Augmenting web ser-
Evra (Indiana University Press, Bloomington 1997) vice discovery by cognitive semantics and abduction,
pp. 477–496 Proc. IEEE/WIC/ACM Int. Joint Conf. on Web Intelli-
6.3 J. Hintikka: What is abduction? The fundamen- gence and Intell. Agent Technol. (IET, London 2009)
tal problem of contemporary epistemology, Trans. pp. 403–410
Charles S. Peirce Soc. 34, 503–533 (1998) 6.13 L. Magnani: Abductive Cognition. The Epistemolog-
6.4 P.A. Flach, C.K. Antonis: Abduction and Induction: ical and Eco-Cognitive Dimensions of Hypothetical
Essays on Their Relation and Interpretation (Kluwer, Reasoning (Springer, Heidelberg 2009)
Dordrecht 2000) 6.14 Aristotle: Categories. In: The Complete Works of Aris-
6.5 L. Magnani: Abduction, Reason and Science: Pro- totle, ed. by J. Barnes (Princeton Univ. Press, Prince-
cesses of Discovery and Explanation (Kluwer, Dor- ton 1985)
drecht 2001) 6.15 C.S. Peirce: Pragmatism and Pragmaticism, Collected
6.6 D.M. Gabbay, J. Woods: The Reach of Abduction: In- Papers of Charles Sanders Peirce, Vol. 5 (Harvard Univ.
sight and Trial, A Practical Logic of Cognitive Systems, Press, Cambridge 1934), ed. by C. Hartshorne, P. Weiss
Vol. 2 (North-Holland, Amsterdam 2005) 6.16 C.S. Peirce: Reasoning and the Logic of Things: The
6.7 S. Paavola: Peircean abduction: Instinct or infer- Cambridge Conference Lectures of 1898 (Harvard Uni-
ence?, Semiotica 153, 131–154 (2005) versity Press, Cambridge 1992), ed. by K.L. Kettner
6.8 A.-V. Pietarinen: Signs of Logic: Peircean Themes on 6.17 C.S. Peirce: Science and Philosophy, Collected Papers
the Philosophy of Language, Games and Communi- of Charles Sanders Peirce, Vol. 7 (Harvard Univ. Press,
cation (Springer, Dordrecht 2006) Cambridge 1958), ed. by A.W. Burks
6.9 A. Aliseda: Abductive reasoning. In: Logical Investi- 6.18 C.S. Peirce: Scientific Metaphysics, Collected Papers
gation into the Processes of Discovery and Evalua- of Charles Sanders Peirce, Vol. 6 (Harvard Univ. Press,
tion, (Springer, Dordrecht 2006) Cambridge 1935), ed. by C. Hartshorne, P. Weiss
150 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
ciol. 13, 117–125 (1999) 6.25 N.E. Sahlin, W. Rabinowitz: The evidentiary value
6.20 J. Woods: Cognitive economics and the logic of ab- model. In: Handbook of Defeasible Reasoning and
duction, Review of Symbolic Logic 5, 148–161 (2012) Uncertainty Management Systems, Vol. 1, ed. by
6.21 J. Woods: Errors of Reasoning: Naturalizing the Logic D.M. Gabbay, P. Smets (Kluwer, Dordrecht 1998)
of Inference, Vol. 45 (College Publications, London pp. 247–265
2013), Studies in Logic Ser. 6.26 W.V. Quine: Epistemology naturalized. In: Ontolog-
6.22 P. Gochet, G. Gribomont: Epistemic logic. In: Logic ical Relativity and Other Essays, ed. by W.V. Quine
and the Modalities in the Twentieth Century, Hand- (Columbia University Press, New York 1969)
book of the History of Logic, Vol. 7, ed. by M. Dov 6.27 A. Goldman: What is justified belief? In: Justification
Gabbay, J. Woods (North-Holland, Amsterdam 2006) and Knowledge Philosophical Studies Series in Phi-
pp. 99–195 losophy 17, ed. by G. Pappas (Reidel, Dordrecht 1979)
6.23 R. Fagin, J.Y. Halpern, Y. Moses, Y.M. Vardi: Reasoning pp. 1–23
About Knowledge (MIT, Cambridge 1995) 6.28 A. Goldman: A causal theory of knowing, J. Philoso-
6.24 J. Van Benthem: Logical Dynamics of Information phy 64, 357–372 (1967)
and Interaction (Cambridge University Press, New
151
Gerhard Schurz
Patterns of A 7. Patterns of Abductive Inference
Part B | 7
7.1 General Characterization
This article understands abductive inference as en-
of Abductive Reasoning and Ibe.......... 152
compassing several special patterns of inference to
the best explanation whose structure determines 7.2 Three Dimensions for Classifying
a promising explanatory conjecture (an abductive Patterns of Abduction......................... 154
conclusion) for phenomena that are in need of 7.3 Factual Abduction .............................. 155
explanation (Sect. 7.1). A classification of different 7.3.1 Observable-Fact Abduction.................. 155
patterns of abduction is given in Sect. 7.2, which is 7.3.2 First-Order Existential Abduction.......... 156
intended to be as complete as possible. A central 7.3.3 Unobservable-Fact Abduction .............. 156
distinction is that between selective abductions, 7.3.4 Logical and Computational Aspects
which choose an optimal candidate from a given of Factual Abduction ........................... 157
multitude of possible explanations (Sects. 7.3 and
7.4 Law Abduction ................................... 158
7.4), and creative abductions, which introduce
new theoretical models or concepts (Sects. 7.5– 7.5 Theoretical-Model Abduction.............. 159
7.7). While the discussion of selective abduction 7.6 Second-Order Existential Abduction .... 161
has dominated the literature, creative abductions 7.6.1 Micro-Part Abduction .......................... 161
are rarely discussed, although they are essential 7.6.2 Analogical Abduction .......................... 161
in science. This paper introduces several kinds 7.6.3 Hypothetical (Common) Cause
of creative abduction, such as theoretical model Abduction .......................................... 162
abduction, common-cause abduction, and sta-
tistical factor analysis. A demarcation between 7.7 Hypothetical (Common) Cause
Abduction Continued.......................... 162
scientifically fruitful abductions and speculative
7.7.1 Speculative Abduction Versus Causal
abductions is proposed, by appeal to two in-
Unification: A Demarcation Criterion ..... 163
terrelated criteria: independent testability and
7.7.2 Strict Common-Cause Abduction
explanatory unification. Section 7.8 presents ap- from Correlated Dispositions
plications of abductive inference in the domains and the Discovery of New Natural Kinds 164
of belief revision and instrumental/technological 7.7.3 Probabilistic Common-Cause Abduction
reasoning. and Statistical Factor Analysis .............. 167
7.7.4 Epistemological Abduction to Reality .... 168
7.8 Further Applications
of Abductive Inference........................ 169
7.8.1 Abductive Belief Revision .................... 169
7.8.2 Instrumental Abduction
and Technological Reasoning ............... 170
References................................................... 171
152 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Thesis 7.3 (The strategic role of abduction as means time. In this respect, the rule of IBE fails completely.
for discovery) It just tells us that we should choose the best (avail-
All inferences have a justificatory (or inferential) and able) explanation without giving us any clue of how to
a strategic (or discovery) function, but to a differ- find it. To see the problem, as presented by a humor-
ent degree. The justificatory function consists of the ous example, think of someone in a hurry who asks an
justification of the conclusion, conditional on the justi- IBE-philosopher for the right way to the railway sta-
fication of the premises. The strategic function consists tion and receives the following answer: Find out which
of searching for the most promising conjecture (con- is the shortest way among all ways between here and
Part B | 7.1
clusion), which is set out for further empirical testing, the train station – this is the route you should choose.
or in Hintikka’s words, which stimulates new ques- In other words, IBE merely reflects the justificatory but
tions [7.14, p. 528], [7.15, Sect. 14]. misses the strategic function of abduction, which in fact
is its essential function. For this reason, the rule of IBE
In deductive inferences the justificatory function is is, epistemically, rather uninformative [7.18, p. 281].
fully realized, because the premises guarantee the truth Peirce once remarked that there are sheer myriads
of the conclusion. Deductive inferences may also serve of possible hypotheses that would explain a given ex-
important strategic functions, because many different perimental phenomena, yet scientists usually manage to
conclusions can be derived from the same premises. find the true hypothesis after only a small number of
In inductive inferences, there is not much search strat- guesses [7.19, CP 6.5000]. But Peirce did not propose
egy involved, because the inductive conclusions of any abductive rules for conjecturing new theories; he
a premise set are narrowly defined by the operations rather explained this miraculous ability of human minds
of generalization over instances. So the major function by their abductive instincts [7.20, CP 5.47, fn. 12;
of inductive inferences is justificatory, but their jus- 5.172; 5.212]. The crucial question seems to be whether
tificatory value is uncertain. In contrast, in abductive there can be anything like a logic of discovery. Pop-
inferences, the strategic function becomes dominant. per and the logical positivists correctly observed that
Different from the situation of induction, in abduction the justification of a hypothesis is independent from
problems we are often confronted with thousands of the way it was discovered. This does not imply, how-
possible explanatory conjectures – anyone in the vil- ever, that it would not be desirable to have in addition
lage might be the murderer. The essential function of good rules for discovering explanatory hypotheses – if
abductions is their role as search strategies that tell us there only were such rules [7.16]. This paper intends to
which explanatory conjecture we should set out first show that there are such rules; in fact, every kind of ab-
for further inquiry [7.14, p. 528] – or more generally, duction pattern presented in this paper constitutes such
which suggest a short and most promising (though not a rule.
necessarily successful) path through the exponentially The majority of the recent literature on abduction
explosive search space of possible explanatory reasons. has aimed at one most general schema of abduction that
In contrast, the justificatory function of abductions matches every particular case. I do not think that good
is minor. Peirce pointed out that abductive hypotheses heuristic rules for generating explanatory hypotheses
are prima facie not even probable, as inductive hypothe- can be found along this route, because such rules are de-
ses, but merely possible [7.3, CP 5.171]. Only upon be- pendent on the specific type of abductive scenario, for
ing confirmed by further tests may an abductive hypoth- example, concerning whether the abduction is mainly
esis become probable. However, I cannot completely selective or creative (etc.). In the remainder of this pa-
agree with Peirce or other authors [7.14, 16], [7.17, per, I will pursue a different route to characterizing
p. 192] who think that abductions are merely a discov- abduction, which consists of modeling various particu-
ery procedure and whose justificatory value is zero. As lar schemata of abduction, each fitting a particular kind
Niiniluoto pointed out, “abduction as a motive for pur- of conjectural situation. Two major results of my paper
suit cannot always be sharply distinguished from con- can be summarized as follows:
siderations of justification” [7.12, S442]. Niiniluoto’s
point is confirmed by a Bayesian analysis: If a hypothe- Result 7.1
sis H explains an explanandum E (where P.H/; P.E/ ¤ There exist rather different kinds of abductive patterns.
0; 1), then P.EjH/ (E’s posterior probability) has been While some of them enjoy a broad discussion in the lit-
raised compared to P.E/ (E’s prior probability), which erature, others have been neglected, although they play
implies (by probability theory) that P.HjE/ > P.H/, an important role in science. The epistemological role
i. e., E raises H’s probability, if only a little bit. and the evaluation criteria of abduction are different for
It is essential for a good search strategy that it leads the different patterns of abduction.
us to an optimal conjecture in a reasonable amount of
154 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Part B | 7.3
Factual abduction may also be called retroduction; are drawn into the sand or blown by the wind, etc. The
Chisholm [7.21, Ch. IV.2] speaks of inverse induction. majority of these possible abductive conjectures will
This kind of abduction has the following structure (the never be considered by us because they are extremely
double line D indicates that the inference is uncertain improbable. The major strategic algorithm that we
and preliminary) apply in factual abduction cases of this sort is a prob-
abilistic elimination technique, which usually works
Known law: If Cx, then Ex in an unconscious manner: our mind quickly scans
Known evidence: Ea has occurred through our large memory store containing millions of
DDDDDDDDDDDDDDDDDDDDDD memorized possible scenarios and only those that have
Abducted conjecture: Ca could be the reason. minimal plausibility pop up in our consciousness.
So, probabilistic evaluation of possible causes given
One may call the factual abduction schema the offi- certain effects and elimination of implausible causes
cial Peirce abduction schema, since Peirce [7.2, CP plays a central role in factual abductions. Fumer-
2.619–2.644] formalized abduction in this way and ton [7.23, p. 592] has gone further and argued that
named it hypothesis; later he generalized abduction in factual abduction can even be reduced to ordinary
the way described in Sect. 7.1. Factual abductions are inductive-statistical inference. More precisely, he ar-
omnipresent in common sense reasoning, and presum- gues that the first inference pattern (below) can be re-
ably rely on inborn abductive instincts of hominids. duced to the second inference pattern (below) in the fol-
Prototypical examples are detective stories [7.22], or lowing way (where P./ denotes subjective-epistemic
more generally, all sorts of causal interpretations of and p./ statistical probability, and K expresses back-
traces. The AI literature is focused almost exclusively ground knowledge).
on factual abductions (Sect. 7.3.4). Depending on the
epistemological nature of the abducted fact, one can Abductiv e inference W
distinguish between the following three subpatterns. L W 8x.Fx ! Gx/
Ga
7.3.1 Observable-Fact Abduction DDDDDDDDDDDDD
Fa .presupposition W
Here one reasons according to the fact-abduction P.FajGa ^ L ^ K/ D high/
schema from observed effects (Ea) to non-observed but ?
observable causes (Ca) in the background of known Fumerton’s ? ?
laws. The follow-up test procedure consists of the reduction: y
attempt to gain direct evidence for the abducted con-
jecture. In the example of a murder case, such direct Inductiv e statistical inference W
evidence would be given, for example, by a confession L0 W p.FxjGx/ D high
of the putative murderer. Ga
In the setting of factual abduction, the problem DDDDDDDDD P.FajGa ^ L0 / D high
often consists of the combinatorial explosion of the Fa
search space of possible causes, in the presence of
a rich background store of laws but in the absence Although Fumerton’s reduction seems reasonable in
of a rich factual knowledge. Thus, factual abductions some cases, I see two reasons why his argument is not
are primarily selective in the sense of Magnani [7.8, generally correct. Firstly, the abductive hypothesis is
p. 20], and their epistemic support depends on the probabilistically evaluated not merely in the light of the
degree to which the background knowledge increases evidence Ga and an inverse statistical law L0 , but in the
their probability in comparison to alternative possible light of the entire background knowledge K. Fumerton
causes. Consider the following example: If your may reply that the inference pattern on the right may be
evidence consists of the trace of the imprints of sandals appropriately extended so that it includes background
156 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
knowledge. But secondly, Fumerton’s proposed trans- who had walked yesterday on this beach. In this case,
formation does not correspond to psychological reality, the weak epistemic support that the abductive inference
nor would it be strategically recommendable. Every in- conveys to the conjecture gets replaced by the strong
dividual case (or effect) is different, and hence, only epistemic support provided by the direct evidence: ab-
a small fraction of possible cause-effect scenarios are duction has played an important strategic role, but it no
encountered frequently enough in a human lifetime in longer plays a justificatory role. This is different, how-
order to be represented by Fumerton-like conditional ever, in all of the following patterns of abduction, in
probabilities. For example, if you are not a turtle expert which the abductive hypothesis is not directly observ-
Part B | 7.3
and you observe the trace of a turtle in the sand, then able, but only indirectly confirmable via its empirical
the only way in which you may arrive at the right guess consequences.
that there was a turtle crawling here is by careful back-
ward reasoning combined with elimination. Unless you 7.3.3 Unobservable-Fact Abduction
are a turtle hunter, it is unlikely that you will have ex-
plicitly stored information concerning the typical sand This kind of abduction has the same formal structure
traces of turtles via a corresponding forward conditional as observable-fact abduction, but the abducted fact is
of the sort proposed by Fumerton. unobservable. The typical case of unobservable-fact ab-
ductions are historical-fact abductions, in which the
7.3.2 First-Order Existential Abduction abducted fact is unobservable because it is located
in the distant past. The abducted fact may also be
This subcase of factual abduction occurs when the unobservable in principle, because it is a theoretical
antecedent of a law contains so-called anonymous vari- fact. However, in such a case the abduction is usu-
ables, i. e., variables that are not contained in the con- ally not driven by simple implicational laws, but by
sequent of the law. In the simplest case, the formal a quantitative theory, and the abducted theoretical fact
structure of first-order existential abduction is as fol- corresponds to a theoretical model of the observed phe-
lows (cf. also [7.24, p. 57]). nomenon: This sort of abduction differs crucially from
law-driven factual abduction and is therefore treated
L W 8x 8y.Ryx ! Hx/ under the separate category of theoretical-model abduc-
logically equivalent W 8x.9yRyx ! Hx/ tion (Sect. 7.5).
Ha Historical-fact abductions are of obvious impor-
tance for all historical sciences [7.12, p. 442]. Assume,
DDDDDDDDDDDD for example, that biologists discover marine fossil
Conjecture W 9yRya records, say fish bones, in the ground of dry land. They
conjecture abductively, given their background theo-
Instantiating the consequent of the law with a and back- ries, that some geological time span ago there was
ward chaining yields a law-antecedent in which one a sea here. Their hypothesis cannot be directly veri-
variable remains uninstantiated (Rya). In such a case, fied by observations. So the biologists look for further
the safest abductive conjecture is one in which we empirical consequences that follow from the abducted
existentially quantify over this variable. We have al- conjecture plus background knowledge – for example,
ready discussed an example of this sort in Sect. 7.3.1: further geological indications such as calcium deposits,
from the footprint in the sand we abductively infer that or marine shell fossils, etc. If the latter findings are
some man was walking on the beach. Prendinger and observationally verified, the abductive conjecture is
Ishizuka [7.25, p. 322] call first-order existential ab- confirmed. Logically speaking, an unobservable-fact
duction element-creative abduction, because here the abduction performs a combination of abductive back-
existence of an unknown object is hypothesized. ward reasoning and deductive or probabilistic forward
Note, however, that only in some cases will we be reasoning to consequences that can be put to further
satisfied with the existential conjecture. In other cases, test. This is graphically displayed by the bold arrow in
in particular in criminal cases, all depends on finding Fig. 7.2.
out which individual is the one whose existence we If the empirical consequence Ea is verified, then
conjecture – who was the murderer? Here one is not sat- both pieces of evidence Ga and Ea provide epistemic
isfied with a first-order existential abduction but want to support for the abducted hypothesis Ha (modulo prob-
have a proper, fully-instantiated fact abduction. abilistic considerations in the light of the background
In observable-fact abduction, the abducted hypothe- knowledge). So, the initial abductive inference has
sis may at later stages of inquiry be confirmed by direct not only a strategic value, but keeps its justificatory
observation – for example, when we later meet the man value.
Patterns of Abductive Inference 7.3 Factual Abduction 157
Part B | 7.3
Fig. 7.2 Historical-fact abduction (the bold arrow indi- tree [7.30, Ch. 13]. The labeled nodes of an and-
cates the route of the abduction process) or tree correspond to literals, unlabeled nodes rep-
resent conjunctions of them, and the directed edges
(arrows) correspond to laws in LŒx. Arrows con-
7.3.4 Logical and Computational Aspects
nected by an arc are and-connected; those without an
of Factual Abduction
arc are or-connected. Written as statements, the laws
underlying Fig. 7.3 are 8x.Fx ! Gx/, 8x.Hx ! Gx/,
If one’s background knowledge does not contain gen-
8x.Q1 x ^ Q2 x ! Gx/, 8x.R1 x ^ R2 x ! Fx/, 8x.Sx !
eral theories but just a finite set of (causal) implicational
Hx/, 8x.T1 x ^ T2 x ! Hx/, 8x.Ux ! Q1 x/, 8x.Vx !
laws, then the set of possible abductive conjectures is
Q2 x/. Besides the goal Ga, the only known fact is T1 a.
finite and can be generated by backward-chaining infer-
Algorithms for this sort of task have been imple-
ence procedures. In this form, abductive inference has
mented, for example, in the programming language
frequently been studied in AI research [7.26, 27]. Given
Prolog in the form of backward-chaining with back-
is a knowledge base K D hLŒx; FŒai in form of a fi-
tracking to all possible solutions.
nite set LŒx of monadic implicational laws going from
The task of finding all possible explanations has ex-
conjunctions of open literals to literals, and a finite set
ponential complexity and, thus, is intractable (that is,
FŒa of facts (closed literals) about the individual case
the time of this task increases exponentially in the num-
a. (A literal is an atomic formula or its negation.) Given
ber of data points and possible hypotheses). Only the
is moreover a certain goal, which is a (possibly con-
complexity of finding some explanation has polynomial
junctive) fact Ga that needs to be explained. One is not
complexity and is tractable [7.26, Ch. 7, p. 165, Th. 7.1,
interested just in any hypotheses that (if true) would ex-
7.2]. Therefore it is crucial to constrain the search space
plain the goal Ga given K, but only in those hypotheses
by probabilistic (or plausibilistic) evaluation methods.
that are not further potentially explainable in K [7.28,
A simple heuristic strategy is the best-first search: for
p. 133], [7.29].
each or-node one processes only that successor that has
So formally, the candidates for abducible hypothe-
the highest plausibility value (among all successors of
ses are all closed literals AŒa such that AŒa is neither
this node). The route of a best-first abduction search is
a fact in FŒa, nor the consequent (head) of a law, i. e.,
depicted in Fig. 7.3 by the bold arrow.
AŒa cannot be further explained by other laws in K. The
A related but more general framework for fac-
set of all possible abductive conjectures AŒa for arbi-
tual abductions via backward reasoning is abduction
trary abduction tasks in K is called the set of abducibles
within Beth tableaux [7.15], [7.7, Ch. 4]. Even more
HŒa. The abductive task for goal Ga is then defined as
general frameworks are Gabbay’s labeled deductive
follows: Find all possible explanations, i. e., all minimal
systems [7.31, Part III] and abduction within epis-
sets EŒa of singular statements about a such that:
temic dynamic logic [7.32]. An alternative approach
(i) EŒa FŒa [ HŒa for computing abductive hypotheses is abductive rea-
+ Ga
0.3
0.2
0.4
Fa Ha Q1a Q2a
0.3 0.5
Fig. 7.3 Search space for a factual abduction problem. C indicates a known fact, indicates possible abductive hypothe-
ses. The numbers are probability values (they do not add up to 1 because of an unknown residual probability). The bold
arrow indicates the route of a best-first search, which leads to the abductive conjecture T2 a
158 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
soning in the framework of adaptive logic [7.33, 34]. of Hintikka et al. [7.15], they may stimulate further
Here one infers singular explanatory hypotheses by interrogative inquiry [7.35, Ch. 6]. As an example,
backward chaining defeasibly and excludes them as consider again a criminal case: If backward reason-
soon as contradicting abnormality statements turn out ing leads to the possibility that the butler could have
to be derivable in later stages of the proof. As a result, been the murderer, and along an independent path,
one doesn’t compute all possible minimal explanations that the murderer must have been left handed, then
(consistent with the knowledge base) but only those that before continuing the abductive reasoning procedure
are undefeated. one better finds out first whether the butler is in-
Part B | 7.4
Besides probabilistic elimination, the second major deed left handed. There are also some AI abduction
technique of constraining the search space is interme- systems that incorporate question-asking modules. For
diate information acquisition: not only the ultimately example, the RED system, designed for the purpose
abducted conjectures, but also intermediate conjec- of red-cell antibody identification based on antigen-
tures (nodes) along the chosen search path can be set reactions of patient serum, asks intermediate questions
out for further empirical test – or in the framework of a database [7.26, pp. 72].
Part B | 7.5
tively formulated. The abductive task consists in finding nomenon to be explained. If such a theoretical model
theoretical (initial and boundary) conditions that de- is found, this is usually celebrated as a great scientific
scribe the causes of the phenomenon in the theoretical success.
language and that allow the mathematical derivation of Theoretical-model abduction is the typical theo-
the phenomenon from the theory. Halonen and Hin- retical activity of normal science in the sense of
tikka [7.36] argue that this task makes up the essential Kuhn [7.39], that is, the activity of extending the ap-
point of the scientist’s explanatory activity. Formally, plication of a given theory core (or paradigm) to new
these theoretical conditions are expressed by factual or cases, rather than changing a theory core or creating
law-like statements, but their semantic content corre- a new one. If the governing theory is classical physics,
sponds to what one typically calls a theoretical model then examples of theoretical model abduction come in
for a particular kind of phenomenon within a given the hundreds, and physics text books are full of them.
theory, whence I speak of theoretical-model abduction. Examples are the theoretical models underlying:
Note also that with my notion of a model I do not im-
ply a particular kind of formalization of models: they 1. The trajectories (paths) of rigid bodies in the con-
can be represented by statements or by set-theoretical stant gravitational field of the Earth (free fall,
models [7.37, p. 109]. A general translation between parabolic path of ballistic objects, gravitational pen-
sentential and model-theoretic theory representations is dulum, etc.)
developed in Schurz [7.38]. 2. The trajectories of cosmological objects in position-
As an example, consider Archimedes’ theoretical dependent gravitational fields (the elliptic orbits of
model of the phenomenon of buoyancy. Here one planets, Kepler’s laws, the Moon’s orbit around the
searches for a theoretical explanation of the fact that Earth, and the lunar tides, interplanet perturbations,
certain substances like stones and metals sink in wa- etc.)
ter while others like wood and ice float on water, 3. The behavior of solid, fluid or gaseous macroscopic
solely in terms of mechanical and gravitational effects. objects viewed as systems of more-or-less coupled
Archimedes’ ingenious abductive conjecture was that mechanical atoms (the modeling of pressure, fric-
the amount of water that is supplanted by the floating tion, viscosity, the thermodynamic explanation of
or sinking body tends to lift the body upwards, with heat and temperature, etc.); and finally
a force fW that equals the weight of the supplanted wa- 4. The explanation of electromagnetic phenomena by
ter (Fig. 7.4). If this force is greater than the weight of incorporating electromagnetic forces into classical
the body (fB ) the body will float, otherwise it will sink. physics [7.40, Ch. 5.3].
Since the volume of supplanted water equals the volume
of the part of the body that is underwater, and since the While for all other kinds of abductions we can pro-
weight is proportional to the mass of a body, it follows vide a general formal pattern and algorithm by which
that the body will sink exactly if its density (mass per one can generate a most promising explanatory hy-
volume) is greater than the density of water. pothesis, we cannot provide such a general pattern for
The situation of theoretical-model abduction is theoretical model abduction because here all depends
rather different from the situation of factual abductions:
one does not face here the problem of a huge multitude Volume of supplanted water,
of possible theoretical models or conjectures, since the causes water level to rise,
given theory constrains the space of possible causes to pushes body upwards
a small class of basic parameters (or generalized forces) fW
by which the theory models the domain of phenomena
that it intends to explain. In the Archimedean case, the fB
given theory presupposes that the ultimate causes can
only be contact forces and gravitational forces – other Fig. 7.4 Theoretical conditions that allow the mechanical deriva-
ultimate causes such as intrinsic capacities of bodies tion of the law of buoyancy
160 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Theoretical-model abduction can also be found in descendance T2 in which A first splits into S1 and a com-
higher and more special sciences than physics. In chem- mon F-less ancestor of S2 and S3 requires two such
istry, the explanations of the atomic component ratios mutations (Fig. 7.5). So probabilistically T1 is favored
(the chemical gross formulae) by a three-dimensional as against T2 .
molecular structure are the results of theoretical-model There are some well-known examples were close-
abductions; the given theory here is the periodic ta- ness of species due to common descent does not go
ble plus Lewis’ octet rule for forming chemical bonds. hand in hand with closeness in terms of phenotypic
A computational implementation is the automatic ab- similarities: Examples of this sort are recognized be-
duction system DENDRAL [7.42, pp. 234], which cause there are several independent kinds of evidence
abducts the chemical structure of organic molecules, that the tree of descendance must simultaneously ex-
given their mass spectrum and their gross formula. plain, in particular:
Theoretical model abductions also take place in evo-
lutionary theory. The reconstruction of evolutionary 1. Phenotypic similarities
trees of descendance from given phenotypic similarities 2. Molecular similarities
is a typical abductive process. The basic evolution- 3. The fossil record [7.44, Ch. 17].
theoretical premise here is that different biological
species descend from common biological ancestors An example of qualitative-model abduction in the
from which they have split apart by discriminative area of humanities is interpretation [7.31, Sect. 4.1].
mutation and selection processes. The alternative ab- The explanandum of interpretations are the utterances,
ductive conjectures about trees of descendance can be written text, or the behavior of given persons (speakers,
evaluated by probability considerations. Assume three authors, or agents). The abducted models are conjec-
species S1 , S2 , and S3 , where both S1 and S2 but not tures about the beliefs and intentions of the given
S3 have a new property F – in Sober’s example, S1 is persons. The general background theory is formed by
sparrows, S2 D robins, S3 D crocodiles, and F D hav- certain parts of folk psychology, in particular the gen-
ing wings [7.43, pp. 174–176]. In this case, the tree eral premise of all rational explanations of actions,
of descendance T1 where the common ancestor A first namely, that normally or ceteris paribus, persons act
splits into S3 and the common ancestor of S1 and S2 , in a way that is suited to fulfill their goals given their
which has already F, requires only one mutation-driven beliefs about the given circumstances [7.45, Sect. 1].
change of non-F into F, while the alternative tree of More specific background assumptions are hermeneu-
Table 7.1
Abduction pattern of Newtonian particle mechanics:
Explanandum: A kinematical process involving (a) some moving particles whose position, velocity and accel-
eration at a variable time t is an empirical function of their initial conditions, and (b) certain objects defining
constant boundary conditions (e.g., a rigid plane on which a ball is rolling, or a large object that exerts a gravi-
tational force, or a spring with Hooke force, etc.)
==============================================================================
Generate the abducted conjecture as follows: (i) Specify for each particle its mass and all non-neglectible
forces acting on it in dependence on the boundary conditions and on the particle’s position at the given time.
(ii) Insert these specifications into Newton’s second axiom (which says that for each particle x and time t, the
sum of all forces on x at t equals the mass of x times the acceleration of x at t). (iii) Try to solve the resulting
system of differential equations. (iv) Check whether the resulting time-dependent trajectories fit the empirical
function mentioned in the explanandum; if yes, the conjecture is preliminarily confirmed; if no, then search for
(perturbing) boundary conditions and/or forces that may have been overlooked.
Patterns of Abductive Inference 7.6 Second-Order Existential Abduction 161
tic rationality presumptions [7.46], Grice’s maxims of theories, and hence, they work within a given con-
communicative cooperation [7.47], and common con- ceptual space. In other words, the abduction schemata
textual knowledge. The investigation of interpretation discussed so far cannot introduce new concepts. In the
as an abductive process is also an important area in next section we turn to abduction schemata that can do
AI [7.48]. this: Since their explanans postulate the existence of
What all abduction schemata discussed so far have a new kind of property or relation, we call them second-
in common is that they are driven by known laws or order existential abductions.
Part B | 7.6
7.6 Second-Order Existential Abduction
The explanandum of a second-order existential ab- Extrapolative micro-part abductions differ from
duction consists of one or several general empirical analogical abductions insofar as the atoms are not
phenomena, or laws. What one abducts is an at least merely viewed as analogical to mechanical particles;
a partly new property or kind of concept governed by they are literally taken as tiny mechanical particles
an at least partly new theoretical law. Depending on (though too small to be observable). Nevertheless one
whether the concept is merely partly or completely new, may view extrapolative abductions as a pre-stage of
the abduction is driven by extrapolation, analogy, or by analogical abductions, which we are going to discuss
pure unification. We discuss these kinds of abductions now.
in the following subsections Sects. 7.6.1–7.7.4.
7.6.2 Analogical Abduction
7.6.1 Micro-Part Abduction
Here one abducts a partially new concept together
In this most harmless case of second-order existen- with partially new laws that connect this concept with
tial abduction, one abducts a hypothesis about the given (empirical) concepts, in order to explain the given
unobservable micro-parts of observable objects that law-like phenomenon. The concept is only partly new
obey the same laws as the macroscopic objects, in or- because it is analogical to familiar concepts, and this is
der to explain various observed empirical phenomena. the way in which this concept was discovered. So ana-
The prototypical example is the atomic hypothesis that logical abduction is driven by analogy. We first consider
was already conjectured in antiquity by Leucippus and Thagard’s [7.24] example of sound waves.
Democritus and was used to explain such phenomena
as the dissolution of sugar in water. These philosophers Background knowledge: Laws of propagation and
have abducted a new natural kind term: atoms, which reflection of water waves.
are the smallest parts of all macroscopic bodies, be- Phenomenon to be explained: Propagation and
ing too small to be observable, but otherwise obeying reflection of sound
the same mechanical laws as macroscopic bodies. So DDDDDDDDDDDDDDDDDDDDDDDD
what one does here is to extrapolate from macroscopic Abductive conjecture: Sound consists of
concepts and laws to the microscopic domain – whence atmospheric waves in analogy to water waves.
we may also speak here of extrapolative abduction. In
the natural sciences after Newton, the atomic hypothe- According to Thagard [7.24, p. 67] analogical abduc-
sis turned out to have enormous explanatory power. For tion results from a conceptual combination: the already
example, Dalton’s atomic hypothesis had successfully possessed concepts of wave and sound are combined
explained Avogadro’s observation that equal volumes into the combined concept of a sound wave. I think that
of gases contain the same number of gas particles. Dal- this early analysis of Thagard [7.24] is too simple. In
ton also postulated that all substances are composed of my view, the crucial process that is involved in ana-
molecules built up from certain atoms in certain integer- logical abduction is a conceptual abstraction based on
valued ratios, in order to explain the laws of constant an isomorphic or homomorphic mapping. What is ab-
proportions in chemical reactions [7.49, pp. 259]. The ducted by this analogy is not only the combined concept
different states of aggregation of substances (solid, of sound wave, but at the same time the theoretical con-
fluid, and gaseous) are explained by different kinds of cept of a wave in abstracto (the later paper of Holyoak
intermolecular distances and interactions. We conclude and Thagard [7.50] supports this view).
our list of examples here, although many more applica- A clear analysis of analogy based on conceptual ab-
tions of the atomic hypothesis could be mentioned. straction has been given by Gentner [7.51]. According
162 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
to Gentner’s analysis, an analogy is a partial isomor- abstract model of a central force system arises, with
phic mapping m between two relational structures, the a central body, peripherical bodies, a centripetal and
source structure (D, (Fi W 1
i
m), (Ri W 1
i
n)) a centrifugal force [7.51, p. 160]. So, finding an ab-
and the target structure (D , (Fi W 1
i
m ), (R i W ductive analogy consists in finding the theoretically
1
i
n )), where the Fi are monadic predicates and essential features of the source structure that can be
the Ri are relations. Gentner argues convincingly [7.51, generalized to other domains, and this goes hand-in-
p. 158] that an analogical mapping preserves only the hand with forming the corresponding conceptual ab-
relations of the two structures (at least many of them, straction. In our example, the analogical transfer of
Part B | 7.7
including second-order relations such as being-a-cause- water waves to sound waves can only work if the theo-
of ), while monadic properties are not preserved. This is retically essential features of (water) waves have been
what distinguishes an analogy from a literal similar- identified, namely, that waves are produced by cou-
ity. For example, our solar system is literally similar pled oscillations. The abductive conjecture of sound
to the star system X12 in the Andromeda galaxy, inas- waves also stipulates that sound consists of coupled os-
much as the X12 central star is bright and yellow like cillations of the molecules of the air. Only after the
our sun, and surrounded by planets that are similar to theoretical model of sound waves has been formed,
our planets. Thus, our sun and the X12 star have many does a theoretical explanation of the propagation and
(monadic) properties in common. On the other hand, reflection of sound waves become possible.
an atom (according to the Rutherford theory) is merely
analogical to our solar system: the positively charged 7.6.3 Hypothetical (Common) Cause
nucleus is surrounded by electrons just as the sun is Abduction
surrounded by planets, being governed by a structurally
similar force law. But concerning its monadic proper- This is the most fundamental kind of conceptually
ties, the atomic nucleus is very different from the sun creative abduction. The explanandum consists either
and the electrons are very different from the planets. (a) in one phenomenon or (b) in several mutually inter-
Formally, then, an analogical mapping m maps a sub- correlated phenomena (properties or regularities). One
set D0 of D bijectively into a subset D0 of D , and abductively conjectures in case (a) that the phenomenon
many (but not necessarily all) relations Ri , with i 2 I is the effect of one single hypothetical (unobservable)
f1; : : : ; ng, into corresponding relations Rm.i/ , such that
cause, and in case (b) that the phenomena are effects
the following holds: For all a, b 2 D0 and Ri with i 2 I, of one common hypothetical (unobservable) cause. I
aRi b iff m.a/R m.i/ m.b/, where iff stands short for if
will argue that only case (b) constitutes a scientifically
and only if. In this sense, the Rutherford-analogy maps worthwhile abduction, while (a) is a case of pure specu-
sun into nucleus, planet into electron, gravitational lation. In both cases, the abductive conjecture postulates
attraction into electrical attraction, surrounding into a new unobservable entity (property or kind) together
surrounding, etc. It follows from the existence of such with new laws connecting it with the observable prop-
a partial isomorphic mapping that, for every explana- erties, without drawing on analogies to concepts with
tory law L expressed in terms of mapping-preserved which one is already familiar. This kind of abduction
relations that hold in the D0 -restricted source structure, does not presuppose any background knowledge ex-
its starred counterpart L will hold in the D0 -restricted cept knowledge about those phenomena that are in need
target structure. In this way, explanations can be trans- of explanation. What drives hypothetical-cause abduc-
ferred from the source to the target structure. tion is the search for explanatory unification, usually in
Every partial isomorphism gives rise to a concep- terms of hidden or common causes – but later on, we
tual abstraction by putting together just those parts will also meet cases where the unifying parameters have
of both structures that are isomorphically mapped into a merely instrumentalistic interpretation. Hypothetical
each other: the resulting structure (D0 , (Ri W i 2 I)), (common) cause abduction is such a large family of ab-
which is determined up to isomorphism, is interpreted duction patterns that we treat it separately in the next
in an abstract system-theoretic sense. In this way, the section.
7.7.1 Speculative Abduction Versus Causal Speculative abductions have been performed by our
Unification: A Demarcation Criterion human ancestors since the earliest times. All sorts of
unexpected events can be pseudo-explained by specu-
Ockham’s razor is a broadly accepted maxim among lative fact-abductions. They do not achieve unification,
IBE-theorists: an explanation of observed phenomena because for every event (E) a special hypothetical wish
should postulate as few new theoretical (unobservable) of God ( .E/) has to be postulated [7.55, p. 86]. For
entities or properties as possible [7.53, pp. 97–100]. the same reason, such pseudo-explanations are entirely
Upon closer inspection this maxim turns out to be post hoc and don’t entail any empirical predictions by
Part B | 7.7
a gradual optimization criterion, for an explanation is which they could be independently tested.
better, the fewer theoretical (hidden) entities that it pos- Speculative law-abductions were especially com-
tulates, and the more phenomena it explains. However, mon in the Middle Ages: every special healing capacity
by introducing sufficiently many hidden variables one of a certain plant (etc.) was attributed to a special power
can explain anything one wants. Where is the borderline that God had implanted in nature for human benefit.
between reasonably many and too many hidden vari- The example of the virtus dormitiva was ironically em-
ables postulated for the purpose of explanation? Based ployed by Molieré, and famous philosophers have used
on Schurz [7.1, p. 219], [7.54, pp. 246], [7.40, pp. 112] it as a paradigm example of a pseudo-explanation [7.56,
I suggest the following demarcation criterion. Book 5, Ch. 7, Sect. 2], [7.57, Ch. 6, Sect. 2]. Specu-
lative law-abductions violate Ockham’s principle since
DC (Demarcation Criterion) for 2nd-Order Abduction:
we have already a sufficient cause for the disposition
The introduction of one new theoretical variable
to make one sleepy, namely the natural kind opium, so
(entity or property) merely for the purpose of ex-
that the postulated power amounts to a redundant multi-
plaining one phenomenon is always speculative and
plication of causes. More formally, the schema does not
post facto, i. e., has no independently testable empir-
offer unification because for every elementary empirical
ical consequences. Only if the postulated theoretical
law one has to introduce two elementary hypothetical
variable explains many intercorrelated but analyt-
laws to explain it [7.55, p. 87]. For the same reason, the
ically independent phenomena, and in this sense
abductive conjecture has no predictive power that goes
yields a causal or explanatory unification, is it a le-
beyond the predictive power of the explained law.
gitimate scientific abduction that is independently
I do not want to diminish the value of cognitive
testable and, hence, worthy of further investigation.
speculation by this analysis. Humans have an inborn in-
Let us explain the criterion (DC) by way of ex- stinct to search for causes [7.58, Ch. 3], and cognitive
amples. The simplest kind of speculative abduction speculations are the predecessor of scientific inquiry.
explains every particular phenomenon P by a special However, it was pointed out in Sect. 7.1 that the best
power who (or which) has caused this phenomenon. In available explanations are often not good enough to
what follows, read .'/ as some power of kind in- count as rationally acceptable. The above speculative
tends that ' happens, where the formula ' may either abduction patterns can be regarded as the idling of our
express a singular fact or an empirical regularity or law inborn explanatory search activities when applied to
that is frequently clothed in the form of an empirical events for which a proper explanation is out of reach.
disposition. Accordingly, we have two kinds of specu- In contrast to these empty causal speculations, sci-
lative abductions – see Table 7.2. entific common cause abductions have usually led to
Table 7.2
Speculative Fact-Abduction : Example:
Explanandum E: Ca John got a cold.
==============================================================================
Conjecture H: .Ca/ ^ 8'. .'/ ! '/ God wanted John to get a cold, and whatever God
wants, happens.
Speculative Law-Abduction: Example:
Explanandum E: 8x.Ox ! Dx/ Opium has the disposition to make people sleepy
(after consuming it).
==============================================================================
Conjecture H: 8x.Ox ! .Dx// ^ 8x. .Dx/ ! Dx/ Opium has a special power (a virtus dormitiva)
that causes its disposition to make people sleepy.
164 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
genuine theoretical progress. The leading principle of 7.7.2 Strict Common-Cause Abduction
causal unification is the following. from Correlated Dispositions
and the Discovery
(CC) Causal Connection Principle: If two properties or of New Natural Kinds
kinds of events are probabilistically dependent, then
they are causally connected in the sense that either In this section I analyze common-cause abduction in
one is a cause of the other (or vice versa), or both a deductivistic setting, which is appropriate when the
are effects of a common cause (where X is a cause domain is ruled by strict laws. Probabilistic generaliza-
Part B | 7.7
of Y iff a directed path of cause-effect relations leads tions are treated in Sect. 7.7.3. Recall the schema of
from X to Y). speculative law-abduction, where one disposition D oc-
curring in one (natural) kind F was pseudo-explained
The causal connection principle (CC) does not en- by a causal power .D/. In this case of a single disposi-
tail that every phenomenon must have a sufficient tion, the postulate of a causal power .D/ that mediates
cause – it merely says that all correlations result from between F and D is an unnecessary multiplication
causal connections. This principle has been empiri- of causes. But in the typical case of a scientifically
cally corroborated in almost every area of science, in productive common-cause abduction, we have several
the sense that conjectured common causes have been (natural) kinds F1 ; : : : ; Fn all of which have a set of
identified in later stages of inquiry; the only known characteristic dispositions D1 ; : : : ; Dm in common –
exception is quantum mechanics. Here we treat (CC) with the result that all these dispositions are correlated.
not as a dogma, but as a meta-theoretical principle Assuming that it is excluded that one disposition can
that guides our causal abductions. (CC) is a conse- cause another one, then by principle (CC) these corre-
quence of the more general causal Markov condition, lated dispositions must be the common effects of a cer-
which is the fundamental axiom of the theory of causal tain intrinsic structure that is present in all of the kinds
nets [7.59, pp. 396], [7.40, p. 16], [7.60, pp. 29]. F1 : : : ; Fn as their common cause. For example, the
Schurz and Gebharter [7.61, Sect. 2] demonstrate that following dispositional properties are common to cer-
the causal Markov condition can itself be justified via tain substances such as iron, copper, tin, etc. (Fig. 7.6):
an explanatory abduction inasmuch as it yields the a characteristic glossing, smooth surface, characteris-
best and only plausible explanation of two (in)stability tic hardness, elasticity, ductility, high conductivity of
properties of statistical correlations: screening-off and heat and of electricity. Already before the era of mod-
linking-up. ern chemistry, craftsmen had abducted that there exists
The way that the causal connection principle leads a characteristic intrinsic property of substances that is
to common-cause abduction is as follows: Whenever the common cause of all these (more-or-less strictly)
we encounter several intercorrelated phenomena, and – correlated dispositions, and they called it metallic char-
for some reason or other – we can exclude that one acter Mx. To be sure, the natural kind term metal
causes the other(s), then (CC) requires that these phe- of premodern chemistry was theoretically hardly un-
nomena must have some (unobservable) common cause derstood. But the introduction of a new (theoretical)
that simultaneously explains all of them. The most im- natural kind term is the first step in the development
portant scientific example of this sort is common cause of a new research program. The next step was then
abduction from correlated dispositions: since disposi- to construct a theoretical model of the postulated kind
tions cannot cause other dispositions, their correlations metal, by which one can give an explanation of how
must have a common intrinsic cause. the structure of a metal can cause all these correlated
Characteristic glossing
Smooth surface
Electronic Hardness
energy band Metal Elasticity
model Ductility (at high temperatures)
High conductivity of electricity
High conductivity of heat Fig. 7.6 Common-cause
Solvability in acid abduction of the chemical
...
kind term metal
Patterns of Abductive Inference 7.7 Hypothetical (Common) Cause Abduction Continued 165
dispositions at once. In combination with atomic (and ence. Also a purely instrumentalistic interpretation
molecular) hypotheses the abducted natural kind terms of a theoretical concept is possible, as a useful
of chemistry became enormously fruitful. In modern means of unifying empirical phenomena (which is
chemistry, the molecular microstructure of metals is defended, for example, by van Fraassen [7.63]).
modeled as an electron band of densely layered en- However, the more empirically successful a theory
ergy levels among which the electrons can easily shift becomes, the more plausible it is to assume that the
around [7.62, pp. 708]. theoretical concept producing this success actually
The structural pattern of the example in Fig. 7.6 can does refer to something real [7.64, Sect. 6.2].
Part B | 7.7
be formalized as in Tab. 7.3. The abductive conjecture 2. Addendum on dispositions: I understand disposi-
H logically implies the explanandum E and yields a uni- tions as conditional (or functional) properties: That
fication of n m empirical (elementary) laws by n C m an object x has a disposition D means that when-
theoretical (elementary) laws, which is a polynomial ever certain initial conditions C are (or would
reduction of elementary laws. At the same time, H gen- be) satisfied for x, then a certain reaction R of
erates novel empirical consequences by which it can be x will (or would) take place. This understanding
independently tested, as follows. H does not only postu- of dispositions is in accordance with the received
late the theoretical property x to be a merely sufficient view [7.65, p. 44], [7.66]. Dispositional properties
cause of the dispositions; it also assumes that these dis- are contrasted with categorical properties, which
positions are an empirical indicator of the theoretical are not defined in terms of conditional effects, but
property x [7.54, p. 111]. This indicator relation may in terms of occurrent intrinsic structures or states
either be reconstructed in a strict form (as indicated in (in the sense of Earman [7.67, p. 94]). Dispo-
brackets [$]), or in a weaker probabilistic form. In ei- sitional properties can have categorical properties
ther case, if we know for some new kind F that it such as molecular structures as their causal basis,
possesses some of the dispositions, then the abducted but they are not identical with them. Although the
common-cause hypothesis predicts that F will also received view is not uncontroversial [7.68, 69], it is
possess all the other dispositions. This is a novel (qual- strongly supported by the situation that underlies
itatively new) prediction. For example, we can predict common-cause abduction (Fig. 7.7): Here different
solubility in acid for a new kind of metallic ore, even if dispositions have the same molecular structure as
this ore has never been put into acid before. Novel pre- their common cause; so they cannot be identical
dictions are a characteristic virtue of genuine scientific with this molecular structure.
theories that go beyond simple empirically inductive Prior et al. [7.66, p. 255] argue that since dispo-
generalizations [7.54, p. 112]. sitions are functional properties, they can only be
Having described the basic principles and mecha- the effects of (suitable) categorical causes, but can-
nisms of common-cause abduction, we must add four not themselves act as causes. If this argument is
important clarifications: not convincing enough, here is a more detailed
argument showing why, at least normally, disposi-
1. Instrumentalist versus realist interpretations: In tions cannot cause each other. Assume D1 and D2
common-cause abduction, a new theoretical prop- are two correlated dispositions, each of them being
erty is postulated or discovered that accomplishes equivalent with an empirical regularity of the above-
a unified explanation of correlated empirical phe- mentioned form: Di x $def 8t.Ci xt ! Ri xt/ (for i 2
nomena. Even if the theory-core describing this f1; 2g, t for the time variable). The implication ex-
property it is independently confirmed in later presses a cause-effect relation. Now, if D1 would
experiments, there is no guarantee that the hy- cause D2 , the only possible causal reconstruction
pothesized theoretical property has realistic refer- would be to assume that C1 causes C2 , which causes
Table 7.3
Common-cause abduction (abducted theoretical concept: ).
Explanandum E: All kinds F1 ; : : : ; Fn have the dispositions D1 ; : : : ; Dm in common.
8i 2 f1; : : : ; ng8j 2 f1; : : : ; mg W 8x.Fi x ! Dj x/.
==============================================================================
Abductive conjecture H: All F1 s; : : : ; Fn s have a common intrinsic and structural property that is a cause
and an indicator of all the dispositions D1 ; : : : ; Dm .
8i 2 f1; : : : ; ng W 8x.Fi x ! x/ ^ 8j 2 f1; : : : ; mg W 8x. x ! Œ$Di x/.
166 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
R2 , which in turn causes R1 .C1 ! C2 ! R2 ! on a typical common-cause abduction [7.49, pp. 196].
R1 /. But normally this is impossible, since the ini- The fundamental common-cause abduction of Newto-
tial conditions are freely manipulable and, hence, nian physics was the abduction of the gravitational
causally independent. For example, my decision to force as the common cause of the disposition of bod-
irradiate a substance with light (to test for its gloss- ies on the Earth to fall and the disposition of the planets
ing) can in no way cause my decision to heat it (to to move around the Sun in elliptic orbits. Here, New-
test for its ductibility). ton’s qualitative stipulation of the gravitational force as
A final remark: When I speak of a molecular struc- the counterbalance of the centrifugal force that acts on
Part B | 7.7
ture as being the cause of a disposition, I understand the circulating planets was his abductive step, while
the notion of cause in a more general sense than his quantitative calculation of the mathematical form
the narrow notion of event causation. This extended of the gravitational law was a deduction from Kepler’s
usage of cause is reducible to the notion of event third law plus this abductive conjecture [7.73, p. 203].
causation as follows: A disposition Dx, being de- Another example is the abduction of the goal(s) of
fined as the conditional property 8t.Cxt ! Rxt/, is a person as the common cause of her behavior un-
caused by a categorical property x iff each mani- der various conditions. Prendinger and Ishizuka [7.25,
festation of the disposition’s reaction, Rxt, is caused p. 324] have utilized common-cause abduction in au-
by x together with the initial conditions Cxt, or tomated web usage mining to infer the interests of
formally, iff 8x8t.Lx ^ Cxt ! Rxt/. Internet users based on their browsing activities. Korn-
3. The precise explication of causal unification – messer [7.74] shows that common-cause abduction was
many effects explained by one or just a few the leading principle in the development of the princi-
causes – presupposes formal ways of counting ples and parameters approach in theories of generative
elementary phenomena, expressed by elementary grammar.
statements. There are some technical difficulties Common-cause abduction can also be applied to or-
involved with this. Solutions to this problem have dinary, nondispositional properties or (kinds of) events
been proposed in Schurz [7.70], Gemes [7.71], and that are correlated. However, in this case one has first to
more recently in Schurz and Weingartner [7.72]. consider more parsimonious causal explanations that do
The following explication is sufficient for our not postulate an unobservable common cause but stipu-
purpose: We represent every given belief system late one of these events or properties to be the cause of
(or set of statements) K by the set of all those the others. For example, if the three kinds of events F,
elementary statements S of the underlying language G, and H (for example, eating a certain poison, having
that are relevant consequences of K, in the sense difficulties in breathing, and finally dying) are strictly
that no predicate in S is replaceable by another correlated and always occur in the form of a tempo-
arbitrary predicate (of the same place-number), ral chain, then the most parsimonious conjecture is that
salva validitate of the entailment K k S. Here, these event types form a causal chain. Only in the spe-
a statement S is called elementary iff S is not log- cial case where two (or several) correlated event types,
ically equivalent to a nonredundant conjunction of say F and G, are strongly correlated, but we know that
statements S1 ^ ^Sn each of which is shorter than there cannot be a direct causal mechanism that connects
S. Borrowing the terminology from Gemes [7.71], them, is a common-cause abduction the most plausible
we call the elementary relevant consequences of K conjecture. An example is the correlation of lightning
K’s content elements. It can be shown that every set and thunder: we know by induction from observation
of sentences is classically equivalent with the set that light does not produce sound, and hence, we con-
of its content elements; so no information is lost by jecture that there must be a common cause of both of
this representation [7.72, Lemma 7.2]. But note that phenomena.
our analysis of common-cause abduction does not We finally discuss our demarcation criterion (DC)
depend on this particular representation method; in the light of Bayesian confirmation theory. Accord-
it merely depends on the assumption that a natural ing to criterion (DC), a speculative hypothetical cause
method of decomposing the classical consequence explanation can never be regarded as confirmed by
class of a belief system K into a nonredundant set the evidence; only a common-cause explanation can.
of smallest content elements exists. A Bayesian would probably object that our demarca-
tion between single- and common-cause abduction is
Many more examples of common-cause abduction just a matter of degree [7.75, p. 141]. Recall from
in the natural sciences can be given. For example, Sect. 7.1 that a given piece of evidence E raises the
Glauber’s discovery of the central chemical concepts of probability of every hypothesis H that increases E’s
acids, bases, and salts in the 17th century was based probability (since by Bayes’ theorem, P.HjE/=P.H/ D
Patterns of Abductive Inference 7.7 Hypothetical (Common) Cause Abduction Continued 167
P.EjH/=P.E//. So according to standard Bayesian no- ence (with the exception of Haig [7.78], who shares my
tions of confirmation [7.76] the evidence E: John got view of factor analysis). In this section I want to show
a cold does indeed confirm the speculative post facto that factor analysis is a certain generalization of hypo-
speculation Hspec : God wanted John to get a cold – thetical common-cause abduction, although sometimes
just to a minor degree in comparison to the scientific it may be better interpreted in a purely instrumentalistic
hypothesis Hsci : John was infected by a virus. So it way. For this purpose, I assume that scientific concepts
seems that our demarcation criterion is in conflict with are represented as statistical random variables X; Y; : : : ,
Bayesian confirmation theory. each of which can take several values xi , yj . (A ran-
Part B | 7.7
Schurz [7.77] suggests a way of embedding (DC) dom variable X W D ! IR assigns to each individual d of
into the Bayesian perspective. In all cases of post facto the domain D a real-valued number X.d/; a dichotomic
explanations, the hypothesis H results from fitting a la- property Fx is coded by a binary variable XF with val-
tent (unobserved) first- or second-order variable X to ues 1 and 0.) The variables are assumed to be at least
the observed evidence E. So H entails the more gen- interval-scaled, and the statistical relations between the
eral background hypothesis 9XH.X/ in which the latent variables are assumed to be monotonic – the linearity
variable X is unfitted and existentially generalized. In assumption of factor analysis yields good approxima-
our example, 9XHspec .X/ says: There exists a God who tions only if these conditions are satisfied.
wants some X, and whatever God wants, happens. Hspec Let us start from the example of the previous
results from 9XHspec .X/ by replacing X by John got section, where we have n empirically measurable
a cold and omitting the existential quantifier. Note that and highly intercorrelated variables X1 ; : : : ; Xn , i. e.,
9XHspec .X/ is a content element of Hspec that transcends cor.Xi ; Xj / Dhigh for all 1
i; j
n. An example would
(is not contained in) the evidence. be the scores of test persons on n different intelligence
Schurz [7.77] argues that the probability-raising of tests. We assume that none of the variables screens
Hspec by E is a case of pseudo-confirmation and not of off the correlations between any other pair of variables
genuine confirmation, because the probability increase (i. e., cor.Xi ; Xj jXr / ¤ 0 for all r ¤ i; j), which makes it
of Hspec by E does not spread to Hspec ’s evidence- plausible that these n variables have a common cause,
transcending content element 9XHspec .X/. This follows distinct from each of the variables – a theoretical fac-
from the fact that 9XHspec .X/ can be fitted to every tor, call it F. In our example, F would be the theoretical
possible piece of evidence whatsoever. Therefore the concept of intelligence. Computationally, the abductive
probability of 9XHspec .X/ remains as low as it was conjecture asserts that for each 1
i
n, Xi is approxi-
before conditionalization, for arbitrarily many pieces mated by a linear function fi of F, fi .F.x// D ai F.x/, for
of evidence, whence the posterior probability of Hspec given individuals x in the domain D (since we assume
(which entails 9Xspec H.X/) also remains low. the variables Xi to be z-standardized, the linear function
Also note that the scientific hypothesis Hsci contains fi has no additive term Cbi ). The true Xi -values are scat-
a latent variable X that has been fitted to the evidence tered around values predicted by this linear function,
E. In our example the unfitted hypothesis 9XHsci .X/ fi .F/, by a remaining random dispersion si ; the square
says that every disease is caused by some pathogenic s2i is the remainder variance. According to the stan-
agent X, which is fitted to John’s cold by replacing X dard linear regression technique, the optimally fitting
by a virus. In this case, however, Hsci implies further coefficients ai are computed so as to minimize this re-
empirical consequences E0 (e.g., that John’s immune mainder variance. Visually speaking, the Xi -values form
system will contain characteristic antibodies) by which a stretched cloud of points in an n-dimensional coordi-
it can independently tested. If such independent evi- nate system, and F is a straight line going through the
dence E0 raises the probability of Hsci as well, this is no middle of the cloud such that the squared normal devia-
longer the result of a post facto fitting, but an instance of tions of the points from the straight line are minimized.
genuine confirmation, because now this probability in- So far we have described the linear-regression
crease spreads to Hsci ’s evidence-transcending content statistics of the abduction of one factor or cause. In
element 9Xsci H.X/ [7.77, Sect. 4]. factor analysis one also takes into account that the mu-
tually intercorrelated variables may have not only one
7.7.3 Probabilistic Common-Cause but several common causes. For example, the variables
Abduction and Statistical Factor may divide into two subgroups with high correlations
Analysis within each subgroup, but low correlations between the
two subgroups. In such a case the reasonable abductive
Statistical factor analysis is an important branch of sta- conjecture is that there are two independent common
tistical methodology whose analysis (according to my causes F1 and F2 , each responsible for the variables
knowledge) has been neglected by philosophers of sci- in one of the two subgroups. In the general picture of
168 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
factor analysis, there are n given empirical variables empirical variables by a small set of k theoretical vari-
Xi , which are explained by k < n theoretical factors (or ables, as explained in Sect. 7.7.1. While the amount of
common causes) Fj as follows (for the following [7.79, explained variance of the first factor is usually much
Ch. 3]; note that I can describe here only the most greater than 1, this amount becomes smaller and smaller
common method of factor analysis without discussing when one introduces more and more factors (in the triv-
subtle differences between different methods): ial limiting case k D n the amount of explained variance
becomes 100%). According to the Kaiser–Guttman cri-
X1 D a11 F1 C C a1k Fk C s1 terion one should introduce new factors only as long
Part B | 7.7
relations have a common-cause explanation in terms basis for the unconscious abduction to an outer reality –
of three-dimensional external objects by the laws of in fact, these correlations seem even to be the major fun-
the perspectival projection. To be sure, these common- dament of our naive belief in the outer reality. If you
cause abductions are mainly unconscious and rely on have a visual appearance of an object, but you are un-
inborn computations performed by the visual cortex of sure whether it is a mere visual illusion or not, then you
our brains. What we consciously experience are the ab- will probably go to the object and try touch it – and
ducted three-dimensional objects that make up the mind if you can, then your realistic desires are satisfied. On
of the naive realist. However, certain situations – for the other hand, visual appearances that do not corre-
Part B | 7.8
example, the case of visual illusions caused by 3-D spond to tactile ones, so-called ghosts, have frightened
pictures – make it plain that what underlies our three- the naively realistic mind and occupied its fantasy since
dimensional visual appearances is a complicated ab- the earliest times.
ductive computational process [7.80]. Moreover, since This concludes my analysis of patterns of abduc-
in our ordinary visual perceptions some objects partly tion. Instead of a conclusion, I refer to the classification
conceal other objects that are behind them, our visual of abduction patterns in Fig. 7.1, and to my main the-
abductions always include the task of Gestalt comple- ses and results as explained in Sect. 7.1, which are
mentation. Identification of three-dimensional objects densely supported by the details of my analysis. As a fi-
based on two-dimensional projective images is an im- nal conclusion, I propose the following: As Peirce once
portant abductive task in the AI field of visual object remarked [7.20, CP 6.500], the success of scientists at
recognition [7.81, Ch. 24.4]. Scientifically advanced finding true hypotheses among myriads of possible hy-
versions of visual abduction where one abducts the potheses seems to be a sheer miracle. I think that this
shape of entire objects from sparse fragments have been success becomes much less miraculous if one under-
analyzed in the field of archeology [7.82]. stands the strategic role of patterns of abduction. In the
The inter-sensual correlation between different sen- concluding section, I present applications of abductive
sual experiences, in particular between visual percep- reasoning in two neighboring fields: belief revision and
tions and tactile perceptions, is the second important instrumental or technological reasoning.
point of philosophy of science, Pagnucco’s notion of Tomiyama et al. [7.89, 90] and Tuzet [7.91, p. 152] have
abduction is too weak, since not just any sentence proposed to extend the meaning of the notion of ab-
that logically entails E constitues a scientific explana- duction so that it also includes technological or, more
tion of E. Schurz [7.85] utilizes the abduction patterns generally, any sort of instrumental reasoning.
classified in Fig. 7.1 to explicate corresponding op- In instrumental abduction, the proposition that the
erations of abductive belief expansion and revision. abductive hypothesis entails in the given background
He defines abductive belief expansion as K C CE Ddef system is not yet true, but rather represents a goal, i. e.,
K [ fEg [ abd.K; E/, where the triple (E; abd.E; K/; K) something that one wishes to realize, and the abductive
expresses one of the abduction situations classified in hypothesis (conclusion) expresses a conjectured means
Fig. 7.1: E is the evidence to be explained, abd(E; K) to realize this goal. Referring to the patterns of abduc-
is the abductive conjecture, and K is the given back- tions classified in a predecessor paper of Schurz [7.1],
ground belief system that drives the abduction [7.85, Tomiyama et al. [7.89] show how abductive reasoning
p. 93]. Next, Schurz [7.85, pp. 94–6] discovers that can be applied to create the design of a refrigerator. For
an appropriate definition of abductive belief revision Tuzet [7.91, p. 152], the reasoning schema We want E. If
fails to meet the Levi identity for two reasons, which C then E. Therefore, we should try to bring about C ex-
he calls the problems of old evidence and incremen- presses the basic form of instrumental abduction, con-
tal belief revision. He defines abductive belief revision strained by the restriction that C should be practically
based on a suitable notion of the abductive revision appropriate: for example C must be practically realiz-
of an explanatory hypothesis by contradicting evi- able and must not have unwanted side-effects (etc.).
dence [7.85, p. 96]. Operations of intelligent abductive Generally speaking, often – though not always –
belief revision that are congenial in spirit have been instrumental reasoning proceeds by similar cognitive
implemented in computer programs by Bharathan and operations as abductive reasoning, the only difference
Josephson [7.87]. being that the abductandum is not yet realized but ex-
A related account to abductive belief revision has presses a goal. Therefore I propose to call this sort
been developed by Cevolani [7.88]. Cevolani agrees of reasoning instrumental abduction, provided one is
with Schurz that the characterization of an explana- aware that this notion extends the proper meaning of
tion as any sentences that entails the explanadum is abduction to a different domain of application. (Note
too weak. He suggests that the abductive hypothesis S that Tuzet [7.91] speaks of epistemic versus projectual
that in the given belief system K explains E should be abduction; I prefer the designations explanatory versus
such that the expected verisimilitude of the resulting be- instrumental abduction because they are more specific.)
lief systems K C CE or K E, respectively, increases. The distinction between explanatory and instrumental
Similar to Schurz, Cevolani observes that his so-defined abduction is summarized in Fig. 7.7.
notion of abductive belief revision fails to satisfy the In Fig. 7.7 the superordinate concept that covers ex-
Levi identity [7.88, p. 011]. planatory as well as instrumental abduction is called
Generalized abduction:
C: Abductandum, conclusion to be inferred
K: System of background beliefs
==========================================
P: Abductans, abducted conjecture such that {P} K entails C
generalized abduction. What is common to both forms of science, I am inclined to think that this generaliza-
of abduction is a process that searches for missing tion overstretches the notion of abduction. On the other
premises P for a given conclusion C. Gabbay and hand, from a logical viewpoint it seems to make sense
Woods [7.17, p. 191] and Woods [7.92, p. 153] go so far to call any process that searches for premises in order
as to call every cognitive process an abduction so long to infer a given inference goal an abduction in the logi-
as it generates a premise P from which a given sentence cally generalized sense.
C can be derived or obtained in the given background
system K (thereby the authors generalize the notion of Acknowledgments. For valuable help I am in-
Part B | 7
consequence to an arbitrary closure relation R). Ac- debted to Ilkka Niiniluoto, Theo Kuipers, Gerhard
cording to this view, a process that confirms C, or that Brewka, Gustavo Cevolani, Lorenzo Magnagni, Hel-
predicts C (via finding suitable premises), would also be mut Prendinger, Tetsuo Tomiyama, and Erik Ols-
called an abduction. From the viewpoint of philosophy son.
References
7.1 G. Schurz: Patterns of abduction, Synthese 164, 201– 7.17 D. Gabbay, J. Woods: Advice on abductive logic, Logic
234 (2008) J. IGPL 14(1), 189–220 (2006)
7.2 C.S. Peirce: Deduction, induction, and hypothesis. 7.18 T. Day, H. Kincaid: Putting inference to the best ex-
In: Elements of Logic, Collected Papers of Charles planation in its place, Synthese 98, 271–295 (1994)
Sanders Peirce, Vol. 2, (Harvard Univ. Press, Cam- 7.19 C.S. Peirce: Scientific Metaphysics, Collected Papers
bridge 1932), ed. by C. Hartshorne, P. Weiss (2.619– of Charles Sanders Peirce, Vol. 6 (Harvard Univ. Press,
2.644) Cambridge 1935), ed. by C. Hartshorne, P. Weiss
7.3 C.S. Peirce: Lectures on pragmatism. In: Pragma- 7.20 C.S. Peirce: Pragmatism and Pragmaticism, Collected
tism and Pragmaticism, Collected Papers of Charles Papers of Charles Sanders Peirce, Vol. 5 (Harvard Univ.
Sanders Peirce, Vol. 5, (Harvard Univ. Press, Cam- Press, Cambridge 1934), ed. by C. Hartshorne, P. Weiss
bridge 1934), ed. by C. Hartshorne, P. Weiss (5.14– 7.21 R.M. Chisholm: Theory of Knowledge (Prentice Hall,
5.212) Englewood Cliffs, N.J 1966)
7.4 J. Pollock: Contemporary Theories of Knowledge 7.22 T.A. Sebeok, J. Umiker-Sebeok: You Know My
(Rowman Littlefied, Maryland 1986) Method. A Juxtaposition of Charles S. Peirce and
7.5 J. Earman: Bayes or Bust? (MIT, Cambridge 1992) Sherlock Holmes (Gaslight, Bloomington/Ind. 1980)
7.6 J. Ladyman: Understanding Philosophy of Science 7.23 R.A. Fumerton: Induction and reasoning to the best
(Routledge, London 2002) explanation, Phil. Sci. 47, 589–600 (1980)
7.7 A. Aliseda: Abductive Reasoning (Springer, Dordrecht 7.24 P. Thagard: Computational Philosophy of Science
2006) (MIT, Cambridge 1988)
7.8 L. Magnani: Abduction, Reason, and Science (Kluwer, 7.25 H. Prendinger, M. Ishizuka: A creative abduction ap-
Dordrecht 2001) proach to scientific and knowledge discovery, Knowl.
7.9 L. Magnani: Abductive Cognition. The Epistemolog- Based Syst. 18, 321–326 (2005)
ical and Eco-Cognitive Dimensions of Hypothetical 7.26 J. Josephson, S. Josephson (Eds.): Abductive Infer-
Reasoning (Springer, Heidelberg, Berlin 2009) ence (Cambridge Univ. Press, New York 1994)
7.10 G.H. Harman: The inference to the best explanation, 7.27 P. Flach, A. Kakas (Eds.): Abduction and Induction
Philos. Rev. 74, 173–228 (1965) (Kluwer, Dordrecht 2000)
7.11 P. Lipton: Inference to the Best Explanation (Rout- 7.28 G. Paul: Approaches to abductive reasoning, Artif.
ledge, London 1991) Intell. Rev. 7, 109–152 (1993)
7.12 I. Niiniluoto: Defending abduction, Proc. Phil. Sci., 7.29 L. Console, D.T. Dupre, P. Torasso: On the relationship
Vol. 66 (1999) pp. S436–S451 between abduction and deduction, J. Logic Comput.
7.13 E. Barnes: Inference to the loveliest explanation, 1(5), 661–690 (1991)
Synthese 103, 251–277 (1995) 7.30 I. Bratko: Prolog Programming for Artificial Intelli-
7.14 J. Hintikka: What is abduction? The fundamen- gence (Addison-Wesley, Reading/Mass 1986)
tal problem of contemporary epistemology, Trans. 7.31 D. Gabbay, J. Woods: The Reach of Abduction: Insight
Charles Sanders Peirce Soc. 34(3), 503–533 (1998) and Trial (A Practical Logic of Cognitive Systems), Vol.
7.15 J. Hintikka, I. Halonen, A. Mutanen: Interrogative 2 (North-Holland, Amsterdam 2005)
logic as a general theory of reasoning. In: Handbook 7.32 A. Nepomuceno, F. Soler-Toscano, F. Velasquez-Que-
of Practical Reasoning, ed. by R.H. Johnson, J. Woods sada: An epistemic and dynamic approach to abduc-
(Kluwer, Dordrecht 2000) tive reasoning: Selecting the best explanation, Logic
7.16 N.R. Hanson: Is there a logic of discovery? In: Current J. IGPL 21(6), 962–979 (2013)
Issues in the Philosophy of Science, ed. by H. Feigl, 7.33 J. Meheus, D. Batens: A formal logic for abductive
G. Maxwell (Holt Rinehart Winston, New York 1961) reasoning, Logic J. IGPL 14, 221–236 (2006)
pp. 20–35
172 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
7.34 A. Aliseda, L. Leonides: Hypotheses testing in adap- 7.60 P. Spirtes, C. Glymour, R. Scheines: Causation, Pre-
tive logics: An application to medical diagnosis, diction, and Search, 2nd edn. (MIT Press, Cambridge
Logic J. IGPL 21(6), 915–930 (2013) 2000)
7.35 D. Walton: Abductive Reasoning (Univ. of Alabama 7.61 G. Schurz, A. Gebharter: Causality as a theoretical
Press, Tuscaloosa 2004) concept, Synthese 193(4), 1071–1103 (2014)
7.36 I. Halonen, J. Hintikka: Towards a theory of the pro- 7.62 D.W. Octoby, H.P. Gillis, N.H. Nachtrielo: Principles of
cess of explanation, Synthese 143(1/2), 5–61 (2005) Modern Chemistry (Saunders College, Orlando 1999)
7.37 L. Magnani: Multimodal abduction, Logic J. IGPL 7.63 B. Van Fraassen: The Scientific Image (Clarendon, Ox-
14(1), 107–136 (2006) ford 1980)
Part B | 7
7.38 G. Schurz: Criteria of theoreticity: Bridging statement 7.64 T.A.F. Kuipers: From Instrumentalism to Constructive
and non statement view, Erkenntnis 79(8), 1521–1545 Realism (Kluwer, Dordrecht 2000)
(2014) 7.65 A. Pap: Disposition concepts and extensional logic.
7.39 T.S. Kuhn: The Structure of Scientific Revolutions In: Dispositions, ed. by R. Tuomela (Reidel, Dordrecht
(Chicago Univ. Press, Chicago 1962) 1978) pp. 27–54
7.40 G. Schurz: Philosophy of Science: A Unified Approach 7.66 E.W. Prior, R. Pargetter, F. Jackson: Three theses about
(Routledge, New York 2013) dispositions, Am. Philos. Q. 19, 1251–1257 (1982)
7.41 P. Kitcher: Explanatory unification, Phil. Sci. 48, 507– 7.67 J. Earman: A Primer on Determinism (Reidel, Dor-
531 (1981) drecht 1986)
7.42 B. Buchanan, G.L. Sutherland, E.A. Feigenbaum: 7.68 D.M. Armstrong: Dispositions as causes, Analysis 30,
Heuristic dendral – A program for generating ex- 23–26 (1969)
planatery hypotheses in organic chemistry, Mach. 7.69 S. Mumford: Dispositions (Oxford Univ. Press, Oxford
Intell. 4, 209–254 (1969) 1998)
7.43 E. Sober: Philosophy of Biology (Westview, Boulder 7.70 G. Schurz: Relevant deduction, Erkenntnis 35, 391–
1993) 437 (1991)
7.44 M. Ridley: Evolution (Blackwell Scientific, Oxford 7.71 K. Gemes: Hypothetico-deductivism, content, and
1993) the natural axiomatization of theories, Phil. Sci. 54,
7.45 G. Schurz: What is normal? An evolution-theoretic 477–487 (1993)
foundation of normic laws, Phil. Sci. 28, 476–497 7.72 G. Schurz, P. Weingartner: Zwart and Franssen’s im-
(2001) possibility theorem, Synthese 172, 415–436 (2010)
7.46 D. Davidson: Inquiries into Truth and Interpretation 7.73 C. Glymour: Theory and Evidence (Princeton Univ.
(Oxford Univ. Press, Oxford 1984) Press, Princeton 1981)
7.47 H.P. Grice: Logic and conversation. In: Syntax and 7.74 S. Kornmesser: Model-based research programs,
Semantics, Vol. 3: Speech Acts, ed. by P. Cole, J. Mor- Conceptus 41(99/100), 135–187 (2014)
gan (Academic Press, New York 1975) pp. 41–58 7.75 C. Howson, P. Urbach: Scientific Reasoning: The
7.48 J.R. Hobbs, M. Stickel, P. Martin, D. Edwarts: Inter- Bayesian Approach, 2nd edn. (Open Court, Chicago
pretation as abduction, Artif. Intell. J. 63(1/2), 69–142 1996)
(1993) 7.76 B. Fitelson: The plurality of bayesian confirmation
7.49 P. Langley, H.A. Simon, G.L. Bradshaw, J.M. Zytkow: measures of confirmation, Proc. Phil. Sci., Vol. 66
Scientific Discovery. Computational Explorations of (1999) pp. S362–S378
the Creative Process (MIT Press, Cambridge 1987) 7.77 G. Schurz: Bayesian pseudo-confirmation, use-nov-
7.50 K. Holyoak, P. Thagard: Analogical mapping by con- elty, and genuine confirmation, Stud. Hist. Phil. Sci.
straint satisfaction, Cogn. Sci. 13, 295–355 (1989) 45, 87–96 (2014)
7.51 D. Gentner: Structure-mapping: A theoretical frame- 7.78 B. Haig: Exploratory factor analysis, theory gener-
work for analogy, Cogn. Sci. 7, 155–170 (1983) ation, and scientific method, Multivar. Behav. Res.
7.52 W. Salmon: Scientific Explanation and the Causal 40(3), 303–329 (2005)
Structure of the World (Princeton Univ. Press, Prince- 7.79 P. Kline: An Easy Guide to Factor Analysis (Routledge,
ton 1984) London 1994)
7.53 P.K. Moser: Knowledge and Evidence (Cambridge 7.80 I. Rock: Perception (Scientific American Books, New
Univ. Press, Cambridge 1989) York 1984)
7.54 G. Schurz: When empirical success implies theoreti- 7.81 S.J. Russell, P. Norvig: Artificial Intelligence (Prentice
cal reference, Br. J. Phil. Sci. 60(1), 101–133 (2009) Hall, Englewood-Cliffs 1995)
7.55 G.K.L. Schurz: Outline of a theory of scientific under- 7.82 C. Shelley: Visual abductive reasoning in archaeol-
standing, Synthese 101(1), 65–120 (1994) ogy, Phil. Sci. 63, 278–301 (1996)
7.56 J.St. Mill: System of Logic, 6th edn. (Parker Son 7.83 C.E. Alchourrón, P. Gärdenfors, D. Makinson: On the
Bourn, London 1865) logic of theory change, J. Symbolic Logic 50, 510–530
7.57 J. Ducasse: A Critical Examination of the Belief in (1985)
a Life After Death (Charles Thomas, Springfield 1974) 7.84 P. Gärdenfors: Knowledge in Flux (MIT Press, Cam-
7.58 D. Sperber, D. Premack, A. James Premack (Eds.): bridge 1988)
Causal Cognition – A Multidisciplinary Approach 7.85 G. Schurz: Abductive belief revision. In: Belief Revi-
(Clarendon, Oxford 1995) sion Meets Philosophy of Science, ed. by E. Olsson,
7.59 J. Pearl: Causality, 2nd edn. (Cambridge Univ. Press, S. Enqvist (Springer, New York 2011) pp. 77–104
Cambridge 2009)
Patterns of Abductive Inference References 173
7.86 M. Pagnucco: The Role of Abductive Reasoning within (2003), paper no. DETC2003/DTM-48650, ASME (CD-
the Process of Belief Revision, Dissertation (Univ. ROM)
Sydney, Sydney 1996) 7.90 T. Tomiyama, P. Gu, Y. Jin, D. Lutters, Ch. Kind,
7.87 V. Bharathan, J.R. Josephson: Belief revision con- F. Kimura: Design methodologies: Industrial and ed-
trolled by meta-abduction, Logic J. IGPL 14(1), 271– ucational applications, CIRP Ann. – Manuf. Technol.
286 (2006) 58(2), 543–565 (2009)
7.88 G. Cevolani: Truth approximation via abductive be- 7.91 G. Tuzet: Projectual abduction, Logic J. IGPL 14(2),
lief change, Logic J. IGPL 21(6), 999–1016 (2013) 151–160 (2006)
7.89 T. Tomiyama, H. Takeda, M. Yoshioka, Y. Shimomura: 7.92 J. Woods: Cognitive economics and the logic of ab-
Part B | 7
Abduction for creative design, Proc. ASME 2003 DETC duction, Rev. Symb. Logic 5(1), 148–161 (2012)
175
Forms of Abd
8. Forms of Abduction and an Inferential Taxonomy
Gerhard Minnameier
Part B | 8
seriously Peirce’s claim (1) that there are only three 8.1.3 Abduction and Abstraction .................. 180
kinds of reasoning, that is, abduction, deduction,
and induction, and (2) that these are mutually 8.2 The Logicality of Abduction,
distinct. Therefore, the fundamental features of Deduction, and Induction .................. 183
the three inferences canvassed, in particular as
8.2.1 Inferential Subprocesses and Abduction
as Inferential Reasoning ...................... 183
regards inferential subprocesses and the validity
8.2.2 The Validity of Abduction, Deduction,
of each kind of reasoning. It is also argued that
and Induction .................................... 184
forms of abduction have to be distinguished along
two dimensions: one concerns levels of abstraction 8.3 Inverse Inferences .............................. 185
(from elementary embodied and perceptual lev- 8.3.1 Theorematic Deduction as Inverse
els to high-level scientific theorizing). The other Deduction .......................................... 185
concerns domains of reasoning such as explana- 8.3.2 An Example for Theorematic Deduction. 187
tory, instrumental, and moral reasoning. Moreover, 8.3.3 Inverse Abduction
Peirce’s notion of theorematic deduction is taken and Inverse Induction ......................... 188
up and reconstructed as inverse deduction. Based 8.4 Discussion of Two Important
on this, inverse abduction and inverse induction Distinctions Between
are introduced as complements of the ordinary Types of Abduction ............................. 189
forms. All in all, the contribution suggests a tax- 8.4.1 Creative Versus Selective Abduction ...... 189
onomy of inferential reasoning, in which different 8.4.2 Factual Versus Theoretical Abduction .... 190
forms of abduction (as well as deduction and in- 8.4.3 Explanatory Versus Nonexplanatory
duction) can be systematically accommodated. The Abduction .......................................... 192
chapter ends with a discussion on forms of abduc- 8.5 Conclusion ......................................... 193
tion found in the current literature.
References................................................... 193
For Peirce, not the proverbial misfortune comes in the same place that even his earlier classification of
threes, but rather does fortune, not least with respect inferences dating from 1867 can be understood in the
to his inferential triad of abduction, deduction, and in- same way, that is, in the sense of the mature Peirce’s
duction. On the one hand, they are thought to cover conception of the three inferences.
the whole process of scientific reasoning from problem Peirce [8.2, CP 8.209 (c. 1905)]:
statement to the final adoption of a hypothesis [8.1, CP
5.171 (1903)]. On the other hand, he claimed that there “I say that these three are the only elementary
are but these three elementary types of inferences so modes of reasoning there are. I am convinced of
that all kinds of reasoning must belong to either abduc- it both a priori and a posteriori. The a priori rea-
tion, deduction, or induction [8.2, CP 8.209 (c. 1905)]. soning is contained in my paper in the Proceedings
Moreover, and this may sound strange, he explains in of the American Academy of Arts and Sciences for
176 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
April 9, 1867. I will not repeat it. But I will men- on theoric transformations that generate or select a new
tion that it turns in part upon the fact that induction system of representation. However, the idea of theoric
is, as Aristotle says, the inference of the truth of the transformations relates to Peirce’s distinction between
major premiss of a syllogism of which the minor theorematic and corollarial deduction, which raises the
premiss is made to be true and the conclusion is question of whether theoric transformations really be-
found to be true, while abduction is the inference long to the realm of abductive reasoning (note that
of the truth of the minor premiss of a syllogism of Hoffmann discusses Peirce’s analysis of Desargues’
which the major premiss is selected as known al- theorem in [8.8, NEM III/2, 870–871 (1909)]. Here,
ready to be true while the conclusion is found to another Peircean puzzle enters the scene, because he
be true. Abduction furnishes all our ideas concern- himself has claimed that theorematic deduction “is very
ing real things, beyond what are given in perception, plainly allied to retroduction (i. e., abduction, G.M.),
but is mere conjecture, without probative force. De- from which it only differs as far as I now see in being
duction is certain but relates only to ideal objects. indisputable” [8.9, MS 754 (1907)].
Induction gives us the only approach to certainty Thus, while there seem to be many different forms
Part B | 8
concerning the real that we can have. In forty years of abduction, it is unclear how many distinctive forms
diligent study of arguments, I have never found one there really are. However, what is much more important
which did not consist of those elements.” is that the scientific community still seems to grapple
with the very notion of abduction, that is, what are the
This is puzzling, if one considers Peirce’s own dis- central features of abduction as such or of its specific
cussion of his earlier conception in his later work where forms. Above, I started citing Peirce with his claim that
he states explicitly that [8.2, CP 8.221 (1910)]: there be only three basic and distinct kinds of inferences.
However, apart from what has already been mentioned
“in almost everything I printed before the beginning
above, a persistent problem seems to be to distinguish
of this century I more or less mixed up Hypothesis
between abduction and induction, inasmuch as infer-
and Induction (i. e., abduction and induction accord-
ence to the best explanation (henceforth IBE) has to be
ing to his later terminology, G.M.).”
understood as a form of induction in the Peircean sense.
Thus, if he is not contradicting himself, both state- In [8.10], I have tried to disentangle abduction and IBE,
ments must be true, however, each in a specific respect. and I have not been alone with this view [8.11]. How-
This is one riddle I will try to solve in this chap- ever, Gabbay and Woods mention inference-to-the-best-
ter, but it is not the only one. I take it as one specific explanation abductions [8.5, p. 44], and their schema
stumbling stone on the way to a full understanding of for abduction [8.5, p. 47] seems to capture both abduc-
the very notion and logicality of abduction. In order tion and IBE. Magnani [8.3, p. 19], [8.4, pp. 18–22]
to achieve a comprehensive account of abduction, how- and Schurz [8.6, pp. 201–203] equally subsume IBE to
ever, it is also necessary to accommodate a whole host abductive reasoning. I reckon that this has to do with
of different concepts of abduction that have been sug- similarities between their notion of selective abduction
gested in recent years. Magnani, for instance, not only on the one hand and IBE on the other.
distinguishes between creative and selective abduc- In my view, Peirce was right to claim that there are
tion [8.3], but also between sentential and model-based but three kinds of reasoning and that there are clear lines
abduction, theoretical and manipulative abduction, ex- of demarcation between them. Accordingly, I think
planatory and nonexplanatory abduction [8.4]. The lat- there is reason to tighten the Peirce-strings by integrat-
ter distinction is drawn from Gabbay and Woods [8.5], ing different forms of abduction (as well as deduction
who maintain that abduction be extended to cover not and induction) within a clear and coherent taxonomy.
merely explanatory, but also nonexplanatory abduc- In Sect. 8.1, I will first point out that abduction and
tion, although they remain diffident qualifying their IBE are distinct (Sect. 8.1.1), then show how abduction,
differentiation “as a loose and contextually flexible dis- deduction, and induction hang together to form a pro-
tinction” [8.5, p. 115]. ductive inferential cycle from a pragmatist point of view
Another classification is proposed by Schurz [8.6] (Sect. 8.1.2), and finally explain how this productivity
who distinguishes between factual abduction, law- enables us to construct a hierarchy of conceptual levels
abduction, theoretical-model-abduction, and second (Sect. 8.1.3). Within this context, different forms of ab-
order existential-abduction, with the first and last being ductions can be distinguished in terms of the cognitive
further divided into subclasses. Building on this clas- levels at which they are located, and in terms of whether
sification and extending it, Hoffmann [8.7] produces new concepts are invented or existing ones applied.
a 3 5 matrix containing 15 types. Most importantly, In Sect. 8.2, I explicate the logicality of each of the
he amends Schurz’s main categories by a form focusing three inferential types. This is done in two steps. First
Forms of Abduction and an Inferential Taxonomy 8.1 Abduction in the Overall Inferential Context 177
(Sect. 8.2.1), the inferences will be analyzed in terms In Sect. 8.4, I will discuss three important distinc-
of three characteristic subprocesses that Peirce assumes tions among forms of abductive reasoning: creative
for inferences in general, that is, (1) colligation, (2) ob- versus selective abduction (Sect. 8.4.1), factual versus
servation, and (3) judgment ([8.12, CP 2.442–244 (c. theoretical abduction (Sect. 8.4.2), and explanatory ver-
1893)], see also Kapitan [8.13, p. 479]). Next, the va- sus nonexplanatory abduction (Sect. 8.4.3). It turns out
lidity of each inference will be discussed in Sect. 8.2.2. that abductions (and other inferences) are to be dis-
Based on this analysis, Peirce’s notion of theore- tinguished in terms of knowledge generation (creative)
matic reasoning is explored in Sect. 8.3. In Sect. 8.3.1, versus knowledge application (selective) and along two
theorematic deduction is explicated as inverse deduc- cognitive dimensions: one concerns levels of abstrac-
tion, leading from the result of corollarial deduction to tion (from elementary embodied and perceptual levels
the premise of corollarial deduction, that is, the theoreti- to high-level scientific theorizing). The other concerns
cal point of view from which the result can be deduced. domains of reasoning such as explanatory, instrumental,
An instructive example is given in Sect. 8.3.2, and in and moral reasoning. In the concluding Sect. 8.5, the
Sect. 8.3.3 the idea of inverse inferences is extended main results of my analysis are summarized and routes
Part B | 8.1
to inverse abduction and induction. As a result, we end for further research indicated.
up with three ordinary and three inverse forms of pure Although I consider my argumentation coherent and
inferential types (note that Peirce has also introduced in line with Peirce, I do not claim to deliver an exegesis
analogy as a compound inference conjoining abduction of what Peirce himself might have thought, especially
and induction [8.14, CP 1.65]; see also [8.15] on this since parts of my inferential taxonomy are clearly not
issue). contained in his works.
is the best explanation and ought to be adopted as true have to be considered anymore (however, this implies
or likely to be true [8.18, CP 6.526–536 (1901)]. Within that by the same token all possible alternatives must, in
this elaboration, he is careful to make sure that the se- fact, be refuted). Or if it is to be conceived as logical,
lective aspect of abduction is something different [8.18, then the hypothesis to be rejected at this stage has ei-
CP 6.528 (1901)]: ther to be conceived as abductively (here: explanatorily)
invalid (see Sect. 8.2.2), or the rejection has to follow
“These distinctions (among forms of induction,
from an inductive evaluation of the competing hypothe-
G.M.) are perfectly clear in principle, which is all
ses. From such an inductive evaluation it might follow
that is necessary, although it might sometimes be
that the hypothesis currently countenanced is well on
a nice question to say to which class a given in-
the way of being proved in the above-quoted sense, in
ference belongs. It is to be remarked that, in pure
that it is better than a number of other hypotheses, al-
abduction, it can never be justifiable to accept the
though further testing or further reflection about novel
hypothesis otherwise than as an interrogation. But
approaches seems appropriate.
as long as that condition is observed, no positive fal-
Anyhow, it has to be admitted that Peirce is impre-
sity is to be feared; and therefore the whole question
Part B | 8.1
tures. Steps 5 through 7 ought, to my mind, be attributed (1903)]. The crucial point here is that induction can
to induction, in this case a rather tentative induction only operate with concepts that are already at hand. On
that Peirce has labeled “abductory induction” [8.18, CP top of this, even simple regularities like 8x.Fx ! Gx/
6.526 (c. 1901)], because it only qualifies a hypoth- do not suggest themselves, but have to be considered
esis H as better than possible alternative hypotheses, by an active mind, before they can be tested and even-
but not in the sense of a full-fledged inductive judg- tually accepted or rejected (see Sect. 8.1.3, relating to
ment to the truth of H. Further conditions S1 ; : : : ; Sn Carnap’s disposition predicates). This is why the ma-
may exist in the form of background knowledge per- ture Peirce suggests that abduction is the process by
taining to Peirce’s criteria of simplicity or breadth (see which new concepts, laws, and theories are first con-
above) or additional empirical evidence in favor of H. ceived, before they are investigated further by deductive
However, these further pieces of information clearly go and inductive processes [8.1, CP 5.171 (1903)]:
beyond the abductive task; they may be produced by de-
“Abduction is the process of forming an explana-
ductive reasoning about what certain hypotheses imply
tory hypothesis. It is the only logical operation
(because how do S1 ; : : : ; Sn become conscious?), and
which introduces any new idea; for induction
Part B | 8.1
are finally considered in inductive reasoning. Therefore,
does nothing but determine a value, and deduc-
I propose to repatriate steps 5 through 7 to the realm
tion merely evolves the necessary consequences of
of induction, and to take very seriously the following
a pure hypothesis. Deduction proves that something
statement [8.2, CP 8.218 (c. 1901)]:
must be; Induction shows that something actually
“Nothing has so much contributed to present chaotic is operative; Abduction merely suggests that some-
or erroneous ideas of the logic of science as failure thing may be. Its only justification is that from its
to distinguish the essentially different characters of suggestion deduction can draw a prediction which
different elements of scientific reasoning; and one can be tested by induction, and that, if we are ever
of the worst of these confusions, as well as one to learn anything or to understand phenomena at all,
of the commonest, consists in regarding abduction it must be by abduction that this is to be brought
and induction taken together (often mixed also with about.”
deduction) as a simple argument. Abduction and
Abduction is most important in our overall reason-
induction have, to be sure, this common feature,
ing, because without it we could not possibly acquire
that both lead to the acceptance of a hypothesis be-
any idea of the world, not even elementary perceptions
cause observed facts are such as would necessarily
of objects, let alone scientific theories. Hence, “no new
or probably result as consequences of that hypoth-
truth can come from induction or from deduction” [8.2,
esis. But for all that, they are the opposite poles of
CP 8.219 (c. 1901)]. Whereas abduction is very power-
reason [. . . ].”
ful in terms of the generation of fruitful new ideas, how-
Recently, McKaughan [8.20], Campos [8.22], and ever, it is very week in terms of empirical validity, as
Mackonis [8.23] have argued in favor of a wide no- Peirce often stresses. He even says that his discovering
tion of IBE, including abduction, although they endorse the true nature of abduction so late was “owing to the
the sharp disctinction others and myself have made. extreme weakness of this kind of inference” [8.12, CP
However, in the light of the subtle, but nonetheless 2.102 (1902)]. Empirical validity is gained by deducing
important, distinctions I have tried to highlight in this necessary consequences from the abduced hypotheses,
section, I think there is not much use fitting it all in one especially predictions that can be tested empirically,
global concept of IBE. and the inductive evaluation of the experimental results
(or other suitable evidence). Figure 8.1 illustrates the
8.1.2 The Dynamical Interaction dynamical interaction of the three inferential types.
of Abduction, Deduction, So far, the role of abduction and deduction seems
and Induction self-evident. However, an explanation should be given
for the role of induction in this triad, in particular why it
By the end of the nineteenth century, Peirce rejected his points back to where abduction starts in Fig. 8.1. After
original syllogistic approach and said “I was too much all, induction is typically understood as the inference to
taken up in considering syllogistic forms [. . . ], which I the truth (or falsity) of the theory in question, and as
made more fundamental than they really are” [8.12, CP a consequence it should point back to the theory itself,
2.102 (1902)]. However, even more to the point, Peirce like for example in Magnani’s ST-Model [8.4, p. 16];
realized that induction “never can originate any idea also [8.3, p. 23].
whatever. Nor can deduction. All the ideas of science However, induction in the Peircean sense is tied
come to it by the way of Abduction” [8.1, CP 5.145 back to his pragmatism, which, again, rests on the logic
180 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Theory
man has some belief at the outset. This belief is,
as to its principal constituent, a habit of expecta-
tion. Some experience which this habit leads him
to expect turns out differently; and the emotion of
on
De
surprise suddenly appears.”
cti
du
du
cti
Ab
on
Thus, when an accepted theory is subsequently ap-
Induction plied to relevant cases, it is not only being applied, but
Facts (t0) Consequences t1, 2, 3, ...
also reassessed over and over. In this very sense, knowl-
Fig. 8.1 The dynamical interaction of abduction, deduc- edge acquisition and knowledge application are funda-
tion, and induction mentally tied together and follow the same inferential
principles. That is, every application of previously ac-
of abduction (cf. also [8.15, pp. 207–212]). Peirce [8.1, quired knowledge has to be understood as:
CP 5.196 (1903)]:
1. Abducing from a certain situational configuration to
Part B | 8.1
building fully explicit. I think it would be a fruitful I consider this a strong and important point (see
endeavor to reconstruct how conceptual (or theoreti- also [8.36, 37] on explanatory hierarchies and explana-
cal) levels are built onto one another by successive tory coherence). Laws, in this view, are not the solutions
abductions. For instance, when a simple disposition is (the explanations) but the problems (the facts to be ex-
discovered (that sugar dissolves in water), this con- plained). However, I would not go so far as to deny
stitutes an empirical law, which is itself a concept of laws, even simple empirical laws, any explanatory func-
a regularity in nature. It can also be used to explain why tion. It just depends on the point of view and the
someone else does not see the sugar in her drink. We theoretical level, which is needed and appropriate to
could say that it is very well in there, but cannot be seen, solve a particular explanatory problem. If one is looking
because it has dissolved. for causal relationships between events, one is in fact
What this simple example shows is that even in sim- searching for law-like explanations. And this not only
ple fact-abduction, we do not just infer to the fact, but applies to children in their early cognitive development.
to the law from which it then follows that sugar might Most adults are content with knowing what keys to
be in a liquid, even if no sugar can be seen. Thus, dis- press in order to use certain functions of a software; in
Part B | 8.1
positional laws and simple action schemes (When the such cases the question is how a certain result is brought
switch is pushed, the light will go on) constitute an el- about, and the explanation consists in functional rela-
ementary theory level, that is, regularities in terms of tions between the keys or menu options and the results
observation language. However, these regularities are visible on the screen. The same applies to cookbooks
themselves phenomena that one may wish to explain, and manuals for technical appliances in which it is
especially when one starts wondering, how the switch explained, how things work or why something I tried
is causally connected with the light. At first, it was es- did not work. I assume that, for example, Schurz’s ac-
tablished that a natural regularity exists. Now, as the count of explanation as unification applies not only to
regularity is established as a matter of fact, it becomes scientific theories, but also to such simple forms of ex-
the object of theoretical reflection and represents the planation [8.10, 38].
fact to be explained. Thus, there seems to be an order of theory-levels
This is what Hintikka highlights when he discusses or levels of abstraction, where the higher ones explain
the difference between abduction and IBE. He says the lower ones, and where abduction is the process that
that [8.32, p. 509]: takes the reasoner from a lower level to a higher one.
Such a hierarchy of levels may also be the clue to under-
“when a dependence law telling us how the ob- standing how (intuitive) cognition works below explicit
served variable depends on the controlled one the sentential reasoning and how the latter comes about in
law does not explain the result of the experiment. It ontogenetic development.
is the result of the experiment, nature’s answer to Peirce famously argued that perceptual judgments
the experimental investigator’s question.” are no abductions. However, he seems to have been too
strict or narrow-minded in this context (see also [8.3,
Earnan McMullin has made the same point con- pp. 42–43], [8.4, pp. 268–276]). While he clearly ad-
cerning the role of laws in explanation: “Laws are mits that perceptual judgment “is plainly nothing but
the explanada; they are the questions, not the an- the extremest case of Abductive Judgements” [8.1, CP
swers” [8.35, p. 90]. And he continues [8.35, p. 91]: 5.185 (1903)] and that “abductive inference shades
into perceptual judgment without any sharp line of de-
“To explain a law, one does not simply have re- marcation between them” [8.1, CP 5.181 (1903)], he
course to a higher law from which the original law maintains that they are nonetheless distinct, because un-
can be deduced. One calls instead upon a theory, like abductive inferences, perceptual judgments were
using this term in a specific and restricted sense. “absolutely beyond criticism” [8.1, CP 5.181 (1903)].
Taking the observed regularity as effect, one seeks Peirce points out repeatedly that abduction as an infer-
by abduction a causal hypothesis which will ex- ence requires control and that this misses in perceptual
plain the regularity. To explain why a particular sort judgment [8.1, CP 5.157, 181, 183, 194 (1903)]. He
of thing acts in a particular way, one postulates an therefore holds that perceptual judgment is the “start-
underlying structure of entities, processes, relation- ing point or first premiss of all critical and controlled
ships, which would account for such a regularity. thinking” [8.1, CP 5.181 (1903)], hence something on
What is ampliative about this, what enables one to which abduction is based, but which does not belong to
speak of this as a strong form of understanding, is abduction itself.
that if successful, it opens up a domain that was pre- However, Peirce fails to consider two aspects of per-
viously unknown, or less known.” ceptual judgments: first, they might be conceivable as
182 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
established facts in the sense of a complete triad of Peirce explains [8.1, CP 5.534 (c. 1905)]:
abduction, deduction, and induction. This would ex-
“Even in this burlesque instance, this operation of
plain why we are (normally) certain of our perceptions.
hypostatic abstraction is not quite utterly futile. For
Second, and more importantly, Peirce fails to consider
it does say that there is some peculiarity in the
that abductions are well-controlled, not by conscious
opium to which the sleep must be due; and this is
thought, but by action. Perceptions can be understood
not suggested in merely saying that opium puts peo-
as habits of action, that is, of categorization and behav-
ple to sleep.”
ior in accordance with what we perceive. And finally,
the abductive or conjectural part of this process is that Elsewhere, he discusses the same idea, but speaks of
with every new perception the individual literally makes subjectal abstraction as opposed to precisive abstrac-
sense of what enters into the sensory system. Some- tion (see also [8.42]).
times these creations are fallacious or even foolish, but Peirce [8.8, NEM III/2, p. 917 (1904)]:
this puts them fully in line with abduction in general.
“There are two entirely different things that are of-
At least, this is what I suggest at this point, and it
ten confused from no cause that I can see except
Part B | 8.1
made by Piaget and Garcia [8.43]). To date, research duction is the process that leads to successively more
on cognitive architectures has primarily focused on the abstract cognitions in the sense of hierarchical com-
lower end of the cognitive hierarchy, i. e., how relative plexity, there is a promising route for further research
simple conceptual and action schemata are built and and a systematic differentiation of types of abductions
grounded in the brain’s modal systems for perception, according to the cognitive levels, to which they apply
emotions, and actions [8.41, 44, 45]. However, since ab- (as for moral cognition see [8.46] as an example).
Part B | 8.2
so as to produce a new icon. [. . . ] It thus appears that
As already mentioned above, Peirce regarded abduction
all knowledge comes to us by observation. A part
as an extremely weak kind of inference. This raises the
is forced upon us from without and seems to result
question of whether it is an inference at all. On top
from Nature’s mind; a part comes from the depths
of this, he says that abduction is “nothing but guess-
of the mind as seen from within [. . . ].”
ing” [8.19, CP 7.219 (1901)] and its results merely “the
spontaneous conjectures of instinctive reason” [8.18, Peirce [8.12, CP 2.444]:
CP 6.475 (1908)]. However, abduction is also said to
“A few mental experiments – or even a single one
“cover all the operations by which theories and con-
[. . . ] – satisfy the mind that the one icon would at all
ceptions are engendered” [8.1, CP 5.590 (1903)], and
times involve the other, that is, suggest it in a spe-
since it takes us to novel concepts and theories, he can-
cial way [. . . ] Hence the mind is not only led from
not mean guesses in the ordinary sense of picking out
believing the premiss to judge the conclusion true,
something at random from a range of existing objects
but it further attaches to this judgment another – that
of choice. However, the question remains whether ab-
every proposition like the premiss, that is having an
duction is an inference or merely an instinct. In a way,
icon like it, would involve, and compel acceptance
both seems to be true [8.47], but for the present purpose
of, a proposition related to it as the conclusion then
it suffices to stress that abduction has an inferential as-
drawn is related to that premiss.”
pect [8.32, 47, 48]. So, let us try to track this inferential
aspect of abduction. He concludes that “[t]he three steps of inference
In this respect it may be instructive to consider are, then, colligation, observation, and the judgment
Peirce’s thoughts on inference in general. On his view, that what we observe in the colligated data follows
all inferences are mental acts of reasoning and as such a rule” [8.12, CP 2.444]. The step of colligation is con-
describe a process with a definite beginning and a def- sistently used and explained and thus seems to be rather
inite end. Any inference begins with a question that clear [8.1, CP 5.163 (1903)], [8.1, CP 5.579 (1898)].
requires an answer in the form of the respective conclu- However, Peirce is less precise about the other two. In
sion. Abduction aims at possible explanations, deduc- particular, his differentiation, in this context, between
tion at necessary consequences following from certain a plan and the steps of reasoning may cause some con-
premises, and induction aims at determining whether fusion [8.1, CP 5.158–166 (1903)]. As for the plan he
to accept or reject a hypothesis. Whatever the infer- says that [8.1, CP 5.162 (1903)]:
ence, however, the process of answering these questions
“we construct an icon of our hypothetical state of
contains three distinctive steps, which Peirce calls col-
things and proceed to observe it. This observation
ligation, observation, and judgment.
leads us to suspect that something is true, which we
Peirce [8.12, CP 2.442 (c. 1893)]:
may or may not be able to formulate with precision,
and we proceed to inquire whether it is true or not.”
“The first step of inference usually consists in bring-
ing together certain propositions which we believe Thus, we observe what is colligated in the premise
to be true, but which, supposing the inference to be in order to produce a result. Even though this observa-
a new one, we have hitherto not considered together, tion may be guided by strategies and other background
or not as united in the same way. This step is called knowledge the result will first come about in a spon-
colligation.” taneous act as the reasoner becomes conscious of it.
184 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
When discussing observation in the context of abduc- Any abductively observed result that does not meet
tion, he goes on to a general description of observation these criteria will have to be rejected. If the criteria are
that brings out this main feature very plainly [8.1, CP met, however, the hypothesis will have to be accepted
5.581 (1898)]: as a valid abductive conclusion (see also my reflections
in Sect. 8.1.1). From this point of view, we can now
“And then comes an Observation. Not, however, an
understand Peirce’s famous statement of the abductive
External observation of the objects as in Induction,
inference [8.1, CP 5.189 (1903)]:
nor yet an observation made upon the parts of a dia-
gram, as in Deduction; but for all that just as truly an “The surprising fact, C, is observed;
observation. For what is observation? What is expe- But if A were true, C would be a matter of course,
rience? It is the enforced element in the history of Hence, there is reason to suspect that A is true.”
our lives. It is that which we are constrained to be
This only relates the final subprocess of abduction,
conscious of by an occult force residing in an ob-
the judgmental part. However, it is not to be confused
ject which we contemplate. The act of observation is
with abduction as an inferential cognitive process as
the deliberate yielding of ourselves to that force ma-
Part B | 8.2
Part B | 8.3
explain C, that is, render the previously surprising phe-
(ii) S, knowing how to use ˚ and knowing how to use
nomenon causally possible. As one hits on an idea, one
with the same propositional intent, as a result of
has to make sure that H would really explain C. This is
undergoing R or having prior knowledge that K en-
the abductive judgment H.
tertains the proposition ˚ with that propositional
This result gained from abduction is then used as
intent as being factually true or false
input for the following deduction, together with suit-
(iii) ˚ is factually true
able premises P available from background knowledge.
(iv) there exists a conjunction C of partial world state
These are observed so as to generate necessary con-
descriptions and probability spaces such that C &
sequences, in particular empirical hypotheses E. The
˚ (C & R & K & ˚) & ˚ (C & ˚) &
judgment ..H ^ P/ ! E/ states that E follows with
R & ˙ (R & ˚)
necessity from .H ^ P/.
(v) as a result of undergoing R or K, S believes that
Again, the deductive conclusion is input into induc-
˚ [8.53, p. 405].
tion, where it is colligated with the actual experiment,
As a result, induction can be conceived in terms which are then observed. This observation is more
of an elaborate eliminative inductivism in the sense of than just recording what happens; in fact, such record-
Earman [8.54]. A theory is to be adopted, if all that has ing would have to be understood as the main part of
been observed so far supports it and that no alternative the inductive colligation. Observation in the context
hypothesis is conceivable (at the current state of knowl- of induction means to look at these results (maybe
edge). at the time when they are actually produced) under
The results of my analysis are condensed in Fig. 8.2 the aspect of whether they confirm the tested hypoth-
(which is reduced to the essential features). Note that esis and disconfirm its rivals. If the final outcome is
the diagram shows steps in the inferential processes. positive, H is accepted as causally necessary, hence
They are not to be misread as syllogisms. H.
orematic deduction and abduction and comes to the ial deduction to the premises from which the result can
following result: “It is one thing to prove a theorem and be deductively derived. The similarity with abduction
another to formulate it” and continues that “it would results from the fact that theorematic deduction takes
make sense to describe the first task as theorematic the reasoner to a theoretical point of view, which is
deduction and the second task as abduction” [8.48, p. the point in the above diagram on inferential reasoning
294]. (Fig. 8.1) where abduction would take her. Thus, ab-
In view of this puzzlement concerning the proper duction and theorematic deduction both aim at the same
understanding of theorematic deduction and its relation point (Fig. 8.3).
to abduction, my suggestion is not to subdivide theore- Within this frame of reference, it also becomes clear
matic deduction into abductive and deductive aspects, why Peirce thinks that theorematic deduction is am-
but to reconstruct theorematic deduction as a form of pliative. He just did not call it ampliative deduction,
reasoning of its own, albeit similar to abduction in an because he feared that this labeling would have been
important respect. As for the similarity between abduc- considered as unacceptable [8.55, NEM IV, 1 (1901)]:
tion and theorematic deduction, Peirce himself remarks
“It now appears that there are two kinds of deductive
Part B | 8.3
De
cti
n o
3 4
7
Region C
image of the premiss in order from the result of 8.3.2 An Example for
such experiment to make corollarial deductions to Theorematic Deduction
the truth of the conclusion.”
Part B | 8.3
In addition to Peirce’s examples like Desargues’ theo-
Hoffmann, too, stresses that theorematic reasoning rem (discussed in [8.7, pp. 581–584]), I suggest Leon-
(he uses the notion of theoric transformations) essen- hard Euler’s solution of the Königsberg bridge problem
tially consists in “looking at facts from a novel point of as a case in point. In Euler’s time, the river Pregel
view” – a phrase taken from [8.60, MS 318] ([8.7, p. formed the topological shape shown in Fig. 8.4. The
581], [8.48, p. 291], [8.4, p. 181]). And the fact that at- question is whether it is possible to pass all seven
taining this novel point of view is first of all the result bridges on a walk while passing each bridge only once.
of observation that subsequently has to be subjected to To solve this problem, Euler used a graph in which
a corollarial deduction as judgment within theorematic the state sequence is shown as transitions from region
deduction may also explain the following passage. to region. Figure 8.5 shows how this looks like, if one
Peirce [8.61, NEM III, p. 869 (1909)]: starts in region A and passes the first five bridges in
numerical order. Accordingly, the number of regions
“To the Diagram of the truth of the Premisses some- in this diagram will always be N C 1, where N is the
thing else has to be added, which is usually a mere number of all bridges. Moreover, with the five bridges
May-be, and then the conclusion appears. I call this connected to region A, this region is mentioned three
Theorematic reasoning because all the most impor- times. A so-called uneven region, that is, one with an
tant theorems are of this nature.” uneven number of bridges, will always appear .nC1/=2
times in the graph, independently from whether one
Ketner [8.57, p. 408] refers to this passage to un- starts in this very region or in another region. This
derpin his view that theorematic deduction is a kind is different for even regions. If we only consider re-
of abduction. However, on my account theorematic de- gions A and B, there are only two bridges. If one
duction is a May-be, firstly, in the sense of introducing starts in A, A is mentioned twice and B only once. If
a theoretical point of view, and secondly, because it is one starts in B, it is the other way round. In general,
spontaneously generated by observation and still has to the region is mentioned n=2 times if one starts out-
be submitted to judgment. This is my reconstruction of side this region, and n=2 C 1 times if one starts from
theorematic deduction as inverse deduction. Further re- within.
finements might be necessary, in particular analyzing However, all regions in the seven bridges problem
the variants that Levy discusses in [8.58, pp. 98–103]. are uneven so that the solution is rather simple. A walk
However, this must be left to a separate analysis. Here, on which one passes each bridge only once encom-
I prefer to provide an instructive example and extend the passes seven transitions between eight states. However,
idea of inverse inferences to include inverse abduction each region must appear .nC1/=2 times in the diagram,
and inverse induction. which means three times for region A and two for re-
gions B through D, that is, nine altogether. Hence, the
desired walk is impossible.
A B A C A D This example shows that from an abstract topolog-
ical point of view it is possible to formulate principles
from which the impossibility of the specified walk can
Fig. 8.5 Graph of the state sequence in seven bridges be deduced. The diagram in Fig. 8.4, together with
problem the question, represents the colligated premise, which
188 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Based on the reconstruction of theorematic deduction Fig. 8.7a,b The cobweb model: (a) successive adjustment
as an inverse deduction, it follows naturally that there of supply and demand; (b) cyclical fluctuations of supply
could be two other forms of reasoning: inverse abduc- and demand
tion and inverse induction (Fig. 8.6). Moreover, since
inverse (theorematic) deduction is similar to abduction all other relevant cases one might think of or en-
in that it aims at the same point in Fig. 8.6, inverse counter). The difference is that it is only a provi-
Part B | 8.3
abduction should be similar to induction, and inverse sional projection (if the theory is true), whereas in
induction should be similar to deduction. induction factual truth in the pragmatist sense is estab-
Inverse abduction starts from some theory or ab- lished.
stract concept and searches, for examples, possible The difference can also be stated in this way: inverse
instantiations. For instance, the economist Nicholas abduction starts from the colligation of a theoretical
Kaldor [8.62] suggested the cobweb model, which ex- model of some sort, which has a meaning but as yet
plains how supply (S) and demand (D) develop if time no reference. This theoretical model is observed in or-
lags are assumed for the reaction of the supply side to der to be able to project it onto some case to which it
a change in demand and vice versa (see Fig. 8.7). If the refers (here is the similarity to induction). Finally, it has
supply curve is steeper than the demand curve, prices to be judged (abductively) whether the case that sug-
(P) and quantities (Q) will gradually converge to the gested itself can really be subsumed to the theoretical
equilibrium. However, if the slope is the same, supply model.
and demand will fluctuate cyclically. Other examples for inverse abduction are riddles
If it is asked what would be a case in point of that we give children to solve, once they can use
such a persistently fluctuating supply and demand, this concepts independently from concrete references, for
would require what I call inverse abduction. The the- example, What has a face and two hands, but no arms
oretical model has to be understood on the abstract or legs? A clock. The task is to find something concrete
level, but it is unclear whether there is a concrete case that satisfies this abstract definition. Again, the defini-
at all to it. An example would be the so-called pork tion is first colligated, then observed in order to project
cycle that was observed in the 1920s in the United it onto some concrete object, and the final part consists
States and in Europe. Kaldor’s theoretical model pro- in the judgment as to whether the definition really ap-
vides a possible explanation for such phenomena, but plies to the object and whether this inference is thus
in this case the argument runs in the opposite di- abductively valid (in this case, as a possible circum-
rection, from the theory to the case. The similarity scription of a clock).
to induction consists in the fact that inverse abduc- Turning to inverse induction, this inference starts
tion projects a possible explanation onto a case (and from the purported truth (or falsity) of a theory and
tries to infer back to a crucial experiment that de-
termines whether the theory would have to be ac-
Theory
cepted or rejected. This form of reasoning typically
applies when two competing approaches stand against
each other, in particular when both are well-confirmed
Inverse Inverse
on
De
abduction deduction
du
du
on
established the validity of quantum theory and refuted posite face is red. So, which two cards have to be turned
Einstein in his attempt to save classical physics. over to see whether this rule is violated? The solution is
Inverse induction is similar to deduction because it the two cards showing the 8 and the yellow back.
essentially involves deductive steps to derive the deci- Not even 10% get this right (at least in this rather
sive experimental conditions. However, as opposed to formal context). The reason may be that they fail to see
Peirce’s theorematic deduction, it does not prove any- that they need to use modus ponens (even number !
thing, and unlike corollarial deduction it does not just red back) and modus tollens (: (red back) ! : (even
derive what follows from a certain theory, but starts number). However, the task is not just to use modus po-
from competing theories and the intention to determine nens and modus tollens correctly, and therefore it is not
which one is true and which one false. From this col- just about deduction, as is usually thought. Rather is it
ligated premise, the theories are observed in order to the most important part of this reasoning task to find
find the decisive experimental conditions, and the final out (through observation) that these deductive rules al-
judgment does not concern the deductive validity but low you to determine which two cards have to be turned
whether the test would really be decisive. over. Moreover, strictly speaking, there are also two
Part B | 8.4
Another, much simpler, example is the so-called competing hypotheses involved: the rule and its nega-
Wason selection task [8.63], one of the most inves- tion.
tigated experimental paradigms. This task consists in To sum up, all inverse inferences contain elements
determining which two of four cards one has to turn of its predecessor in the ordinary order, and these ele-
over in order to know whether a certain rule is true or ments are important in the observational subprocesses.
false. There are cards with a yellow or a red back, and Here, inverse abduction relates to induction, inverse
on their front-sides they have a number, even or odd. deduction to abduction, and inverse induction to de-
Now, there are four cards showing (1) a 3, (2) an 8, (3) duction. However, the final judgments are abductive in
a red back, and (4) a yellow back. The rule says that if inverse abduction, deductive in inverse deduction, and
a card shows an even number on one face, then its op- inductive in inverse induction.
tive judgment by which stupid ideas are sorted out. The 3. And, if there are more than one abductively valid
hypotheses among which to select have already passed ideas, ranked in order of a priori plausibility, how-
this test. However, elsewhere Peirce makes clear that ever, only for economical reasons.
“the whole question of what one out of a number of
possible hypotheses ought to be entertained becomes To be sure, the latter aspect is clearly the least cen-
purely a question of economy” [8.18, CP 6.528 (1901)]. tral one, since it is merely of practical importance. And
Hence, this aspect of selection concerns abduction only it should be noted that Magnani does not attribute it to
from a practical point of view, not from a logical one, selective abduction when he writes: “Once hypotheses
as I have argued above in Sect. 8.1.1. have been selected, they need to be ranked [. . . ] so as
Turning to Magnani and Schurz, the latter writes to plan the evaluation phase by first testing a certain
[8.6, p. 202]: preferred hypothesis” [8.3, p. 73]. As also Peirce warns
in [8.18, CP 6.525, see above], it should by no means
“Following Magnani (2001, p. 20) I call abductions
be confused with inductive reasoning.
which introduce new concepts or models creative,
This reconstruction of selective abduction as the ab-
in contrast to selective abductions whose task is to
Part B | 8.4
are two possible causes, r and s. However, the question To my mind, this misrepresents (factual) abduction.
is whether I abduce to r and s or to r ! w and s ! For on the one hand, if we take each of the village’s in-
w , respectively. In my view, both is true in a certain habitants as a hypothetical candidate for the murderer,
way, which becomes clear if we distinguish inferential and intend to boil down their number by some kind of
subprocesses. inference, this would have to be induction. On the other
Of course, as we look out of the window and wonder hand, if the problem really is to reduce the search space,
about w (colligation), either r or s or both spring to our then we are not dealing with a multitude of conjectures
minds (observation). However, since we are looking for as abductive solutions to some abductive problem (find-
an explanation of w , we are not interested in r or s as ing the murderer), but we are dealing with a problem.
such, but whether w because of r (r ! w ) or whether The fact that there is a multitude of possibilities changes
w because of s (s ! w ). In other words, the law must the situation. The task is not simply to select one of
be implicit in observing the fact, because the fact only those hypotheses, but to come up with a theory that ex-
makes sense as part of the law. What’s more, a sponta- plains the murder and identifies particular individuals
neous idea is no valid abduction (not yet). In order to as suspects.
Part B | 8.4
abduce that r or that s we have to perform a judgment The deeper truth is that instead of merely select-
(explicitly or implicitly) of the type of Schurz’s schema. ing we move to higher level of reasoning, just in the
Thus, Schurz’s schema fleshes out the abductive judg- sense that I have described in Sect. 8.1.3. The very first
ment in the case of factual abduction. And even though level, in the example of the murderer, is that one un-
r or s may be our spontaneous ideas they are engen- derstands that the very concept of a murder implies that
dered not as such, but as the antecedents of r ! w and the victim has been killed by someone. Given that there
s ! w , respectively. are certain objective restrictions, not every human be-
This may all appear self-evident. However, since ing can possibly have committed the crime, but just the
factual abduction is basically abduction to known laws set of the villagers. The next step is to move to the level
and theories (rather than to facts pure and simple), of narratives in the sense of a coherent description of
we can unify Schurz’s subforms of factual abduction, what might have happened. However, there might be
namely observable-fact abduction, first-order existen- still too many possibilities, or also none. Yet another
tial abduction, and unobservable-fact abduction [8.6, step could consist in applying theoretical knowledge as
pp. 27–210]. Moreover, it reveals that Schurz’s distinc- professional profilers do.
tion between factual abduction, on the one hand, and As already expounded in Sect. 8.1.3, my sugges-
law abduction, on the other hand, does not refer to en- tion is to reconstruct different forms of abduction in
tirely different forms of abductive inference. The only the dimension of theoretical abstraction. Since factual
difference is that law abduction relates to the creative abduction comes out as applied law or theory abduc-
abduction of new laws, whereas factual abduction re- tion, there is no fundamental difference between fac-
lates to selective abduction as the abductive step of the tual and theoretical abduction. However, what should
application of known laws. Schurz sanctions this view be distinguished systematically are cognitive levels in
when he writes [8.6, p. 207]: reasoning, down from elementary cognitive levels cap-
tured by forms like visual (or iconic) and manipula-
“In the setting of factual abduction, the problem tive abduction [8.3, 4], and up to high-level abductions
consists in the combinatorial explosion of the search like theoretical model abduction, common cause ab-
space of possible causes in the presence of a rich duction [8.6], or trans-paradigmatic abduction [8.65].
background store of laws but in the absence of a rich Magnani, Schurz, Hoffmann, and others have done pio-
factual knowledge. Thus, factual abductions are pri- neering work explicating abductive inferences at both
marily selective in the sense of Magnani.” ends concrete versus abstract cognition, a dimension
which I prefer to call hierarchical complexity. However,
However, I see yet another problem with this de- the precise structures of hierarchical complexity have
scription. It assumes that there is a multitude of possible yet to be revealed (cf. Sect. 8.1.3, above).
hypotheses from which one or a few plausible ones One also has to be careful to distinguish forms that
have to be chosen. In the very same sense he explains do not fit entirely in this order. This seems to apply, for
that [8.6, p. 204]: example, to Schurz’s notions of (extrapolational) mi-
cropart abduction and analogical abduction [8.6, pp.
“in abduction problems we are often confronted 216–219]. The former consists, for example, in extrap-
with thousands of possible explanatory conjectures olating from the behavior of observable macroobjects
(or conclusions) – everyone in the village might be to assume that unobservable microparts like atoms be-
the murderer.” have in the same (or a similar) way. However, this is
192 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
equivalent to an analogical inference from macro to mi- planatory, but they can serve to fulfil some other kind
cro, and as such both do not indicate a certain level of of purpose.
abstraction or complexity, but, following also Peirce, As an example, Gabbay and Woods discuss New-
are to be reconstructed as compound inferences (in- ton’s action-at-a-distance theorem, which was never
cluding an abductive and an inductive step to hit the conceived of as an explanatory hypothesis by Newton,
abductive target), as I have tried to reveal in [8.15]. since he thought that such an action was causally im-
Moreover, Schurz’s concept of hypothetical (common) possible [8.5, p. 116]. From that point of view, it is clear
cause abduction [8.6, pp. 219–222], where he draws that in the explanatory context an action at a distance
to the dormitive virtue example (see Sect. 8.1.3), is, to poses a problem, that is, that of explaining gravitation,
my mind, no valid form of abduction, since this kind of not a solution (like Schurz’s hypothetical cause abduc-
reasoning establishes a problem (Why does opium put tion discussed above). However, Gabbay and Woods
people to sleep? or What does its dormitive virtue con- point out that [8.5, p. 118–119]:
sist of?), not the solution. It yields the premise of an
“[t]he action-at-a-distance equation serves New-
abductive inference, but not more.
ton’s theory in a wholly instrumental sense. It al-
Part B | 8.4
problems and further background knowledge, the func- are technological theories, which are inductively evalu-
tioning of machines and appliances can be deductively ated in terms of effectiveness and – as far as economic
derived, and the machines or prototypes so constructed aspects are concerned – efficiency.
are then evaluated in terms of effectiveness and effi- However, technological theories seem to be just
ciency. one domain of reasoning among others to complement
Hence, there seems to exist (at least) a second kind explanatory reasoning. At least moral concepts and
of cognitive architecture parallel to the explanatory ar- ethical theories could be a third domain [8.26, pp. 90–
chitecture (and, accordingly, Magnani [8.4, p. 71] is 101], [8.30], and they are evaluated neither in terms of
right to claim that Gabbay and Woods’ [8.5] notion of truth nor effectiveness, but in terms of justice. I can
instrumental abduction is orthogonal to the forms he only allude to these domains, here, and a separate pa-
distinguishes). On the one hand, explanatory concepts per will be necessary to expound these ideas. However,
and theories aim at true accounts, and truth is the eval- what seems obvious is that there are distinct realms of
uative criterion for induction. On the other hand, there abduction and of reasoning in general.
Part B | 8
8.5 Conclusion
To sum up, I have argued (as Peirce did) that there level scientific theories. The other dimension, discussed
are precisely three basic kinds of inferences: abduction, in the previous section, is that of domains. By a do-
deduction, and induction. I have distinguished three main I do not mean, in this context, issues of content to
inferential subprocesses and introduced three inverse which one and the same theory is applied, but domains
types of inference, based on the analysis of inferential of reasoning. In this respect I distinguished explanatory,
subprocesses. My claim is that all kinds of real reason- technological, and moral/ethical concepts and theories.
ing ought to be reducible to one of these three basic This framework opens up a taxonomical system that
forms, its inverse forms, or a particular subprocesses might be able to accommodate the various forms of rea-
within one inferential type. However, I also mentioned soning in general, and of abduction in particular, that
analogical reasoning as a special compound form of in- have been suggested so far. I have discussed a few of
ferential reasoning and referred the reader to my [8.15]. them, but by far not all. However, my hope is that this
Moreover, I have tried to point out that apart from taxonomy allows us to account for all a multitude of
these fundamental kinds of reasoning, inferences can varieties of abduction, deduction, and induction, while
be distinguished along two dimensions. One is the di- recognizing them in their particular place and function
mension of hierarchical complexity so that concepts in an overall system and help us to a distinctive under-
and theories are built upon one another across cognitive standing of similarities and differences between these
levels, from elementary perception and action to high- individual forms.
References
8.1 C.S. Peirce: Pragmatism and Pragmaticism, Collected 8.6 G. Schurz: Patterns of abduction, Synthese 164, 201–
Papers of Charles Sanders Peirce, Vol. 5, ed. by C. 234 (2008)
Hartshorne, P. Weiss (Harvard Univ. Press, Cambridge 8.7 M.H.G. Hoffmann: Theoric transformations and a
1934) new classification of abductive inferences, Trans. C.
8.2 C.S. Peirce: Reviews, Correspondence, and Bibliogra- S. Peirce Soc. 46, 570–590 (2010)
phy, Collected Papers of Charles Sanders Peirce, Vol. 8.8 C.S. Peirce: Mathematical Miscellanea, The New Ele-
8, ed. by A.W. Burks (Harvard Univ. Press, Cambridge ments of Mathematics by Charles S. Peirce, Vol. III/2,
1958) ed. by C. Eisele (Mouton, The Hague, 1976)
8.3 L. Magnani: Abduction, Reason, and Science: Pro- 8.9 C.S. Peirce: MS 754 (1907). In: Annotated Catalogue
cesses of Discovery and Explanation (New York, of the Papers of Charles S. Peirce, ed. by R.S. Robin
Kluwer 2001) (Univ. Massachusetts Press, Amherst 1967), available
8.4 L. Magnani: Abductive Cognition: The Epistemolog- online (3 July 2016): http://www.iupui.edu/~peirce/
ical and Eco-Cognitive Dimensions of Hypothetical robin/robin.htm
Reasoning (Springer, Berlin 2009) 8.10 G. Minnameier: Peirce-Suit of truth: Why inference
8.5 D.M. Gabbay, J. Woods: The Reach of Abduction – In- to the best explanation and abduction ought not to
sight and Trial, A Practical Logic of Cognitive Systems be confused, Erkenntnis 60, 75–105 (2004)
Ser., Vol. 2 (Elsevier, Amsterdam 2005)
194 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
8.11 S. Paavola: Hansonian and harmanian abduction as ed. by F. Oser, K. Heinrichs, T. Lovat (Sense, Rotter-
models of discovery, Int. Stud. Philos. Sci. 20, 93–108 dam 2013) pp. 69–82
(2006) 8.30 G. Minnameier: A cognitive approach to the ‘Happy
8.12 C.S. Peirce: Elements of Logic, Collected Papers of Victimiser’, J. Moral Educ. 41, 491–508 (2012)
Charles Sanders Peirce, Vol. 2, ed. by C. Hartshorne, 8.31 C.S. Peirce: MS 692 (1901). In: Annotated Catalogue
P. Weiss (Harvard Univ. Press, Cambridge 1932) of the Papers of Charles S. Peirce, ed. by R.S. Robin
8.13 T. Kapitan: Peirce and the structure of abductive in- (Univ. Massachusetts Press, Amherst 1967), available
ference. In: Studies in the Logic of Charles Sanders online (3 July 2016): http://www.iupui.edu/~peirce/
Peirce, ed. by N. Houser, D.D. Roberts, J.V. Evra (In- robin/robin.htm
diana Univ. Press, Bloomington 1997) pp. 477–496 8.32 N.R. Hanson: Patterns of Discovery (Univ. of Cam-
8.14 C.S. Peirce: Principles of Philosophy, Collected Papers bridge Press, Cambridge 1958)
of Charles Sanders Peirce, Vol. 1, ed. by C. Hartshorne, 8.33 R. Carnap: Testability and meaning, Philos. Sci. 3,
P. Weiss (Harvard Univ. Press, Cambridge 1931) 419–471 (1936)
8.15 G. Minnameier: Abduction, induction, and analogy 8.34 R. Carnap: Testability and meaning, Philos. Sci. 4, 1–
– On the compound character of analogical infer- 40 (1937)
ences. In: Model-Based Reasoning in Science and 8.35 E. McMullin: The Inference that Makes Science (Mar-
Technology: Abduction, Logic, and Computational quette Univ. Press, Milwaukee 1992)
Part B | 8
Discovery, ed. by W. Carnielli, L. Magnani, C. Pizzi 8.36 P. Thagard: Coherence, truth, and the development
(Springer, Heidelberg 2010) pp. 107–119 of scientific knowledge, Philos. Sci. 74, 28–47 (2007)
8.16 S. Psillos: An explorer upon untrodden ground: 8.37 T.A.F. Kuipers: Laws, theories, and research pro-
Peirce on abduction. In: Handbook of the History grams. In: Handbook of the Philosophy of Science:
of Logic, Vol. 10, ed. by D.M. Gabbay, S. Hartmann, General Philosophy of Science – Focal Issues, ed. by
J. Woods (Elsevier, Amsterdam 2011) pp. 117–151 T.A.F. Kuipers (Elsevier, Amsterdam 2007) pp. 1–95
8.17 J. Hintikka: What is abduction? The fundamental 8.38 G. Schurz: Explanation as unification, Synthese 120,
problem of contemporary epistemology, Trans. C. S. 95–114 (1999)
Peirce Soc. 34, 503–533 (1998) 8.39 W. Park: How to learn abduction from animals? –
8.18 C.S. Peirce: Scientific Metaphysics, Collected Papers of From avicenna to magnani. In: Model-Based Rea-
Charles Sanders Peirce, Vol. 6, ed. by C. Hartshorne, soning in Science and Technology – Theoretical and
P. Weiss (Harvard Univ. Press, Cambridge 1935) Cognitive Issues, ed. by L. Magnani (Springer, Berlin
8.19 C.S. Peirce: Science and Philosophy, Collected Papers 2014) pp. 207–220
of Charles Sanders Peirce, Vol. 7, ed. by A.W. Burks 8.40 C. El Khachab: The logical goodness of abduction in
(Harvard Univ. Press, Cambridge 1958) C.S. Peirce’s thought, Trans. C. S. Peirce Soc. 49, 157–
8.20 D.J. McKaughan: From ugly duckling to swan: C. S. 177 (2013)
Peirce, abduction, and the pursuit of scientific the- 8.41 L.W. Barsalou: Grounded cognition, Annu. Rev. Psy-
ories, Trans. C. S. Peirce Soc. 44, 446–468 (2008) chol. 59, 617–645 (2008)
8.21 A. Aliseda: Abductive Reasoning – Logical Investi- 8.42 J.J. Zeman: Peirce on abstraction. In: The Relevance
gations into Discovery and Explanation (Springer, of Charles Peirce, ed. by E. Freeman (The Hegeler In-
Dordrecht 2006) stitute, La Salle, IL 1983) pp. 293–311
8.22 D.G. Campos: On the distinction between Peirce’s 8.43 J. Piaget, R. Garcia: Psychogenesis and the History of
abduction and Lipton’s inference to the best expla- Science (Columbia Univ. Press, New York 1989)
nation, Synthese 180, 419–442 (2011) 8.44 L.W. Barsalou: The human conceptual system. In:
8.23 A. Mackonis: Inference to the best explanation, co- The Cambridge Handbook of Psycholinguistics, ed.
herence and other explanatory virtues, Synthese by M. Spivey, K. McRae, M. Joanisse (Cambridge Univ.
190, 975–995 (2013) Press, New York 2012) pp. 239–258
8.24 C. Hookway: The Pragmatic Maxim: Essays on Peirce 8.45 P. Thagard: Cognitive architectures. In: The Cam-
and Pragmatism (Oxford Univ. Press, Oxford 2012) bridge Handbook of Cognitive Science, ed. by
8.25 C. Hookway: Truth, reality, and convergence. In: The K. Frankish, W. Ramsay (Cambridge Univ. Press, Cam-
Cambridge Companion to Peirce, ed. by C. Misak bridge 2012) pp. 50–70
(Cambridge Univ. Press, Cambridge 2004) pp. 127–149 8.46 G. Minnameier: A new stairway to moral heaven
8.26 C. Misak: Truth, Politics, Morality: Pragmatism and – A systematic reconstruction of stages of moral
Deliberation (Routledge, London 2000) thinking based on a Piagetian logic of cognitive de-
8.27 I. Levi: Beware of syllogism: Statistical reasoning and velopment, J. Moral Educ. 30, 317–337 (2001)
conjecturing according to peirce. In: The Cambridge 8.47 S. Paavola: Peircean abduction: Instinct or inference,
Companion to Peirce, ed. by C. Misak (Cambridge Semiotica 153, 131–154 (2005)
Univ. Press, Cambridge 2004) pp. 257–286 8.48 M.H.G. Hoffmann: Problems with Peirce’s concept of
8.28 G. Minnameier: What’s wrong with it? – Kinds and abduction, Found. Sci. 4, 271–305 (1999)
inferential mechanics of reasoning errors. In: Learn- 8.49 F. Poggiani: What makes a reasoning sound? C.S.
ing from Errors, ed. by J. Seifried, E. Wuttke (Verlag Peirce’s normative foundation of logic, Trans. C. S.
Barbara Budrich, Opladen 2012) pp. 13–29 Peirce Soc. 48, 31–50 (2012)
8.29 G. Minnameier: Deontic and responsibility judg- 8.50 K.T. Fann: Peirce’s Theory of Abduction (Martinus Ni-
ments: An inferential analysis. In: Handbook of jhoff, The Hague 1970)
Moral Motivation: Theories, Models, Applications,
Forms of Abduction and an Inferential Taxonomy References 195
8.51 T. Kapitan: Peirce and the autonomy of abductive Hartshorne, P. Weiss (Harvard Univ. Press, Cambridge
reasoning, Erkenntnis 37, 1–26 (1992) 1933)
8.52 J. Hintikka: Socratic Epistemology: Explorations 8.60 C.S. Peirce: MS 318 (1907). In: Annotated Catalogue
of Knowledge-Seeking by Questioning (Cambridge of the Papers of Charles S. Peirce, ed. by R.S. Robin
Univ. Press, Cambridge 2007) (Univ. Massachusetts Press, Amherst 1967), available
8.53 F. Suppe: Science without induction. In: The Cosmos online (3 July 2016): http://www.iupui.edu/~peirce/
of Science: Essays of Exploration, ed. by J. Earman, robin/robin.htm
J.D. Norton (Univ. Pittsburgh Press, Pittsburgh 1997) 8.61 C.S. Peirce: Mathematical Miscellanea, The New Ele-
pp. 386–429 ments of Mathematics by Charles S. Peirce, Vol. III/1,
8.54 J. Earman: Bayes or Bust? A Critical Examination of ed. by C. Eisele (Mouton, The Hague, 1976)
Bayesian Confirmation Theory (MIT Press, Cambridge 8.62 N. Kaldor: A classificatory note on the determination
1992) of equilibrium, Rev. Econ. Stud. 1, 122–136 (1934)
8.55 C.S. Peirce: Mathematical Philosophy, The New Ele- 8.63 P.C. Wason: Self-contradictions. In: Thinking: Read-
ments of Mathematics by Charles S. Peirce, Vol. IV, ings in Cognitive Science, ed. by P.N. Johnson-Laird,
ed. by C. Eisele (Mouton, The Hague, 1976) P.C. Wason (Cambridge Univ. Press, Cambridge 1977)
8.56 J. Hintikka: C. S. Peirce’s first real discovery and its pp. 114–128
contemporary relevance, Monist 63, 304–315 (1980) 8.64 G. Minnameier: Abduction, selection, and selective
Part B | 8
8.57 K.L. Ketner: How Hintikka misunderstood Peirce’s abduction. In: Model-Based Reasoning in Science
account of theorematic reasoning, Trans. C. S. Peirce and Technology: Logical, Epistemological and Cog-
Soc. 21, 407–418 (1985) nitive Issues, ed. by L. Magnani, C. Casadio (Springer,
8.58 S.H. Levy: Peirce’s theoremic/corollarial distinction Berlin, Heidelberg 2016)
and the interconnections between mathematics and 8.65 F.V. Hendricks, J. Faye: Abducting explanation. In:
logic. In: Studies in the Logic of Charles Sanders Model-Based Reasoning in Scientific Discovery, ed.
Peirce, ed. by N. Houser, D.D. Roberts, J.V. Evra (In- by L. Magnani, N.J. Nersessian, P. Thagard (Kluwer,
diana Univ. Press, Bloomington 1997) pp. 85–110 New York 1999) pp. 271–294
8.59 C.S. Peirce: The Simplest Mathematics, Collected Pa- 8.66 R.B. Cattell: The Description and Measurement of
pers of Charles Sanders Peirce, Vol. 4, ed. by C. Personality (World Book, New York 1946)
197
Woosuk Park
Magnani’s M 9. Magnani’s Manipulative Abduction
Part B | 9.1
9.4 Manipulative Abduction as a Form
(Sect. 9.1), I shall discuss how and why Magnani
of Practical Reasoning ........................ 204
counts diagrammatic reasoning in geometry as
the prime example of manipulative abduction 9.5 The Ubiquity
(Sect. 9.2). Though we can witness an increasing of Manipulative Abduction ................. 206
interest in the role of abduction and manipulation 9.5.1 Manipulative Abduction in Fallacies ..... 206
in what Peirce calls theorematic reasoning, Mag- 9.5.2 Manipulative Abduction in Animals ...... 207
nani is unique in equating theorematic reasoning 9.6 Concluding Remarks ........................... 212
itself as abduction. Then, I shall discuss what he
References................................................... 212
counts as some common characteristics of manip-
ulative abductions (Sect. 9.3), and how and why
Magnani views manipulative abduction as a form
of practical reasoning (Sect. 9.4). Ultimately, I shall
argue that it is manipulative abduction that en-
ables Magnani to extend abduction to all directions
to develop the eco-cognitive model of abduction.
For this purpose, fallacies and animal abduction
will be used as examples (Sect. 9.5).
introduces the concept of manipulative abduction as fol- As is clear from this quote, manipulative abduction
lows [9.1, p. 12] (cf. [9.2, pp. 15,16,43]): is contrasted with theoretical abduction by Magnani.
Further, in Magnani’s writings it is also stressed that
“The concept of manipulative abduction captures manipulative abduction is occurring taking advantage
a large part of scientific thinking where the role of of those model-based (e.g., iconic) aspects that are em-
action is central, and where the features of this ac- bedded in external models [9.1, p. 58]:
tion are implicit and hard to be elicited: Action can
“We have seen that manipulative abduction is a kind
provide otherwise unavailable information that en-
of abduction, usually model based and so intrinsi-
ables the agent to solve problems by starting and by
cally iconic, that exploits external models endowed
performing a suitable abductive process of genera-
with delegated (and often implicit) cognitive and
tion or selection of hypotheses.”
semiotic roles and attributes.”
For my present purpose, the following text is more This line of thought clearly indicates a possibility
informative [9.1, p. 39] (cf. [9.2, p. 53]): that Magnani’s multiple distinctions of abduction may
work in such a way that each distinction represents
“Manipulative abduction [9.2] – contrasted with a different dimension in our understanding of abduc-
theoretical abduction – happens when we are think- tion. What I have in mind might be presented crudely
ing through doing and not only, in a pragmatic as in the cubic model of Magnani’s classification of ab-
sense, about doing. [. . . ] Manipulative abduction duction (Fig. 9.1).
Marietti cites the following texts as “the classic experimental diagrams act like objects for observa-
description given by Peirce of the two types of deduc- tion; they now resist the mind, as it were, and must
tion” [9.7, CP 2.267]: be evaluated as solutions to the mathematical prob-
lem. In this respect, this process is akin to abduction
“A corollarial deduction is one that represents the
in the natural sciences.”
conditions of the conclusion in a diagram and finds
from the observation of this diagram, as it is, the Marietti [9.7] seems to go one step further in this
truth of the conclusion. A Theorematic Deduction regard, for she explicitly mentions the necessity of
is one which, having represented the conditions of abduction in theorematic demonstration in mathemat-
the conclusion in a diagram, performs an ingenious ics [9.7, p. 124]:
experiment upon the diagram, and by the observa-
“In my view, it is possible to identify an opera-
tion of the diagram, so modified, ascertains the truth
tion in thought – it takes place essentially above
of the conclusion.”
the diagram and involves precisely the perceptual
Marietti [9.7, p. 120, NEM 4:42]: relations organized by it – which forms the core
of the abductive inference that makes the demon-
“What I call the theorematic reasoning of math-
stration synthetic, theorematic, creative. This is the
ematics consists in so introducing a foreign idea,
point: There has to be an abductive inference in
using it, and finally deducing a conclusion from
every informative demonstration given that a con-
which it is eliminated.”
necting thread running through Peirce’s thoughts
CP 4.233 seems, however, the most extensive on the logic of science is that ‘[a]ll the ideas of
text pertinent to Peirce’s distinction between corol- science come to it by the way of abduction’ (CP
Part B | 9.2
larial/theorematic distinction in broader perspective. I 5.145). In a theorematic demonstration – that is in
discussed this text rather extensively in [9.8]. a demonstration introducing new knowledge into
In their treatment of these typical Peircean texts, our mathematical system – it is necessary to carry
recently many commentators have discussed diagram- out an abductive passage.”
matic reasoning in connection with abduction. For ex-
However, Marietti leaves it unclear how it is pos-
ample, Hoffmann [9.4, p. 411] writes (cf. [9.9, p. 337],
sible to have “synthetic, theorematic, creative” mathe-
[9.10, p. 69], [9.11, p. 465] and [9.4, pp. 292–293]):
matical demonstration. So, one might say that Marietti
“The creativity of theorematic reasoning and the is here detecting the indispensability of abduction in
role of observation in it support the interpretation theorematic reasoning.
that there must be, for Peirce, a connection between Stjernfelt [9.13] is clearly much more informative
this form of deductive reasoning and abduction. as to how abduction takes place in theorematic reason-
Both, at least, seem to fulfill the same task. What ing [9.13, p. 276]:
theorematic deduction is for mathematics, abduc-
“An important issue here – both related to the ad-
tion seems to be for scientific discoveries in general.
dition of new elements or foreign ideas and to
Thus, Ketner (1985) maintained ‘that production
the experiment aspects – is the relation between
of experiments within theorematic reasoning, on
theorematic reasoning and abduction. A finished
Peirce’s view, is done through abduction’.”
piece of theorematic reasoning, of course, is de-
It is interesting to note that, while Ketner clearly ductive – the conclusion follows with necessity
invokes abduction in the “experiments within theore- from the the premises. But in the course of con-
matic reasoning”, Hoffmann himself is merely making ducting the experiment, an abductive phase appears
an analogy between theorematic deduction in mathe- when investigating which experimental procedure,
matics and abduction in science. Campos [9.12] is also among many, to follow; which new elements or for-
somewhat similar to Hoffmann’s case [9.12, p. 135]: eign ideas to introduce. This may require repeated,
trial-and-error abductive guessing, until the final
“In mathematical reasoning, the imagination cre- structure of the proof is found – maybe after years
ates experimental diagrams that function as signs or centuries. Exactly the fact that neither premises
that are then perceived, interpreted, judged, often nor theorems need to contain any mentioning of
transformed, re-imagined, re-interpreted, and so on, the experiment or the introduction of new elements
in a continuous process. Experimental hypotheses makes the abductive character of experimentation
are imaginative suggestions that become subject to clear. Of course, once the right step has been found,
logical scrutiny as possible keys to the solution abductive searching may cease and the deductive
of a theorematic deduction. Once conceived, the character of the final proof stands out.”
200 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
Here, Stjernfelt makes it exactly clear when and that by observing the effects of such manipulation
where abduction intervenes in theorematic reasoning. we find properties not to be otherwise discerned. In
When conducting the experiments with diagrams, “an such manipulation, we are guided by precious dis-
abductive phase appears”. He is also quite explicit about coveries which are embodied in general formulae.
what are abduced at that phase: i. e., “which experi- These are the patterns which we have the right to
mental procedure to follow”, and “which new elements imitate in our procedure, and are the icons par ex-
or foreign ideas to introduce”. His view seems attrac- cellence of algebra [Emphasis is mine].”
tive in that at least it rather persuasively appeasing the
Among the commentators of Peirce, Marietti seems
apparent conflict between theorematic deduction and
to be the one who highlights manipulation on diagrams
abduction. Though “a finished piece of theorematic
in more detail. She is rather explicit in presenting ma-
reasoning, of course, is deductive”, abductive phase
nipulation on diagrams as the core of mathematical
appears in doing experiments with diagrams. At the
proofs, and thereby mathematics itself [9.16, p. 166]:
same time, he also explains why it has been difficult
to notice abductive phase in theorematic reasoning, for “Manipulation and observation of diagrammatic
“once the right step has been found, abductive search- signs characterize Peirce’s idea of mathematical
ing may cease and the deductive character of the final reasoning. The more that such reasoning leads to
proof stands out”. He is perceptive enough to point out relevant conclusions, the more manipulation and
that [9.13, p. 276]: observation play a key role in it.”
“Exactly the fact that neither premises nor theorems Marietti [9.7, p. 112]:
need to contain any mentioning of the experiment
“It is well known that Peirce conceived mathematics
or the introduction of new elements makes the ab-
Part B | 9.2
matic deduction, in which it is not only a question in diagrammatic reasoning in geometry. For example,
of observing the relations that emerge but also to Peircean corollarial reasoning would be model based,
carry out some operations on this single model, on theoretical, and visual abduction. On the other hand,
the individual token. What is to be done is to manip- Peicean theorematic reasoning would be model based,
ulate this sign, to implement a series of ingenious manipulative, and visual abduction [9.1, pp. 117–118,
experiments – hence creative and not mechanical – 176–178].
aimed at finding the correct modification so as to In assimilating Magnani’s views of manipulative
cause to emerge the new relations that we then ob- abduction to Peirce’s theory of diagrammatic reason-
serve. Such concrete work presupposes the concrete ing in geometry, the most difficult point would be that
quality of a material sign.” Peircean corollarial and theorematic reasonings are usu-
ally counted as deductions. Magnani claims that “theo-
Marietti further hints at what is involved in ma- rematic deduction can be easily interpreted in terms of
nipulation on individual diagrams. Sun-Joo Shin seems manipulative abduction” [9.1, p. 178]. However, it is not
to be another important contributor to the research on clear what he has in mind. Probably, further hints can be
Peircean diagrammatic reasoning, who emphasizes the secured from the following quote from Magnani [9.1,
aspect of individuality in diagrams [9.17, 18]. [9.16, p. 181]:
p. 153]:
“As I have already indicated Peirce further distin-
“The second step of the demonstration introduces guished a corollarial and a theoric part within the-
that spatiotemporal level which is indispensable in orematic reasoning, and connected theoric aspects
view of the manipulation of the diagram. The ek- to abduction [Hoffmann, 1999, p. 293]: Thêoric
thesis consists in the individualization of the initial reasoning [. . . ] is very plainly allied to what is nor-
Part B | 9.2
proposition, the protasis, which is expressed in gen- mally called abduction [Peirce, 1966, 754, ISP, p.
eral terms.” 8]. ”
Marietti [9.16, p. 154]: As Hoffmann points out, however, there can be
different possible interpretations of what Peirce says
“Demonstrating means manipulating individual di-
“either he identified abduction/retroduction and theoric
agrams. In order to experiment on a diagram, as we
reasoning here or he claimed that there is abduction in
said above, we must face a single instance of it. De-
mathematics beyond theoric deduction” [9.4, p. 293].
duction cannot work on general signs alone.”
Though extremely interesting and significant, it is
Marietti [9.16, p. 155]: not my present concern to answer whether we can
safely understand Peirce’s theoric reasoning as abduc-
“The individual diagrammatic sign really acts upon
tion, as Magnani claims. What is at stake is rather
us. We manipulate it, and it reacts on us concretely
how and why Magnani views diagrammatic reasoning
showing some new relations that impose themselves
in geometry as the prime example of manipulative ab-
upon our understanding, escaping any doubt.”
duction. If so, by assuming that we now understand the
In sum, some commentators of Peirce’s distinction basics of Magnani’s notion of manipulative abduction,
between corollarial and theorematic reasoning indeed we need to raise the following further questions. Why
detect abductive and manipulative aspects of diagram- does he appeal to diagrammatic reasoning in geometry
matic reasoning in geometry. None of them, however, whenever he has to give an example of manipulative
invokes manipulative abduction in diagrammatic rea- abduction? What exactly does Magnani mean by ma-
soning. nipulative abduction in geometrical reasoning? What
exactly do mathematicians do when they experiment on
9.2.2 Magnani on Manipulative Abduction diagrams? What kind of things could be manipulated in
in Diagrammatic Reasoning mathematicians’ manipulative abduction? What is it for
to do manipulative abduction?
Magnani already dealt with the role of model based Magnani uses Fig. 9.2 as an example of cognitive
and manipulative abductions in geometrical reasoning manipulating in diagrammatic demonstration. Accord-
in several places [9.1, 2, 19, 20]. Since there have been ing to him, this example, taken from the field of ele-
many attempts to understand Peirce’s philosophy of mentary geometry, shows how [9.1, p. 176]:
mathematics focusing on his distinction between corol-
larial and theorematic reasoning, as we saw earlier, “a simple manipulation of the triangle in Fig. 9.2a
Magnani’s results in model based and manipulative ab- gives rise to an external configuration – Fig. 9.2b –
duction can be easily combined with previous results that carries relevant semiotic information about the
202 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
a) b) m l
which it is hard or impossible to extract new mean- nation able to help the understanding of concepts
ingful features of an object, elects or creates an difficult to grasp or that appear obscure and/or
action that structures the environment in such a way epistemologically unjustified. I will present in the
that it gives new information which would be oth- following section some mirror diagrams which pro-
erwise unavailable and which is used specifically to vided new mental representations of the concept of
infer explanatory hypotheses.” parallel lines.
They help abductively create new previously un-
The reason why diagrammatic reasoning in geome- known concepts that are nonexplanatory, as illus-
try is instrumental for Magnani to introduce manipula- trated in the case of the discovery of the non-
tive abduction is not hard to understand. From his point Euclidean geometry.
of view, it is important to first note that “mathemati-
As we can infer from the passage just quoted, the
cal diagrams play various roles in a typical abductive
discovery of non-Euclidean geometry provides Mag-
way”. Secondly, they are external representations that
nanian ideal springboard to elaborate his views on
provide both explanatory and nonexplanatory abductive
manipulative abduction in diagrammatic reasoning. In
results [9.1, pp. 118–119]. The first point can be elabo-
fact, Magnani [9.1] uses Lobachevsky’s discovery of
rated by the following [9.1, p. 118]:
non-Euclidean geometry as an example, in which ma-
nipulative abduction played a crucial role. After briefly
“Following the approach in cognitive science re-
narrating what happened to the parallel postulate of Eu-
lated to the studies in distributed cognition, I con-
clidean geometry throughout the history, he explains
tend that in the construction of mathematical con-
Lobachevsky’s strategy to face the problem situation as
cepts many external representations are exploited,
follows [9.1, p. 123]:
both in terms of diagrams and of symbols. I have
been interested in my research in diagrams which “Lobachevsky’s strategy for resolving the anomaly
play an optical role – microscopes (that look at of the fifth postulate was first of all to manipulate
the infinitesimally small details), telescopes (that the symbols, second to rebuild the principles, and
look at infinity), windows (that look at a particular then to derive new proofs and provide a new math-
situation), a mirror role (to externalize rough men- ematical apparatus; of course his analysis depended
tal models), and an unveiling role (to help create on some of the previous mathematical attempts to
new and interesting mathematical concepts, theo- demonstrate the fifth postulate. The failure of the
ries, and structures).” demonstrations – of the fifth postulate from the
other four – that was present to the attention of
Also, the second point is further explained by Mag- Lobachevsky, lead him to believe that the difficul-
nani as follows. ties that had to be overcome were due to causes
Magnani’s Manipulative Abduction 9.3 When Does Manipulative Abduction Take Place? 203
Part B | 9.3
According to Magnani, we can detect the anomaly is prevalently a kind of manipulative and model-based
of the fifth postulate from Fig. 9.3, for, unlike other abduction” [9.1, 127]. As Magnani aptly points out,
four postulates, “we cannot verify empirically whether “Lobachevsky’s target is to perform a geometrical ab-
two lines meet, since we can draw only segments, not ductive process able to create the new and very abstract
lines” [9.1, p. 121]. concept of non-Euclidean parallel lines”, and what is
In contrast to the diagram of Fig. 9.3, as Magnani remarkable is that “the whole epistemic process is me-
explains, the diagram of Fig. 9.4 introduces a new defi- diated by interesting manipulations of external mirror
nition of parallelism [9.1, p. 13]: diagrams” [9.1, p. 123].
However, it would be prudent not to conclude so seems still collecting interesting cases that deserve to
fast, for it is not clear which comes first. In a lengthy be characterized as manipulative abductions, and iden-
paragraph just quoted selectively, Magnani uses the tifying their interesting traits. In other words, he has
well-known history of electromagnetism in the nine- just drawn our attention to when and by what manip-
teenth century in order to make sense of many require- ulative abductions occur. What is interesting is, if we
ments for manipulative abduction. Magnani uses Oer- are on the right track, such a search to find cases of
sted’s report of his experiment about electromagnetism manipulative abduction and their typical characteristics
as an example of 1, for it described some anomalous might result in a disjunctive property, whose extension
aspects. Also, he uses Davy’s setup of using an artificial could be rather huge. Though it might be interesting
tower of needles as an example of 3. Magnani seems and meaningful to pursue such a property, it is definitely
indebted here to Gooding’s views of the roles of con- not necessary and sufficient for manipulative abduction.
struals in science [9.1, pp. 48–51] and [9.21, pp. 29–69]. Contrary to what one might believe Magnani is do-
But, in order for the episodes in that history to be exam- ing, he may be after entirely different target. Roughly
ples of manipulative abduction, should not there be first speaking, Magnani seems to aim at demonstrating the
pre-established set of requirements for manipulative ab- ubiquity or pervasiveness of manipulative abduction.
duction? Magnani seems delighted to find geometrical What I have in mind could become clearer by “the
constructions satisfy all these requirements. But he does example discussed above”. It is rather impressive that
not first discover manipulative abduction in geometrical Magnani is able to uncover all the various iconic roles in
constructions, and sort out the commonalities of manip- geometrical diagrams, such as optical, mirror, unveiling
ulative abductions, and not the other way around? There roles. These different iconic roles are related to differ-
is some suspicion as to possible circularity in Magnani’s ent types of representations. Further, there is no end to
Part B | 9.4
way of thinking. Furthermore, not to mention the fact the synthesis or multiplication of these representations,
that it is by no means clear what relations hold between for [9.1, p. 49]:
these requirements, there is room for doubt whether
each of the requirements is evidently an individually “[t]he various procedures for manipulating objects,
necessary condition for manipulative abduction. instruments and experiences will be in their turn
Some such worries make us wonder whether Mag- reinterpreted in terms of procedures for manip-
nani simply enumerates some interesting traits that ulating concepts, models, propositions, and for-
appear in cases of manipulative abduction. Magnani malisms”
There are several interesting points to note regard- 6. Action can be useful in presence of incomplete or
ing Magnani’s new perspective of manipulative abduc- inconsistent information – not only from the per-
tion as a form of practical reasoning. For convenience’s ceptual point of view – or of a diminished capacity
sake, let us distinguish between scientific contexts and to act upon the world: it is used to get more data to
the contexts of ordinary life. As is clear from our dis- restore coherence and to improve deficient knowl-
cussion earlier, manipulative abduction plays important edge;
roles in scientific contexts. But with this new perspec- 7. Action enables us to build external artifactual mod-
tive, we may deepen our understanding of some of els of task mechanisms instead of the correspond-
the characteristics of manipulative abduction in sci- ing internal ones, that are adequate to adapt the
ence. According to Magnani, for example, the first three environment to the agent’s needs: Experimental ma-
among the new characteristics of manipulative abduc- nipulations exploit artificial apparatus to free new
tion are also found in geometrical constructions. It may possible stable and repeatable sources of informa-
not be irrelevant to invoke, in this regard, John Woods’ tion about hidden knowledge and constraints.
apt characterization of Magnani [9.22, p.240]: 8. Action as a control of sense data illustrates how
we can change the position of our body (and/or
“At the centre of Magnani’s investigations is the
of the external objects) and how to exploit various
reasoning of the practical agent, of the individual
kinds of prostheses (Galileo’s telescope, technolog-
agent operating on the ground, that is, in the cir-
ical instruments and interfaces) to get various new
cumstances of real life. In all its contexts, from the
kinds of stimulation: action provides some tactile
most abstractly mathematical to the most muckilly
and visual information (e.g., in surgery), otherwise
empirical, Magnani emphasizes the cognitive nature
unavailable.”
of abduction.”
Part B | 9.4
If Woods is right, then it is not a small matter that As individual agents with scant resources to manage
manipulative abduction can be interpreted as a form of and survive in complicated and unfriendly environ-
practical reasoning. Magnani is shifting our focus from ments, we can understand without much difficulty what
more theoretical and abstract aspects of science to more Magnani is talking about in this quote. Furthermore, as I
practical and experimental aspects of science by his em- shall show by some examples in the next section, Mag-
phasis on manipulative abduction. Now, if we turn to nani’s recent research, including not only those works
the world of everyday reasoning with the new perspec- directly concern abduction (such as Magnani [9.1]) but
tive of manipulative abduction as a form of practical also virtually all other works apparently dealing with
reasoning, we may realize how indispensable manipula- other subject matters (such as Magnani [9.25–27]) can
tive abduction is for individual agents in virtually every be interpreted as examining the roles of manipulative
moments and situations in real like. Again, Woods’ suc- abduction in all the different areas. Before turning to
cinct summary of the pages 363–384 of Magnani [9.1] examples, let us briefly examine what new aspects of
is to the point [9.22]: manipulative abduction are introduced by the character-
istics 5 through 8 in addition to characteristics 1 through
“In an original thrust, he identifies the practical
4. The most salient point would be that action-based
agent as a cognitive system whose resources are
character of these characteristics are emphasized by
comparatively scant and who sets his cognitive
Magnani. In some sense, there seems to be one–one cor-
targets with due regard (and respect) for these re-
respondence between the new characteristics 5 through
source-limitations.”
8 and the old characteristics 5 through 8. For example,
Magnani’s second move is also impressive. In ad- in both and 3 and 7, artificial apparatus is invoked. The
dition to the common characteristics of manipulative only difference is that in the latter externality of artifac-
abduction discussed earlier, Magnani identifies some tual models is emphasized. Likewise, in both 4 and 8,
other common characteristics of manipulative abduc- some contingent ways of acting are invoked. The only
tion from the perspective of manipulative abduction as difference is that, unlike epistemic acting in the former
a form of practical reasoning [9.1, pp. 51–52]: (e.g., looking from different perspective), real acting is
done in the latter (e.g., control of sense-data by chang-
“5. Action elaborates a simplification of the reasoning ing the position of the body). In other words, Magnani
task and a redistribution of effort across time [9.23], seems to be insinuating that, when we interpret ma-
when we need to manipulate concrete things in or- nipulative abduction as a form of practical reasoning,
der to understand structures which are otherwise too thereby exploiting some action, we can find the new
abstract [9.24], or when we are in presence of re- characteristics that were in some sense already pregnant
dundant and unmanageable information; in the old characteristics.
206 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
The following quote from Magnani [9.1] seems per- computational. It is in this way that merely success-
fectly supports my understanding by clearly combining ful strategies are replaced with successful strategies
his two moves [9.1, p. 397]: that also tell the more precise truth about things.
Human informal nondemonstrative inferential pro-
“Human beings spontaneously (and also animals, cesses of abduction (and of induction) are more
like already Peirce maintained) perform more or and more externalized and objectified: These ex-
less rudimentary abductive and inductive reasoning. ternal representations can be usefully rerepresented
Starting from the low-level inferential performances in our brains (if this is useful and possible), and
of the kid’s hasty generalization that is a strategic they can originate new improved organic (mentally
success and a cognitive failure human beings ar- internal) ways of inferring or suitably exploited
rive to the externalization of theoretical inductive in a hybrid manipulative interplay, as I have said
and abductive agents as ideal agents, logical and above.”
ful as the distinction between scientific and everyday and Hoffmann [9.33], attempts at classifying abduction
context. As he wants to find manipulative abduction in is still largely focusing on the problem of classifying
everyday as well as scientific contexts, he also wants explanatory abduction in science. Not to mention non-
to find it in prelinguistic as well as linguistic agents. explanatory abduction in science, such as instrumental
How does Magnani expand manipulative abduction in abduction, Gabbay and Woods [9.28] covers abduc-
science to manipulative abduction in everyday context? tion in nonscientific context, such as legal abduction.
How does Magnani expand manipulative abduction in Magnani welcomes Gabbay and Woods’ distinction be-
linguistic agents to manipulative abduction in prelin- tween explanatory and nonexplanatory abduction as
guistic agents? A nice strategic point to get an overview follows [9.1, p. 71]:
of both lines of expanding the scope of manipulative ab-
“In my previous book on abduction [9.2] I made
duction may be secured in Chapter 7 of Magnani [9.1]
some examples of abductive reasoning that ba-
entitled “Abduction in human and logical agents: Hasty
sically are nonexplanatory and/or instrumentalist
generalizers, hybrid abducers, fallacies”. On the one
without clearly acknowledging it. The contribution
hand, this chapter provides us with a clear example
of Gabbay and Woods to the analysis of abduction
of nonexplanatory abduction, thereby representing the
has the logical and epistemological merit of hav-
shift from the scientific to the practical. On the other
ing clarified these basic aspects of abduction, until
hand, it presents us a foil for the search of manip-
now disregarded in the literature. Their distinction
ulative abduction in nonhuman, prelinguistic agents.
between explanatory, nonexplanatory and instru-
In both respects, interestingly, Magnani seems to be
mental abduction is orthogonal to mine in terms of
strongly influenced by and responding to Gabbay and
the theoretical and manipulative (including the sub-
Woods [9.28] and Woods [9.29].
classes of sentential and model based) and further
allows us to explore fundamental features of abduc-
9.5.1 Manipulative Abduction in Fallacies
tive cognition.”
As pointed out by Park [9.30], we can witness the re- Magnani is also strongly influenced by Woods’ ex-
cent surge of interest in classifying different patterns tensive and revolutionary study of fallacies. Above all,
or types of abduction. Many philosophers, including Magnani is fully sympathetic with Woods’ project of
Thagard, Magnani, Gabbay and Woods, Schurz, and the naturalization of logic, the official core topic of
Hoffmann, have suggested their own classifications em- which is the one of logical fallacies [9.35, p. 20]. What
phasizing different aspects of abduction [9.1, 2, 28, is needed here is just to understand how Magnani adopts
31–33]. Such a development is remarkable, in view of and appropriates Wood’s views of fallacies for his own
the fact that until quite recently the focus of the research eco-cognitive project. Woods contends, and Magnani
on Peircean abduction was basically to identify its log- confirms that fallacy has been counted as “a mistake in
Magnani’s Manipulative Abduction 9.5 The Ubiquity of Manipulative Abduction 207
reasoning, a mistake which occurs with some frequency tionality, and he explicitly claims that “many of the
in real arguments and which is characteristically decep- traditional fallacies – hasty generalization for example –
tive” [9.1, p. 404] and [9.29]. However, Magnani points call for an equivocal treatment” [9.35, p. 21] and [9.1].
out that [9.1, pp. 404–405]: What he means by “an equivocal treatment” is that the
so-called fallacies [9.1]:
“when they are used by actual reasoners, beings like
us, that is in an eco-logical and not merely logical – “are sometimes cognitive mistakes and strategic
ideal and abstract – way, they are no longer neces- successes, and in at least some of those cases, it is
sarily fallacies.” more rational to proceed strategically, even at the
cost of cognitive error.”
Magnani agrees with Woods’ conviction that from
Aristotle onward logic has irremediably mismanaged Magnani also claims that his general agreement
the fallacies project. And he concurs with Woods’ be- with Woods’ views of fallacies can be further strongly
lief that naturalization of logic is appropriate to the task motivated by his emphasis on what he calls the general
of “an account of fallacious reasoning – and of its de- military nature of language, that is:
tection, avoidance, and repair” [9.35, 36]. What Woods
1. Human language possesses a pregnance-mirroring
calls EAUI-conception of fallacies is the traditional
function.
perspective of fallacies that “fallacies are Errors of Rea-
2. In this sense we can say that vocal and written lan-
soning, Attractive, Universal, and Incorrigible” [9.36,
guage is a tool exactly like a knife.
p. 135], [9.35, p. 21]. Now, Magnani reports Woods’
3. The so-called fallacies, are certainly linked to that
views of fallacies as follows [9.35, p. 22]:
efficacious military intelligence, which relates to
“According to Woods’ last and more recent obser- the problem of the role of language in the so-
Part B | 9.5
vations the traditional fallacies – hasty generaliza- called coalition enforcement, which characterizes
tion included – do not really instantiate the tradi- all the various kinds of groups and collectives of hu-
tional concept of fallacy (the EAUI-conception). In mans [9.35, p. 22].
this perspective it is not that it is sometimes strategi-
Indeed Magnani contends that in this perspective
cally justified to commit fallacies (a perfectly sound
language is basically rooted in a kind of military in-
principle, by the way), but rather that in the case of
telligence, a term coined by the mathematician René
the Gang of Eighteen traditional fallacies they sim-
Thom [9.37], the creator of the so-called catastrophe
ply are not fallacies. The distinction is subtle, and I
theory. See Chap. 8 of Magnani [9.1] for more in-depth
can add that I agree with it in the following sense:
study of military intelligence and the notion of coalition
The traditional conception of fallacies adopts – so to
enforcement.
say – an aristocratic (ideal) perspective on human
Certainly most people would believe that communi-
thinking that disregards its profound eco-cognitive
cation is the primary function of language. Also, when
character. Errors, in an eco-cognitive perspective,
broadly understood, communication might include the
certainly are not the exclusive fruit of the so-called
manipulation of other human beings by language. One
fallacies, and in this wide sense, a fallacy is an er-
possible danger is that whenever we talk about commu-
ror – in Woods’ words – ‘that virtually everyone
nicative function as the primary function of language,
is disposed to commit with a frequency that, while
we tend to ignore or neglect the manipulative func-
comparatively low, is nontrivially greater than the
tion of language. Clever and sometimes malicious uses
frequency of their errors in general’.”
of fallacies in order to manipulate other human be-
By the term Gang of Eighteen, Woods refers to fol- ings, definitely there is military intelligence involved.
lowing typical fallacies [9.36, p. 5]: In a word, we may say that manipulative abduction is
crucial in understanding the role of fallacies in military
“ad baculum, ad hominem, ad populum, ad vere-
intelligence.
cundiam, ad ignorantiam, ad misericordiam, af-
firming the consequent, denying the antecedent,
9.5.2 Manipulative Abduction in Animals
begging the question, gambler’s fallacy, post hic,
ergo propter hoc, composition and division (of
I emphasized the central importance of animal abduc-
which secundum quid is a special case), faulty anal-
tion in Magnani’s thought in a series of papers [9.30,
ogy, and ignoratio elenchi (of which straw man is
38, 39]. This section draws extensively from these, es-
a special case).”
pecially [9.38]. Unlike these previous articles, this time
Magnani’s eco-cognitive perspective draws a rather I want to highlight the role of manipulative abduction
sharp distinction between strategic and cognitive ra- in animal cognition. One of the most pressing issues in
208 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
understanding abduction is whether it is an instinct or level of conscious inferences like for example in the
an inference. For many commentators find it paradox- case of scientific reasoning” [9.1, p. 279]. And he im-
ical “that new ideas and hypotheses are products of an plicitly blames their assumption of instinct “as a kind
instinct (or an insight), and products of an inference at of mysterious, not analyzed, guessing power” for such
the same time” [9.40, p. 131]. Paavola refers to [9.41– a claim [9.1]. Indeed Magnani distances himself from
46]. As Paavola points out, we seem to face a dilemma: those commentators and Peirce himself as follows [9.1]:
“If abduction relies on instinct, it is not a form of rea-
“I think a better interpretation is the following that I
soning, and if it is a form of reasoning, it does not rely
am proposing here: Certainly instinct, which I con-
on instinct” [9.40, p. 131]. Fortunately, Lorenzo Mag-
sider a simple and not a mysterious endowment of
nani’s recent discussion of animal abduction sheds light
human beings, is at the basis of both practical and
on both instinctual and inferential character of Peircean
scientific reasoning, in turn instinct shows the obvi-
abduction (Magnani [9.1, especially Chapter 5], “Ani-
ous origin of both in natural evolution.”
mal abduction: From mindless organisms to artifactual
mediators”, which was originally published in Magnani But on what ground does Magnani claim superior-
an Li [9.25]). Contrary to many commentators, who find ity of his interpretation? How could he be so sure that
conflicts between abduction as instinct and abduction as instinct is at the basis of both practical and scientific
inference, he claims that they simply co-exist. reasoning? Even though Magnani does not formulate an
In order to overcome the conflict between abduc- argument that proves his claim once and for all, there
tion as an instinct and abduction as an inference, it seem to be enough clues for fathoming his mind. In
is not enough to draw attention to some relevant texts addition to his reliance on the naturalistic ground for
from Peirce and to provide insightful interpretation abductive instinct in humans, Magnani is also attracted
Part B | 9.5
of them. Magnani needs to indicate exactly where he to the so-called synechism of Peirce. Further, he seems
is going beyond Peirce, thereby pointing out wherein encouraged by two intriguing points from Peirce: (1)
lies the limitation of Peirce’s views on abduction. It that “thought is not necessarily connected with brain”,
is of course an important matter for Magnani himself and (2) that “instincts themselves can undergo modi-
whether he is going beyond Peirce or not [9.1, p. 221]. fications through evolution” [9.1, p. 278]. For the first
Magnani finds such a clear example from Peirce’s dif- point, Magnani actually quotes from Peirce [9.14, CP
ferent treatments of practical reasoning and scientific 4.551]:
thinking [9.1, pp. 278–279]:
“Thought is not necessarily connected with brain.
“Elsewhere Peirce seems to maintain that instinct It appears in the work of bees, of crystals, and
is not really relevant in scientific reasoning but that throughout the purely physical world; and one can
it is typical of just the reasoning of practical men no more deny that it is really there, than that the
about every day affairs. So as to say, we can perform colours, the shapes, etc. of objects are really there.”
instinctive abduction (that is not controlled, not rea-
On the other hand, for the second point, he again
soned) in practical reasoning, but this is not typical
quotes from Peirce: [instincts are] “inherited habits,
of scientific thinking.”
or in a more accurate language, inherited disposi-
Here Magnani quotes extensively from Peirce’s tions” [9.1, p. 278] and [9.5, CP 2.170].
Carnegie application of 1902 (MS L75) (cf. Arisbe In other words, Magnani seems to assimilate ab-
Website [9.47].) We should note that Magnani is fully duction as an instinct and abduction as an inference
aware of the fact that we can find many instances where form both directions. This interpretation of Magnani’s
Peirce allowed abductive instinct to humans even in strategy seems to be supported strongly by his explicit
scientific reasoning. For example, hypothesis selection announcement [9.1, p. 267]:
is a largely instinctual endowment of human beings
“I can conclude that instinct versus inference repre-
which Peirce thinks is given by God or related to a kind
sents a conflict we can overcome simply by observ-
of Galilean lume naturale [9.1, p. 277] and [9.6, CP
ing that the work of abduction is partly explicable
7.220]:
as a biological phenomenon and partly as a more
“It is a primary hypothesis underlying all abduction or less logical operation related to plastic cognitive
that the human mind is akin to the truth in the sense endowments of all organisms.”
that in a finite number of guesses it will light upon
To those who would allow abductive instinct to
the correct hypothesis.”
nonhuman animals but not to humans, he tries to em-
Magnani counts commentators like [9.4, 40, 48] as phasize the instinctual elements in human abductive
maintaining that “instinct [. . . ] does not operate at the reasoning. On the other hand, to those who would allow
Magnani’s Manipulative Abduction 9.5 The Ubiquity of Manipulative Abduction 209
abduction as inference to humans but not to nonhu- responses that do not seem to involve sentential as-
man animals, he suggests to broaden the concept of pects but rather merely noninferential ways of cog-
inference, and thereby that of thinking. For the for- nition. If we adopt the semiotic perspective above,
mer project, Magnani cites hypothesis generation in which does not reduce the term inference to its sen-
scientific reasoning as a weighty evidence for abduc- tential level, but which includes the whole arena of
tive instinct in humans: From this Peircean perspective, sign activity – in the light of Peircean tradition –
hypothesis generation is a largely instinctual and non- these kinds of thinking promptly appear full, infer-
linguistic endowment of human beings and, of course, ential forms of thought. Let me recall that Peirce
also of animals. It is clear that for Peirce abduction is stated that all thinking is in signs, and signs can be
rooted in the instinct and that many basically instinc- icons, indices, or symbols, and, moreover, all infer-
tually rooted cognitive performances, like emotions, ence is a form of sign activity, where the word sign
provide examples of abduction available to both hu- includes feeling, image, conception, and other rep-
man and nonhuman animals [9.1, p. 286]. Here, of resentation.”
course, Magnani’s claim about hypothesis generation
as instinctual must be still controversial. Someone may Magnani is well aware of the fact animals have been
object that it should be able to work out what might widely considered as mindless organisms for a long
explain a phenomenon. For further discussion of this time. So, based on the cornerstone laid by Peirce, this
complicated issue, please see [9.1, pp. 18–19]. In this semiotic perspective needs further extension. But how
regard, Magnani distinguishes between “(1) abduction is it possible? According to Magnani, that is possible
that only generates plausible hypothesis (selective or thanks to the recent results in cognitive science and
creative)” and “(2) abduction considered as inference to ethology about animals, and of developmental psychol-
Part B | 9.5
the best explanation, that also evaluates hypotheses by ogy and cognitive archeology of humans and infants.
induction” [9.1, p. 18, Magnani’s emphasis]. And, he [9.1, p. 283]:
makes it explicit that the first meaning of abduction is
“Philosophy itself has for a long time disregarded
what he accepts in his epistemological model. Though
the ways of thinking and knowing of animals, tra-
inconclusive, Magnani’s claim about hypothesis gener-
ditionally considered mindless organisms. Peircean
ation as instinctual is more defensible under the first
insight regarding the role of abduction in animals
meaning of abduction. Even after having noted these
was a good starting point, but only more recent re-
supportive points, however, it is still unclear how ab-
sults in the fields of cognitive science and ethology
duction could be rooted in instinctual-rooted cognitive
about animals, and of developmental psychology
performances like emotion.
and cognitive archeology about humans and infants,
As for the latter project, Magnani wants to se-
have provided the actual intellectual awareness of
cure inferential character of animal abduction from sign
the importance of the comparative studies.”
activity and semiotic processes found in nonhuman ani-
mals. He frequently appeals to Peirce [9.49, CP 5.283]: Magnani not only points out that inferences are not
necessarily structured like a language, but also there are
“all thinking is in signs, and signs can be icons,
animal-like aspects in human thinking and feeling. [9.1,
indices or symbols. Moreover, all inferences are
p. 283]:
a form of sign activity, where the word sign includes
feeling, image, conception, and other representa- “Sometimes philosophy has anthropocentrically
tion.” condemned itself to partial results when reflecting
upon human cognition because it lacked in appre-
Here is a lengthy quote from Magnani that makes
ciation of the more animal-like aspects of thinking
this point crystal clear [9.1, p. 288] and [9.5, 14, CP
and feeling, which are certainly in operation and are
5.283]:
greatly important in human behavior.”
“Many forms of thinking, such as imagistic, em- Encouraged by the discovery of “the ways of think-
phatic, trial and error, and analogical reasoning, ing in which the sign activity is of a nonlinguistic
and cognitive activities performed through com- sort” [9.1, p. 189] in lower animals, Magnani claims
plex bodily skills, appear to be basically model that “a higher degree of abductive abilities has to be ac-
based and manipulative. They are usually described knowledged” to them [9.1, pp. 290,291]:
in terms of living beings that adjust themselves
to the environment rather than in terms of beings “Chicken form separate representations faced with
that acquire information from the environment. In different events and they are affected by prior ex-
this sense these kinds of thinking would produce periences (of food, for example). They are mainly
210 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
due to internally developed plastic capabilities to ily claim our superiority over nonhuman animals either.
react to the environment, and can be thought of It is roaches not humans that turn out to demonstrate
as the fruit of learning. In general this plasticity is better ability for survival, which may imply superiority
often accompanied by the suitable reification of ex- in perceiving affordances. In a word, I think we may
ternal artificial pseudo representations (for example safely and more profitably forget the issue of inferior-
landmarks, alarm calls, urine marks and roars, etc.) ity or superiority. Let it suffice to say that we humans,
which artificially modify the environment, and/or unlike nonhuman animals, seem to have very unique
by the referral to externalities already endowed with abductive instinct displayed by our perceiving affor-
delegated cognitive values, made by the animals dances.
themselves or provided by humans.” Magnani would be happy with my interpretation,
for he himself claims that “cognitive niche construction
In fact, Maganni goes even farther in his ascription can be considered as one of the most distinctive traits of
of pseudo thought to nonhuman animals in his discus- human cognition” [9.2, p. 331]. According to Magnani,
sion of affordances, multimodal abduction, cognitive both human and nonhuman animals are chance seekers,
niches, and animal artifactual mediators. It is exactly and thereby ecological engineers. They “do not simply
where we can find what Magnani believes to be a clear live their environment, but actively shape and change it
evidence that manipulative abduction plays a crucial looking for suitable chances” [9.1, p. 319]. Further, “in
role in animal sign activities. doing so, they construct cognitive niches” [9.1]. Then,
As Magnani points out, we are now in a much better in chance seeking ecological engineering in general,
position than Peirce to understand the way of thinking and in cognitive niche construction in particular, what
and knowing of animals, thanks to [9.1, p. 283]: exactly does differentiate humans from nonhuman ani-
Part B | 9.5
mals?
“more recent results in the fields of cognitive sci-
In order to answer this question, we need to un-
ence and ethology about animals, and of develop-
derstand in what respects Magnani extends or goes
mental psychology and cognitive archeology about
beyond Gibson’s notion of affordance. In principle, it
humans and infants.”
should not be too difficult, because Magnani himself
But Magnani reminds us of the fact that Darwin al- indicates explicitly or implicitly some such respects of
ready paved a way toward the appreciation of cognitive his own innovation. Magnani takes Gibson’s notion of
faculties of animals [9.1, pp. 284-285]: affordance “as what the environment offers, provides,
or furnishes” as his point of departure. He also notes
“It is important to note that Darwin also paid great
that Gibson’s further definitions of affordance as [9.1,
attention to those external structures built by worms
p. 333]:
and engineered for utility, comfort, and security.
I will describe later on in this chapter the cogni-
“1. Opportunities for action.
tive role of artifacts in both human and nonhuman
2. The values and meanings of things which can be
animals. Artifacts can be illustrated as cognitive
directly perceived.
mediators [9.2] which are the building blocks that
3. Ecological facts.
bring into existence what it is now called a cogni-
4. Implying the mutuality of perceiver and environ-
tive niche: Darwin maintains that ‘We thus see that
ment.”
burrows are not mere excavations, but may rather
be compared with tunnels lined with cement’ [9.50,
may contribute to avoiding possible misunderstand-
p. 112]. Like humans, worms build external artifacts
ing. Given this Gibsonian ecological perspective, Mag-
endowed with precise roles and functions, which
nani appropriates some further extensions or modifica-
strongly affect their lives in various ways, and of
tions by recent scholars in order to establish his own
course their opportunity to know the environment.”
extended framework for the notion of affordance. It is
I would like to discuss a strong or active sense of simply beyond my ability to do justice to all elements of
learning abduction from animals. I do interpret Mag- Magnani’s extended framework for affordances. Let me
nani’s ideas on perceiving affordance in human and just note one issue in which Magnani shows enormous
nonhuman animals as an answer to the problem of how interest, Gibsonian direct perception.
to learn abduction from animals in this sense. As far as Magnani takes Donald Norman’s ambitious project
the problem of perceiving affordances is concerned, we of reconciling constructivist and ecological approaches
do not have to confess our inferiority to nonhuman an- to perception seriously [9.1, 6.4.3, p. 343]. Above
imals. It is we humans who have perceived affordances all, Magnani notes that Norman “modifies the orig-
in some highly creative ways. However, we cannot eas- inal Gibsonian notion of affordance also involving
Magnani’s Manipulative Abduction 9.5 The Ubiquity of Manipulative Abduction 211
mental=internal processing” [9.1, p. 337] based on (which becomes semiencapsulated). Finally, organ-
a text, where Norman writes [9.51, p. 14]: isms can also create affordances by building arti-
facts and cognitive niches.”
“I believe that affordances result from the mental
interpretation of things, based on our past knowl-
There are several points that become clear from this
edge and experience applied to our perception of the
quote, I think. First, in addition to the original Gibso-
things about us.”
nian framework for affordances, there is room for or-
If Norman is right, we may safely infer, as Mag- ganisms to participate in perceiving affordances (in the
nani does, that pace Gibson, “affordances depend on broad sense). Secondly, abductive skills are performed
the organism’s experience, learning, and full cognitive by organisms in perceiving affordances. Thirdly, in such
abilities” [9.1, p. 337]. Both Norman and Magnani are abductively perceiving affordances, perception and ac-
evidencing these ideas by formidable array of recent tion are inseparably intertwined. Finally, organisms can
results in cognitive experimental psychology and neu- even create affordance by abduction. Except for the first
roscience [9.1, p. 341]. point, I think, all these seem to be due to Magnani.
Now, given this extended framework for that ex- At the beginning of this section, I introduced
tends and modifies some aspects of the original Gibso- Paavola’s dilemma: “If abduction relies on instinct, it is
nian notion of affordances, what exactly is Magnani’s not a form of reasoning, and if it is a form of reasoning,
contribution? In some sense, this is an unnecessary it does not rely on instinct”. Though I welcomed basi-
stupid question, for everybody already knows the cor- cally Magnani’s way out of this dilemma, that is, they
rect answer. By his expertise on abduction, and in simply co-exist, it was not clear exactly what that so-
particular his Peircean thesis of perception as abduc- lution means. After having improved our understanding
Part B | 9.5
tion, Magnani contributes enormously to deepen our of manipulative abduction in animals, now we may un-
understanding of some truly big issues, such as how to derstand it better. Magnani claims that, from a semiotic
reconcile constructivist and ecological theories of per- point of view, the idea that there is a conflict between
ception. So, my question aspires to understand more views of abduction in terms of heuristic strategies or in
specifically how the Peirce–Magnani view of percep- terms of instinct (insight, perception) [9.4, 40, 52], ap-
tion as abduction contributes in that regard. Let us pears old fashioned. And he elaborates his claim that the
suppose that the original Gibsonian notion of affor- two aspects simply coexist, by adding that that is so at
dance has been extended and modified a la Norman. the level of the real organic agent (Emphasis is mine).
Would Magnani claim that such an extension or mod- Depending upon the cognitive/semiotic perspective we
ification is impossible without abductive activities of adopt, he claims [9.1, pp. 281–282]:
organisms? Or, would he claim that such an extension
or modificationis still incomplete without abduction? “1. We can see it as a practical agent that mainly takes
Be that as it may, the big picture Magnani presents advantage of its implicit endowments in terms of
is this [9.1, p. 348]: guessing right, wired by evolution, where of course
instinct or hardwired programs are central.
“Organisms have at their disposal a standard en-
2. We can see it as the user of explicit and more or less
dowment of affordances (for instance through their
abstract semiotic devices internally stored or exter-
hardwired sensory system), but at the same time
nally available – or hybrid – where heuristic plastic
they can extend and modify the scope of what can
strategies (in some organism they are conscious) ex-
afford them through the suitable cognitive abductive
ploiting relevance and plausibility criteria – various
skills.”
and contextual – for guessing hypotheses are ex-
If we probe the question as to what exactly are ploited.”
involved in organisms’ employment of cognitive ab-
ductive skills, Magnani would respond roughly as the Now we may underwrite the fact that the two as-
following lines [9.1, p. 346]: pects simply co-exist at the level of real organic agents,
who are manipulative abducers, for it would be hard
“in sum organism already have affordances avail- to find a better example of manipulative abduction than
able because of their instinctive gifts, but also they creating affordances. In other words, manipulative ab-
can dynamically abductively extract natural affor- duction is the key factor in Magnani’s thought about
dances through affecting and modifying perception animal abduction.
212 Part B Theoretical and Cognitive Issues on Abduction and Scientific Inference
References
9.1 L. Magnani: Abductive Cognition. The Epistemolog- 9.15 C.S. Peirce: Exact Logic (Published Papers), Collected
ical and Eco-Cognitive Dimensions of Hypothetical Papers of Charles Sanders Peirce, Vol. 3 (Harvard Univ.
Reasoning (Springer, Berlin 2009) Press, Cambridge 1933), ed. by C. Hartshorne, P. Weiss
9.2 L. Magnani: Abduction, Reason, and Science: Pro- 9.16 S. Marietti: Observing signs. In: New Essays on
Part B | 9
cesses of Discovery and Explanation (Kluwer, New Peirce’s Mathematical Philosophy, ed. by M.E. Moore
York 2001) (Open Court, Chicago and La Salle 2010) pp. 147–167
9.3 C.S. Peirce: The New Elements of Mathematics, Vol. 4, 9.17 S.-J. Shin: Peirce’s two ways of abstraction. In: New
ed. by C. Eisele (Humanities Press, New York 1976) Essays on Peirce’s Mathematical Philosophy, ed. by
9.4 M.H.G. Hoffmann: Problems with Peirce’s concept of M.E. Moore (Open Court, Chicago and La Salle 2010)
abduction, Found. Sci. 4(3), 271–305 (1999) pp. 41–58
9.5 C.S. Peirce: Elements of Logic, Collected Papers of 9.18 S.-J. Shin: The forgotten individual: Diagrammatic
Charles Sanders Peirce, Vol. 2. (Harvard Univ. Press, reasoning in mathematics, Synthese 186, 149–168
Cambridge 1932), ed. by C. Hartshorne, P. Weiss (2012)
9.6 C.S. Peirce: Science and Philosophy, Collected Papers 9.19 L. Magnani: Thinking through drawing: Diagram
of Charles Sanders Peirce, Vol. 7 (Harvard Univ. Press, constructions as epistemic mediators in geometrical
Cambridge 1958), ed. by A.W. Burks discovery, Knowl. Eng. Rev. 28(3), 303–326 (2013)
9.7 S. Marietti: Semiotics and deduction: Perceptual 9.20 L. Magnani, R. Dossena: Perceiving the infinite and
representations of mathematical processes. In: the infinitesimal world: Unveiling and optical dia-
Semiotics and Philosophy in Charles Sanders Peirce, grams in mathematics, Found. Sci. 10, 7–23 (2005)
ed. by R. Fabbrichesi, S. Marietti (Cambridge Scholars 9.21 D. Gooding: Experiment and the Making of Meaning:
Press, Cambridge 2006) pp. 112–127 Human Agency in Scientific Observation and Experi-
9.8 W. Park: On classifying abduction, J. Appl. Logic 13(3), ment (Kluwer, Dordrecht 1990)
215–238 (2015) 9.22 J. Woods: Recent developments in abductive logic,
9.9 C. Eisele: Mathematical methodology in the thought Stud. History Phil. Sci. 42(1), 240–244 (2011)
of C.S. Peirce, His. Math. 9, 333–341 (1982) 9.23 E. Hutchins: Cognition in the Wild (MIT Press, Cam-
9.10 R.A. Tursman: Peirce’s Theory of Scientific Discov- bridge 1995)
ery. A System of Logic Conceived as Semiotic (Indiana 9.24 Piaget: Adaption and Intelligence (Univ. Chicago
Univ. Press, Bloomington 1987) Press, Chicago 1974)
9.11 E.J. Crombie: What is deduction? In: Studies in the 9.25 L. Magnani: Animal abduction: From mindless or-
Logic of Charles Sanders Peirce, ed. by N. Houser, ganisms to artifactual mediators. In: Model-Based
D.D. Roberts, J. Van Evra (Indiana Univ. Press, Bloom- Reasoning in Science, Technology, and Medicine, ed.
ington and Indianapolis 1997) pp. 460–476 by L. Magnani, P. Li (Springer, Berlin 2007) pp. 3–37
9.12 D.G. Campos: The Interpretation and hypothesis- 9.26 L. Magnani: Morality in a Technological World:
making in mathematics: A Peircean account. In: New Knowledge as Duty (Cambridge Univ. Press, Cam-
Essays on Peirce’s Mathematical Philosophy, ed. by bridge 2007)
M.E. Moore (Open Court, Chicago and La Salle 2010) 9.27 L. Magnani: Understanding Violence. The Intertwin-
pp. 123–145 ing of Morality, Religion and Violence: A Philosoph-
9.13 F. Stjernfelt: Natural Propositions: The Actuality of ical Stance (Springer, Berlin 2011)
Peirce’s Doctrine of Dicisigns (Docent, Boston 2014) 9.28 D. Gabbay, J. Woods: The Reach of Abduction: Insight
9.14 C.S. Peirce: The Simplest Mathematics, Collected Pa- and Trial, A Practical Logic of Cognitive Systems, Vol. 2
pers of Charles Sanders Peirce, Vol. 4 (Harvard Univ. (North-Holland, Amsterdam 2005)
Press, Cambridge 1933), ed. by C. Hartshorne, P. Weiss
Magnani’s Manipulative Abduction References 213
9.29 J. Woods: Errors of Reasoning: Naturalizing the Logic pp. 53–74, Sapere 2
of Inference (College Publications, London 2013) 9.40 S. Paavola: Peircean abduction: Instinct or infer-
9.30 W. Park: How to learn abduction from animals? From ence?, Semiotica 153(1–4), 131–154 (2005)
Avicenna to Magnani. In: Model-Based Reasoning in 9.41 K.T. Fann: Peirce’s Theory of Abduction (Martinus Ni-
Science and Technology Theoretical and Cognitive Is- jhoff, Hague 1970)
sues, ed. by L. Magnani (Springer, Heidelberg/Berlin 9.42 D.R. Anderson: Creativity and the Philosophy of
2014) pp. 53–74 C.S. Peirce (Martinus Nijhoff Publishers, Dordrecht
9.31 P. Thagard: Computational Philosophy of Science (MIT 1987)
Press, Cambridge 1988) 9.43 R.J. Roth: Anderson on Peirce’s concept of abduc-
9.32 G. Schurz: Patterns of abduction, Synthese 164, 201– tion: Further reflections, Trans. C.S. Peirce Soc. 24(1),
234 (2008) 131–139 (1988)
9.33 M.H.G. Hoffmann: Theoric transformations and 9.44 B. Brogaard: Peirce on abduction and rational con-
a new classification of abductive inferences, Trans. trol, Trans. C.S. Peirce Soc. 35(1), 129–155 (1999)
C.S. Peirce Soc. 46(4), 570–590 (2011) 9.45 R.B. Burton: The problem of control in abduction,
9.34 T. Kapitan: Peirce and the structure of abductive in- Trans. C.S. Peirce Soc. 36(1), 149–156 (2000)
ference. In: Studies in the Logic of Charles Sanders 9.46 H.G. Frankfurt: Peirce’s account of inquiry, J. Philos.
Peirce, ed. by N. Houser, D.D. Roberts, J. Van Evra 55, 588–592 (1985)
(Indiana Univ. Press, Bloomington Indianapolis 1997) 9.47 C.S. Peirce: Carnegie application of 1902 (MS L75),
pp. 477–496 http://members.door.net/arisbe/ (1902)
9.35 L. Magnani: Naturalizing logic: Errors of reason- 9.48 N. Rescher: Peirce’s Philosophy of Science (Univ.
ing vindicated: Logic reapproaches cognitive science, Notre Dame Press, Notre Dame 1978)
J. Appl. Logic 13, 13–36 (2015) 9.49 C.S. Peirce: Pragmatism and Pragmaticism, Collected
9.36 J. Woods: Errors of Reasoning: Naturalizing the Logic Papers of Charles Sanders Peirce, Vol. 5 (Harvard Univ.
of Inference (College Publications, London 2013) Press, Cambridge 1934), ed. by C. Hartshorne, P. Weiss
Part B | 9
9.37 R. Thom: Esquisse D’une s‘Emiophysique. In: Semio 9.50 C. Darwin: The Formation of Vegetable Mould,
Physics: A Sketch (Addison Wesley, Redwood City Through the Action of Worms With Observations on
1990) Inter Editions, Paris (1988), Translated by V. Their Habits (Univ. Chicago Press, Chicago 1985), orig-
Meyer inally published in 1881
9.38 W. Park: Abduction and estimation in animals, 9.51 D.A. Norman: The Psychology of Everyday Things (Ba-
Found. Sci. 17, 321–337 (2012) sic Books, New York 1988)
9.39 W. Park: On animal cognition: Before and after the 9.52 S. Paavola: Abduction through grammar, critic and
beast-machine controversy. In: Philosophy and Cog- methodeutic, Trans. C.S. Peirce Soc. 40(2), 245–270
nitive Science. Western and Eastern Studies, ed. by (2004)
L. Magnani, P. Li (Springer, Heidelberg/Berlin 2012)
215
Part C
The Logic Part C The Logic of Hypothetical Reasoning,
Abduction, and Models
Ed. by Atocha Aliseda
This section, The Logic of Hypothetical Reasoning, Ab- ing, abduction is a reasoning process from a single
duction, and Models, shall be concerned with reviewing observation to (plausible) explanations. This character-
some formal models for scientific inquiry. Scientific in- ization, which largely follows the original formulation
quiry is a human enterprise to which we cannot deny of Charles Peirce (to be described in Chap. 10), still
a big success. It has been our intellectual instrument for leaves ample room for several interpretations. To begin
achieving great endeavors, such as the arrival on the with, when talking about abduction, or any inferential
moon, the possibility of internet communication, the process for that matter, we may refer to a finished prod-
discovery of infections and the invention of vaccines, uct, in this case the abductive explanation, or to an
and many more. The inferential processes involved in activity, the abductive process. These two are closely
scientific inquiry are an essential aspect to analyze related, for the abductive process produces an abduc-
when carrying out an enterprise like the one in this tive explanation, but they are not the same. Moreover,
Springer Handbook of Model-Based Science. However, for a given fact to be explained, there are often sev-
the field and the material are so vast that it would be eral abductive explanations to choose from, but only
impossible to review everything. one that counts as the best one. Thus, abduction is con-
nected to both hypotheses construction and hypotheses
Therefore, the main concern within scientific rea-
selection. Some authors consider these processes as two
soning will be ampliative reasoning, those inferential
separate steps, construction dealing with the generation
processes in which the conclusion expands the given
of plausible explanations according to some criteria of
information. This kind of reasoning manifests itself in
what counts as such, and selection with applying some
inferences such as induction and abduction, and it is
preference criteria to select the best one among the
opposed to deduction, in which conclusions are certain
plausible ones. Another issue to be settled in regard to
but add nothing new to the given. A salient aspect of
abduction is the distinction from its closest neighbor,
ampliative reasoning is the tentative epistemic status
induction.
of the conclusions produced, something which makes
them defeasible. That is, given additional information, In this very broad map of hypothetical reasoning,
it may no longer be warranted to draw a previously valid two approaches to abduction are salient, one which in-
conclusion. terprets it as argument versus as inference to the best
explanation (each case, in turn, may be seen either as
More particularly, the focus in this section will
a product or as a process). This is a familiar distinc-
be precisely on the tentative status of the conclusion
tion in the philosophy of science, where abduction is
produced, of its being hypothetical. A hypothetical
closely connected with issues of scientific explanation.
statement is, at the very best, potential knowledge. It
In a more in-depth view of logical abduction, three
is neither true nor false, but holds a hypothetical epis-
characterizations may be identified, namely as logical
temic status, one that may be settled later as true (when
inference, as a computational process, and as a pro-
the hypothesis is corroborated) or as false (when it is
cess for epistemic change. Each one of these views
falsified). Hypothetical reasoning is understood here as
highlights one relevant aspect of abductive reasoning:
a type of reasoning for explanations.
its logical structure (under an interpretation as a prod-
One case of hypothetical reasoning is enumerative uct), its underlying computational mechanism (under an
induction, also known as inductive generalization, in interpretation as a process), and its role in the dynam-
which the inferential process that is at stake is one ics of belief revision. Indeed, there are several ways to
which obtains a universal statement (all ravens are characterize this reasoning type, and it may be more
black) from a set of individual ones (the first raven appropriate to characterize abductive patterns rather
is black, . . . , the n-th raven is black). A generaliza- than trying to define it as a single concept. These ap-
tion from instances is a case of ampliative reasoning proaches and logical characterizations of abduction will
because it expands what is stated in the instances by ad- be spelled out in detail in the introductory chapter, The
vancing a defeasible prediction (the next raven will be Logic Abduction: An Introduction, together with an at-
black). That is, the generalization may fail when a fur- tempt to provide a proper distinction between induction
ther instance falsifies the conclusion. As is well known, and abduction.
inductive generalization is based on the assumption of
In Chap. 11 by Mathieu Beirlaen, Qualitative In-
the uniformity of nature; the world is uniform and,
ductive Generalization and Confirmation, the author
therefore, it seems safe to draw generalizations out of
offers a number of adaptive logics for inductive gen-
instances, although they may fail at some point.
eralization, each of which is analyzed as a criterion
Another case of hypothetical reasoning that shall be of confirmation and confronted with Hempel’s satis-
reviewed in depth is that of abduction. Broadly speak- faction criterion and the hypothetico-deductive model
217
of confirmation. The adaptive criteria proposed in this at the center of the formal modeling. However, each of
paper offer an interesting alternative perspective on these chapters is actually a combination of at least two
(qualitative) confirmation theory in the philosophy of abductive characterizations.
science.
In Chap. 13 by Angel Nepomuceno Fernández,
Adaptive logics are a relatively new proof-theoreti- Fernando Soler Toscano and Fernando R. Velázquez
cal framework designed to model ampliative reasoning Quesada, Abductive Reasoning in Dynamic Epistemic
and dynamic information. For the case of inductive gen- Logic, the authors rely on the dynamic epistemic logic
eralization, the defeasibillity of the conclusion is dealt framework, which is largely based on a semantic per-
with in two respects. On the one hand, each of these spective of modal logic and is an ideal tool to represent
proposed logics uses a criterion to assert a generaliza- an agent’s state of knowledge (and belief) together with
tion as a statement within the proof. On the other hand, the dynamics of epistemic change. Operations to up-
each of these logics implements a strategy, a specific grade, update and revise plausibility models are put
way by which a generalization is refuted and, therefore, forward to dynamically change both the content and the
marked in the proof, so that it is no longer considered ordering of these models. Original characterizations of
as part of the derivation (until it is unmarked due to new what is an abductive problem (solution) are put forward,
information). not with respect to a background theory, as is the case
in the classical approach to abduction (see introductory
In Chap. 12 by Tjerk Gauderis, Modeling Hypothet-
chapter), but rather with respect to an agent’s infor-
ical Reasoning by Formal logics, the author offers an
mation at a given state. Moreover, plausibility models
interesting discussion in regard to the feasibility of the
provide an ordering among epistemic possibilities and
project of modeling hypothetical reasoning by means
accordingly, the best abductive explanation turns out
of formal logics, exploring the assumptions one has
to be the most plausible one. This chapter exhibits the
to hold in order to accept or reject this endeavor. The
inference to the best explanation approach and a combi-
author then puts forward four patterns of hypothetical
nation of the inferential and epistemic characterizations
reasoning, showing that not a single one can be eas-
of abduction.
ily modeled by formal means. Abduction of a singular
fact is the one pattern that has received most attention In Chap. 14 by Cristina Barés Gómez and Matthieu
in the logical literature; the author gives a review and Fontaine, Argumentation and Abduction in Dialogical
a detailed description of two adaptive logics devised for Logic, the authors offer an interesting discussion in fa-
this particular pattern, showing that although it is the vor of a reconciliation between argumentation theory
simplest pattern of all four, there are already some chal- and formal logic; one in which their selected logical
lenges to model it formally. framework, dialogical logic, is the formal model for sci-
entific inquiry. More particularly, reasoning is modeled
These two chapters share the logical framework of
via a dialectical interaction in a game-like scenario be-
adaptive logics as a formal model for scientific inquiry,
tween the proponent of a thesis and an opponent to it.
one for inductive generalization, the other one for sin-
The authors endorse the view of Dov Gabbay and John
gle fact abduction. Of the four patters of hypothetical
Woods, according to which abduction is a response to
reasoning put forward by Gauderis, the second one, ab-
an ignorance problem. An agent has an ignorance prob-
duction of a generalization, is indeed a case of inductive
lem with respect to a cognitive target when she lacks the
generalization.
knowledge to attain such a target, and abduction is but
According to the previously mentioned classifica- one type of solution to this kind of problem. The authors
tion of hypothetical reasoning, these two chapters fall of this chapter propose an extension of the dialogical
into the argumentative approach of the type of rea- framework to account for abduction and put forward the
soning modeled (inductive or abductive). However, the notion of a concession problem, in order to do so. The
chapter by Gauderis highlights a distinction between chapter follows the argumentative approach to abduc-
practical abduction and theoretical abduction, which – tion, but extends this view with a dialectical interaction
to a certain extent – corresponds to the argumentative and combines aspects of the inferential and epistemic
versus inference to the best explanation dichotonomy characterizations.
of abduction found in the philosophical literature.
In the last chapter of this section, Chap. 15 by Ju-
The next three chapters of this section belong to liana Bueno Soler, Walter Carnielli, Marcelo Coniglio,
the epistemic approach to abduction of taking an agent- and Abilio Rodrigues, Formal (In)consistency, Abduc-
oriented stance, one in which the agent’s perspective is tion and Modalities, the authors take a broader view of
218
scientific inquiry and deal with the problem of inconsis- explicative abduction, according to the distinction put
tent information, as when there is conflicting evidence forward by Lorenzo Magnani. The chapter mainly fol-
for a fact to be explained. In accordance with the fo- lows the argumentative approach to abduction and is
cus on hypothetical reasoning taken in this section, a combination of the computational and epistemic char-
the onset of conflicting information, as presented in acterizations.
this chapter, is just another case of a tentative conclu-
sion, one which is taken in a sense weaker than true, By virtue of their being formal models of scientific
with a provisional status and pending further investiga- reasoning, the chapters to follow are technical; each one
tion. Authors of this chapter are interested in reviewing of them offers an original contribution to the field, but
the case when no obvious explanation is at hand, es- at the same time, they all provide the intuition and ra-
pecially when contradictory information is involved, tionale behind the notions presented. The diversity of
and a meaningful explanation can still be constructed formal frameworks displayed in this section shows the
(one that can not be produced in a classical setting). wide variety of formal tools for hypothetical reasoning
They develop their own formal framework, based on the modeling. These tools have proved useful to philoso-
classical tableaux systems. In respect to abduction, the phers, logicians, and computer scientists alike and may
authors apply a paraconsistent logic to deal with it, and also be so to anyone who would like to make use of
even go further to draw connections between modali- the potential of formal tools to model scientific in-
ties and consistency. Their approach is focused on the quiry at large. This section, The Logic of Hypothetical
process of hypothesis generation, making it attractive Reasoning, Abduction, and Models, offers a thorough
for computational implementation (not developed in introduction and is an overview to some formal models
the paper) and identified by the authors themselves as for hypothetical reasoning found in the philosophy of
a case of creative abduction, one which contrasts with science and logical literature.
219
The Logic of A
10. The Logic of Abduction: An Introduction
Part C | 10.1
Atocha Aliseda
ophy of science, induction is understood in the broad ligence (ECAI) and International Joint Conference on
Part C | 10.1
sense of any kind of inference that expands knowledge Artificial intelligence (IJCAI)) as well as that of edited
in the face of uncertainty [10.2]. books [10.6].
Since the time of John Stuart Mill (1806–1873), With the sole purpose of providing a methodologi-
the technical name given to all kinds of nondeduc- cal distinction between abduction and induction, in this
tive reasoning has been induction, but several methods chapter, abduction will be understood as reasoning from
for discovery and demonstration of causal relation- a single observation to its explanations, and induc-
ships [10.3] were recognized. These included generaliz- tion as enumerative induction, a reasoning kind from
ing from a sample to a general property, and reasoning samples to general statements. Given these tentative
from data to a causal hypothesis (the latter further di- characterizations, those aspects that distinguish them
vided into methods of agreement, difference, residues, will be highlighted. While induction explains a set of
and concomitant variation). A more refined and modern observations, abduction explains a single one. Induction
terminology is enumerative induction and explanatory makes a prediction for further observations, abduction
induction, of which inductive generalization, inductive does not (directly) account for later observations. While
projection, statistical syllogism, and concept formation induction needs no background theory per se, abduction
are some instances. relies on a background theory to construct and test its
Another term for nondeductive reasoning is sta- abductive explanations.
tistical reasoning, introducing a probabilistic flavor, As for their similarities, induction and abduction
in which explanations are not certain but only prob- are both ampliative and defeasible modes of reason-
able. Statistical reasoning exhibits the same diversity ing. More precisely, they are nonmonotonic types of
as abduction. First of all, just as the latter is strongly inference. A consequence ) is labeled as nonmono-
identified with backward deduction (as it will be shown tonic whenever T ) b does not guarantee T, a ) b.
later in this chapter), the former finds its reverse notion That is, the addition of a new premise (a) may invali-
in probability (For those readers interested in quantita- date a previous valid argument. In the terminology of
tive approaches: the problem in probability is: Given an philosophers of science, nonmonotonic inferences are
stochastic model, what can we say about the outcomes? not erosion proof [10.7]. Moreover, qua direction, both
The problem in statistics is the reverse: Given a set of run in the opposite direction to standard deduction; they
outcomes, what can we say about the model?). Both ab- both run from evidence to explanation and the status of
duction and statistical reasoning are closely linked with the produced explanation is hypothetical.
notions like confirmation (the testing of hypotheses) To clear up terminological conflicts, one might want
and likelihood (a measure for alternative hypotheses). to coin new terminology altogether. Some may ar-
The former will be reviewed later in this part of the gue for a new term of explanatory reasoning as done
handbook (Chap. 11). in [10.8] or even better as hypothetical reasoning try-
On the other hand, some authors put forward ing to describe its fundamental aspects without having
abduction as the main category and take induction as to decide if they are instances of either abduction or in-
one of its instances. Abduction as IBE is considered duction. In this broader perspective, it is also possible
by Harman [10.4] as the basic form of nondeductive to capture explanation for more than one instance or
inference, which includes (enumerative) induction as for generalizations and introduce further fine-structure.
a special case (this approach will be presented later in Indeed, a classification in terms of patterns of hypothet-
this chapter). ical reasoning may be very appropriate, as will be found
This confusion in terminology returns in artificial later on in this part of the handbook (Chap. 12. A key
intelligence (AI). Induction is used for the process of reference in the literature in terms of patterns of abduc-
learning from examples – but also for creating a the- tion is found in [10.9]).
ory to explain the observed facts [10.5], thus making
abduction an instance of induction. Abduction is usu- 10.1.2 The Founding Father: C.S. Peirce
ally restricted to producing abductive explanations in
the form of facts (predicates of some sort, as those used The literature on abduction is so vast that makes im-
in computational implementations of abduction, to be possible to undertake a complete survey here. But any
later introduced). When explanations are rules, it is then history of abduction cannot fail to mention the founding
regarded as part of induction. Indeed, the relationship father: Charles Sanders Peirce (1839–1914).
between abduction and induction has been a distin- Peirce is the founder of American pragmatism and
guished topic of several workshops in AI mainstream the first philosopher to give to abduction a logical form.
conferences (European Conference on Artificial Intel- However, his notion of abduction is a difficult one to un-
The Logic of Abduction: An Introduction 10.1 Some History 221
ravel. On the one hand, it is entangled with many other 10.1.3 The Cognitive Sciences
Part C | 10.1
aspects of his philosophy, and on the other hand, sev-
eral different conceptions of abduction evolved in his Research on abduction in AI dates back to the 1970s
thought. The notions of logical inference and of validity of the twentieth century [10.14], but it is only fairly re-
that Peirce puts forward go beyond our present under- cently that it has attracted great interest, in areas like
standing of what logic is about. They are linked to his logic programming, knowledge assimilation, and diag-
epistemology, a dynamic view of thought as logical in- nosis, to name a few. Some publications, collective and
quiry, and correspond to a deep philosophical concern, individual alike, are found in [10.15–19], to name a few.
that of studying the nature of synthetic reasoning. In In all these places, the discussion about the different
what follows, a few general aspects of his later theory aspects of abduction has been conceptually challeng-
of abduction will be pointed out, to later concentrate ing but also shows a (terminological) confusion with its
on some of its more logical aspects (for a more elab- close neighbor, induction (similar to what has already
orate analysis of the evolution of Peirce’s abduction been pointed out previously). Abduction has also been
as well as its connection to his epistemology [10.10– a distinguished topic of model-based reasoning (MBR)
12]). conferences (those linked to the editorial project of this
For Peirce, three aspects determine whether a hy- handbook).
pothesis is promising: it must be explanatory, testable, The importance of abduction has been recognized
and economic. A hypothesis is an explanation if it ac- by leading researchers in nearly all fields that make
counts for the facts. Its status is that of a suggestion up the cognitive sciences: philosophy, computer sci-
until it is verified, which explains the need for the sec- ence, cognitive psychology, and linguistics. For Jaakko
ond criterion. Hintikka, abduction is the fundamental problem of con-
Finally, the motivation for the economic criterion is temporary epistemology, in which abductive inferences
twofold: A response to the practical problem of having must be construed as answers to the inquirer’s explicit
innumerable explanatory hypotheses to test, as well as or (usually) tacit question put to some definite source
the need for a criterion to select the best explanation of answers (information) [10.11, p.519]. For Herbert
among the testable ones. Simon, the nature of the retroductive process (Peirce’s
For the explanatory aspect, Peirce gave the follow- original term for abduction) is the main subject of
ing often-quoted logical formulation [10.13, CP 5.189]: the theory of problem solving in both its positive and
normative versions [10.20, p.151]. For Paul Thagard,
“The surprising fact, C, is observed. several kinds of abduction play a key role as heuristic
But if A were true, C would be a matter of course. strategies in the program PI (processes of induction),
Hence, there is reason to suspect that A is true.” a working system devoted to explain – in computational
terms – some of the main problems in philosophy of
This formulation has played a fundamental role in science, such as scientific discovery, explanation, and
Peirce scholarship, and it has been the point of depar- evaluation [10.2]. Finally, for Noam Chomsky, abduc-
ture of many classic studies on abductive reasoning in tion plays a key role in language acquisition; for the
all fields that make up the cognitive sciences, mainly child abduces the rules of grammar guided by her in-
those in which the approach is argumentative-based. nate knowledge of language universals [10.21].
Nevertheless, these accounts have paid little attention
to the elements of this formulation and practically none 10.1.4 Some Examples
to what Peirce said elsewhere in his writings. This situ-
ation may be due to the fact that his philosophy is very There are a variety of approaches that claim to capture
complex and not easy to be implemented in the compu- the true nature of the notion of abduction. One reason
tational realm. The notions of logical inference and of for this diversity lies in the fact that abductive reasoning
validity that Peirce puts forward go beyond logical for- occurs in a multitude of contexts and aims to cover from
mulations but at the same time some of his ideas find the simplest selection of already existing hypotheses in
a natural place in recent proposals, such as that found a context of common sense reasoning to the generation
in theories of belief revision (to be reviewed later in this of new concepts in science. Here are some examples il-
chapter). lustrating this variety (examples are taken from [10.8]):
The approach to abductive reasoning, in this hand-
book part, reflects this Peircean diversity in part, taking 1. Common sense: Explaining observations with sim-
abduction as a style of logical reasoning that occurs at ple facts. All you know is that the lawn gets wet
different levels and contexts and comes in several de- either when it rains, or when the sprinklers are on.
grees. You wake up in the morning and notice that the lawn
222 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
is wet. Therefore you hypothesize that it rained dur- ladonna, a homeopathic medicine that stimulates
Part C | 10.2
ing the night or that the sprinklers had been on. the immune system by strengthening the physio-
2. Common sense: When something does not work. logical resources of the patient to fight infectious
You come into your house late at night, and notice diseases (This is an adaptation of Hempel’s illustra-
that the light in your room, which is always left on, tion of his inductive-statistical model of explanation
is off. It has being raining very heavily, and so you as shown in [10.7]. The part about homeopathy is
think some power line went down, but the lights in taken from [10.8])
the rest of the house work fine. Then, you wonder 4 Scientific reasoning: Kepler’s discovery. It has been
if you left both heaters on, something which usu- claimed that Johannes Kepler’s great discovery that
ally causes the breakers to cut off, so you check the orbit of the planets is elliptical rather than circu-
them: but they are OK. Finally, a simpler explana- lar was a prime piece of abductive reasoning [10.13,
tion crosses your mind. Maybe the light bulb of your CP 2.623]. What initially led to this discovery was
lamp which you last saw working well, is worn out, his observation that the longitudes of Mars did not
and needs replacing. fit circular orbits, but before even dreaming that the
best explanation involved ellipses instead of circles,
These examples belong to a practical setting found he tried several other forms. Moreover, Kepler had
in our day-to-day common sense reasoning. The first to make some other assumptions about the planetary
one is the paradigmatic example of abduction in AI and system, without which his discovery does not work.
will be analyzed in full detail later in this part of the His heliocentric view allowed him to think that the
handbook, in Chap. 15. An extension of the second ex- sun, so near to the center of the planetary system,
ample will be presented in Chap. 13. and so large, must somehow cause the planets to
Some other instances of abductive reasoning, but in move as they do. In addition to this strong conjec-
this case oriented to model the cognitive competence ture, he also had to generalize his findings for Mars
of health practitioners and of working scientists are the to all planets, by assuming that the same physical
following: conditions obtained throughout the solar system.
3 Statistical reasoning: Medical diagnosis. Jane The third example is concerned with the construc-
Jones recovered quite rapidly from a streptococci tion of a diagnosis, in which based on a series of
infection after she was given a dose of penicillin. observations (symptoms and signs) and of causal rela-
Almost all streptococcus infections clear up quickly tions linking those observations to pathologies, health
upon administration of penicillin, unless they are professionals build their diagnoses in order to deter-
penicillin resistant, in which case the probability of mine an illness. But abduction also occurs in theoretical
quick recovery is rather small. The doctor knew that scientific contexts, such as the one described in the
Jane’s infection is of the penicillin-resistant type, fourth example, in which anomalous observations give
and is puzzled by her recovery. Jane Jones then rise to new ideas that force to revise knowledge found
confesses that her grandmother had given her Bel- in existing theories.
However intuitive, this interpretation certainly cap- Under this approach, abduction may be regarded as
Part C | 10.2
tures neither the fact that C is surprising nor the addi- a single process by which a single best explanation is
tional criteria Peirce proposed. Moreover, the interpre- constructed. And the focus is on finding selection cri-
tation of the second premise should not be committed teria which allow to characterize a hypothesis as the
to material implication (For a causal interpretation of best one (A key references for the interpretation of ab-
this conditional [10.31]). But other interpretations are duction as inference to the best explanation is found
possible; any nonstandard form of logical entailment or in [10.33]).
even a computational process in which A is the input Thus, abduction is connected to both hypotheses
and C the output, are all feasible interpretations for if C construction and hypotheses selection. Some authors
were true, A would be a matter of course. consider these processes as two separate steps: con-
The additional Peircean requirements of testability struction dealing with what counts as a possible ab-
and economy are not recognized as such in AI, but to ductive explanation, and selection with applying some
some extent are nevertheless incorporated. The latter preference criterion over possible abductive explana-
criterion is implemented as a further selection process tions to select the best one.
to produce the best explanation, since there might be As it turns out, the notion of a best abductive expla-
several formulae that satisfy the above formulation but nation necessarily involves contextual aspects, varying
are nevertheless inappropriate as explanations. Testabil- from application to application. There is at least a new
ity as understood by Peirce is an extra-logical empirical parameter of preference ranking here. There exists both
criterion. a philosophical tradition on the logic of preference,
and logical systems in AI for handling preferences that
10.2.2 Inference to the Best Explanation may be used to single out best explanations [10.34, 35].
A proposal to tackle this approach in the framework of
Abduction as IBE was proposed by Gilbert Harman as dynamic epistemic logic, in which (an extension of) the
the basic form of nondeductive inference, one which in- light failure example is analyzed in full detail, will be
cluded enumerative induction as one of its instances. presented later in this handbook part (Chap. 13).
According to him [10.4, p.89]:
10.2.3 A Taxonomy
“Uses of the inference to the best explanation are
manifold. When a detective puts the evidence to-
What has been presented so far may be summarized as
gether and decides that it must have been the butler,
follows. Abduction is a process whose products are spe-
he is reasoning that no other explanation which ac-
cific abductive explanations, with a certain inferential
counts for all the facts is plausible enough or simple
structure, making an (abductive) explanatory argument.
enough to be accepted.”
As for the logical form of abduction – referring to the
This idea may be put into an argumentative form as inference corresponding to the abductive process that
follows [10.32]: takes a background theory () and a given observation
(') as inputs, and produces an abductive explanation
“D is a collection of data
(˛) as its output – the proposal here is that at a very
(facts, observations, givens)
general level, the logical structure of abduction may be
H explains D
viewed as a threefold relation
(would, if true, explain D)
No other hypothesis explains D as well as H does
; ' ) ˛ :
Therefore, H is probably correct”
Given a fact to be explained, there are often several Other parameters are possible here, such as a prefer-
possible abductive explanations, but (hopefully) only ence ranking, but these would rather concern the further
one that counts as the best one. Pending subsequent selection process. This characterization aims to capture
testing, in the previous common sense example of light the direction (from evidence to abductive explanation)
failure (2) several abductive explanations account for of this type of reasoning. In the end, however, the goal is
the unexpected darkness of the room (power line down, to characterize an (abductive) explanatory argument, in
breakers cut off, bulb worn out). But only one may be its deductive forward fashion, that is, an inference from
considered as best explaining the event, namely the true theory () and abductive explanation (˛) to evidence
one, the one that really happened. But other preference (') as follows
criteria may be appropriate, too, especially when there
is no direct test available. ; ˛ ) ' :
224 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
Against this background, three main parameters that observation instead ( ) :')? Thus, two triggers for
Part C | 10.2
determine types of explanatory arguments are put for- abduction are identified: novelty and anomaly
ward:
Definition 10.1 (Abductive Novelty: 6) ', 6)
1. An inferential parameter ()) sets some suitable
:')
logical relationship among explananda, background
' is novel. It cannot be explained ( 6) '), but it is
theory, and explanandum.
consistent with the theory ( 6) :')
2. Next, abductive triggers determine what kind of ab-
ductive process is to be performed: ' may be a novel
phenomenon, or it may be in conflict with the theory
Definition 10.2 (Abductive Anomaly: 6) ', )
.
:')
3. Finally, abductive outcomes (˛) are the various
' is anomalous. The theory explains rather its negation
products (abductive explanations) of an abductive
( ) :').
process: facts, rules, or even new theories.
Part C | 10.3
This section offers a description of three logical charac- When ˆ is interpreted as classical logical conse-
terizations of abduction found in the literature. Nowa- quence, conditions 1 and 2 go hand in hand and are
days, there are plenty of logic-based papers on abduc- clearly mandatory. The first one dictates the entailment
tion, most of which fit into one of the following three condition while the second one imposes the abductive
characterizations: explanation to be consistent with the background the-
ory, for the principle of explosion is valid in classical
1. Abduction as logical inference
logic (Chap. 15). As for condition 3, it is necessary
2. Abduction as computation
in order to avoid self-explanations or, more generally,
3. Abduction as epistemic change.
explanations that are independent from the background
While the first one aims at characterizing abduc- theory. Condition 4 aims at capturing either a criterion
tion as backward deduction plus additional conditions of best explanation by which minimal may be inter-
and (generally) has classical logical consequence as preted as selecting the weakest explanation (e.g., not
an underlying inference, the second one focuses on equal to ! ') or a preferred explanation (which re-
providing specific algorithms that produce abductive quires a predefined preference ordering). Additionally,
explanations, and it is therefore as varied as compu- there is usually another requirement restricting the log-
tational platforms allow for. The last one puts forward ical vocabulary as well as the syntactic form of the
abductive operations to revise, expand and update and explanation , such as being an atomic formula from
has close links both with theories of belief revision and the vocabulary in the logical language. This schema
of update in AI. These approaches may be labeled as the suggests a classification into abductive inferential styles
classical characterizations of logical abduction. These (as done in [10.8]), one in which plain, consistent, ex-
characterizations are classical in at least two respects. In planatory, minimal, and preferential abductive styles
the first place, they are classical because they emerged correspond to above conditions.
originally as logical models for abduction and capture at A (pre) condition to the above schema – not always
least one relevant aspect of abductive reasoning: Its log- made explicit – is that the theory does not already en-
ical structure, its underlying computational mechanism, tail neither the explanandum nor its negation ( 6ˆ '
and its role in the dynamics of belief revision. In an- and 6ˆ :'). In many proposals, this condition con-
other respect, they are classical – in so far as presented stitutes an abductive problem (such as in Chap. 15).
in this chapter – because they exhibit a very classical Given the previous taxonomy for abduction by which
way for doing logic, computation, or formalizations of two abductive triggers were distinguished, the follow-
belief revision theories. ing definition of an abductive problem is put forward,
Interesting proposals in all three characterizations one in which novel and anomalous problems are distin-
and of its combinations are to be found in special issues guished.
of the Logic Journal of the IGPL (most notably, in spe-
cial issues on abduction, such as [10.36–38]). Some of Definition 10.3 (Abductive problem)
the chapters to follow are good examples of these com- Let and ' be a theory and a formula, respectively, in
binations as well. some language L. Let ˆ be a consequence relation on
L:
10.3.1 Inferential
The pair .; '/ constitutes a (novel) abductive
problem when neither ' nor :' are consequences
The classical characterization of abduction as logical
of . That is, when
inference is mainly a deductive classical logical account
in which a background theory (), together with an ab- 6ˆ ' and 6ˆ :' :
ductive explanation ( ), constitutes the explananda and
do entail the explanandum ('). It puts forward the fol- The pair .; '/ constitutes an anomalous abductive
lowing logical schema. problem when ' is not a consequence of , but :'
Given a theory (a set of formulae) and a formula is. That is, when
' (an atomic formula), is an abductive explanation if: 6ˆ ' and ˆ :' :
1. [ ˆ ' It is typically assumed that the theory is a set of
2. is consistent with formulas closed under logical consequence, and that ˆ
3. 6ˆ ' is a truth-preserving consequence relation.
4. is minimal.
226 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
Given this definition, a (novel) abductive solution succeed when it is not derivable from the program). In
Part C | 10.3
would then be any formula which follows the log- actual ALP, for these facts to be counted as abductions,
ical schema above. It remains to be seen however, they have to belong to a predefined set of abducibles,
what would be an anomalous abductive solution for and to be verified by additional conditions (so-called in-
an anomalous abductive problem. To this end, several tegrity constraints), in order to prevent a combinatorial
proposals exist in the literature (e.g., [10.39]), many explosion of possible explanations.
of which acknowledge the possibility of theory revi- Therefore, logic programming does not use blind
sion, for some formula may be retracted from in deduction; different control mechanisms for proof
order to maintain the consistency of the revised the- search determine how queries are processed and this
ory (in Chap. 13 a proposal is made in this direction is crucial to the efficiency of the enterprise. Hence,
in the framewok of dynamic epistemic theories, one in different control policies will vary in the abductions
which a three-way distinction of abductive problems is produced, their form and the order in which they ap-
offered). pear.
Going back to the logical schema above, it should In what follows, it will be illustrated how an ALP
be stressed that some authors are not committed to clas- system works. A notation closer to logic than to logic
sical logical entailment but rely instead on some other programs is the one used in what follows. An abductive
form of nonclassical consequence, something that of- logic program has three components:
ten has as a consequence that the conditions that make
up the logical schema need not be imposed explicitly
A set of rules. Each rule has a head (a predicate) and
a body (a set of predicates). A way to prove the head
or are replaced instead by weaker ones. For example,
is to prove all predicates of its body. Rules without
in Chap. 15 in which abduction is modeled in paracon-
body are facts and are assumed to be true. Here is
sistent logics, the consistency condition is replaced by
a very simple logic program
a weaker one stating nontriviality ( and are nontriv-
ial when there exists a B such that ; 6ˆ B).
eats.X; Y/ vegetarian.X/ ;
To be sure, while there are clear-cut characteriza-
tions of what is an abductive problem (solution), there vegetable.Y/ (10.1)
are several logical ways to make it precise. It should be eats.X; Y/ carnivore.X/ ;
clear by now that there is a significant way in which ab- meat.Y/ (10.2)
duction is interpreted as a logical inference in its own
vegetarian(rabbit) : (10.3)
right. This characterization does integrate the view of
an otherwise varied group of scholars; John Woods has
There are two rules (10.1) and (10.2) for the pred-
labeled it as the AKM model, according to the initial
icate eats.X; Y/. The first rule, for example, states that
letters of its proponents surnames [10.40, p.305]:
a way to prove that X eats Y is to prove that X is veg-
etarian and Y is a vegetable. The fact (10.3) states that
“Thus, for example, for ‘A’ we have Aliseda [10.8];
the rabbit is vegetarian:
for ‘K’ we have Kowalski [10.41], Kuipers [10.25],
Kakas et al. [10.26] and Flach and Kakas [10.6]; A set of abducible predicates. A common re-
and for ‘M’ there is Magnani [10.19] and Meheus striction is that abducible predicates cannot be in
et al. [10.30]. Needless to say, there are legions of the head of any rule. In the example above, the
AKM-proponents whose surnames are ungraced by following may be considered as abducible pred-
any of these letters in first positions.” icates: vegetarian.X/, carnivore.X/, vegetable.X/
and meat.X/.
10.3.2 Computational A set of integrity constraints. These are rules with
? (falsehood) as head. To be satisfied, they require
Of the many computational approaches to abduction, that not all literals in the body are simultaneously
abductive logic programming (ALP) is the selected true. In the example above, the following constraint
one to be described in detail (see [10.26, 42–44] for may be added in order to avoid that an animal is
an overview to the field). Logic programming works considered both a vegetarian and a carnivore
mostly within first-order logic, and it consists of logic
programs, queries, and an underlying inferential mech- ? vegetarian.X/; carnivore.X/ : (10.4)
anism known as resolution. Abduction emerges natu-
rally in logic programming as a repair mechanism, that An abductive problem in ALP manifests itself when
is, its result is an extension of a logic program with the there is a query (an instance of some predicate) that
facts needed for a query to succeed (a query does not cannot succeed with the rules of the logic program
The Logic of Abduction: An Introduction 10.3 Three Characterizations 227
(the query is not derivable from the program). In the those formulae, which close the open branches (in
Part C | 10.3
example above, the query eats(rabbit,banana) is not suc- a consistent way). Abduction in semantic tableaux of-
cessful with the program (10.1)–(10.3). An abductive fers a way of implementing computationally the AKM
solution in ALP is then a set of facts that, together with model (for more details on abduction in semantic
the program, entail the original query. In above exam- tableaux, see [10.27] for its original proposal and [10.8]
ple, there are two possible sets of these: for a further development). Moreover, abduction in se-
mantic tableaux for a paraconsistent logical theory is to
fvegetable(banana)g
be found in Chap. 15, in which a detailed description of
fcarnivore(rabbit), meat(banana)g.
this framework will be offered.
But only the first one satisfies integrity con-
straint (10.4), because assuming carnivore(rabbit), to- 10.3.3 Epistemic Change
gether with vegetarian(rabbit), contradicts (10.4).
More technically, in ALP, standard selective linear Abduction has also been characterized as a process for
definite resolution [10.45] is extended to build the ab- epistemic change, and in this respect an obvious related
ductive solutions and check the integrity constraints. territory is theories of belief revision in AI (see [10.8,
Some modified resolution procedures and ALP systems 29, 47, 48] for an introduction and a more detailed ac-
have been developed since the pioneer work of Bob count on this topic). These theories describe how to
Kowalski [10.41], but the usual procedure remains the incorporate a new piece of information into a database,
same: look for abductive solutions at the dead ends of a scientific theory, or a set of common sense beliefs.
the proofs. That is, when a resolution proof cannot be More precisely, given a consistent theory closed un-
completed because there is no clause to prove some der logical consequence, called the belief state, and
query, if the predicate of that query is abducible, then a sentence ', the incoming belief, there are three epis-
it is incorporated to the abductive solution. Other pro- temic attitudes for with respect to ':
cedures use well-founded semantics and stable models.
1. ' is accepted (' 2 )
The overview of ALP is left here.
2. ' is rejected (:' 2 )
As already mentioned, there are indeed many com-
3. ' is undetermined (' 62 ; :' 62 ).
putational approaches to abduction. To end this section,
here are some final words in regard to the logical frame- Given these attitudes, the following operations char-
work of semantic tableaux, which – when properly acterize the kind of belief change ' brings into ,
extended – allows for a combination of both approaches thereby effecting an epistemic change in the agent’s cur-
to abduction previously reviewed, that of abduction as rently held beliefs:
a logical inference and abduction as computation.
As well known in the logical literature, semantic
Expansion. A new sentence is added to regardless
of the consequences of the larger set to be formed.
tableaux is a refutation method to test formulae valid-
The belief system that results from expanding by
ity. Roughly speaking, it works as follows (see [10.46]
a sentence ' together with the logical consequences
for an overview to the field):
is denoted by C '.
“To test if a formula ' follows from a set of Contraction. Some sentence in is deleted with-
premises , a tableau tree for the sentences in out any addition of new facts. In order to guarantee
[f:'g is constructed, denoted by T . [f:'g/. the deductive closure of the resulting system, some
The tableau itself is a binary tree built from its ini- other sentences of may be given up. The result of
tial set of sentences by using rules for each of the contracting with respect to sentence ' is denoted
logical connectives that specify the ways in which by '.
the tree branches. Revision. A new sentence that is (typically) incon-
If the tableau closes (every branch contains an sistent with a belief system is added, but in order
atomic formula and its negation), the initial set is that the resulting belief system be consistent, some
unsatisfiable and the entailment ˆ ' holds. Oth- of the old sentences in are deleted. The result of
erwise, if the resulting tableau has open branches, revising by a sentence ' is denoted by '.
the formula ' is not a valid consequence of . ”
Of these operations, revision is the most complex
Within this framework, abduction comes into play one. Indeed the three belief change operations can be
as en extension of the constructed tableau. When there reduced into two of them, since revision and contrac-
are open branches, which indicates the condition for tion may be defined in terms of each other. In particular,
a novel abductive problem in this framework, the gen- revision here is defined as a composition of contraction
eration of abductive solutions consists in producing and expansion: First contract those beliefs of that are
228 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
in conflict with ', and then expand the modified theory Here, then, are two abductive operations for epis-
Part C | 10.4
with sentence ' (known as Levi’s identity). While ex- temic change (as proposed in [10.8]):
pansion can be uniquely and easily defined ( C ' D
f˛ j _ f'g ` ˛g), this is not so with contraction or re-
Abductive expansion. Given an abductive novelty ',
a consistent explanation ˛ for ' is computed in such
vision, as several formulas can be retracted to achieve
a way that ; ˛ ) ', and then added to .
the desired effect. Therefore, additional criteria must be
incorporated in order to fix which formula to retract.
Abductive revision. Given an abductive anomaly
', a consistent explanation ˛ is computed as fol-
Here, the general intuition is that changes on the theory
lows: The theory is revised into 0 so that it
should be kept minimal, in some sense of informational
does not explain :'. That is, 0 6) :', where
economy (One way of dealing with this issue is based
0 D .ˇ1 ; : : : ; ˇl / (In many cases, several for-
on the notion of entrenchment, a preferential ordering
mulas and not just one must be removed from the
which lines up the formulas in a belief state according
theory. The reason is that sets of formulas which
to their importance [10.49]. Thus, those formulas that
entail (explain) ' should be removed. Example:
are the least entrenched, should be retracted first).
Given D f˛ ! ˇ; ˛; ˇg and ' D :ˇ, in order to
Moreover, epistemic theories in this tradition ob-
make ; :ˇ consistent, one needs to remove either
serve certain integrity constraints (such as those pre-
fˇ; ˛g or fˇ; ˛ ! ˇg). Once 0 is obtained, a con-
viously shown for ALP), which concern the theory’s
sistent explanation ˛ is calculated in such a way that
preservation of consistency, its deductive closure and
0 ; ˛ ) ' and then added to . Thus, the process
two criteria for the retraction of beliefs: The loss of
of revision involves both contraction and expansion.
information should be kept minimal and the less en-
trenched beliefs should be removed first. These are the Some of the previous examples of abduction given
very basics of the AGM approach (This acronym stands in Sect. 10.1.4 may be described as expansions (13),
for the three initial letters of its proponents: Alchour- where the background theory gets expanded to account
rón, Gärdenfors, and Makinson, authors of the seminal for a new fact. Another one of them (4) is clearly a case
paper which started this tradition [10.50]). calling for theory revision, that in which the theory
Abduction may be seen as an epistemic process for needs to be revised in order to account for an anomaly,
belief revision. In this context, an incoming sentence ' such as those found in practical settings like diagnostic
is not necessarily an observation, but rather a belief for reasoning [10.15, 39]. Belief revision theories provide
which an explanation is sought. The previously defined an explicit calculus of modification for both cases and
abductive novelty and abductive anomaly correspond applied to abduction, operations for abductive expan-
respectively, to the epistemic attitudes of undetermina- sion and abductive revision are defined as well.
tion and rejection (provided that ) is ` and closed Note, however, that in this approach changes occur
under logical consequence). Both a novel phenomenon only in the theory, as the situation or world to be mod-
and an anomalous one induce a change in the original eled is supposed to be static, only new information is
theory. The latter calls for a revision and the former for coming in. Another important type of epistemic change
expansion. So, the basic operations for abduction are studied in AI is that of update, the process of keeping
expansion and revision. Therefore, two epistemic atti- beliefs up-to-date as the world changes. A recent pro-
tudes and changes in them are reflected in an abductive posal in this direction in connection to abduction is to
model. be found later in this part of the handbook, in Chap. 13.
10.4 Conclusions
This chapter covered, on the one hand, a brief his- form of an argument and those in which it mani-
torical overview of the logic of abduction since its fests itself as an inference to the best explanation,
origins in Charles Peirce’s view to its privileged place is a useful one in philosophy, and was described in
in the cognitive sciences. In the context of philoso- Sect. 10.2.
phy of science, logical abduction is relevant in re- Section 10.3 described three classical characteriza-
gard to issues of scientific explanation. In a broader tions of abduction, namely as a logical inference, as
context, one including computer science, abduction a computational process and as a process for epistemic
is immersed in the heuristics of inferential mecha- change. These characterizations have been the domi-
nisms. On the other hand, this chapter offered an nant ones in the logical literature and are still a point of
in-depth overview of logical abduction. A distinc- reference to new proposals which go beyond the bound-
tion between approaches in which abduction takes the aries of the classical way.
The Logic of Abduction: An Introduction References 229
The history of the logic of abduction and of its for- Acknowledgments. Research for this article was sup-
Part C | 10
mal modeling is still on the making. The chapters to ported by the research project Logics of Discovery,
follow offer an overview of formal tools to model logi- Heuristics, and Creativity in the Sciences (PAPIIT,
cal and computational abduction. IN400514) granted by UNAM.
References
10.1 B. Russell: The Art of Philosophizing and Other Es- ed. by G. Brewka (CSLI Publications, Stanford 1996)
says (Adams, Totowa, Littlefield 1974) 10.18 P. Paul: AI approaches to abduction. In: Abductive
10.2 P. Thagard: Computational Philosophy of Science Reasoning and Uncertainty Management Systems,
(MIT, Cambridge 1988) Handbook of Defeasible Reasoning and Uncertainty
10.3 J.S. Mill: A system of logic. In: The Collected Works Management Systems, Vol. 4, ed. by D. Gabbay,
of John Stuart Mill, ed. by J.M. Robson (Routledge R. Kruse (Kluwer Academic, Dordrecht 2000) pp. 35–
and Kegan Paul, London 1958), New York, Harper 98
and brothers 10.19 L. Magnani: Abduction, Reason and Sci-
10.4 G. Harman: The inference to the best explanation, ence: Processes of Discovery and Explanation
Philos. Rev. 74(1), 88–95 (1965) (Kluwer/Plenum, New York 2001)
10.5 E. Shapiro: Inductive inference of theories from 10.20 H. Simon: Models of Discovery (Reidel, Holland 1977)
facts. In: Computational Logic: Essays in Honor of 10.21 N. Chomsky: Language and Mind. Enlarged edition
Alan Robinson, ed. by J.L. Lassez, G. Plotkin (MIT, (Harcourt Brace Jovanovich, New York 1972)
Cambridge 1991) 10.22 C. Hempel: Aspects of scientific explanation. In: As-
10.6 P. Flach, A. Kakas (Eds.): Abduction and Induction. pects of Scientific Explanation and Other Essays in
Essays on Their Relation and Integration (Kluwer the Philosophy of Science, ed. by C. Hempel (The
Academic, Dordrecht 2000) Free Press, New York 1965)
10.7 W. Salmon: Scientific explanation. In: Introduc- 10.23 I. Niiniluoto: Statistical explanation reconsidered,
tion to the Philosophy of Science, Vol. 1–6, ed. Synthese 48(3), 437–472 (1981)
by W. Salmon, J. Earman, C. Glymour, J. Lennox, 10.24 I. Niiniluoto: Hempel’s theory of statistical expla-
K. Schaffner, W.C. Salmon, J.D. Norton, J.E. McGuire, nation. In: Science, Explanation, and Rationality:
P. Machamer, J.G. Lennox (Prentice Hall, New York The Philosophy of Carl G. Hempel, ed. by J.H. Fetzer
1992) (Oxford Univ. Press, Oxford 2000) pp. 138–163
10.8 A. Aliseda: Abductive Reasoning. Logical Investi- 10.25 T. Kuipers: Abduction aiming at empirical progress
gation into Discovery and Explanation, Vol. 330 or even truth approximation leading to a challenge
(Springer, Dordrecht 2006) for computational modelling, Found. Sci. 4(3), 307–
10.9 G. Schurz: Patterns of abduction, Synthese 164, 323 (1999)
201–234 (2008) 10.26 A.C. Kakas, R.A. Kowalski, F. Toni: Abductive logic
10.10 D. Anderson: The Evolution of Peirce’s Concept of programming, J. Logic Comput. 2(6), 719–770 (1992)
Abduction, Trans. Charles S. Peirce Society, Vol. 22 10.27 M.C. Mayer, F. Pirri: First order abduction via
(Indiana Univ. Press, Bloomington 1986) pp. 145– tableau and sequent calculi, Logic J. IGPL 1(1), 99–
164 117 (1993)
10.11 J. Hintikka: What is abduction? The fundamen- 10.28 D. Makinson: General patterns in nonmonotonic
tal problem of contemporary epistemology, Trans. reasoning. In: Handbook of Logic in Artificial In-
Charles S. Peirce Soc. 34(3), 503–533 (1998) telligence and Logic Programming, Nonmonotonic
10.12 A. Aliseda: Abduction as epistemic change: Reasoning and Uncertain Reasoning, Vol. 3, ed.
A Peircean model in artificial intelligence. In: by C.J. Hogger, D.M. Gabbay, J.A. Robinson (Ox-
Abduction and Induction, ed. by P. Flach, A. Kakas ford Science Publications, Clarendon, Oxford 1994)
(Kluwer Academic, Dordrecht 2000) pp. 45–58 pp. 35–110
10.13 C.S. Peirce: 1867-1913. Collected Papers of Charles 10.29 C. Boutilier, V. Becher: Abduction as belief revision,
Sanders Peirce. Vols. 1–6, ed. by C. Hartshorne, P. Artif. Intell. 77(1), 43–94 (1995)
Weiss. (Harvard Univ. Press, Cambridge 1934) 10.30 J. Meheus, L. Verhoeven, M. Van Dyck, D. Provijn:
10.14 H.E. Pople: On the Mechanization of Abduc- Ampliative adaptive logics and the foundation of
tive Logic (Morgan Kaufmann, San Francisco 1973) logic-based approaches to abduction. In: Logical
pp. 147–152 and Computational Aspects of Model-Based Rea-
10.15 Y. Peng, J.A. Reggia: Abductive Inference Models soning, ed. by L. Magnani, N.J. Nersessian, C. Pizzi
for Diagnostic Problem-Solving, Symbolic Compu- (Kluwer Academic, Dordrecht 2002) pp. 39–71
tation: Artificial Intelligence (Springer, New York 10.31 M. Beirlaen, A. Aliseda: A conditional logic for ab-
1990) duction, Synthese 191(15), 3733–3758 (2014)
10.16 G. Paul: Approaches to abductive reasoning: An 10.32 J.R. Josephson: Smart inductive generalizations are
overview, Artif. Intell. Rev. 7(2), 109–152 (1993) abductions. In: Abduction and Induction, ed. by
10.17 K. Konolige: Abductive theories in artificial intelli- P. Flach, A. Kakas (Kluwer Academic, Dordrecht
gence. In: Principles of Knowledge Representation, 2000) pp. 31–44
230 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
10.33 I. Douven: Abduction. In: The Stanford Encyclo- 10.43 A.C. Kakas, R.A. Kowalski, F. Toni: The role of abduc-
Part C | 10
pedia of Philosophy, Spring 2011 edn., ed. by tion in logic programming. In: Handbook of Logic
E. Zalta (2011), http://plato.stanford.edu/archives/ in Artificial Intelligence and Logic Programming,
spr2011/entries/abduction/, Date of last access: June Vol. 5, ed. by C.J. Hogger, D.M. Gabbay, J.A. Robin-
8th 2016 son (Clarendon, Oxford 1998) pp. 235–324
10.34 Y. Shoham: Reasoning About Change. Time and 10.44 M. Denecker, A.C. Kakas: Abduction in logic pro-
Causation from the Standpoint of Artificial Intel- gramming. In: Computational Logic: Logic Pro-
ligence (MIT, Cambridge 1988) gramming and Beyond, ed. by A.C. Kakas, F. Sadri
10.35 D. Dubois, H. Prade: Possibilistic logic, preferen- (Springer, Berlin, Heidelberg 2002) pp. 402–436
tial models, non-monotonicity and related issues, 10.45 R. Kowalski, D. Kuehner: Linear resolution with se-
Proc. 12th Int. Joint Conf. on Artificial Intelligence lection function, Artif. Intell. 2(3–4), 227–260 (1971)
(Morgan Kaufman, Burlington 1991) pp. 419–424 10.46 R. Hähnle, M. D’Agostino, D.M. Gabbay, J. Posegga
10.36 L. Magnani (Ed.): Special issue: Abduction, practical (Eds.): Handbook of Tableau Methods (Kluwer Aca-
reasoning, and creative inferences in science, Logic demic, Dordrecht 1999)
J. IGPL 14(2) (2006) 10.47 P. Gärdenfors: Belief revision: An introduction. In:
10.37 L. Magnani, W. Carnielli, C. Pizz (Eds.): Special issue: Belief Revision, Cambridge Tracts in Theoretical
Formal representations in model-based reasoning Computer Science, ed. by P. Gärdenfors (Cambridge
and abduction, Logic J. IGPL 20(2) (2012) Univ. Press, Cambridge 1992) pp. 1–28
10.38 L. Magnani (Ed.): Special issue: Formal represen- 10.48 P. Gärdenfors, H. Rott: Belief revision. In: Hand-
tations in model-based reasoning and abduction, book of Logic in Artificial Intelligence and Logic
Logic J. IGPL 21(6) (2013) Programming, , Oxford Science Publications, ed.
10.39 A. Aliseda, L. Leonides: Hypotheses testing in by C.J. Hogger, D.M. Gabbay, J.A. Robinson, Vol. 4,
adaptive logics: An application to medical diagno- (Clarendon, Oxford 1995) pp. 35–132
sis, Logic J. IGPL 21(6), 915–930 (2013) 10.49 P. Gärdenfors: Knowledge in Flux: Modeling the
10.40 J. Woods: Ignorance and semantic tableux: Aliseda Dynamics of Epistemic States (MIT, Cambridge 1988)
on abduction, Theoria. 22(3), 305–318 (2007) 10.50 C.E. Alchourrón, P. Gärdenfors, D. Makinson: On the
10.41 R.A. Kowalski: Logic for Problem Solving (Elsevier, logic of theory change: Partial meet contraction
New York 1979) and revision functions, J. Symb. Logic 50(2), 510–
10.42 J.W. Lloyd: Foundations of Logic Programming, 2nd 530 (1985)
edn. (Springer, Berlin, Heidelberg 1987)
231
Mathieu Beirlaen
Qualitative In
11. Qualitative Inductive Generalization
and Confirmation
Part C | 11.1
11.2.1 General Characterization
Inductive generalization is a defeasible type of
of the Standard Format ...................... 232
inference which we use to reason from the partic-
11.2.2 Proof Theory ..................................... 233
ular to the universal. First, a number of systems 11.2.3 Minimal Abnormality ......................... 236
are presented that provide different ways of im-
plementing this inference pattern within first- 11.3 More Adaptive Logics
for Inductive Generalization.............. 237
order logic. These systems are defined within the
adaptive logics framework for modeling defeasi- 11.4 Qualitative Inductive Generalization
ble reasoning. Next, the logics are re-interpreted and Confirmation.............................. 240
as criteria of confirmation. It is argued that they 11.4.1 I-Confirmation
withstand the comparison with two qualitative and Hempel’s Adequacy Conditions .... 240
theories of confirmation, Hempel’s satisfaction 11.4.2 I-Confirmation
criterion and hypothetico-deductive confirma- and the Hypothetico-Deductive Model 242
11.4.3 Interdependent Abnormalities
tion.
and Heuristic Guidance...................... 243
11.5 Conclusions ...................................... 245
11.1 Adaptive Logics
for Inductive Generalization.............. 231 11.A Appendix:
11.2 A First Logic Blocking the Raven Paradox? ............. 246
for Inductive Generalization.............. 232 References................................................... 247
Logics of induction are tools for evaluating the strength ber of logics are discussed which permit us, given a set
of arguments which are not deductively valid. There are of objects sharing or not sharing a number of properties,
many kinds of argument the conclusion of which is not to infer generalizations of the form All x are P, or All x
guaranteed to follow from its premises, and there are with property P share property Q. Inductive generaliza-
many ways to evaluate the strength of such arguments. tion is a common practice which has proven its use in
This chapter focusses on one particular kind of non- scientific endeavor. For instance, given the fact that the
deductive argument, and on one particular method of relatively few electrons measured so far carry a charge
implementation. The type of argument under consider- of 1:61019 Coulombs, we believe that all electrons
ation here is that of inductive generalization, as when have this charge [11.1].
we reason from the particular to the universal. A num-
of the premises. This is the idea behind the logic LI. existing qualitative criteria of confirmation. There is an
A more economical approach is to permit the introduc- overlap between the fields of inductive logic and con-
tion of a generalization on the condition that at least firmation theory. In 1943 already, Hempel noted that
one instance of it it present. This is the rationale behind the development of a logical theory of confirmation
a second logic, IL. In an IL-proof a generalization All might be regarded as a contribution to the field of in-
P are Q can be introduced only if the premise set con- ductive logic [11.6, p. 123]. In Sect. 11.4 the logics
tains at least one object which is either not-P or Q. More from Sect. 11.2 and 11.3 are re-interpreted as qualitative
economical still is the rationale behind a third logic, G, criteria of confirmation, and are related to other qual-
Part C | 11.2
which aims to capture the requirement of knowing at itative models of confirmation: Hempel’s satisfaction
least one positive instance of a generalization before criterion (Sect. 11.4.1) and the hypothetico-deductive
introducing it in a proof. That is, in a G-proof a gen- model (Sect. 11.4.2). Section 11.4 ends with some re-
eralization All P are Q can be introduced if the premise marks on the heuristic guidance that adaptive logics for
set contains at least one object which is both P and Q. inductive generalization can provide in the derivation
The second dimension along which different con- and subsequent confirmation of additional generaliza-
sequence relations are generated concerns the specific tions (Sect. 11.4.3).
mechanism used for retracting generalizations intro- The following notational conventions are used
duced in adaptive proofs. It is often not sufficient to throughout the chapter. The formal language used is
demand retraction just in case a generalization is falsi- that of first-order logic without identity. A primitive
fied by the premises. For instance, if the consequence functional formula of rank 1 is an open formula that
sets of our logics are to be closed under classical does not contain any logical symbols (9; 8; :; _; ^;
logic, jointly incompatible generalizations should not ; ), sentential letters, or individual constants, and that
be derivable, even though none of them is falsified by contains only predicate letters of rank 1. The set of
our premise set. Within the adaptive logics framework, functional atoms of rank 1, denoted Af 1 , comprises
various strategies are available for retracting condi- the primitive functional formulas of rank 1 and their
tional moves in an adaptive proof. Two such strategies negations. A generalization is the universal closure
are presented in this chapter: the reliability strategy and of a disjunction of members of Af 1 . That is, the set
the minimal abnormality strategy. of generalizations in this technical sense is the set
Combining both dimensions, a family of six adap- f8.A1 _ : : : _ An / j A1 ; : : : ; An 2 Af 1 I n 1g, where 8
tive logics for inductive generalization is obtained (it denotes the universal closure of the subsequent for-
contains the systems LI, IL, and G, each of which can mula. Occasionally the term generalization is also used
be defined using either the reliability or the minimal for formulas equivalent to a member of this set, e.g.,
abnormality strategy). These logics have all been pre- 8x.Px
Qx/. It is easily checked that generalizations
sented elsewhere (for LI, see [11.2–4]. For IL and G, 8.A1 _: : :_An / can be rewritten as formulas of the gen-
see [11.5]). The original contribution of this chapter eral form 8..B1 ^ : : : ^ Bj /
.C1 _ : : : _ Ck //, and vice
consists in a study comparing these systems to some versa, where all Bi and Cj belong to Af 1 .
that the first-order fragment of Classical Logic (CL) one of the rules of inference brings the proof to its next
meets this requirement, as we work almost exclusively stage, which is the sequence of all lines written so far.)
with CL as a LLL. The lower limit logic of LI is CL. A line in an adaptive proof consists of four ele-
Typically, an AL enables one to derive, for most ments: a line number, a formula, a justification and
premise sets, some extra consequences on top of those a condition. For instance, a line
that are LLL-derivable. These supplementary conse-
quences are obtained by interpreting a premise set as j A i1 ; : : : ; in I R ;
normally as possible, or, equivalently, by supposing
Part C | 11.2
abnormalities to be false unless and until proven oth- reads: at line j, the formula A is derived from lines
erwise. What it means to interpret a premise set as i1 in by rule R on the condition . The fourth element,
normally as possible is disambiguated by the strategy, the condition, is what permits the dynamics. Intuitively,
element (iii). the condition of a line in a proof corresponds to an
The normality assumption made by the logics to be assumption made at that line. In the example above,
defined in this chapter amounts to supposing that the A was derived on the assumption that the formulas in
world is in some sense uniform. Normal situations are are false. If, later on in the proof, it turns out that
those in which it is safe to derive generalizations. Ab- this assumption was too bold, the line in question is
normal situations are those in which generalizations are withdrawn from the proof by a marking mechanism cor-
falsified. In fact, the set of LI-abnormalities, denoted responding to an adaptive strategy. Importantly, only
˝LI , is just the set of falsified generalizations (the defi- members of the set of abnormalities are allowed as el-
nitions are those from [11.5]; in [11.10, Sect. 4.2.2] it is ements of the condition of a line in an adaptive proof.
shown that the same logic is obtained if ˝LI is defined Thus, assumptions always correspond to the falsity of
as the set of formulas of the form :8xA.x/, where A one or more abnormalities, or, equivalently, to the truth
contains no quantifiers, free variables, or constants) of one or more generalizations.
˚ Before explaining how the marking mechanism
˝LI Ddf :8.A1 _ : : : _ An / j A1 ; : : : ; An 2 Af 1 I works, the generic inference rules of the SF must be
n 1g : introduced. There are three of them: a premise intro-
duction rule (Prem), an unconditional rule (RU), and
(11.1)
a conditional rule (RC). For adaptive logics with CL as
In adaptive proofs, it is possible to make conditional their LLL, they are defined as follows
inferences assuming that one or more abnormalities
are false. Whether or not such assumptions can be up- Prem If A 2 W
held in the continuation of the proof is determined by ::: :::
the adaptive strategy. The SF incorporates two adaptive A ;
strategies, the reliability strategy and the minimal ab- RU If A1 ; : : : ; An `CL B W
normality strategy. In the generic proof theory of the
SF, adaptive strategies come with a marking definition, A1 1
:: ::
which takes care of the withdrawal of certain condi- : :
tional inferences in dynamic proofs. It will be easier to An n
explain the intuitions behind these strategies after defin-
ing the generic proof theory for ALs. For now, just note
B 1 [ : : : [ n
that in the remainder LI is ambiguous between LIr and
LIm , where the subscripts r and m denote the reliability RC If A1 ; : : : ; An `CL B _ Dab./ W
strategy, respectively the minimal abnormality strategy. A1 1
Analogously for the other logics defined below. :: ::
: : :
11.2.2 Proof Theory An n
B 1 [ : : : [ n [
Adaptive proofs are dynamic in the sense that lines de-
rived at a certain stage of a proof may be withdrawn Where is the premise set, Prem permits the intro-
at a later stage. Moreover, lines withdrawn at a certain duction of premises on the empty condition at any time
stage can become derivable again at an even later stage, in the proof. Remember that conditions, at the intuitive
and so on. (A stage of a proof is a sequence of lines level, correspond to assumptions, so Prem stipulates
and a proof is a sequence of stages. Every proof starts that premises can be introduced at any time without
off with stage 1. Adding a line to a proof by applying making any further assumptions.
234 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
Since ALs strengthen their LLL, one or more rules In a similar fashion, RC can be used to derive other gen-
are needed to incorporate LLL-inferences in AL-proofs. eralizations
In the proof theory of the SF, this is taken care of by the 7 8xPx RC f:8xPxg
generic rule RU. This rule stipulates that whenever B is
8 8xQx RC f:8xQxg
a CL-consequence of A1 ; : : : ; An , and all of A1 ; : : : ; An
9 8x.:Px _ Qx/ RC f:8x.:Px _ Qx/g
have been derived in a proof, then B is derivable, pro-
vided that the conditions attached to the lines at which 10 8x.Px _ :Qx/ RC f:8x.Px _ :Qx/g
A1 ; : : : ; An were derived are carried over. Intuitively, if 11 8x.:Px _ :Qx/ RC f:8x.:Px _ :Qx/g
Part C | 11.2
A1 ; : : : ; An are derivable assuming that the members of Each generalization is derivable assuming that its cor-
1 ; : : : ; n are false, and if B is a CL-consequence of responding condition is false. However, some of these
A1 ; : : : ; An , then B is derivable, still assuming that all assumptions clearly cannot be upheld. We know, for in-
members of 1 ; : : : ; n are false. stance, that the generalizations derived at lines 8 and
Before turning to RC, here is an example illus- 11 are falsified by the premises at lines 3 and 1 re-
trating the use of the rules Prem and RU. Let 1 D spectively. So we need a way of distinguishing between
fPa ^ Qa; Pb; :Qcg. Suppose we start an LI-proof for good and bad inferred generalizations. This is where
1 as follows the adaptive strategy comes in. Since distinguishing
good from bad generalizations can be done in differ-
1 Pa ^ Qa Prem ; ent ways, there are different strategies available to us
2 Pb Prem ; for making the distinction hard. First, the reliability
3 :Qc Prem ; strategy and its corresponding marking definition are
4 Pa 1I RU ; introduced. The latter definition takes care of the retrac-
5 Qa 1I RU ; tion of bad generalizations.
Marking definitions proceed in terms of the mini-
mal inferred Dab-formulas derived at a stage of a proof.
Let be a finite set of LI-abnormalities, that is, A Dab-formula that is derived at a proof stage by RU
˝LI . Then Dab./ refers to the classical disjunction at a line with condition ; is called an inferred Dab-for-
of the members of (Dab abbreviates disjunction of mula of the proof stage.
abnormalities; in the remainder, such disjunctions are
sometimes referred to as Dab-formulas). RC stipulates Definition 11.1 Minimal inferred Dab-formula
that, whenever B is CL-derivable from A1 ; : : : ; An in Dab./ is a minimal inferred Dab-formula at stage s of
disjunction with one or more abnormalities, then B can a proof iff Dab./ is an inferred Dab-formula at stage s
be inferred assuming that these abnormalities are false, and there is no 0 such that Dab.0 / is an inferred
i. e., we can derive B and add the abnormalities in ques- Dab-formula at stage s.
tion to the condition set, together with assumptions
made at the lines at which A1 ; : : : ; An were derived. Where Dab.1 /; : : : ; Dab.n / are the minimal inferred
For instance, (11.2) is CL-valid Dab-formulas derived at stage s, Us . / D 1 [: : :[n
is the set of formulas that are unreliable at stage s.
8x.Px _ Qx/ _ :8x.Px _ Qx/ (11.2)
Definition 11.2 Marking for reliability
Where is the condition of line i, line i is marked at
Note that the second disjunct of (11.2) is a member
stage s iff \ Us . / ¤ ;.
of ˝LI . In the context of inductive generalization the
assumption that the world is as normal as possible
To illustrate the marking mechanism, consider the fol-
corresponds to an assumption about the uniformity of
lowing extension of the LIr -proof for 1 (marked lines
the world. In adaptive proofs, such assumptions are
are indicated by a X-sign; lines 15 are not repeated in
made explicit by applications of the conditional rule.
the proof)
Concretely, if a formula like (11.2) is derived in an
LI-proof, RC can be used to derive the first disjunct 6 8x.Px _ Qx/ RC
on the condition that the second disjunct is false. In f:8x.Px _ Qx/gX
fact, since (11.2) is a CL-theorem, the generalization
8x.Px _ Qx/ can be inroduced right away, taking its 7 8xPx RC
negation to be false (lines 15 are not repeated) f:8xPxgX
8 8xQx RC
6 8x.Px _ Qx/ RC f:8x.Px _ Qx/g f:8xQxgX
Qualitative Inductive Generalization and Confirmation 11.2 A First Logic for Inductive Generalization 235
9 8x.:Px _ Qx/ RC is not marked at stage s, and (iii) every extension of the
f:8x.:Px _ Qx/gX proof in which line i is marked may be further extended
in such a way that line i is unmarked.
10 8x.Px _ :Qx/ RC
f:8x.Px _ :Qx/g
11 8x.:Px _ :Qx/ RC Definition 11.4 Logical consequence for LIr
f:8x.:Px _ :Qx/gX `LIr A (A is finally LIr -derivable from ) iff A is
finally derived at a line of an LIr -proof from .
Part C | 11.2
12 :8xQx 3I RU
; Given the premise set 1 , there are no extensions of
13 :8x.:Px _ :Qx/ 1I RU the proof above in which any of the marked lines be-
come unmarked, nor are there extensions in which line
;
10 is marked and cannot be unmarked again in a fur-
14 :8xPx _ :8x.:Px _ Qx/ 3I RU ther extension of the proof. Hence, by Definitions 11.3
; and 11.4
15 :8x.Px _ Qx/ _ :8x.:Px _ Qx/ 3I RU 1 6`LIr 8xPx ; (11.4)
;
1 6`LIr 8xQx ; (11.5)
As remarked above, the generalizations derived at lines 1 6`LIr 8x.Px _ Qx/ ; (11.6)
8 and 11 are falsified by the premises, so it makes good
1 `LIr 8x.Px _ :Qx/ ; (11.7)
sense to mark them and thereby consider them not de-
rived anymore. As soon as we derive the negations of 1 6`LIr 8x.:Px _ Qx/ ; (11.8)
these generalizations (lines 12 and 13) Definition 11.2 1 6`LIr 8x.:Px _ :Qx/ : (11.9)
takes care that lines 8 and 11 are marked. The gener-
alizations derived at lines 6, 7, and 9 are not falsified The logic LIr is non-monotonic: adding new premises
by the data, yet they are marked according to Defini- may block the derivation of generalizations that were
tion 11.2, due to the derivability of the minimal inferred finally derivable from the original premise set. For in-
Dab-disjunctions at lines 14 and 15. We know, for in- stance, suppose that we add the premise :Pd ^ Qd to
stance, that the generalizations derived at lines 7 and 9 1 . Since the extra premise provides a counter-instance
cannot be upheld together: at line 14 we inferred that to the generalization 8x.Px_:Qx/, the latter should no
they are jointly incompatible in view of the premises. longer be LIr -derivable from the new premise set. The
Definition 11.2 takes care that both lines 7 and 9 are following proof illustrates that this is indeed the case
marked at stage 15, since
1 Pa ^ Qa Prem
U15 .1 / D f:8xPx; :8xQx; :8x.Px _ Qx/; ;
:8x.:Px _ Qx/; :8x.:Px _ :Qx/g :
2 Pb Prem
(11.3) ;
The only inferred generalization left unmarked at stage 3 :Qc Prem
15 is 8x.Px _ :Qx/, derived at line 10. ;
Due to the dynamics of adaptive proofs, we cannot
4 :Pd ^ Qd Prem
just take a formula to be an AL-consequence of some
premise set once we derived it at some stage on an ;
unmarked line in a proof for , for it may be that there 5 8x.Px _ Qx/ RC
are extensions of the proof in which the line in question f:8x.Px _ Qx/gX
gets marked. Likewise, we need to take into account the
fact that lines marked at a stage of a proof may become 6 8xPx RC
unmarked at a later stage. This is taken care of by using f:8xPxgX
the concept of final derivability:
7 8xQx RC
f:8xQxgX
Definition 11.3 Final derivability
A is finally derived from at line i of a finite proof 8 8x.:Px _ Qx/ RC
stage s iff (i) A is the second element of line i, (ii) line i f:8x.:Px _ Qx/gX
236 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
12 :8xQx 3I RU
; 2 :Rb ^ .:Pb _ :Qb/ Prem
;
13 :8x.:Px _ :Qx/ 1I RU
; 3 :Pc ^ :Qc ^ Rc Prem
;
14 :8x.Px _ Qx/ _ :8x.:Px _ Qx/ 3I RU
; 4 8x.Px _ Qx/ RC
f:8x.Px _ Qx/gX
15 :8x.Px _ :Qx/ 4I RU
; 5 8x.Px _ Rx/ RC
f:8x.Px _ Rx/gX
Line 9 is marked in view of the Dab-formula derived 6 8x.:Px _ Rx/ RC
at line 15. There is no way to extend this proof in such f:8x.:Px _ Rx/gX
a way that the line in question gets unmarked. Hence,
1 [f:Pd^Qdg 6`LIr 8x.Px_:Qx/. In fact, no nontau- 7 :8x.Px _ Qx/ 3I RU
tological generalizations whatsoever are LIr -derivable ;
from the extended premise set 1 [ f:Pd ^ Qdg. 8 :8x.Px _ Rx/ _ :8x.:Px _ Rx/ 2I RU
;
11.2.3 Minimal Abnormality
9 8x.Px _ Rx/ _ 8x.:Px _ Rx/ 5I RU
Different interpretations of the same set of data may f:8x.Px _ Rx/g
lead to different views concerning which generaliza-
tions should or should not be derivable. Each such view 10 8x.Px _ Rx/ _ 8x.:Px _ Rx/ 6I RU
may be driven by its own rationale, and choosing one f:8x.:Px _ Rx/g
such rationale over the other is not a matter of pure
To see what is happening in this proof, we need to un-
logic. For that reason, different strategies are available
derstand the markings. Note that there are two minimal
to adaptive logicians, each interpreting a set of data in
choice sets at stage 10
their own sensible way, depending on the context. The
reliability strategy was defined already. The minimal
abnormality strategy is slightly less skeptical. Conse- ˚10 .2 / D ff:8x.Px _ Qx/; :8x.Px _ Rx/g;
quently, for some premise sets, generalizations may be f:8x.Px _ Qx/; :8x.:Px _ Rx/gg :
LIm -derivable, but not LIr -derivable. (11.10)
Like reliability, the minimal abnormality strategy
comes with its marking definition. Let a choice set of Line 4 is marked in view of clause (i) in Definition 11.5,
˙ D f1 ; 2 ; : : :g be a set that contains one element since its condition intersects with each minimal choice
out of each member of ˙. A minimal choice set of ˙ is set in ˚10 .2 /. Lines 5 and 6 are marked in view of
a choice set of ˙ of which no proper subset is a choice clause (ii) in Definition 11.5. For the minimal choice
set of ˙. Where Dab.1 /; Dab.2 /; : : : are the mini- set f:8x.Px _ Qx/; :8x.Px _ Rx/g, there is no line at
mal inferred Dab-formulas derived from a premise set which 8x.Px _ Rx/ was derived on a condition that
at stage s of a proof, ˚s . / is the set of minimal does not intersect with this set. Hence line 5 is marked.
choice sets of f1 ; 2 ; : : :g. Analogously, line 6 is marked because, for the minimal
choice set f:8x.Px _ Qx/; :8x.:Px _ Rx/g, there is no
Definition 11.5 Marking for minimal abnormality line at which 8x.:Px _ Rx/ was derived on a condition
Where A is the formula and the condition of line i, that does not intersect with this set.
line i is marked at stage s iff (i) there is no ' 2 ˚s . / Things change, however, when we turn to lines 9
such that ' \ D ;, or (ii) for some ' 2 ˚s . /, there and 10. In these cases, none of clauses (i) or (ii) of Def-
Qualitative Inductive Generalization and Confirmation 11.3 More Adaptive Logics for Inductive Generalization 237
inition 11.5 apply: for each of these lines, there is a min- generalizations than those currently occurring in the
imal choice set in ˚10 .2 / which does not intersect with proof, nothing will change in terms of final derivabil-
the line’s condition; and for each of the sets in ˚10 .2 /, ity with respect to the formulas derived at stage 10
we have derived the formula 8x.Px _ Rx/ _ 8x.:Px _
Rx/ on a condition that does not intersect with it. Hence, 2 6`LIm 8x.Px _ Qx/ ; (11.12)
these lines remain unmarked at stage 10 of the proof. 2 6`LIm 8x.Px _ Rx/ ; (11.13)
Things would have been different if we made use of
2 6`LIm 8x.Px _ :Rx/ ; (11.14)
the reliability strategy, since
Part C | 11.3
2 `LIm 8x.Px _ Rx/ _ 8x.:Px _ Rx/ ; (11.15)
U10 .2 / Df:8x.Px _ Qx/; :8x.Px _ Rx/; 2 6`LIr 8x.Px _ Qx/ ; (11.16)
:8x.:Px _ Rx/g : (11.11) 2 6`LIr 8x.Px _ Rx/ ; (11.17)
In view of U10 .2 / and Definition 11.2, all of lines 46 2 6`LIr 8x.Px _ :Rx/ ; (11.18)
and 910 would be marked if the above proof were 2 6`LIr 8x.Px _ Rx/ _ 8x.:Px _ Rx/ : (11.19)
a LIr -proof.
As with the reliability strategy, logical consequence At the beginning of Sect. 11.2.3 it was mentioned
for the minimal abnormality strategy is defined in terms that the rationale underlying the reliability strategy is
of final derivability (Definition 11.3). A consequence slightly more skeptical than that underlying the mini-
relation for LIm is defined simply by replacing all oc- mal abnormality strategy. The point is illustrated by the
currences of LIr in Definition 11.4 with LIm . Although proof for 2 . As we saw, the formula 8x.Px _ Rx/ _
the proof above can be extended in many interesting 8x.:Px _ Rx/ is LIm -derivable from 2 , but not LIr -
ways, showing the (non-)derivability of many more derivable from 2 .
now have an instance of 8xQx, so we can conditionally then used this generalization to infer an instance of
infer the latter generalization, taking over the condition 8xQx. This is perfectly in line with the intuition be-
of line 4. Importantly, not a single disjunction of mem- hind IL: If deriving a generalization on the basis of an
bers of ˝IL is CL-derivable from 3 . This means that instance leads us to more instances of other general-
there is no way to mark any of lines 35 in any exten- izations, then, assuming the world to be as uniform as
sion of this proof, independently of which strategy we possible, we take the world to be uniform with respect
use. to these other generalizations as well.
Consequence relations for ILr and ILm are again When discussing inductive generalization, confir-
Part C | 11.3
definable in terms of final derivability (Definition 11.3). mation theorists often use the more fine-grained dis-
All we need to do is replace all occurrences of LIr in tinction between mere instances of a generalization,
Definition 11.4 with ILr , respectively ILm . Hence positive instances, and negative instances. For example,
given a generalization 8x.Px
Qx/, any a such that
3 `IL 8xPx ; (11.21) Pa
Qa is an instance of 8x.Px
Qx/; any a such that
3 `IL 8xQx : (11.22) Pa ^ Qa is a positive instance of 8x.Px
Qx/; and any
a such that Pa ^ :Qa is a negative instance of 8x.Px
Compare the IL-proof above with the following LI- Qx/. Instead of requiring a mere instance before in-
proof from 3 troducing a generalization, some confirmation theorists
have suggested the stronger requirement for a positive
1 Pa Prem instance, that is, a negative instance of the contrary
; generalization (Sect. 11.4.3). According to this idea, in-
terpreting the world as uniform as possible amounts to
2 :Pb _ Qb Prem generalizing whenever a positive instance is available to
; us. Abnormal situations, then, are those in which both
3 8xPx RC a positive and a negative instance of a generalization are
f:8xPxgX available to us. There is a corresponding variant of IL
that hard-codes this idea in its set of abnormalities: the
4 Qb 2; 3I RU logic G from [11.5]. The latter is defined by the lower
f:8xPxgX limit logic CL, the set of abnormalities ˝G and either
the reliability strategy (Gr ) or the minimal abnormality
5 8xQx RC strategy (Gm ).
f:8xQxgX
6 :8xPx _ :8x:Qx 1; 2I RU ˝G Ddf
;
f9.A1 ^ : : : ^ An ^ A0 / ^ 9.A1 ^ : : : ^ An ^ :A0 / j
7 :8xQx _ :8x.:Px _ :Qx/ 1; 2I RU A0 ; A1 ; : : : ; An 2 Af 1 I n 0g :
;
(11.25)
Independently of the adaptive strategy used (reliability
or minimal abnormality), there are no extensions of this In proofs to follow 9.A1 ^ : : : ^ An ^ A0 / ^ 9.A1 ^ : : : ^
LI-proof in which any of lines 35 become unmarked. An ^ :A0 / is abbreviated as A1 ^ : : : ^ An ^ ˙A0 (where
Therefore again A0 ; A1 ; : : : ; An 2 Af 1 ). As an illustration of the
workings of G, consider the following G-proof from
3 6`LI 8xPx ; (11.23) 4 D fPa ^ Qa; :Qb; :Pcg
3 6`LI 8xQx : (11.24)
1 Pa ^ Qa Prem
The premise set 3 not only serves to show that IL is not ;
strictly weaker than LI in terms of derivable generaliza-
tions. It also illustrates that, although in an IL-proof we 2 :Qb Prem
generalize on the basis of instances, such an instance ;
need not always be CL-derivable from the premise set.
3 :Pc Prem
In the proof from 3 , we derived the generalization
;
8xQx even though no instance of this generalization is
CL-derivable from 3 . Instead, we first derived 8xPx 4 8x.Px
Qx/ 1I RC
(of which 3 does provide us with an instance), and fPx ^ ˙Qxg
Qualitative Inductive Generalization and Confirmation 11.3 More Adaptive Logics for Inductive Generalization 239
Part C | 11.3
(11.31)
8 9xQx ^ 9x:Qx 1; 2I RU
; Two more remarks are in order. First, the example above
The formulas derived at lines 46 are finally G-deriv- suggests that G is in general stronger than IL. This
able in the proof. Since G-consequence too is defined is correct for the minimal abnormality strategy, but
in terms of final derivability, it follows, independently false for the reliability strategy. An illustration is pro-
of the strategy used, that vided by the premise set 5 D fPa; Qb; Rb; Qc; :Rcg.
The generalization 8x.:Px
Qx/ cannot be inferred
4 `G 8x.Px
Qx/ ; (11.26) on the condition :Px ^ ˙Qx, since we lack a pos-
4 `G 8x.Qx
Px/ ; (11.27) itive instance. It can be inferred on the conditions
˙Qx or ˙Px in view of 8xQx `CL 8x.:Px
Qx/ and
4 `G 8x.Px Qx/ : (11.28)
8xPx `CL 8x.:Px
Qx/, but none of these conditions
Now consider the following IL-proof from 4 (where are reliable in view of the derivability of minimal
A1 ; : : : ; An 2 Af 1 , Š.A1 _ : : : _ An / abbreviates 9.A1 _ Dab-formulas like ˙Px _ .Px ^ ˙Rx/ and ˙Qx _ .Qx ^
: : : _ An / ^ 9:.A1 _ : : : _ An )) ˙Px/ _ .Px ^ ˙Rx/.
The situation is different in an ILr -proof, where de-
1 Pa ^ Qa Prem riving 8x.:Px
Qx/ on the condition Š.Px _ Qx/ in
; a proof from 5 is both possible and final. That is, for
every derivable Dab-formula in which Š.Px _ Qx/ oc-
2 :Qb Prem curs, we can derive a shorter (minimal) disjunction of
; abnormalities in which it no longer occurs. Summing
3 :Pc Prem up
;
5 6`Gr 8x.:Px
Qx/ ; (11.32)
4 8x.Px
Qx/ 1I RC 5 `ILr 8x.:Px
Qx/ : (11.33)
fŠ.:Px _ Qx/gX
The second remark is that the requirement for a pos-
5 8x.Qx
Px/ 1I RC itive instance before generalizing in a G-proof is still
fŠ.:Qx _ Px/gX insufficient to guarantee that for every G-derivable gen-
6 8x.Px Qx/ 4; 4I RU eralization a positive instance is CL-derivable from the
fŠ.:Px _ Qx/; Š.:Qx _ Px/gX premises. The following proof from Pa illustrates the
point
7 ŠPx 1; 3I RU
; 1 Pa Prem ;
2 8xPx 1I RC f˙Pxg
8 ŠQx 1; 2I RU
3 8x.Qx
Px/ 2I RU f˙Pxg
;
9 Š.Px _ Qx/_Š.:Px _ Qx/ 1; 2I RU Independently of the strategy used, no means are avail-
; able to mark line 3, hence Pa `G 8x.Qx
Px/, even
though no positive instance of 8x.Qx
Px/ is avail-
10 Š.:Qx _ Px/_Š.Px _ Qx/ 1; 3I RU able. More on this point below (see the discussion on
; Hempel’s raven paradox in Sect. 11.4.1 and in the Ap-
pendix).
11 Š.:Px _ :Qx/ 1; 2I RU
A total of six logics have been presented so far: the
;
logics LIr , LIm , ILr , ILm , Gr , and Gm . Each of these
The minimal inferred Dab-formulas inferred at lines systems interprets the claim that the world is uniform in
711 will remain minimal in any extension of this proof a slightly different way, leading to slightly different log-
240 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
ics. Importantly, there is no Carnapian embarrassment patible with the premises, a combined system gives
of riches here: each of the systems has a clear intuition precedence to the more general hypothesis and delivers
behind it. only 8x.Px
Qx/ as a consequence. There are various
The systems presented here can be combined so as ways to hard-code this idea, resulting in various new
to implement Popper’s suggestion that more general combined adaptive logics for inductive generalization,
hypotheses should be given precedence over less gen- each slightly different from the others. These combina-
eral ones [11.11]. For instance, if two generalizations tions are not fully spelled out here. For a brief synopsis,
8x.Px
Qx/ and 8x..Rx^Sx/
Tx/ are jointly incom- see [11.5, Sect. 5].
Part C | 11.4
is relevant to a formula B iff there is some model M can, however, be adjusted so as to take into account
of A such that: if M 0 differs from M only in the value background knowledge [11.19, 20].) For problems re-
assigned to B, M 0 is not a model of A. The domain lated to auxiliary hypotheses, see also Sect. 11.4.2.
of a formula A is the set of individual constants that For now, it suffices to note that the criteria from Def-
occur in the atomic formulas that are relevant for A. inition 11.6 do not face this problem, as quantified
The development of a universally quantified formula formulas are perfectly allowed to occur in premise sets.
A for another formula B is the restriction of A to the For instance, the set fPa; Qa; Pb; Qb; 8x.Px
Rx/g I-
domain of B, that is, the truth value of A is evaluated confirms the hypothesis 8x.Rx
Qx/
Part C | 11.4
with respect to the domain of B. For instance, the do-
main of Pa ^ .Pb _ Qc/ is fa; b; cg whereas the domain Pa; Qa; Pb; Qb; 8x.Px
Rx/ `I 8x.Rx
Qx/ :
of Pa ^ Qa is fag; and the development of 8x.Px
Qx/ (11.35)
for Pa ^ :Qb is .Pa
Qa/ ^ .Pb
Qb/.
It seems, then, that I-confirmation is not too restric-
Definition 11.7 Hempel’s satisfaction criterion tive a criterion for confirmation. However, there are two
An observation report E directly confirms a hypothesis senses in which I-confirmation, like Hempelian confir-
H if E entails the development of H for E. mation, can be said to be too liberal. The first has to
An observation report E confirms a hypothesis H if H do with Goodman’s well-known new riddle of induc-
is entailed by a class of sentences each of which is di- tion [11.21]. The family of adaptive logics for inductive
rectly confirmed by E. generalization makes no distinction between regulari-
An observation report E disconfirms a hypothesis H if ties that are projectible and regularities that are not.
it confirms the denial of H. Using Goodman’s famous example, let an emerald be
An observation report E is neutral with respect to a hy- grue if it is green before January 1st 2020, and blue
pothesis H if E neither confirms nor disconfirms H. thereafter. Then the fact that all hitherto observed emer-
alds are grue confirms the hypothesis that all emeralds
There are two reasons for arguing that Hempel’s satis- are grue. The latter regularity is not projectible into the
faction criterion is too restrictive, and two reasons for future, as we do not seriously believe that in 2020 we
arguing that it is too liberal. Each of these is discussed will start observing blue emeralds. Nonetheless, it is
in turn. First, in order for the evidence to confirm a hy- perfectly fine to define a predicate denoting the prop-
pothesis H according to Hempel’s criterion, all objects erty of being grue, just as it is perfectly fine to define
in the development of H must be known to be instances a predicate denoting the property of being green. Yet
of H. This is a very strong requirement. I-confirmation the hypothesis all emeralds are green is projectible,
is different in this respect. For instance, whereas all emeralds are grue is not.
The problem of formulating precise rules for de-
Pa; Qa; :Pb; :Qb; Pc `I 8x.Px
Qx/ : (11.34) termining which regularities are projectible and which
are not is difficult and important, but it is an epis-
In (11.34) it is unknown whether c instantiates the hy- temological problem that cannot be solved by purely
pothesis 8x.Px
Qx/, since the premises do not tell logical means. Consequently, it falls outside the scope
us whether Pc
Qc. The development of 8x.Px
Qx/ of this article. See [11.21] for Goodman’s formulation
entails Pc
Qc, whereas the premise set of (11.34) and proposed solution of the problem, and [11.22] for
does not. So the hypothesis 8x.Px
Qx/ is not directly a collection of essays on the projectibility of regulari-
confirmed by these premises according to the satisfac- ties.
tion criterion, nor is it entailed by one or more sentences Finally, one may argue that I-confirmation is too
which are directly confirmed by them. Therefore the liberal on the basis of Hempel’s own raven paradox.
satisfaction criterion judges the premises to be neutral Where Ra abbreviates that a is a raven, and Ba abbre-
with respect to the hypothesis 8x.Px
Qx/, whereas viates that a is black, a non-black non-raven I-confirms
(11.34) illustrates that 8x.Px
Qx/ is I-confirmed by the hypothesis that all ravens are black
these premises.
Second, given the law 8x.Px
Rx/, the report :Ba; :Ra `I 8x.Rx
Bx/ : (11.36)
fPa; Qa; Pb; Qbg, does not confirm the hypothesis
8x.Rx
Qx/ according to Hempel’s original formu- Even the logic G does not block this inference. The
lation of the satisfaction criterion. The reason is that reason is that we are given a positive instance of
auxiliary hypotheses like 8x.Px
Rx/ contain quanti- the generalization 8x.:Bx
:Rx/, so we can derive
fiers and therefore cannot be elements of observation this generalization on the condition 9x.:Bx ^ :Rx/ ^
reports. (The original formulation of Hempel’s criterion 9x.:Bx ^ Rx/. As the generalization 8x.:Bx
:Rx/
242 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
is G-derivable from the premises, so is the logically the empty set. Then, according to Hempel’s entailment
equivalent hypothesis that all ravens are black, 8x.Rx
condition, H is confirmed by E, since E ` H. Not so
Bx/ (remember that G, like all logics defined in the pre- according to HD confirmation, for condition (ii) of
vious section, is closed under CL). Definition 11.8 is violated (H 6` E) [11.24]. The same
Hempel’s own reaction to the raven paradox was example illustrates how HD confirmation violates the
to bite the bullet and accept its conclusion [11.23]. following condition, which holds for the satisfaction
According to Hempel, a non-black non-raven indeed criterion in view of Definition 11.7 [11.25]:
confirms the raven hypothesis in case we did not know
Part C | 11.4
inition 11.8. In general, this inclusion is an advantage, The third objection against HD confirmation dates
since evidence often does not (dis)confirm a hypothesis back to Hempel’s [11.17], in which he argued that a vari-
simpliciter. Rather, evidence (dis)confirms hypotheses ant of HD confirmation (which he calls the prediction
with respect to a set of auxiliary (background) assump- criterion of confirmation) is circular. The problem is
tions or theories. The vocabulary of a theory often that in HD confirmation the hypothesis to be confirmed
extends beyond what is directly observable. Notwith- functions as a premise from which we derive the evi-
standing Hempel’s conviction to the contrary, nowadays dence, and that it is unclear where this premise comes
philosophers largely agree that the use of purely the- from. The hypothesis is not generated, but given in ad-
Part C | 11.4
oretical terms is both intelligible and necessary in vance, so HD confirmation presupposes the prior attain-
science [11.26]. Making the confirmation relation rel- ment – by inductive reasoning – of a hypothesis. This in-
ative to a set of auxiliaries allows for the inclusion ductive move, Hempel argues, already presupposes the
of bridging principles connecting observation terms idea of confirmation, making the HD account circular.
with theoretical terms, permitting purely theoretical The weak step in Hempel’s argument consists in
hypotheses to be confirmed by pure observation state- his assumption that the inductive jump to the origi-
ments [11.27]. However, making confirmation relative nal attainment of a hypothesis already presupposes the
to background assumptions makes HD vulnerable to confirmation of this hypothesis. In testing or generat-
a type of objection often traced back to Duhem [11.28] ing a hypothesis we need not yet believe or accept it.
and Quine [11.29]. Suppose that a hypothesis H entails Typically, belief and acceptance come only after con-
an observation E relative to , and that E is found to firming the hypothesis. Indeed, in probabilistic notions
be false. Then either (a) H is false or (b) a member of confirmation the idea is often exactly this: confirming
of is false. But the evidence does not tell us which a hypothesis amounts to increasing our degree of belief
of (a) or (b) is the case, so we always have the op- in it. Hempel’s circularity objection, it seems, confuses
tion to retain H and blame some auxiliary hypothesis in hypothesis generation and hypothesis confirmation.
the background information. More generally, one may Hempel’s circularity objection does not undermine
object that what gets (dis)confirmed by observations is HD confirmation, but it points to the wider scope of the
not a hypothesis taken by itself, but the conjunction of adaptive account as compared to HD confirmation. In
a hypothesis and a set of background assumptions or an I-proof, the conditional rule allows us to generate
theories. hypotheses. Hypotheses are not given in advance but
With Elliott Sober, we can counter such holistic ob- are computable by the logic itself. Moreover, a clear
jections by pointing to the different epistemic status of distinction can be made between hypothesis generation
hypotheses under test and auxiliary hypotheses (or hy- and hypothesis confirmation. Hypotheses generated in
potheses used in a test). Auxiliaries are independently an I-proof may be derivable at some stage of the
testable, and when used in an experiment we already proof, but the central question is whether they can be
have good reasons to think of these hypotheses as true. retained – whether they are finally derivable. I-confir-
Moreover, they are epistemically independent of the test mation, then, amounts to final derivability in an I-proof
outcome. So if a hypothesis is disconfirmed by the HD whereas the inductive step of hypothesis generation is
criterion, we can, in the vast majority of cases, maintain represented by retractable applications of RC.
that it is the hypothesis we need to retract, and not one
of the background assumptions [11.30]. 11.4.3 Interdependent Abnormalities
A parallel point can be made concerning I-confir- and Heuristic Guidance
mation. Here too, we can add to the premises a set
of auxiliary or background assumptions. And here For any of the adaptive logics for inductive general-
too, we can use Sober’s defence against objections ization defined in this chapter, at most one positive
from evidential holism. A nice feature of I-confirma- instance is needed to try and derive and, subsequently,
tion is that in adaptive proofs the weaker epistemic confirm a generalization for a given set of premises.
status of hypotheses inferred from an observation re- This is a feature that I-confirmation shares with the
port in conjunction with a set of auxiliaries is reflected other qualitative criteria of confirmation. As a simple
by their non-empty condition. Whereas auxiliaries are illustration, note that an observation report consist-
introduced as premises on the empty condition, induc- ing of a single observation Pa confirms the hypothesis
tively generated hypotheses are derived conditionally 8xPx according to all qualitative criteria discussed in
and may be retracted at a later stage of the proof. this chapter. Proponents of quantitative approaches to
For a more fine-grained treatment of background infor- confirmation may object that this is insufficient; that
mation in adaptive logics for inductive generalization, a stronger criterion is needed which requires more than
see [11.5, Sect. 6]. one instance for a hypothesis to be confirmed. Against
244 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
this view, one can uphold that confirmation is mainly Examples like these illustrate that I-confirmation is
falsification-driven. Rather than confirming hypotheses not too liberal a criterion of confirmation. They also
by heaping up positive instances, we try and test them serve to illustrate a different point. Minimal Dab-for-
by searching for negative instances. In the remainder mulas like (11.45) evoke questions. Which of the two
of this section, it is argued by means of a number of abnormalities is the case? For this particular premise
examples that I-confirmation is sufficiently selective as set, establishing which of Bb or :Bb is the case
a criterion for confirming generated hypotheses. The would settle the matter. For if Bb were the case, then
examples moreover allow for the illustration of an ad- the second disjunct of (11.45) would be derivable,
Part C | 11.4
ditional feature of I-confirmation: its use as a heuristic and (11.45) would no longer be minimal. Consequently,
guide for provoking further tests in generating and con- the abnormality 9x.:Rx _ Bx/ ^ 9x:.:Rx _ Bx/ would
firming additional hypotheses. no longer be part of a minimal disjunction of abnor-
Simple examples like the one given in the previous malities, and the generalization 8x.Rx
Bx/ would
paragraph may suggest that, in the absence of falsifying become finally derivable. Analogously, if :Bb were the
instances, a single instance usually suffices to I-con- case, then the first disjunct of (11.45) would become
firm a hypothesis. This is far from the truth. Consider derivable, and, by the same reasoning, the generaliza-
the simple premise set 6 D f:Pa_Qa; :Qb; Pcg. This tion 8x.Rx
:Bx/ would become finally derivable.
premise set contains instances of all of the generaliza- Thus
tions 8xPx; 8x:Qx, and 8x.Px
Qx/. Not a single one
of these is IL-confirmed, however, due to the derivabil- 7 [ fBbg `IL 8x.Rx
Bx/ ; (11.46)
ity of the following disjunctions of abnormalities 7 [ f:Bbg `IL 8x.Rx
:Bx/ : (11.47)
ŠPx_ŠQx ; (11.37) Two more comments are in order here. First, this ex-
ŠPx_Š.:Px _ Qx/ ; (11.38) ample illustrates that confirming a hypothesis often
Š.Px _ Qx/_Š.:Px _ Qx/ ; (11.39)
involves the disconfirmation of the contrary hypothe-
sis. We saw that if we use Hempel’s criterion a non-
ŠQx_Š.:Px _ Qx/ ; (11.40) black non-raven confirms the raven hypothesis. But as
Š.:Px _ Qx/_Š.:Px _ :Qx/ : (11.41) Goodman pointed out “the prospects for indoor or-
nithology vanish when we notice that under these same
Note that 6 contains positive instances of both 8xPx conditions, the contrary hypothesis that no ravens are
and 8x:Qx, so not even a positive instance suffices black is equally well confirmed” [11.21, p. 71]. Thus,
for a generalization to be finally IL-derivable in the according to Goodman, confirming the raven hypoth-
absence of falsifying instances. The same is true if esis 8x.Rx
Bx/ requires disconfirming its contrary
we switch from IL to G. None of 8xPx; 8x:Qx, or 8x.Rx
:Bx/. This is exactly what happens in the ex-
8x.Px
Qx/ is G-confirmed, due to the derivability of ample: in order to IL-derive 8x.Rx
Bx/, a falsifying
the following disjunctions of abnormalities instance for its contrary is needed, as (11.46) illustrates.
Goodman’s suggestion that the confirmation of a hy-
˙Px _ ˙Qx ; (11.42) pothesis requires the falsification/disconfirmation of its
˙Px _ .Px ^ ˙Qx/ ; (11.43) contrary was picked up by Israel Scheffler, who de-
˙Qx _ .Qx ^ ˙Px/ : (11.44) veloped it further in his [11.31]. Note that falsifying
the contrary of the raven hypothesis amounts to find-
The reason for the non-confirmation of generalizations ing a positive instance of the raven hypothesis. Thus, in
like 8xPx; 8x:Qx, or 8x.Px
Qx/ in this example demanding a positive instance before permitting gen-
has to do with the dependencies that exist between eralization in a G-proof, the latter system goes further
abnormalities. Even if a generalization is not falsified than IL in implementing Goodman’s idea. As we saw,
by the data, it is often the case that this generaliza- however, not even G goes all the way: a generalization
tion is not compatible with a different generalization may be G-derivable even in the absence of a positive
left unfalsified by the data. As a further illustration, instance.
consider the premise set 7 D f:Ra; :Ba; Rbg. Again, Second, if empirical (observational or experimen-
although no falsifying instance is present, the general- tal) means are available to answer questions like
ization 8x.Rx
Bx/ is not IL-derivable. The reason is ‹fBb; :Bbg in the foregoing example, these questions
the derivability of the following minimal disjunction of may be called tests [11.2]. Adaptive logics for in-
abnormalities ductive generalization provide heuristic guidance in
the sense that interdependencies between abnormalities
Š.:Rx _ Bx/_Š.:Rx _ :Bx/ : (11.45) evoke such tests. Importantly, further tests may lead to
Qualitative Inductive Generalization and Confirmation 11.5 Conclusions 245
the derivability of new generalizations. In the example, By the same reasoning as in the previous illustration,
deciding the question ‹fBb; :Bbg in favor of Bb leads 8 evokes the question ‹fQc; :Qcg. If this question is
to the confirmation of 8x.Rx
Bx/ and to the discon- a test (if it can be answered by empirical means), the
firmation of 8x.Rx
:Bx/, while deciding it in favor answer will confirm one of the generalizations 8x.Px
of :Bb leads to the confirmation of 8x.Rx
:Bx/ and Qx/ and 8x.Rx
:Qx/, and will disconfirm the other
to the disconfirmation of 8x.Rx
Bx/. This is an im- generalization [11.2].
portant practical advantage of I-confirmation over other The example generalizes. In LI and G too, the
qualitative criteria: adaptive logics for inductive gen- derivability of 8x.Px
Qx/ and 8x.Rx
:Qx/ is
Part C | 11.5
eralization evoke tests for increasing the number of blocked due to the CL-derivability of the LI-minimal
confirmed generalizations. Dab-formula (11.49), respectively the G-minimal Dab-
The illustrations so far may suggest that this heuris- formula (11.50)
tic guidance provided by I-confirmation only applies
to hypotheses that are logically related or closely con- :8x.Px
Qx/ _ :8x.Rx
:Qx/ ; (11.49)
nected, like the raven hypothesis and its contrary. But .Px ^ ˙Qx/ _ .Rx ^ ˙Qx/ : (11.50)
the point is more general, as the following example il-
lustrates. Here too, deciding the question ‹fQc; :Qcg resolves the
Consider the premise set matter. Thus, where I 2 fLI; IL; Gg
11.5 Conclusions
A number of adaptive logics for inductive generaliza- adaptive strategy. Here, no surprises arise. A logic de-
tion were presented each of which, it was argued, can fined using the reliability strategy is in general weaker
be re-interpreted as a criterion of confirmation. The log- than its counterpart logic defined using the minimal
ics in question can be classified along two dimensions. abnormality strategy (this was shown to be the case
The first dimension concerns when it is permitted to for all adaptive logics defined within the standard for-
introduce a generalization in an adaptive proof. The mat [11.7, Theorem 11]).
logic LI permits the free introduction of generaliza- When re-interpreted as criteria of confirmation, the
tions. IL and G require instances of a generalization logics defined here withstand the comparison with their
before introducing it in a proof. Interestingly, these main rivals, i. e., Hempel’s satisfaction criterion and the
stronger requirements do not result in stronger log- hypothetico-deductive model of confirmation. In con-
ics. clusion, the adaptive confirmation criteria defined in
The second dimension along which the logics de- Sect. 11.4 offer an interesting alternative perspective on
fined in this chapter can be classified concerns their (qualitative) confirmation theory.
246 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
A0 ; A1 ; : : : ; An 2 Af 1 I n 0g; (11.58)
35].
A similar strategy could be adopted with respect to and the adaptive strategy reliability (ICr ) or minimal
I-confirmation and the raven paradox. In this appendix, abnormality (ICm ). IC is defined within the SF. All
an alternative adaptive logic of induction, IC, is defined, rules and definitions for its proof theory are as for the
as is a corresponding criterion of confirmation which is other logics defined in this chapter, except that in the
slightly less permissive than the criteria from Sect. 11.4. definition of RU and RC, CL is replaced with CL! .
IC makes use of a non-classical conditional resembling The following proof illustrates how formulas are de-
a number of conditionals originally defined in order to rived conditionally in IC
avoid the so-called paradoxes of material implication.
First, an extension of CL is introduced, including this 1 :Ra Prem
new conditional connective. Next, the adaptive logic IC ;
is defined.
The new conditional, !, is fully characterized by 2 :Ba Prem
the following rules and axiom schema’s ;
A; .A ! B/ 3 8x.:Bx ! :Rx/ 1; 2I RC
; .MP/ f9x.:Bx ^ :Rx/ ^ :8x.:Bx ! :Rx/g
B
AB Given only the premises :Ra and :Ba, there is no
; .RCEA/
.A ! C/ .B ! C/ possible extension of this proof in which line 3 gets
AB marked. Hence
; .RCEC/
.C ! A/ .C ! B/
:Ra; :Ba `IC 8x.:Bx ! :Rx/ : (11.59)
.A ! .B ^ C// ..A ! B/ ^ .A ! C// ; .D^/
..A _ B/ ! C/ ..A ! C/ ^ .B ! C// ; .D_/ However, contraposition is invalid for the new condi-
tional !, hence we cannot derive the raven hypothesis
((RCEA), (RCEC), and (D^) fully characterize the con- from the formula derived at line 3. Note also that, in
ditional of Chellas’s logic CR from [11.36]. The latter view of (11.60), we cannot use the conditional rule RC
was also used for capturing explanatory conditionals to derive 8x.Rx ! Bx/ on the condition f9x.Rx ^ Bx/ ^
in [11.37]. See also [11.38, Chap. 5] for some closely :8x.Rx ! Bx/g in an IC-proof, since
related conditional logics, including an extension of
Chellas’s systems that validates (MP).) :Ra; :Ba 6`CL! 8x.Rx ! Bx/
Let CL! be the logic resulting from adding ! to _ .9x.Rx ^ Bx/ ^ :8x.Rx ! Bx// : (11.60)
the language of CL, and from adding (MP)-(D_) to the
list of rules and axioms of CL. Note that the conditional Therefore
! is strictly stronger than
:Ra; :Ba 6`IC 8x.Rx ! Bx/ : (11.61)
.A ! B/
.A
B/ : (11.57)
Thus, if conditional statements of the form for all x, if
(By (MP), A; .A ! B/ `CL! B. By the deduction the- A.x/ then B.x/ are taken to be IC-confirmed only if the
orem for
, A ! B `CL! A
B. By the deduction conditional in question is an arrow (!) instead of a ma-
theorem again, `CL! .A ! B/
.A
B/.) terial implication, then the raven paradox, in its original
In view of this bridging principle between both formulation, is blocked.
conditionals it is easily seen that counter-instances to An additional property of IC is that strengthening
a formula of the form 8x.A.x/
B.x// form counter- the antecedent fails for !. In Sect. 11.3, for instance,
instances to 8x.A.x/ ! B.x//, and falsify the latter for- we saw that
mula as well. For instance, if Pa ^ :Qa, then, by CL,
:8x.Px
Qx/, and, by (11.57), :8x.Px ! Qx/. Pa `G 8x.Qx
Px/ : (11.62)
Qualitative Inductive Generalization and Confirmation References 247
Part C | 11
Pa `IC 8x.Qx
Px/ : (11.64)
The key step in this proof is the derivation of Bb at
However, since 8xPx 6`CL! 8x.Qx ! Px/, and since line 5, which together with Rb provides us with a pos-
we do not have any further means to conditionally de- itive instance of the raven hypothesis. Bb is derivable
rive the formula 8x.Qx ! Px/ in an IC-proof from lines 2 and 4 in view of CL and (11.57). Ex-
cept for the formulas 9xRx ^ 9x:Rx and 9xBx ^ 9x:Bx,
Pa 6`IC 8x.Qx ! Px/ : (11.65) no minimal Dab-formulas are CL! -derivable from 9 .
Therefore
Originally, the logics in the G-family were constructed
as logics requiring a positive instance before we are al- 9 `IC 8x.Rx ! Bx/ : (11.66)
lowed to apply RC. This is reflected in the definition
of the set of G-abnormalities. In order to derive a for- As (11.61) illustrates the logic IC avoids the raven
mula like 8x.Px
Qx/ on its corresponding condition, paradox in its original formulation. A possible draw-
a positive instance, e.g., Pa ^ Qa, is needed. Examples back of IC is that it does not fully meet the demand
like (11.36) and (11.62) show, however, that such a pos- for a positive instance when confirming a hypothesis
itive instance is not always required in order to G-derive (Sect. 11.4.3). It is left open whether it is possible and
a generalization. The logic IC, it seems, does much bet- desirable to further extend IC so as to fully meet this
ter in this respect. However, it still does not fully live up demand.
to the requirement for a positive instance before gener-
alizing, as the following IC-proof from 9 D f:Ra ^ Acknowledgments. The author is greatly indebted
:Ba; Rb; Bcg illustrates (where A0 ; A1 ; : : : ; An 2 Af 1 , to Atocha Aliseda, Cristina Barés-Gómez, Diderik
..A1 ^ : : : ^ An / ! A0 / abbreviates 9.A1 ^ : : : ^ An ^ Batens, Matthieu Fontaine, Jan Sprenger, and Frederik
A0 / ^ :8..A1 ^ : : : ^ An / ! A0 /). Van De Putte for insightful and valuable comments
on previous drafts of this chapter. Research for this
article was supported by the Programa de Becas Pos-
1 :Ra ^ :Ba Prem
doctorales de la Coordinación de Humanidades of the
;
National Autonomous University of Mexico (UNAM),
2 Rb Prem by the project Logics of discovery, heuristics and cre-
; ativity in the sciences (PAPIIT, IN400514-3) granted by
the UNAM, and by a Sofja Kovalevskaja award of the
3 Bc Prem Alexander von Humboldt-Foundation, founded by the
; German Ministry for Education and Research.
References
11.1 J. Norton: A little survey of induction. In: Scientific 2003, 255–290 (2001)
Evidence, ed. by P. Achinstein (John Hopkins Univ. 11.5 D. Batens: Logics for qualitative inductive general-
Press, Baltimore 2005) pp. 9–34 ization, Studia Logica 97, 61–80 (2011)
11.2 D. Batens: The basic inductive schema, inductive 11.6 C.G. Hempel: A purely syntactical definition of con-
truisms, and the research-guiding capacities of the firmation, J. Symb. Log. 8(4), 122–143 (1943)
logic of inductive generalization, Logique et Anal- 11.7 D. Batens: A universal logic approach to adaptive
yse 185-188, appeared 2005, 53–84 (2004) logics, Logica Universalis 1, 221–242 (2007)
11.3 D. Batens: On a logic of induction, Log. Philos. Sci. 11.8 D. Batens: Tutorial on inconsistency-adaptive log-
4(1), 3–32 (2006) ics. In: New Directions in Paraconsistent Logic: 5th
11.4 D. Batens, L. Haesaert: On classical adaptive logics WCP, Kolkata, India, February 2014, Springer Pro-
of induction, Logique et Analyse 173-175, appeared ceedings in Mathematics and Statistics, Vol. 152, ed.
248 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
by J.-Y. Beziau, M. Chakraborty, S. Dutta (Springer 11.23 C.G. Hempel: Studies in the logic of confirmation I,
India, New Delhi 2015) pp. 3–38 Mind 54(213), 1–26 (1945)
11.9 D. Batens: Towards a dialogic interpretation of 11.24 J. Sprenger: Hempel and the paradoxes of confir-
dynamic proofs. In: Dialogues, Logics and Other mation. In: Handbook of the History of Logic, Vol.
Strange Things. Essays in Honour of Shahid Rah- 10, ed. by D. Gabbay, S. Hartmann, J. Woods (Else-
man, ed. by C. Dégremont, L. Keiff, H. Rückert vier, Amsterdam 2011) pp. 231–260
(College Publications, London 2009) pp. 27–51 11.25 V. Crupi: Confirmation. In: The Stanford Encyclo-
11.10 F. Van De Putte, C. Straßer: Adaptive logics: A para- pedia of Philosophy, Spring 2014 edn., ed. by Ed-
metric approach, Log. J. IGPL 22(6), 905–932 (2014) ward N. Zalta, http://plato.stanford.edu/archives/
Part C | 11
Tjerk Gauderis
Modeling Hyp 12. Modeling Hypothetical Reasoning
by Formal Logics
Part C | 12.1
formally, a more fine-grained classification of rea- 12.7.5 Avoiding Random Hypotheses ........... 260
soning patterns should be in order. After such
a classification is provided in Sect. 12.3, a formal
12.8 MLAss : A Logic for Theoretical
Singular Fact Abduction ................... 261
framework that has proven successful to capture
12.8.1 Formal Language Schema ................. 261
some of these patterns is described (Sects. 12.4 and
12.8.2 Lower Limit Logic ............................. 261
12.6) and some of the specific problems for this
12.8.3 Intended Interpretation of
procedure are discussed (Sect. 12.5). The chapter the Modal Operators ......................... 261
concludes by presenting two logics for hypothet- 12.8.4 Set of Abnormalities ......................... 261
ical reasoning in an informal way (Sects. 12.7 and 12.8.5 First Proposal ˝1 .............................. 262
12.8) such that the nontechnically skilled reader 12.8.6 Simple Strategy ................................ 262
can get a flavor of how formal methods can be 12.8.7 Contradictory Hypotheses.................. 262
used to describe hypothetical reasoning. 12.8.8 Predictions and Evidence .................. 262
12.8.9 Contradictions.................................. 263
12.8.10 Tautologies ...................................... 263
12.1 The Feasibility of the Project ............ 249
12.8.11 Second Proposal ˝2 ......................... 263
12.2 Advantages and Drawbacks .............. 251 12.8.12 Most Parsimonious Explanantia ......... 263
12.3 Four Patterns 12.8.13 Notation.......................................... 264
of Hypothetical Reasoning ............... 252 12.8.14 Final Proposal ˝ .............................. 264
12.4 Abductive Reasoning
12.9 Conclusions ..................................... 265
and Adaptive Logics......................... 255
12.5 The Problem 12.A Appendix: Formal Presentations
of Multiple Explanatory Hypotheses.. 256 of the Logics LArs and MLAss ............... 265
12.6 The Standard Format 12.A.1 Proof Theory .................................... 265
of Adaptive Logics ........................... 256 12.A.2 Semantics........................................ 266
12.6.1 Dynamic Proof Theory ....................... 257 References................................................... 267
This argument, however, is a straw man. Nobody The purpose here, however, is to model or explicate hu-
would argue for the claim that hypothetical reasoning man reasoning patterns. As these patterns are fallible,
can be modeled by means of formal logics along these leading to conclusions that are not necessarily true even
lines. What is argued in this chapter and the various if the premises are assumed to be true, it should be pos-
sources it cites is the more modest claim that certain sible to revoke previously derived results; hence, the use
aspects and forms of hypothetical reasoning can be of nonmonotonic logics. Also, because there are many
modeled with the aid of formal systems that are specif- patterns of human reasoning, it is natural to conceive of
ically suited for this task. a plenitude of logics in order to describe them.
There are three important ways in which this modest Let this be explained a bit more formally. A logic
claim differs from the straw man that is attacked by the can be considered as a function from the power set of
creativity excludes logic argument: the sentences of a language to itself. So, given a lan-
guage L and the set W of its well-formed formulas
1. Abductive reasoning is not a monolithic concept:
it does not consist of a single method or proce-
L W }.W / ! }.W / : (12.1)
dure, but consists of many different patterns; formal
Part C | 12.1
First, the focus in the adaptive logic program is, in an adaptive logic is created according to certain stan-
contrast with other approaches to nonmonotonic rea- dards (the so-called standard format), many important
soning, on proof theory. For these logics, a dynamic metatheoretical properties are generically proven. This
proof style has been defined in order to mimic to a cer- creates an opportunity for projects such as this to focus
tain extent actual human reasoning patterns. More in almost exclusively on the application of these formal
particular, these dynamic proofs display the two forms methods without having to worry too much about prov-
of revoking previously derived results that can also be ing their metatheoretical characteristics.
found in human reasoning: revoking old conclusions Finally, as the framework is presented as a unified
on closer consideration of the available evidence (in- framework for nonmonotonic logics, it has been applied
ternal dynamics) and revoking them in light of new in many different contexts. Over the years, adaptive
information (external dynamics). One should not be logics have been devised for, apart from abduction,
misled, however, by this idea of dynamic proofs in paraconsistent reasoning, induction, argumentation, de-
thinking that the consequence set of adaptive logics ontic reasoning, etc. Most of these applications have
for a certain premise set depends on the proof. Adap- been studied at the Centre for Logic and Philosophy
Part C | 12.2
tive logics are proper proof-invariant logics that assign of Science (Ghent University). At this center’s web-
for each premise set exactly one consequence set site, many references can be found to papers in various
CnL . /. contexts. The reference works mentioned earlier, [12.3]
Second, over the years, a solid meta-theory has and [12.5], also give a good overview of the various
been built for this framework, which guarantees that if applications.
such reasoning processes are always, to a great extent, growth of the agent’s background knowledge. It is clear
idealized. that this also poses a limit to the application of these
Natural languages are also immensely more com- methods to real-world problems.
plex than any formal language can aspire to be. There- Finally, one might question the normativity of this
fore, models of human reasoning are unavoidably sim- project (and more generally of the adaptive logics
plifications. Furthermore, as formal logics state every- program). By aiming to describe actual human rea-
thing explicitly, any modeler of human reasoning has to soning processes, this branch of logics appears to put
simplify deliberately the actual cases, only to achieve a descriptive ideal first, which contrasts sharply with
a certain degree of comprehensibility. the strongly normative ideals in the field of logic in
Altogether, it is clear that formal models of hu- general. The standard answer to this question is that
man reasoning processes are, in fact, only models: They adaptive logics attempt to provide both: On the one
contain abstractions, simulations, simplifications, and hand, they aim to describe actual reasoning patterns;
idealizations. And although these techniques are the key on the other, once these patterns are identified, they
characteristics of models, such as those used in science, aim to prescribe how these patters should be rationally
Part C | 12.3
it is not always easy to evade the criticism that formal applied. Yet, this does not answer how the trade-off
logics can only handle toy examples. between these two goals of description and normativ-
Third, certain patterns of creative hypothesis for- ity should be conceived. Is it better to have a large
mation, that is, those that introduce the hypothetical set of logics that is able to describe virtually any pat-
existence of new concepts, cannot be modeled by first- tern actually found in human reasoning, or should one
order logics. They seem to require at least the use of keep this set trimmed and qualify most actual human
second-order logics, and this is a possibility of which, reasoning as failing to accord with the highest nor-
at present, the adaptive logics framework is not capable. mative standards? Therefore, it remains a legitimate
Fourth, as one is here purely concerned with hy- criticism that the goals of description and prescrip-
pothesis formation and not with hypothesis selection, tion cannot be so easily joined: how their trade-off
formal methods will generate sets of possible hypothe- should be dealt with needs further theoretical underpin-
ses that may grow exponentially in relation to the ning.
tions and knowledge that provides the explanatory link knowledge about how the structure of pelvis and
between hypothesis and observations (see [12.16] for knee bones relates to the locomotion of animals
an elaborate discussion of the role of an explanatory (EF).
framework in logical approaches to explanation). The inference that (H YP) two particles (x1 and
In line with the Fregean tradition, factual statements x2 ) might have opposite electric charges (E),
are considered as statements of a concept with regard to from (O BS) observing their attraction (F) and
one or more objects (or a logical combination of such (B BK) knowledge of the Coulomb force (EF).
statements). For instance, the statement there was a civil 2. Abduction of a Generalization
war in France in 1789 can be analyzed as the concept
a country in civil war applied to or with regard to the .O BS/ F with regard to all observed
object France in 1789. A fact is a true factual state- objects of class D
ment. As such, concepts can also be considered as the .B BK/ E with regard to some objects
class of all objects (or tuples of objects) for which the explains F with regard to those
concept with regard to that object (or tuple of objects) objects in a certain explanatory
Part C | 12.3
is a fact. An observed fact is a factual statement de- framework EF:
scribing an agent’s observation that she considers to be .H YP/ It might be that E with regard to
true. This can be broadly conceived to include also, for
all existing objects of class D
instance, a graph or a table of measurements in a pa-
per. Together, the observed facts form the trigger for
Some examples of this pattern, which has also
the agent.
been called rule abduction [12.13], law abduc-
In this semiformal description of these patterns, that
tion [12.14], and selective law abduction [12.15],
p should be considered as a hypothesis is expressed
are as follows:
by using a formulation of the form it might be that p;
beliefs and observed facts can be expressed simply by
The inference that (H YP) all hominids of the
last three million years (D) might have been
stating their content. Concepts such as a country in civil
bipedal (E), from (O BS) observing the similar
war or a bipedal hominid are denoted by uppercase let-
structure of the pelvis and knee bones (F) of all
ters (typically F for observed, factual concepts and E
observed hominid skeletons dated to be younger
for explanatory concepts) and objects such as France in
than three million years (D) and (B BK) knowl-
1789 or Lucy by lowercase letters such as x or y. A fi-
edge about how the structure of pelvis and knee
nite set or list of (related) objects or concepts can then
bones relates to the locomotion of animals (EF).
be expressed, for example, by x1 ; : : : ; xn or F1 ; : : : ; Fn
where generally n > 1 (hence, including the possibility
The inference that (H YP) all emitted radiation
from a particular chemical element (D) might
of a single object or concept; the other case is indicated
be electrically neutral (E), from (O BS) observ-
by n > 2). Finally, that a concept applies to certain ob-
ing in all experiments conducted so far that
jects will be indicated by the phrase with regard to:
radiation emitted by this element (D) contin-
1. Abduction of a singular fact ues in a straight path in an external magnetic
field perpendicular to the stream of radiation (F)
.O BS/ F with regard to x1 ; : : : ; xn .n > 1/ and (B BK) knowledge of the Lorentz force and
.B BK/ E with regard to x1 ; : : : ; xn explains Newton’s second law (EF).
F with regard to those objects in 3. Existential abduction, or the abduction of the exis-
a certain explanatory framework tence of unknown objects from a particular class
EF:
.H YP/ It might be that E with regard to .O BS/ F with regard to x1 ; : : : ; xn .n > 1/
x1 ; : : : ; xn .B BK/ The existence of objects
y1 ; : : : ; ym .m > 1/ of class E
Some examples of this pattern, which has also been would explain F with regard to
called simple abduction by Thagard [12.13], factual x1 ; : : : ; xn in a certain explanatory
abduction by Schurz [12.14], and selective fact ab- framework EF:
duction by Hoffmann [12.15], are as follows: .H YP/ It might be that there exist objects
The inference that (H YP) the hominid who has y1 ; : : : ; ym of class E
been dubbed Lucy (x1 ) might have been bipedal
(E) from (O BS) observing the particular struc- Some examples of this pattern, which was already
ture of her pelvis and knee bones (F) and (B BK) called existential abduction by Thagard [12.13],
254 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
and has also been called first-order existen- (F1 ; : : : ; Fm ) between certain types of particles
tial abduction [12.14] and selective type abduc- (x1 ; : : : ; xn ) in similar experiments and (B BK)
tion [12.15], are as follows: believing that this behavior cannot be explained
The inference that (H YP) a hominid (y1 ) of the by the already known interactions, properties of
genus Australopithecus (E) might have lived in the involved particles and properties of the ex-
this area, from (O BS) observing a set of vul- perimental setup (EF).
canized foot imprints (x1 ; : : : ; xn of class F) and
(B BK) the belief that these foot imprints are of Using the terminology of Magnani [12.18] and fol-
an Australopithecus (EF). lowing the distinction of Schurz [12.14], the first two
The inference that (H YP) there might be other patterns, the abduction of a singular fact and that of
charged particles (y1 ; : : : ; ym of class E) in the a generalization, can be considered as instances of se-
chamber, from (O BS) observing deflections in lective abduction, as the agent selects an appropriate
the path (F) of a charged particle (x1 ) in a cham- hypothesis in her background knowledge, while the lat-
ber without external electric or magnetic fields ter two, existential abduction and conceptual abduction,
Part C | 12.3
and (B BK) the knowledge of the Coulomb and can be called creative abduction, as the agent creates
Lorentz forces and Newton’s second law (EF). a new hypothetical concept or object. It has to be added
4. Conceptual abduction or the abduction of a new that Hoffmann [12.15] would dispute this distinction, as
concept he sees the third pattern (existential abduction) in the
first place as the selection of an already known type
.O BS/ F1 ; : : : ; Fm .m > 2/ with regard to (e.g., the genus Australopithecus), and not so much as
each of x1 ; : : : ; xn .n > 2/ the creation of a new token (someone of this genus of
.B BK/ No known concept explains why which his/her existence is now hypothesized).
F1 ; : : : ; Fm with regard to each of As stated before, this list is not exhaustive. Further
x1 ; : : : ; xn patterns have been identified, such as the abduction of
.H YP/ It might be that there is a similarity a new perspective [12.15], for example, suggesting that
between the x1 ; : : : ; xn ; which a problem might have a geometrical solution instead of
can be labeled with a new concept an algebraic one; analogical abduction [12.13], for ex-
E that explains why F1 ; : : : ; Fm ample, explaining similar properties of water and light,
with regard to each of x1 ; : : : ; xn in by hypothesizing that light could also be wave-like;
a certain explanatory framework or theoretical model abduction [12.14], for example,
EF: explaining some observation by suggesting suitable
initial conditions given some governing principles or
It was Schurz [12.14] who pointed out that this pat- laws. Some have even considered visual abduction, the
tern is rational and useful for science only if the inference from the observation itself to a statement de-
observation concerns several objects each individu- scribing this observation, as a separate pattern [12.19].
ally having the same or similar properties, so that For some of these patterns (or instances of them), it is
some form of conceptual unification is obtained. possible to argue that they are a special case of one of
Otherwise, for each fact, it could be suggested that the patterns mentioned earlier. For instance, the sug-
there exists an ad hoc power that explains (only) this gestion of the wave nature of light can also be seen
single fact. as an instance of conceptual abduction, in which the
Some examples of this pattern, which largely coin- (mathematical) concept wave behavior is contructed to
cides with the various types of second-order abduc- explain the similar properties of water and light; yet,
tion, as Schurz [12.14] suggests and several types of it is true that the analogical nature of this inference
creative abduction conceived by Hoffmann [12.15], makes it a special subpattern with interesting proper-
are as follows: ties in itself. This is also how Schurz [12.14] presents
The inference that (H YP) there might be a new it: In his classification, analogical abduction is one of
species of hominids (E), from (O BS) observ- the types of second-order existential abduction he con-
ing various hominid fossils (x1 ; : : : ; xn ) that are ceives of.
similar in many ways (F1 ; : : : ; Fm ) and (B BK) Perhaps more important to note is that these pat-
believing that these fossils cannot be classified terns are not mutually exclusive given a particular
in the current taxonomy of hominids (EF). instance of abductive reasoning. For instance, the in-
The inference that (H YP) there might ex- ference that leads to the explanation of why a particular
ist a new type of interaction (E), from piece of iron is rusted can be described both as singular
(O BS) observing similar interactive behavior fact abduction (this piece of iron underwent a reaction
Modeling Hypothetical Reasoning by Formal Logics 12.4 Abductive Reasoning and Adaptive Logics 255
with oxygen) and as existential abduction (there were One should not, however, be too worried about these
oxygen atoms present with which this piece of iron re- issues, if it is remembered that these patterns are cat-
acted). But in essence it describes the same explanation egories for linguistic descriptions of actual reasoning
for the same explanandum. processes. Any actual instance of hypothesis formation
Also, combinations occur. For instance, if a new can be described in several ways by means of natural
particle is hypothesized as an explanation for an experi- language, and some of these expressions can be for-
mental anomaly (such as, for instance, Wolfgang Pauli’s mally analyzed in more than one way. Therefore, one
suggestion of the neutrino in the case of the anomalous should not focus too much on the exact classification of
ˇ spectrum [12.20]), then this is both an instance of ex- particular instances of hypothesis formation. Yet, this
istential abduction – there is a not yet observed particle does not render meaningless the project of explicating
that causes the observed phenomenon – and an instance various patterns of hypothesis formation. The goal of
of conceptual abduction – these hypothesized particles this project is to provide normative guidance for future
are of a new kind of combination, which coincides with hypothesis formation. If particular problems or observa-
Hoffmann’s [12.15] pattern of creative fact abduction. tions can be looked at from different perspectives and,
Part C | 12.4
Yet in the mind of the scientist, this process of hypothe- therefore, expressed in various ways, it is only benefi-
sis formation might have occurred in a single reasoning cial for an agent to have multiple patterns of hypothesis
step. formation at her disposal.
in [12.24, pp. 224–225] and first used in [12.27]) and ex falso quodlibet argument possible. Briefly, such an
modeled by the logic LArs ([12.29], Sect. 12.7), is suit- argument goes as follows: suppose one’s premise set
able for modeling situations in which one has to act on contains both the statements p and :p. Then, by means
the basis of the conclusions before having the chance to of addition, one can first derive p _ q for any random
find out which hypothesis actually is the case. A good q. (Informally, as one already knows that p is true, any
example is how people react to unexpected behavior. statement of the form p or : : : will also be true.) But, as
If someone suddenly starts to shout, people will typi- :p also holds, one can derive q from this disjunction by
cally react in a hesitant way, taking into account that means of a disjunctive syllogism (the logical rule that if
either they themselves are somehow at fault or that the you know that one side of disjunction is false, the other
shouting person is just frustrated or crazy and acting in- side has to be true to make the disjunction true).
appropriately. The logic MLAss [12.30] presented in this overview
Second, someone with a theoretical perspective (for (Sect. 12.8) solves this problem by adding modalities
instance, a scientist or a detective) is interested in find- to the language and deriving the hypotheses ˘Qa and
ing out which of the various hypotheses is the actual ˘Ra instead of Qa and Ra. By conceiving of hypotheses
explanation. Therefore, it is important that s/he can ab- as logical possibilities, the conjunction problem is au-
duce the individual hypotheses Qa and Ra in order to tomatically solved because ˘Qa ^ ˘Ra does not imply
examine them further one by one. Early work on these ˘.Qa ^ Ra/ in any standard modal logic. This approach
kinds of logics has been done in [12.27, 28]. Yet, these also nicely coincides with the common idea that hy-
logics have a quite complex proof theory. This is be- potheses are possibilities. These features make the logic
cause, on the one hand, one has to be able to derive Qa MLAss very suitable for the modeling of actual theoret-
and Ra separately, but on the other, one has to prevent ical abductive reasoning processes.
Definition 12.1
An adaptive logic in the standard format is defined by The lower limit logic LLL specifies the stable part
a triple: of the adaptive logic. Its rules are unconditionally valid
Modeling Hypothetical Reasoning by Formal Logics 12.6 The Standard Format of Adaptive Logics 257
in the adaptive logic, and anything that follows from RC If A1 ; ; An `LLL B _ Dab./
the premises by LLL will never be revoked. Apart from A1 1
that, it is also possible in an adaptive logic to derive :: ::
defeasible consequences. These are obtained by assum- : :
ing that the elements of the set of abnormalities are as An n
much as possible false. The adaptive strategy is needed B 1 [ : : : [ n [
to specify as much as possible. This will become clearer
further on. The premise rule PREM states that a premise may
Strictly speaking, the standard format for adaptive be introduced at any line of a proof on the empty condi-
logics requires that a lower limit logic contains, in ad- tion. The unconditional inference rule RU states that, if
dition to the LLL operators, also the operators of CL. A1 ; : : : ; An `LLL B and A1 ; : : : ; An occur in the proof on
However, these operators have merely a technical role the conditions 1 ; : : : ; n , one may add B on the con-
(in the generic meta-theory for adaptive logics) and dition 1 [ [ n . The strength of an adaptive logic
are not used in the applications presented here. There- comes from the third rule, the conditional inference rule
Part C | 12.6
fore, given the introductory nature of this section, this RC, which works analogously to RU, but introduces
will not be explained into further detail. In the logics new conditions. So, it allows one to take defeasible
presented in this chapter, the condition is implicitly as- steps based on the assumption that the abnormalities
sumed to be satisfied. are false (this rule also makes clear that any adaptive
proof can be transformed into a Fitch-style proof in the
12.6.1 Dynamic Proof Theory LLL by writing down for each line the disjunction of
the formula and all of the abnormalities in the condi-
As stated before, a key advantage of adaptive logics is tion). Several examples of how these rules are employed
their dynamic proof theory which models human rea- will follow.
soning. This dynamics is possible because a line in an The only thing that is still needed is a criterion that
adaptive proof has – along with a line number, a for- defines when a line of the proof is considered to be de-
mula and a justification – a fourth element, that is, the feated. At first sight, it seems straightforward to mark
condition. A condition is a finite subset of the set of ab- lines of which one of the elements of the condition is
normalities and specifies which abnormalities need to unconditionally derived from the premises, this means
be assumed to be false for the formula on that line to be that it is derived on the empty condition (defeated lines
derivable. in a proof are marked instead of deleted, because, in
The inference rules in an adaptive logic reduce to general, it is possible that they may later become un-
three generic rules. Where is the set of premises, is marked in an extension of the proof). But this strategy,
a finite subset of the set of abnormalities ˝ and Dab./ called the simple strategy, usually has a serious flaw. If
the (classical) disjunction of the abnormalities in , and it is possible to derive unconditionally a disjunction of
where abnormalities Dab./ that is minimal, that is, if there
is no 0 such that Dab.0 / can be unconditionally
A : (12.3) derived, the simple strategy would ignore this informa-
tion. This is problematic, however, because at least one
indicates that A occurs in the proof on the condition , of the disjuncts of the ignored disjunction has to be true.
the inference rules are given by the generic rules Therefore, one can use the simple strategy only in cases
where
PREM If A 2 W `LLL Dab./
:: ::
: : only if
A ; there is an A 2 such that `LLL A (12.4)
RU If A1 ; ; An `LLL B W with Dab./ being any disjunction of abnormalities out
A1 1 of ˝. This condition will be met for the logic MLAss
:: :: (Sect. 12.8); this logic will, hence, employ the simple
: :
strategy.
An n
The majority of logics, however, do not meet this
B 1 [ : : : [ n criterion and for those logics, more advanced strategies
258 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
have been developed. The best known of these are re- tion and the superscripts r and s stand for the adaptive
liability and minimal abnormality. The logic LArs uses strategies reliability and simple strategy, respectively.
the reliability strategy. This strategy, which will be ex- The subscript s originally denoted that the logic was
plained and illustrated in the following, orders to mark formulated in the standard format for adaptive logics,
any line of which one of the elements is uncondition- but in [12.21], it is argued that it is more useful to in-
ally derived as a disjunct from a minimal disjunction of terpret this s as that they are logics for singular fact
abnormalities. abduction. After all, most adaptive logics are nowa-
At this point, all elements are introduced to explain days formulated in the standard format anyhow, and
the naming of the two logics that will be presented this allows us to contrast these logics with the logic
in this chapter: As might be expected LA and MLA LAr8 which is a logic for abduction of generaliza-
stand for logic for abduction and modal logic for abduc- tions [12.16, 21].
In this section, the reader is introduced to the logic are not related to the explanandum. Finally, as has been
LArs [12.29] in an informal manner. This will allow the pointed out in the introduction, it is a nice feature of
reader to gain a better understanding of the framework adaptive logics that they enable one to integrate defea-
of adaptive logics and the functioning of its dynamic sible and deductive steps.
proof theory. In the next section, the same approach will
be used for the logic MLAss . The formal definitions of 12.7.1 Lower Limit Logic
both logics will be presented in the appendix for those
who are interested. The lower limit logic of LArs is a classical first-order
In order to model abductive reasoning processes of logic CL. This means that the deductive inferences of
singular facts, the logic LArs (as will the logic MLAss ) this logic are the reasoning steps modeled by classical
contains, in addition to deductive inference steps, de- logic. Also, as this logic is an extension of classical
feasible reasoning steps based on an argumentation logic, any classical consequence of a premise set will
schema known as affirming the consequent (combined also be a consequence of the premise set according to
with Universal Instantiation) this logic.
The choice for a predicate logic is motivated by If one takes (here and in further definitions) the
the fact that a material implication is used to model metavariables A and B to represent (well-formed) for-
the relation between explanans and explanandum. As mulas, ˛ a variable and ˇ a constant of the language in
is well known that B `CL A
B, a propositional logic which the logic is defined L, we can define the set of
would allow one to derive anything as a hypothesis. In abnormalities of the logics LArs as
the predicative case, the use of the universal quantifier
can avoid this. This can be seen if we compare `CL ˝ D f.8˛.A.˛/
B.˛//
B.ˇ/
.A.ˇ/
B.ˇ// with 6`CL B.ˇ/
.8˛/.A.˛/
^ .B.ˇ/ ^ :A.ˇ/// j
B.˛// (see [12.7] for a propositional logic for abduction
that solves this problem in another way). No predicate occurring in B
Let the list of desiderata for this logic first be occurs in Ag (12.6)
overviewed. This is important because in specifying the
set of abnormalities and the strategy, one has to check The first line is the logical form of the abnormality;
whether they allow one to model practical abductive the second line in the definition is to prevent self-
reasoning according to one’s expectations. Apart from explanatory hypotheses. To understand the functioning
the fact that by means of this logic one should be able to of this logical form, consider the following exam-
derive hypotheses according to the schema of affirming ple proof starting from the premise set, fQa; 8x.Px
the consequent, one has to make sure that one cannot Qx/g; 8x.Px
Rx/. The official layout of this chapter
derive – as a side effect – random hypotheses which in two columns forces to split each line of the proof over
Modeling Hypothetical Reasoning by Formal Logics 12.7 LArs : A Logic for Practical Singular Fact Abduction 259
two lines and write the condition of the line on a second the point where they have arrived.
line, starting with an ! arrow
:: :: ::
: : :
1 8x.Px
Qx/ -;PREM 5 Pa 4;RC
! ; ! f8x.Px
Qx/ ^ .Qa ^ :Pa/g
2 Qa -;PREM 6 8x.Px
Rx/ -; PREM
! ; ! ;
3 Pa _ :Pa -; RU 7 Ra 5; 6I RU
! ; ! f8x.Px
Qx/ ^ .Qa ^ :Pa/g
4 Pa _ .8x.Px
Qx/ ^ .Qa ^ :Pa// 1,2,3;RU 8 :Pa -;PREM
! ; ! ;
5 Pa 4;RC 9 8x.Px
Qx/ ^ .Qa ^ :Pa/ 1,2,6;RU
! f8x.Px
Qx/ ^ .Qa ^ :Pa/g ! ;
Part C | 12.7
This new premise makes it possible to derive un-
From this premise set, one would like to be able conditionally on line 9 the condition of the hypothesis
to form the hypothesis Pa. One obtains this hypothe- Pa. At this point, it is clear that one should not trust
sis as follows. One starts by writing two premises on anymore the hypothesis formed on line 5, which one
the first two lines and a tautology on the third line (all indicates by marking this line with a checkmark, in-
these lines are not dependent on earlier lines, indicated dicating that one lost one’s confidence in this formula
by the dash). These three lines allow one then to derive once one wrote down line 9. As the formula Ra is ar-
the disjunction on line 4 by means of the unconditional rived at by reasoning futher upon the hypothesis Pa, it
inference rule RU. This disjunction has the exact form has (at least) the same condition, and is, hence, at this
that allows one now to derive conditionally the hypoth- point also marked.
esis Pa from it by applying the rule RC. In summary, each time one defeasibly derives a hy-
From this hypothesis, one can reason further in pothesis, one has to state explicitly the condition the
a deductive way by applying, for example, modus po- (suspected) truth of which would defeat the hypothesis.
nens (note that the result of this inference has also Therefore, one can assume the hypothesis to be true as
a nonempty condition) long as one can assume the condition to be false; but
as soon as one has evidence that the condition might be
:: :: :: true, one should withdraw the hypothesis.
: : :
5 Pa 4;RC 12.7.3 Reliability Strategy
! f8x.Px
Qx/ ^ .Qa ^ :Pa/g
6 8x.Px
Rx/ -; PREM In the previous example, one withdrew the hypothesis
! ; because its condition was explicitly derived. However,
7 Ra 5,6;RU have a look at the following example proof from the
! f8x.Px
Qx/ ^ .Qa ^ :Pa/g premise set fQa; Ra; 8x.Px
Qx/; 8x.:Px
Rx/g
6 :Pa 2,4;RC X7
9 Pa _ Qa 3,8;RC
! f8x.:Px
Rx/ ^ .Ra ^ Pa/g
! f8x..Px _ Qx/
Rx/
7 .8x.Px
Qx/ ^ .Qa ^ :Pa// 1–4;RU
^.Ra ^ :.Pa _ Qa//g
_.8x.:Px
Rx/ ^ .Ra ^ Pa//
! ; Because of the fact that the minimal Dab formulas
on lines 6 and 7 could be derived from the premises, the
This marking strategy is called the reliability strat-
individual hypotheses Pa and Qa have to be withdrawn;
egy and it orders one to mark lines for which an element
yet, the condition of their disjunction on line 9 is not
of the condition has been unconditionally derived as
part of a minimal Dab formula from these premises.
a disjunct of a minimal disjunction of abnormalities (or
This shows that this logic only allows one to derive
in short, a minimal Dab formula). It is important to note
a disjunction in the case of multiple explanatory hy-
that (1) the disjunction should only hold disjuncts that
potheses, and none of the individual disjuncts.
have the form of an abnormality (otherwise, a defeating
disjunction could be constructed for every hypothesis)
12.7.5 Avoiding Random Hypotheses
and (2) that this disjunction should be minimal (as dis-
junctions can always be extended by applications of the
Another important feature of a logic for abduction is
addition rule). To clarify this last point: Suppose one
that it prevents from allowing one to derive random hy-
was able to derive the condition of line 5 by itself, then
potheses. The three most common ways to introduce
the disjunction on line 7 would not be minimal anymore
random hypotheses is:
and there would be no reason anymore to mark line 6.
1. By deriving an explanation for a tautology, for ex-
12.7.4 Practical Abduction
ample, deriving Xa from the theorems Pa _:Pa and
8x.Xx
.Px _ :Px//
The logic LArs is a logic for practical abduction
2. By deriving contradictions as explanations, which
(Sect. 12.5). This means that it solves the problem of
leads to logical explosion, for example, deriv-
multiple explanatory hypotheses by only allowing the
ing Xa ^ :Xa from Pa and the theorem 8x..Xx ^
disjunction of the various hypotheses to be derived.
:Xx/
Px/
Consider the following example from the premise set
3. By deriving hypotheses that are not the most par-
fRa; 8x.Px
Rx/; 8x.Qx
Rx/g
simonious ones, for example, deriving Pa ^ Xa
from Qa and 8x.Px
Qx/ (and its consequence
1 8x.Px
Rx/ -;PREM
8x..Px ^ Xx/
Qx/).
! ;
2 8x.Qx
Rx/ -;PREM The logic LArs prevent these three ways by similar
! ; mechanisms as the mechanism to block individual hy-
3 Ra -;PREM potheses illustrated above. Elaborate examples for each
! ; of these three ways can be found in [12.29].
Modeling Hypothetical Reasoning by Formal Logics 12.8 MLAss : A Logic for Theoretical Singular Fact Abduction 261
Part C | 12.8
This logic is one of the weakest normal modal log-
the problem of multiple explanatory hypotheses in a dif- ics that exist and is obtained by adding the D-axiom to
ferent manner. Specific for this logic (as this logic is the axiomatization of the better known minimal normal
aimed at modeling the reasoning of, e.g., scientists or modal logic K.
detectives [12.33]) is the desideratum that it handles The semantics for this logic can be expressed by
contradictory hypotheses, predictions, and counterevi- a standard possible world Kripke semantics where the
dence in a natural way. accessibility relation R between possible worlds is se-
rial, that is, for every world w in the model, there is at
12.8.1 Formal Language Schema least one world w 0 in the model such that Rww 0 .
As this logic is a modal logic, the language of this logic 12.8.3 Intended Interpretation of
is an extension of the language of the classical logic the Modal Operators
CL. Let the standard predicative language of the clas-
sical logic be denoted with L. C , V , F , and W will As indicated above, explanatory hypotheses – the re-
further be used to refer, respectively, to the sets of indi- sults of abductive inferences – will be represented by
vidual constants, individual variables, all (well-formed) formulas of the form ˘A (A 2 W ). Formulas of the
formulas of L, and the closed (well-formed) formulas form B are used to represent explananda, other ob-
of L. servational data, and relevant background knowledge.
LM , the language of the logic MLAss , is L extended Otherwise, this information would not be able to revoke
with the modal operator . WM , the set of closed derived hypotheses (for instance, :A and ˘A are not
formulas of LM is the smallest set that satisfies the fol- contradictory, whereas :A and ˘A are). The reason
lowing conditions: D is chosen instead of K is that it is assumed that the
explananda and background information are together
1. If A 2 W , then A, A 2 WM
consistent. This assumption is modeled by the D-axiom
2. If A 2 WM , then :A 2 WM
(for instance, the premise set f:Pa, .8x/Pxg is a set
3. If A; B 2 WM ,
modeling an inconsistent set of background knowledge
then A ^ B, A _ B, A
B, A B 2 WM .
and observations, but in the logic K, this set would not
It is important to notice that there are no occur- be considered inconsistent, because anything cannot be
rences of modal operators within the scope of another derived from this set by Ex Falso Quodlibet. To be able
modal operator or a quantifier. The set W , the subset to do this, the D-axiom is needed.)
of WM , the elements of which can act as premises in
the logic, is further defined as 12.8.4 Set of Abnormalities
W D fA j A 2 W g : (12.7) Since the final form of the abnormalities is quite com-
plex – although the idea behind is straightforward –
two more basic proposals that are constitutive for the
It is easily seen that
final form will first be considered and it will be shown
why they are insufficient. Obviously, only closed well-
W WM : (12.8) formed formulas can be an element of any set of
262 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
hypothesis will be defeated if one shows explicitly that the conditions on lines 5–7 are not unconditionally
the hypothesis cannot be the case. derivable from the premise set. It is also interesting
to note that, because of the properties of the lower
12.8.6 Simple Strategy limit D, it is not possible to derive from these premises
that ˘.Pa ^ :Pa/. The conjunction of two hypotheses
For this logic, the simple strategy can be used, which is never considered as a hypothesis itself, unless there
means, as stated before, that one has to mark lines for is further background information that links these two
which one of the elements of the condition is uncondi- hypotheses in some way.
tionally derived. It can easily be seen that the condition
for the use of the simple strategy, that is, 12.8.8 Predictions and Evidence
`LLL Dab./
To show that this logic handles predictions and (coun-
only if
ter) evidence for these predictions in a natural way, let
there is an A 2 such that `LLL A ; (12.14) the premise set be extended with the additional impli-
cation 8x.Px
Sx/
is fulfilled here. Since all premises have the form A,
the only option to derive a disjunction of abnormali- 8 8x.Px
Sx/ -;PREM
ties would be to apply addition, that is, to derive .A _
! ;
B/ from A (or B), because it is well known that
9 ˘Sa 5,8;RU
.A _ B/ ° A _ B in any standard modal logic (it is
! f.8x.Px
Qx/ ^ .Qa ^ :Pa//g
also possible to derive a disjunction from the premises
by means of the K-axiom. For instance, .A
B/ ` With this extra implication, the prediction ˘Sa can
:A _ B, but the first disjunct will always be equiv- be derived. As long as one has no further information
alent to a possibility (˘:A) and can, hence, not be an about this prediction (for instance, by observation), it
abnormality). remains a hypothesis derived on the same condition as
˘Pa. If one would test this prediction, one would have
12.8.7 Contradictory Hypotheses two possibilities. On the one hand, if the prediction
turns out to be false, the premise :Sa could be added
As a first example of the functioning of this logic, con- to the premise set
sider the following example starting from the premise
set fQa; Ra; 8x.Px
Qx/; 8x.:Px
Rx/g. As :: :: ::
the reader is by now probably accustomed to the func- : : :
tioning of the abnormalities, it is also shown how this 5 ˘Pa 1,3;RC X12
logic is able to handle contradictory hypotheses with- ! f.8x.Px
Qx/
out causing explosion. ^.Qa ^ :Pa//g
:: :: ::
1 8x.Px
Qx/ -;PREM : : : g
! ; 9 ˘Sa 5,8;RU X12
2 8x.:Px
Rx/ -;PREM ! f.8x.Px
Qx/
! ; ^.Qa ^ :Pa//g
Modeling Hypothetical Reasoning by Formal Logics 12.8 MLAss : A Logic for Theoretical Singular Fact Abduction 263
Part C | 12.8
but do not prove it, while false predictions directly fal-
sify the hypothesis, one can say that this logic handles No hypothesis can be abduced from a tautology if the
predictions in a Popperian way, although in using this abnormalities have the following form
vocabulary, the reader has to be reminded that MLAss is
a logic for modeling abduction and handling explana- ˝2 D f.8˛.A.˛/
B.˛//
tory hypotheses, not a formal methodology of science. ^ .B.ˇ/ ^ :A.ˇ///
This logic has nothing to say about the confirmation of
_ 8˛B.˛/ j
theories for which Popper actually employed the con-
cepts of corroboration and falsification [12.34]. No predicate occurring in B
occurs in Ag (12.15)
12.8.9 Contradictions
It is clear that one can keep using the simple strat-
One of the three ways a logic of abduction could gen- egy with this new set of abnormalities. It is also easily
erate random hypotheses as a side effect is by allowing seen that all of the advantages and examples described
for the abduction of contradictions. How this is possi- above still hold. Each time one can derive an abnormal-
ble and how the logic prevents this are illustrated in the ity of ˝1 , one can derive the corresponding abnormality
following proof from the premise set fQag of ˝2 by a simple application of the addition rule. Fi-
nally, the problem raised by tautologies, as illustrated
1 Qa -;PREM in the previous example, is solved in an elegant way,
because the form of abnormalities makes sure that the
! ;
abnormality will always be a theorem in case the ex-
2 8x..Xx ^ :Xx/
Qx/ -;RU
planandum is a theorem. So, nothing can be abduced
! ; from tautologies.
3 ˘.Xa ^ :Xa/ 1,2;RC X4
! f.8x..Xx ^ :Xx/
Qx/ 12.8.12 Most Parsimonious Explanantia
^.Qa ^ :.Xa ^ :Xa///g
4 .8x..Xx ^ :Xx/
Qx/ 1;RU Still, there is third way to derive random hypotheses that
^.Qa ^ .:Xa _ Xa/// cannot be prevented by ˝2 . Consider, for instance, the
! ; following proof from the premise set fRa; 8x.Px
Rx/g
12.8.10 Tautologies
1 Ra -;PREM
Still, there are other ways to derive random hypotheses ! ;
that are not prevented by the first proposal for the set 2 8x.Px
Rx/ -;PREM
of abnormalities ˝1 . For instance, ˝1 does not prevent ! ;
that random hypotheses can be derived from a tau- 3 8x..Px ^ Xx/
Rx/ 2;RU
tology, as illustrated by the following example. As is ! ;
impossible in the following proof from the premise set 4 ˘.Pa ^ Xa/ 1,3;RC
; to unconditionally derive the abnormality in the con- ! f.8x..Px ^ Xx/
Rx/
dition of line 3 from the premises, the formula of line ^.Ra ^ :.Pa ^ Xa/// _ 8xRxg
264 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
APCN .˛/ D .Q1 1 / .Qm m / For the same reasons, as stated in the description of
˝2 , one can keep using the simple strategy and all of
.A1 .˛/ ^ ^ An .˛// the advantages and examples described above will still
and ` APCN .˛/ A.˛/ (12.16) hold.
Let one has a look at how this final set of abnor-
with m > 0, n > 1, Qi 2 f8; 9g for i 6 m, i 2 V for i 6 malities solve the previous problem. As the condition
m, ˛ 2 V and Ai .˛/ disjunctions of literals in F for i 6 is fully written out, one can easily see that the third
n. disjunct 8x.Px
Rx/ is actually a premise, and that,
Then, the new notation A1 hence, the abnormality on line 4 unconditionally deriv-
i .˛/ (1 6 i 6 n) can be
introduced so that there is a way to take out one of the able is
conjuncts of a formula in the PCN form. In cases where 1 Ra -;PREM
the conjunction consists of only one conjunct (and, ob- ! ;
viously, no more parsimonious explanation is possible), 2 8x.Px
Rx/ -;PREM
the substitution with a random tautology will make sure ! ;
that the condition for parsimony, added in the next set 3 8x..Px ^ Qx/
Rx/ 2;RU
of abnormalities, is satisfied trivially ! ;
4 ˘.Pa ^ Qa/ 1,3;RC X5
if n > 1 W ! f.8x..Px ^ Qx/
Rx/
^.Ra ^ :.Pa ^ Qa/// _ 8xRx
A1
i .˛/ Ddf .Q1 1 / .Qm m /.A1 .˛/ ^ _8x.Px
Rx/ _ 8x.Qx
Rx/g
^ Ai1 .˛/ ^ AiC1 .˛/ ^ ^ An .˛// 5 .8x..Px ^ Qx/
Rx/^ 2; RU
with Aj .1 6 j 6 n/ the j th conjunct .Ra ^ :.Pa ^ Qa/// _ 8xRx
of APCN .˛/ (12.17) _8x.Px
Rx/ _ 8x.Qx
Rx/
! ;
if n D 1 W
This concludes the informal presentation of this logic,
A1
1 .˛/ Ddf > which, in its final form, meets all desiderata put up
with > any tautology of CL: (12.18) front.
Modeling Hypothetical Reasoning by Formal Logics 12.A Appendix: Formal Presentations of the Logics LArs and MLAss 265
12.9 Conclusions
There is quite some ground covered in this chapter, the admittedly modest, can be taken in the project of for-
main purpose of which was to show in a direct yet nu- mally modeling hypothetical reasoning. At the same
anced fashion the feasibility and the limits of modeling time, the reader is introduced to the unificational frame-
hypothetical reasoning by means of formal logics. It work of adaptive logics that shows promise to take some
started with an argument for this claim in a general way, further steps along the road. Finally, it also shows that
showing which assumptions one has to assume or re- the use of formal models draws the attention to vari-
ject to take this view. As far as there is argued for the ous issues about these reasoning patterns which were
feasibility of this project, the attention was also drawn previously left unattended, for example, the difference
to certain limits, pitfalls, and disadvantages of it. This between practical and theoretical abduction or the im-
discussion was then expanded by identifying four main portance of avoiding random hypotheses by restricting
abduction patterns, which showed that no pattern of hy- the use of tautologies and contradictions.
pothetical reasoning can be easily modeled. However, if one looks at the prospect of model-
In the second part of this chapter, gears were shifted ing abductive reasoning by means of formal (adaptive)
Part C | 12.A
and a glimpse was shown of what is already possible logics, one has to conclude that so far only the tip of
today with current logical techniques, by explaining the iceberg has been scratched. At present, apart from
in detail two logics originating in the adaptive logics a single exception, only logics have been devised for
framework: LArs for practical fact abduction and MLAss singular fact abduction, which is, in fact, the most easy
for theoretical singular fact abduction. The purpose of of the various patterns of abduction. Yet, the complica-
including the full details of these logics is threefold: tions that already arise on this level warn dreamers that
First, it shows the reader how certain steps, which are the road ahead will be steep and arduous.
Definition 12.2 (Minimal Dab formula at stage s) Definition 12.9 (Final derivability for MLAss )
A Dab formula Dab./ (Dab./ is the (classical) dis- For all W : `MLAss A (A 2 CnMLAss . /) if and
junction of the abnormalities in a finite subset of the only if A is finally derived in a MLAss -proof from .
set of abnormalities ˝) is a minimal Dab formula at
stage s if and only if Dab./ is derived on the empty
condition at stage s, and there is no 0 for which 12.A.2 Semantics
Dab.0 / is derived on the empty condition at stage s.
The semantics of an adaptive logic is obtained by a se-
lection on the models of the lower limit logic. For
Definition 12.3 (Set of unreliable formulas Us . / at a more elaborate discussion of the following definitions,
stage s) the reader is referred to the original articles and the
The set of unreliable formulas Us . / at stage s is the aforementioned theoretical overviews of adaptive log-
union of all for which Dab./ is a minimal Dab for- ics.
mula at stage s.
Part C | 12
Definition 12.10
A CL-model M of the premise set is reliable
Definition 12.4 (Marking for the reliability strategy) if and only if fA 2 ˝ j M Ag 1 [ 2 [ : : : with
Line i with condition is marked for the reliability fDab.1 /; Dab.2/; : : :g the set of minimal Dab-
strategy at stage s of a proof if and only if \Us . / 6D consequences of .
;.
Definition 12.11
Definition 12.5 (Marking for the simple strategy) A D-model M of the premise set is simply all right if
Line i with condition is marked for the simple strat- and only if fA 2 ˝ j M Ag D fA 2 ˝ j `D Ag.
egy at stage s of a proof, if stage s contains a line of
which A 2 is the formula and ; the condition.
Definition 12.12 (Semantic consequence of LArs )
For all W : LArs A if and only if A is verified
Definition 12.6 (Derivation of a formula at stage s) by all reliable models of .
A formula A is derived from at stage s of a proof if
and only if A is the formula of a line that is unmarked
at stage s. Definition 12.13 (Semantic consequence of MLAss )
For all W : MLAss A if and only if A is verified
by all simply all right models of .
Definition 12.7 (Final derivation of a formula at
stage s) The fact that these two logics are in a standard for-
A formula A is finally derived from at stage s of mat warrants that the following theorems hold.
a proof if and only if A is derived at line i, line i is
not marked at stage s and every extension of the proof Theorem 12.1 Soundness and completeness of LArs )
in which i is marked may be further extended in such `LArs A if and only if LArs A.
a way that line i is unmarked.
Using the simple strategy, it is not possible that Theorem 12.2 (Soundness and completeness of
a marked line becomes unmarked at a later stage of MLAss )
a proof. Therefore, the final criterion reduces for this `MLAss A if and only if MLAss A.
strategy to the requirement that the line remains un-
marked in every extension of the proof.
References
12.1 N. Rescher: Hypothetical Reasoning (North- 12.20 T. Gauderis: To envision a new particle or change an
Holland, Amsterdam 1964) existing law? Hypothesis formation and anomaly
12.2 R. Koons: Defeasible reasoning. In: The Stanford resolution for the curious spectrum of the ˇ decay
Encyclopedia of Philosophy, ed. by E. Zalta (Stan- spectrum, Stud. Hist. Philos. Mod. Phys. 45(1), 27–
ford Univ., Stanford 2014), Spring 2014 edn. 45 (2014)
12.3 C. Straßer: Adaptive Logics for Defeasible Rea- 12.21 T. Gauderis: Patterns of Hypothesis Formation: At
soning: Applications in Argumentation, Normative the Crossroads of Philosophy of Science, Logic, Epis-
Reasoning and Default Reasoning (Springer, Dor- temology, Artificial Intelligence and Physics, Ph.D.
drecht 2013) Thesis (Ghent Univ., Ghent 2013)
12.4 D. Batens: A universal logic approach to adaptive 12.22 J. Meheus, L. Verhoeven, M. Van Dyck, D. Provijn:
logics, Logica Universalis 1, 221–242 (2007) Ampliative adaptive logics and the foundation of
12.5 D. Batens: Adaptive Logics and Dynamic Proofs. logic-based approaches to abduction. In: Logical
Mastering the Dynamics of Reasoning with Spe- and Computational Aspects of Model-Based Rea-
cial Attention to Handling Inconsistency (Ghent soning, ed. by L. Magnani, N.J. Nersessian, C. Pizzi
Part C | 12
Univ., Ghent 2010), http://logica.ugent.be/adlog/ (Kluwer Academic, Dordrecht 2002) pp. 39–71
book.html 12.23 D. Batens, J. Meheus, D. Provijn, L. Verhoeven:
12.6 P. Paul: AI approaches to abduction. In: Abductive Some adaptive logics for diagnosis, Log. Log. Phi-
Reasoning and Uncertainty Management Systems, los. 11/12, 39–65 (2003)
Handbook of Defeasible Reasoning and Uncertainty 12.24 J. Meheus, D. Batens: A formal logic for abductive
Management Systems, Vol. 4, ed. by D. Gabbay, reasoning, Log. J. IGPL 14, 221–236 (2006)
R. Kruse (Kluwer Acad., Dordrecht 2000) pp. 35–98 12.25 J. Meheus: Adaptive logics for abduction and the
12.7 M. Beirlaen, A. Aliseda: A conditional logic for ab- explication of explanation-seeking processes. In:
duction, Synthese 191(15), 3733–3758 (2014) Abduction and the Process of Scientific Discovery,
12.8 N.R. Hanson: Patterns of Discovery: An Inquiry into ed. by O. Pombo, A. Gerne (Centro de Filosofia das
the Conceptual Foundations of Science (Cambridge Ciencias, Lisboa 2007) pp. 97–119
Univ. Press, Cambridge 1958) 12.26 J. Meheus, D. Provijn: Abduction through semantic
12.9 N.R. Hanson: Is there a logic of scientific discovery? tableaux versus abduction through goal-directed
In: Current Issues in the Philosophy of Science, ed. proofs, Theoria 22(3), 295–304 (2007)
by H. Feigl, G. Maxwell (Holt, Rinehart and Winston, 12.27 H. Lycke: The Adaptive logics approach to abduc-
New York 1961) pp. 20–35 tion. In: Logic Philosophy and History of Science in
12.10 G. Harman: The inference to the best explanation, Belgium, Proc. Young Res. Days, ed. by E. Weber,
Philosophical Rev. 74(1), 88–95 (1965) T. Libert, P. Marage, G. Vanpaemel (KVAB, Brussels
12.11 T. Nickles: Introductory essay: Scientific discov- 2009) pp. 35–41
ery and the future of philosophy of science. In: 12.28 H. Lycke: A formal explication of the search for
Scientific Discovery, Logic and Rationality, ed. by explanations: The adaptive logics approach to ab-
T. Nickles (Reidel, Dordrecht 1980) pp. 1–59 ductive reasoning, Log. J. IGPL 20(2), 497–516 (2012)
12.12 H. Simon: Does scientific discovery have a logic?, 12.29 J. Meheus: A formal logic for the abduction of sin-
Philosophy Sci. 40, 471–480 (1973) gular hypotheses. In: Explanation, Prediction, and
12.13 P. Thagard: Computational Philosophy of Science Confirmation, ed. by D. Dieks, W. Gonzalez, S. Hart-
(MIT, Cambridge 1988) mann, T. Uebel, M. Weber (Springer, Dordrecht 2011)
12.14 G. Schurz: Patterns of abduction, Synthese 164, pp. 93–108
201–234 (2008) 12.30 T. Gauderis: Modelling abduction in science by
12.15 M. Hoffmann: Theoric transformations and a new means of a modal adaptive logic, Found. Sci. 18(4),
classification of abductive inferences, Trans. 611–624 (2013)
Charles S. Peirce Soc. 46(4), 570–590 (2010) 12.31 D. Provijn: The generation of abductive explana-
12.16 T. Gauderis, F. Van De Putte: Abduction of general- tions from inconsistent theories, Log. J. IGPL 20(2),
izations, Theoria 27(3), 345–363 (2012) 400–416 (2012)
12.17 A. Aliseda: Abductive Reasoning. Logical Investi- 12.32 T. Gauderis: An adaptive logic based approach
gation into Discovery and Explanation, Vol. 330 to abduction in AI, Proc. 9th Int. Workshop
(Springer, Dordrecht 2006), Synthese Library Nonmonotonic Reasoning, Action Change (NARC),
12.18 L. Magnani: Abduction, Reason and Sci- Barcelona, ed. by S. Sardina, S. Vassos (2011) pp. 1–
ence: Processes of Discovery and Explanation 6, Online Publication, http://ijcai-11.iiia.csic.es/
(Kluwer/Plenum, New York 2001) files/proceedings/W4-%20NRAC11-Proceedings.pdf
12.19 P. Thagard, C. Shelley: Abductive reasoning: Logic, 12.33 T. Gauderis: The problem of multiple explanatory
visual thinking, and coherence. In: Logic and Sci- hypotheses, future directions for logic, Proc. PhDs
entific Methods, Vol. 259, ed. by M.L. Dalla Chiara, in Logic III, Brussels, 2011, ed. by L. Demey, J. De-
K. Doets, D. Mundici, J. van Benthem (Kluwer Acad., vuyst (College Publications, London 2012) pp. 45–54
Dordrecht 1997) pp. 413–427 12.34 K. Popper: The Logic of Scientific Discovery (Rout-
ledge, London 1959)
269
Abductive Re
13. Abductive Reasoning in Dynamic Epistemic Logic
Part C | 13
tion representing the application of an abductive 13.5.1 Ordering Explanations........................ 282
step is introduced, and an illustrative example is 13.6 Integrating the Best Solution ............ 284
provided. A number of the most interesting prop- 13.6.1 Abduction in a Picture, Once Again ..... 285
erties of abductive reasoning (those highlighted by 13.6.2 Further Classification ......................... 285
Peirce) are shown to be better modeled within this 13.6.3 Properties in a Picture........................ 287
approach.
13.7 Working with the Explanations ......... 287
13.7.1 A Modality ........................................ 288
13.1 Classical Abduction ........................... 270
13.2 A Dynamic Epistemic Perspective ....... 272 13.8 A Brief Exploration to Nonideal
13.2.1 What Is an Abductive Problem?........... 272 Agents.............................................. 289
13.2.2 What Is an Abductive Solution?........... 273 13.8.1 Considering Inference ........................ 290
13.2.3 How is the Best Explanation Selected? 273 13.8.2 Different Reasoning Abilities............... 290
13.2.4 How is the Best Explanation 13.9 Conclusions ...................................... 290
Incorporated Into
the Agent’s Information? ................... 274 References................................................... 292
Within logic, abductive reasoning has been studied 2. Proposing algorithms to find abductive solu-
mainly from a purely syntactic perspective. Definitions tions [13.2–6]
of an abductive problem and its solution(s) are given in 3. Analyzing the structural properties of abductive
terms of a theory and a formula, and therefore most of consequence relations [13.7–9].
the formal logical work on the subject has focused on:
In all these studies, which follow the so-called
1. Discussing what a theory and a formula should sat- Aliseda–Kakas/Kowalski–Magnani/Meheus (AKM)-
isfy in order to constitute an abductive problem, and schema of abduction Chap. 10, explanationism and
what a formula should satisfy in order to be an ab- consequentialism are considered, but the epistemic
ductive solution [13.1]; see also Chap. 10 character of abductive reasoning seems to have been
270 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
pushed into the background. Such character is consid- information change. Section 13.3 introduces the needed
ered crucial in this chapter, as it will be discussed. tools, and then the ideas and definitions discussed in
This chapter’s main proposal is an epistemic and dy- Sect. 13.2 are formalized in Sects. 13.4, 13.5, 13.6,
namic approach to abductive reasoning. The proposal is and 13.7. The chapter closes with a brief exploration
close to the ideas of [13.10–13] in that it stresses the (Sect. 13.8) of the epistemic and dynamic aspects of
key role that agents play within the abductive reason- abductive reasoning that are brought to light when non-
ing scenario; after all, at the heart, abduction deals with ideal agents are considered.
agents and their (individual or collective) information. Abductive reasoning The concept of abductive rea-
In this sense, this collaboration is closer to the Gabbay– soning has been discussed in various fields, and this has
Woods (GW)-schema [13.14, 15], see also Chap. 14, led to different ideas of what abduction should consist
which is based on the concept of ignorance problem of (see [13.20], among others). For example, while cer-
that arises when a cognitive agent has a cognitive target tain authors claim that there is an abductive problem
that cannot be attained from what she currently knows, only when neither the observed nor its negation fol-
and thus highlights the distinctive epistemic feature of lows from a theory [13.2], others say that there is also
abduction that is key to this chapter’s considerations. an abductive problem when, though does not follow,
Even so, this presentation goes one step further, as it its negation does [13.1], a situation that has been typ-
fully adopts a dynamic perspective by making explicit ically called a belief revision problem. There are also
the actions involved in the abductive process; after all, several opinions of what an abductive solution is. Most
abduction studies the way agents react epistemically (as of the work on strategies for finding abductive solutions
individuals or groupwise) to new observations. focuses on formulas that are already part of the system
More precisely, this proposal argues (Sect. 13.2) (the aforementioned [13.2–6]), while some others take
Part C | 13.1
that abductive reasoning can be better understood as a broader view, allowing not only changes in the under-
a process that involves an agent’s information. To this lying logical consequence relation [13.21] but also the
end, it presents definitions of an abductive problem and creation and modification of concepts [13.22].
an abductive solution in terms of an agent’s knowledge The present proposal focuses on a simple account:
and her beliefs as well as a subjective criteria for select- Abductive reasoning will be understood as a reason-
ing the agent’s best explanation, and outlines a policy ing process that goes from a single unjustified fact
through which the chosen abductive solution can be to its abductive explanations, where an explanation is
integrated into the agent’s information. Then, the dis- a formula of the system that satisfies certain properties.
cussed ideas and definitions are formalized using tools Still, similar epistemic and dynamic approaches can
from dynamic epistemic logic (DEL). This choice is not be made to other interpretations of abduction, as those
accidental: classical epistemic logic (EL [13.16, 17]) that involve the creation of new concepts or changes in
with its possible worlds semantic model is a powerful awareness [13.23, 24].
framework that allows to represent an agent’s knowl- Abductive reasoning in dynamic epistemic logic
edge and beliefs not only about propositional facts but This contribution is a revised version of a proposal
also about her own information. Its dynamic extension, whose different parts have been presented in diverse
DEL [13.18, 19], allows the representation of diverse venues. While Sects. 13.2, 13.4, and 13.6 are based
epistemic actions (as diverse forms of announcements on [13.25], Sects. 13.5 and 13.7 are based on [13.26]
and different policies for belief revision) that make such and Sect. 13.8 is based on [13.27].
When formalized within logical frameworks, the Definition 13.2 Abductive solution
key concepts in abductive reasoning have traditionally
taken the following form (Chap. 10). First, it is said that
Given a novel abductive problem .˚; /, the for-
mula is said to be an abductive solution when
an abductive problem arises when there is a formula that
does not follow from the current theory.
˚ [ fg ` :
Definition 13.1 Abductive problem
Let ˚ and be a theory and a formula, respectively, in
Given an anomalous abductive problem .˚; /, the
formula is an abductive solution when it is pos-
some language L. Let ` be a consequence relation on
sible to perform a theory revision to get a novel
L:
problem .˚ 0 ; / for which is a solution.
The pair .˚; / constitutes a (novel) abductive
problem when neither nor : are consequences This definition of an abductive solution is often con-
of ˚, that is, when sidered as too weak: can take many trivial forms, as
anything that contradicts ˚ (then everything, includ-
˚ 6` and ˚ 6` : : ing , follows from ˚ [ fg) and even itself (clearly,
˚ [ fg ` ). Further conditions can be imposed in or-
The pair .˚; / constitutes an anomalous abductive der to define more satisfactory solutions; here are some
problem when, though is not a consequence of ˚, of them [13.1] (Chap. 10).
: is, that is, when
Definition 13.3 Classification of abductive solutions
Part C | 13.1
˚ 6` and ˚ ` : : Let .˚; / be an abductive problem. An abductive solu-
tion is
It is typically assumed that the theory ˚ is a set of
consistent iff ˚; 6` ?
formulas closed under logical consequence, and that `
explanatory iff 6`
is a truth-preserving consequence relation.
minimal iff for every other solution ;
Consider a novel abductive problem. The observa- ` implies `
tion of a about which the theory ˚ does not have
any opinion shows that ˚ is incomplete. Further in- The consistency requirement discards solutions that
formation that completes ˚ making a consequence are inconsistent with the theory, something a reason-
of it solves the problem, as now the theory is strong able explanation should not do. In a similar way, the
enough to explain . Consider now an anomalous ab- explanatory requirement discards those explanations
ductive problem. The observation of a whose negation that would justify the problem by themselves, since
is entailed by the theory shows that the theory contains it is preferred that the explanation only complements
a mistake. Now two steps are needed. First, perform the current theory. Finally, the minimality requirement
a theory revision that stops : from being a conse- works as Occam’s razor, looking for the simplest expla-
quence of ˚; this turns the anomalous problem into nation: A solution is minimal when it is in fact logically
a novel one, and now the search for further informa- equivalent to any other solution it implies. For fur-
tion that completes the theory, making a consequence ther details on these definitions, the reader is referred
of it, can be performed. Here are the formal definitions. to Chap. 10.
272 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
serves that the lawn is wet; Holmes observes that Mr. is the action that triggers the abductive problem, that
Wilson’s right cuff is very shiny). Thus, even though is, the action that turns a formula into an abductive
abductive reasoning has been linked to scientific the- problem.
ories (as interpreted in philosophy of science), in its For the former concept, a formula is typically said
most basic forms it deals with an agent’s (or a set to be an abductive problem when it is surprising. There
of agents’) information. Second, abductive reasoning are different ways to define a surprising observation
implies a change in the agent’s information (Mary of (some of them in a DEL setting [13.30]). Most
assumes that the electricity supply has failed; Karen as- of the approaches that define this notion in terms of
sumes it has rained; Holmes assumes Mr. Wilson has what the agent knows (believes) understand a surprise
done a lot of writing lately), and thus it is essential as something that does not follow from such knowledge
to distinguish the different stages during the abductive (beliefs). In other words, it is said that a given is sur-
process: the stage before the observation, the stage af- prising whenever the agent does not know (believe) it,
ter the observation has raised the abductive problem or, more radically, whenever the agent knows (believes)
(and thus the one when the agent starts looking for an :.
explanation), and the stage in which the explanation Now, note how in the context of abductive reason-
that has been chosen is incorporated into the agent’s ing it is not reasonable to define a surprising observation
information. This describes, of course, a dynamic pro- in terms of what the agent knows (believes) after such
cess. epistemic action. The reason is that, after observing ,
There is a final issue that is crucial for an epistemic an agent would typically come to know (believe) it.
approach to abductive reasoning. From this contribu- Thus, if the mentioned definitions are followed focus-
tion’s perspective, abductive reasoning involves not one ing on the agent’s information after the observation,
epistemic attitude (as is typically assumed in most no would be surprising and there would be no ab-
approaches) but rather (at least) two: that of those ductive problems at all! It is more reasonable to define
propositions about which the agent has full certainty; a surprising observation not in terms of what the agent
and that of those propositions that she considers very knows (believes) as a result of the observation, but
likely but she still cannot be certain about. The reason rather in terms of what she knew (believed) before it.
is that an agent typically tries to explain facts she has More precisely, it will be said that a known (believed)
come to know due to some observation, but the chosen is surprising with respect to an agent whenever she
solution, being a hypothesis that might be dropped in could not have come to know (believe) it.
the light of further observations, should not attain the Of course, the meaning of the sentence the agent
full certainty status. The use of different epistemic no- could have come to know (believe) still needs to be
tions also gives more flexibility to deal with a wider clarified. This is a crucial notion, as it will indicate not
variety of abductive problems and abductive solutions, only when a formula is an abductive problem (the
Abductive Reasoning in Dynamic Epistemic Logic 13.2 A Dynamic Epistemic Perspective 273
agent could not have come to know (believe) ), but in the introduction, this contribution does not intend
also what a formula needs in order to be an abductive on providing a full account of the many different un-
solution (with the help of , the agent could have come derstandings of what abductive reasoning does. Rather,
to know (believe) ). Here the ability to come to know its aim is to show how an epistemic and dynamic per-
(believe) a given formula will be understood as the abil- spective can shed a new light on the way abductive
ity to infer it, and the simplest way to state this idea reasoning is understood, even when assuming its sim-
is the following: An agent could have come to know plest interpretation.
(believe) if and only if there is an implication !
such that the agent knew both the implication and its 13.2.2 What Is an Abductive Solution?
antecedent. Other formulations that do not use the ma-
terial implication ! are also possible (e.g., the agent In this proposal’s setting, an abductive solution for
may know both :_ and to come to know ), but in a given will be defined in terms of what the agent
the semantic model this contribution uses (Sect. 13.3), could have been able to infer before the observation that
they are logically equivalent to the proposed one. raised the problem. As mentioned before, it will be said
With respect to the action that triggers an abductive that is a solution for the abductive problem when
problem , this action is typically assumed to be the ob- the agent could have come to know (believe) with the
servation of itself. Here a more general idea will be help of . In this simple case in which the ability to
considered: The action that triggers the abductive prob- come to know (believe) a given formula is understood
lem will be simply the observation of some formula . as the ability to infer the formula by means of a simple
Thus, though should indeed be related to (after all, modus ponens step, the following definition is obtained:
is an abductive problem because the agent comes to
“A formula constitutes an abductive solution for
Part C | 13.2
know by observing ), the agent will not be restricted
the abductive problem at some given state s2 if
to look for explanations of the formula that has been ob-
the agent knew ! at the previous state s1 . Thus,
served: She will also be able to look for explanations of
the set of solutions for an abductive problem is the
any formula she has come to know (believe) through
set of antecedents of implications which have as
the observation but could not have come to know (be-
consequent and were known before the observation
lieve) by herself before. Note how other actions are also
that triggered the abductive problem.”
reasonable, as the agent might want to explain a belief
she attained after a belief revision (Sect. 13.4.1). Note how abductive solutions are looked for not
Here is the intuitive definition of an abductive prob- when the agent has come to know (believe) , but rather
lem in full detail: at the stage immediately before it. Thus, is a solution
when, had it been known (believed) before, would have
“Let s1 represent the epistemic state of an agent,
allowed the agent to come to know (believe) (to pre-
and let s2 be the epistemic state that results from
dict/expect) .
the agent observing some given . A formula
constitutes an abductive problem for the agent at s2
13.2.3 How is the Best Explanation
whenever is known and there is no implication
Selected?
! such that the agent knew both the implica-
tion and its antecedent at s1 .”
Although there are several notions of explanation for
It is important to emphasize how an abductive prob- modeling the behavior of why-questions in scientific
lem has been defined with respect to an agent and stage contexts (e.g., the law model, the statistical relevance
(i. e., some epistemic situation). Thus, whether a for- model, or the genetic model), most of these consider
mula is an abductive problem depends on the formula a consequence (entailment) relation; explanation and
but also on the information of that given agent at that consequence go typically hand in hand. However, find-
given stage. The definition is given purely in terms of ing suitable and reasonable criteria for selecting the best
the agent’s knowledge, but it can also be given purely explanation has constituted a fundamental problem in
in terms of her beliefs, or even in terms of both, as it abductive reasoning [13.31–33], and in fact many au-
will be seen later. thors consider it to be the heart of the subject. Many
The presented definition could seem very restric- approaches are based on logical criteria, but beyond
tive. Even if the reader agrees with the basic idea ( is requisites to avoid triviality and certain restrictions to
an abductive problem for a given agent whenever she the syntactic form, the definition of suitable criteria
knows but she could not have come to know (be- is still an open problem. Some approaches have sug-
lieve) it), she/he does not need to agree with the way gested the use of contextual aspects, such as an ordering
key parts of it are understood. Nevertheless, as stated among formulas or among full theories. In particular,
274 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
for the latter, a typical option is the use of preferential the features that distinguishes abductive reasoning from
models based on qualitative properties that are beyond deductive reasoning is its nonmonotonic nature: The
the pure causal or deductive relationship between the chosen explanation does not need to be true, and in
abductive problem and its abductive solution. However, fact can be discarded in the light of further informa-
these preference criteria are seen as an external device tion. This indicates that an abductive solution cannot
which works on top of the deductive part of the explana- be assimilated as knowledge. Nevertheless, an epis-
tory mechanism, and as such they have been criticized temic agent has not only this hard form of information
because they seem to fall outside the logical frame- which is not subjected to modifications; she also has
work. a soft form that can be revised as many times as it
Approaching abductive reasoning from an epis- is needed: beliefs. Therefore, once the best abductive
temic point of view provides a different perspective. solution has been chosen, the agent’s information
It has been discussed already how the explanation an can be changed, leading her to believe that is the
agent will choose for a given abductive problem does case.
not depend on how the problematic formula could have
been predicted, but rather on how the agent could have 13.2.5 Abduction in a Picture
predicted it. In general, different agents have different
information, and thus they might disagree in what each It is interesting to notice how the stated definitions of
one calls the best explanation (and even in what each abductive problem and abductive solution rely on some
one calls explanation at all). This suggests that, instead form of counterfactivity, as in Peirce’s original formu-
of looking for criteria to select the best explanation, the lation (and also as discussed in [13.15]): A given is
goal should be a criterion to select the agent’s best ex- a solution of a problem if it would have allowed the
Part C | 13.2
planation. Now, once the agent has a set of formulas agent to predict . This can be better described with the
that explain the abductive problem from her point of following diagram.
view, how can she choose the best? This proposal’s an-
swer makes use of the fact that the considered agents Coming to know χ Accepting η
s1 s2 s3
have not only knowledge but also beliefs: Among all
these explanations, some are more plausible than oth-
ers from her point of view. These are precisely the ones s′2 s′3
the agent will choose when trying to explain a surpris- Incorporating η Inferring χ
ing observation: The best explanation can be defined in
terms of a preference ordering among the agent’s epis- The upper path is the real one: By means of an obser-
temic possibilities. It could be argued that this criterion vation, the agent goes from the epistemic state s1 to the
is not logical in the classic sense because it is not based epistemic state s2 in which she knows , and by accept-
exclusively on the deductive relationship between the ing the abductive solution she goes further to s3 . The
observed fact and different ways in which it could have existence of this path, the fact that is an abductive
been derived. Nevertheless, it is logical in a broader problem and is one of its abductive solutions, indi-
sense since it does depend on the agent’s information: cates that, at s1 , the lower path would have been pos-
her knowledge and, crucially, her beliefs. sible: Incorporating to the agent’s information would
have taken her to an epistemic state s02 where she would
13.2.4 How is the Best Explanation have been able to infer . Of course, s03 is not identical
Incorporated Into to s3 : In s03 both and are equally reliable because the
the Agent’s Information? second is inferred from the first, but in s3 , is less reli-
able than since although the second is obtained via an
Once the best explanation has been selected, it has to observation, the first is just a hypothesis that is subject
be incorporated into the agent’s information. One of to revision in the light of further information.
Abductive Reasoning in Dynamic Epistemic Logic 13.3 Representing Knowledgeand Beliefs 275
Part C | 13.3
there is a world at least as plausible (as the current confused with the equal plausibility relation, denoted
one) where ' holds, and those of the form hi' are by ', and defined as the intersection of
and , that
read as there is a world epistemically indistinguishable is, ' WD
\ . For further details and discussion on
(from the current one) where ' holds. Other Boolean these models, their requirements and their properties,
connectives (^; !; $) as well as the universal modali- the reader is referred to [13.34, 35].
ties, Œ
and Œ, are defined as usual (Œ
' WD :h
i:'
and Œ' WD :hi:' for the latter). Example 13.1
The following diagram represents a plausibility model
The modalities h
i and hi, respectively, make it M based on the atomic propositions P WD fl; e; bg. Cir-
possible to define the notions of belief and knowledge cles represent possible worlds (named w1 up to w5 ), and
within L. The language’s semantic model, a plausibility each one of them includes exactly the atomic proposi-
model, is defined as follows. tions that are true at that world (e.g., at w2 , the atomic
propositions l and e are true, but b is false). Arrows
Definition 13.5 Plausibility model represent the plausibility relation, with transitive arcs
Let P be a set of atomic propositions. A plausibility omitted (so w4
w5
w2
w1
w3 , but also w4
are interpreted as usual. For the remaining cases, epistemically indistinguishable world) nor believed (it
is not true in the most plausible worlds) at w2 . Still, it is
.M; w / h
i' iff 9u 2 W s.t. w
u & .M; u/ ' true in some epistemic possibilities from w2 (e.g., w5 );
.M; w / hi' iff 9u 2 W s.t. w u & .M; u/ ' : O holds at w2 : At that world, the
hence, hib (i. e., Kb)
agent considers b possible.
will survive the operation) and, after the update, ' is f.w ; u/jw u; .M; w / :
the case. The universal modality Œ Š is defined as the and .M; u/ g :
modal dual of h Ši, that is, Œ Š' WD :h Ši:'.
In addition to being the most natural operation over
Kripke-like models, an update also has a straightfor- The new plausibility relation states that after an up-
ward epistemic interpretation: it works as an act of grade with , all -worlds become more plausible than
a public announcement [13.38, 39] or, as it will be all : -worlds, and within the two zones the old order-
called here, an act of observation. When the agent ob- ing remains [13.40]. More precisely, a world u will be
serves a given , she can discard those epistemically at least as plausible as a world w , w
0 u, if and only
possible worlds that fail to satisfy this formula, thereby if they already are of that order and u satisfies , or
obtaining a model with only worlds that satisfied be- they already are of that order and w satisfies : , or
fore the operation. More details on this operation and its they are comparable, w satisfies : and u satisfies .
modalities (including an axiom system) can be found in This operation preserves the properties of the plausibil-
the papers [13.38, 39] or in the textbooks [13.18, 19]. ity relation and hence preserves plausibility models, as
shown in [13.35].
Example 13.3 In order to describe effects of this operation within
Consider the model M in Example 13.1 again. Suppose the language, an existential modality h *i is intro-
the agent observes l; this can be modeled as an update duced for every formula ,
with l, which yields the following model MlŠ
.M; w / h *i' iff .M *;w / ':
Part C | 13.3
In words, an upgrade formula h *i' holds at .M; w /
l, b l, e, b l, e l
if and only if ' is the case after an upgrade with . The
w4 w5 w2 w1 universal modality Œ * is defined as the modal dual
of h *i, as in the update case.
The most plausible world in M has been discarded in This operation also has a very natural epistemic in-
MlŠ . As explained in Example 13.2, the agent believes terpretation. The plausibility relation defines the agent’s
:l in M, but after the observation this is not the case beliefs, and hence any changes in the relation can be in-
anymore: :l does not hold in the unique most plausible terpreted as changes in the agent’s beliefs [13.34, 40,
world of the new model MlŠ . In fact, :l does not hold 41]. In particular, an act of revising beliefs after a re-
in any epistemically possible world, and thus after the liable and yet fallible source has suggested can be
observation the agent knows l; in symbols represented by an operation that puts -worlds at the
top of the plausibility order. Moreover, each one of the
.MlŠ ; w2 / Kl; that is, .M; w2 / ŒlŠKl : different methods to obtain a relation with former -
worlds at the top can be seen as a different policy for
revising beliefs. Details on the operation and its modal-
ities (including an axiom system) can be found in the
papers [13.34, 40] or in the textbook [13.19].
Upgrade, Also Known as Belief Revision
Another natural operation over plausibility-like models
Example 13.4
is the rearrangement of worlds within an epistemic par-
Consider the model MlŠ in Example 13.3, that is, the
tition. Of course, there are several ways in which a new
model that results from the agent observing l at the
order can be defined. The following rearrangement,
initial model M in Example 13.1. Suppose the agent
taken from [13.40], is one of the many possibilities.
performs a belief revision toward b; this can be mod-
eled as an upgrade with b, which yields the following
Definition 13.8 Upgrade operation
model .MlŠ /b* :
Let the tuple M D hW;
; Vi be a plausibility model and
let be a formula in L. The upgrade operation pro-
duces the plausibility model M * D hW;
0 ; Vi, which
differs from M just in the plausibility order, given now l, e l l, b l, e, b
by
w2 w1 w4 w5
0
WDf.w ; u/jw
u and .M; u/ g[ The ordering of the worlds has changed, making those
f.w ; u/jw
u and .M; w / : g[ worlds that satisfy b (w4 and w5 ) more plausible than
278 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
those that do not (w2 and w1 ), keeping the old order- b ^ e, and thus the formula is part of the agent’s beliefs.
ing with these two zones (w5 strictly above w4 and w1 In symbols,
strictly above w2 ). In MlŠ the agent believed :b ^:e, as
..MlŠ /b* ; w2 / B.b ^ e/;
such formula was the case in the model’s unique most
plausible world w1 , but this is not the case anymore in that is, .MlŠ ; w2 / Œl *B.b ^ e/ :
.MlŠ /b* : The unique most plausible world, w5 , satisfies
results from observing a given at .M; w /. and the one that becomes an abductive problem are un-
A formula is an abductive problem at .M Š ; w / related: In order for the agent to know after observing
if and only if it is known at such stage but it was not , she must have known ! Œ Š before the action.
known before, that is, if and only if This is nothing but the reduction axiom for the knowl-
edge modality in public announcement logic
.M Š; w / K and .M; w / :K :
Œ ŠK $ . ! K. ! Œ Š// :
Equivalently, a formula can become an abductive
problem at .M; w / if and only if it is not known at such Second, the requirements Definition 13.9 asks for
stage but will be known after observing , that is, if and to be an abductive problem are not exactly the ones
only if stated in the previous section: The sentence there is no
implication ! such that, before became an ab-
.M; w / :K ^ Œ ŠK : ductive problem, the agent knew both the implication
and its antecedent has been replaced by the agent did
Note again how the definition of an abductive prob- not know before became an abductive problem. The
lem is relative to an agent’s information at some given reason is that, in DEL, the agent’s knowledge and be-
stage (the one represented by the pointed model). liefs are closed under logical consequence (still, small
variations of the EL framework allows the representa-
There are two points worth emphasizing. First, note tion of nonideal agents and their abductive reasoning;
again how the definition distinguishes between the for- see Sect. 13.8), and in such setting the two statements
mula that becomes the abductive problem, , and the are equivalent: If there is an such that the agent knew
formula whose observation triggers the abductive prob- ! and before became an abductive problem,
lem, . Although these two formulas are typically then clearly she knew too, and if she knew , then
understood to be the same ( becomes an abductive there was a such that ! and were both known,
problem after being observed), the choice in this con- namely itself. In fact, the restatement of the require-
tribution is to distinguish between them. One reason ment emphasizes that it is the observation of what
for this is technical: Here the idea is that the agent causes the agent to know and hence what creates the
will look for explanations of formulas that she could abductive problem.
not have known before the observation but knows after- It is worthwhile to highlight how, although the def-
ward. However, stating this as the agent knows after inition of an abductive problem was given in terms of
observing it is restrictive in the DEL setting as not ev- the agent’s knowledge, it can also be given in terms of
ery formula satisfies this condition. This is because the her beliefs: It also makes sense for her to look for ex-
underlying EL framework is powerful enough to talk planations of what she has come to believe!
Abductive Reasoning in Dynamic Epistemic Logic 13.4 Abductive Problem and Solution 279
Part C | 13.4
before the action.
Definition 13.11 Abductive solution
13.4.2 Classifying Problems
Let .M; w / be a pointed plausibility model, and con-
sider .M Š ; w /, the pointed plausibility model that
As mentioned, some approaches classify an abductive
results from observing at .M; w /.
problem according to whether or : follows from
If at .M Š ; w / the formula is an abductive prob-
the theory: If neither nor : follows, then is called
lem, then is an abductive solution if and only if the
a novel abductive problem; if does not follow but :
agent knew that implied before the observation, that
does, then is called an anomalous abductive prob-
is, if and only if
lem. Given the requirement the agent did not know
before became an abductive problem (:K) in .M; w / K. ! / :
Definition 13.9, one could suggest the agent knew :
(K:) as an alternative, but since the definition also Equivalently, if at .M; w / the formula can become
asks for to be known after the observation in order to an abductive problem, then will be an abductive solu-
be an abductive problem, such suggestion turns out to tion if and only if the agent knows that implies , that
be too strong for propositional formulas: If : is propo- is, if and only if
sitional and the agent knows it at some stage, then every
epistemic possibility satisfies :. Thus, since no epis- .M; w / K. ! / :
temic action can change the (propositional) formula’s
truth value, the only way for the agent to know af-
terward is for the action to eliminate every epistemic Just as in the case of abductive problem, it is also
possibility, making K' true for every formula ' and possible to define an abductive solution in terms of
thus turning the agent inconsistent. But even though it is weaker notions as beliefs. For example, while a very
not possible to classify abductive problems in terms of strict agent would accept as explanation only when
the knowledge the agent had about the formula before ! was known, a less strict agent could accept it
the observation, it is still possible (and more reason- when such implication was only believed.
able) to classify them by using weaker notions, such as It is worth emphasizing that, in the stated definition,
beliefs. Here is one possibility. a solution for a problem (at some M Š ) is a formula
such that ! is known not when the abductive
Definition 13.10 Expected, novel and anomalous problem has arisen (at M Š ) but rather at the stage im-
problems mediately before (at M). This is because an explanation
Suppose is an abductive problem at .M Š ; w /. Then is a piece of information that would have allowed the
is said to be: agent to predict the surprising observation. In fact, if
280 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
an abductive solution for a problem were defined as before it. In fact, there are formulas that, in a given sit-
a formula such that ! is known once is an ab- uation, are solutions according to the stated definition,
ductive problem (at M Š ), then every formula ' would and yet not epistemically possible once the abductive
be a solution since (at M Š ) K would be the case (be- problem has been raised.
cause is an abductive problem) and hence so would
be K.' ! / for every formula '. Fact 13.1
Observe also how, again in the stated definition, if Not every abductive solution is consistent.
is a solution for the abductive problem (at some
M Š ), then could not be known before the observa- Proof: Let and be propositional formulas, and take
tion that triggered the problem (at M). Otherwise, both a model M in which the agent considers at least one
K. ! / and K would be the case at such stage (M) .: ^ :/-world to be epistemically possible, with the
and hence, by the closure under logical consequence of rest of the epistemic possibilities being .:^/-worlds.
knowledge in EL, so would be K, contradicting the After observing , :-worlds will be discarded and
fact that is an abductive problem. there will be only .: ^ /-worlds left, thus making
itself an abductive problem (it is not known at M but
Proposition 13.1 it will be known at MŠ ) and an abductive solution
Let be an abductive problem and be one of its ab- (every epistemic possibility at M satisfies ! , so the
ductive solutions, both at .M Š ; w /. Then, .M; w / agent knows this implication). Nevertheless, there are
:K. no -worlds at MŠ , and therefore K O is false at such
stage.
The explanatory property is interesting. The idea in
Part C | 13.4
13.4.4 Classifying Solutions the classic setting is to avoid solutions that imply the
problematic per se, such as itself or any formula
It is common in the literature to classify abductive so- logically equivalent to it. In the current epistemic set-
lutions according to their properties (Chap. 10). For ex- ting, this idea can be understood in a different way:
ample (Definitions 13.2 and 13.3; again, see Chap. 10), A solution is explanatory when the acceptance of
given a surprising observation , an abductive solution (which, as discussed, will be modeled via belief revi-
is said to be: sion; see Sect. 13.6) changes the agent’s information,
that is, when the agent’s information is different from
Plain when it is a solution
.M Š ; w / to ..M Š /* ; w / (the model that results after
Consistent when it does not contradict the agent’s
integrating the solution ). This assertion could be for-
information
malized by stating that the agent’s information is the
Explanatory when it does not explain by itself.
same in two pointed models if and only if the agent
Similar properties can be described in the present has the same knowledge in both, but this would be
setting. To begin with, the plain property simply states insufficient: The model operation representing an act
that is an abductive solution; a definition that has been of belief revision (the upgrade of Definition 13.8) is
already provided (Definition 13.11). devised to change only the agent’s beliefs (although cer-
For the consistency property, the intuitive idea is for tain knowledge, such as knowledge about beliefs, might
the solution to be compatible with the agent’s informa- also change). A second attempt would be to state that
tion. To this end, consider the following definition. the agent’s information is the same in two pointed mod-
els if and only if they coincide in the agent’s knowledge
Definition 13.12 Consistent solution and beliefs, but the mentioned operation can change
Let be an abductive problem and be one of its ab- a model without changing the agent’s beliefs.
ductive solutions, both at .M Š ; w /. It is said that is Within the current modal epistemic logic frame-
a consistent solution if and only if the agent considers work, a more natural way of specifying the idea of an
it possible at .M Š ; w /, that is, if and only if agent having the same information in two models is via
the notion of bisimulation.
.M Š; w / O :
K
Definition 13.13 Bisimulation
Let P be a set of atomic propositions and let M D hW,
Thus, a solution is consistent when it is epistemi-
; Vi and M 0 D hW 0 ;
0 ; V 0 i be two plausibility models
cally possible. Note how this requirement is given in based on this set. A nonempty relation Z .W W 0 /
terms of the agent’s information after the epistemic is called a bisimulation between M and M 0 (notation:
action that triggered the abductive problem, and not M$Z M 0 ) if and only if, for every .w ; w 0 / 2 Z:
Abductive Reasoning in Dynamic Epistemic Logic 13.5 Selecting the Best Explanation 281
V.w / D V 0 .w 0 /, that is, w and w 0 satisfy the same only if there is no bisimulation between .M Š; w / and
atomic propositions ..M Š /* ; w /.
If there is a u 2 W such that w
u, then there is
a u0 2 W 0 such that w 0
0 u0 and Zuu0 This definition, devised in order to avoid solutions
If there is a u0 2 W 0 such that w 0
0 u0 , then there is that explain the abductive problem per se, has pleas-
a u 2 W such that w
u and Zuu0 . ant side effects. In the abductive reasoning literature,
a solution is called trivial when it is logically equiv-
Two models M and M 0 are bisimilar (notation: alent to the abductive problem (i. e., when it is not
M$M 0 ) when there is a bisimulation between them, explanatory) or when it is a contradiction (to the agent’s
and two pointed models .M; w / and .M 0 ; w 0 / are knowledge, or a logical contradiction). Under the given
bisimilar (notation: .M; w /$.M 0 ; w 0 /) when there is definition, every trivial solution is not explanatory: Ac-
a bisimulation between M and M 0 containing the pair cepting any such solution will not change the agent’s
.w ; w 0 /. information. The reason is that, in both cases, the up-
grade operation will not make any change in the model:
This notion is significant because, under image- In the first case because, after the observation, the agent
finiteness (a plausibility model is image-finite if and knows the abductive problem formula, and hence every
only if every world can
-see only a finite number of epistemically possible world satisfies it (as well as ev-
worlds), it characterizes modal equivalence, that is, it ery formula logically equivalent to the problem); in the
characterises models that satisfy exactly the same for- second case because no epistemically possible world
mulas in the modal language. satisfies it. In this way, this framework characterizes
trivial solutions not in terms of their form, as is typi-
Part C | 13.5
Theorem 13.1 cally done, but rather in terms of their effect: Accepting
Let P be a set of atomic propositions and let M D hW, them will not give the agent any new information.
; Vi and M 0 D hW 0 ;
0 ; V 0 i be two image-finite plau- In particular, this shows how the act of incorporat-
sibility models. Then .M; w /$.M 0 ; w 0 / if and only if, ing a contradictory explanation will not make the agent
for every formula ' 2 L, .M; w / ' iff .M 0 ; w 0 / '. collapse and turn into someone that knows and be-
lieves everything, as happens in traditional approaches;
Now it is possible to state a formal definition of thus, a logic of formal inconsistency (e.g., [13.43]; see
what it means for a solution to be explanatory. also Chap. 15) is not strictly necessary. This is a conse-
quence of two simple but powerful ideas:
Definition 13.14 Explanatory solution
Let be an abductive problem and be one of its 1. Distinguishing an agent’s different epistemic atti-
abductive solutions, both at .M Š ; w /. It is said that tudes
is an explanatory solution if and only if its accep- 2. Assimilating an abductive solution not as knowl-
tance changes the agent’s information, that is, if and edge, but rather as a belief.
start with a set of abducible predicates and select ex- that the bulb is burned out, as Gaby thinks, and it is
planations built only from ground atoms using them even possible that the switch is faulty. Then, why do
(see Chap. 10 for more details on abductive logic pro- they choose a different explanation? The reason is that,
gramming). though they both observe that the light does not turn
There are also approaches that use logical crite- on, they have different background information: Mary
ria, but beyond the already mentioned requisites to knows that the apartment is old, and hence she consid-
avoid triviality, the definition of suitable criteria is still ers a failure in the electric line more likely than any
an open problem. One of the most pursued ideas is other explanation, but Gaby does not have that piece of
that of minimality, a concept that can be understood information, so for her a burned out bulb explains the
syntactically (e.g., [13.3] and [13.5] look for literals), lack of light better.
semantically (a minimal explanations is equivalent to The example shows that, even when facing the
any other explanation it implies [13.1]), with respect to same surprising observation (the light does not turn on),
the set of possible explanations (the best explanation is agents with different knowledge and beliefs may choose
the weakest, i. e., the one that is implied by the rest of a different best explanation: While Mary assumes that
them), and even with respect to the current information the electric line has failed, Gaby thinks that the bulb is
(the best explanation is the one that disrupt less the cur- burned out. Both explanations are equally logical since
rent information). either a failure on the electric line or else a burned out
In fact, most logical criteria are based on restrictions bulb is enough to explain why the light does not turn on.
on the logical form of the solutions but, as mentioned What makes Mary to choose the first and Gaby the sec-
in [13.1], finer criteria to select between two equally ond is that they have different knowledge and different
valid solutions require contextual aspects. With this beliefs. This suggest first, that, instead of looking for
Part C | 13.5
idea in mind some approaches have proposed to use an criteria to select the best explanation, the goal should
ordering among formulas [13.10, 50, 51] or among full be a criteria to select the agent’s best explanation.
theories (i. e., possible worlds [13.52, 53]). In particular, But there is more. The explanation an agent will
for the latter, a common option is the use of preferen- choose for a given abductive problem depends not only
tial models (e.g., [13.54]) in which preferential criteria on how the problematic formula could have been pre-
for selecting the best explanation are regarded as qual- dicted, but also on what the agent herself knows and
itative properties that are beyond the pure causal or what she considers more likely to be the case. It could
deductive relationship between the abductive problem be argued that this criterion is not logical in the clas-
and its abductive solution. But these preference criteria sical sense because it is not based exclusively on the
are normally treated as an external device, which works deductive relationship between the observed fact and
on top of the logical or deductive part of the explana- the different ways in which it could have been derived.
tory mechanism, and thus it has been criticized because Nevertheless, it is logical in a broader sense since it
it seems to fall outside a logical framework. does depend on the agent’s information: her knowledge
The epistemic approach of this proposal provides and her beliefs. In particular, in the plausibility mod-
with an interesting alternative. The concepts of an ab- els framework, the agent’s knowledge and beliefs are
ductive problem and an abductive solution have been defined in terms of a plausibility relation among epis-
defined in terms of the agent’s epistemic attitudes, so it temic possibilities, so it is natural to use precisely this
is natural to use such attitudes as a criterion for selecting relation as a criterion for selecting each agent’s best ex-
the best explanation. Consider, for instance, the follow- planation(s).
ing elaboration of an example presented in Chap. 10. This section presents a straightforward use of this
idea. It discusses how the plausibility order among epis-
“Mary and Gaby arrive late to Mary’s apartment; the
temic possibilities can be lifted to a plausibility order
light switch is pressed but the light does not turn on.
among formulas, thus providing a natural criterion to
Knowing that the apartment is old, Mary assumes
select the agent’s best explanation. A generalization of
a failure in the electric line as the explanation for
this idea that works instead with all explanations will be
the light not turning on. Gaby, on the other hand,
discussed later (Sect. 13.7).
does not have any information about the apartment,
so she explains the light not turning on by assuming
13.5.1 Ordering Explanations
that the bulb is burned out.”
After pressing the switch, both Mary and Gaby ob- A plausibility model provides an ordering among pos-
serve that the light does not turn on. There are several sible worlds. This order can be lifted to get an ordering
explanations for this: It is possible that the electric line among set of worlds, that is, an ordering among formu-
failed, as Mary assumed, but it can also be the case las of the language (with each formula seen as the set
Abductive Reasoning in Dynamic Epistemic Logic 13.5 Selecting the Best Explanation 283
of those worlds that make it true). The different ways 2. There might be a -world strictly below a '-one
in which such ordering can be defined has been stud- (<, the strict version of
, is defined as w < u if
ied in preference logic (see [13.55–57] or, for a more and only if w
u and not u
w ).
detailed exposition [13.58, Chap. 3.3]); this section re-
calls the main ideas, showing how they can be applied The plausibility order is locally connected (i. e.,
to the task of selecting the best explanation in abductive inside each epistemic partition, every world is compa-
reasoning. rable to each other) so (1) cannot occur. Thus, a formula
In general, an ordering among objects can be lifted defining
88 only needs to guarantee that no -world
to an ordering among sets of such objects in different is strictly below a '-one; in other words, it needs to
ways. For example, one can say that the lifted order- express that, given any -world, every world that is
ing puts the set of objects satisfying the property (the strictly more plausible satisfies :'. Such formula can
set of -objects) over the set of objects satisfying the be easily stated in a language that extends L with a stan-
property ' (the set of '-objects) when there is a - dard modality for the relation <
object that the original ordering among objects places
above some '-object (a 99 preference of over '; see '
88 WD Œ. ! Œ<:'/ :
below). But one can be more drastic and say that the
set of -objects is above the set of '-ones when the Finally, the 98 ordering presents a similar situa-
original ordering places every -object above every '- tion. Following the first two cases one could propose
one (a 88 preference of over '). This quantification hi.' ^ Œ< /, but such formula is not appropriate,
combination gives raise to the following possibilities even in the current full-comparability case: It holds
even when there are -worlds below the chosen '-
Part C | 13.5
'
99 iff there is a '-object w and there one. In order to guarantee the existence of a '-world
is a -object u such that w
u that is at most as plausible as every -world, the for-
'
89 iff for every '-object w there is a mula should state that every world that is strictly less
-object u such that w
u plausible than the '-world satisfies : . Extending the
'
88 iff w
u for every '-object w and language again, this time with a modality for >, makes
every -object u such formula straightforward
'
98 iff there is a '-object w such that
w
u for every -object u '
98 WD hi.' ^ Œ>: / :
The first two orderings can be defined within the All in all, the important fact is that among these
language L four orderings on sets of worlds (i. e., formulas), two
are definable within L and the other two only need sim-
'
99 WD hi.' ^ h
i / ple extensions. This shows how the plausibility order
among worlds that defines the agent’s knowledge and
'
89 WD Œ.' ! h
i / :
beliefs (Sect. 13.3.1) also defines plausibility orderings
among formulas (sets of worlds), and hence provides
The first formula indicates that there is a -world
a criterion for selecting the best abductive solution for
that is at least as plausible as a '-one, '
99 , exactly
a given agent. It will now be shown how this criterion
when there is an epistemic possibility that satisfies '
can be used, and how it leads to situations in which
and that can see an at least as plausible -world. The
agents with different knowledge and beliefs choose dif-
second one only changes the first quantification (turn-
ferent best explanations.
ing, accordingly, the conjunction into an implication):
For every '-world there is a -world that is at least as
Example 13.5
plausible.
Recall Mary and Gaby’s example. Both observe that af-
The last two orderings are not immediate. Given the
ter pressing the switch the light does not turn on, but
formulas for the previous two orderings, one could pro-
each one of them chooses a different explanation: While
pose Œ.' ! Œ
/ for the 88 case, but this formula
Mary assumes that the electric line failed, Gaby thinks
is not correct: It states that every world that is at least
that the bulb is burned out. As it has been argued, the
as plausible as any '-world satisfies , but it does not
reason why they choose different explanations is that
guarantee that every -world is indeed above every '-
they have different knowledge and beliefs. Here is a for-
world:
malization of the situation.
1. There might be a -world incomparable to some '- The following plausibility models show Mary and
one, and even if all worlds are comparable Gaby’s knowledge and beliefs before pressing the
284 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
switch. They involve tree atomic propositions: l stand- not the case, w3 , is eliminated, thus producing the fol-
ing for lack of light, e standing for a failure in the lowing models.
electric line and b standing for a burned out bulb. Again,
each possible world has indicated within it exactly those
atomic propositions that are true in each one of them, Maryl! l, e, b l, b l, e l
and the arrows represent the plausibility relation (tran-
sitive arrows are omitted). w5 w4 w2 w1
Mary l, e, b l, b l, e l Gabyl! l, e, b l, e l, b l
w5 w4 w2 w1 w3 w5 w2 w4 w1
fact, the only difference in the models is the plausibility implications that have l as a consequent and that were
order between worlds w2 and w4 . Mary knows that the known before the observation. So, how can each girl
apartment is old so she considers a failure on the line choose her own best explanation? For Mary, the unique
(e) more likely than a burned out bulb (b), and hence ordering that puts b above e is the weakest one, 99
the situation where the electric line fails but the bulb (there is a b-world, w4 , at least as plausible as a e-one,
is not burned out (w2 ) is more likely than its opposite w5 ). Nevertheless, from her point of view, e is above b
(w4 ). Gaby, on the other hand, does not know anything not only in the weak 99 way (w2 is at least as plausible
about the apartment, and hence for her a burned out bulb as w4 ) but also in the stronger 89 way (every b-world
with a working electric line (w4 ) is more plausible than has a e-world that is at least as plausible as it). Thus,
a working bulb and a failing electric line (w2 ). It is also one can say that e is a more plausible explanation from
assumed that, for both of them, the most likely possibil- Mary’s perspective. In Gaby’s case something analo-
ity is the one in which everything works correctly (w1 ) gous happens: b is above e not only in the weak 99 way
and the least plausible case is the one in which every- (w4 is at least as plausible as w2 ) but also in the strong
thing fails (w5 ). 89 way. Hence, it can be said that, for Gaby, b is the
After they both observe that pressing the switch best explanation.
does not turn on the light, the unique world where l is
Equivalently, hAbd
i'’s semantic interpretation can be 13.6.1 Abduction in a Picture, Once Again
defined as
The definitions that have been provided allow more pre-
.M Š; w / hAbd
i' cision in the diagram of abductive reasoning presented
iff in Sect. 13.2.5. Here is the updated version for the case
in which the definitions are given just in terms of the
.M; w / :K ^ K. ! / ^ Œ Š.K ^ Œ *'/ : agent’s knowledge. Note how the inferring step has
been dropped, as it is not needed in an omniscient set-
ting such as DEL. Again, circles represent the agent’s
The definition states that hAbd
i' is true at epistemic states (i. e., full plausibility models) and ar-
.M Š ; w / if and only if: rows are labeled with the operations that modify the
agent’s information.
1. is an abductive problem at .M Š ; w /
2. is an abductive solution also at .M Š ; w / Kχ
Kχ
χ
3. An upgrade (Definition 13.8) with will make ' ψ! Abd η
K ( η χ) s1 s2 s3
true. Kη
The last part makes precise the idea of how an agent s′2
η!
should incorporate the selected explanation: It cannot
be incorporated as knowledge, but it can be incorpo- Again, the upper path represents what really hap-
rated as a belief. pened. After observing , the agent reaches the epis-
Part C | 13.6
temic state s2 in which she knows . But before the
Example 13.6 observation, at s1 , she did not know , and thus this for-
Returning to Example 13.5, once Mary and Gaby have mula is an abductive problem at s2 . Observe how !
selected their respective best explanation, they can was known at s1 : hence, is an abductive solution at
perform an abductive step. In Mary’s case, worlds sat- s2 and the agent can perform an abductive step with it
isfying e (w5 and w2 ) will become more plausible than to reach state s3 . This abductive solution would have
worlds that do not satisfy it (w4 and w1 ); in Gaby’s helped the agent to infer (and hence to come to know)
case, worlds satisfying b (w5 and w4 ) will become more , and the lower path represents this alternative situa-
plausible than worlds that do not satisfy it (w2 and tion. In general, it cannot be guaranteed that the agent
w1 ). Applying these upgrades to the models MarylŠ and would have known (or even ) at state s02 : these for-
GabylŠ in produces the following models. mulas could have had epistemic modalities, and hence
the observation could have changed their truth value.
However, if both formulas are propositional, K and K
(Maryl !) e l, b l l, e, b l, e hold at s02 .
w4 w1 w5 w2
13.6.2 Further Classification
Definition 13.16 Adequate solution and successful Observe how q is an abductive problem at MqŠ since
solution it is not known at M (there is an epistemically possible
Let the formula be an abductive solution for the ab- world where q fails, namely, w3 ) but it is known at MqŠ .
ductive problem at .M Š ; w /. Then: Observe also how p^:Bp is an abductive solution since
K..p ^ :Bp/ ! q/ holds at M (it is true at w1 and w2
is an adequate solution if and only if the agent still
because q is true in those worlds, and also true at w3 be-
knows ! at .M Š ; w /, that is, if and only if
cause p^:Bp fails in this world). Furthermore, p^:Bp
.M Š; w / K. ! / : is a consistent solution since it is epistemically possible
is a successful solution if and only if it is believed in MqŠ (p and :Bp are both true at w1 , the latter because
after the abductive step, that is, if and only if there is a most plausible world, w2 , where p is not the
case, and hence the agent does not believe p). Neverthe-
.M Š; w / hAbd
iB : less, after an upgrade with p ^ :Bp this very formula is
not believed. It fails at the unique most plausible world
Here it is a result about the adequacy property. w1 because :Bp fails at it: the most plausible world (w1
itself) satisfies p and hence the agent now believes p,
Proposition 13.2 that is, Bp is the case.
Every abductive solution is adequate. Nevertheless, if a propositional solution is also
Proof: More precisely, suppose that at .M Š ; w / the consistent, then it is successful.
formula is an abductive problem and is one of
its abductive solutions. Since is an abductive prob- Proposition 13.3
lem, .M Š ; w / K and hence .M Š ; w / K. ! Suppose that at .M Š ; w / the formula is an abductive
Part C | 13.6
Neutral when .M Š ; w / :B ^ :B: at s1 (she would have known both and , and she
Strongly explanatory when .M Š; w / B:. would have still known ! ). Therefore,
Kχ Bη
Again, there are more possibilities if further epis- Kη Kχ
Kχ K ( η χ) K ( η χ)
temic attitudes are considered. K ( η χ) χ
ψ! Abd η
Kη s1 s2 s3
13.6.3 Properties in a Picture Bχ
Kη
Consider an anomalous abductive problem (i. e., B: s′2
η!
holds at s1 ) whose abductive solution is consistent Kχ
O holds at s2 ) and successful (B holds at s3 ), recall-
(K Kη
ing also that every solution is adequate (so K. ! / K ( η χ)
holds at s2 ). This extends the diagram of Sect. 13.6.1 in
the following way. This diagram beautifully illustrates what lies behind
this proposal’s understanding of abductive reasoning.
Kχ In the propositional case, if is a consistent and suc-
Kη cessful abductive solution for the abductive problem ,
Kχ K ( η χ) Bη
χ then, after abductive reasoning, the agent will know
K ( η χ) ψ! Abd η
Kη
s1 s2 s3 and will believe . In fact, when the observed formula
Bχ is actually the same that becomes an abductive prob-
lem, the epistemic effect of abductive reasoning, from
Part C | 13.7
s′2 knowledge to beliefs, can be described with the follow-
η!
ing validity [13.59],
Moreover, consider the case in which both and are
propositional, the typical case in abductive reasoning in K. ! / ! ŒŠ.K ! hAbd
iB/ :
which the agent looks for explanations of facts, and not
of her own (or, in a multiagent setting, of other agents’) What makes a reasonable solution is the existence
epistemic state. First, in such case, should be an epis- of an alternative reality in which she observed and,
temic possibility not only at s2 but also at s1 . But not thanks to that, came to know . Similar diagrams can
only that; it is possible now to state the effects of the ab- be obtained for the cases in which the definitions of an
ductive step at s2 (the agent will believe and will still abductive problem and an abductive solution are given
know ! ) and of the hypothetical announcement of in terms of epistemic attitudes other than knowledge.
Example 13.7
After their respective abductive steps (models
((Gabyl!) b) b! l, e l
(MaryŠl )e* and (GabylŠ )b* of Example 13.6), Mary
and Gaby take a closer look at the bulb and observe that w2 w1
it is not burned out (:b). Semantically this is simply
an observation operation that eliminates w4 and w5 , This observation does not affect Mary’s explanation:
exactly those epistemic possibilities where the bulb is She still believes that the electric line has failed (e is
burned out (i. e., where b holds). The resulting models true in her unique most plausible world w2 ). But Gaby’s
288 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
case is different: She does not have an explanation for l L defines an existential modality of the form hAbd i',
anymore. Although she knows it (Kl holds at the model read as the agent can perform a complete abductive step
on the bottom, that is, l is true in every epistemic possi- for after which ' is the case, and whose semantic in-
bility), she neither knows nor believes the antecedent of terpretation is as follows
a known implication with l as a consequent (besides, of
course, the trivial ones); she needs to perform a further .M Š; w / hAbd i'
abductive step in order to explain it. iff
.1/.M Š; w / K and .M; w / :K ;
There is, nevertheless, a way to avoid the extra ab-
ductive reasoning step. Recall that after applying the .2/..M Š/.W ˙ /* ; w / ' ;
defined upgrade operation (Definition 13.8), all the
worlds satisfying the given formula become more plau- where ˙ is the set of abductive solutions for , that is,
sible than the ones that do not satisfy it, and within the
two zones the old ordering remains. If the lifted worlds ˙ WD f j .M; w / K. ! /g :
are not those that satisfy the agent’s most plausible ex-
planation but rather those that satisfy at least one of her Equivalently, hAbd i'’s semantic interpretation can be
explanations, the resulting model will have two layers: defined as
the lower one with worlds that do not satisfy any ex-
.M Š ; w / hAbd i'
planation, and the upper one with worlds that satisfy at
least one. But inside the upper layer the old ordering iff
will remain. In other words, the most plausible worlds .M; w / :K ^ Œ Š.K ^ Œ_˙ *'/ :
Part C | 13.7
Example 13.8 has failed (e is the case in the most plausible world at
Let us go back to Mary and Gaby’s example all the way the model on the top), Gaby believes that the bulb is
to the stage after which they have observed that the light burned out (b holds in the most plausible world at the
does not turn on (models MarylŠ and GabylŠ of Exam- model on the bottom).
ple 13.5, repeated here). So far the result of the upgrade is, with respect
to Mary and Gaby beliefs, exactly the same as with
the previous proposal where only worlds that satisfy
the most plausible explanation are upgraded (in both
Maryl ! l, e, b l, b l, e l
cases, w2 and w4 are Mary’s and Gaby’s most plausible
w5 w4 w2 w1 worlds, respectively). But note what happens now when
they both observe that the bulb is in fact not burned out
(:b): Such action produces the following situation.
Gabyl ! l, e, b l, e l, b l
w5 w2 w4 w1 ((Maryl!) (eb) b! l l, e
Part C | 13.8
upgrade with e _ b. This produces the following mod- w1 w2
els.
Again, the observation does not affect Mary’s explana-
tion (e still holds in the most plausible world at model
(Maryl !) (eb) l l, e, b l, b l, e on the top), but it does change Gaby’s since her previous
explanation b is not possible anymore. The difference is
w1 w5 w4 w2 that now she does not need to perform an extra abduc-
tive step because she has already another explanation:
She now believes that the electric line has failed (e holds
in the most plausible world at model on the bottom).
(Gabyl !) (eb) l l, e, b l, e l, b
13.8.1 Considering Inference reasoning tools to infer there is smoke in the kitchen
from the chicken has been on the fire for a long time.
Suppose Karl is in his dining room and sees smoke But what if that was not the case? That is, what if, be-
coming out of the kitchen. This seems unjustified at sides not having at hand all the logical consequences of
first, but then he realises that the chicken he placed on his information, Karl did not have the required reason-
the fire has been there for a long time. Initially Karl ing tools to infer some of them?
did not have any explanation for the smoke, but after In such new situation, Karl faces again an abductive
a moment he realized that such event was actually not problem, but this time of a different nature. The surpris-
surprising at all. ing observation could have been predicted in the sense
This case is different from the discussed ones be- that it is a logical consequence of Karl’s information
cause Karl is not an ideal agent: He does not have at the chicken has been on the fire for a long time, just as
hand all logical consequences of his information, and in the initial version of this example. The difference is
therefore he did not realize that the information he had that such observation is not something that Karl could
before seeing the smoke was enough to predict it (i. e., have predicted by himself: He did not have the needed
to infer that there would be smoke). Described in more tools. One can say that, even though there is smoke in
technical terms, seeing the smoke raised an abductive the kitchen is objectively derivable from the initial infor-
problem for Karl, but such problem arose because he mation, it is not subjectively derivable in the sense that
did not have, at the time of the observation, all the log- Karl could not have done it. To put it in other words,
ical consequences of the information he actually had besides not having at hand all the logical consequences
(otherwise there would have been no abductive problem of her actual information, Karl might not even be able
at all). Accordingly, in such case the abductive solution to reach them.
Part C | 13.9
is not necessarily a piece of information that would have Accordingly, the simple inference step of before
allowed Karl to predict the smoke; it might be a simple cannot be a solution to the problem now, as Karl does
inference step that made explicit what was only implicit not have the needed tools to perform it. One possi-
before. ble solution is, as in the traditional case, a piece of
This shows not only how agents whose information information that would have allowed Karl to predict
is not closed under logical consequence can face at least the smoke from some other previously known fact,
a new kind of abductive problem, but also how such but a more interesting one is some reasoning tool that
problems give rise to a different kind of solutions. would have helped him to predict the fire from the
known fact the chicken has been on the fire for a long
13.8.2 Different Reasoning Abilities time.
New cases arise when further kinds of agents are
In the previous example, the abductive solution was considered. A systematic study of such cases can be
a simple inference step because Karl had the needed found in [13.61].
13.9 Conclusions
This chapter has proposed an epistemic and dynamic of an agent’s knowledge and beliefs, the present pro-
approach to abductive reasoning, understanding this posal has discussed:
form of reasoning as a process that:
1. A classification of abductive problems in terms of
1. Is triggered by an epistemic action through which both how convinced the agent is of the problematic
the agent comes to know or believe certain that formula after the observation (she knows it, or just
otherwise she could not have been able to know or believes it) and how plausible the formula was be-
believe fore the epistemic action that triggered the problem
2. Looks for explanations for in the set of formulas 2. A classification of abductive solutions based not
that could have helped the agent to come to know or only on their deductive relation with the abductive
believe problem or their syntactic form, but also in terms of
3. Incorporates the chosen explanation as a part of the both their plausibility before the problem was raised
agent’s beliefs. and the way it will affect the agent’s information
once they are incorporated
Besides providing formal definitions of what an ab- 3. A new perspective that looks not for the best expla-
ductive problem and an abductive solution are in terms nation but rather for the agent’s best explanation,
Abductive Reasoning in Dynamic Epistemic Logic 13.9 Conclusions 291
and the possibility to carry out this search in terms of in the alternative reality path. These extensions are rel-
which explanations are more likely from the agent’s evant: They would allow a better understanding of the
point of view, that is, in terms of the agent’s beliefs abductive process as performed by real agents.
4. The possibility of integrating the chosen solution But it is also possible to do more than just follow
into the agent’s information as part of her beliefs, the traditional research lines in abductive reasoning,
which allows not only to identify trivial solutions and here are two interesting possibilities (whose devel-
because of their effect rather than their form, but opment exceeds the limits of this chapter). First, the
also to revise and eventually discard solutions that DEL framework allows multiagent scenarios in which
become obsolete in the light of further information. abductive problems would arise in the context of a com-
munity of agents. In such setting, further to the public
Crucial for all these contributions has been the use observation and revision used here, actions that affect
of plausibility models and, in general, the DEL guide- the knowledge and beliefs of different agents in dif-
lines, which puts emphasis in the representation of both ferent ways are possible. For example, an agent may
epistemic attitudes and the actions that affect them. be privately informed about : If this raises an abduc-
It is worthwhile to compare, albeit briefly, the tive problem for her and another agent has private
present proposal to other epistemic approaches to ab- information about ! , they can interact to obtain
ductive reasoning. Besides immediate differences in the the abductive solution . Second, the DEL frame-
respective semantic models (while other approaches fol- work deals with high-order knowledge, thus allowing
low the Alchourrón–Gärdenfors–Makinson (AGM) be- to study cases in which an agent, instead of looking for
lief revision, using a set of formulas for representing an explanation of a fact, looks for an explanation of her
the agent’s information, here possible worlds are used), own epistemic state. Interestingly, explanations might
Part C | 13.9
there are two main points that distinguish the presented involve epistemic actions as well as the lack of them.
ideas from other proposals. First, here several epistemic According to those considerations, this logical ap-
attitudes are taken into account, thus making a clear dif- proach takes into account the dynamics aspects of
ference between what the agent holds with full certainty logical information processing, and one of them is ab-
(knowledge) and what she considers very likely but still ductive inference, one of the most important forms of
cannot guarantee (beliefs); this allows to distinguish be- inference in scientific practices. The aforementioned
tween the certainty of both the previous information multiagent scenarios allow to model concrete practices,
and the surprising observation, and the mere plausibility particularly those that develop a methodology based on
of the chosen solution (recall the validity K. ! / ! observation, verification, and systematic formulation of
ŒŠ.K ! hAbd iB/, briefly discussed at the end of provisional hypotheses, such as in empirical sciences,
Sect. 13.6). Second, this approach goes one step fur- social sciences, and clinical diagnosis. The epistemo-
ther by making explicit the different stages of the ab- logical repercussions of this DEL approach is given by
ductive process, thus making also explicit the epistemic the conceptual resources that it offers, useful to model
actions involved. This highlights the importance of ac- several aspects of explanatory processes. If known the-
tions such as belief revision, commonly understood in ories of belief revision, at the last resort, say nothing
epistemic approaches to abduction as the one triggered about context of discovery, by means of DEL the acces-
by the abductive problem [13.12, 62], and also such as sibility of this context to rational epistemological and
observation, understood here as the one that triggers the logical analysis is extended, further on classical log-
abductive process. ical treatment of abduction. From the perspective of
This chapter presents only the first steps toward game theoretic semantics, for example, now it is eas-
a proper study of abductive reasoning from an epistemic ier to determine what rules are strategic and what are
and dynamic perspective, and several of the current operatories when abductive steps were given. But ap-
proposals can be refined. For example, the specific def- plications should also be considered to tackle certain
inition of an abductive problem (Definition 13.9) relies philosophical problems. For example, abductive sce-
on the fact that, within the DEL framework, agents narios within multiagent settings can be used to study
are logically omniscient. As it has been hinted at in the implications of different forms of communication
Sect.13.8, in a nonomniscient DEL setting [13.35, 63] within scientific communities.
the ideas discussed in Sect. 13.2 would produce a differ-
ent formal definition (which, incidentally, would allow Acknowledgments. The first author acknowledges
to classify abductive problems and abductive solutions the support of the project Logics of discovery, heuristics
according to some derivability criteria). Moreover, it and creativity in the sciences (PAPIIT, IN400514-3),
would be possible to analyze the full abductive picture granted by the National Autonomous University of
presented in Sect. 13.2.1, which requires inference steps Mexico (UNAM).
292 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
References
13.1 A. Aliseda: Abductive Reasoning. Logical Investi- 13.21 F. Soler-Toscano, D. Fernández-Duque, Á. Nepo-
gations into Discovery and Explanation, Synthese muceno-Fernández: A modal framework for mod-
Library, Vol. 330 (Springer, Dordrecht 2006) eling abductive reasoning, Log. J. IGPL 20(2), 438–
13.2 A.C. Kakas, R.A. Kowalski, F. Toni: Abductive logic 444 (2012)
programming, J. Logic Comput. 2(6), 719–770 (1992) 13.22 M.E. Quilici-Gonzalez, W.F.G. Haselager: Creativity:
13.3 M.C. Mayer, F. Pirri: First order abduction via Surprise and abductive reasoning, Semiotica 153(1–
tableau and sequent calculi, Log. J. IGPL 1(1), 99–117 4), 325–342 (2005)
(1993) 13.23 J. van Benthem, F.R. Velázquez-Quesada: The dy-
13.4 M.C. Mayer, F. Pirri: Propositional abduction in namics of awareness, Synthese (Knowl., Rationality
modal logic, Logic J. IGPL 3(6), 907–919 (1995) and Action) 177, 5–27 (2010)
13.5 A.L. Reyes-Cabello, A. Aliseda, Á. Nepomnceno- 13.24 B. Hill: Awareness dynamics, J. Phil. Log. 39(2), 113–
Fernández: Towards abductive reasoning in first- 137 (2010)
order logic, Logic J. IGPL 14(2), 287–304 (2006) 13.25 F.R. Velázquez-Quesada, F. Soler-Toscano, Á. Nepo-
13.6 S. Klarman, U. Eudriss, S. Schlobar: ABox abduction muceno-Fernández: An epistemic and dynamic
in the description logic ACC, J. Autom. Reason. 46, approach to abductive reasoning: Abductive prob-
43–80 (2011) lem and abductive solution, J. Appl. Log. 11(4),
13.7 J. Lobo, C. Uzcátegui: Abductive consequence rela- 505–522 (2013)
tions, Artif. Intell. 89(1/2), 149–171 (1997) 13.26 Á. Nepomuceno-Fernández, F. Soler-Toscano,
13.8 A. Aliseda: Mathematical reasoning vs. abductive F.R. Velázquez-Quesada: An epistemic and dy-
reasoning: A structural approach, Synthese 134(1/2), namic approach to abductive reasoning: Selecting
25–44 (2003) the best explanation, Log. J. IGPL 21(6), 943–961
13.9 B. Walliser, D. Zwirn, H. Zwirn: Abductive logics in (2013)
Part C | 13
a belief revision framework, J. Log, Lang. Info. 14(1), 13.27 F. Soler-Toscano, F.R. Velázquez-Quesada: A dy-
87–117 (2004) namic-epistemic approach to abductive reasoning.
13.10 H.J. Levesque: A knowledge-level account of ab- In: Logic of Knowledge. Theory and Applications,
duction, Proc. 11th Intl. Joint Conf. on Artif. Intell., Dialogues and the Games of Logic. A Philosoph-
ed. by N.S. Sridharan (Morgan Kaufmann, Burling- ical Perspective, Vol. 3, ed. by C. Barés Gómez,
ton 1989), pp. 1061–1067, Detroit 1989 S. Magnier, F.J. Salguero (College Publications, Lon-
13.11 C. Boutilier, V. Becher: Abduction as belief revision, don 2012) pp. 47–78
Artif. Intell. 77(1), 43–94 (1995) 13.28 C.S. Peirce: The Essential Peirce. Selected Philo-
13.12 A. Aliseda: Abduction as epistemic change: sophical Writings (1893–1913), Vol. 2 (Indiana Univ.,
A Peircean model in artificial intelligence. In: Bloomington, Indianapolis 1998), ed. by N. Houser
Abduction and Induction: Essays on Their Relation 13.29 C.S. Peirce: The Essential Peirce. Selected Philo-
and Integration, Applied Logic, ed. by P.A. Flach, sophical Writings (1867–1893), Vol. 1 (Indiana Univ.,
A.C. Kakas (Kluwer, Dordrecht 2000) pp. 45–58 Bloomington, Indianapolis 1992), ed. by N. Houser,
13.13 L. Magnani: Abductive Cognition: The Epistemolog- C. Kloesel
ical and Eco-Cognitive Dimensions of Hypothetical 13.30 E. Lorini, C. Castelfranchi: The cognitive structure of
Reasoning, Cognitive Systems Monographs, Vol. 3 surprise: Looking for basic principles, Topoi 26(1),
(Springer, Heidelberg 2009) 133–149 (2007)
13.14 D. Gabbay, J. Woods (Eds.): The Reach of Abduction: 13.31 G.H. Harman: The inference to the best explana-
Insight and Trial, A Practical Logic of Cognitive Sys- tion, Phil. Rev. 74(1), 88–95 (1965)
tems, Vol. 2 (Elsevier, Amsterdam 2005) 13.32 P. Lipton: Inference to the Best Explanation (Rout-
13.15 J. Woods: Cognitive economics and the logic of ab- ledge, London, New York 2004)
duction, Rev. Symb. Log. 5(1), 148–161 (2012) 13.33 J. Hintikka: What is abduction? The fundamen-
13.16 J. Hintikka: Knowledge and Belief: An Introduction tal problem of contemporary epistemology, Trans.
to the Logic of the Two Notions (Cornell Univ. Press, C.S. Peirce Soc. 34(3), 503–533 (1998)
Ithaca 1962) 13.34 A. Baltag, S. Smets: A qualitative theory of dynamic
13.17 R. Fagin, J.Y. Halpern, Y. Moses, M.Y. Vardi: Reason- interactive belief revision. In: Logic and the Foun-
ing About Knowledge (MIT Press, Cambridge 1995) dations of Game and Decision Theory (LOFT7), Texts
13.18 H. van Ditmarsch, W. van der Hoek, B. Kooi: Dy- in Logic and Games, Vol. 3, ed. by G. Bonanno,
namic Epistemic Logic, Synthese Library, Vol. 337 W. van der Hoek, M. Wooldridge (Amsterdam Univ.
(Springer, Dordrecht 2007) Press, Amsterdam 2008) pp. 13–60
13.19 J. van Benthem: Logical Dynamics of Information 13.35 F.R. Velázquez-Quesada: Dynamic epistemic logic
and Interaction (Cambridge Univ. Press, Cambridge for implicit and explicit beliefs, J. Log. Lang. Info.
2011) 23(2), 107–140 (2014)
13.20 P.A. Flach, A.C. Kakas: Abduction and Induction: 13.36 C. Boutilier: Unifying default reasoning and belief
Essays on their Relation and Integration, Applied revision in a modal framework, Artif. Intell. 68(1),
Logic (Kluwer, Dordrecht 2000) 33–85 (1994)
Abductive Reasoning in Dynamic Epistemic Logic References 293
13.37 R. Stalnaker: On logics of knowledge and belief, 13.51 P. Gärdenfors, D. Makinson: Nonmonotonic infer-
Phil. Stud. 128(1), 169–199 (2006) ence based on expectations, Artif. Intell. 65(2),
13.38 J.A. Plaza: Logics of public communications, Proc. 197–245 (1994)
4th Intl. Symp. Methodol. Intell. Sys., ed. by 13.52 R. Pino-Pérez, C. Uzcátegui: Jumping to explana-
M.L. Emrich, M.S. Pfeifer, M. Hadzikadic, Z.W. Ras tions versus jumping to conclusions, Artif. Intell.
(North-Holland, Amsterdam 1989) pp. 201–216 111(1/2), 131–169 (1999)
13.39 J. Gerbrandy, W. Groeneveld: Reasoning about in- 13.53 R. Pino-Pérez, C. Uzcátegui: Preferences and expla-
formation change, J. Log, Lang. Info. 6(2), 147–196 nations, Artif. Intell. 149(1), 1–30 (2003)
(1997) 13.54 D. Makinson: Bridges between classical and non-
13.40 J. van Benthem: Dynamic logic for belief revision, monotonic logic, Log. J. IGPL 11(1), 69–96 (2003)
J. Appl. Non-Class. Log. 17(2), 129–155 (2007) 13.55 J. van Benthem, S. van Otterloo, O. Roy: Preference
13.41 H. van Ditmarsch: Prolegomena to dynamic logic logic, conditionals and solution concepts in games.
for belief revision, Synthese 147(2), 229–275 (2005) In: Modality Matters: Twenty-Five Essays in Honour
13.42 A. Baltag, S. Smets: Learning by questions and an- of Krister Segerberg, Uppsala Philosophical Stud-
swers from belief-revision cycles to doxastic fixed ies, ed. by H. Lagerlund, S. Lindström, R. Sliwinski
points. In: Logic, Language, Information and Com- (Univ. Uppsala, Upsala 2006) pp. 61–76
putation, ed. by H. Ono, M. Kanazawa, R. de 13.56 P. Girard: Modal Logic for Belief and Preference
Queiroz (Springer, Berlin, Heidelberg 2009) pp. 124– Change, Ph.D. Thesis (Stanford Univ., Stanford
139 2008)
13.43 W.A. Carnielli: Surviving abduction, Log. J. IGPL 13.57 J. van Benthem, P. Girard, O. Roy: Everything else
14(2), 237–256 (2006) being equal: A modal logic for ceteris paribus pref-
13.44 J. Pearl: Probabilistic Reasoning in Intelligent Sys- erences, J. Phil. Log. 38(1), 83–125 (2009)
tems – Networks of Plausible Inference (Morgan 13.58 F. Liu: Reasoning about Preference Dynamics, Syn-
Kaufmann, San Francisco 1989) these Library, Vol. 354 (Springer, Heidelberg 2011)
Part C | 13
13.45 D. Poole: Probabilistic horn abduction and 13.59 F.R. Velázquez-Quesada: Reasoning processes as
bayesian networks, Artif. Intell. 64(1), 81–129 (1993) epistemic dynamics, Axiomathes 25(1), 41–60 (2015)
13.46 D. Dubois, A. Gilio, G. Kern-Isberner: Probabilistic 13.60 F. Soler-Toscano, F.R. Velázquez-Quesada: Gener-
abduction without priors, Intl. J. Approx. Reason. ation and selection of abductive explanations for
47(3), 333–351 (2008) non-omniscient agents, J. Log, Lang. Info. 23(2),
13.47 M. Denecker, D. De Schreye: SLDNFA: An abduc- 141–168 (2014)
tive procedure for normal abductive programs, 13.61 F. Soler-Toscano, F.R. Velázquez-Quesada: Abduc-
Proc. Intl. Joint Conf. Symp. Log. Program., ed. by tion for (non-omniscient) agents, Workshop Proc.
K.R. Apt (MIT Press, Washington 1992) pp. 686–700 MALLOW 2010, Vol. 627, ed. by O. Boissier, A. El Fal-
13.48 A.C. Kakas, P. Mancarella: Generalized stable mod- lah Seghrouchni, S. Hassas, N. Maudet (CEUR, Lyon
els: A semantics for abduction, Proc. 9th Eur. Conf. 2010), www.ceur-ws.org/vol-627/lrba_4.pdf
Artif. Intell. ECAI ’90, ed. by L.C. Aiello (Pitman, 13.62 J. van Benthem: Abduction at the interface of logic
Stockholm 1990) pp. 385–391 and philosophy of science, Theoria 22(3), 271–273
13.49 F. Lin, J.-H. You: Abduction in logic program- (2009)
ming: A new definition and an abductive proce- 13.63 F.R. Velázquez-Quesada: Explicit and implicit
dure based on rewriting, Proc. 17th Int. Joint Conf. knowledge in neighbourhood models. In: Logic,
Artif. Intell., IJCAI, ed. by B. Nebel (Morgan Kauf- Rationality, and Interaction – Proc. 4th Int.
mann, Seattle 2001) pp. 655–666 Workshop LORI 2013, Hangzhou, Lecture Notes in
13.50 M.C. Mayer, F. Pirri: Abduction is not deduction-in- Computer Science, Vol. 8196, ed. by D. Grossi,
reverse, Log. J. IGPL 4(1), 95–108 (1996) O. Roy, H. Huang (Springer, Berlin, Heidelberg 2013)
pp. 239–252
295
Argumentatio
14. Argumentation and Abduction in Dialogical Logic
Part C | 14.1
duction as a case of nondeductive reasoning. By What Kind of Speech Act? .................. 310
relying on some relevant ideas of the Gabbay–
14.7 Conclusions ...................................... 312
Woods (GW) schema of abduction and Aliseda’s
approach, a new dialogical explanation of abduc- References................................................... 312
tion in terms of concession-problem is proposed.
This notion of concession problem will be defined
thereafter. With respect to the topics of the model-
based sciences, the question of the specificity of
the speech act by means of which a hypothesis is
conjectured is set more specifically.
the dialogical framework in which the proof is con- sis of what has been called an agent-centered logic.
ceived in terms of a dialectical process. The specificity The position defended in this chapter, which is per-
of deductive and abductive reasonings is clarified by haps stronger than that of Woods, is that focusing on
identifying different kinds of speech acts specific to a consequence-having relation is also a mistake with
each of these forms of reasoning. The aim is to show respect to deductive reasoning. Reasoning, in general,
that abductive dialogues involve specific speech acts, must be studied in a general framework in which par-
namely certain conjectural claims that differ from usual ticular attention is paid to the action of the agents and
assertions and questions of deductive dialogues. A more their commitments.
exhaustive study of commitment and its role in the More precisely, it is argued that deductive as well
definition of different kinds of speech acts in dialogi- as nondeductive reasoning should be understood within
cal interaction which can also be found in Walton and argumentative practices, taking into account the inter-
Krabbe [14.3]. However, this study focuses here on action between agents. This can be achieved by means
some aspects of commitment related to assertions and of dialogical logic, a semantics based on argumentative
questions in deductive dialogues and considers how to practices and presented as a game between a propo-
extend the picture to abductive dialogues. The con- nent of a thesis and an opponent to this thesis. More
text of this study is first explained. In order to defend precisely, dialogical logic is grounded on speech acts
a practical logic to study the fallacies, Woods [14.4] and commitments related to these speech acts. That
identifies what he calls third-way reasoning, which op- is, a dialogue is a sequence of speech acts, questions
erates beyond the usual standards of deduction and and assertions, in order to justify or challenge an initial
induction. According to Woods, logicians have missed thesis. Moreover, utterances are not free of further justi-
the target concerning the study of fallacies because they fications: When we utter something, we are committed
have failed to invoke the right standards of reasoning. to providing justification of what we are saying. This is
The mistake is linked to an ostracism with respect to the basis of the rules which say how to challenge and
the human being when the task should be to describe how to defend an utterance. Deductive validity is thus
reasoning. Indeed, in most logical studies of reason- conceived in terms of strategy by means of which a pro-
ing, the human being has simply been left out of the ponent of a thesis defends her initial claim against every
story! In Woods’ own words, “there are no people in attack of her opponent.
the models of mainstream mathematical logic” [14.4, However, this is just deduction! Is it possible to
Part C | 14.1
alogical perspective in terms of concession problem. separately. First, it is not true that those aspects are
A concession problem is overcome by a conjecture on completely missing in formal logic. It is shown in the
the basis of which the dialogue is continued. In contrast rest of this section that numerous formal logics deal
with the usual deductive dialogues, such a conjecture with these aspects, although they have yet to be brought
is settled in a new kind of move allowed by an addi- all together. Second, in this contribution, it is thought
tional rule. The difficulty is thus to specify which kind that even deduction is to be understood within argu-
of speech act is at stake while performing such conjec- mentative practices. Hence, the dialogical framework is
tural moves. Indeed, under the view endorsed in this introduced in the third section, where it is come back to
chapter, conjectural moves are performed by means of the key concept of commitment. It is also shown how
speech acts which are neither assertions nor questions dialogical logic enables to grasp the central role of the
of usual deductive dialogues. agent as well as the dynamics of the contexts in terms
Reasons why Toulmin argues in favor of a radical of a pluralist attitude. After having presented abductive
separation between formal logic and argumentation are reasoning in the fourth section, the scene for a dialogi-
given in the first section. Although it is true that some cal understanding of abduction is set in the fifth section.
aspects of argumentation such as the role of the agent, All the details of dialogical pluralism, dynamics of con-
the dynamic of the contexts, and the defeasibility are texts, and dialogical defeasibility cannot be given here.
to be taken into account, it is not a reason to conclude However, the relevant related works on each of these
that formal logic and argumentation should be studied points will be systematically mentioned.
Part C | 14.2
of the agents, and so on. Some of the most virulent someone to believe something. An agent puts forward
of these theoreticians were perhaps Toulmin [14.1] (see an argument in order to defend a thesis and the in-
also [14.6] for recent studies about Toulmin Model) and ferences are defeasible, that is, they might be rebutted
Perelmann and Olbrechts-Tyteca [14.2]. This chapter when new information is encountered. Schematically,
focuses on Toulmin, who defined a model of argumen- the Toulmin Model may be represented as in Fig. 14.1.
tation based on the analysis of microarguments. This This schema represents the process that consists in
model will be called the Toulmin Model of argumenta- defending a claim against a challenger. First, the agent
tion whose general idea is that some data leads to the asserts a Claim (C) and then defends this claim by ap-
claim (or conclusion). The data is supported by a war- pealing to relevant available facts, the so-called Data
(D). Next, the challenger may ask for the bearing of the With respect to the agent, intuitionism initiated by the
data and this is exactly what is called the Warrant (W). Dutch mathematician Brouwer [14.8] is motivated by
The warrant influences the degree of force on the con- the need of taking the importance of the agent into ac-
clusion it justifies and this is signaled by the qualifying count. More recently, agent-centered dialogical logic
of the conclusion with the Qualifier (Q): necessar- initiated by Lorenzen [14.9, 10] conceives the notion
ily, probably, or presumably. The qualifier presumably of proof itself in terms of interactions between agents.
renders the argument defeasible, and the condition of It remains true that further efforts are still required to
Rebuttal (R) should be specified. The process ends in deal with nondeductive reasoning. Before pronouncing
a question that consists in asking what is thought about the divorce between logic and argumentation theory,
the general acceptability of the argument: what Toul- it should be recognized that many logicians had al-
min calls Backing (B). In different fields, warrant and ready widened the range of argumentative schemas in
backing might be of different kinds. formal logic, by adding the agent, thinking otherwise
According to the Toulmin Model, an argument is the premises–conclusion relation, and defining several
assumed to be used by a practical agent. Inferences kinds of consequence relations.
are not conceived in terms of the relationship between Certain cognate aspects of reasoning must be
propositions independent from any act. And the act of grasped. For example, inference is a process which
inferring is linked with the agent who expresses a claim, involves a flow of information, changes of belief,
by means of which a commitment to a thesis is in fact knowledge or even desires. Logicians have to take
expressed. The underlying methodological thesis is that the “dynamic turn”, in the words of Gochet [14.11].
the study of reasoning must be related to real-life rea- That is why the agent has sometimes been introduced
soning. This kind of reasoning is never perfect (as in explicitly in the object language in order to express
an ideal model of formal logic) because we never have intentional relations by means of specific operators.
all the information needed to defend a claim and we The enterprise does not always head in the same di-
might always find a rebuttal that changes it. Hence, the rection as an agent-centered analysis (in fact it almost
right standard of a good argument cannot be the deduc- never goes in that direction) but the enterprise does
tive standard of validity. An argument succeeds or fails provide new tools on how to implement the agent in
only in relation to an agent’s target. Toulmin’s schema the study of reasoning. Hintikka’s explicit epistemic
enriches the traditional premises–conclusion relation- logic [14.12], and more recently Priest’s intentional
Part C | 14.2
ship of the deductive reasoning model of arguments logic [14.13], among others, define useful tools to de-
by distinguishing additional elements, such as warrant, scribe the intentional states of an agent. In addition,
backing, and rebuttal. It is an interesting fact that the dynamic approaches, such as the AGM-Belief Revision
Toulmin Model and argumentation theory call up not Theory [14.14] (and see Chap. 10), are meant to give
only the matter of the burden of proof, but also the mat- an account of how to incorporate new pieces of be-
ter of the burden of questioning, which is of importance liefs into an agent’s belief set, conceived as a set of
for the beginning of the process. A consequence of this sentences. In the same spirit, dynamic studies coming
action- and agent-centered analysis is that an account from natural language semantics [14.15–18] and dy-
of the defeasibility of reasoning is now required. The namic epistemic logic [14.17–19] add operators to deal
fact that none of these features appeared in formal logic with the flow of information and the transmission of
constituted the core of Toulmin’s criticism, that led him information between groups of agents. The study of dy-
to consider argumentation theory and formal logic as namic inferences is not restricted to model theory and
radically different disciplines. to the change in information. From a pluralistic point
There is nothing really controversial in Toulmin’s of view, a change of logic might occur with respect to
critics of formal logic or in his model of argumenta- a given context of argumentation. For example, dialog-
tion. Nevertheless, following van Benthem [14.7], in ical logic is a pluralist enterprise in which the context
this chapter, it is believed him to be wrong in pronounc- of argumentation is defined by means of rules gov-
ing the divorce of argumentation theory and formal erning the general organization of a dialectical game
logic. Indeed, it might be true that classical formal logic (more precision on this point below). Although this fails
is insufficient to deal with reasoning as a human ac- to provide Toulmin with an answer to each critic he
tivity. Classical formal logic is not the only way to do addresses on formal logic, it does reveal how formal
logic, however. Although Toulmin’s work has the virtue studies are sufficiently rich to consider the possibility of
of emphasizing the role of human being, the defeasi- a more practical logic in which reasoning is conceived
ble feature of everyday life reasoning, and the dynamic as a human activity.
of argumentative contexts, it is worth noting that those Another aspect of argumentation stressed by Toul-
features were not completely lacking in formal logic. min is the imperfect feature of human reasoning, which
Argumentation and Abduction in Dialogical Logic 14.3 Logic and Argumentation: A Reconciliation 299
he deals with by means of the notion of rebuttal. Think- rules may be subject to conditions with respect to
ing of reasoning as defeasible means that an agent the context of the proof (e.g., in the context of con-
never draws conclusions definitively, that is, whatever tradictory premises, disjunctive syllogism might be
she infers from a given base of information might be rejected).
revised when faced with new information. In other The three main aspects of the Toulmin Model of
words, the conclusions drawn by an agent might be argumentation that have been highlighted are the cen-
defeated. It is worth noting that defeasibility does not tral role of the agent, the dynamics related to the action
need to be studied in the context of nonmonotonic and changes of contexts, and defeasibility. In what fol-
logics. If nonmonontonic reasoning is defeasible, the lows, it will be argued for a reconciliation of formal
converse does not hold. Interesting ways of defeasible logic and argumentation, and deduction will be also
cases come from the context [14.20]. What character- defined in argumentative practices. Note that it is not
izes defeasible reasoning is the possibility to defeat, the purpose of this chapter to deal exhaustively with
or to change a previously drawn consequence. Again, all the relevant aspects of argumentation. Indeed, every
this feature cannot be claimed to be completely missing facet of the dialogical pluralism (although the general
in formal logic. Indeed, defeasible reasoning has been principles are explained and relevant related works are
studied from various perspectives [14.21]. As already mentioned) or defeasible reasoning cannot be presented
mentioned, one well-known approach is the epistemic here. The designation defeasible reasoning gathers to-
approach such as that in the context of belief revision gether aspects of default logic, nonmonotonic logics,
theory. The formal epistemology of Pollock [14.22], truth maintenance systems, defeasible inheritance log-
who differentiates between fundamental knowledge and ics, autoepistemic logics, circumscription logics, logic
inferred knowledges, provides another example. In this programming systems, preferential reasoning logics,
theory, inferred knowledge is precisely a knowledge abductive logics, theory revision logics, belief change
which might be defeated. Another approach is centered logics, and so on. In fact, all of this relates to what is
on the notion of logical consequence, that is, dealing called by Woods the third-way reasoning [14.4]. Vari-
with defeasibility in the context of nonmonotonic log- ous systematic approaches to defeasible argumentation
ics (Chap. 10). Some of the most important proposals that make use of formal tools originating from compu-
are Default Logic by Reiter [14.23] and Circumscrip- tational sciences and artificial intelligence can be found
tion by McCarthy [14.24]. In both of these frameworks, in [14.29].
Part C | 14.3
the conclusion follows defeasibly or nonmonotonically The main thesis of this contribution is that a unified
from a set of premises, just in the case that it will study of reasoning may be achieved by focusing on the
hold in almost all models that verify the premises. key notion of commitment in argumentative interaction.
(For a relevant survey, see for example [14.25–27]. Indeed, this notion forms the basis for a distinction be-
See also the third-way reasoning in [14.4].) It is also tween various kinds of speech acts that are significant
important to mention Batens’ adaptive logic [14.28], for the specification of different kinds of reasonings,
a formal logic in which the application of inference such as deduction and abduction.
nor a proof-theoretic semantics, and is grounded in the defined independently of the identity of P and O (hence
argumentative practices. It is a semantics based on the they are defined making use of player variables X and
“meaning is use” of Wittgenstein [14.38, p. 43] and the Y). It is fundamental that when agents perform utter-
description of specific language games governed by the ances, they are committed to justify their claims. This
rules defined below. Although it was first developed commitment is essential in the characterization of dif-
to deal with intuitionist logic, it has since then taken ferent kinds of speech acts and in giving the meaning of
a pluralist turn. Indeed, different kinds of rules enable what is said.
a sharp distinction between different semantic levels The language used to define the rules of dialogical
and this enables the definition of a wider range of logics logic is defined as follows. Let L be the language of
in a unified framework. standard propositional logic:
Roughly speaking, dialogical logic is a framework
in which the proof process is conceived as a dialectical Two labels, O and P, stand for the players of the
game between two players: the Proponent of a thesis game: the Opponent and the Proponent, respec-
and the Opponent. The Proponent utters an initial the- tively.
sis and tries to defend it against challenges performed To define particle rules, variables X and Y are re-
by the Opponent, who criticizes the thesis. The two quired, with X ¤ Y, that hold for players (regardless
players make moves alternately. Those moves consist of their identity with O or P).
of specific speech acts by means of which they perform Force symbols, Š and ‹, are used to specify the kind
challenges and defences. A thesis is valid if and only of speech act at stake: Š for declarative utterances,
if the Proponent is able to defend it against every at- and ‹ for interrogative utterances.
tack of the Opponent. In order to criticize an assertion The conjunction can be indexed yielding ^i , where
of her argumentation partner, a move in which a for- i 2 f1; 2g, such that ^1 stands for the first conjunct,
mula has been uttered has to be challenged with respect and ^2 the second.
to its main connective. Such sequences of utterances, r WD n indicates the rank chosen by the player at the
challenges, and defences are regulated by the particle beginning of a dialogue, as pointed out by the rule
rules by means of which the local meaning of the log- [SR0]. For example, n WD 1 means that the rank is
ical constant is given. In addition, structural rules give 1. (The notion of rank is explained and defined in
the general organization of the dialogue and determine Sect. 14.3.3)
Part C | 14.3
They are independent of any specific context of argu- D.'/ starts with the assertion of ' by P (' is called
mentation. They are the same no matter the presup- the initial thesis). O and P then choose a positive
posed logic and are applied in the same way by both integer called repetition rank.
players. The formulation of particle rules is symmetric. [SR1-c] [Classical Gameplay Rule]
Symmetry is an essential feature of dialogical par- After the ranks have been chosen, moves are alter-
ticle rules and this is the reason why dialogical logic nately performed by O and P and every move is
is immune to trivializing connectives such as Prior’s either a challenge or a defence. Let n be the repe-
tonk [14.39], even if there is no reference to any model tition rank of a player X: When it is X’s turn to play,
or to any truth condition. Rahman et al. [14.40] and X can challenge a preceding utterance or defend her-
Rahman [14.41] show that defining a rule for a tonk op- self against a preceding challenge at most n times by
erator would lead to a formulation of particle rule which the application of particle rules.
is not symmetric. This would involve player-dependent [SR1-i] [Intuitionistic Gameplay Rule]
rules, which is not possible in dialogical logic because, After the ranks have been chosen, moves are alter-
at the local level, the identity of the players has not been nately performed by O and P and every move is
yet defined. As rightly stressed by Clerbout [14.33], either a challenge or a defence. Let n be the repe-
it does not even make sense to talk of Opponent and tition rank of a player X: When it is X’s turn to play,
Proponent at the local level. Indeed, the identity of the X can challenge a preceding utterance or defend her-
players is defined at the level of structural rules, when self against the last challenge which has not yet been
it is said, for example, that the Proponent is the player defended, at most n times by the application of par-
who utters the initial thesis. ticle rules.
Note how commitment is essential to the meaning [SR2] [Formal Rule]
of an assertion. An agent, on uttering a conjunction, is P is not allowed to utter an atomic formula unless
committed to give a justification for both of the con- O uttered the same atomic formula before. Atomic
juncts. Hence the challenger has the choice of which formulae cannot be challenged.
subformula to defend. That is, if X utters ' ^ , Y [SR3] [Winning Rule]
challenges this move by asking either ‹^1 (the first A player X wins the game if and only if the game
conjunct) or ‹^2 (the second conjunct). In the case of is finished and X made the last move. It is said that
a disjunction, it is the defender (X) who chooses. In- a game is finished if and only if there are no more
Part C | 14.3
deed, an agent uttering a disjunction is committed to moves allowed according to the particle rules.
give a justification for (at least) one of the disjuncts,
that is, Y asks ‹_ and X chooses to answer either ' The first rule [SR0] sets the identity of the players
or . by claiming that the Proponent is the one who utters the
Note that a challenge on a negation cannot be an- initial thesis and introduces asymmetry. Once the initial
swered. The challenge consists in a switch in the burden thesis is uttered, the players have to choose a rank of
of the proof: If a player X utters a formula :', a player repetition. That rank of repetition prevents them from
Y challenges that formula uttering ' and has to defend infinitely repeating the same moves. In fact, they indi-
it thereafter. For the conditional, Y takes the burden cate how many times a player can challenge or defend
of the proof of the antecedent. It might be said that a formula. For example, if a player choses rank 1, then
when an agent X utters a conditional ' ! , then X this player is allowed to challenge a formula at most
is committed to justifying with the proviso that the once. Ranks are used to ensure that every game ends af-
argumentation partner Y concedes '. ter a finite number of moves. Rules [SR1-c] and [SR1-i]
regulate the gameplay and distinguish classical from
14.3.3 Structural Rules intuitionistic games. Note that a game is never played
with both of them. The classical rule [SR1-c] does not
Now structural rules are needed in order to define the impose any restriction with respect to the defences.
general organization of a dialogue by explaining how While playing with the intuitionistic rule [SR1-i], it is
to apply the particle rules, that is, how to start a dia- forbidden to defend the same move twice or to give
logue, who has to play, when, who wins, and so on. The a defence against a challenge that is not the last one.
global level of meaning is defined by these rules, that This is related to the intuitionistic requirement of hav-
is, a level of meaning that arises from the application of ing a direct justification for the uttered formula.
the particle rules in specific contexts of argumentation. The formal rule, [SR2], might be understood as
a rule that prevents the Proponent from making any sup-
[SR0] [Starting Rule] position which might be used to win. Without that rule,
Let ' be a complex formula. Every dialogical game dialogues would be trivial and the Proponent would al-
302 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
ways be in a situation to win. Finally, the winning rule, Table 14.2 Dialogue 1
[SR3], gives the conditions of victory. O P
::p ! p 0
14.3.4 Winning Strategy and Validity 1 r WD 1 r WD 2 2
3 ::p 0 p 6
Hitherto, nothing has been said about the notion of va- 3 :p 4
lidity. In dialogical logic, validity is not defined in terms see o 5 p 4
of truth preservation but rather in terms of winning
strategy. It is said that a player has a winning strategy played with classical rule [SR1-c]. If it had been played
if and only if she is able to win regardless of the moves with the intuitionistic rule [SR1-i], P would have lost. P
and the choices made by her argumentation partners. could not have performed move 6 because the last chal-
This leads to the strategic level which is not involved at lenge of O is 5, not 3 ([SR2]). Thus the dialogue would
the level of particle and structural rules. Indeed, noth- have ended at move 5 with a victory by O.
ing in those rules indicate how to play strategically and These two different possible gameplays illustrate
in no way do they indicate how to win; neither do they the difference between classical and intuitionistic nega-
prevent anybody from playing badly. Note then that it tion. Quine’s claim “change of logic, change of sub-
is not one play of the game which is to be taken into ject” [14.43, pp. 80–94] must be thought otherwise.
account to determinate the validity of a formula: The Indeed, the dialogical setting displays that negation has
validity of a formula is determined by the existence of the same local meaning in every logic, and its global
a winning strategy. meaning is changing according to its use in different
Now, it is reasonable to ask whether a generally contexts of argumentation. Both the semantic levels are
good strategy exists. First a comment about the choice significant in fully defining the meaning of an expres-
of rank. As explained by Clerbout [14.33, 42], it is sion.
sufficient to consider the case in which the Opponent Beyond the classical and intuitionistic logics, the
chooses rank 1 and the Proponent rank 2 in order to sharp distinction between the particle rules and the
obtain a significant range of winning strategies to deal structural rules allows a development of dialogical logic
with deductive validity. Second, trained dialogicians as a pluralistic tool. The pluralistic aspect of dialogical
know in fact that the best way to play is always to let logic allows us to deal with various kinds of argu-
Part C | 14.3
the Opponent choose first when it is possible and there- mentation contexts and their dynamics, the importance
after to repeat the same choices. This is the well-known of which has been stressed by argumentation theo-
copy-cat strategy based on a clever use of the formal reticians. Indeed, more expressive languages may be
rule. introduced by means of the introduction of new sym-
An illustration of a dialogue is given in Table 14.2 bols, the (local) meaning of which will be given by
by taking the elimination of double negation principle a particle rule. A language may be used in different con-
::p ! p as an example. In Table 14.2, the moves of texts of argumentation, with various underlying logics.
the players are written down in the column O for the Dialogically, this means that a language may be used
O-moves, and in the column P for the P-moves. The in different kinds of games distinguished by their struc-
number of a move is indicated in the outer column tural rules.
whereas those of the challenges moves are indicated in As stated earlier, it is not the purpose of this contri-
the inner columns. The game runs by applying the clas- bution to present all the varieties of dialogical logics
sical rule [SR1-c]. which nevertheless should be taken into account in
At move 0, P states the initial thesis. At move 1, order to deal with the contextual aspect of argumen-
O chooses rank 1 and P chooses rank 2. At move 3; tation. More details on first-order dialogical logic are
O challenges the initial thesis uttering the antecedent to be found in Clerbout [14.42]. With different struc-
of the conditional, namely ::p. P cannot answer im- tural rules it is also possible to define a dialogical free
mediately by giving the consequent p because P cannot logics as in Rahman et al. [14.44], Fontaine and Red-
utter an atomic formula. Therefore, at move 4, P chal- mond’s paper in [14.45] and an application to the logic
lenges the double negation ::p by uttering :p. No of fiction is to be found in Fontaine [14.46]. For the in-
defence is allowed and O has to counter-attack by utter- troduction of modal operators (and explicit contexts of
ing p. P uses that concession to answer to the attack 3 argumentation) and their use in different modal frames,
at move 6. Again, P wins. However, this game has been see [14.47] and [14.31].
Argumentation and Abduction in Dialogical Logic 14.4 Beyond Deductive Inference: Abduction 303
Part C | 14.4
approach in the context of dialogical logic, and more the conclusion is to be understood as an ignorance-
precisely in the context of the dialogical approach to preserving relation.
belief revision of Fiutek [14.51]. In a similar way, Nepo- Abduction is an inference triggered in response to
muceno et al. [14.52] (see also Chap. 13 by Nepomu- an ignorance problem, in particular, there is an igno-
ceno et al.) define abduction in the context of dynamic rance problem when, with respect to a (surprising) fact
epistemic logic and its public announcement operator. or state of affairs, there is a question (a problem), Q,
This might be dialogically understood on the basis of we cannot answer with our present knowledge. We as-
Magnier [14.53]. However, this is not the path followed sume that there is a sentence ˛ such that if we knew it,
in this contribution, because the agent would be intro- it would help us to answer Q. With respect to such a Q,
duced into the language and abduction would still be three situations are possible:
understood in terms of consequence-having relation,
despite some kind of interaction in a dialogical re-
Subduance, that is, new knowledge removes igno-
rance (e.g., by discovering an empirical explana-
construction. Moreover, an epistemic understanding of
tion)
abduction would lead to consider hypothetical abduc-
tive solutions as new pieces of knowledge; something
Surrender, that is, we give up and do not look for an
answer
that is not defended in this chapter, as clarified in the
following.
Abduction, that is, we set a hypothesis as a basis of
new actions.
Essentially, the challenge consists in explaining
what is specific to abduction in a dialogue. As shown Abduction is thus an inference by means of which
below, while studying abduction, the concepts of ab- we do not solve the ignorance problem, but we over-
ductive problem and abductive solution are fundamen- come it in a certain way by setting a hypothesis. This
tal (Chap. 10). In order to define dialogues based on hypothesis can then be released in further reasoning,
these concepts, a new kind of move performed by something which allows for specific kinds of actions.
means of a specific type of speech act is needed. There- In Woods’ words, abduction “is a response that offers
fore, the problem is to clarify this type of speech act the agent a reasoned basis for new action in the pres-
and the rules which govern it. Again, the key question ence of that ignorance” [14.4, p. 368]. Therefore, what
304 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
must be grasped here is that the conclusion of an abduc- If there is no K or K relating to the cognitive tar-
tion is not (necessarily) a true sentence or a new piece get, a hypothesis H is sought by the agent in order to
of knowledge; it is a hypothesis that can be used in fur- set a plausible solution to the ignorance problem. Such
ther reasoning. The ignorance contained at the level of a hypothesis is not knowledge, it is a hypothesis. This
the premises is inherited by the conclusion. What is spe- is represented in steps (4) and (5). Since it is only a hy-
cific in the relation between premises and conclusions pothesis, it cannot relate to the cognitive target either,
here is not a gain of knowledge, but rather an ignorance- because it is not a solution. Even combined with the
preserving relation. knowledge set, the cognitive target is not attained. This
For reasons of clarity, the GW schema is formally is expressed in steps (6) and (7).
presented following Woods’ latest version in [14.4, What is the purpose of the hypothesis H if it does
p. 369]. Let T be an agent’s epistemic state at a specific not solve the problem? In step (8), it is settled as
time, K the agent’s knowledge base at that time, K an a hypothesis that subjunctively relates to the cogni-
immediate successor base of K, R an attainment relation tive target in combination with our knowledge base.
for T (that is, R.K; T/ means that the knowledge-base K What does this mean that it subjunctively relates to the
is sufficient to reach the target T), a symbol denot- cognitive target? This is how Gabbay and Woods un-
ing the subjunctive conditional connective, for which no derstand Peirce’s, hence, in the schema laid down by
particular formal interpretation is assumed, and K.H/ Peirce [14.55, 5.189] (see also Chap. 10 for the original
the revision of K upon the addition of H. C.H/ denotes formulation). It means that it is not a true sentence, it is
the conjecture of H and H c its activation. Let TŠQ.˛/ not a piece of knowledge either, but if it were, it would
denote the setting of T as an epistemic target with re- give an acceptable solution to the cognitive problem. As
spect to an unanswered question Q to which, if known, in step (9), some additional conditions should be added
˛ would be the answer. According to the GW schema, for the acceptability of H.
the general structure of abduction is as follows: Having set hypothesis H as a subjunctive solution
of the cognitive problem, abduction first consists in
1. TŠQ.˛/
concluding that we are right in conjecturing that hy-
2. :.R.K; T// [fact]
pothesis. This is the first subconclusion at step (10).
3. :.R.K ; T// [fact]
C.H/ means that the hypothesis H is conjectured. It
4. H … K [fact]
is important to notice here that abduction does not end
5. H … K
Part C | 14.4
[fact]
at step (10). Indeed, by taking seriously the fact that
6. :R.H; T/ [fact]
abduction is triggered by a cognitive problem, we trig-
7. :R.K.H/; T/ [fact]
ger an abduction not to conjecture a hypothesis, but
8. H R.K.H/; T/ [fact]
in order to find a possibility of further actions despite
9. H meets further conditions S1 ; : : :; Sn [fact]
the lack of knowledge. Therefore, the abduction should
10. Therefore, C.H/ [subconclusion, 1-7]
not end before step (11), that is, when the conjecture
11. Therefore, H c [conclusion,1-8]
is released and when the hypothesis is used in further
The aim, here, is to characterize what is specific to reasoning as a basis for new action. H C represents the
abductive inference, by taking into account what trig- hypothesis released in a further reasoning, that is, in
gers such an inference, and to describe the subsequent a reasoning in which we act on the hypothesis H and the
process. At the beginning, a cognitive target TŠQ.˛/ is superscript C indicates the conjectural origin of the hy-
set (1): something we aim to reach in response to an pothesis. Following Woods [14.4, p. 371] an inference
ignorance problem. The ignorance problem triggers an that ends at step (10) will be called a partial abduction,
abduction because it is a cognitive irritant, that is, it and an inference continuing with step (11) a full abduc-
places us in an unpleasant situation of lack of knowl- tion.
edge which can be overcome by action and reasoning. For the purpose of clarity, in step (10) we face two
Step (2) :.R.K; T// says that the current knowl- possibilities. First, we do not test the hypothesis but
edge is insufficient to attain the cognitive target. This we use it in a further reasoning (as in step (11)). This
is essential if we face an ignorance problem. Step (3), is precisely what is called full abduction. Second, we
:.R.K ; T//, says that there is no immediate successor test the hypothesis, by empirical methods, for exam-
of K by means of which the target would be attained. ple. This presents us with three possibilities. First, the
This is a crucial step. If there were such a K , we hypothesis is confirmed and we obtain a new piece of
would just extend our knowledge by adding new infor- knowledge; this would lead to a situation similar to the
mation and would refrain from triggering anything such K situation above. In this case, no full abduction is
as an abduction. This would be subduance, that is, new triggered, that is, we do not act on the hypothesis in an
knowledge would remove the initial ignorance. ignorance-preserving way. In fact, we would end with
Argumentation and Abduction in Dialogical Logic 14.4 Beyond Deductive Inference: Abduction 305
Part C | 14.4
to be disclosed is now that there is a faster way to because even combined with their knowledge-base K,
go to Universidad which cannot be explained on the it does not relate to the cognitive target (:R.K.H/; T/).
basis of the information contained on the incomplete Step (8) is crucial because H only subjunctively relates
map. to the cognitive target, that is, the effective existence
With respect to the previously detailed GW schema, of another line might be such that, when added to the
step (1) TŠQ.˛/ is such that Q is the question of know- knowledge-base K, it would allow the cognitive target
ing how their workmate might have arrived so early. T to be reached. However, H is only a hypothesis and
The cognitive target T would be a situation in which without further information, it does not constitute an ˛
an ˛ is known such that ˛ would be the answer to that answer to Q that would relate to the cognitive target
question. With respect to step (2), their knowledge base T. If H meets further conditions (S1 ; : : :; Sn / it might
is insufficient to answer the question because their map be considered as a good or plausible explanation, per-
does not show any another way to reach Universidad haps the best one, as expressed in step (9), and that
(:.R.K; T/). In step (3), they receive no further knowl- hypothesis would conjectured as in step (10) (C.H/).
edge (e.g., an updated map) to answer the question The following day, Sahid and Ángel stop at Mixcoac as
(:.R.K ; T/). There are three possibilities: First, they if they knew the existence of this line, but in fact they
do not care and follow the same trip as the day before do not. That is, step (11), they release the conjectured
(surrender). Second, they search for more information hypothesis (H C / and act upon it despite their persisting
and obtain an updated map in which the line between ignorance with respect to the genuine explanation of the
Mixcoac and Universidad appears (subduence). Note initial problem.
that in this last case, no abduction is triggered, a new The fact that an epistemic view of the abductive in-
piece of information is added to the knowledge base ference thus described would not grasp the specificity
(such that the new knowledge-base K explains why of abduction has to be emphasized. Indeed, the new
the workmate went faster the day before – R.K ; T/). hypothesis is not to be considered as a new piece of
Third, they perform an abduction. That is, they conjec- knowledge or belief. It might be accepted as an abduc-
ture the existence of the line and, therefore, they can tive conclusion and as a good explanation without being
leave half an hour later the following day. The exis- believed or accepted as the good explanation. What is
306 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
characteristic of an abduction is the conjectural aspect that the cognitive target is not attained by a definitive
of its conclusion and the activation of the hypothesis in solution of the initial problem at the level of the conclu-
further reasoning. What is essential to an abduction is sion.
Part C | 14.5
uation, P has to be allowed to claim that she is facing a device that allows switches of the burden of proof, in
an abductive problem. In fact, as it is clear in the di- addition to the particle rule for the F operator. This pre-
alogue (Table 14.3), an abductive problem is triggered supposes a generalization of the formal rule by means
by a concession problem: P cannot explain her thesis on of a structural rule which says that the player who
the basis of the concessions. plays formally (i. e., the player who cannot introduce
The notion of abductive problem is dialogically atomic formula) is the player who challenges an F
defined following Aliseda’s [14.50, p. 47] definitions operator (or the player who defends an AS-move). In
of abductive novelty and abductive anomaly (see also other words, the argumentation partner who challenges
Chap. 10): P will now be allowed to claim I am fac- a formula such as F ˚ will have to take the burden
ing an abductive novelty or I am facing an abductive of the proof by defending ˚ under the formal restric-
anomaly, as in the rules [SR-AN] and [SR-AA] below. tion.
In the first case, P is committed to show that neither ˚
[SR2.1] [Formal Restriction]
nor :˚ is entailed by . In other words, P has no win-
Let dn be a section of a dialogue (main dialogue or
ning strategy for ˚ nor for :˚, given . In the second
subsection): If X plays under formal restriction in
case, P is committed to show that ˚ is not entailed by
dn , then X is not allowed to utter an atomic formula
while :˚ is. That is, P has a winning strategy for :˚
unless Y uttered the same atomic formula before in
but not for ˚ given . Here, the technical difficulty is
the same section dn .
that the Proponent would need a losing strategy in order
to justify she is facing an abductive novelty or an abduc- The rule governing the application of the formal
tive anomaly. How strange such a game would be! restriction is now defined as follows:
Table 14.4 Particle rule for F operator
Assertion Challenge Defence
XŠ F ˚ d1 Y‹F d1:i XŠ :. ! ˚ / d1:i
Y opens a subdialogue d1:i .
308 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
[SR2.2] [Application of the Formal Restriction] at move 7 by opening a subdialogue in which she now
The application of the formal restriction is regulated plays formally. P answers by saying :. ! .C ^:E//,
by the following conditions: where is the same set of initial concessions as be-
fore, and the dialogue runs as usual. It is easy to verify
1. In the main dialogue d1 , if X D P, then X plays
that O will lose (for the same reasons P lost in Dia-
formally.
logue 2, Table 14.3). In the same way, O will lose even
2. If X opens a subdialogue d1:i by challenging an F
if she challenges the second conjunct in move 4, and
operator, then X plays formally.
P will have justified that she was facing an abductive
3. If X opens a subdialogue d1:i by challenging an
novelty.
AS-move, then Y plays formally.
Now, structural rules that allow the Proponent to 14.5.2 Guessing
claim she is facing an abductive problem are added.
She has the choice between the two kinds of abductive After having shown she was facing an abductive prob-
problems previously defined: lem, the Proponent has to guess what is missing in
order to solve it. The Proponent has to be allowed
Part C | 14.5
Even if, from a dialogical viewpoint, it is not looked conditions of use are not the same as the usual as-
for any true formula, what is an abductive solution may sertions of the standard dialogical logic because it is
be defined following a similar process. Indeed, after subject to further justfication, no matter whether it is an
having shown that she was facing an abductive prob- atomic or a complex formula: Would it be a new kind
lem, the Proponent should be allowed to put forward of utterance?
a hypothesis. What the Proponent has to look for is
a formula, not conceded by the Opponent, that enables 14.5.3 Committing
her to win the dialogue previously lost. Therefore, a rule
that allows the Proponent to conjecture the hypothesis The dialogues defined here only describe a partial
of an explanation called abductive solution is added: abduction, that is, an abductive problem is set and
a plausible answer is guessed. However, in order to
[SR-SA] [Abductive Solution Rule]
characterize a full abduction, it should be explained
When the Proponent has won the subdialogue trig-
how the conjecture might be released in a further dia-
gered by the challenge of the F operator, whether
logue and how the players might act upon it. As already
it be novelty or anomaly, the Opponent is allowed
explained, Gabbay and Woods characterize abduction
to ask her ‹AS (i. e., she claims do you have an ab-
as an ignorance-preserving inference. It has been shown
ductive solution to propose?). If so, the Proponent
that abductive dialogues are not-conceded-preserving:
answers AS W ˛ (i. e., she claims ˛ is my abductive
Part C | 14.5
The explicative conjecture remains not-conceded and
solution).
the Proponent only gives a subjunctive explanation for
What does it mean that ˛ is an abductive solution the surprising fact. The difficulty at this point involves
for the Proponent, and why is that abductive solution the clarification of the commitment carried by such con-
the conjecture of a hypothesis? In fact, this move con- jectural moves, which are rather different from the usual
sists in claiming that there is a plausible explanation assertions.
to the surprising fact ˚ given . This specific move, Even if the question of the commitment of the
AS W ˛, is the move that forces to reconsider dialogi- conjectural move is very complex (it might even vary
cal games to fit in with abductive reasoning. Indeed, according to the argumentation contexts), a rule to deal
it may consist of the utterance by the Proponent of an with the consequence requirement of the type of abduc-
atomic formula not previously conceded by the Oppo- tion called “plain abduction” by Aliseda [14.50, Part II]
nent. Nevertheless, the introduction of this new piece can be defined. In a dialogue, this consists in adding
of information is to be understood as a subjunctive ex- the possibility of a challenge on the AS-move, called an
planation. That is, the Proponent introduces ˛ as she AS-challenge. The Opponent makes the request to jus-
would say if you had conceded me ˛, I would have tify that it is sufficient to consider the conjunction of
been able to explain ˚. In no way is ˛ introduced as and ˛ to derive ˚ by means of the rule in Table 14.6.
an O-concession to be incorporated into or into a 0 , Under this rule, the challenger opens a subdialogue
a successor of containing the initial concessions and in which the defender will have to defend the condi-
the other concessions made during the dialogue. ˛ is tion . ^ ˛/ ! ˚. The act of Y opening a subdialogue
a new formula that may be used in further reasoning, means that X will play under formal restriction. The for-
but only temporarily, and that temporarily nature re- mal restriction is applied in accordance with the rules
quires further justification. Indeed, as a hypothesis, ˛ [SR2.1] and [SR2.2] given earlier. More kinds of such
is defeasible, that is, it is a conclusion faute de mieux challenges should be defined to complete the picture.
guessed by the Proponent. If it is shown later that this is Adding the explanatory character of ˛ might also be
not a good explanation or if a counter-example is en- required. Thus, the possibility to chose another attack
countered, then ˛ will be defeated and removed. Its against an AS-move is offered to Y (Table 14.7).
310 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
Other requirements, such as consistency (; ˛ ² another thesis as if it had been conceded, should be
?), minimality and so on (Chap. 10), might be added developed. Indeed, full abductive dialogue should be
in the same way. It would also be possible to rely on not-conceded-preserving, that is, the agents act upon
these rules in order to deal with the defeasibility of AS- the hypothesis although it has not been conceded by
moves. Indeed, if a player is not able to answer the the Opponent. This is the dialogical understanding of
AS-challenges performed by her argumentation part- ignorance-preservation in the GW model of abduction
ner, then her conjectural move should be removed and defended in this contribution. In the GW schema, it was
considered as null. In the same way, if some counter- said that neither R.K.H/; T/ nor R.K .H/; T/ were the
examples or a better explanation are found, the AS- case. Here, this parallels the fact that P does not actually
moves should also be cancelled. However, defeasibility attain the target. P only encounters something similar to
is a very wide topic and cannot be dealt with in detail a subjunctive winning strategy, a strategy which would
in this chapter. A nonmonotonic account of abduction lead to the victory if O had conceded ˛; similarly in
that makes use of adaptive logic is given by Meheus and the GW model, it is only a subjunctive attainment rela-
Batens [14.59] and Beirlaen and Aliseda [14.60] (see tion expressed by H R.K.H/; T/. Now, the challenge
also Chap. 12 by Gauderis). For a dialogical study of faced in order to complete the picture and to define
defeasible reasoning, see the work of Nzokou [14.61]. the conditions of use of a hypothetical explanation ˛
For a nonmonotonic treatment of inconsistencies in the in a full abduction, consists in providing an in-depth
context of an adaptive dialogical logic, see Rahman and analysis of the commitment carried by such a conjec-
Van Bendegem [14.62] . ture. This relates to the following question: What kind
What has been characterized in this section is only of speech act is at stake when a hypothesis is conjec-
partial abduction. In order to attain a full abduction, the tured? Without a precise answer to this question, no
framework over which dialogues are obtained and in precise rule of victory for abductive dialogues can be
which the hypothesis ˛ is released in the defence of yet formulated.
In the previous section, a new kind of move specific to explanatory relation conjectured in a hypothetical ab-
abductive dialogues, the so-called AS-move, by means ductive solution. However, it would have ended up in
of which a hypothetical abductive solution is conjec- an account explicitly involving the epistemic states of
tured, has been introduced. Such a move is considered the agents instead of taking into account their actions.
as a subjunctive move, that is, a move stated hypothet- Moreover, such an account would yield an excessively
ically with an assumption such as if you had conceded strong commitment on the part of the agent with re-
me ˛, I would have been able to justify ˚. The con- spect to the belief or the knowledge of the truth of the
ditions under which it is possible to conjecture an hypothetical abductive solution. However, as explained
abductive solution and how such a hypothetical abduc- earlier, this is not necessary. An abductive solution can
tive solution might be challenged have been clarified. be conjectured as being plausible without any commit-
However, by means of what kind of speech act is an ment to the belief of the truth of what is expressed.
AS-move performed? What kind of speech act is the This last point brings back the problem of the status
conjecture of a hypothetical abductive solution if ˛ can of an AS-move. Is it an assertive speech act? How could
be used in the defence of another thesis (in a full abduc- it be? An assertive speech act is usually characterized
tion)? by the commitment (of the speaker) to its truth. In his
An epistemic explanation might have seemed attrac- theory of speech acts, Searle [14.64, p. 12] defines the
tive, relying for example on the notion of subjunctive class of assertive speech act as follows:
knowledge defined by Rückert [14.63]. Subjunctive
knowledge is defined in a modal frame as the knowl- “The point or purpose of the members of the as-
edge people of another world would have about the sertive class is to commit the speaker (in varing
actual world. Abduction might thus be thought of in degrees) to something being the case, to the truth
terms of subjunctive epistemic change, namely if some of the expressed proposition.”
people of another world had the knowledge of what
is expressed by the hypothesis, they would be able Although, in dialogical logic, the commitment to
to explain a surprising fact in the actual world. This the truth is irrelevant in the characterization of an
would smartly explain the subjunctive status of the assertion, assertion can be thought of in terms of com-
Argumentation and Abduction in Dialogical Logic 14.6 Hypothesis:What Kind of Speech Act? 311
mitment to justify what is said (by defending it against the failure of the promise is dependent upon the agent
further challenges or by relying on the concessions of herself, the failure of an abductive explanation includes
the Opponent). What about the AS-move? It is conjec- a wider range of factors which do not exclusively de-
tured and might be released in another dialogue without pend on the agent activity. So, it does not seem that
being conceded by the Opponent or fully justified by the speech act by means of which an AS-move is per-
the Proponent. Therefore, is the conjecture of a hypoth- formed, is a commissive speech act.
esis an assertive utterance in the dialogical sense of the If it consists in neither an assertive nor a commissive
term? In Searle’s terms, is conjecturing an assertive act? act, would a conjectural move be a fictional speech act?
It seems that it cannot be. Answering these questions Indeed, according to Searle [14.65], fictional discourse
is crucial if the aim is to succeed in introducing the is not composed of genuine assertions but instead of
AS-move defeasibly and to release the conjectures in pretended assertions. The point is that in fiction, even
further reasoning in the same way as in the GW schema if the author is not committed to the truth of what she
of abduction in a dialogical framework. says, she does not have the intention to lie. Therefore,
If the speech act, by means of which a hypothetical the author does not tell the truth but neither is the author
abductive solution is conjectured, is not an assertion, lying. The author tells a story doing as if she were as-
would it be a commissive speech act? Beyond the ques- serting. When a player performs an AS-move and uses it
tion of the commitment to the truth or to belief, or even in a further reasoning, she does as if it were conceded.
to the acceptance of what is uttered, an abductive so- She does not have to believe what she says but neither is
lution commits the speaker to a subsequent series of she trying to mislead the interlocutor. However, beyond
actions. First, the speaker is committed to answer the the fact that Searle’s theory of fictionality is not share by
AS-challenges. Second, the use of the hypothesis in this contribution, it is thought that abduction has a prac-
a full abduction without knowing whether it is true or tical dimension, which is not necessary to the fictional
not, might be seen as a peculiar kind of commitment. discourse. Hence, in this chapter, it is not believed that
Does such a peculiar commitment relate to what Searle the hypothetical speech act should be explained in terms
has called the commissive speech acts? More precisely, of fictional discourse. Moreover, what is to be explained
Searle defines the commissives as “those illocutionary while studying fiction is its double aspect, the fact that
acts whose point is to commit the speaker (again in var- while we know it is not true we react to such a discourse
ing degree) to some future course of action” [14.64, without experiencing any kind of cognitive dissonance.
Part C | 14.6
p. 14]. In the dialogical approach, which has been out- Also there is no such tension to be explained in the con-
lined in the previous section, the underlying idea is that jecture of a hypothetical abductive solution. (For more
the Proponent conjectures a hypothetical abductive so- details on these points, see [14.66–68].)
lution which is such that, if it had been conceded, it An alternative, though tentative solution, would be
would have explained the surprising fact. However, this to reconsider the taxonomy of speech acts. For ex-
should not be the end of the story because the aim would ample, Bach [14.69] defines the wider category of
be to release the hypothesis in a further reasoning: in constatative, which the conjecturing act would be part
another dialogue in which the Proponent defends an- of. Other inspirations might be found in the work of
other thesis by acting on the hypothesis at stake. That Barés Gómez [14.70] who distinguishes between dif-
is why the commitment carried out by the speech act, ferent kinds of assertions in natural language (assevera-
by means of which an AS-move is performed, indicates tive paradigm, negative paradigm, and evidentiality) by
that it could be understood in terms of a commissive making use of Dynamic Epistemic Logic and by focus-
speech act. In addition to further justification, it also ing on the transmission of information. These different
commits the agent to further dialectical actions. Nev- kinds of assertions might also be understood as differ-
ertheless the commissives are usually speech acts in ent types (talking thus about hypothetical judgement);
which the agents commit themselves to an action over see the recent work of Rahman and Clerbout [14.71],
which they have full control. That is to say, the com- on Constructive Type Theory in the context of dialogi-
missive speech act commits to something that depends cal logic follows in this respect. The question is left as
only on the agents, as it happens in the case of promises a challenge for further investigations. Is a hypothetical
and oaths. However, the agent who performs an abduc- speech act a particular kind of assertive or commissive
tion does not have full control of the explanatory force act? Is it a mix of both? Is it a completely new kind of
of an abductive solution. Indeed, while in the first case speech act?
312 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
14.7 Conclusions
In this chapter, it is first advocated for a reconcilita- nature of such a hypothetical speech act is being faced,
tion of argumentation theory and formal logic in an which leads to the key question of commitment. What
agent-centered theory of reasoning, that is, a theory in are we committed to when we conjecture a hypotheti-
which inferences are studied in terms of human activ- cal explanation of a surprising fact and when we release
ities. More precisely, the dialogical approach to logic, such a hypothesis in further reasoning? A definite an-
in which reasoning is studied through a dialectical in- swer to this question is let for further investigations.
teraction between the Proponent of a thesis and the
Opponent of it, is defended. In this context, the ne- Acknowledgments. Research for this chapter was
cessity of taking into account, not only the actions of supported by the project Logics of discovery, heuris-
the agents, but also the importance of the notion of tics and creativity in the sciences (PAPIIT, IN400514-3)
commitment is stressed. Beginning with deductive di- granted by the National Autonomous University of
alogues, the picture has been extended to abduction, Mexico (UNAM) and by the project Interpretaciones
which is considered as a case of nondeductive reason- alternativas de lógicas no clásicas, IALNoC (P10-
ing. HUM-5844) granted by Junta de Andalucía (Conse-
The starting point to deal with abduction is the jería de Innovación, Ciencia y Empresas). Matthieu
agent-centered analysis of the GW model. While Gab- Fontaine is greatly indebted to the Dirección General de
bay and Woods identify abduction as an ignorance- Asuntos del Personal Académico (UNAM) and to the
preserving inference triggered by an ignorance prob- Programa de Becas Posdoctorales de la Coordinación
lem, abductive dialogues have been defined here as de Humanidades (UNAM). We thank Atocha Aliseda
not-conceded-preserving dialogues triggered by a con- and Mathieu Beirlaen for their comments. We are also
cession problem. The specificity of abductive dialogues thankful to Shahid Rahman for fruitful discussions on
has been identified at the level of the so-called AS- these topics (some of his arguments were detailed in his
moves by means of which hypothetical abductive so- work What Is Wrong about Pereleman-Toulmin’s Oppo-
lutions are conjectured. To allow such moves, new sition between Legal Reasoning and Logic?, JURILOG
rules have been put forward. The challenge for dialo- conference, Lille, 2014 and some ideas were suggested
gicians now consists in exploring the release of such during our talk Transmission de l’information dans les
Part C | 14
hypotheses in further dialogues in which they remain pratiques argumentatives. Evidentialité dans une sé-
not-conceded. However, the difficulty of defining the mantique dialogique, Ve SPS, Lille, 2014).
References
14.1 S.E. Toulmin: The Uses of Argument (Cambridge 14.10 P. Lorenzen, K. Lorenz: Dialogische Logik (Wis-
Univ. Press, Cambridge 2003) senschaftliche Buchgesellschaft, Darmstadt 1978),
14.2 C. Perelman, L. Olbrechts-Tyteca: Traité de in German
l’Argumentation: La Nouvelle Rhétorique (Presses 14.11 P. Gochet: The dynamic turn in twentieth century
Univ. de France, Paris 1958), in French logic, Synthese 130(2), 175–184 (2002)
14.3 D. Walton, E. Krabbe: Commitment in Dialogue. 14.12 J. Hintikka: Knowledge and Belief. An Introduction
Basic Concepts of Interpersonal Reasoning (SUNY to the Logic of the Two Notions (Cornell Univ. Press,
Press, Albany 1995) Ithaca 1962)
14.4 J. Woods: Errors of Reasoning. Naturalizing the 14.13 G. Priest: Towards Non-Being: the Semantics and
Logic of Inference (College Publications, London Metaphysics of Intentionality (Oxford Univ. Press,
2013) Oxford 2005)
14.5 D. Gabbay, J. Woods: Advice on abductive logic, 14.14 C.E. Alchourrón, P. Gärdenfors, D. Makinson: On the
Log. J. IGPL 14(2), 189–219 (2006) logic of theory change: Partial meet contraction
14.6 D. Hitchcock, B. Verheij (Eds.): Arguing on the Toul- and revision functions, J. Symb. Log. 50(2), 510–530
min Model. New Essays in Argument Analysis and (1985)
Evaluation (Springer, Dordrecht 2006) 14.15 J. Groenendijk, M. Stokhof: Dynamic predicate
14.7 J.F.A.K. van Benthem: On logician’s perspective on logic, Linguist. Philos. 14(1), 39–100 (1991)
argumentation, COGENCY 1(2), 13–25 (2009) 14.16 A. Baltag, H. van Ditmarsch, L.S. Moss: Epistemic
14.8 L.E.J. Brouwer: Intuitionisme en formalisme (No- logic and information update. In: Handbook on
ordhoff, Groningen 1912), in Dutch the Philosophy of Information, ed. by P. Aderiaans,
14.9 P. Lorenzen: Einführung in die Operative Logik und J.F.A.K. van Benthem (Elsevier, Amsterdam 2008)
Mathematik (Springer, Berlin 1955), in German
Argumentation and Abduction in Dialogical Logic References 313
14.17 H. van Ditmarsch, W. van der Hoek, B. Kooi: Dy- 14.38 L. Wittgenstein: Philosophical Investigations
namic Epistemic Logic (Springer, Dordrecht 2008) (Blackwell, Oxford 1953), ed. by G. Anscombe,
14.18 J.F.A.K. van Benthem: Logical Dynamics of Infor- R. Rhees
mation and Interaction (Cambridge Univ. Press, 14.39 A. Prior: The runabout inference-ticket, Analysis
Cambridge 2014) 21(2), 38–39 (1960)
14.19 R. Fagin, J.Y. Halpern, Y. Moses, M.Y. Vardi: Rea- 14.40 S. Rahman, N. Clerbout, L. Keiff: On dialogues and
soning About Knowledge (Bradford, The MIT Press, natural deduction. In: Acts of Knowledge: History,
Cambridge 2004) Philosophy and Logic. Essays Dedicated to Göran
14.20 R. Morado: Problemas filosóficos de la lógica no- Sundholm, ed. by G. Primiero, S. Rahman (College
monótona. In: Filosofía de la Lógica: Enciclope- Publications, London 2009)
dia Iberoamericana de Filosofía, Vol. 27, ed. by 14.41 S. Rahman: Negation in the logic of first degree
R. Orayen, A. Moretti (Trotta, Madrid 2005), in Span- entailment and tonk: A dialogical study. In: The
ish Realism-Antirealism Debate in the Age of Alterna-
14.21 R. Koons: Defeasible Reasoning. The Stanford tive Logics, ed. by S. Rahman, G. Primiero, M. Mar-
Encyclopedia of Philosophy, ed. by E. N. Zalta, ion (Springer, Dordrecht 2010) pp. 175–201
Spring Ed., http://plato.stanford.edu/erchives/ 14.42 N. Clerbout: First-order dialogical games and
spr2013/entries/reasoning-defeasible (2013) tableaux, J. Philos. Log. 43(4), 785–801 (2014)
14.22 J.L. Pollock: Defeasible reasoning, Cogn. Sci. 11, 14.43 W.V. Quine: Philosophy of Logic, 2nd edn. (Harvard
481–518 (1987) Univ. Press, Cambridge 1986)
14.23 R. Reiter: A logic for default reasoning, Artif. Intell. 14.44 S. Rahman, M. Rückert, H. Fischmann: On dialogues
13, 81–137 (1980) and ontology. The dialogical approach to free logic,
14.24 J. McCarthy: Circumscription – A form of non- Log. Anal. 160, 357–374 (1997)
monotonic reasoning, Artif. Intell. 13(1-2), 27–39 14.45 C. Barés Gómez, S. Magnier, F. Salguero (Eds.): Logic
(1980) of Knowledge. Theory and Applications (College
14.25 M.L. Ginsberg (Ed.): Readings in Non-Monotonic Publications, London 2012)
Reasoning (Morgan Kaufmann, Los Altos 1987) 14.46 M. Fontaine, S. Rahman: Fiction, creation and
14.26 V.W. Marek, M. Truszczynski: Nonmonotonic Logic. fictionality – An overview, Methodos (2010),
Context-Dependent Reasoning (Springer, New York doi:10.4000/methodos.2343
1993) 14.47 S. Rahman, H. Rückert: Dialogische modallogik
14.27 J.F. Horty: Reasons as Default (Oxford Univ. Press, (für T, B, S4, und S5), Log. Anal. 167/168, 243–282
Oxford 2012) (2001)
14.28 D. Batens: A universal logic approach to adaptive 14.48 L. Magnani: Logic and abduction: Cognitive exter-
Part C | 14
logics, Log. Univers. 1, 221–242 (2007) nalizations in demonstrative environments, Theo-
14.29 H. Prakken, G. Vreeswijk: Logics for defeasible ar- ria 60, 275–284 (2007)
gumentation. In: Handbook of Philosophical Logic, 14.49 R. Kowalski: Logic without model theory. In: What
2nd, Vol. 4, ed. by D. Gabbay, F. Guenthner (Kluwer is a Logical System?, Studies in Logic and Compu-
Academic, Dordrecht 2002) pp. 219–318 tation, ed. by D. Gabbay (Oxford Univ. Press, Oxford
14.30 S. Rahman: Über Dialogue, Protologische Kate- 1994) pp. 73–106
gorien und andere Seltenheiten (Peter Lang, Bern 14.50 A. Aliseda: Abductive reasoning. Logical Investi-
1993), in German gations into Discovery and Explanation (Springer,
14.31 S. Rahman, L. Keiff: On how to be a dialogician. In: Dordrecht 2006)
Logic, Thought and Action, Logic, Epistemology and 14.51 V. Fiutek: Playing with Knowledge and Belief, Ph.D.
the Unity of Science, Vol. 2, ed. by D. Vanderveken (Universiteit van Amsterdam, Amsterdam 2013)
(Springer, Dordrecht 2005) pp. 359–408 14.52 F.R. Velázquez-Quesada, F. Soler-Toscano, Á. Nepo-
14.32 M. Fontaine, J. Redmond: Logique dialogique. Une muceno-Fernández: An epistemic and dynamic
Introduction (College Publications, London 2008), approach to abductive reasoning: Abductive prob-
in French lem and abductive solution, J. Appl. Log. 11(4),
14.33 N. Clerbout: La sémantique dialogique: Notions 505–522 (2013)
fondamentales et éléments de metathéorie (College 14.53 S. Magnier: Approche dialogique de la dynamique
Publications, London 2014), in French épistémique et de la condition juridique (College
14.34 E.M. Barth, E. Krabbe: From Axiom to Dialogue: Publications, London 2013), in French
A Philosophical study of Logic and Argumentation 14.54 D. Gabbay, J. Woods: The Reach of Abduction. In-
(de Gruyter, Berlin 1982) sight and Trial (Elsevier, Amsterdam 2005)
14.35 E. Krabbe: Formal systems of dialogue rules, Syn- 14.55 C.S. Peirce: Collected Papers of Charles Sanders
these 63, 295–328 (1985) Peirce, ed. by P. Hartshorne, P. Weiss, A. Burks (Har-
14.36 F.H. van Emereen, R. Grootendorst: A Systematic vard Univ. Press, Cambridge 1931–1958)
Theory of Argumentation. The Pragma-Dialectical 14.56 L. Keiff: Le Pluralisme Dialogique. Approches dy-
Approach (Cambridge Univ. Press, Cambridge 2004) namiques de l’argumentation formelle, Ph.D. The-
14.37 F.H. van Emereen, P. Houtlosser, A.F. Snoeck Henke- sis (Université Lille 3, Lille 2007), in French
mans: Argumentative Indicators in Discourse. 14.57 S. Rahman, T. Tulenheimo: From games to dia-
A Pragma-Dialectical Study (Springer, Dordrecht logues and back. In: Games: Unifying Logic, Lan-
2007) guage and Philosophy, Vol. 15, ed. by O. Majer, A.-
314 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
V. Pietarinen, T. Tulenheimo (Springer, Dordrecht 14.64 J.R. Searle: Expression and Meaning. Studies in the
2009) pp. 153–208 Theory of Speech Acts (Cambridge Univ. Press, Cam-
14.58 S. Rahman, H. Rückert: Dialogical conexive logic, bridge 1979)
Synthese 127, 105–139 (2001) 14.65 J.R. Searle: The logical status of fictional discourse,
14.59 J. Meheus, D. Batens: A formal logic for abductive New Lit. Hist. 6(2), 319–332 (1975)
reasoning, Log. J. IGPL 14, 221–236 (2006) 14.66 J. Woods: Fictions and their logics. In: Handbook
14.60 M. Beirlaen, A. Aliseda: A conditional logic for ab- of Philosophy of Science, Vol. 5, ed. by D. Gabbay,
duction, Synthese 191(15), 3733–3758 (2014) P. Thagard, J. Woods (Elsevier, Amsterdam 2007)
14.61 G. Nzokou: Logique de l’argumentation dans les pp. 1061–1126
traditions orales africaines. Proverbes, Connais- 14.67 J. Woods, J. Isenberg: Psychologizing the se-
sance et Inférences non-monotoniques (College mantics of fiction, Methodos (2010), doi:10.4000/
Publications, London 2013), in French methodos.2387
14.62 S. Rahman, J.-P. Van Bendegem: The dialogical 14.68 M. Fontaine: Argumentation et engagement on-
dynamics of adaptive paraconsistency. In: Para- tologique. Être, c’est être choisi (College Publica-
consistency: The Logical Way to the Inconsistent, tions, London 2013), in French
Lecture Notes in Pure and Applied Mathematics, 14.69 K. Bach: Speech acts. In: Routledge Encyclopedia of
Vol. 228, ed. by W. Carnielli, M.E. Coniglio, I.M. Lof- Philosophy, (Routledge, London 1998)
fredo D’ottaviano (Dekker, New York 2001) pp. 295– 14.70 C. Barés Gómez: Lógica dinámica epistémica para
322 la evidencialidad negativa, Vol. 5 (College Publica-
14.63 H. Rückert: A solution to Fitch’s paradox of knowa- tions, London 2013), in Spanish
bility. In: Logic, Epistemology and the Unity of 14.71 S. Rahman, N. Clerbout: Constructive type theory
Science, Logic, Epistemology and the Unity of Sci- and the dialogical approach to meaning, the Baltic
ence, ed. by S. Rahman, J. Symons, D. Gabbay, International Yearbook of Cognition, Log. Commun.
J.P. Van Bendegem (Springer, Dordrecht 2004) 8, 1–72 (2013)
Part C | 14
315
Formal (In)co
15. Formal (In)consistency, Abduction and Modalities
Part C | 15.1
15.1 Paraconsistency
Paraconsistency is the study of logical systems in which there are true contradictions, nor that reality is, in some
the presence of a contradiction does not imply triviality, sense, contradictory.
that is, logical systems with a nonexplosive negation : It is a fact that contradictions appear in a number of
such that a pair of propositions A and :A does not (al- real-life contexts of reasoning. From databases to scien-
ways) trivialize the system. In paraconsistent logics the tific theories, we often have to deal with contradictory
principle of explosion does not hold information. There are several scientific theories, how-
ever successful in their areas of knowledge, that yield
A; :A ° B : (15.1) contradictions, either by themselves or when combined
with other efficacious and compelling theories [15.1,
But what would be the reason for devising a paraconsis- Chap. 5]. The presence of contradictions is not a suf-
tent logic? Or more precisely, if avoiding contradictions ficient condition for discarding interesting theories. In
is a fundamental criterion of thought and reason, what order to deal rationally with contradictions, explosion
is the point of a formal system that tolerates contra- cannot be valid without restrictions, since triviality,
dictions? It will be argued here that to have available that is, a circumstance such that everything holds, is
a logical formalism capable of dealing with contradic- obviously unacceptable. Given that, in classical logic,
tions does not imply any sympathy with the thesis that explosion is a valid principle of inference, the underly-
316 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
ing logic of a contradictory context of reasoning cannot The question about the nature of contradictions ac-
be classical. cepted by paraconsistent logics is where a great deal
Indeed, the occurrence of contradictions in both sci- of the debate on the philosophical significance of para-
entific theories and everyday contexts of reasoning is consistency has been concentrated. In philosophical
being increasingly recognized. Notice that, as a general terminology, we say that something is ontological when
rule, these theories have been successful in describing it has to do with reality, with the world in the widest
and predicting a wide range of phenomena. The realist sense, and that something is epistemological when it
(and naive) assumption that scientific theories provide has to do with knowledge and the process of its ac-
correct descriptions of reality would unavoidably im- quisition. A central question for paraconsistency is the
ply that there are ontological contradictions, but this following: Are the contradictions that paraconsistent
would be a careless and hasty conclusion, since these logic deals with ontological or epistemological? Do
contradictions are better taken as provisional [15.2, p. contradictions have to do with reality proper? Or do
2]. If contradictions are provisional, they should not contradictions have to do with knowledge and thought?
be taken as true contradictions. From a strictly logical Contradictions of the latter kind, called here epis-
point of view, the problem is how to formulate an ac- temological contradictions, have their origin in our
count of logical consequence capable of identifying, in cognitive apparatus, in the failure of measuring instru-
contradictory contexts, the inferences that are allowed, ments, in the interactions of these instruments with
distinguishing them from those that must be blocked. phenomena, in operations of thought, or even in sim-
It is clear that such an account of logical consequence ple mistakes that in principle could be corrected later
must be paraconsistent. on.
order to give a counterexample to the principle of ex- between ıA and :.A ^ :A/, some very interesting de-
plosion, we need a weaker negation and a semantics in velopments become available. Indeed, ıA may express
which there is a model M such that A and :A holds notions different from consistency as freedom from
in M. contradiction.
In classical logic the values 0 and 1 are understood The circumstance in which both A and :A receive
respectively as false and true, but in nonclassical log- the value 1 may be understood as the presence of si-
ics this does not need to be the case. It is not necessary multaneous but nonconclusive evidence that A is true
that a paraconsistent logic takes a pair of formulas A and :A is true. Evidence for A in the sense proposed
and :A as both true. The semantic value 1 attributed here are reasons to believe in A. One may be justified
to a formula A may be read as A is taken to be true, A in believing that A is true inasmuch one has evidence
is possibly true, A is probably true, or perhaps better as available that A is true. But of course it may be that
there is some evidence that A is true in the sense of there there are also reasons for believing :A, and in this case
being reasons for believing that A is true. Thus, the attri- the evidence is not conclusive.
bution of the value 1 to a pair of propositions A and :A, Suppose that according to some empirically testable
does not need to be understood as if both propositions criteria, an atomic proposition A is true if and only if
are true in the sense that there is something in the world a condition c is fulfilled and on the other hand, there
that makes them true. Rather, it is better to consider that is also a condition d, independent of c, such that ob-
A and :A are both being taken in a sense weaker than taining d implies the truth of :A. In some critical
true, perhaps waiting for further investigations that will circumstances, it may happen that both criteria c and d
decide the issue, and discard one of them. are obtained [15.4, pp. 9–10]. Although c and d have
Logics of formal inconsistency are a family of para- been conceived initially as criteria of truth, it seems
consistent logics that have resources to express the far more reasonable at this point to not draw the con-
notion of consistency inside the object language by clusion that A and :A are both true. It is better to be
means of a sentential unary connective: ıA means more careful and to take the contradiction as a provi-
(informally) that A is consistent. As in any other para- sional state, a kind of excess of information that should,
consistent logic, explosion does not hold in LFIs. But at least in principle, be eliminated by means of fur-
this is handled in a way that allows distinguishing be- ther investigation. The criteria c and d provide reasons
tween contradictions that can be accepted from those for believing (i. e., provide evidence) that A and :A
that cannot. In LFIs, negation is explosive only with re- are true, but do not establish conclusively that both are
spect to consistent formulas true. Thus, a counterexample for explosion is straight-
forward: there may be nonconclusive evidence for both
A; :A °LFI B; while ı A; A; :A `LFI B : (15.4) A and :A, but no evidence for some B.
This intuitive interpretation for the paraconsistent
An LFI is thus a logic that separates the propositions negation justifies the invalidity of explosion. However,
Part C | 15.2
for which explosion holds from those for which it does it is not possible yet to express that some proposition
not hold. The former are marked with ı. For this reason, is true, because the notion of evidence is weaker than
they are called gently explosive. truth. With the help of the consistency operator this
In the Cn hierarchy, introduced by da Costa problem can be solved. The following intuitive meaning
in [15.3], the so-called well-behavedness of a formula for the consistency operator is proposed: ıA means in-
A, in the sense that it is not the case that A and formally that the truth value of A has been conclusively
:A hold, is also expressed inside the object language. established. Now one has resources to express not only
However, in C1 , Aı is an abbreviation of :.A ^ :A/, that there is evidence that A is true but also that A has
which makes the well-behavedness of a proposition A been established (by whatever means) as true: ıA ^ A.
equivalent to saying that A is noncontradictory. A full Notice that how the truth or falsity of a proposition is
hierarchy of calculi Cn , for n natural, is defined and established is not a concern of logic. The establishment
studied in [15.3]. of the truth of a given proposition A comes from outside
The first step in paraconsistency is the distinction the formal system.
between triviality and contradictoriness. But there is A very good example of a provisional contradiction
a second step, namely, the distinction between consis- in physics, better understood in terms of conflicting evi-
tency and noncontradictoriness. In LFIs the consistency dence rather than truth, is the problem faced by Einstein
connective ı is not only primitive, but it is also not just before he formulated the special theory of relativ-
always logically equivalent to noncontradiction. This ity. It is well known that there was an incompatibility
is the most distinguishing feature of the logics of for- between classical Newtonian mechanics and Maxwell’s
mal inconsistency. Once we break up the equivalence theory of electromagnetic field. This is a typical case of
318 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
two (supposedly) noncontradictory theories that yield i. Permits us to define classical negation, and thus can
contradictory results. be seen as an extension of classical logic
A friendly presentation of the problem may be ii. Permits recovering classical consequence by means
found in Einstein [15.5] and Feynman et al. [15.6]. of a derivability adjustment theorem (DAT)
Briefly, with respect to the same hypothetical situation, iii. Distinguishes the consistency of a formula A from
with c being the velocity of light in vacuum and w the noncontradiction of A, i. e., ıA and :.A ^ :A/
the velocity of light in a particular circumstance [15.5, are not equivalent
Sects. 6 and 7], Newtonian mechanics and Maxwell’s iv. Is gently explosive in the sense that it tolerates some
theory provide that :.w D c/ and w D c respectively. pairs of formulas A and :A, while it is explosive
So, combining the two theories yields a contradiction, with respect to others; and
and if the underlying logic is classical, triviality fol- v. Has a sound and complete bivalued semantics.
lows.
In such a scenario, two contradictory propositions The Syntax of mbC
hold in the sense that both may be proven from theo- Let L1 be a language with a denumerable set of sen-
ries that were supposed to be correct. This fact may be tential letters fp1 ; p2 ; p3 ; : : :g, the set of connectives
represented by the attribution of the semantic value 1 fı; :; ^; _; !g, and parentheses. The consistency op-
to both :.w D c/ and w D c. But clearly, the meaning erator ı is a primitive symbol and : is a nonexplosive
of this should not be that both are true – actually, we negation. The set of formulas of L1 is obtained recur-
know it is not the case, and nobody has ever supposed sively in the usual way; and Roman capitals stand for
that it could be the case. The meaning of the simultane- metavariables for formulas of L1 .
ous attribution of the value 1, as we suggest, is that at The logic mbC is defined over the language L1 by
that time there was evidence for both in the sense, men- the following Hilbert system:
tioned above, of some reasons for believing that both
were true, because there was evidence that the results Axiom-schemas:
yielded by both classical mechanics and theory of the Ax.1. A ! .B ! A/ ;
electromagnetic field were true. The contradiction has Ax.2. .A ! .B ! C// ! ..A ! B/ ! .A ! C// ;
been solved (roughly speaking) in the following way: Ax.3. A ! .B ! .A ^ B// ;
As velocity grows, time slows down and space short-
ens. So, the relation between space and time that gives Ax.4. .A ^ B/ ! A ;
velocity remains the same, because both have decreased Ax.5. .A ^ B/ ! B ;
(for details, see [15.6, Sect. 15]). This is an example of Ax.6. A ! .A _ B/ ;
what we call epistemic contradictions. We want to call
Ax.7. B ! .A _ B/ ;
attention to the fact that the general logical framework
Einstein was working in was not classical. He had two Ax.8. .A ! C/ ! ..B ! C/ ! ..A _ B/ ! C// ;
Ax.9. A _ .A ! B/ ;
Part C | 15.2
P2. Monotonicity: if `mbC B, then ; A `mbC B, for ence between the respective languages. The first step is
any A; to translate one language into another. Let L2 be a lan-
P3. Cut: if `mbC A and ; A `mbC B, then ; `mbC guage with the set of connectives f; _; ^; !g. Instead
B; of a paraconsistent negation :, L2 has classical nega-
P4. Deduction theorem: if ; A `mbC B, then `mbC tion .
A ! B;
P5. Compactness: if `mbC A, then there is , Fact 15.2
finite, `mbC A. Let t be a mapping that replaces by :. Then, the fol-
lowing holds:
For all and for all B, [ fBg L2 , there is a ,
Proof: The properties P1, P2, P3 and P5 come directly L1 such that `CPL B iff tŒ ; ı `mbC tŒB, where
from the definition of `mbC A. The deduction theo- ı D fıA W A 2 g.
rem comes from axioms Ax. 1 and Ax. 2 plus modus
ponens.
Proof: From left to right, suppose there is a deriva-
Since the properties P1, P2 and P3 hold, mbC is thus
tion D of `CPL B (in the language L2 of CPL). If
a standard logic [15.7, p. 6]. Due to the axiom bc1, mbC
we simply change the classical negations to :, such
is gently explosive, that is
a derivation does not hold in mbC. We need to be con-
For some A and B: cerned only with occurrences of explosion. The relevant
A; :A °mbC B ; point is that some information must be available in or-
der to reconstruct classical reasoning. An occurrence of
ı A; A °mbC B ; a line:
ı A; :A °mbC B ;
1. A ! .A ! B/
While for every A and B: ı A; A; :A `mbC B :
in the derivation D has to be substituted by the follow-
Thus, the formal system is able to distinguish the con- ing lines, obtaining a derivation D0 :
tradictions that do not lead to explosion from those that
do. The axiom bc1 is also called the gentle explosion 2. ıA
law, because it is explosive only with respect to formu- 3. ıA ! .A ! .:A ! B//
las marked with ı. 4. A ! .:A ! B/.
Classical logic may be recovered in mbC in two
ways: by defining a negation that has the properties of From right to left, suppose there is a derivation D0 of
the classical negation and by means of a derivability ad- tŒ ; ı `mbC tŒB. We get a derivation D of `CPL B
justment theorem (DAT). just by deleting the occurrences of ı and changing :
to .
Part C | 15.2
Fact 15.1 The reader should notice the difference between
Classical negation is definable in mbC. restoring classical consequence by means of a defini-
tion of a classical negation inside mbC and by means
of a DAT. In the latter case, the central issue is the in-
def def
Proof: We define ? D ıA ^ A ^ :A and A D A ! ?. formation that has to be available to restore classical
Now, we get explosion, A ! .A ! B/, as a theorem, reasoning. In each occurrence of classical explosion,
in a few steps from bc1. From the axiom 9, excluded A ! .A ! B/, the information needed from the view-
middle is obtained, A _ A. Classical propositional point of mbC is the consistency of A, represented by ıA.
logic (CPL) is obtained by axioms 1–8 plus explosion,
excluded middle and modus ponens. A Semantics for mbC
The general purpose of a derivability adjustment The sentential connectives of classical logic are truth
theorem is to establish a relationship between two log- functional. This means that the truth-value of a molec-
ics, L1 and L2, in the sense of restoring inferences that ular formula is functionally determined by its structure
are lacking in one of them. The basic idea is that some and by the truth-values of its components, which reduce
information has to be added to the premises to restore to the truth-values of the atomic formulas. Truth func-
the inferences that are lacking. DATs are especially in- tionality as a property of the semantics of certain logics
teresting because they show what is needed to restore is a mathematical rendering of the principle of compo-
the classical consequence in a paraconsistent scenario. sitionality, which says that the meaning of a complex
For the sake of precisely stating the DAT between expression is functionally determined by the mean-
mbC and CPL, we need to take into account the differ- ings of its constituent expressions and the rules used
320 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
dence for A or for :A when there is no evidence at all. undeterminedness. mbCD is correct and complete with
This happens, for instance, in a criminal investigation respect to a bivalued semantics defined by clauses (i) to
in which one begins by considering everyone (in some (iii) of Def. 15.1 plus the clause (vi) above.
group of people) not guilty until proof to the contrary Although mbCD is able to express also the absence
(see Example 15.5 in Sect. 15.3.1). In fact, any context of evidence, the negation still may be improved to bet-
of reasoning such that a final decision must be made in ter represent the deductive behavior of the notion of
a finite amount of time demands that A or :A, or even preservation of evidence. The logic LET K (the logic of
both, have to be in some sense accepted. Whether the evidence and truth based on CPLC) is thus obtained by
excluded middle should be valid from the start, or be adding to mbCD the axioms 11 to 14 below
recovered once some information has been added may
be seen as a methodological decision that depends on Ax. 11. A $ ::A ;
the reasoning scenario we want to represent. Ax. 12. :.A ^ B/ $ .:A _ :B/ ;
Ax. 13. :.A _ B/ $ .:A ^ :B/ ;
15.2.2 A Logic of Evidence and Truth
Ax. 14. :.A ! B/ $ .A ^ :B/ :
The logic mbC may be slightly modified to be able
to express a scenario such that no evidence is avail- The axioms above fit the intuitive meaning of the simul-
able. The duality between the principles of explosion taneous attribution of the value 0 or the value 1 to a pair
and excluded middle that corresponds to a duality be- of propositions A and :A as absence and presence of
tween paraconsistency and paracompleteness has been evidence respectively. Let us consider axiom 12. It is
mentioned in Sect. 15.2. Now, in a way analogous to reasonable to conclude that if there is some evidence
that by which we recover the explosion with respect to that a conjunction is false, that same evidence must be
a formula A, in a paracomplete logic, the validity of the evidence that one of the conjuncts is false. On the other
excluded middle with respect to a formula A may be hand, if there is some evidence that A is false, that same
recovered by means of the following axiom evidence must be evidence that A ^ B is false, for any B.
Analogous reasoning applies for disjunction and impli-
Ax. bd1. ı A ! .A _ :A/ : cation.
A bivalued complete and correct semantics for
A semantic clause for the axioms bc1 and bd1 is defined LET K is obtained by adding to the semantics of mbCD
as follows: the following clauses
(vi) if v .ıA/ D 1; then Œv .A/ D 1 iff v .:A/ D 0 :
(vii) v .A/ D 1 iff v .::A/ D 1 ;
If the excluded middle holds for A, we say that A is (viii) v .:.A ^ B// D 1 iff v .:A/ D 1 or v .:B/ D 1 ;
determined. bd stands for basic property of determined- (ix) v .:.A _ B// D 1 iff v .:A/ D 1 and v .:B/ D 1 ;
v .:.A ! B// D 1 iff v .A/ D 1 and v .:B/ D 1 :
Part C | 15.2
ness (bd). A system in which both bd1 and bc1 holds is (x)
thus paracomplete and paraconsistent. It is better to call
ı, in this context, not a consistent operator but rather The logic LET K can be proven without much trou-
a classicality operator, since ıA recovers classical truth ble to be sound and complete with respect to the
conditions with respect to A. But ıA still may be infor- semantics above. The proof needs only to drop the
mally understood as meaning that the truth value of A clauses related to Ax. 10 and to extend the soundness
has been conclusively established. In fact, the basic idea and completeness proofs for mbC [15.7, pp. 38ff] to the
of restricting the validity of the principle of explosion new axioms and semantic clauses, which can be done
may be generalized. The validity of some inference rule without difficulties. In LET K , a DAT holds as in mbC
(or axiom) may be restricted in such a way that some and a classical negation is definable in the same way as
logical property does not hold unless some informa- in mbC, thus LET K may be also seen as an extension of
tion is added to the system. In particular, the excluded propositional classical logic. It is worth noting that ac-
middle may be restricted in a way analogous to the re- cording to the intuitive interpretation proposed, LET K
striction imposed to the explosion. like mbC does not tolerate true contradictions: indeed,
Now, with bd1 and bc1, we have the resources to a true contradiction yields triviality, as in classical logic.
express the following situations, besides nonconclusive If A is simultaneously true and false, this is expressed
evidence; no evidence at all: v .A/ D v .:A/ D 0 and by .ıA ^ A/ ^ .ıA ^ :A/; that, in its turn, is equivalent
conflicting evidence: v .A/ D v .:A/ D 1. The system to ıA ^ A ^ :A; but the latter is nothing but a bottom
obtained by adding the axioms bd1 and bc1 to CPLC particle ?, i. e., a formula that alone implies triviality:
is called mbCD, a minimal logic of inconsistency and for any B, ? ` B.
322 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
15.3 Abduction
The problem of abduction has been formulated by needs. The question is analogous to the one in automatic
Peirce as the process of forming hypotheses with ex- theorem proving: finding any proof is one thing, while
planatory purposes. It is, thus, a kind of a reversed finding a philosophically interesting proof is another.
explanation [15.10, CP 7.202]: In the same manner as automatic theorem proving is
satisfied with the first level of proofs, so automatic ab-
“Accepting the conclusion that an explanation is
duction will be satisfied with a first-level explanation.
needed when facts contrary to what we should ex-
We are mainly interested here in abductions with no
pect emerge, it follows that the explanation must be
obvious explanations, particularly those in which con-
such a proposition as would lead to the prediction of
tradictions may be involved.
the observed facts, either as necessary consequences
Let ` be a deductive relation; if ° A, the creative
or at least as very probable under the circumstances.
abductive step consists of finding an appropriate so
A hypothesis then, has to be adopted, which is likely
that [ ` A. In this case, the discovered performs
in itself, and renders the facts likely. This step of
the explicative abductive step. Obviously, there must be
adopting a hypothesis as being suggested by the
some constraints, otherwise D f?g would be a trivial
facts, is what I call abduction.”
solution for the abduction problem in most deduction
The basic idea may be expressed as follows: When relations. Usually, if the underlying logic is explosive
some fact is discovered that is not explained by the (e.g., classical or intuitionistic), another constraint is
available theory (i. e., is not a consequence of the that ° :A, for this would imply that any explicative
available theory), a set of new premises is added as would be a trivial explanation. This restriction, how-
a hypothetical solution to the problem. However, the act ever, will not be necessary in our case, as the following
of adding something before using it as an explanation developments and the examples will make clear.
poses a second problem: how is it possible to gener- From the point of view of general argumentation
ate an abductive explanans? First, we have to recognize (and not only deduction), abduction concerns the search
that characterizing the concept of explanation is one of for hypotheses or the search for explanatory instances
the greatest challenges in the philosophy of science. that support reasoning. In this sense, it can be seen as
This problem is even harder in logic and mathemat- a complement to argumentation, in the same manner
ics, where explanations are sometimes confused with that in the philosophy of science, the context of discov-
proofs [15.11]. ery is a complement to the context of justification. And
Although we are not suggesting that explaining can moreover, further pursuing the analogy, the question of
be reduced to deducing, it is certainly acceptable that the logical possibility of creative abduction lies on the
the idea of explanation in deductive sciences includes same side of the famous question of the logical possi-
the query for missing hypotheses; it is in this context bility of scientific discovery.
Part C | 15.3
that the general abductive process can be formulated A renewed interest in abduction acquired impetus
as the process of generating new hypotheses within ar- due to the factual treatment of data and the question
bitrary deductive systems, and afterwards using them of virtual causality in the information age. The enor-
in deductive terms. The former task (generating new mous amount of data stored on the World Wide Web
hypotheses) is referred to here as creative abduction, and in complex systems, as well as the virtual rela-
while the latter (using such new hypotheses) as explica- tionship among such data, continuously demands new
tive abduction. The term explicative is here understood tools for automatic reasoning. These tools should incor-
under the following proviso: A missing link in a deduc- porate general logical methods which are at the same
tion certainly does not exhaust the need for an explana- time machine-understandable, and sufficiently close to
tion, but does constitute the first necessary step towards human semantics as to perform sensible automated rea-
explaining a surprising (i. e., not yet deducible) fact. soning.
Two natural assumptions about explanations that An example wherein abductive inference is highly
can be posed are the following: First, there can be relevant is the model-based diagnosis in engineering
various explanations for the same surprising fact, and and AI (artificial intelligence). Suppose that a com-
second, there can be explanations of various degrees for plex system, such as an aircraft, is being tested before
the same surprising fact. For example, searching for the a transatlantic flight. The electronic circuitry permits
ultimate scientific explanation as to why the grass of the testers to predict certain outputs based on specific
your garden is wet in the morning and discovering that input tests. If the instruments show something distinct
the sprinkler was left on all night may be two different from the expected, it is a task of model-based diagnosis
things. Both explain the fact, but responding to different to discover an explanation for the anomaly and use it to
Formal (In)consistency, Abduction and Modalities 15.3 Abduction 323
separate the components responsible for the problem, can be done? The alternative of deleting all data seems
instead of disassembling the whole aircraft. inconceivable, and the one of having all queries be
Another example occurs in the process of updating answered positively (since a database established on
in the so-called datalog databases. Suppose we have classical logic grounds would deduce anything from
a logic program (see introductory chapter for a brief a contradiction) is of course intolerable. Thus a legit-
overview of abduction in logic programming) com- imate logic to ground this process would have to be
posed of the following clauses, where desc.x; y/ means a first-order logic that sanctions useful reasoning in the
x is a descendent of y, parent.y; x/ means y is a parent presence of contradictions.
of x and ˇ ˛ means that the database contents plus ˛ Proposals from this perspective have been investi-
produce (or answer) ˇ gated in [15.13]. A first-order LFI, QmbC, that is an
extension of the logic mbC, has been presented and in-
desc.x; y/ parent.y; x/ ; vestigated in [15.14]. We will argue here that simple
desc.x; y/ parent.z; x/; desc.z; y/ : yet powerful techniques for automatic abduction can
be usefully implemented by means of tableau proof-
There is a subtle difference between inserting informa- procedures for the logic mbC, which may be extended
tion in the database in an explicit versus implicit man- to QmbC.
ner: information of the form y is a parent of x is a basic Although a wide-scoped study of tableaux and ab-
fact, and can be inserted explicitly, while information of duction was offered by [15.15] in 1997, the quite
the form x is a descendent of y is either factual knowl- natural idea of using the backward mechanism of
edge or is a consequence of the machine reasoning (as tableaux for gaining automatic explanations occurred
simple as it can be). If one wishes to insert a piece of earlier: [15.16] in 1992 already proposed a fully
implicit information, it is necessary to modify the set of detailed treatment for the question of completing
facts stored in the database in such a way that this in- a database in a way as to deduce (in classical propo-
formation can be deduced: this is an example of creative sitional logic) a previously undeducible formula. In the
abduction and of explicative abduction at the same time. same year a tableau proof system for da Costa’s logic C1
For instance, if we have stored desc.Zeus; Uranus/ was proposed in [15.17], and several examples of using
and parent.Uranus; Cronus/, for implicitly inserting such tableau systems for automatic solving dilemmatic
desc.Aphrodite; Uranus/ there are two different ways: situations were extensively discussed. Even though nei-
we may either insert parent.Zeus; Aphrodite/ or alterna- ther of these references explicitly mentions the concept
tively insert desc.Aphrodite; Cronus/. These two alter- of abduction, these papers undoubtedly proposed ways
native additions are examples of abductive explanations to solve the abductive problem, for classical propo-
for desc.Aphrodite; Uranus/. In fact, logic program- sitional logic and for the propositional paraconsistent
ming uses this abductive mechanism for answering logic C1 respectively. In [15.18], tableau systems for
queries, in the form: Is the fact desc.Aphrodite; Uranus/ LFIs were proposed, and this logical formalism was
Part C | 15.3
compatible with the program clauses and data? or Is used as a method to devise database repairs in [15.13].
there any x such that desc.x; Uranus/? The whole pro- Such methods are based upon many-valued semantics,
cedure is creative as much as it can be automatized. or upon bivalued semantics.
Therefore it is evident that a useful abductive mecha- The question of abduction thus involves two inde-
nism for databases should be based on first-order logic, pendent, but complementary problems:
and not merely on propositional logic.
Moreover, abductive approaches are also used to 1. Finding a method to automatically perform abduc-
integrate different ontologies and database schemes, tion (and, if possible, to automatically generate
or for integrating distinct data sources under the abductive data), and
same ontology, as for example [15.12], where an 2. Doing this within a robust reasoning environment,
abductive-based application for database integration in a such a way as to keep running and providing
is developed. Suppose that, while a query is being reasonable output even in the presence of the possi-
processed by a user, another data source had in- ble contradictions that this search would engender.
serted desc.Uranus; Aphrodite/ in our database, plus
a constraint of the form: For no x and y, simul- Any contradictions found in the process of pro-
taneously desc.x; y/ and desc.y; x/ can be main- ducing lucid outputs are a condemnation of the whole
tained in the database. If parent.Zeus; Aphrodite/ had process if the underlying logic is classical, so, the ab-
been inserted for one data source, the insertion of ductive experience can sometimes appear to be lethal.
desc.Uranus; Aphrodite/ by the second data source We argue here, however, that very simple and natural
would cause a collapse in view of the constraint. What logical models can be designed for dealing with abduc-
324 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
tion, by means of defining them in terms of refutation sic idea is that the open branches may be seen as
procedures based on LFIs. a heuristic device that helps indicate the formulas that
It is well accepted that abduction does not go in the would close the tableau, and these formulas are then
forward direction of deduction. It is not difficult to ac- taken as the explicative hypotheses.
cept, either, that abduction cannot coincide with any We present below a definition of the notion of an
backward form of classical deduction, but it does not abductive explanation:
follow that another form of backward deduction would
not work. In this sense, some LFIs are good candidates. Definition 15.2
Let us take as example mbC: it does not prove anything Let , be finite sets of sentences and A be a sentence
that classical logic would not prove; it tolerates con- in the language of a given logic L. and A form an
tradictions, but, nonetheless, it can encode the whole abductive problem and is an abductive explanation
of classical reasoning. Backwards proof procedures for for the abductive problem if:
LFIs indeed constitute a suitable approach for abduc-
tion, and we intend to show how this approach can be 1. (Abductive problem): The context is not suffi-
programmed and treated on a natural basis departing cient to entail A, that is, ° A
from a very simple formalism. 2. (Abductive solution): The enriched context plus
We have already seen here that it is typical that is sufficient to entail A, that is, ; ` A
cognitive situations can enter into a situation of (pre- 3. (Nontriviality of solution): The enriched context
sumably temporary) contradictory state. Of course, in plus is nontrivial, that is, there exists B such that
a situation where we have serious theories competing ; ° B
around a contradiction, there is little sense in rejecting 4. (Vocabulary restriction of solution): Var./
one of them a priori just to save the principle of ex- Var. / [ Var.A/
plosion. It seems to be out of question that it is more 5. (Minimality of solution): by lack of any other crite-
convenient to tame the logic, rather than to sacrifice ria, a mathematically minimal is a good explana-
a precious (and possibly correct) theory. tion (in the sense, for example, that it is composed
This is not only the case for scientific theories. of a set with minimal cardinality and with formulas
A single digit in a database can of course be extremely of minimal length).
valuable than to be just thrown away, and it is already
widely recognized that no automated reasoning is pos-
sible without means of controlling logical explosion. While conditions (1) and (2) just define what is an
What is yet not clear is whether the act of guessing abductive problem and what is a solution, conditions
is involved in the discovery context of abduction, and (3), (4) and (5) impose restrictions for a solution to be
furthermore under such conditions, can be the subject considered relevant: condition (3) avoids, for instance,
of logic. Although it seems that Peirce maintained the that be taken as the collection of all formulas, or as
Part C | 15.3
negative, we argue that in many interesting cases the a single bottom particle (which would entail any other
process of guessing can be solved semi-automatically formulas). Since the compactness theorem holds for
by means of careful manipulation of the concept of con- mbC, and can always be taken as finite sets.
sistency, viewed as a primitive notion independent from Below, we present a tableau system for mbC,
the concept of contradiction, as shown in Sect. 15.2.1 based on the bivaluation semantics presented in Defi-
above. In this way we can obtain a reasonably effi- nition 15.1 [15.7, p. 48]. We use 0 and 1 as syntactic
cient and conceptually simple method for discovering labels to represent the semantic values 0 and 1.
new logical hypotheses that will serve as explanans for
a given explanandum. 0.:X/
R1
1.X/
15.3.1 mbC-Tableaux
1.ıX/
R2
The beginnings of automatic heuristics by means of 0.X/ j 0.:X/
paraconsistent tableaux can be traced back to [15.17],
R3
although the logic used there is da Costa’s C1 . As ar- 1.X/ j 0.X/
gued in the preceding sections, an important LFI is the 1.X1 ^ X2 /
logic mbC. A relevant feature of mbC is that it can R4
1.X1 /; 1.X2 /
be defined by means of refutative tableau-type proof
procedures. Such backward proof procedures are very 0.X1 ^ X2 /
R5
convenient for formalizing abductive routines. The ba- 0.X1 / j 0.X2 /
Formal (In)consistency, Abduction and Modalities 15.3 Abduction 325
Part C | 15.3
off. In any case, some choices may be necessary in but classically it would not be a solution, since the set
order to implement a preference policy for ranking mul- of premises contain a formula :A. But A is indeed a so-
tiple explanations – facts may have precedence over lution, that also indicates the proposition :A is not well
hypothetical explanations, and likelihood may be used established as true – that is, is not consistent. In fact,
to classify explanations. Although this is an important in mbC, A; :A ` : ı A. For this reason, this explanation
part of the whole question that will affect the useful- does not violate Definition 15.2. Notice that this sce-
ness of the automatic explanations produced, it is not nario cannot be represented by a classical tableau.
part of the abduction problem as originally posed. It is
worth noting that the position we are holding here does
not require a nonmonotonic logic. The logic used here Example 15.4 Explanations that Avoid Hasty Con-
to produce the abductive output, mbC, is monotonic. clusions
Nonmonotonic reasoning, if necessary, may be used in We know that taking certain drugs has beneficial con-
further steps. sequences for health, but also the same drugs, under
certain conditions, will produce undesirable effects on
Example 15.1 our health. Represent this situation as A ! B and A !
A case where mbC-tableaux and classical tableaux give :B. Under classical reasoning (using classical tableaux,
the same result: or any other classical inference mechanism) an imme-
Let D fA ! C; B ! Cg; of course ° C. Running diate conclusion would be :A, that is, we should not
an mbC-tableau for 1. / [ f0.C/g produces an open take these drugs. However, the negative effects could
branch containing 0.A/ and 0.B/. Clearly, this branch be explained by inappropriate doses, or by different
326 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
health conditions in different people, and so on. Using is not only quite natural, but expected in real applica-
mbC-tableaux, however, this case turns out to be an in- tions. Although there are some technical complications,
teresting abduction problem, since in mbC A ! B; A ! from the tableau-proof-theoretical standpoint, all the
:B ° :A, as can be checked by the reader, by consult- grounding constructions are already at our disposal: the
ing the semantics given in Sect. 15.2.1, A Semantics for logic QmbC, first-order extension of the propositional
mbC. We are thus invited to look for an abductive ex- mbC, has been studied in detail in [15.14]. We recall the
planation: this explanation, automatically produced by main ideas about QmbC and show how the underlying
the mbC-tableau, is that the drug is to be banned only tableau procedure can be used in abductive problems.
if the contradictory effects are undeniable, that is, if Let ˙ be the language of mbC enriched with 8 and 9,
ıB. Indeed, in mbC A ! B; A ! :B; ıB ` :A. Hence, and Var be a set of variables. The formulas of QmbC
D fıBg is an explanation: the resulting mbC-tableau are defined as usual in first-order logics; all the famil-
is closed. iar syntactic notions of free and bound variables, closed
formulas (sentences), substitution etc., are defined as
usual. From the semantic side, sentences of QmbC are
Example 15.5 Whodunit? interpreted by adding the following to the semantics of
A diamond was stolen in a hotel room and only two peo- mbC:
ple had entered the room on two different days, Bob
i. v .9xA/ D 1 iff v .AŒx=t/ D 1 for some term t in L
and Alice. Since there is only nonconclusive evidence
ii. v .8xA/ D 1 iff v .AŒx=t/ D 1 for all t in L
against them and the standard of a proof in a criminal
iii. If A is a variant of B, then v .A/ D v .B/
trial must be so strong that there should be no shadow
of doubt, the police initially consider that they are not We say that A is a variant of B (and vice versa) if A
guilty, but certainly one of them is guilty, that is, the can be obtained from B by means of addition or deletion
evidence basis contains D f:A; :B; A _ Bg where A of void quantifiers, or by renaming bound variables. It is
and B stand, respectively, for Alice is guilty and Bob a theorem of classical first-order logic that if A and B are
is guilty. At this point, ° A and ° B, so we have variants of each other, then A and B are logically equiv-
two abductive problems. Now, by running the respective alent. However, in QmbC the clause (3) above has to be
mbC-tableaux, we easily see that either ıA (meaning postulated to solve some technical problems that will
that the initial supposition about Alice’s innocence was not be considered in detail here, but the reader can find
indeed consistent) or ıB (meaning, alternatively, that the in [15.14]. From a syntactical perspective, what inter-
initial supposition about Bob’s innocence was indeed ests us here for the sake of abduction, a tableau system
consistent) would decide the question. Indeed, in mbC, for QmbC is obtained by adding to the tableau rules of
A _ B; :A; :B; ıA ` B and mbC the following rules for the quantifiers
A _ B; :A; :B; ıB ` A 1.8xA/
R10
Part C | 15.3
plications, because QmbC-tableaux, as much as their To the extent that LFIs permit fine control of rea-
classical counterparts, are not a decision procedure for soning in the presence of inconclusively established hy-
QmbC-validity – indeed, QmbC is undecidable. Let us potheses, particularly under contradictions, the mech-
see an example below. anism presented here is thus capable of proposing
solutions for an extensive class of abductive problems.
Example 15.6 As we have seen, the mbC-tableaux increase the range
Consider the following set of propositions: D of options provided by classical reasoning. The issues
f8x.Cx ! Bx/; 8x.Gx ! Bx/; :Cag. Here, ° Ba. discussed here have much in common with belief re-
Running an mbC-tableau for 1. / [ f0.Ba/g produces vision, default reasoning, the closed-world assumption,
an open branch containing 0.Ca/, 1.:Ca/ and 0.Ga/. and negation as failure of logic programming, as well
Classically, the only candidate to be an abductive ex- as databases with evolutionary constraints, thus making
planation is Ga. But from the point of view of QmbC, our proposal valuable for several applications. Abduc-
we obtained two possible explanations, since 1.Ca/ also tion, however, can also be regarded, from a much more
closes that branch. In the latter case, a further con- abstract standpoint, as a companion for argumentation
clusion is that Ca is not consistent, i. e., is not well (see chapter by Barés and Fontaine, this section, for
established as a true proposition. Thus, this explanation a proposal in this direction). From this perspective, any
does not violate Definition 15.2. attempt to make abduction somewhat closer to deduc-
tion is welcome.
15.4 Modality
Modal logics and paraconsistent logics are cousins. In some modal paradoxes. Moral dilemmas are a typical
1948, while attempting to answer a question posed by situation in which paraconsistent modal logics provide
J. Lukasiewicz, S. Jaśkowski presented the first formal a tool to handling contradictions without triviality. Let
system for a paraconsistent logic with his discussive us take as an example the well-known dilemma, posed
logic. Interestingly enough, his logic was framed in by [15.26], of the man in occupied France who, on the
terms of modalities, and later on it was proven to be one hand, wants to fight the Nazis but, on the other,
a particular case of the family of LFIs [15.7]. However, must take care of his mother. He believes that each al-
it was only in 1986 that the first modal paraconsistent ternative is a moral obligation, but doing one implies
system was proposed in [15.20], with the aim of dealing not doing the other. Let A and B be, respectively, fight
with deontic paradoxes. That system was a modal ex- the Nazis and take care of his mother. Let the symbols
tension of da Costa’s paraconsistent logic C1 . This ap- O and P mean, respectively it is obligatory that and it
proach has been extended by means of deontic modali- is permitted that (as usual means necessity and ˙
Part C | 15.4
ties combined with LFIs, as developed in [15.21, 22]. means possibility). From the premises
Paraconsistent negation can be regarded as a kind of
modal operator, considering the fact that the classical OA; OB; ˙.A ^ B/ ;
negation for possibility (and, a fortiori, for necessity)
has a paraconsistent behavior. Namely, the operator :, plus the following principles of deontic logic
defined as
.A ! B/ ! .OA ! OB/ ;
def
:A D ˙A; OA ! OA ;
is a paraconsistent negation where, as usual, de- and given that in classical modal logic ˙.A ^ B/
notes the classical negation. This relationship has been is equivalent to .A ! B/, a contradiction may be
studied in [15.23], both with respect to the standard obtained in a few steps. On the other hand, a paraconsis-
modal logic S5 and to four-valued modal logics [15.24]. tent modal logic may handle the contradiction without
It is worth noting that the fact that ˙A defines triviality.
a paraconsistent negation was already observed in 1987 Another example is Urmson’s paradox. In this case,
in [15.25], when a Kripke-style semantics was proposed the modal paradox is just avoided. Consider the follow-
for Sette’s logic P1 based on Kripke frames for modal ing proposition
logic T.
One of the interests in paraconsistent modal log- (X) It is optional that you attend my talk or not;
ics is the potential of dealing with, or even avoiding, but your choice is not indifferent.
328 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
It is clear that the notions optional (Opt) and indifferent of the anodic modal system K ˙ . Then, we show how
(Ind) must be distinct in (X). Again, let P and O mean paraconsistent modal logics that are fragments of the
permitted and obligatory. It is natural in modal logic to familiar systems T, S4 and S5 may be obtained, as ex-
formalize Opt and Ind as tensions of mbC . Correct and complete semantics are
presented for all of these systems.
def
Opt.A/ D PA ^ PA :
def 15.4.1 The Anodic System K˙
Ind.A/ D OA ^ OA :
In this section, a purely positive bimodal system K ˙
In classical modal logic a contradiction occurs because will be defined in a negationless language that is both
is a classical negation and OA is equivalent to PA. an extension of CPLC and a fragment of K. The only
Hence, it is easy to see that Opt and Ind are equivalent. modal axioms are positive versions of the distribution
So, attending the talk is simultaneously optional and not axiom (K) (namely, (K), (K1), (K2) and (K3)) together
optional. On the other hand, if a nonexplosive negation with the usual necessitation rule (Nec).
: is available, it can be used to express the notions Opt The language L2 of K ˙ has the following set of
and Ind. In this way, no paradox occurs because OA and connectives: f_; ^; !; ; ˙g. Notice that both modal
:P:A are no longer equivalent. operators are needed as primitive because one cannot
Paraconsistent deontic logics have also been stud- be defined in terms of the other, given that no negation
ied in the literature for quite a some time [15.20], is available. The set of formulas of K ˙ is obtained as
and deontic counterparts of LFIs, the logics of de- typically done in modal logic. The formulas of K ˙ are
ontic (in)consistency (LDIs), have been introduced represented by Roman capital letters, and sets of for-
in [15.21]. These logics are shown to be able to han- mulas are represented by uppercase Greek letters ;
dle deontic paradoxes, as the well-known Chisholm’s etc. The definition of a derivation of A from a set of
paradox. Since contradictory obligations do not trivial- premises ( `K ˙ A) is the usual one: a finite se-
ize such LDIs, several paradoxes involving conflicting quence of formulas B1 : : : Bn such that A is Bn and each
obligations are dissolved [15.22]. Bi , 1
i
n is an axiom, a formula that belongs to ,
It is important to note, however, that the potential of or a result of an inference rule. A theorem is a formula
combining paraconsistency and modalities extends far derived from the empty set of premises. When there is
beyond deontic issues. Not only can some problems, de- no risk of ambiguity, we write just ` instead of `K ˙ .
scribed in [15.27], be thought in paraconsistent terms,
but also certain problems and paradoxes in epistemic Definition 15.3
and doxastic logics gain new insight when regarded The anodic modal system K ˙ is defined by adding to
from paraconsistent perspective. CPLC the following modal axiom-schemas and modal
A detailed investigation of the relationship between rule:
LFIs and their modal versions is carried out in [15.28],
Part C | 15.4
where the so-called anodic systems (purely positive (K) .A ! B/ ! .A ! B/
modal systems) introduced in [15.29] are extended by (K1) .A ! B/ ! .˙A ! ˙B/
adding certain paraconsistent axioms based on LFIs, (K2) ˙.A _ B/ ! ˙A _ ˙B
defining a class of modal systems called cathodic sys- (K3) .˙A ! B/ ! .A ! B/
tems (modal systems involving degrees of negations). (Nec) ` A implies ` A
For an explanation of the terms anodic and cathodic
see [15.29]. A semantic interpretation of cathodic sys-
tems is given in [15.28], where it is shown that the A modal system is called normal if it contains the
cathodic systems can be semantically characterized in distribution axiom (K) and the necessitation rule (Nec)
two different ways: by means of Kripke-style semantics among its axioms and rules, and minimal if it has only
and by means of modal possible-translations semantics. (K) as a modal axiom and (Nec) as a modal rule. K ˙ is
In the following sections we start by presenting minimal and normal. In addition, it is not difficult to see
a positive (i. e., anodic) modal system, that can be en- that K ˙ is a fragment of the system K, since the axioms
hanced with degrees of negation, as shown in [15.7], so (K1)-(K3) can be derived in the system K, as the reader
as to obtain a family of cathodic systems. We start with can verify as an exercise (remember that K is obtained
the anodic modal system K ˙ , a negationless fragment by adding to CPLC the axiom K and the necessitation
of the well-known modal system K. rule). As well as mbC, K ˙ satisfies the properties of re-
The first paraconsistent modal system we shall con- flexivity, monotonicity, cut and compactness. Besides,
sider is mbC , which will be obtained as an extension the deduction theorem is also valid.
Formal (In)consistency, Abduction and Modalities 15.4 Modality 329
Part C | 15.4
satisfying:
˙A D A
Def
(15.5)
(i) v .p; w / D 1 or v .p; w / D 0
(ii) v .A ! B; w / D 1 iff v .A; w / D 0 or v .B; w / D 1 Hence, the axioms (K1)–(K3) are innocuous in mbC ,
(iii)v .A ^ B; w / D 1 iff v .A; w / D 1 and v .B; w / D 1 since they can be easily proven as theorems, as the
(iv) v .A _ B; w / D 1 iff v .A; w / D 1 or v .B; w / D 1 reader can verify.
(v) v .A; w / D 1 iff v .A; w 0 / D 1, for all w 0 2 W
such that w R w 0
A Semantics for mbC
(vi) v .˙A; w / D 1 iff v .A; w 0 / D 1, for some w 0 2 W
such that w R˙ w 0 .
Definition 15.6
A frame is a relational structure F D hW; Ri, where
A sentence A is said to be satisfied in a model M, W ¤ ¿ is a universe and R is a binary relation on W
if there is a w 2 W such that v .A; w / D 1 (notation: (notice that now we need only one relation that covers
M; w A). A sentence A is said to be valid in a model both ˙ and ).
M, if v .A; w / D 1 for all w 2 W (notation: M A).
A sentence A is said to be valid in a frame F, if A is A bivalued relational model for the cathodic system
valid in all models M based on F (notation: F A). mbC is defined as follows:
A special class of frames F is the collection of
frames that satisfy some condition imposed on the re- Definition 15.7
lation R. Examples are the special class of reflexive A bivalued relational model M for mbC is a pair
330 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
hF; v i where F is a frame and v W Var W ! f0; 1g Serial iff for every w 2 W there is some w 0 2 W :
is a bivalued modal assignment satisfying the condi- such that w Rw 0
tions:
(i) v .p; w / D 1 or v .p; w / D 0; The modal logics K, K ˙ and mbC have no condition
(ii) v .A ! B; w / D 1 iff v .A; w / D 0 on frames – that is, they have no condition on the rela-
or v .B; w / D 1 tion R of accessibility between worlds. Starting from K,
(iii) v .A ^ B; w / D 1 iff v .A; w / D 1 the systems D, T, S4, B and S5 are obtained imposing
and v .B; w / D 1 the following condition on frames
(iv) v .A _ B; w / D 1 iff v .A; w / D 1 or v .B; w / D 1
(v) v .A; w / D 0 implies v .:A; w / D 1 D W serial;
(vi) v .A; w / D 1 iff 8w 0 .w Rw 0 /; v .A; w 0 / D 1 T W reflexive;
(vii) v .˙A; w / D 1 iff 9w 0 .w Rw 0 /; v .A; w 0 / D 1
(viii) v .ıA; w / D 1 implies v .A; w / D 0 B W reflexive and symmetric;
or v .:A; w / D 0. S4 W reflexive, transitive;
S5 W reflexive, symmetric, transitive.
The notion of validity in a frame is defined as usual.
Now, mbC may be extended, obtaining paraconsistent
modal systems that are both extensions and fragments
15.4.3 Extensions of mbC
of each of the systems above.
From mbC , stronger systems may be defined. The
mbCD D mbC C .D/ ;
def
well-known modal systems D, T, S4, B and S5 are ob-
tained by adding one or more of the following axioms def
to system K mbCT D mbC C .T/ ;
def
mbCB D mbC C .T/ C .B/ ;
.D/ A ! ˙A I
mbCS4 D mbC C .T/ C .4/ ;
def
.T/ A ! A I
.4/ A ! A I mbCS5 D mbC C .T/ C .5/ :
def
.B/ A ! ˙A I
.5/ ˙A ! ˙A : Kripke-style semantics for the above paraconsistent
modal systems may be obtained by just adding clauses
Now, axiomatic systems for the modal logics listed be- corresponding to the respective restrictions on frames
low are obtained as follows to the semantics for mbC .
Part C | 15.4
in philosophical issues in modal logic, as well as the respect to Kripke semantics. The meaning of this kind
possibility of enhancing it in order to fit some con- of incompleteness is also discussed in [15.31], but, sur-
texts of reasoning or philosophical problems related to prisingly, the phenomenon of modal incompleteness is
modalities in general are topics that deserve further at- also found among purely positive (multi)modal log-
tention. ics: Bueno-Soler in [15.30] obtains some examples
Although a large class of anodic and cathodic multi- of Kripke-incompletable purely positive modal logics,
modal logics can be shown to be complete with respect demonstrating that modal incompleteness is a result of
to Kripke frames, an interesting point about anodic the interaction of modalities, independent of negation.
and cathodic modal logics is that some incomplete The incompleteness in the case of cathodic modal log-
systems can be found in both families. Bueno-Soler ics, however, does not obtain with respect to possible-
in [15.31] shows that some classes of cathodic mul- translations semantics, marking a distinction between
timodal paraconsistent logics (that is, logics endowed this kind of semantics and Kripke semantics for modal-
with weak forms of negation) are incompletable with ities.
Part C | 15.5
perspective for this kind of logics. Fidel claims that, by considering n-tuples of a given
Let us begin by recalling the valuation seman- class of algebras (for some n 2), it is possible to ana-
tics for mbC introduced above (Definition 15.1). The lyze the structure of the algebraic models of other non-
clauses for the binary connectives !, _ and ^ are the classical logics. Besides twist structures for Nelson’s
same as for classical logic. However, as we have seen, logic (where n D 2), he introduces in [15.36, Chap. 4]
the clauses for the paraconsistent negation : and the a new semantics for the logic of Ockham algebras P3;1
consistency operator ı give the following (nondeter- by considering triples of elements of distributive lat-
ministic) quasi-matrices tices. Any triple .a; b; c/ is such that a represents the
value of a formula A, while b and c represent the values
A :A ıA of :A and ::A, respectively.
1 1 0 v1 These ideas from Fidel inspired [15.37], wherein
0 1 v2 the notions of snapshots and swap structures for mbC
0 v3 (and some LFIs extending it) were introduced.
0 1 1 v4
0 v5 Definition 15.8
Let A D hA; ^; _; !; 0; 1i be a Boolean algebra, and
Accordingly, there are five possible valuations for let BA D f.a; b; c/ 2 A A A W a _ b D 1 and a ^ b ^
mbC concerning propositions A, :A and ıA, namely v1 c D 0g: A swap structure for mbC over A is any mul-
to v5 , such that v1 .A/ D v1 .:A/ D 1 and v1 .ıA/ D 0; tialgebra B D hB; ^; _; !; :; ıi such that B BA and
v2 .A/ D v2 .ıA/ D 1 and v2 .:A/ D 0, and so on. Ob- where the multivalued operations satisfy the following
332 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
conditions, for every .a; b; c/ and .a0 ; b0 ; c0 / in B, and The last result can be strongly improved: Let A2
for each # 2 f^; _; !g: be the two-elements Boolean algebra, and let K2 be
the nondeterministic matrix M.B/ induced by the
(i) .a; b; c/#.a0 ; b0 ; c0 / unique swap structure B for mbC over A2 with do-
def
D f.a00 ; b00 ; c00 / 2 B W a00 D a#a0 g; main BA2 . Observe that BA2 coincides with the set
def
(ii) :.a; b; c/ D f.a00 ; b00 ; c00 / 2 B W a00 D bg; B5 D ft; T; t0 ; F; f0 g mentioned at the beginning of this
def section (with the notation introduced therein). Then, the
(iii) ı.a; b; c/ D f.a00 ; b00 ; c00 / 2 B W a00 D cg.
following result can be proven [15.37]:
The elements of a given swap structure are called Theorem 15.2 Soundness and Completeness with
snapshots. Intuitively, a snapshot x D .a; b; c/ simulta- Respect to A2
neously keeps track of the value a of a given formula Let [ fAg be a set of formulas in mbC. Then, `mbC
A, the value b of :A, and the value c of ıA. Because A if and only if ˆK2 A .
of this, a _ b D 1 by the principle of excluded middle,
and a ^ b ^ c D 0 by the gentle explosion law, which are Theorem 15.2 is nothing more than the seman-
both valid in mbC. tic characterization of mbC by means of Nmatrices
The binary operations (cf. clause (i)) kept fix the obtained in [15.33]. The notion of nondeterministic ma-
first coordinate of the given snapshots: the second and trices (or Nmatrices) was proposed by Avron and Lev
third coordinates of the output do not depend on the in [15.38], and afterwards studied by Avron and his
given data. It reflects the fact that the binary connectives collaborators. Basically, an Nmatrix is a logical matrix
are classical, but the truth value of :.A#B/ and ı.A#B/ M D hA; Di such that each operation in the algebra
are unrelated to the truth values of A, B, :A, :B, ıA and A is multivalued, that is, if cM is an n-ary operator
ıB. With respect to the unary connectives (cf. clauses of A (interpreting a connective c) and .a1 ; : : : ; an / 2
(ii) and (iii)), the negation :x of a snapshot x is the An (where A is the domain of the multialgebra) then
set of snapshots with the second coordinate of x on the cM .a1 ; : : : ; an / is a finite, nonempty subset of A. The
first place. Accordingly, the consistency ıx of x has the valuations are mappings v assigning to each formula
third coordinate of x in the first place. This reflects the of the logic being interpreted by M an element of
intuitive meaning of the components of a snapshot, as A in a coherent way, as follows: v .c.A1 ; : : : ; An // 2
mentioned above. As in the case of (i), the second and cM .v .A1 /; : : : ; v .An //. The semantic consequence is
third coordinates of :x are independent from x, since defined as in the case of standard logical matrices,
::A and ı:A are independent from A, :A and ıA. The namely: A follows from a set of formulas if, for ev-
same observation holds for ıx. ery valuation v , v .A/ 2 D (the set of designated truth
Swap structures are multialgebras defined over suit- values) whenever v . / 2 D for every in .
able subsets of A3 , for any Boolean algebra A. Hence As observed above, the Nmatrices for mbC con-
Part C | 15.5
they naturally determine a family of nondeterministic sidered by Avron were defined by means of a notion
matrices in the sense of Avron and Lev (see below) equivalent to snapshots in B5 , with the same mo-
which semantically characterize mbC and other LFIs. tivations described here. Swap structures propose a
Any swap structure B for mbC determines a non- generalization of this approach to arbitrary Boolean
def
deterministic matrix M.B/ D hB; DB i such that DB D algebras (instead of taking the two-valued Boolean
fx 2 B W x1 D 1g is the set of designated truth values algebra). From this, some interesting perspectives to
(here, x1 denotes the first coordinate of the snapshot study LFIs from the point of view of multialgebras
x). Let K be the class of such nondeterministic matri- arise, by adapting appropriate techniques from alge-
ces, and define the semantic consequence with respect braic logic.
to swap structures for mbC as follows: ˆK A iff Finally, the possible-translations semantics
ˆM.B/ A, for every swap structure B for mbC (the paradigm will be briefly surveyed here. The possible-
consequence relation on each nondeterministic matrix translations semantics (PTS) were introduced by
is defined by means of valuations in the sense of Avron Carnielli in 1990 [15.39] as an attempt to offer a more
and Lev, see below). Then, the following result can be palatable interpretation, from the philosophical point
proven [15.37]: of view, for some nonclassical logics, and especially
for paraconsistent logics. PTSs are based on the notion
Theorem 15.1 Soundness and Completeness of translations between logics (that is, mappings that
Let [ fAg be a set of formulas in mbC. Then, `mbC preserve consequence relations). Using an analogy with
A if and only if ˆK A. natural languages, translations can be are thought of as
different world views, and the concept of PTSs is a way
Formal (In)consistency, Abduction and Modalities 15.6 Conclusions 333
of interpreting a given logic L as the combination of all other hand, from the viewpoint of the consequence rela-
possible world views, represented by an appropriate set tion, and leaving aside algebraic aspects, swap-structure
of translations of the formulas of L into a class of logics semantics is nothing more than a semantics defined by
with known consequence relation. By choosing an ade- a family of nondeterministic matrices. Therefore, it can
quate collection of such translations, the object logic L be seen as a particular case of possible-translations se-
acquires a semantic meaning throughout the logics into mantics, as pointed out above.
which it is translated (the so-called traducts). When From the previous considerations, it can be seen
the translations and traducts are decidable, PTSs offer that possible-translations semantics is a semantic tool
a decision procedure. having a wide scope of applications and a high degree
In formal terms, a possible-translations semantics of generality. On the other hand, Nmatrices and swap
for a logic L is a pair of families hfLi gi2I ; ffi gi2I i such structures offer semantic interpretations with a more
that each Li is a logic and each fi is a mapping from L algebraic-oriented perspective, thus being more in-
to Li such that `L A if and only if fi Œ `Li fi .A/, for tuitive. In particular, swap structures offer not only
every i 2 I. a promising way to study mbC and other LFIs from the
In [15.40] it was shown that possible-translations point of view of algebraic logic, but also a new angle
semantics are able to express Nmatrices, which implies for understanding why such logics matter for the analy-
that the latter is a particular case of the former. On the sis of reasoning.
15.6 Conclusions
Classical logic is a very powerful tool for modeling both vance, due to the problem of contradictory character of
informal and scientific reasoning. However, the fact that certain abductive solutions. The subject of abduction is
it is not able to handle contradictions is an important currently investigated, and employed, in several fields
constraint. Although paraconsistent logics have been such as artificial intelligence research, formal systems of
gaining an increasingly relevant place in contempo- law and norms, diagnostic expert systems and databases.
rary philosophical debate, there is still some resistance In such cases, theory and practice rely more and more on
against recognizing their philosophical significance, and the paraconsistency paradigm. In this way, logic-based
it is likely that this reluctance is primarily related to the abduction, regarded from a paraconsistent perspective,
awkwardness of the claim that there are meaningful con- acquires special interest. Taking into account, for in-
tradictory propositions about reality which are true. An stance, that probabilistic abductive reasoning as a form
epistemological interpretation of the acceptance of con- of taking decisions is already extensively used in areas of
tradictions by paraconsistent logics has been presented higher degree of uncertainty such as medical diagnoses
and defended here. It has also been argued that the logics and pharmaceutical tests, further investigation on para-
Part C | 15.6
of formal inconsistency are capable of expressing con- consistent probabilistic methods combined with tableau
tradictions as conflicting evidence, a notion weaker than would open a new research window, adding interest to
truth that occurs in several contexts of informal reason- the type of approach here expounded.
ing and scientific research. The unary operator ı initially With respect to the paraconsistent modal logics sur-
had the purpose of representing the metatheoretical no- veyed in the Sect. 15.4, it is worth noting that the
tion of consistency within the object language. But the epistemological interpretation of contradictions in rea-
idea has been further developed in such a way that it soning scenarios can be naturally combined with modal
may receive alternative meanings. Actually, any logi- logics. The informal interpretation here suggested, ac-
cal property may have its validity restricted to a group cording to which a pair of contradictory propositions A
of propositions, depending on the context of reasoning and :A means that there is conflicting evidence about
one wants to represent. This has been done in the log- A, and that ıA means that the truth value of A has been
ics mbCD and its extension LET k, which restrict not only conclusively established, can be combined with modal-
explosion but also excluded middle. ities with interesting new aspects. For example, ıA
In Sect. 15.3 we have argued that the logic mbC, and may be understood as meaning that the question about
its first-order extension QmbC, are naturally connected whether or not A is necessary has been conclusively es-
to the general question of finding solutions for (respec- tablished, and ı A may be understood as meaning that
tively, sentential and quantified) problems of abduction. necessarily the truth value of A will be conclusively es-
Although the use of tableaux in abductive contexts is not tablished, assigning thus a realistic determinism on the
a novelty, paraconsistent tableaux do represent an ad- proposition A. The mere possibility of performing this
334 Part C The Logic of Hypothetical Reasoning, Abduction, and Models
separation by means of cathodic and anodic modalities cussed: swap structures, nondeterministic matrices and
already seems to offer new logical perspectives to some possible-translations semantics. These paradigms are
modal dilemmas, although this is not the appropriate bounded nondeterministic in nature, in the sense that
place to develop such analysis. In all cases, the ideas the result of an input by the semantic procedure can
here surveyed certainly offer new tools for philosophers produce more than one single output, but within a pre-
and logicians. viously determined set of values. This strongly suggests
Finally, in Sect. 15.5, some alternative semantics that bounded nondeterminism is a suitable approach
were discussed for mbC (and for nonclassical logics in when studying mbC as well as other logics of the same
general). The fact that mbC lies outside the scope of kind. Besides characterizing mbC, the formal properties
the traditional algebraic methods has furthered the de- of such semantics deserve future studies. In particular,
velopment of new kinds of semantics. Three paradigms the algebraic properties of swap structures constitute an
were briefly surveyed, and their interrelations were dis- instigating topic of research.
References
15.1 N.C.A. da Costa, S. French: Science and Partial Truth: 15.15 A. Aliseda: Seeking Explanations: Abduction in
A Unitary Approach to Models and Scientific Rea- Logic, Philosophy of Science and Artificial Intelli-
soning (Oxford Univ. Press, Oxford 2003) gence, Ph.D. Thesis (Stanford University, Stanford
15.2 T. Nickles: From Copernicus to Ptolemy: Inconsis- 1997)
tency and method. In: Inconsistency in Science, ed. 15.16 M.E. Coniglio: Obtención de respuestas en bases
by J. Meheus (Springer, Dordrecht 2002) de conocimiento a partir de completaciones, Query
15.3 N.C.A. da Costa: Sistemas formais inconsistentes procedures in knowledge bases by means of com-
(Inconsistent formal systems) (Editora UFPR, Cu- pletions, Proc. 21th JAIIO Argent. Symp. Inf. Operat.
ritiba 1993), in Portuguese Res. (SADIO) (1992), in Spanish
15.4 N.C.A. da Costa: The philosophical import of para- 15.17 W.A. Carnielli, M. Lima-Marques: Reasoning under
consistent logic, J. Non-Class. Log. 1, 1–19 (1982) inconsistent knowledge, J. Appl. Non-Class. Log.
15.5 A. Einstein: Relativity: The Special and General The- 2(1), 49–79 (1992)
ory (Emporium Books, Belrose 2013) 15.18 W.A. Carnielli, J. Marcos: Tableau systems for logics
15.6 R.P. Feynman, R.B. Leighton, M. Sands: The Feyn- of formal inconsistency, Proc. Int. Conf. Artif. Intell.
man Lectures on Physics, Vol. 1 (Basic Books, New (IC-AI 01), Vol. II (2001) pp. 848–852
York 2010) 15.19 C. Caleiro, W. A. Carnielli, M. E. Coniglio, J.
15.7 W.A. Carnielli, M.E. Coniglio, J. Marcos: Logics of Marcos: Dyadic semantics for many-valued log-
formal inconsistency. In: Handbook of Philosophi- ics, Draft http://sqig.math.ist.utl.pt/pub/caleiroc/
cal Logic, Vol. 14, ed. by D.M. Gabbay, F. Guenthner 03-CCCM-dyadic2.pdf
(Springer, Dordrecht 2007) 15.20 N.C.A. da Costa, W.A. Carnielli: On paraconsistent
15.8 K. Gödel: Zum intuitionistischen Aussagenkalkül, deontic logic, Philosophia 16(3/4), 293–305 (1986)
Part C | 15
Anz. Akad. Wiss. Wien 69, 65–66 (1932) 15.21 M.E. Coniglio: Logics of deontic inconsistency, Re-
15.9 N.C.A. da Costa, E.H. Alves: A semantical analysis of vista Brasileira de Filosofia 233, 162–186 (2009)
the calculi Cn, Notre Dame J. Formal Log. 18, 621– 15.22 M.E. Coniglio, N.M. Peron: A paraconsistentist ap-
630 (1977) proach to Chisholm’s paradox, Principia 13(3), 299–
15.10 C.S. Peirce: Collected Papers (Harvard Univ. Press, 326 (2009)
Cambridge 1931) 15.23 J.-Y. Béziau: S5 is a paraconsistent logic and so
15.11 P. Mancosu: Mathematical explanation: Problems is first-order classical logic, Log. Stud. 9, 301–309
and prospects, Topoi 20(1), 97–117 (2001) (2002)
15.12 O. Arieli, M. Denecker, B. van Nuffelen, 15.24 J.-Y. Béziau: Paraconsistent logic from a modal
M. Bruynooghe: Coherent integration of databases viewpoint, J. Appl. Log. 3, 7–14 (2005)
by abductive logic programming, J. Artif. Intell. 15.25 A.L. de Araújo, E.H. Alves, J.A.D. Guerzoni: Some re-
Res. 21, 245–286 (2004) lations between modal and paraconsistent logic,
15.13 W.A. Carnielli, S. De Amo, J. Marcos: A logical frame- J. Non-Class. Log. 4(2), 33–44 (1987)
work for integrating inconsistent information in 15.26 J.P. Sartre: Existentialism is a Humanism (Yale Univ.
multiple databases, Lect. Notes Comput. Sci. 2284, Press, New Haven 2007)
67–84 (2002) 15.27 J. Hansen, G. Pigozzi, L. van der Torre: Ten philo-
15.14 W.A. Carnielli, M.E. Coniglio, R. Podiacki, T. Ro- sophical problems in deontic logic, Dagstuhl Semi-
drigues: On the way to a wider model theory: nar Proceeding 07122 (2007), http://drops.dagstuhl.
Completeness theorems for first-order logics of for- de/opus/volltexte/2007/941/
mal inconsistency, Rev. Symb. Log. 7(63), 548–578 15.28 J. Bueno-Soler: Two semantical approaches to
(2014) paraconsistent modalities, Log. Univ. 4(1), 137–160
(2010)
Formal (In)consistency, Abduction and Modalities References 335
15.29 J. Bueno-Soler: Multimodalidades anódicas e Applied Mathematics Ser., Vol. 39, ed. by A.I. Ar-
catódicas: A negação controlada em lógicas mul- ruda, N.C.A. da Costa, R. Chuaqui (Marcel Dekker,
timodais e seu poder expressivo (Anodic and Ca- New York 1978) pp. 99–117
thodic Multimodalities: Controled Negation in Mul- 15.35 D. Vakarelov: Notes on N-lattices and constructive
timodal Logics and Their Expressive Power, Ph.D. logic with strong negation, Stud. Log. 36(1/2), 109–
Thesis (IFCH-Unicamp, Campinas 2009), in Por- 125 (1977)
tuguese 15.36 M.M. Fidel: Nuevos enfoques en Lógica Algebraica
15.30 J. Bueno-Soler: Completeness and incompleteness (New Approaches to Algebraic Logic), Ph.D. Thesis
for anodic modal logics, J. Appl. Non-Class. Log. 19, (Universidad Nacional del Sur, Bahía Blanca 2003),
291–310 (2009) in Spanish
15.31 J. Bueno-Soler: Multimodal incompleteness under 15.37 W.A. Carnielli, M.E. Coniglio: Swap structures for
weak negations, Log. Univ. 7, 21–31 (2013) LFIs, CLE e-Prints 14(1), 1–39 (2014)
15.32 W.J. Blok, D. Pigozzi: Algebraizable Logics, Memoirs 15.38 A. Avron, I. Lev: Canonical propositional Gentzen-
of the American Mathematical Society Ser., Vol. 77 type systems, Lect. Notes Artif. Intell. 2083, 529–
(396) (American Mathematical Society, Providence 544 (2001)
1989) 15.39 W.A. Carnielli: Many-valued logics and plausible
15.33 A. Avron: Non-deterministic matrices and modular reasoning, Proc. 20th Int. Symp. Multiple-Valued
semantics of rules. In: Logica Universalis, ed. by J.- Log. (The IEEE Computer Society Press, Los Alami-
Y. Béziau (Birkhäuser Verlag, Basel 2005) pp. 149– tos 1990) pp. 328–335
167 15.40 W.A. Carnielli, M.E. Coniglio: Splitting logics. In: We
15.34 M. Fidel: An algebraic study of a propositional sys- Will Show Them! Essays in Honour of Dov Gabbay,
tem of Nelson. In: Mathematical Logic. Proceedings Vol. 1, ed. by S. Artemov, H. Barringer, A.A. Garcez
of the First Brazilian Conference on Mathematical (College Publications, London 2005)
Logic, Campinas 1977, Lecture Notes in Pure and
Part C | 15
337
Part D
Model-Ba Part D Model-Based Reasoning in Science
and the History of Science
Ed. by Nora Alejandrina Schwartz
The chapters contained in Part D, Model-Based Rea- nitive processes implicated in scientific change and also
soning in Science and the History of Science, provide develops the argument that model-based reasoning is
conceptual tools that allow us to understand model- effective to create new candidate representations be-
based-reasoning in current science and the history cause it facilitates the changes of constraints.
of science and akin notions such as similar system-
based inferences. On the one hand, they give ana- Chapter 18 helps to improve the understanding and
lytic frames – cognitive, historical, and methodological appreciation of the notion of physically similar systems
ones – (Chaps. 16–19) to help us understand that kind of in the philosophy of science. Susan Sterrett character-
scientific reasoning in any domain or across the social izes this concept as it is understood currently, based on
sciences. On the other hand, they address specific forms the article by Edgar Buckingham On Physically Similar
of model-based reasoning: a paradigm of model-based Systems: Illustrations of the Use of Dimensional Equa-
diagnostic reasoning, supported by a formal theory of tions. Then she draws a path from the earliest precursors
diagnostic reasoning (Chap. 20) and a review of the de- of the concept in the Renaissance to its plain articula-
bate on thought experiment (Chap. 21). tion in the twentieth century, bringing out, on the one
hand, the key ideas of function, which was developed in
Chapter 16 proposes an interpretative frame based the eighteenth century, and, on the other hand, the idea
on cognitive science to understand the effects mathe- of a coherent system of units, which was developed in
matical representations may have on scientists’ model- the late nineteenth century. Also, Sterrett discusses the
based reasoning, specifically on that of physicists. This role that the notion has in reasoning and drawing infer-
frame is constituted by the concept of model-based rea- ences: The concept of similar systems has been useful
soning, the concept of metaphorical processes founded in developing methods to draw inferences about values
on embodied cognition and on more basic conceptual of specific quantities in a system, based on observations
spaces, and the concept of long-term working mem- in other systems. Sterrett emphasizes that the success
ory. Ryan Tweney defends this proposal on the basis of of this approach in physical chemistry promoted the ex-
three claims: (i) that mathematical representations used tension of a similar system approach to electromagnetic
in physics exemplify model-based-reasoning, (b) that theory and gas kinetic theory.
the working of such models depends on acquired
metaphors and conceptual blend, and (c) that the acqui- Chapter 19 focuses on the distinctive features of the
sition of these metaphorical grounds can be explained method of hypothetical modeling in social sciences by
by developing long-term working memory. He illus- treating it as one style of reasoning: abstract or theo-
trates the argument that the above-mentioned cognitive retical model-based reasoning. With this goal, Caterina
schemes can be understood as the basis of mathemati- Marchionni, Alessandra Basso, and Chiara Lisciandra
cal representations in physics, developing the analysis compare this method with other styles of reasoning
of one part of J.C. Maxwell’s field theory of electro- employed in social science: experiments and computer
magnetism. simulations. Differences between hypothetical model-
ing and experiments are found, and the consequences
Chapter 17 seeks to clarify and underline the mer- they have for making inferences about the world are
its of the cognitive-historical approach elaborated by explored. Considered closely, computer simulations are
Nancy Nersessian, an environmental perspective within also viewed as a different style of reasoning from that
cognitive studies of science, with which a central as- of analytical models, in that they are particularly apt
pect of model-based reasoning in science is treated: for dealing with complex systems. Also, the legitimacy
a process to solve representational problems generating of hypothetical modeling as a way of learning about
historically creative ideas. In order to achieve this goal, social scientific phenomena is examined. From recog-
Nora A. Schwartz introduces the main problems and so- nizing the little philosophical agreement on the issue,
lutions provided using the cognitive-historical method. the discussion is rebuilt by organizing the different per-
Accordingly, the chapter consists of three parts: ques- spectives around the function of models that is taken as
tions about the creation of the scientific concepts, epis- primary.
temic virtues of the cognitive-historical analysis, and a
hypothesis about the creation of scientific concepts. The Chapter 20 presents model-based-diagnostic rea-
first one is focused on the nature of cognitive processes soning, understood as a paradigm of diagnostic infer-
implied in the creation of ideas and the search of an ence aimed to give rational explanations of some faulty
account for their effectiveness in achieving successful behavior of the system under discussion. The main idea
results. The second one exhibits the epistemic virtues of of this paradigm is the comparison between the be-
the historical and cognitive dimensions of the method. havior of the observed system and the one which can
The last one introduces the dynamic hypothesis of cog- be predicted using knowledge about the system model.
339
The model-based diagnostic approach is placed within sections, Archangeli tackles the following questions:
knowledge engineering methods from the artificial in- what is a thought experiment? What is the function of
telligence domain and is based on the formal theory thought experiments? How do they achieve their func-
of diagnostic reasoning by R. Reiter. Antoni Ligeza tion? She reviews what has been said about thought
and Bartlomiej Górny illustrate the method in detail experiment definitions placed within an experimental
with applications; particularly they mention the dy- domain and within a theoretical domain and, finally, she
namic system of three tanks. refers to the main features that should help us to iden-
tify thought experiments. Further, she deals with the
Chapter 21 is structured in five parts. In the first central epistemological questions treated in the litera-
one, Margherita Archangeli introduces in a historical ture on thought experiments: what kind of knowledge
context a sample of examples of thought experimenta- do thought experiments produce? To what extent are
tion with the purpose of giving a precise idea of the thought experiments a reliable source of information?
issues under discussion. In the second part, she refers to What role do thought experiments play in the processes
the more relevant steps in the history of thought exper- of rational choice? Finally, she reviews what has been
iments: the beginning, when the term was coined, and said about the cognitive underpinnings of thought ex-
the two phases into which the debate can be divided, the perimentation and focuses on the role of imagination in
classic one and the contemporary one. In the last three thought experiments.
341
Metaphor an
16. Metaphor and Model-Based Reasoning
Part D | 16
in Mathematical Physics
Ryan D. Tweney
Mathematics is central in science; it is frequently used extensive work on the psychological and cognitive un-
as the basis for calculation, as a means of derivation derpinnings of scientific thought in general [16.1].
of new expressions, and – the focus of this chapter – To clarify what is meant by representation, consider
as a means of representation. Oddly, however, there the following. Isaac Newton, in his Principia Mathe-
are few attempts to deal with the power of mathemat- matica [16.2] formulated a law of universal gravitation
ics as a representational medium in science, in spite of which is usually today expressed with the following
342 Part D Model-Based Reasoning in Science and the History of Science
repulsive force between two magnets or two electric lying metaphoric bases of mathematics. The argument
Part D | 16.1
charges. The action at a distance account was chal- is based on three claims:
lenged by Michael Faraday, who instead argued that
1. That mathematical representations can serve in
electric and magnetic forces depended upon lines of
model-based reasoning, and
force; the first true field theory in physics. By the end
2. That an understanding of how they are used requires
of his life, Faraday believed he had demonstrated the
attention to the embodied metaphoric understand-
physical reality of the lines as immaterial but real cen-
ings of the expressions. The metaphoric bases are
ters of power [16.8].
in turn
Faraday was mainly well-known for his many ex-
3. Dependent upon automated cognitive processes re-
perimental researches and discoveries, but his theo-
lated to the employment of long term working
retical account had almost no adherents – except the
memory (LTWM).
young Maxwell. For Maxwell, Faraday’s account was
a seminal one, and he set about to translate it into math- In this way, the external representation in the form
ematical expressions. Eventually, he was able to show of a mathematical expression is coordinated with an in-
that the prevalent action at a distance theories of elec- ternal representation.
tromagnetic effects were less tenable than a true field One terminological point is needed. In distinguish-
theory (although this account also was slow to gain ac- ing between metaphors and analogies, an unconven-
ceptance, as Hunt has shown [16.9]). tional division between two terms often seen as in-
In this chapter, using a part of Maxwell’s account, terchangeable is used. In the present usage (follow-
it will be shown how cognitive science can provide an ing [16.10]), Metaphor is used to signal a taken-for-
analytic framework for an understanding of the role granted, tacit, comparison. I use analogy to signal
of mathematics in physics. Maxwell’s reformulation of a comparison between a source and a target that must
classical physical ideas can thus be understood in cog- be explicitly argued. In the particular case of Maxwell’s
nitive terms, using recent formulations of model-based physics, there have been many studies of his use of anal-
reasoning in science, and recent analyses of the under- ogy in this sense, but little about his use of metaphor.
16.1.2 Metaphoric Processes that my approach differs from accounts that regard
Part D | 16.1
tent knowledge, as Chi et al. [16.31] have shown (see sic principles of physics, while the graduate students
Part D | 16.2
also [16.32]). The professor subjects have developed are still acquiring them and are more dependent upon
such LTWM retrieval structures centered around the ba- surface-level cues.
observed: Each wire was cutting all the lines of force the potential field within the cavity. That is, there is an
Part D | 16.2
but the generated currents were in opposite directions, overall field because the cavity is within a magnet and
thus cancelling each other. This was the result he was a specific field due to the surface distribution of mag-
after: the lines of magnetic force ran through the mag- netism on the ends of the cylindrical cavity. Note that
net, out at one end, curved around through space, and the forces due to the circular surfaces are of opposite
re-entered at the other end of the magnet. Magnetic polarity to the ends of the magnet.
lines of force are closed curves. Maxwell first considered the field due to the sur-
face distribution on the cylinder ends, claiming that the
16.2.3 Maxwell: Magnetic Lines forces on the monopole are equal and in the same di-
Within a Magnet rection (because the monopole will be attracted by one
surface and repelled by the other). This force will be
Maxwell’s Treatise is divided into four parts, with the
fourth part developing the final form of his theory of b
R D 4 I 1 p ; (16.3)
electromagnetism and the third presenting his account a2 C b2
of magnetism. The first chapter of the third part con-
sidered the magnetic potential at any point outside of where R is the force and I is the intensity of magnetiza-
a nearby magnet, showing that the force on a unit tion. Because the dimensions of the cavity are involved,
magnetic pole is equal to the gradient of the potential the force is dependent upon the shape of the cavity. In-
(rV, where V is a scalar function), that is, to the rate terestingly, Maxwell does not show how this equation
of change of the potential in the direction of greatest is obtained, taking for granted that the reader will know
change. In Chapter II, Maxwell considered the forces how to do this (while not lengthy, I will not carry this
within a magnet. In contrast to Faraday, however, he did out – see [16.6, the comment on 396.2] and the discus-
not here conduct experiments, nor replicate Faraday’s sion that follows).
(although they are cited). Instead, he conducted a series With this in hand, Maxwell now asked the reader
of thought experiments. to consider two cases. In the first, imagine that a is
He began by imagining a cylindrical hollow cavity very small, that is, shrink the diameter of the cylinder
within a bar magnet (Fig. 16.2). Taking its length as cavity. From (16.3), note that R will approach 0 as a ap-
2b and its radius as a, he then imagined a unit mag- proaches 0. In the second case, let the cylinder shrink in
netic pole centered within the cavity. Such a pole is an length. As b approaches 0, then R approaches 4 I. This
imaginary object, since magnetic monopoles do not ex- means that, in the first case, a long and thin cylinder,
ist (that is, if you break a magnet into two pieces, each the force will simply be that due to the overall field;
piece will have a North and South pole, breaking them it will be the gradient of the potential. Maxwell calls
again, each piece will have two poles, and so on). Still, this magnetic force within the magnet and symbolized
were such a thing to exist, it is possible to represent the it as a vector, H (here using bold-face, to indicate a vec-
forces it would experience. There are two sources; first tor). In the second case, which becomes a flat disk as
the forces due to magnetic induction from the ends of the cylinder length shrinks, the force is dependent on R
the cavity. Since the field lines are parallel to the walls and is compounded of 4 I and H. He symbolizes this
of the cylinder, the walls play no role, only the circu- new quantity as B and calls it the magnetic induction.
lar ends are involved. Second, there are forces due to The two terms are related by a simple equation, via the
overall intensity of magnetization, I, which, written as
a vector, is
b
B D H C 4 I : (16.4)
a
N (S) (N) S Note from (16.4) that the distinction between B and
P H will hold only within a magnet; in the absence of
a surrounding magnet, that is, when I = 0, the two are
identical (Fig. 16.3).
Maxwell used the relation between B and H to clar-
Fig. 16.2 Maxwell’s thought experiment: a bar magnet ify a paradox in Faraday’s notion of lines of force. The
with a cavity inside. The cavity is cylindrical, of length paradox arose because the directions of B and of H dif-
2b and with faces of radius a. Note that the polarity of the fer. That is, the magnetic force due to H always goes
faces is the reverse of the polarity of the nearest end of the from the North Pole of the magnet to the South Pole –
magnet both inside and outside the magnet! As a result, they
348 Part D Model-Based Reasoning in Science and the History of Science
Part D | 16.3
ple, consider (16.3) from the previous section earlier case. In particular, both charge and magnetic en-
Part D | 16.3
tities exert force according to an inverse square law, that
b is, inversely as the square of the distance. Thus, 4 I,
R D 4 I 1 p ;
a2 C b2 unlike the other part of (16.3) is an analogy, albeit it-
self grounded in the mathematics of earlier parts of the
As noted earlier, Maxwell did not provide a derivation book [16.7, Vol. 2, p. 5]:
of this result, instead assuming that his readers would be
able to recognize it. To show its metaphoric nature, first “Since the expression of the law of force between
consider the term a2 C b2 . From Fig. 16.2, it is apparent given quantities of Magnetism has exactly the same
that this is related, via the Pythagorean theorem, to the mathematical form as the law of force between
length of the hypotenuse of the triangle with sides a and quantities of Electricity of equal numerical value,
b. If we take the square root and call this r, then we can much of the mathematical treatment of magnetism
simplify (16.3) must be similar to that of electricity.”
Maxwell is able to carry over the expression for the
b
R D 4 I 1 ; (16.5a) magnetic surface density from the equivalent expres-
r sion for electric surface density: he does not need to
repeat the derivation (which is also built on metaphoric
This, in turn, becomes
grounds and hence can be taken as given), he only needs
b to have shown the analogy.
R D 4 I 4 I : (16.5b) We can again obtain an informal understanding by
r
asking where the multiplier 4 comes from. Note first
Now suppose that a shrinks (Maxwell’s first case). that the monopole at point P is subjected to an attrac-
Then b=r goes to 1 and R goes to 0. And if b shrinks tive force from one face of the cylindrical cavity and
(Maxwell’s second case), then R goes to 4 I. a repelling force from the other face. Both forces are in
The attentive reader can now see how the the same direction, so any one face is contributing 2
metaphoric underpinnings worked in the discussion of to the result. But 2 is the circumference of a circle of
this equation. For, in fact, what has been asked of you radius 1. Here, it appears as if Maxwell is relying upon
to do is what Maxwell (with, to be sure, more exten- a previous result from the first volume of the Treatise,
sive metaphors assumed) asked of his readers! That is, namely Stokes’s theorem, which states that the surface
we drew upon your knowledge of the Pythagorean the- integral of a function describing a surface is equal to
orem and upon your metaphoric sense of what happens the line integral of the curve bounding that surface. Ex-
when geometric terms like a and b change. Further, how plaining this would go beyond the scope of this chapter,
the sense of algebraic equations can be modified, as in but it implies in the case of the circular face of the cavity
going from (16.3) to (16.5a) and (16.5b), was also in- that the force due to the face can be construed as either
volved. These did not need to be specifically argued based on the density of magnetization of the surface or,
because, as Lakoff and Núñez [16.21] argued, these equivalently, as based on a circulation around the closed
have been acquired on the basis of long practice – they curve (the circle) that bounds it. Thus, 2 emerges!
are conceptual blends with metaphoric groundings. On Note that for Maxwell’s readers Stokes’s theorem
my account, they are not analogies, because the links would have been assumed knowledge (it is explained
between source and target are implicit and assumed to in a preliminary chapter [16.7, Vol. 1, p. 29]. For the
be known among his readers. This is why Maxwell does present purpose, however, it is enough to catch some
not explicate (16.3). glimpse of how the factor emerges; in the following
However, (16.3) is not yet fully explicated for our chapter, Maxwell uses Stokes’s theorem to make a more
purposes. Where does the 4 I come from? In the pre- explicitly physical representation. There, he shows that
vious section, Maxwell had considered the force on a magnetic shell (a surface bounded by a closed curve)
a small magnet due to the distribution of a surface of can be represented equivalently by an electric current in
magnetic matter (like the imagined cavity and the mag- a conductor that follows the same closed curve.
netic monopole, this is another convenient fiction). That One final point: Maxwell’s Treatise is notable in
discussion, in turn, relied upon results achieved in the part for its use of vectors as representational entities. In
first volume of the Treatise, in which he showed that the selection here, these appear as H, B, and I. We have
the surface distribution of an electric charge on a con- previously discussed the metaphoric basis of vector rep-
ductor exerted a force near to the conductor equal to resentations [16.10]. For the present case, it needs only
4 ¢, where ¢ is the surface distribution of charge. In to be noted that vectors are quantities that represent both
the present case, I is equivalent to the charge in the magnitude and direction. They can be grounded on el-
350 Part D Model-Based Reasoning in Science and the History of Science
ementary notions of muscular force and direction, and resentation of fields (as in Fig. 16.3). The introduction
Part D | 16.4
can then be conceptually blended with other mathemat- of such vector analysis was an important milestone in
ical concepts. Throughout the Treatise, Maxwell uses mathematical physics generally, one that continues to
them (and the vector calculus) as part of his overall rep- be used today [16.5, 45].
thought experiment he presents is fundamentally an- enabled the detection of the lines within a magnet.
Part D | 16.5
chored in the reader’s ability to follow the claims made Maxwell achieved the same thing using a thought ex-
via the construction of a model and via the implementa- periment, a move that allowed him to distinguish be-
tion of the mathematical representations involved. Note tween H and B, thus identifying B as the physically sig-
that they lead up to the expression of an identity, not nificant quantity. The two approaches complement each
an equation in the usual sense. That is, (16.4), B D other in an interesting fashion. Thus, Faraday’s science
H C 4 I, is presented, not because it has calculational is replete with hand-eye-mind representations [16.13];
uses but because it shows the reader the relationships for him, the lines of force were physically real to
among key terms and because, by using vector notation the extent that he could observe their effects and ma-
for the first time in the section, it reiterates the direc- nipulate their character. For Maxwell, the observation
tional character of the lines of induction, of force, and and manipulation were based, not on experiment di-
of magnetic intensity. It is important because of its rep- rectly, but on the expression of a mental model and
resentational character. its extension via the metaphoric underpinnings of the
As noted earlier Faraday represented magnetic lines mathematical representations. It, too, had a hand-eye-
of force experimentally, by constructing apparatus that mind character.
16.5 Conclusions
Ultimately, then, this is the true fashion in which Fara- abstract entities, particularly in situations in which
day and Maxwell can be seen as similar: both were mathematical representation is involved. In fact, how-
doing science in a style dependent upon a fundamen- ever, the presence of arcane symbols and equations
tal embodiment of the conceptual representations they do not mean that, for the scientist, these are neces-
created. For both, this was, in fact, a conscious goal. sarily abstract, however they appear to the uninitiated.
When Faraday is seeking the physical reality of his lines In the present chapter, we have tried to show how
of force, he is doing just what Maxwell was doing in both Faraday and Maxwell were anchored in quite con-
identifying the vector B as the physically significant crete representations of their respective models of the
quantity. That they followed different pathways, that electromagnetic field within a magnet. Those represen-
Faraday’s was experimental and Maxwell’s mathemati- tations depended for their utility on the highly skilled
cal, is not, in the end, the most important aspect for an and easily accessible expertise that each investigator
understanding of their creative achievements. possessed. The presence of such expertise is the nec-
Beyond these two cases, however, there is a more essary cognitive grounding of creative achievement in
general point to be made. That model-based reason- science.
ing is ubiquitous in science should be clear to the
reader of this volume. What case studies of the type of- Acknowledgments. Thanks are due especially to
fered here can provide is a method of discovery of the Howard Fisher, who has saved me from many errors
finer points with which such reasoning is carried out. and is not responsible for remaining ones! I have ben-
While not every scientist will resemble either Faraday efitted greatly from discussions of Maxwell with John
or Maxwell in the way in which they employ such rea- Clement, Howard Fisher, Frank James, Nancy Nerses-
soning, still, the nuances may be quite general across sian, and Thomas Simpson. The chapter’s ultimate ori-
cases. In particular, the importance of distinguishing be- gin stems from discussions with the late David Gooding
tween those aspects of model-based reasoning that are and with Elke Kurz-Milcke. The proximate origin is
tacit (and hence unargued – what we have referred to a paper given at MBR012 in Sestri Levante, Italy, in
as metaphorical in nature) versus those that are explicit June, 2012; I am grateful for the questions and com-
(i. e., analogical in nature) is central to any understand- ments of the other participants and to Lorenzo Magnani
ing of scientific thinking. That is why determining the for his support. Matt Lira and Frank James provided
role of expertise and of long-term WM is so helpful in helpful comments on an early draft, for which I am
understanding the particularities of a case – any case. grateful. An earlier version was published in L. Mag-
The case studies also reflect a challenge to the nani (Ed.): Model Based Reasoning in Science and
common view that science deals with increasingly Technology (Springer, Berlin 2014) pp. 395–414.
352 Part D Model-Based Reasoning in Science and the History of Science
References
Part D | 16
16.1 M.E. Gorman, R.D. Tweney, D.C. Gooding, A.P. Kin- 16.19 J. Clement: Creative Model Construction in Scien-
cannon (Eds.): Scientific and Technological Think- tists and Students: Imagery, Analogy, and Mental
ing (Lawrence Erlbaum, Mahwah 2005) Simulation (Springer, Dordrecht 2008)
16.2 I. Newton: The Principia: Mathematical Princi- 16.20 G. Lakoff, M. Johnson: Philosophy in the Flesh:
ples of Natural Philosophy (Univ. California Press, The Embodied Mind and its Challenge to Modern
Berkeley 1999), transl. by I.B. Cohen, A. Whitman, Thought (Basic Books, New York 1999)
originally published 1687 16.21 G. Lakoff, R.E. Núñez: Where Mathematics Comes
16.3 J.L. Lagrange: Analytical Mechanics (Kluwer, Boston From: How the Embodied Mind Brings Mathematics
1997), transl. and ed. by A.C. Boissonnade, V.N. into Being (Basic Books, New York 2000)
Vagliente, originally published 1788 16.22 M. Turner: Cognitive Dimensions of Social Science
16.4 I. Grattan-Guinness: The Fontana History of the (Oxford Univ. Press, Oxford 2001)
Mathematical Sciences: The Rainbow of Mathe- 16.23 R.E. Núñez: Creating mathematical infinites:
matics (London, Fontana Press 1997) Metaphor, blending, and the beauty of transfinite
16.5 E. Garber: The Language of Physics: The Calculus cardinals, J. Pragmat. 37, 1717–1741 (2005)
and the Development of Theoretical Physics in Eu- 16.24 G.L. Murphy: On metaphoric representation, Cog-
rope, 1750–1914 (Birkhäuser, Boston 1999) nit. 60, 173–204 (1996)
16.6 H. Fisher: Maxwell’s Treatise on Electricity and 16.25 G.L. Murphy: Reasons to doubt the present evi-
Magnetism: The Central Argument (Green Lion, dence for metaphoric representation, Cognit. 62,
Santa Fe 2014) 99–108 (1997)
16.7 J.C. Maxwell: A Treatise on Electricity and Mag- 16.26 D.A. Weiskopf: Embodied cognition and linguistic
netism, 3rd edn. (Clarendon Press, Oxford 1891), 2 comprehension, Stud. Hist. Philos. Sci. 41, 294–304
Volumes, revised by J.J. Thompson, originally pub- (2010)
lished 1873 16.27 R.W. Gibbs Jr.: Why many concepts are metaphor-
16.8 D.C. Gooding: Final steps to the field theory: Fara- ical, Cognit. 61, 309–319 (1996)
day’s study of magnetic phenomena, Hist. Stud. 16.28 D. Gentner, M. Jeziorski: The shift from metaphor
Phys. Sci. 11, 231–275 (1981) to analogy in Western science. In: Metaphor and
16.9 B.R. Hunt: The Maxwellians (Cornell Univ. Press, Thought, ed. by A. Ortony (Cambridge Univ. Press,
Ithaca 2005) Cambridge 1993) pp. 447–480
16.10 R.D. Tweney: On the unreasonable reasonableness 16.29 K.A. Ericsson, W. Kintsch: Long-term working
of mathematical physics: A cognitive view. In: Psy- memory, Psychol. Rev. 102, 211–245 (1995)
chology of Science: Implicit and Explicit Processes, 16.30 E. Kurz-Milcke: The authority of representations.
ed. by R.W. Proctor, E.J. Capaldi (Oxford Univ. Press, In: Experts in Science and Society, ed. by E. Kurz-
Oxford 2012) pp. 406–435 Milcke, G. Gigerenzer (Kluwer/Plenum, New York
16.11 L. Magnani: Abduction, Reason, and Sci- 2004) pp. 281–302
ence: Processes of Discovery and Explanation 16.31 M.T.H. Chi, P.J. Feltovich, R. Glaser: Categorization
(Kluwer/Plenum, New York 2001) and representation of physics problems by experts
16.12 J. Cat: Into the ‘regions of physical and meta- and novices, Cogn. Sci. 5, 121–152 (1981)
physical chaos’: Maxwell’s scientific metaphysics 16.32 J.H. Larkin, J. McDermott, D.P. Simon, H.A. Simon:
and natural philosophy of action (agency, deter- Models of competence in solving physics problems,
minacy and necessity from theology, moral phi- Cogn. Sci. 4, 317–345 (1980)
losophy and history to mathematics, theory and 16.33 F.A.J.L. James: Michael Faraday: A Very Short Intro-
experiment, Stud. Hist. Philos. Sci. Part A 43, 91– duction (Oxford Univ. Press, Oxford 2010)
104 (2011) 16.34 C.W.F. Everitt: James Clerk Maxwell: Physicist and
16.13 D. Gooding: Experiment and the Making of Mean- Natural Philosopher (Charles Scribner’s Sons, New
ing: Human Agency in Scientific Observation and York 1975)
Experiment (Kluwer, Dordrecht 1990) 16.35 M. Faraday: On the Physical Character of the Lines
16.14 N.J. Nersessian: Creating Scientific Concepts (MIT of Magnetic Force. In: Experimental Researches in
Press, Cambridge, MA 2008) Electricity, Vol. 3, ed. by M. Faraday (Taylor Francis,
16.15 T.K. Simpson: Figures of Thought: A Literary Ap- London 1855) pp. 407–437, first published 1852
preciation of Maxwell’s Treatise on Electricity and 16.36 R.D. Tweney: Inventing the field: Michael Faraday
Magnetism (Green Lion Press, Santa Fe 2005) and the creative ‘engineering’ of electromagnetic
16.16 T.K. Simpson: Maxwell’s Mathematical Rhetoric: field theory. In: Inventive minds: Creativity in Tech-
Rethinking the Treatise on Electricity and Mag- nology, ed. by R.J. Weber, D.N. Perkins (Oxford
netism (Green Lion Press, Santa Fe 2010) Univ. Press, Oxford 1992) pp. 31–47
16.17 P.N. Johnson-Laird: Mental models in cognitive 16.37 M. Faraday: Experimental researches in electricity,
science, Cogn. Sci. 4, 71–115 (1980) Nineteenth series. On the magnetization of light
16.18 K. Forbus: Reasoning about space and motion. and the illumination of magnetic lines of force. In:
In: Mental Models, ed. by D. Gentner, A. Stevens Experimental Researches in Electricity, Vol. 3, ed. by
(Lawrence Erlbaum, Hillsdale 1983) pp. 53–74 M. Faraday (Taylor Francis, London 1855) pp. 1–26,
(originally published 1846)
Metaphor and Model-Based Reasoning in Mathematical Physics References 353
16.38 W. Thomson (Lord Kelvin): On the uniform mo- (Macmillan Co., London 1872) pp. 340–425 (origi-
Part D | 16
tion of heat in homogeneous solid bodies, and its nally published 1849)
connexion with the mathematical theory of elec- 16.44 C. Smith, M.N. Wise: Energy and Empire: A Bio-
tricity. In: Reprint of Papers on Electrostatics and graphical Study of Lord Kelvin (Cambridge Univ.
Magnetism, ed. by Sir W. Thomson (Macmillan Co., Press, Cambridge 1989)
London 1872), pp. 1–14 (originally published 1842) 16.45 M.J. Crowe: A History of Vector Analysis (Univ. Notre
16.39 W. Thomson (Lord Kelvin): On the mathematical Dame Press, South Bend 1967)
theory of electricity in equilibrium. I. On the el- 16.46 R.D. Tweney: Representing the electromagnetic
ementary laws of statical electricity. In: Reprint of field: How Maxwell’s mathematics empowered
Papers on Electrostatics and Magnetism, ed. by Sir Faraday’s field theory, Sci. Educ. 20(7/8), 687–700
W. Thomson (Macmillan Co., London 1872) pp. 15–37 (2011)
(originally published 1845) 16.47 P.M. Harman: The Natural Philosophy of James Clerk
16.40 N. Nersessian: Faraday to Einstein: Constructing Maxwell (Cambridge Univ. Press, Cambridge 1998)
Meaning in Scientific Theories (Nijhoff, Dordrecht 16.48 A. Warwick: Masters of Theory: Cambridge and the
1984) Rise of Mathematical Physics (Univ. Chicago Press,
16.41 D.M. Siegel: Innovation in Maxwell’s Electromag- Chicago 2003)
netic Theory: Molecular Vortices, Displacement Cur- 16.49 I. Lakatos: Proofs and Refutations (Cambridge Univ.
rent, and Light (Cambridge Univ. Press, Cambridge Press, Cambridge 1976), (originally published 1963–
1991) 1964)
16.42 M. Faraday: Experimental researches in electric- 16.50 J. Cat: On understanding: Maxwell on the methods
ity, Twenty-eighth series. On lines of magnetic of illustration and scientific metaphor. Stud. Hist.
force: Their definite character; and their distri- Philos. Modern Phys (32, 395–441 2001)
bution within a magnet and through space. In: 16.51 M. Bradie: Models and metaphors in science: The
Experimental Researches in Electricity, Vol. 3, ed. by metaphorical turn, Protosociol. 12, 305–318 (1998)
M. Faraday (Taylor Francis, London 1855) pp. 328– 16.52 D. Gentner, B. Bowdle: Metaphor as struc-
370, originally published 1851) ture-mapping. In: The Cambridge Handbook of
16.43 W. Thomson (Lord Kelvin): A mathematical theory Metaphor and Thought, ed. by R.W. Gibbs Jr. (Cam-
of magnetism. In: Reprint of Papers on Electro- bridge Univ. Press, Cambridge 2008) pp. 109–128
statics and Magnetism, ed. by Sir W. Thomson
355
Nancy Nerses
17. Nancy Nersessian’s Cognitive-Historical Approach
Part D | 17
17.1 Questions About the Creation
Nancy Nersessian raises questions about the cre-
of Scientific Concepts ........................ 356
ation of scientific concepts and proposes answers
17.1.1 The Problem of Conceptual Change ..... 356
to them based on the cognitive-historical ap- 17.1.2 The Naturalistic Approach to Science:
proach. These problems are mainly about the Revision of the Problem ..................... 357
nature of the cognitive processes involved in the 17.1.3 The Naturalistic Recasting................... 357
generation of ideas fundamentally new in hu-
man history and the efficacy of those mechanisms 17.2 The Epistemic Virtues
in achieving successful results. In this chapter, I of Cognitive Historical Analysis .......... 359
intend to show the epistemic virtues that make 17.2.1 The Cognitive–Historical Approach ...... 359
this method a useful tool for establishing the dy-
17.2.2 Epistemic Virtues and Dimensions
of this Approach ................................ 360
namic hypothesis about the creation of knowledge
17.2.3 Cognitive Methods to Investigate
in science. I also point out that, compared to
Conceptual Innovation ....................... 362
other methods of cognitive studies on the cre-
ation of scientific knowledge – ethnography, in 17.3 Hypothesis About the Creation
vivo observation, and laboratory experiments – of Scientific Concepts ........................ 363
the cognitive-historical approach turns out to be 17.3.1 Dynamic Hypothesis .......................... 364
primary. I analyze Nersessian’s idea that scien- 17.3.2 The Power of Model-Based Reasoning . 367
tists often employ model-based reasoning, in an 17.4 Conclusions ...................................... 373
iterative way, in order to solve representational
problems in the target domain. Additionally, I References................................................... 373
examine her claim that model-based reasoning
facilitates the conceptual change. This hypothesis
involves a representation of concepts illustrated by
the dynamic frames theory about concepts.
Nancy Nersessian has studied the creation of scientific scientific concepts, and that yet another question arises
concepts from a naturalized perspective in the philoso- from this one. The basic question is about the nature of
phy of science: the cognitive historical approach. Why cognitive processes implied in the generation of ideas
did she do this? What properties does this method pos- fundamentally new in human history, which leads to the
sess that justify such employment? The main purpose assessment of their effectiveness in achieving success-
of this chapter is to clarify the way in which she under- ful results. First, I will introduce Nersessian’s review of
stands this method and to underline its merits. In order the way in which logical positivism and the historicist
to achieve this goal, I will introduce the main problems philosophy of science have framed the problem of con-
about the issue she deals with by this method, and, on ceptual change, as well as her critical evaluation of this
the other hand, I will mention the solutions to these matter.
questions that she has been able to provide using the The second part, Sect. 17.2 Epistemic Virtues of the
cognitive-historical method. Cognitive-Historical Analysis, deals with Nersessian’s
This chapter consists of three parts. The first, conception of the cognitive-historical method, empha-
Sect. 17.1 Questions About the Creation of the Sci- sizing those qualities that make it a useful tool for
entific Concepts, highlights that, within Nersessian’s answering the open questions about the creation of sci-
recasting of the problem of conceptual change, there entific concepts. In addition, it is pointed out here that,
lies a fundamental question related to the creation of compared to other approaches to the creation of scien-
356 Part D Model-Based Reasoning in Science and the History of Science
tific knowledge, the cognitive-historical method can be tive-historical investigations, Nersessian confirms that
considered primary for it is the one that establishes the scientists often employ model-based reasoning in an it-
generative mechanisms of creative concepts in a histor- erative way until they solve representational problems
ical sense. in the target domain. The second section presents the
The last part, Sect. 17.3 Hypothesis About the Cre- general conception of the meaning or the representation
ation of Scientific Concepts, has two sections. The first of concepts that Nersessian proposed in the first place
one introduces the dynamic hypothesis proposed by so as to understand the change of conceptual structures,
Nersessian to give a solution to the problem of the na- and the one that she has more recently suggested, in
Part D | 17.1
ture of practices that generate new scientific concepts. order to explain the effectiveness of model-based rea-
Mainly, it will be observed that, through her cogni- soning for creating new concepts.
called the justification context. This means that they of science further developed such an approach which
tended to make rational reconstructions of science, in had been started by the historicist philosophers. From
particular artificial maps of the logical relationships be- this developed a methodological form of naturalism,
tween concepts [17.3, p. x]. They analyzed scientific which defends the need to appeal to science in order to
concepts as linguistic structures and considered that understand science [17.10, Introduction]. Methodologi-
logical and conceptual studies were enough to under- cal naturalism is one of the various kinds of naturalistic
stand the meaning of the scientific theories, not study- philosophy that have been proposed. Nersessian’s ap-
ing the real scientific activity [17.4, p. 4]. With respect proach, like Ronald Giere’s, belong to this position.
Part D | 17.1
to the second reason why she thinks that the treatment Usually this one is attributed to Kuhn in The Structure of
of scientific change has been unsatisfactory, Nersessian Scientific Revolutions [17.5], but was pioneered by Lud-
highlights the fact that, though Kuhn and Feyerabend wick Fleck in Genesis and Development of a Scientific
resorted to scientific knowledge in order to understand Fact [17.11]. Within the domain of philosophy of sci-
conceptual change, the Gestalt psychology available ence, Fleck insisted on studying the practice of science
then did not give them the suitable tools for that [17.1, instead of propounding rational reconstructions of the
p. 6]. She points out that the perceptive metaphor of logics of the investigation. Therefore, the real subject
change of Gestalt, which Kuhn took from the Gestalt of the science is an issue of methodological natural-
psychology, had an adverse effect [17.1, p. 6]. ism concern [17.10, p. 3]. Ronald Giere characterized
the stance as “[. . . ] the vision that all human activities
“[. . . ] By emphasizing the endpoints of a conceptual should be understood entirely as natural phenomena,
change (e.g., Newtonian mechanics and relativistic as are activities of chemicals or animals” [17.12, p.
mechanics) [. . . ], the change of Gestalt was made to 8]. As for the naturalistic studies of science, he de-
appear artificially abrupt and discontinuous.” scribes them as the perspective of “[. . . ] using science
in the attempt to understand science itself” [17.13, p.
Moreover, Nersessian wrote that, as a philosopher 145]. Nersessian bases her own research of science
of science, Kuhn neglected the processual aspect of on a naturalistic approach that, to understand scien-
conceptual change [17.1, p. 7]. tific knowledge, philosophical theories need to have the
best scientific information available on the human sub-
“Significantly, although Kuhn does talk about dis- ject and about the practices for constructing knowledge
covery as an extended process [17.5, pp. 45ff] and, used by scientists; she also holds the view that empir-
in his role as historian of science, has provided de- ical methods are admissible in developing and testing
tailed examinations of such processes, in his role philosophical hypothesis [17.14, pp. 4–5].
as philosopher of science he identifies conceptual
change with the last act when the pieces fall to- 17.1.3 The Naturalistic Recasting
gether [17.6].”
With the adoption of the naturalistic approach to sci-
Regarding the discussion of the problem of concep- ence, the presentation of the problem of conceptual
tual change in the second half of the twentieth century, change is modified. The contrived logical reconstruc-
Arabatzis and Kindi [17.4], Andersen and Nerses- tions of science are replaced by the study of effective
sian [17.7], and Thagard [17.8] can be consulted. scientific practices aiming at explaining the continu-
ous and noncumulative character of conceptual change.
17.1.2 The Naturalistic Approach to Science: Kuhn conceived normal scientific cognition in terms of
Revision of the Problem practices of solving puzzles guided by solutions to ex-
emplar problems [17.15, Chap. 6]. The post-Kuhnian
During the 1980s and 1990s, Nersessian, Paul Thagard group of cognitive orientation holds that interest in
and Hanne Andersen, and Peter Barker and Xiang Chen scientific activity, focusing, specifically, on the prac-
concurred in revising the Kuhnian idea of a radical and tices of formation and changing of scientific concepts.
sudden conceptual change in science, and they real- In general, scientific practices can be understood as
ized that something like that would be, at the most, procedures carried out by scientific agents. For exam-
exceptional. Thus, Thagard proposed that to think of ple, David Gooding characterizes the notion of pro-
changes as gestaltic ones “[. . . ] makes it hard to see how cedure implied in such practices as “[. . . ] a sequence
conceptual change can take place” [17.8, p. 49]; see of acts or operations whose inferential structure is
also [17.9]. The conviction that a naturalist scientific undecided” [17.16, p. 8]. An overview of the philo-
approach would enable facing the problem in a suitable sophical discussions related to scientific practices can
way began to impose itself. That is why several scholars be found in the work of Joseph Rouse [17.17]. Although
358 Part D Model-Based Reasoning in Science and the History of Science
science, also participate in the studies of science and all the aspects relevant to science can be explained
technology (STS), together with the social–cultural pro- in terms of sociocultural factors. Nersessian consid-
grams. Some of the methods of the cognitive studies ers that this view is a form of reductionism which is
of science use the traditional view of cognitive science. manifested, for example, in the declaration of a 10
This means that they assume that cognition is a pro- year moratorium on cognitive science studies, which
cessing of symbols that occurs within the individual was initiated by Bruno Latour and Stephen Woolgar in
minds of humans. Therefore, the cultural and social 1986 [17.21].
dimensions of scientific practice are not an integral In short, the cognitive-historical method is an en-
Part D | 17.2
part of their analysis. On the other hand, other ap- vironmental approach inscribed within the cognitive
proaches of cognitive studies, the environmental ones, studies of science with which Nersessian interprets the
recognize that material, cultural, and social environ- problem of conceptual change. The new version of the
ments in which science is practiced, are crucial to problem of conceptual change contains a basic question
understand scientific cognition. Nersessian argues that, related to the creation of scientific concepts: Which sit-
in order to give accounts which capture the fusion of uated cognitive processes, that is, integrated within their
the cultural-cognitive-social dimensions in the practices environment, do scientists develop in order to come to
producing scientific and engineering knowledge, it is articulate new concepts from vague notions? As will
convenient to use the view of environmental perspec- be treated in Sect. 17.2, this issue has led to inves-
tives in cognitive science. Nersessian [17.19] provides tigations that indicate the sought-after processes are
an overview of the environmental analysis lines of re- model-based reasoning. This conclusion, in turn, im-
search that have been delineated between the 1980s and pelled the posing of new questions, and one among
the year 2000. The environmental paradigm emphasizes them that stands out for its relevance and potential to
that sociocultural and body factors have a substan- determine why this kind of reasoning affords an ap-
tial role in cognitive processes [17.19, 20]. Although propriate medium to generate new scientific concepts
it should be pointed out that researchers in cognitive is: “What features of model-based-reasoning make it
science that study science based on the environmen- a particularly effective means of conceptual innova-
tal perspective resist the view of those who think that tion?” [17.14, p. 186].
term cognitive history: “Recently a research frontier I The way in which Nersessian understands the cog-
call cognitive history has emerged within the history nitive-historical method constitutes a kind of philosoph-
of science and is finding its place in this confederation ical analysis that integrates several contributions from
(Cognitive science)” [17.24, p. 194]. However, this kind philosophy, history, and psychology. This is not a purely
of mixed approach was used by other authors, at least formal analysis like the ones exalted by logical pos-
since the 1970s. Tweney [17.25] presented an overview itivism and questioned by the philosophy of science
of its versions that includes the contributions published of the mid-twentieth century. It is, by contrast, an in-
principally since the1980s. He restricts his attention to terpretation of historical scientific practices in terms
Part D | 17.2
time. Some examples of practices approached in this prises material, cultural, and social aspects [17.19]. The
way are [17.24, p. 194]: continuum hypothesis justifies that the achievements
of cognitive sciences can be employed to understand
“[. . . ] devising and executing real-world and scientific practices. This hypothesis refers to the hu-
thought experiments, constructing arguments, in- man cognitive capacities and mechanisms of those who
venting and using mathematical tools, creating con- make science. It maintains that they are basically the
ceptual innovations, devising means of communi- same as those of ordinary humans. Therefore, to a great
cating ideas and practices, and training practition- extent, what scientists do, and the constraints they
Part D | 17.2
ers.” experience, derive from their human cognitive condi-
tion [17.1]:
This temporal feature of the method is valuable
because it makes it possible to obtain knowledge on “The underlying presupposition is that the problem-
conceptual change, a scientific phenomenon that is dif- solving strategies scientists have invented and the
ficult to capture because it is exceptional and it usually representational practices they have developed over
implies long lapses. The historical component of the the course of the history of science are very sophisti-
analysis does not constitute a historical narrative, but cated and refined outgrowths of ordinary reasoning
a detailed investigation of microstructures and micro- and representational processes.”
processes, more specifically, of representational prac-
tices and practices of problem solving [17.1, 14, 24]. On The continuum hypothesis does not negate the fact
the other hand, the historical dimension of the method that there are great differences between scientific and
is a contextual perspective, that is, it takes into account the ordinary cognition. Indeed, scientists have a vast
the community where the scientific practices have been knowledge of a specific domain, have a methodologi-
carried out and the cultural resources implied therein. cal training, and have learned to metacognitively reflect
This point of view has the virtue of satisfying the ob- on and refine the use of cognitive capacities that give
jective of preserving the essence of the phenomenon them the ability to reason scientifically in carrying out
under investigation pursued by the ecological approach the necessary cognitive functions [17.37, p. 2]. Ners-
in the psychological investigation. One of the ways to essian defends this hypothesis by pointing out that it
access long-term scientific activities is through histori- is not speculative, it is not an a priori conjecture, but
cal records. Among the sources are: diaries, laboratory that it is, on the contrary, a claim based on psycho-
notebooks, publications, correspondence, experimental logical facts. She also suggests that the understanding
equipment, drawings, diagrams, lecture notes, and texts. of the investigated scientific practices consists in at-
In the case that the investigated procedures are long tributing to them a sense which transcends the specific
term and extend into the present, data about them can characteristics of the case. This appears to derive from
be obtained in other ways, for example, using field the fact that this sense would be similar to, if not the
observation and other ethnographic methods (see the same as, the one that makes ordinary cognitive prac-
next section). Here the main sources of information tices generalizable. Therefore, the possibility to make
are the cognitive tools employed in scientific activities general descriptions about the nature and processes
and the artifacts produced by them. Consequently, an- of scientific activities is opened. In fact, Nersessian
other benefit of the cognitive-historical method is that affirms that such regularities can be abstracted from
the information about the practices embedded in the the thick descriptions of the particular case ([17.38,
conceptual change is not restricted to scientists’ verbal Chap. 1]; [17.14, p. 9]). Thick description is a con-
accounts. cept from qualitative investigation. It is thought to have
The cognitive dimension of the method is inscribed been introduced for the first time by Gilbert Ryle. For
in the tradition of psychological epistemology, includ- Ryle [17.39], the thick description implied assigning
ing works by Locke, Hume, and Quine [17.1, p. 5]. intentionality to one’s behavior. The thick descrip-
It refers to the employment of cognitive sciences for tion interprets behavior within the context and assigns
understanding the scientific practices involved in the thought and intentionality to the observed behavior. So,
creation and change of concepts. It postulates that re- for Ryle the thick description implies understanding and
sults, interpretations, and relevant debates on cognitive absorbing the context of the situation or the behavior.
science would help to understand such scientific prac- Also it signifies assigning current and future inten-
tices [17.14, p. 6]. As has been previously mentioned, tionality to behavior. Geertz borrows this philosophical
the cognitive approach adopted by Nersessian is the en- term from Ryle in order to describe the ethnography
vironmental one, and this implies considering scientific work [17.40]. In this way, abstracting regularities about
and engineering thought as a complex system that com- scientific activities, the problem that historicist philoso-
362 Part D Model-Based Reasoning in Science and the History of Science
phers have to face is solved: the one of how to go from on research made in scientific laboratories [17.42].
a case study to a more general conclusion, avoiding Ethnography of science started at the beginning of the
the risks of making a hasty generalization [17.1, p. 35]. 1980s with the sociology of scientific knowledge (SSK)
The cognitive-historical approach, therefore, is satisfac- considered as part of social studies of science and
tory for understanding the nature of scientific practices technology (STS) – with the work of Latour and Wool-
and surmounting conclusions established for particular gar [17.21] and Knorr-Cetina [17.43], among others.
cases. But ethnography of science also is deemed as among
the cognitive studies pursued to investigate cognition
Part D | 17.2
17.2.3 Cognitive Methods to Investigate and its context in mutual relation ([17.44–46]; [17.19,
Conceptual Innovation p. 38]). Thus, Nersessian points out that both the way in
which scientists understand the problems and the tools
Nersessian asserts that the primary method to inves- they use to solve them depend on a sociocultural con-
tigate practices of conceptual innovation in science text [17.14, p. 7,9]. Specific ethnographic methods are
is cognitive-historical analysis (Fig. 17.2). She agrees qualitative: They include field observation, collection of
with other cognitive researchers of science that none of artifacts, and field interviews. Each of them is useful to
the approaches in use is sufficient by itself to understand investigate the heuristic of discovery that scientists fol-
those practices and considers, like them, that those low in their daily scientific practice, as they carry it out,
methods must be complemented in order to understand that is, in real time and, therefore, to study scientists’
the complexity of scientific cognition. For example, Si- cognitive mechanisms when creating new concepts.
mon and Klahr assert that [17.41, p. 531]: In vivo observation is a method suggested by Dun-
bar [17.47, 48]. It arises from the premise that scientists
“[. . . ] the fundamental thesis in this article is that forget many of the important thought processes they
the findings from these diverse approaches, when use, and that, consequently, there is no record of them
considered in combination, can advance our under- in their notes or laboratory notebooks. So he argues
standing of the discovery process more than any that it is necessary to investigate living scientists in
single approach;” order to have information about these processes, that
is, by observing them while they perform their work.
a similar position is taken by Thagard [17.18, Chap. 1]. During observational studies, important activities at the
However, at the same time, she considers that not all the laboratory are recorded, and then the data are codified
employed methodologies are necessarily equal. There and interpreted within the frame of psychological con-
is a central feature in cognitive-historical analysis that structs.
makes Nersessian consider it fundamental and that dis- Laboratory experiments are studies of the processes
tinguishes it from the other methodologies: It enables for solving problems of people in artificial situations
acquisition of knowledge about the mechanisms that with the purpose of isolating one or more relevant as-
have had a historical impact in science. This quality will pects of science from the real world. In general, they are
be properly understood after examining the other ap- performed in a psychology laboratory. The experimen-
proaches and highlighting in them a characteristic they tation on discovery processes is carried out employing
all share, and which contrasts with the one just men- nonscientists as experimental subjects. The principal
tioned as distinct from cognitive-historical analysis. roles for this experimentation are two: being tools for
Ethnographic studies are based on field work and, testing the hypothesis already posed, and being ex-
in particular, studies about science are usually based ploratory tools. In this latter case, the experiments are
conducted to make certain phenomena appear [17.41,
pp. 526–527]. Klahr maintains that the experimentation
on the discovery processes enables a detailed analysis
Cognitive studies of the creation
of the processes of solving problems and that they can
of scientific knowledge be entirely identified and recorded [17.49].
Ethnography, in vivo observation, and laboratory
experiments have in common that they all provide infor-
mation about the activities performed at the scientists’
Cognitive- Ethnography Observation Laboratory own laboratories, or at laboratories of cognitive sci-
historical in vivo studies
analysis ences during normally brief sessions: hours or days.
However, the ethnographic studies mixed with the cog-
Fig. 17.2 Cognitive approaches of the creation of scientific nitive-historical method can be extended for years, each
knowledge of them providing relevant information. An example
Nancy Nersessian’s Cognitive-Historical Approach 17.3 Hypothesis About the Creationof Scientific Concepts 363
of this is Nersessian’s and her collaborators’ own re- can improve the cognitive interpretations of historical
search in biomedical engineering laboratories. In this episodes in scientific change provided by the cognitive-
case [17.44, p. 2], historical analysis [17.22].
An objection that might be made to the employ-
“Ethnographic studies tell us what BME practition- ment of the cognitive-historical method to investigate
ers do in the context of their research and cognitive- scientific practices of the creation of concepts is re-
historical studies provide insight into how and why lated to a question that any proposal to naturalize the
these cognitive practices have evolved.” philosophy of science should answer: whether it is le-
Part D | 17.3
gitimate to use scientific knowledge in order to develop
What is more important is that these approaches are a theory of scientific knowledge. A general argument
characterized by revealing the creative mechanisms that contrary to such a proposal comes from traditional phi-
underlie the practices performed when learning, tak- losophy of science. It states that the use of scientific
ing ownership of, and employing existing concepts. In methods to investigate science is necessarily circular,
other words, they inform about the scientists’ cogni- that it supposes a petitio principii, or leads to a regres-
tive mechanisms that are psychologically creative, that sion [17.51, p. 333]. Nersessian defends the use of the
is, mechanisms that produce a conceptual novelty for cognitive-historical method to develop a theory of the
them. production of scientific knowledge from the argument
Unlike these approaches, the cognitive-historical of circularity. She adopts a point of view similar to the
method is the only one that can provide information one of the philosophers who practice a naturalized an-
about cognitive practices that are extended in time, thus tifundamentalist epistemology, such as Ronald Giere,
revealing the mechanisms generating creative concepts who maintains that [17.12, p. 11]
in a historical sense, that is, of those concepts previ-
ously nonexistent and that have, consequently, histori- “The [Cartesian] program of trying to justify sci-
cal impact [17.50]. Nersessian adopts Margaret Boden’s ence without appeal to any even minimally scien-
distinction between psychologically creative ideas and, tific premises has been going on without conspic-
on the other hand, historically creative ideas. While the uous success for 300 years. One begins to suspect
first ones “[. . . ] are surprising, or even fundamentally the lack of success is due to the impossibility of the
novel, with respect to the individual mind which had task.”
the idea [. . . ],” the historically creative ideas have not
been previously thought by anyone [17.50, p. 43]. This The circularity implied in the naturalized concep-
is the reason why Nersessian considers the cognitive- tion of science, therefore, seems to be inherent to
historical method necessary to understand conceptual humans and insurmountable for them. But this does not
innovation. Besides, she has in mind that historically mean that such circularity is vicious. Nersessian sup-
creative mechanisms are also psychologically creative, ports a virtuous circularity, which could be obtained by
for they generate not only novelties for humanity, but putting cognitive and historical interpretations in a state
also novelties for the scientists themselves. For this, of reflective equilibrium. As has been pointed out ear-
she considers that ethnography, in vivo observation, and lier, she considers that this reflexivity is a particularity
laboratory experimentation are necessary, although in of the cognitive-historical analysis [17.1, p. 7]. There-
a secondary way, for studying conceptual change. In fore, the studies on scientific cognition provide feed-
fact, as these three approaches provide knowledge about back for the field of cognitive science, thus forming the
scientists’ psychologically creative mechanisms, they basis for additional cognitive investigation [17.14, p. 7].
throughout the history of science, in a way similar R. E. Mayer reminds us that, in the psychological
to that in which in kinematics describes movement in literature, a problem consists of a given state, a state
physics. of destination, and a set of operators. The problem oc-
In Sect. 17.3.1, I will expatiate the dynamic hy- curs, according to Gestalt theory, when a situation is in
pothesis proposed by Nersessian to solve the problem one state, the solver wants it to be in another state, and
of the nature of the practices that create new scientific there are obstacles that prevent a fluid transition from
concepts from the cognitive-historical perspective. In one state to the other. In addition, Mayer asserts that,
Sect. 17.3.2, I will cite how she understands the power from a cognitive perspective, the solution of problems
Part D | 17.3
of modeling to generate scientific concepts and refer to is “[. . . ] directed, cognitive processing aimed at finding
her proposal on how to consider the representation of a way to achieve a goal” and that two phases can often
concepts in elaborating this position. be distinguished in it: the representation of the problem
and the solution of the problem. The represented prob-
17.3.1 Dynamic Hypothesis lem may be of various kinds, one of them – of particular
relevance for our explanation – is the one of represen-
The cognitive-historical analysis proceeds by bootstrap- tational problems. There are usually two different ways
ping, that is, enables the establishment of hypotheses of finding and carrying out the solution of problems:
about conceptual structures and cognitive processes
implied in historical cases of scientific investigation, 1. The sudden appearance of the solution (insight
which in turn, serve as support to make new historical leap), which occurs immediately after a sudden and
studies and cognitive interpretations that provide addi- more suitable restructuring of the representation of
tional knowledge. Let us see how Nersessian employs the problem – this kind of solution often comes
it when she poses the dynamic hypothesis to solve the along with the aha! or Eureka experience, a subjec-
problem regarding the nature of the cognitive processes tive feeling of surprise.
that scientists elaborate in order to articulate new con- 2. The process of step-by-step solution – also called
cepts. the analytic method – which consists of finding
The historical studies of several episodes that have a strategy and executing a sequence of actions in
led to a conceptual change in science provide infor- order to generate a solution to the problem [17.53,
mation which, according to Nersessian, supports the pp. 112–113].
following conclusion: Conceptual innovation does not
occur suddenly, but results from extended problem- Weisberg categorizes step-by-step problem solving
solving processes [17.52, pp. 13–14]: as an analytic method: “[. . . ] we can categorize the
various modes of solving problems that are based on
“[. . . ] If one examines their deeds [of scientists] – degrees of specificity of knowledge about a problem as
their papers, diaries, letters, notebooks – these analytic methods” [17.54, p. 282].
records support a quite different interpretation in By means of these psychological conceptual tools,
most cases. As I have been arguing for some years, Nersessian interprets that, basically, the kind of prob-
conceptual change results from extended problem- lem that occurs in conceptual change is a representa-
solving processes.” tional one, and that the solution implied is a step-by-
step one. This means that, on the one hand, this problem
This way of interpreting historical information fits consists of a given situation in which a certain phe-
with the reality that Nersessian includes herself within nomenon escapes understanding and the solver does not
an epistemological tradition constituted by Dewey, know how to obtain the new conceptual resources to un-
Mead, and Popper, by which science is seen precisely derstand it [17.14, p. xii]; and that, on the other hand,
as a problem-solving process. However, in spite of her the solution to the representational problem of concep-
closeness to these authors, she assumes some distance tual change is reached through an heuristic strategy that,
from them, for their view has a limited range, that is, as will be immediately apparent, consists of a bootstrap-
it does not include the scientific phenomenon of con- ping cycle of modeling, understood as a kind of creative
ceptual change [17.1, p. 12]. Nersessian articulates the reasoning [17.14, p. 184].
basic interpretative scheme of the cases of conceptual Let us go back to the bootstrapping employment
change as problem-solving processes in psychological Nersessian’s cognitive-historical method. She realizes
terms and, in its elaboration, she uses some concepts that the historical studies of specific cases of concep-
that come from Gestalt theory and also from cognitive tual change show, in a ubiquitous way, that scientists
psychology. A brief reference to them will help to clar- use analogy, visual representation, and thought exper-
ify her interpretation. iments to solve representational problems. She points
Nancy Nersessian’s Cognitive-Historical Approach 17.3 Hypothesis About the Creationof Scientific Concepts 365
out that these three practices have in common that they why she calls her own hypothesis minimalist [17.52, p.
are ways of modeling, that is, of the construction and 12]:
manipulation of models. For example, when examining
the development of the current concept of the electro- “To carry out an analysis of model-based reasoning
magnetic field, she indicates that Faraday articulated the in conceptual change requires only that we adopt
notion of “continuous, progressive transmission of the a minimalist version of a mental modeling hypoth-
action” with the help of the concrete visual image of in- esis: that in certain problem solving tasks humans
duction “cutting” the strength lines, and with the help reason by constructing an internal model of the sit-
Part D | 17.3
of vague analogies between the electric and magnetic uations, events and processes that in dynamic cases
actions and known progressive phenomena [17.3, pp. provide the basis for simulative reasoning.”
144–145]. Again, Nersessian interprets historical data
with cognitive tools and considers that the modeling A property of scientific model-based reasoning is
practices, with which new concepts are constructed, are that it does not guarantee the production of a solution,
kinds of reasoning. In doing this, she uses a broad rea- that is, it is not an algorithmic procedure and, because
soning notion that comes from cognitive psychology, of this, Nersessian understands it as heuristic [17.57,
particularly from Johnson-Laird’s semantic conception pp. 325–326]. The difficulty to produce solutions
of reasoning. would be that the models used for reasoning may be
Unlike the philosophical traditional notion of rea- unsatisfactory, that is, they may not embody the rele-
soning, which only comprises deductive and inductive vant constraints of the target situation, and not so much
arguments, Johnson-Laird’s conception enables includ- that the reasoning may be incorrect [17.52, p. 14].
ing kinds of creative reasoning. According to this, much
of human reasoning is done through mental modeling. “In the case of science where the situations are more
When referring to deductive reasoning, Johnson-Laird removed from experience and the assumptions more
et al. assert that [17.55, p. 3]: imbued with theoretical assumptions, there is less
assurance that a reasoning process, even if correct,
“On the other side, there are those, such as ourselves will yield success. In the evaluating process, a ma-
(see also Johnson-Laird and Byrne [17.56]) who jor criterion for success remains the goodness of
claim that it is a semantic process that depends on fit to the phenomena, but success can also include
mental models akin to the models that logicians in- such factors as enabling the construction of a viable
voke in formulating the semantics of their calculi.” mathematical representation.”
Following Johnson-Laird, Nersessian maintains that Let us continue with the iterative application of
humans retrieve or construct models through which the cognitive-historical method. According to the his-
they make inferences about a target problem. The rep- torical studies of scientific practices, being able to
resented structure is supposed to contain parts that can have a model that satisfies the constraints of the tar-
possess an analogous model that also is made of parts. get problem frequently involves a cycle of construc-
The nature of mental models is, in Peirce’s words, tion, manipulation, evaluation, and adaptation of in-
iconic. termediate models. This is a bootstrapping process.
The idea that modeling is a kind of reasoning may This means that each intermediate model that is con-
be applied both to scientific contexts and to ordinary structed achieves a higher satisfaction of the constraints
contexts. As she focused on scientific reasoning tasks, of the target domain, and contributes to constructing
Nersessian finds that she needs to extend Johnson- the subsequent model. Intermediate models are hy-
Laird’s conception by widening the domain of mental brid, that is, they embody not only the constraints
models. She understands that models are interpretations of the target domain, but also those of the respective
intended to satisfy the salient constraints of a phys- source domains [17.52, p. 21]. Through the process
ical system, process, phenomenon, or situation. This of construction of satisfactory models, the solution to
implies that mental models not only comprehend the the representational problems implied in conceptual
structural analogues of what is modelled, that is, models change cases is achieved. Nersessian compares this
which embody representations of spatial, temporal re- kind of extended-in-time process with organic phenom-
lations, and causal structures, but that they also include ena wherein a perfect innovation emerges [17.14, p.
functional analogues which are also dynamic in nature. ix]:
Through her conception of mental modeling, she does
not intend to participate in the numerous debates that “Rather, such conceptual innovation, like perfect or-
have been sparked by this notion. That is the reason chids and flavorful grapes, emerges from lengthy,
366 Part D Model-Based Reasoning in Science and the History of Science
organic processes, and requires a combination of The thesis of scientific cognition as model-based
inherited and environmental conditions to bud and reasoning has been developing over more than the last
bloom and reach full development.” thirty years, due not only to Nersessian’s work, but also
to the contribution of authors such as Ronald Giere,
An important feature of the bootstrapping process Lorenzo Magnani, and Paul Thagard. L. Magnani en-
of creating a model with such reasoning is that it riched Nersessian’s analysis of model-based reasoning
implies selectivity. By means of abstraction and evalua- with the reference to the problem of abduction in cre-
tion processes, the irrelevant features are left aside, and ative reasoning, also taking advantage of the recent
Part D | 17.3
attention is focused on the relevant ones according to cognitive research on distributed cognition [17.58]. It
the problem-solving context. These historical findings is fitting to point out the fact that N. Nersessian, since
differ from those established by the current cognitive 1998, in collaboration with L. Magnani and P. Tha-
theories of analogy. According to them, the models gard, created and promoted the MBR Conferences on
to reason with are already provided by the target Model-Based Reasoning, realizing its seventh convo-
source [17.1, p. 20]. Therefore, in order to understand cation in 2015. One of the problems that the thesis of
the iterative character of the construction of analogue scientific cognition as model-based reasoning presents
models in science, it is necessary to modify those cog- is that the notion of mental modeling implies that of rep-
nitive theories. In this way, the reflexive nature of the resentation, and the latter has been questioned within
cognitive-historical method becomes evident. Nerses- the cognitive sciences by the dissenters of the dogma of
sian deeply analyzes a case that exemplifies the idea that cognitivism. This doctrine of cognitivism is integrated
there are modeling processes that generate conceptual within the representational and the computational the-
innovation. Those are the modeling processes that led ories of the mind [17.59]. As I have mentioned before,
Maxwell to make the first derivation of field equations Nersessian adopts a moderate environmental perspec-
for electromagnetic phenomena. Figure 17.3 shows the tive that places her on those dissenters’ side. From this
contribution of the target, source, and model constraints environmental approach, she defends the idea of mental
to these processes of reasoning [17.14, Chap. 2]. modeling, but conceives of it as a procedure carried out
General dynamical
model
1864
Abstract
1861-2 Model 3
Map
evaluate Construct
abstract
Target ++
Simulate
Derived constraints
Model 2
Map
evaluate Construct
abstract
Machine mechanics
Target + Select
source
Simulate
Derived constraints
Model 1
Map
evaluate Construct
abstract
Apply Fluid mechanics
Target
source
Provide Select
Initial Fig. 17.3 Maxwell’s model-
constraints
ing process (after [17.14, p.
57])
Nancy Nersessian’s Cognitive-Historical Approach 17.3 Hypothesis About the Creationof Scientific Concepts 367
by a cognitive system constituted by internal represen- what follows, I will review the way in which Nersessian
tations, frequently coupled with resources from the real treated this issue in order to show the relation it has with
world [17.60]. Moreover, together with Lisa Osbeck, her conception of the role of models in creative reason-
she suggests a conception of representations organized ing.
in models as practices. These representational practices Nersessian considers the meta-theoretical problem
may be interpreted as distributed, that is, they can be ex- of the representation of concepts as a central theme of
panded through internal–external traditional domains. the kinematics of conceptual change. This area of the
These representations are [17.61, Introduction]: theory of conceptual change aims at determining the
Part D | 17.3
form of conceptual change, that is, the differences exist-
“[. . . ] created and used in the cooperative practices ing between conceptual structures as time goes by, both
of persons as they engage with natural objects, man- between conceptual systems and between individual
ufactured devices, and traditions, as they seek to concepts. This task presupposes that a particular rep-
understand and solve new problems.” resentation of concepts is available. On the one hand,
the kinds of change that took place between different
In their description of the distribution of repre- conceptualizations, the first conceptual structures just
sentation in scientific cultures, Nersessian and Osbeck mentioned, refer to changes in the form of the organi-
employ a language with which they try to convey the co- zation of the concepts that integrate them. A conceptual
constitutive nature of culture and cognition, that is, the system can be analyzed as a network of nodes, where
relation between these two domains in a unique system. each node corresponds to a concept and each line within
In it, the concepts of cognitive partnering, internal– the network corresponds to a link between concepts.
external representational coupling, and enactment are Within conceptual networks, concepts are organized
central [17.59, 61]. through links such as kind, property, and relation. These
links can be characterized as connections, which indi-
17.3.2 The Power of Model-Based cate that a concept is a kind of another concept, that
Reasoning an object has a property and which express relations,
respectively [17.8, pp. 30–31]. Accordingly, the re-
Summing up what was written in the previous part structuring of the conceptual systems supposes changes
about the hypothesis that Nersessian managed to estab- related to the concepts that integrate them, and these
lish in relation to the creation of scientific concepts, let changes have impact on the other concepts. For in-
us say they explain that this scientific practice is based stance, changes of hierarchy, changes from properties to
on mechanisms consisting of a modeling iteration. The relations, and the addition and suppression of concepts.
modeling series ends with the construction of a satisfac- These changes are coordinated, that is, since concepts
tory model from which to draw inferences about a target are interlinked, changes related to a concept have an im-
problem. Once she has reached this conclusion about pact on other concepts.
the process of the creation of scientific concepts, Ners- Nersessian recounts a relevant case of change of
essian poses another question. She tries to explain the conception in the history of science, the one that refers
efficiency of model-based reasoning for creating sci- to the representation of movement. She analyzes vari-
entific concepts and, with this in mind, she decided to ous phases of this change – the medieval philosophers’,
stipulate a particular concept of concept. Galileo’s, and Newton’s conceptions of movement –
During the 1980s, Nersessian dedicated her stud- pointing out how some concepts attained a new or-
ies primarily to the issue related to how to represent ganization, for instance, the concepts of movement,
concepts in order to understand that conceptual change vacuum, and space, and how the new concept of the
in science is continuous, gradual, and noncumulative. force of gravity was built. Let us consider a represen-
In this way, she came to describe the representation of tative sample of the way in which she carries out her
a type of concepts as “a set of family resemblances ex- examination. She indicates that: movement changes in
hibited in its ‘meaning schema’.” But more recently, hierarchy within the medieval conceptual structure it is
in [17.14], mentioning the state of the art of the concept a kind of process, while, within the Galilean concep-
representation issue, she stated that there is no agree- tion, it is a state; gravitas changes from a property to
ment either in cognitive science or in philosophy of a relationship – within the medieval mechanics heavi-
science about how to conceive of a concept. As it is ness is a property of bodies, while, on the other hand,
necessary to have some conception about it in order within Newtonian physics it is a force which acts on the
to explain the power of model-based reasoning in the bodies and, as such, a relationship between them; within
creation of scientific concepts, she propounds one that the Galilean conception, the medieval conception of the
does not require answer to other problems in debate. In distinction of natural/violent movement is abandoned;
368 Part D Model-Based Reasoning in Science and the History of Science
and, finally, within the Newtonian conceptual structure, change can be exemplified with the concept of an elec-
the principle of inertia is added to the theory of me- tromagnetic field. Nersessian derived that this concept
chanics. has undergone three phases. The first one, which she
Nersessian adopted a system of representation of calls heuristic guide, encompasses the contributions of
knowledge, conceptual maps, in order to facilitate the Faraday and the first two papers by Maxwell: On Fara-
analysis of the changes happening in the various phases. day’s Lines of Force and On Physical Lines of Force;
These maps contain conceptual nodes and links be- the second one, which she calls elaborative, comprises
tween them. For instance, in [17.62, pp. 171–173], she the subsequent contributions of Maxwell and those of
Part D | 17.3
drew three conceptual maps, reproduced in Figs. 17.4– Lorentz; and the third one, which she calls philosoph-
17.6, which represent salient parts of the conceptual ical, that is, a critical reflection on its foundations,
structures of the medieval, Galilean, and Newtonian encompasses the contributions of Einstein.
theories of movement. It should be mentioned that To determine the change of an individual con-
Nersessian has made valuable contributions regarding cept supposes that a general conception of meaning is
the applicability of knowledge about the way in which available that justifies the existence of an identifiable
conceptual systems in science change to teaching and line of descent among the instances of a concept. In
learning in the field of science [17.62, p. 166]. this way, the meta-theoretical problem of establishing
On the other hand, the difference in form that occurs the representation or meaning of a concept emerges.
throughout time (in this case not related to complete Nersessian’s early proposal for a general conception
conceptions, but to an individual concept) refers to the of meaning [17.3, Chap. 7] is the following: All the
change in meaning between its instances. This kind of instances of an individual concept fulfill an explana-
K K Motion K
Is in
Natural Rest K K Change
K
Natural Local
Towards Takes
K K K place
in
K
Free fall Violent
Circular Keeps
in
Keeps
Space
in
K Causes
K
Heavenly Body
Pr Pr
Empty
K
Causes Mover Impetus
Occupied
Gravitas R
K K
Pr Pr
Fig. 17.4 Partial conceptual
Causes
structure of the medieval the-
Prime Projector Finite Closed
ory of motion (after [17.62, p.
171])
Nancy Nersessian’s Cognitive-Historical Approach 17.3 Hypothesis About the Creationof Scientific Concepts 369
K
Rest
Space Motion
Part D | 17.3
K K
K
Violent Natural
Pr Pr
K
K
Vacuum
Free fall Circular
Takes
Finite place K
in Pr
Force Causes
K
Body
Heaviness Pr
tory/descriptive role in scientific theories. The meaning its ’meaning schema’.” The representation of a type of
of a concept or scheme of meaning can be understood concept is an adaptation of the notion of the prototype
as a two-dimensional array based on those roles: the of a concept. Nersessian writes [17.63, p. 161]:
dimension that contains a summary of the features of
each instance, which can be subsumed under the fol- “I have adapted a prototype notion of a concept, as-
lowing factors: thing, function, structure, and causal sociated with the work of Eleanor Rosch, to develop
power; and the dimension that contains the develop- a schema representation of a scientific concept as an
ment of these features of each instance with the passage overlapping set of features.”
of time, which enables one to identify a line of descent
between those instances. The notion of prototype was elaborated on the basis
In other words, the instances of a concept can be of the empirical research about concepts carried out by
represented in both a synchronic and a diachronic way. Rosch and her collaborators beginning in 1970. Ners-
Nersessian refers to the synchronic representation of the essian found this cognitive view appealing because it
instances of a concept as a vector composed of their enables one to represent the development, the conti-
salient features. On the other hand, she describes the nuity, and the change of concepts in general and of
diachronic representation of the cases of an individual scientific ones in particular [17.36, p. 168]. In fact,
concept as a vector expanded to an array. Within the the notion of prototype makes it possible to establish
array, it is shown how each of the components of the a familiarity relationship between the earlier and the
concept’s meaning changes over time [17.36, p. 166]. later forms of a concept. The probabilistic or prototypes
From the representation of the instances can arise the theories about concepts propose that human beings rep-
representation of a type of concept, which, as I antic- resent a concept by a prototypical example, which is
ipated, is “a set of family resemblances exhibited in the typical representation of a concept. A prototype in-
370 Part D Model-Based Reasoning in Science and the History of Science
Natural Accelerated
Part D | 17.3
K K
K K
K
Violent Circular
Planetary Free fall
Constant
velocity
Causes Force Causes
Causes
Pr
Causes K K
Takes Keep K
K K
place in
in Impressed K
Body
Centripetal
Space
Gravity
K Pr Pr
Inertia
Pr R
Mass Weight
cludes a list of the features that most probably describe concept, as summarized in Table 17.1 . Here it can be
the exemplars of the concept. Some instances of a given observed that each instance of the concept is linked
concept are better examples than others, depending on to the next through chains of reasoning connections
the degree in similarity of the object in question to other (COR) (Nersessian borrows this notion from Dudley
instances of the concepts or to the prototypical instance. Shapere [17.66]).
That is why there is a reference to the graduated struc- In later analyses dealing with the creation of scien-
tures of the concepts. Rosch follows Wittgenstein, as, tific concepts in [17.14], Nersessian discussed briefly
in her conception, a concept is represented by a set the state of the art with respect to the representation of
of family resemblances among the instances placed in concepts. She wrote that there is no agreement about
the category [17.64]; [17.65, pp. 151–166]. Continu- that, thus [17.14, p. 187]:
ing with the analysis of the example of the concept
“For the present analysis, the format issue can be
of electromagnetic field, in the three phases of its de-
bypassed by stipulating only that whatever the for-
velopment, the concept fulfills the role of describing
mat of a concept, concepts specify constraints for
the transmission of electric and magnetic forces and
generating members of a class of models.”
the role of explaining how such continuous and pro-
gressive action is possible. On this basis, Nersessian Nersessian’s stipulation that concepts specify con-
managed to reconstruct the scheme of meaning of the straints seems to come from the frame theory about con-
Nancy Nersessian’s Cognitive-Historical Approach 17.3 Hypothesis About the Creationof Scientific Concepts 371
Part D | 17.3
material medium (aether) actions (now including light) optical effects, radiant heat, etc.
C C C C
O O O O
R R R R
State of immobile aether (nonme- Same Same plus Lorentz force Same
chanical)
C C C C
O O O O
R R R R
State of space Same Same but relativistic Same
interpretation
C
O – chain-of-reasoning connection
R
cepts. According to this, the frame of a concept is a the- Nersessian uses the case expounded by Hanne An-
oretical representation that organizes all the possible dersen et al. [17.9, Chap. 4] where, precisely, they
information related to a given concept within a speech employ the notion of dynamic frame – in order to il-
community. There are various versions of this concep- lustrate the idea that concepts specify constraints. The
tion and one of them is Lawrence Barsalou’s dynamic case refers to the representation of the concept of bird.
frames approach. Nersessian judges that this frame per- This one appears in various ways in the successive tax-
spective has been used successfully in many analysis of onomies of Ray, Sundevall, and Gadow. In each one,
the change of taxonomic concepts, but she points out the corresponding frame of the concept of bird reflects
that “[. . . ] the case of science is complicated by the ex- a different set of attributes and different constraints
istence of many nontaxonomic concepts, such as ‘force’ between their values. Figures 17.7–17.9 illustrate the
and ‘mass”’ [17.67, p. 183]. The distinction of taxo- difference of conceptual representation in those three
nomic and nontaxonomic concepts corresponds to the ornithological taxonomies.
Kuhnian classification between basic and theoretical Based on her idea that concepts specify constraints,
concepts [17.68] and, later, to the classification between Nersessian describes the concept formation and change
normic and nomic concepts [17.69]. While taxonomic, as a process of generating new constraints or modify-
basic or normic concepts are learned by pointing out
many of their instances, theoretical or nomic concepts
Superordinate Subordinate Sub-subordinate
are learned by pointing out complex problem situations concept concept concept
to which a law is applied [17.70]. Nersessian consid-
ers that Barsalou’s approach helps to illustrate precisely Attribute Value Swan
the idea that concepts specify constraints [17.70]. Ac- Round Goose
cording to Barsalou, a frame is a set of attributes with Beak Water bird
Pointed Duck
a multiplicity of values, integrated by structural con- Heron
nections. In general, these attributes hold relationships Bird
with each other that are given through the majority of Stork
the exemplars of a concept. Barsalou calls these struc- Webbed Chicken
Foot Land bird
tural invariants to these relationships. Furthermore, the Clawed Turkey
values of frame attributes are linked with each other Quail
through relationships of dependency. These relations
are constraints. “Instead, values constrain each other in Fig. 17.7 Partial dynamic frame of Ray’s concept of bird (1678) (af-
powerful and complex manners” [17.71, p. 37]; [17.72]. ter [17.9, p. 73])
372 Part D Model-Based Reasoning in Science and the History of Science
Coarse
Heron
Wing Absent
Bird (5th secondary) Grallatores Screamer
Present
Stork
Skinned
Leg
Scutate
Chicken
Webbed
Foot Gallinae Turkey
Clawed
Quail
ing the existing ones. This construct paved the way to changes of constraints. By means of the processes of
answering philosophical questions such as: How is it abstraction and integration of constraints from multiple
that model-based reasoning generates new conceptual domains in a hybrid model, new combinations of con-
representations? And how do models figure in this rea- straints can emerge, and these ones may fit structures
soning and facilitate the reasoning about phenomena? and behaviors not represented previously. When scien-
Model-based reasoning is effective to create new tific change is produced, concepts without precedent in
candidate representations because it facilitates the the history of science emerge [17.14, Chap. 6].
Nancy Nersessian’s Cognitive-Historical Approach References 373
17.4 Conclusions
From a cognitive-historical approach, Nersessian poses about historically creative scientific activities and inter-
problems relevant to the creation of scientific concepts, prets that information in terms of the cognitive sciences.
thus introducing within the philosophy of science an So, on the one hand, she establishes that, through
issue that was traditionally considered not pertinent. a cycle of bootstrapping modelings, scientists solve
Among them, two questions stand out: representational problems; those modeling processes
are genuinely kinds of creative reasoning about a tar-
Part D | 17
1. A fundamental one asks which are the cogni-
get problem; and model-based reasoning is a heuristic
tive processes integrated within the environment
strategy. On the other hand, based on a conception of
that scientists develop in forming concepts without
concepts that stipulates that these specify constraints,
precedent in history.
she states that the concept formation and change are
2. The other, derived from the previous one, refers to
processes of constraint generation or modification, and
the reasons that make those mechanisms efficient
she defends the hypothesis that proves that model-
means to generate new scientific concepts.
based reasoning is effective for creating new candidate
As a result, the adoption of a cognitive-histori- representations because they facilitate the change of
cal perspective creates a formulation of the problem constraints.
that can produce a satisfactory answer. Throughout the
chapter, it has been demonstrated that cognitive-histori- Acknowledgments. I wish to thank Alicia E. Gianella
cal analysis gathers together features that make it a very for the support she has always given me, and for her
valuable tool for that: It makes it possible to obtain inexhaustible willingness to listen to and discuss my
information about innovative historical scientific prac- embryonic ideas. I am also extremely grateful to the
tices; it enables one to study cognitive processes and encouragement Nancy Nersessian gave me to continue
structures implied in those practices; it aims to inves- with my research, when I had the opportunity to meet
tigate creative processes within their own context to her at the Conference Logic, Reasoning and Rational-
avoid distortion; it retrieves data from various kinds of ity held in September 2010 at the Universiteit Gent,
sources; it establishes general conclusions about cre- Belgium. I highly appreciate the valuable comments
ative scientific practices; and, finally, it is the method – of Carlos Oller to a first draft of this chapter, as well
within cognitive studies – that gives information about as Sandra Meta’s help with the final version in En-
historically innovative processes that evolve over long glish. Finally, I wish to specially mention Lorenzo
periods of time. Magnani for the confidence he granted me, and to the
It is possible to appreciate the fertility of the ap- C.I.E.C.E. (Centro de Investigación en Epistemología
proach through the examination of certain hypothesis de las Ciencias Económicas de la Facultad de Ciencias
about the creation of scientific concepts that Nerses- Económicas – Universidad de Buenos Aires), which has
sian succeeded in establishing. She obtains information given me a space for productive intellectual exchange.
References
17.1 N.J. Nersessian: Cognitive models of science. In: 17.5 T.S. Kuhn: The Structures of Scientific Revolutions
How Do Scientists Think? Capturing the Dynamics (Chicago Univ. Press, Chicago 1965)
of Conceptual Change in Science, ed. by R. Giere 17.6 T.S. Kuhn: What are scientific revolutions? In: The
(Minnesota Press, Minneapolis 1992) Probabilistic Revolution, Ideas in History, Vol. I, ed.
17.2 H. Andersen, K.Hepburn: Scientific change. In: by L. Kruger, L.J. Daston, M. Heidelberger (MIT Press,
Internet Encyclopedia of Philosophy, ed. by J. Cambridge 1987) pp. 7–22
Fieser, B. Dowden, http://www.iep.utm.edu/s- 17.7 N. Nersessian, H. Andersen: Conceptual change and
change (2011) incommensurability: A cognitive-historical view.
17.3 N.J. Nersessian: Faraday to Einstein: Constructing In: Danish Yearbook of Philosophy 32, ed. by
Meaning in Scientific Theories (Kluwer, Dordrecht, F. Collin (Museum Tusculanum Press, Copenhagen
Boston, London 1984) 1997)
17.4 T. Arabatzis, V. Kindi: The Problem of conceptual 17.8 P. Thagard: Conceptual Revolutions (Princeton Univ.
change in the philosohy and history of science. In: Press, Princeton 1992)
International Handbook of Research on Conceptual 17.9 H. Andersen, P. Barker, X. Chen: The Cognitive
Change, ed. by S. Vosniadou (Routledge, New York Structure of Scientific Revolutions (Cambridge Univ.
2008) pp. 345–373 Press, Cambridge 2006)
374 Part D Model-Based Reasoning in Science and the History of Science
17.10 M. Milkowski, K. Talmont-Kaminski (Eds.): Regard- 17.28 M. De Mey: The Cognitive Paradigm (D. Reidel, Dor-
ing the Mind, Naturally: Naturalist Approaches to drecht 1982)
the Sciences of the Mental (Cambridge Scholars 17.29 H.E. Gruber: Darwin on Man: A Psychological Study
Publishing, Newcastle upon Tyne 2013) of Scientific Creativity (Dutton, New York 1974)
17.11 L. Fleck: Genesis and Development of a Scientific 17.30 A.I. Miller: Imagery in Scientific Thought: Creating
Fact (Benno Schwabe, Basel 1935) 20th Century Physics (Birkhauser, Boston 1984)
17.12 R.N. Giere: Explaining Science: A Cognitive Ap- 17.31 R. Tweney: Psychology of science and metascience.
proach (Univ. Chicago Press, Chicago 1998) In: A Framework for the Cognitive Psychology of Sci-
17.13 R. Giere: The cognitive study of science. In: The ence, ed. by B. Gholson, A. Houts, R.M. Neimeyer,
Part D | 17
Process of Science, ed. by N.J. Nersessian (Martinus W. Shadish (Cambridge Univ. Press, Cambridge
Nijhoff, Dordrecht Boston Lancaster 1987) 1989)
17.14 N. Nersessian: Creating Scientific Concepts (MIT 17.32 R. Tweney: Discovering discovery: How Faraday
Press, Cambridge London 2008) found the first metallic colloid, Perspectives on Sci-
17.15 T. Nickles: Normal science: From logic to case- ence 14(1), 97–121 (2006)
based. In: Thomas Kuhn, ed. by T. Nickles (Cam- 17.33 E. Cavicchi: Experimenting with magnetism: Ways
bridge Univ. Press, Cambridge 2003) of learning of Joann and Faraday, American Journal
17.16 D.W. Gooding: Experiment and the Making of of Physics 65, 867–882 (1997)
Meaning (Springer, New York 1990) 17.34 E. Cavicchi: Experiences with the magnetism of
17.17 J. Rouse: Understanding scientific practices. Cul- conducting loops: Historical instruments, exper-
tural studies of science as a philosophical pro- imental replications, and productive confusions,
gramm. In: The Science Studies Reader, ed. by American Journal of Physics 71(2), 156–167 (2003)
M. Biagioli (Routledge, New York 1999) 17.35 U. Feest, F. Steinle (Eds.): Scientific Concepts and
17.18 P. Thagard: The Cognitive Science of Science: Ex- Investigative Practice, Vol. 3 (de Gruyter, Berlin
planation, Discovery, and Conceptual Change (MIT Boston 2012)
Press, Cambridge 2012) 17.36 N.J. Nersessian: A cognitive-historical approach to
17.19 N. Nersessian: Interpreting scientific and engi- meaning in scientific theories. In: The Process of
neering practices: Integrating the cognitive, social, Science, ed. by N.J. Nersessian (Martinus Nijhoff,
and cultural dimensions. In: Scientific and Tech- Dordrecht, Boston, Lancaster 1987)
nological Thinking, ed. by M. Gorman, R. Tweney, 17.37 D.C. Minnen, N.J. Nersessian: Exploring science: The
D. Gooding, A. Kincannon (Erlbaum, New Jersey cognition and development of discovery processes,
2005) Am. Psychol. Assoc. 48(3), 360–363 (2003)
17.20 M. Gorman, A. Kincannon, M.M. Mehalik: Spherical 17.38 C. Geertz: The Interpretation of Cultures: Selected
horses and shared toothbrushes: Lessons learned Essay (Basic books, New York 1973)
from a workshop on scientific and technological 17.39 G. Ryle: Collected Papers: Volume II: Collected Es-
thinking. In: Discovery Science, ed. by K.P. Jantke, says (1929–1968) (Hutchinson, London 1971)
A. Shinoara (Springer, Berlin, Heidelberg, New York 17.40 J.G. Ponterotto: Brief note on the origins, evo-
2001) lution, and meaning of the qualitative research
17.21 B. Latour, S. Woolgar: Laboratory Life: The Construc- concept Thick Description, Qual. Rep. 11, 538–549
tion of Scientific Facts (Princeton Univ., Princeton (2006)
1979) 17.41 D. Klahr, H.A. Simon: Studies of scientific discovery:
17.22 N.J. Nersessian: Conceptual change: Creativity, cog- Complementary approaches and convergent find-
nition, and culture. In: Models of Discovery and ings, Psychol. Bull. 125, 524–543 (1999)
Creativity, ed. by J. Meheus, T. Nickles (Springer, 17.42 D.J. Hess: Ethnography and the development of
Dordrecht, Heidelberg, London, New York 2009) science and technology studies. In: Sage Handbook
17.23 R. Giere: Cognitive approaches to science. In: A of Ethnography, ed. by P. Atkinson, A. Coffey, S. De-
Companion to the Philosophy of Science, ed. by lamont, J. Lofland, L. Lofland (SAGE, Thousand Oaks
W.H. Newton Smith (Blackwell, Oxford 2000) 2001)
17.24 N.J. Nersesian: Opening the black box: Cognitive 17.43 K. Knorr-Cetina: The Manufacture of Knowledge
science and history of science, OSIRIS 10, 194–211 (Pergamon, New York 1981)
(1995) 17.44 N.J. Nersessian, W.C. Newstetter, E. Kurz-Milcke,
17.25 R. Tweney: Cognitive-historical approaches to the J. Davies: A mixed-method approach to studying
understanding of science. In: Handbook of the Psy- distributed cognition in evolving environments,
chology of Science, ed. by G.J. Feist, M.E. Gorman Proc. ICLS Conf. (ICLS, Seattle 2002)
(Springer, New York 2013) 17.45 M. MacLeod, N.J. Nersessian: Coupling simulation
17.26 D. Kulkarni, H.A. Simon: The processes of scientific and experiment: The bimodal strategy in integra-
discovery: The strategy of experimentation, Cogni- tive systems biology, Stud. Hist. Philos. Sci. C 44,
tive Science, 125, 139–176 (1988) 572–584 (2013)
17.27 J. Schrager, P. Langley (Eds.): Computational mod- 17.46 M. MacLeod, N.J. Nersessian: The creative industry
els of scientific discovery and theory formation, of integrative systems biology, Mind Soc. 12, 35–48
D. Kulkarni and H. A. Simon, “Experimentation in (2013)
machine discovery” (Morgan Kaufmann, San Mateo 17.47 R.J. Sternberger, L. Davidson, K. Dunbar (Eds.):
1990) Mechanisms of Insight, How Scientist Really Rea-
Nancy Nersessian’s Cognitive-Historical Approach References 375
Part D | 17
velopment of Discovery Processes (MIT Press, Cam- 17.64 E. Rosch, C.B. Mervis: Family resemblances: Studies
bridge 2010) in the internal structure of categories, Cogn. Psy-
17.50 M.A. Boden: The Creative Mind: Myths and Mecha- chol. 7, 573–605 (1975)
nisms (Routledge, London 2004) 17.65 M. Chapman, R.A. Dixon: Meaning and the Growth
17.51 R.N. Giere: Philosophy of science naturalized, Phi- of Understanding: Wittgenstein’s Significance for
los. Sci. 52, 331–356 (1985) Developmental Psychology (Springer, Berlin, Hei-
17.52 N.J. Nersessian: Model based reasoning in concep- delberg 1987)
tual change. In: Model-Based Reasoning in Scien- 17.66 D. Shapere: Meaning and scientific change. In:
tific Discovery, ed. by L. Magnani, N.J. Nersessian, Mind and Cosmos, ed. by R.G. Colodny (Univ. of
P. Thagard (Kluwer Academic, New York 1999) Pittsburgh Press, Pittsburgh 1966) pp. 41–85
17.53 R.E. Mayer: Problem solving and reasoning. In: 17.67 N.J. Nersessian: Kuhn, conceptual change, and
Learning and Cognition in Education, ed. by cognitive science. In: Thomas Kuhn, ed. by T. Nick-
V.G. Aukrust (Elsevier, Oxford 2011) les (Cambridge Univ. Press, Cambridge 2003)
17.54 R.W. Weisberg: Creativity. Understanding Innova- 17.68 T.S. Kuhn: Metaphor in science. In: Metaphor and
tion in Problem Solving, Science, Invention and the Thought, ed. by A. Ortony (Cambridge Univ. Press,
Arts (Wiley, New Jersey 2006) Cambridge 1979)
17.55 P.N. Johnson-Laird, V. Girotto, P. Legrenzi: Mental 17.69 T.S. Kuhn: Afterwords. In: World Changes, ed. by
models: A gentle guide for outsiders, Sistemi Intel- P. Horwich (MIT Press, Cambridge 1993) pp. 311–342
ligenti 9, 63–83 (1998) 17.70 H. Andersen, N.J. Nersessian: Nomic concepts,
17.56 P.N. Johnson-Laird, R.M. Byrne: Deduction frames, and conceptual change, Philosophy of Sci-
(Lawrence Erlbaum, Hilsdale 1991) ence 67, 224–241 (2000)
17.57 L.R. Novick, M. Bassok: Problem solving. In: The 17.71 L.W. Barsalou: Frames, concepts, and conceptual
Cambridge Handbook of Thinking and Reasoning, fields. In: Frames, Fields, and Contrasts: New Es-
ed. by K. Holyoak, B. Morrison (Cambridge Univ. says in Semantic and Lexical Organization, ed. by
Press, Cambridge 2005) A. Lehrer, E.F. Kittay (Routledge, New York London
17.58 L. Magnani: Abductive Cognition. The Epistemolog- 2009)
ical and Eco-Cognitive Dimensions of Hypothetical 17.72 L. Barsalou, C. Hale: Components of concep-
Reasoning (Springer, Berlin, Heidelberg 2009) tual representation. From feature lists to recursive
17.59 L.M. Osbeck: Transformations in cognitive science: frames. In: Categories and Concepts: Theoretical
Implications and issues posed, J. Theor. Philos. Views and Inductive Data Analysis, ed. by I. Van
Psychol. 29, 16–33 (2009) Mechelen, J. Hampton, R. Michalski, P. Theuns
17.60 L.M. Osbeck, K.R. Malone, N.J. Nersessian: Dis- (Academic Press, Waltham 1993), http://philpapers.
senters in the sanctuary evolving frameworks in org/rec/VANCAC-5
mainstream cognitive science, Theory Psychol. 17,
377
Susan G. Sterrett
Physically Sim 18. Physically Similar Systems –
A History of the Concept
Part D | 18
which it developed. The concept was used in the 18.3 Late Nineteenth and Early Twentieth
nineteenth century in various fields of engineering Century ............................................ 383
(Froude, Bertrand, Reech), theoretical physics (van 18.3.1 Engineering and Similarity Laws ......... 384
der Waals, Onnes, Lorentz, Maxwell, Boltzmann), 18.3.2 Similar Systems in Theoretical Physics:
and theoretical and experimental hydrodynamics Lorentz, Boltzmann, van der Waals,
(Stokes, Helmholtz, Reynolds, Prandtl, Rayleigh). In and Onnes ........................................ 386
1914, it was articulated in terms of ideas developed 18.3.3 Similar Systems in Theoretical Physics . 391
in the eighteenth century and used in nineteenth
18.4 1914: The Year of Physically Similar
century mathematics and mechanics: equations, Systems ............................................ 397
functions, and dimensional analysis. The termi- 18.4.1 Overview of Relevant Events
nology physically similar systems was proposed of the Year 1914 ................................. 398
for this new characterization of similar systems 18.4.2 Stanton and Pannell .......................... 398
by the physicist Edgar Buckingham. Related work 18.4.3 Buckingham and Tolman ................... 399
by Vaschy, Bertrand, and Riabouchinsky had ap- 18.4.4 Precursors of the Pi-Theorem
peared by then. The concept is very powerful in in Buckingham’s 1914 Papers .............. 406
studying physical phenomena both theoretically
18.5 Physically Similar Systems:
and experimentally. As it is not currently a part
The Path in Retrospect ...................... 408
of the core curricula of science, technology, en-
gineering, and mathematics (STEM) disciplines or References................................................... 409
philosophy of science, it is not as well known as it
ought to be.
The concept of similar systems is one of the most pow- tion of the concept of similar systems in philosophy. In
erful concepts in the natural sciences, yet one of the addition to being neglected in philosophy of science,
most neglected concepts in philosophy of science today. the concept of similar systems is also often not fully
The concept of similar systems was developed specifi- understood even when it is mentioned.
cally for physics, and its use in biology has generally The concept of similar systems has been useful in
been in terms of plant and animal physiology; hence, developing methods for drawing inferences about the
the term physically similar systems is often used. It re- values of specific quantities in one system from obser-
mains an open research question whether, and how, the vations on another system. Some know of the concept
concept of similar systems might be applied to sciences only in this derivative way, via applications to specific
other than physics, such as ecology, economics, and an- questions in physics, biology, or engineering.
thropology. The fact that it has such useful applications has
This chapter is devoted to providing a history of the sometimes led to an underappreciation of the funda-
concept of physically similar systems. It also aims, in mental nature, immense power, and broad scope of the
doing so, to increase the understanding and apprecia- concept. Yet, its utility in practical matters of determin-
378 Part D Model-Based Reasoning in Science and the History of Science
Fig. 18.1 Newton seems to have been the first to use the follow the path up to the twentieth century characteriza-
term similar systems in his Principia Mathematica, but tion of it. This history of the concept, though admittedly
Galileo seems to have employed a closely kin idea in his not exhaustively complete, should help clarify its role in
reasoning in Two New Sciences reasoning and drawing inferences.
Fig. 18.2 This timeline (not to scale) illustrates that the concept of similar systems is credited to Renaissance era thinkers
Galileo and Newton, and was revived in the second half of the nineteenth century, when it was extended to chemistry,
electromagnetic theory, heat, and thermodynamics
Physically Similar Systems – A History of the Concept 18.1 Similar Systems, the Twentieth Century Concept 379
Buckingham Buckingham
New English Interpret. of On
Riabouchinsky transl. of Galileo′s model expts. physically
Methode des Two new (May) and similar systems
variables de sciences makes it Physically in English
dimension zero available in similar systems (October)
in French English (February) (June)
Rayleigh′s Fluid
Part D | 18.1
J. Thomson′s Stanton & Pannel motions in English in
Comparison of Similarity of several venues (March,
similar motion... in June)
structures... English (January)
is republished in
a collection of Tolman′s The principle
his works in of similitude in
English English (April)
Fig. 18.3 This timeline (not to scale) shows there was a lot of discussion about and interest in issues regarding similarity
in 1914 and the years immediately preceding. In 1914 the term physically similar systems comes into use
“the number of fundamental units required in an ab- the Reduced Relation Equation of 1914, Buckingham
solute system for measuring the n kinds of quantity, writes that “we may develop from it the notion of simi-
and n, the kinds of quantity [involved in the rela- lar systems;” he develops it as follows [18.2, p. 353]:
tion].”
“Let S be a physical system, and let a relation sub-
The function is not defined in this form of the sist among a number of quantities Q which pertain
equation, but that is perfectly fine; we still consider it to S. Let us imagine S to be transformed into another
an equation – it’s just an equation in which the form of system S0 so that S0 corresponds to S as regards
the function is not specified. The equation states, basi- the essential quantities. There is no point of the
cally, that such a function relating the ˘ ’s and r’s does transformation at which we can suppose that the
exist, and the conclusion is that this equation, the Re- quantities cease to be dependent on one another:
duced Relation Equation of 1914, is another form of hence we must suppose that some relation will sub-
the original physical equation, that is, that any physi- sist among the quantities Q0 in S0 which correspond
cal equation can be reduced to this form. Next follows to the quantities Q in S. If this relation in S0 is of the
Part D | 18.2
a short section illustrating how this conclusion can be same form as the relation in S and is describable by
applied to the same example given earlier in the paper the same equation, the two systems are physically
to determine the relationships between some specific similar as regards this relation.”
quantities in an elegant and particularly useful way. All
this is done prior to, and independently of, defining the This is the notion of physically similar systems still
notion of physically similar systems. currently in use today. It was first articulated in 1914
It is in the section entitled “Physically Similar Sys- by the physicist Edgar Buckingham. But it did not
tems,” the sixth section of the paper, that the notion arise from Buckingham’s cogitations out of the blue.
of similar systems is first presented. Referring to the For its precursors, we have to go back to the Renais-
equation in his paper shown above, which I have called sance.
chanical structures of this kind (of five dimensions) “In two such similar motions, then,
similar unless their homologous linear dimensions let the homologous paths be s and ˛ s,
as well as the times and the masses bore to one an- the homologous times be t and ˇ t;
other the same ratio.” whence the homologous velocities are v D s=t and
˛ v D ˛=ˇ s=t,
I gather that what Mach is saying is that the notion the homologous accelerations D 2s=t2 and " D
of similar in use at the time he is writing is the notion ˛=ˇ 2 2s=t2 .
of geometrical similarity, in which there is a kind of Now all oscillations which a body performs un-
shrinking or enlarging of every linear quantity of each der the conditions above set forth with any two
dimension by the same ratio (for geometrical similarity, different amplitudes 1 and ˛, will be readily rec-
there would usually not be more than three dimen- ognized as similar motions.”
sions). That is, I believe he means that, if we are talking
about a three-dimensional machine, similarity amounts Thus, in spite of noting that similar generally means
to shrinking or enlarging quantities of each linear di- geometrically similar at the time he was writing, Mach
Part D | 18.2
mension by the same ratio while keeping the machine indulges Newton in the use of the adjective similar to
and all its parts exactly the same shape, that is, while indicate phoronomically (kinematically) similar struc-
preserving every ratio of linear quantities within the tures, which are, properly speaking (in the terminology
same machine. Now, of course, areas and volumes will of Mach’s day), not related by similarity but by affinity
bear a different ratio to their homologues than quanti- (that is, by affine transformations). After showing how
ties of the linear dimensions do (e.g., if the ratio is 1 W 3 elegantly theorems about centripetal motion can be ob-
for the linear dimension, it will be 1 W 9 for an area and tained by such means, he remarks [18.4, p. 205]:
1 W 27 for a volume), but the similarity can be defined
“It is a pity that investigations of this kind respect-
in terms of the linear dimensions alone. That is how
ing mechanical and phoronomical affinity are not
geometrical similarity works. Mach says, I think that
more extensively cultivated, since they promise the
a strict application of the notion of geometric similar-
most beautiful and most elucidative extensions of
ity would require that the ratio between a quantity and
insight imaginable.”
its homologous quantity be the same for all five of the
dimensions that Newton mentions for his case, and that Thus, Mach sees the great power of the notion of
the situation imagined in Newton’s proposition does not similar systems. In terms of clarification of the notion
satisfy that constraint. itself, though, which is the topic of this article, Mach’s
However – and what is significant and interesting – attention in his critique of Newton is on the similar in
Mach does not say that Newton is wrong here; rather, similar systems; he does not here discuss criteria for
what he says is that what Newton was doing is better something counting as a system.
understood in Mach’s day in terms of affine transfor- Newton is recognized for the concept today, as he
mations [18.4, p. 204]: has been throughout all of the nineteenth and twentieth
centuries. In their Similarity of Motion in Relation to
“The structures might more appropriately be termed
the Surface Friction of Fluids paper in early 1914, Stan-
affined to one another.
ton and Pannell credit George Greenhill with pointing
We shall retain, however, the name phoronom-
out that the idea that relations “applicable to all fluids
ically [kinematically] similar structures, and in the
and conditions of flow” existed was “foreshadowed by
consideration that is to follow leave the masses en-
Newton in Proposition 32, Book II of the Principia”
tirely out of account.”
[18.6, p. 199]. Zahm’s 1929 report Theories of Flow
It is clear that Newton was interested in more than Similitude [18.7] also credits Newton for a method of
this, that he wanted to employ the notion of similar “dynamically similar systems,” citing Newton’s Propo-
systems to reason about forces, too; in fact, he does sitions 32 and 33. Also, in many more recent works,
so in the remarks that follow the quote above ([18.3, including [18.8, p. 86ff], [18.9, pp. 39–41], and [18.5,
pp. 327–328], [18.5, pp. 766–768]). However, in leav- p. 766].
ing the masses out of the account, Mach picks out from
Newton’s work what he wishes to endorse, and shows 18.2.2 Galileo
how the points he endorses ought to be understood in
the terminology of the nineteenth century. Mach shows Although Newton seems to have been the first to use the
how to understand phoronomically (kinematically) sim- term similar systems, Galileo’s reasoning certainly used
ilar structures for the topic of oscillation he has been a notion of similar systems akin to, if not prescient of,
discussing [18.4]: Newton’s in discussing not only the motions of the bob
382 Part D Model-Based Reasoning in Science and the History of Science
of a pendulum, but also the more complicated behavior able hung to a thread exactly one braccio in length, I
of machines and structures with mass; this is especially can find the length of the string from the numbers of
clear in his Dialogues Concerning Two New Sciences. vibrations of these two pendulums during the same
Galileo’s dialogue begins with Salviati (usually taken to period of time.”
be the voice of Galileo), recounting numerous examples
of a large structure that has the same proportions and ra- The reasoning that Sagredo uses to infer the length
tios as a smaller structure but that is not proportionately of one pendulum (the larger) from another (the smaller)
strong. In these opening pages of the dialogue, Salviati is based upon the constancy of the value of a certain ratio
explains to a puzzled Sagredo that “if a scantling can involving the length and the frequency of a pendulum’s
bear the weight of ten scantlings, a [geometrically] sim- oscillations. What Sagredo derives from the constancy
ilar beam will by no means be able to bear the weight of of that ratio for all pendulums is a law of correspon-
ten like beams” [18.10, m.p. 52–53]. The phenomenon dence telling him how to find the corresponding length
of the effect of size on the function of machines of in the large pendulum from the length of the small (or
similar design holds among natural as well as artifi- vice versa) and the number of oscillations of the two
Part D | 18.2
cial forms, Salviati explains: “just as smaller animals pendulums observed during the same time period. (The
are proportionately stronger or more robust than larger time period itself during which the oscillations are ob-
ones, so smaller plants will sustain themselves bet- served is not needed; what is needed is only the (square
ter” [18.10, m.p. 52–53]. of the) ratio of the number of oscillations of the two pen-
Perhaps the most well known of Salviati’s illustra- dulums.) He works out an example [18.10, pp. 140]:
tions is about giants [18.10, m.p. 52–53]:
“[. . . ] let us assume that in the time my friend has
“I think you both know that if an oak were two hun- counted twenty vibrations of the long string, I have
dred feet high, it could not support branches spread counted two hundred forty of my thread, which is
out similarly to those of an oak of average size. Only one braccio long. Then after squaring the numbers
by a miracle could nature form a horse the size of 20 and 240, giving 400 and 57 600, I shall say that
twenty horses, or a giant ten times the height of the long string contains 57 600 of those units [mis-
a man – unless she greatly altered the proportions ure] of which my thread contains 400; and since my
of the members, especially those of the skeleton, thread is a single braccio, I divide 57 600 by 400 and
thickening the bones far beyond their ordinary sym- get 144, so 144 braccia is the length of the string.”
metry.”
Salviati (the voice of Galileo) responds approvingly
Although Galileo’s work opens with the wise par- to Sagredo’s claim that this method will yield the length
ticipant in the dialogue reminding the others of the of the string: “Nor will you be in error by a span, espe-
reasons for the lack of giant versions of naturally occur- cially if you take a large number of vibrations.” This is
ring life forms, it soon proceeds to the case of a valid reasoning much like Newton’s use of similar systems,
use of a small (artificial) machine to infer the behav- in that one pendulum is regarded as being similar to an-
ior of a large (artificial) machine. But the basis for other pendulum, so that the period of oscillation and
the similarity is not merely geometric similarity. Later length of one of the pendulums is homologous to the
in this same work of Galileo’s, Sagredo makes use of period of oscillation and length of the other.
Salviati’s statement that the “times of oscillation” of Of course, Galileo’s reasoning here is not presented
bodies [18.10, m.p. 139] suspended by threads of differ- as a general method, as it is specific to pendulums,
ent lengths “are as the square roots of the string lengths; whereas Newton’s notion of similar systems is. Nor do
or we should say that the lengths are as the doubled ra- we find in Galileo’s discussion here any explicit criteria
tios, or squares, of the times.” From this, Sagredo uses for something being a machine that could serve to delin-
one physical pendulum to infer the length of another eate the sorts of things on which this kind of reasoning
physical pendulum [18.10, m.p. 140]: could be used. However, Galileo’s discussion does
make clear that the two quantities that are considered
“Then, if I understood correctly, I can easily know homologous – the time of vibration and the length of
the length of a string hanging from any great height, the pendulum string – are fixed features of a pendulum,
even though the upper attachment is out of my sight, in contrast to other quantities such as the amplitude of
and I see only the lower end. For if I attach a heavy the oscillations, or the weight of the bob [18.10, p. 141]:
weight to the string down here, and set it in oscil-
lation back and forth; and if a companion counts “Take in hand any string you like, to which a weight
a number of its vibrations made by another move- is attached, and try the best you can to increase or
Physically Similar Systems – A History of the Concept 18.3 Late Nineteenth and Early Twentieth Century 383
diminish the frequency of its vibrations; this will be pendulums are related to each other by a law of corre-
a mere waste of effort. On the other hand, we con- spondence.
fer motion on any pendulum, by merely blowing on Because the point is so often missed, it may be help-
it [. . . ] This motion may be made quite large [. . . ] ful to state it a slightly different way. Clearly, Galileo
yet it will take place only in accord with the time sees that in a pendulum’s behavior, the quantities that
appropriate to its oscillations.” characterize a pendulum’s behavior are related to each
other in a fixed (though nonlinear) relation, as evi-
Thus, each of the two quantities – length of the denced by his remarks about the time of oscillation of
string, time of vibration – of a given pendulum deter- a pendulum being determined by the length of its string.
mines the other. The point germane to the topic of the Yet, rather than illustrating that one can use this rela-
history of similar systems, though, is this: every pen- tion to figure out the value of one quantity associated
dulum is related to every other pendulum by a law with a certain pendulum by measuring another quantity
of correspondence. The law of correspondence relates associated with that same pendulum, what Galileo is
each of these two quantities in one pendulum to its doing here is using a completely different method of in-
Part D | 18.3
homologue in another pendulum. I think that we can ference: establishing a law of correspondence between
see this as akin to how Newton conceived similar sys- two different pendulums. Then, from an observation of
tems to be related: by a law of correspondence between one quantity obtained experimentally on another pen-
quantities in one system and their homologous quan- dulum chosen or constructed for the purpose, the law
tities in the similar system. Only the length of the of correspondence he has established is invoked to infer
string and the time of vibration show up as homolo- the value of the homologous quantity in the pendulum.
gous properties in comparison of the two pendulums. (In the passage from Galileo quoted above, the method
Thus, Galileo makes a point of distinguishing between was used to infer the length of one pendulum from the
quantities that characterize a given pendulum (length of length of another pendulum.) It is the articulation of
string; time of oscillation) and quantities that do not this method that justifies including Galileo along with
(amplitude of oscillation; weight of bob), in addition Newton in a history of the concept of physically similar
to making the point about how some behaviors of all systems [18.11].
equation using an unknown function to their stability, when that is mainly or essentially due
to their gravity [weight] or, as we may say, to the down-
.˘ 1; ˘ 2; : : : ; ˘ i; r0 ; r00 ; : : :/ D 0; ward force which they receive from gravitation” [18.16,
p. 362].
that is, the equation I have called the Reduced Relation Thomson offered a “comprehensive but simple and
Equation of 1914. easily intelligible principle” for the first kind of com-
Buckingham did his doctoral work at the very end of parison: Similar structures, if strained similarly within
the nineteenth century. Where were people employing limits of elasticity from their forms when free from ap-
or talking about the notion of similar systems during the plied forces, must have their systems of applied forces,
late nineteenth century? By then, some notion of similar similar in arrangement and of amounts, at homologous
systems was known in theoretical physics, where it was places, proportional to the squares of their homolo-
occasionally explicitly discussed using the term similar gous linear dimensions. His reasoning in establishing
systems, as well as in many branches of engineering, this principle is a deductive argument special to solid
where it was involved, albeit sometimes implicitly or mechanics, the mechanics of deformable bodies. To es-
Part D | 18.3
obliquely, in experimental investigations. Then, too, tablish this we have only to build up, in imagination,
there were activities and investigations that did not fit both structures out of similar small elements or blocks,
neatly into one or the other of these categories, or strad- alike strained, with the same intensity and direction of
dled them. How did various thinkers producing these stress in each new pair of homologous elements built
works think about and express the concepts associated into the pair of objects [18.16, pp. 362–363]. These
with mechanical similarity and similar systems? small elements or blocks are imagined to be so small in
relation to the overall body that the stresses in them can
18.3.1 Engineering and Similarity Laws be considered homogenous throughout the element or
block. This is how the principle is derived, but the point
Similar Structures of emphasis for both scientific understanding and engi-
In engineering and science of the nineteenth century, neering practice was that “similar structures of different
the main notion invoked when reasoning with simi- dimensions must not be similarly loaded [. . . ] if they
lar machines or systems was that of a similarity law are to be stressed with equal severity.” In saying that the
or a similarity principle. James Thomson (1822–1892) structures must not be similarly loaded, he draws atten-
(brother of William Thomson, Lord Kelvin (1824– tion to the part of the principle that says that the loads
1907)) gave an influential paper in 1875 entitled Com- in the two similar structures must vary by the squares of
parison of Similar Structures as to Elasticity, Strength, their linear dimensions, rather than by the simple mul-
and Stability [18.13] that tried to identify and lay out tiplicative factor that the linear dimensions do.
the methodology involved in the engineering design of This was commonly what was meant at the time by
structures such as bridges and buildings, but he used a similarity principle or, sometimes similarity law or
some other interesting examples such as obelisks and law of similarity. Each one covered a certain class of
umbrellas, too. Thomson’s examples are often about cases. The point of the principle was usually to state
how to vary some quantity such that two structures of how one variable – for example, density, stiffness – was
different sizes are similar in one of these respects I to be varied as another, such as length, was varied. One
refer to as behavioral: that is, elasticity, strength, or sta- form such reasoning took was to show how the ratio of
bility. Thomson’s paper was built upon and expanded variables of one type varied as a ratio of another type of
in 1899 (by Barr [18.14]) and again in 1913 (by Tor- variable did: for instance [18.17, p. 136],
rance [18.15]).
The principle James Thomson identified was meant “If the scale ratio for any two orifices, i. e., the ratio
to be general. Yet, there were still different kinds of of any two corresponding linear dimensions, is S,
comparisons. In his 1875 paper, which became more the ratio of the areas of corresponding elements of
widely available when his collected works were pub- the orifices will be S2 , while if similarly situated
lished in 1912, Thomson distinguished between two with respect to the water surface, their depths are
kinds of comparisons of similar structures, which, he proportional to S.”
said, were “very distinct, and which stand remarkably in
contrast each with the other.” One kind of comparison However, sometimes the similarity law or principle
of similar structures is “in respect to their elasticity and for a certain kind of behavior was stated simply
strength for resisting bending, or damage, or breakage as a ratio, the implication being that that ratio was
by similarly applied systems of forces.” The other, con- invariant for similar systems; setting the ratio equal to
trasting kind was “comparisons of similar structures as 1 and rearranging terms yielded the relations between
Physically Similar Systems – A History of the Concept 18.3 Late Nineteenth and Early Twentieth Century 385
quantities that must be maintained in order to achieve the construction of an experimental water tank to carry
the similarity of that type. out the experiments he proposed. His methods for ex-
trapolating from smaller, scale models of ships in his
Similar Interactions: A Law of Comparison water tank to the full size ship were vindicated when
for Model Ships the Admiralty conducted full-scale tests on the HMS
One of the most well-known engineering advances em- Greyhound and Froude was able to compare the mea-
ploying similarity and, implicitly, the notion of similar surements taken on the full size Greyhound with those
systems, was William Froude’s (1810–1879) solution he had taken on his 1=16 model of the HMS Greyhound
of significant, urgent, and previously unsolved prob- in his experimental tank. His Law of Comparison was
lems in ship design for the British Admiralty ([18.18, soon adopted for all further ship design not only by the
p. 279], [18.5, 19, 20]). In the design of ships for sta- British Admiralty, but also by the US Navy, which con-
bility and speed, not only does gravitational force enter structed the Experimental Model Basin in Washington,
into the consideration of a structure’s behavior, but the DC in the 1890s. The Experimental Model Basin was
ship’s interaction with the water in which it is sitting or constructed under the leadership of David Watson Tay-
Part D | 18.3
moving must also be considered. lor. Hagler [18.20] provides a good discussion of David
Froude’s reasoning about the stability of ships in- Watson Taylor’s writings on ship design; Taylor shows
volved examining the motion of a pendulum in a resis- how the methodology used by the US in almost all its
tive fluid [18.21, pp. 5ff, 15ff, 61]: the same question naval design work in the first half of the twentieth cen-
Newton addressed when he presented the proposition tury is ultimately traceable to this work Froude did in
in which he introduced the idea of similar systems. the nineteenth century.
Schaffer points out that, although the statement does Froude similarity was developed specifically for the
not appear in the final version of the Principia, New- purpose of using model experimentation for ship de-
ton had written that “if various shapes of ships were sign. As with the similarity laws in mechanics, Froude
constructed as little models and compared with each similarity can be expressed in terms of a ratio, the
other, one could test cheaply which was best for nav- Froude number, which is a dimensionless parame-
igation” [18.22, p. 90]. ter. Though no notion of similar systems is defined,
Unlike Newton, Froude does not seem to analyze a nascent notion of similar systems was involved in
the notion of similar systems in thinking about a pendu- practice, since the similarity of situations is established
lum in a resistive medium. However, the idea of relating when the Froude numbers for each of the two situa-
quantities in one physical situation to those in another is tions pare equal. One formulation of the Froude number
predominant in Froude’s work; it is, in fact, the topic of is v = gL, where v is a velocity, L is a length, and
his main contributions to the problem of the efficient g is the gravitational acceleration. The application of
design of large ships driven by propellers. As Zwart Froude similarity requires expertise; which velocity and
has pointed out [18.5], the naval architect John Scott characteristic length are relevant depends on the phe-
Russell had already constructed and tested many small nomenon being investigated. We can see from the form
models, but his experience had convinced him that of the Froude dimensionless ratio, however, that quan-
the little models, though they had provided him with tities do not all scale linearly, much less by the same
much pleasure, could provide no help in determining linear factor. Another point of note is that, as Froude
how large ships behaved. The exchange between Rus- similarity compares homologous forces as well as ho-
sell and Froude following Froude’s reading of his 1874 mologous motions, it is a kind of dynamic similarity,
paper was recorded in a transcript and so is available not merely a kinematic similarity.
today [18.23], showing that the problem of how to ex-
trapolate observations on the behavior of small models Bertrand and Reech: The French Connection
of ships when placed in water to the behavior of full size Between Newton and Froude
ships was considered unsolved when Froude took it on Many have pointed out that Froude took over results
([18.5, p. 15], [18.20, pp. 128–130], [18.19, 23]). Ha- due to others, naming in particular French engineering
gler also notes that Froude’s confidence that the smaller professor Ferdinand Reech and French mathematician
model ships (some of which were over 20 feet long) Joseph Bertrand, both of whom wrote on similarity
could be used to infer the behavior of larger full-scale methods in mechanics ([18.24, p. 141ff], [18.25, p.
ships was based in part on Rankine’s investigations 381], [18.26, p. 15], and [18.18, p. 279]). The extent
on streamlines. Froude explicitly discusses Rankine’s to which this is true has been debated [18.24], but none
work in his 1869’s “The State of Existing Knowledge deny that Froude holds a unique place as an experimen-
on the Stability, Propulsion and Seagoing Qualities of talist whose accomplishments advanced both the field
Ships” [18.20]. He convinced the Admiralty to fund of hydraulics and the industry of marine architecture.
386 Part D Model-Based Reasoning in Science and the History of Science
Ferdinand Reech (1805–1884) publishing in 1852 on ing the theory. He explains how the notion of similar
topics he had lectured about much earlier, explicitly systems, though it may look rather limited, is in fact
followed Newton’s approach, discussing and deriving sometimes indispensible, that is, for problems not sus-
principles about how to relate observations of velocities ceptible to a mathematical solution [18.28, p. 131]:
and motions of one ship to other ships of different sizes.
Like Newton, he considered bodies and forces on them, “It is true that only proportional results can be de-
though he employed the term similar system in his dis- duced from [the principle]; and that, consequently,
cussions when deriving laws of comparison [18.27]. It it will only serve to solve a question, when an-
is Joseph Bertrand who seems to have taken a concep- other of an analogous nature and of an equivalent
tual step beyond Newton, though he heaps quite a great analytical difficulty shall have been solved. It may,
deal of credit for his work upon Newton, as though he however, be of great utility to determine in certain
is doing little more than showing the consequences of cases the analogy which exists between the move-
Newton’s theorems about similar systems. ments of the two systems, even supposing each of
Joseph Bertrand (1822–1900) produced many text- them not to be susceptible of strict theoretical deter-
Part D | 18.3
has already proved of great value in molecular the- these numbers; some are constrained by laws of mo-
ory,” as it had allowed Kamerlingh Onnes “to give tion, but others are not. This leaves him free to “imagine
a theoretical demonstration of van der Waals’s law of a large variety of systems S0 , similar to S, and which
corresponding states” [18.29]. The experimental confir- must be deemed possible as far as our equations of mo-
mation of that law, Lorenz wrote, “has taught us that tion are concerned” [18.29, p. 445].
a large number of really existing bodies may, to a cer- Lorentz uses the notion of similar systems to ex-
tain approximation be regarded as similar.” plore the constraints on theory, as opposed to using
Lorentz had already developed a notion of corre- theory to state how one can construct a system S0 to
sponding states for use in electrodynamics by 1900. be similar to a certain system S, in order to make infer-
The context in which he made the observation above, ences about one of the systems based upon observations
though, was his paper The Theory of Radiation and about the other. This seems a different use of the notion
the Second Law of Thermodynamics, in which he was than Galileo or Newton made of it; it also allows the
concerned with the question of the similarity in the contemplation of unprecendented kinds of similarity. It
structure of different bodies that would be mandated by may, Lorentz realizes, even give rise to systems of a dif-
Part D | 18.3
thermodynamics [18.29, p. 440]. It would take us too ferent ontological status; he explains why that, too, may
far afield to explain everything that Lorentz was trying be useful [18.29, pp. 447–448]:
to do in this paper; here we restrict our discussion to
“It might be argued that two bodies existing in na-
what concept of similar systems Lorentz employed or
ture will hardly ever be similar in the sense we have
seems to have had in mind.
given to the word, and that therefore, if S corre-
Lorentz’ idea of similar systems involves starting
sponds to a real system, this will not be the case with
with one system and then constructing a second one
S0 . But this seems to be no objection. Suppose, we
from the first. Lorentz writes of “comparing two sys-
have formed an image of a class of phenomena, with
tems”; what he says is that the systems he compares
a view to certain laws that have been derived from
are: “[. . . ] in a wide sense of the word, similar, that
observation or from general principles. If, then, we
is, such that, for every kind of geometrical or physical
wish to know, which of the features of our picture
quantity involved, there is a fixed ratio between its cor-
are essential and which not, i. e., which of them are
responding values in the two systems, [. . . ]” [18.29].
necessary for the agreement with the laws in ques-
It is not clear on what basis he justifies being able to
tion we have only to seek in how far these latter will
say that “We shall begin by supposing that, in passing
still hold after different modifications of the image;
from one system to another, the dimensions, masses and
it will not at all be necessary that every image which
molecular forces may be arbitrarily modified,” as this
agrees in its essential characteristics with the one we
seems to require a certain kind of independence among
have first formed corresponds to a natural object.”
the things being modified. He argues that “if the sec-
ond system, as compared with the original one, is to Thus, Lorentz’s exploratory use of similar systems
satisfy Boltzmann’s and Wien’s laws,” that “we shall in fields beyond mechanics was motivated by the ex-
find that the charges of the electrons must remain unal- ample of van der Waals’ and Onnes’ highly successful
tered.” results using mechanical similarity to derive new theo-
He first describes a certain system S that includes retical results.
a “ponderable body” enclosed in a space. Some of the
features of S are delineated (he ascribes an “irregular Van der Waals and Onnes
molecular motion” and the “power of acting on one In his 1881 General Theory of Liquids, Onnes ar-
another with certain molecular forces” to the particles gued that van der Waals’ “law of corresponding states”,
making up the body, for instance, and adds that some which had just been published the previous year, could
are electrically charged) but other features are not (there be derived from scaling arguments, in conjunction with
may be other (molecular) forces of another kind, acting assumptions about how molecules behaved. Van der
on the electron) [18.29, p. 443]. The description of the Waals was impressed with the paper, and a long friend-
“really existing” system S is meant to pick out some- ship between the two ensued (Fig. 18.4). Van der Waals
thing that actually exists, in contrast to the system S0 , was awarded the Nobel Prize in Physics in 1910 for
which “perhaps will be only an imaginary one” [18.29, “The equation of state for gases and liquids” [18.30],
p. 444]. To complete the description of the state of S0 , and Onnes was awarded it in 1913 [18.31], for “Inves-
“we indicate, for each of the physical quantities in- tigations into the properties of substances at low tem-
volved, the number by which we must multiply its value peratures, which have led, amongst other things, to the
in S, in order to obtain its value in S0 at corresponding preparation of liquid helium.” In his lecture delivered for
points and times.” He then explores the constraints on the occasion, Onnes highlighted the connection between
388 Part D Model-Based Reasoning in Science and the History of Science
Part D | 18.3
Propane Carbon dioxide
0.2 n-Butane Water
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7
Reduced pressure PR
Onnes used this insight about corresponding states particles in balance, as a function of the five param-
to set up an experimental apparatus to liquefy helium, eters. He solves this problem by deriving a set of
which has an extremely low critical temperature. What scaling relations for M, A, v , u, and p, which pertain
is so exciting about his story is that he had to rely on the if the units of length, mass, and time are changed.”
law of corresponding states to estimate the critical tem-
perature so that he would know where to look – that is, Onnes provides a criterion for corresponding states
so that he would know what conditions to create in or- based on these scaling relations, along with assump-
der for helium to liquefy. What is especially relevant to tions about what the molecular-sized objects are like.
the history of the notion of physically similar systems is Levelt Sengers remarks [18.33, p. 30]:
that he did more than just use van der Waals’ law of cor-
responding states. He also gave a foundation for it that “Two fluids are in corresponding states if, by proper
was independent of the exact form of van der Waals’ scaling of length, time and mass for each fluid, they
equation and did not depend on results in statistical me- can be brought into the same “state of motion.” It is
chanics. Instead, he used mechanical similarity [18.33, not clearly stated what he means by this, but he must
p. 30]: have had in mind an exact mapping of the molecular
motion in one system onto that of another system if
“Kamerlingh Onnes’s (1881) purpose is to demon- the systems are in corresponding states.”
strate that the principle of corresponding states can
be derived on the basis of what he calls the prin- Levelt Sengers illustrates what being in the same
ciple of similarity of motion, which he ascribes “state of motion” means “in modern terms” [18.33, p.
to Newton. He assumes, with van der Waals, that 30]:
the molecules are elastic bodies of constant size,
“[. . . ] suppose a movie is made of the molecular
which are subjected to attractive forces only when
motions in one fluid. Then, after setting the initial
in the boundary layer near a wall, since the attrac-
positions and speed of the molecules, choosing the
tive forces in the interior of the volume are assumed
temperature and volume of a second fluid appropri-
to balance each other [. . . ] He realizes this can be
ately, and adjusting the film speed, a movie of the
valid only if there is a large number of molecules
molecular motion in a second fluid can be made to
within the range of attraction [. . . Onnes] consid-
be an exact replica of that in the first fluid.”
ered a state in which N molecules occupy a volume
v , and all have the same speed u (no Maxwellian Appeal to such imagined visual images is very
distribution!). The problem is to express the external much in keeping with nineteenth century science, and
pressure p, required to keep the system of moving one can see here an attempt to generalize Newton’s use
390 Part D Model-Based Reasoning in Science and the History of Science
of similar systems in the Principia to thermodynam- molecules) having the same energy but different ini-
ics. Onnes used the principle of corresponding states tial conditions.”
for more than visualizing, though, and, even, for more
than theorizing; he used it to show how one could make Stephen G. Brush, also citing Boltzmann’s 1884 and
a prediction about one fluid from knowledge about an- 1887 papers, remarks that [18.36, pp. 75–76]:
other. Wisniak explains [18.34, p. 569]
“There has been considerable confusion about what
“Kamerlingh Onnes proposed to use the law of Maxwell and Boltzmann really meant by ergodic
corresponding states to examine the possibility of systems. It appears that they did not have in mind
cooling hydrogen further by its own expansion. He completely deterministic mechanical systems fol-
then used this law to predict from the known expe- lowing a single trajectory unaffected by external
rience with oxygen what was to be expected from conditions; [. . . ]
the apparatus for the cooling of hydrogen: [quoting In fact, when Boltzmann first introduced the
Onnes:] But let us return to the thermodynamically words Ergoden and ergodische, he used them not
corresponding substances. If two such substances for single systems but for the collections of sim-
Part D | 18.3
are brought in corresponding engines and if these ilar systems with the same energy but different
engines are set in motion with corresponding veloc- conditions. In the papers published in 1884 and
ities, then they will run correspondingly as long as 1887, Boltzmann was continuing his earlier analy-
there is given off a corresponding quantity of heat sis of mechanical analogies for the Second Law of
in the corresponding times by the walls of the ma- Thermodynamics, and also developing what is now
chine.” (following J. Willard Gibbs) known as ensemble
theory. Here again, Boltzmann was following a trail
Thus, Onnes has introduced not just correspond-
blazed by Maxwell, who had introduced the ensem-
ing motions and times, as in mechanical similarity, but
ble concept in his 1879 paper. But while Maxwell
also corresponding quantities of heat. Wisniak contin-
never got past the restriction that all systems in the
ues [18.34, p. 569]
ensemble must have the same energy, Boltzmann
“He [Onnes] then introduced the notion of ther- suggested more general possibilities, and Gibbs ul-
modynamically corresponding operations to argue timately showed that it is most useful to consider
that if then in a model, working with oxygen, af- ensembles in which not only the energy but also the
ter a given time a given volume of liquid oxygen number of particles can have any value, with a spec-
is found, there will be obtained in the correspond- ified probability.”
ing hydrogen apparatus after the corresponding time
What these commentators on Boltzmann are re-
a corresponding volume of liquid hydrogen.”
ferring to in mentioning the influence of Maxwell are
By model here, Onnes clearly means physical Maxwell’s remarks in his On Boltzmann’s Theorem on
model, and the model includes the contained gases the average distribution of energy in a system of mate-
such as oxygen and hydrogen. The model is an actual rial points [18.37]. There, Maxwell wrote, speaking of
physical model: a physical setup, an actual, physical the case “in which the system is supposed to be con-
machine. By the end of the nineteenth century, the tained within a fixed vessel” [18.37, pp. 715ff]:
physics of machines included the thermodynamics of
machines. And, as in Newton and Galileo’s day, one “I have found it convenient, instead of consider-
could talk both about imagined similar systems, and ing one system of material particles, to consider
about actual similar machines. a large number of systems similar to each other in
all respects except in the initial circumstances of the
Maxwell and Boltzmann motion, which are supposed to vary from system to
As several scholars have noted, Ludwig Boltzmann system, the total energy being the same in all. In the
(1844–1906) mentioned similar systems in his inves- statistical investigation of the motion, we confine
tigations into the theory of gases, too. It’s been noted our attention to the number of these systems which
that, in his 1884 and 1887 papers, Boltzmann [18.35, at a given time are in a phase such that the variables
pp. 56–57]: which define it lie within given limits. [Emphasis in
italic added.]
“tried to deepen the foundation of the new theory If the number of systems which are in a given
[that was to become known as statistical mechanics] phase (defined with respect to configuration and ve-
by introducing the concept of Ergoden – meaning locity) does not vary with the time, the distribution
a collection (ensemble) of similar systems (of gas of the systems is said to be steady.”
Physically Similar Systems – A History of the Concept 18.3 Late Nineteenth and Early Twentieth Century 391
It is not clear how the use of the notion of similar to other laws. Hence a flying-machine, which when
systems here, i. e., in forming ensembles in thermo- made on a small scale is able to support its own
dynamics in order to study their behavior statistically, weight, loses its power when its dimensions are in-
might be related to either Newton’s notion of similar creased. The theory, initiated by Sir Isaac Newton,
systems or the notion involved in the principle of corre- of the dependence of various effects on the linear di-
sponding states. It is certainly a use of similar systems mensions, is treated in the article Units, Dimensions
that is very different from using one system experimen- Of.”
tally to infer the values of quantities in another. So, if,
as Brush’s comment implies, Boltzmann was thinking The use of a flying machine to illustrate the point
of more general kinds of similar systems, it seems he was not incidental; in his On Aeronautics, Boltzmann
was no longer restricting the notion of similar systems urged research into solving the problem of flight, and
to systems that are behaviorally similar to each other expressed his opinion that experimentation with kites
with respect to motions, and he was not restricting its was the appropriate approach. The complexities of air-
use to the use of one system or machine to infer the flow over an airplane wing, he said, were too difficult
Part D | 18.3
behavior of another. to study using hydrodynamics [18.39, p. 256]. Yet,
Yet Boltzmann’s departure from Newton’s use of the basis for extrapolating from experiments on a kite
the term similar systems was almost certainly not a mat- or flying machine from one observed situation to an-
ter of confusion on Boltzmann’s part about the notion other, unobserved, situation (even with a machine of
in the sense Newton had used it, for Boltzmann’s en- the same size) owes something to hydrodynamics. The
cyclopedia entry on models [18.38] shows that he was dimensionless parameters yielding the appropriate cor-
well aware of, and respected the distinctive nature of, respondences between homologous quantities for kites
the use of experimental models of machines, in which and flying machines were provided by Helmholtz’s in-
one machine is specially constructed in order to infer novative use of the equations of hydrodynamics.
the behavior of another. Boltzmann, in fact, associates
the latter kind of model with Newton’s insights. 18.3.3 Similar Systems in Theoretical
On the approach in which physical models con- Physics
structed with our own hands are acatually a continua-
tion and integration of our process of thought, Boltz- Stokes and Helmholtz
mann says in that encyclopedia article (Model) [18.38]: Hermann von Helmholtz (1821–1894), like Ludwig
Boltzmann and so many other physicists of the nine-
“physical theory is merely a mental construction of teenth century, contributed to the scientific literature on
mechanical models, the working of which we make research into flight. Some of these contributions took
plain to ourselves by the analogy of mechanisms we the form of investigations concerning the earth’s at-
hold in our hands.” mosphere. Six of the 20 papers in the important and
selective 1891 anthology The Mechanics of the Earth’s
In contrast, Boltzmann explicitly described experi- Atmosphere: A Collection of Translations by Cleve-
mental models as of a different sort than the kind with land Abbe [18.40] were by Helmholtz; one of these
which he was comparing mental models, and explained was his 1873 On a Theorem Relative to Movements
why they must be distinguished [18.38]: That Are Geometrically Similar in Fluid Bodies, To-
gether with an Application to the Problem of Steering
“A distinction must be observed between the mod- Balloons [18.41, 42]. It is the only one of Helmholtz’s
els which have been described and those experimen- papers in that volume that explicitly addresses an appli-
tal models which present on a small scale a machine cation to the problem of flight. What is relevant to the
that is subsequently to be completed on a larger, history of the concept of similar systems is the kind of
so as to afford a trial of its capabilities. Here it reasoning he uses in the paper.
must be noted that a mere alteration in dimensions Helmholtz’s starting point is “the hydrodynamic
is often sufficient to cause a material alteration in equations” which, he argues, can be considered “the
the action, since the various capabilities depend in exact expression of the laws controlling the motions
various ways on the linear dimensions. Thus the of fluids” ([18.41, p. 67], [18.42]). What about the
weight varies as the cube of the linear dimensions, well-known contradictions between observations and
the surface of any single part and the phenomena the consequences of the equations? Those, he argues,
that depend on such surfaces are proportionate to are only apparent contradictions, which disappear once
the square, while other effects – such as friction, ex- the phenomenon of “surfaces of separation” is no longer
pansion and condition of heat, etc., vary according neglected [18.41, p. 67]; his On Discontinuous Motions
392 Part D Model-Based Reasoning in Science and the History of Science
in Liquids [18.43, 44], also included in the same collec- a very fundamental problem to finding a neat solution,
tion of translations, aims to establish their existence. too, for [18.41, p. 67]
The Discontinuous Motions paper [18.43] is an ex-
“The discontinuous surfaces are extremely variable,
traordinarily interesting contribution to the methods of
since they possess a sort of unstable equilibrium,
reasoning by analogy between fluid currents, electrical
and with every disturbance in the whirl they strive
currents, and heat currents. For, the paper begins by
to unroll themselves; this circumstance makes their
pointing out that “the partial differential equations for
theoretical treatment very difficult.”
the interior of an incompressible fluid that is not sub-
ject to friction and whose particles have no motion of Theory being of very little use in prediction
rotation” are precisely the same as the partial differ- here, [18.41, p. 68]
ential equations for “stationary currents of electricity
“we are thrown almost entirely back upon experi-
or heat in conductors of uniform conductivity” [18.43,
mental trials, [. . . ] as to the result of new modi-
p. 58]. Yet, he notes, even for the same configurations
fications of our hydraulic machines, aqueducts, or
and boundary conditions, the behavior of these differ-
propelling apparatus.”
Part D | 18.3
Stokes had employed “similar systems” [18.45, pp. 16– coordinate axes are designated x, y, and z; the
17]: components of velocity associated with them are
designated u, v , and w . The time t, fluid density
“Consider any number of similar systems, com-
", pressure p, and coefficient of friction k (viscos-
posed of similar solids, oscillating in a similar
ity) are also named, which allows him to construct
manner in different fluids or in the same fluid. Let
the equations of motion of the first fluid in the
a, a0 , a00 . . . be homologous lines in the different
Eulerian form. The second fluid is then given desig-
systems; T, T 0 , T 00 . . . corresponding times, such
nations of U, V, W for the components of velocity
for example as the times of oscillation from rest to
(in coordinate axes X, Y, Z), the pressure P, the
rest. Let x, y, z be measured from similarly situated
fluid density E, and the viscosity constant by K.
origins, and in corresponding directions, and t from
Three additional constants q, r, and n are named,
corresponding epochs, such for example as the com-
so that the quantities in the second fluid can then
mencements of oscillations when the systems are
be related to the designated quantities in the first
beginning to move from a given side of the mean
fluid such that the quantities in the second fluid
position.”
Part D | 18.3
will also satisfy the equations of motion that were
Then, Stokes says that the form of the equations constructed for the first fluid. For example, the den-
shows that the equations being satisfied for one system sities of the two fluids are related by E D r"; their
will be satisfied for all the systems, if certain rela- coefficients of friction are related by K D qk; and
tions between the quantities in those equations are met, the velocity components, by U D nu, V D nv , and
which he lays out. He adds the condition needed in or- W D nw . Then, the pressures must be related by
der for the systems to be dynamically similar; then, if P D n2 rp C constant, and the times in the two fluids
we “compare similarly situated points,” the motions in must be related by T D qt=n2 . Putting the terms for
the systems will also be similar, and the “resultants [of the quantities of the second fluid expressed in terms
pressure of the fluids on the solids] in two similar sys- of the quantities of the first fluid into the equations
tems are to one another” in a certain ratio that he shows of motion for the first fluid shows that they satisfy
how to obtain. Stokes does not end there; the paper those equations.
contains further discussion about establishing similar- 2. The nature of the two fluids determines how their
ity between the two systems, having to do with how the densities and coefficients of friction are related to
fluids are confined. This much about Stokes should give each other, so two of the three constants, q and r, are
a general idea of how he conceived of and used the no- determined. Helmholtz then considers various kinds
tion of “similar systems” [18.45, p. 19]. of cases (e.g., compressible versus incompressible,
Helmholtz’ approach probably owes much to cohesive versus noncohesive (liquid vs gaseous flu-
Stokes; David Cahan’s study Helmholtz and the British ids), certain boundary conditions, whether friction
Scientific Elite: From Force Conservation to Energy can be neglected), and what they permit to be in-
Conservation identifies Stokes as one of the British ferred about the third undetermined constant n.
elites with whom Helmholtz built a relationship dur- The paper contains a variety of interesting remarks,
ing the 1850s and 1860s [18.46]. Helmholtz does refer some of great practical significance, about how
to Stokes, to be sure, but there is also something cre- other quantities of the two fluids (e.g., velocity of
ative in what he does in his own paper. Helmholtz turns sound) must be related to each other.
the idea of how the Eulerian equations for flow are re- 3. When Helmholtz comes to addressing the practical
lated to similar systems around, so that he sees how one problem mentioned in the title: “driving balloons
might, in principle at least, use the equations in con- forward relative to the surrounding air,” he uses not
junction with model experiments on ships to inform us two masses of air in which two different air bal-
about how to predict and direct the motions of balloons loons are situated, but rather: for the second fluid,
(dirigibles). a mass of air in which an air balloon is situated,
The discussion and derivation of the conclusions and, for the first fluid, a mass of water in which
Helmholtz reached for all the cases he considered in his a ship is situated. He writes: “our propositions al-
1873 paper [18.41] are too long to summarize here, but low us to compare this problem [driving balloons
a few points can be mentioned: forward relative to the surrounding air] with the
other one that is practically executed in many forms,
1. Helmholtz’s strategy is to consider two given flu- namely, to drive a ship forward in water by means
ids and use the hydrodynamic equations to infer of oar-like or screw-like means of motion [. . . ] we
the way or ways in which their quantities must must [. . . ] imagine to ourselves a ship driven along
be related. For the first fluid, the directions of its under the surface. Such a balloon which presents
394 Part D Model-Based Reasoning in Science and the History of Science
a surface above and below that is congruent with specification of what quantities need to be considered
the submerged surface of an ordinary ship scarcely in the analysis.
differs in its powers of motion from an ordinary Yet, Helmholtz is careful not to overreach concern-
ship” [18.41, p. 73]. Then, letting “the small letters ing what can be deduced from the form of an equation;
of the two above given systems of hydrodynamic as he points out in his Discontinuous Motions pa-
equations refer to water and the large letters to per [18.43] when investigating the example of fluid
the air,” he examines the practical conditions under being “torn asunder”: just because a certain situation
which he can “apply the transference from ship to is governed by an equation of the same form as an-
balloon with complete consideration of the peculiar- other equation governing a different situation, does not
ities of air and water.” in itself guarantee that the two situations will exhibit
analogous behavior – even when the configuration and
Helmholtz’s discussion contains many subtle points boundary conditions are also analogous. It is for the
concerning what would need to be considered if ac- confluence of all these points that I consider Helmholtz’
tually building the kind of ship needed to model an 1873 paper [18.41] such a major contribution to the his-
Part D | 18.3
air balloon. As he indicates, the practical considera- tory of the concept of similar systems.
tions involved in applying the method are not trivial
and can sometimes even be prohibitive; nevertheless, Reynolds
the point is that the approach he outlines permits one Osborne Reynolds’ (1842–1912) work and influence on
to make a proper analysis of any such comparison, similarity was immense, but it was by no means his only
or “transference” using the hydrodynamic equations, major achievement [18.47]. Unless one has invested the
and can sometimes yield a solution when the hydrody- time required to read a significant part of his work, any
namic equations are insoluble [18.41, p. 74]. Evidence evaluation of his achievements and influence will sound
of the influence and significance of this particular pa- like hyperbole. I mention here only his most significant
per of Helmholtz’s into the twentieth century appears contribution relevant to the history of the concept of
in Zahm’s Theories of Flow Similitude [18.7]. Zahm similar systems.
identifies three methods, one with Isaac Newton, one The decisive difference Reynolds made in the no-
with Stokes and Helmholtz, and one with Rayleigh. tion of similar systems was to show that it applied
The sole paper by Helmholtz cited there is this paper beyond well-behaved regimes. In fact, he showed, it
of 1873 [18.41]. applied during the transition between well-behaved
The significance to the history of physically simi- regimes and chaotic ones. And, not only that, but that
lar systems is that Helmholtz’s account of his method the critical point of transition between well-behaved
involves a differential equation, that the equation is (laminar flow) and chaotic (turbulent flow) regimes
so central to the account, and that how it is involved could be characterized, and characterized by a param-
is stated so clearly. What is not stated very clearly is eter that was independent of the fluid. Stokes put it well
whatever it is that plays the role of system; sometimes in the statement he made in his role as President of the
Helmholtz seems to be saying that the transference is Royal Society on the occasion of presenting a Royal
from one mass of fluid to another; other times, that Medal to Reynolds on November 30, 1888 [18.45, p.
it is between the objects situated within the fluid. If 234]:
we denote whatever ought to play that role by the
term system, though, we would say that, in Helmholtz’s “In an important paper published in the Philosophi-
analysis, the hydrodynamic equations are not only the cal Transactions for 1883, [Osborne Reynolds] has
core of the criterion for allowing the “transference” given an account of an investigation, both theoret-
of results [18.41, p. 74] observed in one situation to ical and experimental, of the circumstances which
another, but they indirectly give a criterion for, and determine whether the motion of water shall be
thus specify, what a system is, that is, what the sim- direct or sinuous, or, in other words, regular and sta-
ilarity in similar systems is between. If we use the ble, or else eddying and unstable. The dimensions
term system this way, then it is implicit in Helmholtz’s of the terms in the equations of motion of a fluid
account that a system is the mass and its configu- when viscosity is taken into account involve, as had
ration (including anything situated within the mass), been pointed out, the conditions of dynamical sim-
with boundary conditions, to which the partial differ- ilarity in geometrically similar systems in which
ential equation applies. We might also take note of the the motion is regular; but when the motion be-
fact that what the equation applies to is in equilibrium comes eddying it seemed no longer to be amenable
(though not necessarily static equilibrium). The gov- to mathematical treatment. But Professor Reynolds
erning differential equations are important, too, in the has shown that the same conditions of similarity
Physically Similar Systems – A History of the Concept 18.3 Late Nineteenth and Early Twentieth Century 395
hold good, as to the average effect, even when the the field in Germany in the twentieth century. Lud-
motion is of the eddying kind; and moreover that if wig Prandtl (1873–1953) was an ex-engineer-turned-
in one system the motion is on the border between professor in the Polytechnic at Hanover conducting
steady and eddying, in another system it will also research on air flow when he presented a paper at
be on the border, provided the system satisfies the the Third International Congress of Mathematicians
above conditions of dynamical as well as geometri- in 1904: Motion of Fluids with Very Little Viscos-
cal similarity.” ity [18.50]. It did not make much of a splash – except
with Felix Klein, then a prominent mathematician at the
Stokes does not here use the term similar systems, University of Göttingen. In his paper, Prandtl laid out
but that is what he means in using the grammatical a plan to treat flow around bodies. What he proposed
construction: “if in one system [. . . ], in another sys- was that the problem be analyzed into several distinct
tem it will also [. . . ], provided the system satisfies the questions [18.50]:
above conditions of dynamical as well as geometri-
1. What happened at the boundary of the skin that
cal similarity.” What this means is that there are some
formed against the body, and what happened on
Part D | 18.3
(experimentally determined) functions of a certain (di-
each side of it, that is
mensionless) parameter that describe the behavior of
2. What happened in the fluid on the side of the bound-
fluids, whatever the fluid. The parameter is not a single
ary that was within the skin, and
measured quantity such as distance, velocity, or viscos-
3. What happened in the fluid on the other side of the
ity; rather, it is a ratio involving a number of quantities
boundary, within the main fluid stream.
(e.g., density, velocity, characteristic length, and viscos-
ity). The ratio is without units, as it is dimensionless. Prandtl showed that, in the mainstream, the math-
Reynolds is often cited for coming up with the criterion ematical solutions that were obtained by neglecting
of dynamical similarity, but obviously, the idea pre- viscosity could be applied to even these real fluids.
dated his work, as Stokes’ statement recognizes. Rather, In the part of the flow under the skin formed around
what Reynolds did that was so decisive for the future the body, however, viscosity did have to be taken into
of hydrodynamics (and aerodynamics) was, as he ex- account. And, crucially, what happened in the main-
plained in a letter to Stokes, that there was a critical stream – the formation of vortices – set conditions for
value (or values) for “what may be called the parameter what happened on the other side of the boundary, via
of dynamical similarity [the dimensionless parameter setting boundary conditions at the interface between
mentioned earlier, which is now known as Reynolds the two layers. Klein saw the potential of Prandtl’s ap-
number]” [18.48, p. 233]. proach and brought him to a post in Göttingen right
In the excerpt from his statement quoted above, away [18.11].
Stokes puts his finger on why what Reynolds did was In Göttingen, Prandtl then made use of the knowl-
so significant in terms of a fundamental understanding edge that had been developed about hydrodynamical
of fluid behavior, but Reynolds’ 1883 paper also had similarity, using a water tank for some of his most
practical significance for research in the field as well. famous experiments. Rather than towing an object in
Stokes continued [18.49, p. 234]: the water, though, Prandtl used a water wheel to move
the fluid in the water tank, much like fans were be-
“This is a matter of great practical importance, be-
ing used to push air through wind tunnels which by
cause the resistance to the flow of water in channels
then were replacing the whirling arm or moving rail-
and conduits usually depends mainly on the for-
car apparatuses used earlier in aerodynamical research.
mation of eddies; and though we cannot determine
Prandtl’s results for airfoils were based on hydrody-
mathematically the actual resistance, yet the appli-
namical similarity and, hence, on the concept of dy-
cation of the above proposition leads to a formula
namically similar systems. His approach went beyond
for the flow, in which there is a most material
that, too, including fundamental questions he addressed
reduction in the number of constants for the deter-
by combining mathematical solutions and experimen-
mination of which we are obliged to have recourse
tal results in an uncommon kind of synthesis. William
to experiment.”
Lanchester in England also employed dynamic similar-
It is not surprising that interest in applying the meth- ity and authored significant works about his theoretical
ods of similar systems grew in the subsequent years. and experimental research in aerodynamics; his visit
to Prandtl in 1908 may have contributed somewhat to
Prandtl Prandtl developing these ideas, since Prandtl was in
Prandtl’s work in experimental hydrodynamics and a position to understand Lanchester’s work, and appre-
aerodynamics is singularly prominent in work done in ciate its significance [18.11].
396 Part D Model-Based Reasoning in Science and the History of Science
he introduces the topic by first citing Lanchester for that Reynolds also investigated cases where viscosity
one application of the principle of dynamical similar- was the “leading consideration,” as Rayleigh put it, in
ity, then noting his own communications of “a some- remarking that “It appears that in the extreme cases,
what more general statement which may be found to when viscosity can be neglected and again when it is
possess advantages.” The next year, 1910–1911, the paramount, we are able to give a pretty good account of
committee’s annual report included two papers on dy- what passes, it is in the intermediate region, where both
namical similarity, one of them by Rayleigh, under inertia and viscosity are of influence, that the difficulty
the General Questions in Aerodynamics section of the is the greatest” [18.55]. This is the lead-in to his advo-
report [18.52]. In 1911–1912, the annual report men- cacy for the law of dynamic similarity: “But even here
tions plans for experiments on an airship to determine we are not wholly without guidance.” What is this guid-
its resistance “by towing tests in the William Froude ance? He continues [18.55, p. 364]:
National Tank” [18.53]. Under a section, The Law of
“There is a general law, called the law of dynamical
Dynamical Similarity and the Use of Models in Aero-
similarity, which is often of great service. In the past
nautics, the report notes its significance to all their
this law has been unaccountably neglected, and not
research [18.52]:
only in the present field. It allows us to infer what
“The theory relating to dynamical similarity ex- will happen upon one scale of operations from what
plained by Lord Rayleigh and Mr. Lanchester in the has been observed at another.”
first of the Annual Reports of the Committee is of
Rayleigh also notes: “But the principle is at least
fundamental importance in all applications of the
equally important in effecting a comparison between
method of models to the determination of the forces
different fluids. If we know what happens on a certain
acting on bodies moving in air or in water.”
scale and at a certain velocity in water, [emphasis in
The next year, the annual report noted that [18.54]: the original] we can infer what will happen in air on
any other scale, provided the velocity is chosen suit-
“Much evidence has now been accumulated in
ably.” This is, of course, the point Helmholtz had made
favour of the truth of the law of dynamical similar-
in 1873. Rayleigh notes that the point applies only in
ity to which attention was drawn by Lord Rayleigh
the range where the velocities are small in comparison
and Mr. Lanchester in the first Report of this Com-
to the velocity of sound [18.55].
mittee.”
Rayleigh gives an example of a use of the princi-
In June of 1914, the journal Nature featured a kind ple which permits one observation or experiment to be
of survey paper, Fluid Motions, based on “a discourse regarded as representative of a whole class of actual
delivered at the Royal Institution on March 20” by cases: that is, the class of all the other cases to which it
Rayleigh [18.55]. Here, we see Rayleigh actively cam- is similar, even though the cases may have very different
paigning for wider appreciation and use of the principle, values of measurable quantities such as velocity. The
which he credits Stokes with having “laid down in all its important fact about the situation is expressed by the
completeness.” We know that Stokes explicitly used the formula for the dimensionless parameter, which picks
notion of similar systems in developing and explaining out the cases to which it is similar [18.55, p. 364]:
the use of the principle, so it is fair to say that Rayleigh
means his discussion and use of it to be consistent with “It appears that similar motions may take place pro-
Stokes’ notion of similar systems. vided a certain condition be satisfied, viz. that the
Physically Similar Systems – A History of the Concept 18.4 1914: The Year of Physically Similar Systems 397
product of the linear dimension and the velocity, di- Rayleigh tries to persuade the reader of the signifi-
vided by the kinematic viscosity of the fluid, remain cance of the effects of viscosity on the velocity of fluid
unchanged.” flow by relating some experiments he performed with
a cleverly designed apparatus in his laboratory. The
Put more specifically, the important feature of a par- apparatus consisted of two bottles containing fluid at
ticular situation is the value of this dimensionless pa- different heights, connected by a tube with a constric-
rameter; what Rayleigh is saying is that, even in cases of tion, through which fluid flowed due to the difference
a different fluid, so long as this dimensionless product in “head”, or height of fluid, in the two bottles [18.55,
is the same (and, of course, that one is in the applicable p. 364]. The tube with the constriction contained fit-
velocity range for which it was derived), the motions tings that allow the measurement of pressure head at
will be similar. the constriction, and on either side of it. To investigate
One might think that, by 1914, when the use of wind the effects of viscosity, Rayleigh varied the temper-
tunnels had become recognized as essential to practical ature of the fluid, which changes the fluid viscosity,
aeronautical research, this principle would have become and he observed how the velocity of the fluid flowing
Part D | 18.4
accepted and would no longer be in question, at least between the two bottles was affected. The kind of rela-
among aeronautical researchers. But if Rayleigh’s es- tionship he establishes and uses is of the form Galileo
timation of the state of the profession is correct, apart employed in reasoning from one pendulum to another.
from Lanchester’s work, this was not so, even as late as In other words, he worked in terms of ratios (ratios
March of 1914; he says that: of velocities, ratios of viscosities, ratios of heads), and
he employed the fact that some ratios are the square
“although the principle of similarity is well estab- root of others [18.55]. He took the experimental results
lished on the theoretical side and has met with some he reported in this 1914 paper to conclusively settle
confirmation in experiment, there has been much the question of the relevance of viscosity to fluid mo-
hesitation in applying it, [. . . ]” tions. This is an example of the kind of exploratory
work that can be involved in order to answer one of
He especially mentions problems in its acceptance the questions needed in order to use the principle of
in aeronautics due to skepticism that viscosity, which is similarity properly: What quantities are relevant to the
extremely small in air, should be considered an impor- behavior of interest (in the range of interest)? Although
tant parameter: the researcher’s experience and judgment are involved,
sometimes new experiments should be, and are, con-
“In order to remove these doubts it is very desirable ceived and carried out to help determine the question.
to experiment with different viscosities, but this is Rayleigh delivered this “Discourse” in early 1914
not easy to do on a moderately large scale, as in the [18.55]. 1914 was a very special year for the concept of
wind channels used for aeronautical purposes.” similar systems, and deserves a section all its own.
well-defined meaning in the theories of mechanical draulic equation that fulfills the law of similarity can
similarity and dynamical similarity, may have caused, be expressed in the form of an equation consisting of an
or at least contributed to, confusion about the concepts unidentified function F of three dimensionless ratios set
of similar system and similarity as they are used in con- equal to an unidentified constant. He indicates that the
nection with mechanical and dynamical similarity. As law of similarity is shown to be merely a special case of
we shall see, confusions about these concepts came to the general law according to which all the terms of any
a head in 1914; perhaps it is no coincidence that at least of the equations of importance in mechanics, need to be
one source of the confusion was a proposal by someone of equal dimension, inasmuch as the law of similarity
known for his work in statistical thermodynamics. treats one body as a prototype, and the others as copies
of it.
18.4.1 Overview of Relevant Events
of the Year 1914 18.4.2 Stanton and Pannell
In the part of 1914 leading up to Buckingham’s land- In January of 1914, Stanton and Pannell read their paper
Part D | 18.4
mark paper in October 1914 [18.2] that developed the Similarity of Motion in Relation to the Surface Fric-
notion of physically similar systems, hardly a month tion of Fluids [18.6] to the Royal Society of London.
went by without some major work concerning similar- Stanton was superintendent of Britain’s National Phys-
ity and similar systems appearing (Fig. 18.3): ical Laboratory (NPL) Engineering Department. The
paper was a compendium of the work done there on
In January 1914, Stanton and Pannell publish a ma- similarity, and had been submitted to the Society in De-
jor compendium of work [18.6] done at Britain’s cember 1913. It begins with references to Helmholtz’s
National Physical Laboratory over the previous and Stokes’ work using equations for non-ideal fluid
four years, Investigation into Similarity of Mo- flow, refers to Newton’s Principia on similar motions,
tions and uses Rayleigh’s equation for fluid resistance. It
In February 1914, a much anticipated English trans- explains that Stanton and Pannell’s work involves in-
lation of Galileo’s Two New Sciences [18.10] was vestigating “the conditions under which similar motions
published. can be produced under practical conditions.” The work
In March 1914, Rayleigh delivered his lecture Fluid had been carried out due in part to interest in the possi-
Motions [18.55] at the Royal Institute (March 20, bilities of using small-scale models in wind tunnels for
1914). engineering research. With one exception, they began,
In April 1914, Richard Chace Tolman’s The the experimental study of similar motions of fluids was
Principle of Similitude appears in Physical Re- very recent [18.6, p. 200]:
view [18.58], and Rayleigh’s Fluid Motions [18.55]
“Apart from the researches on similarity of mo-
is published in the periodical Engineering, 97 (April
tion of fluids, which have been in progress in the
8, 1914).
Aeronautical Department of the National Physical
In May 1914, Buckingham gives a paper on The
Laboratory during the last four years, the only pre-
Interpretation of Model Experiments to the Wash-
vious experimental investigation on the subject, as
ington Academy of Sciences [18.59].
far as the authors are aware, has been that of Os-
In June 1914, Rayleigh’s review article Fluid Mo-
borne Reynolds [. . . ].”
tions was published in Nature [18.55].
In July 1914, Buckingham’s Physically Similar Sys- Stanton and Pannell cite several of Reynolds’ major
tems was published in Journal of the Washington discoveries:
Academy of Science [18.1].
1. that there is a critical point at which fluid flow sud-
In October 1914, Buckingham’s Physically Similar
denly changed from “lamellar motion” to “eddying
Systems: Illustrations of the Use of Dimensional
motion” [18.6, p. 200]
Equations [18.2].
2. that the critical velocity is directly proportional to
the kinematical viscosity of the water and inversely
And sometime during 1914, Philipp Forchheimer’s
proportional to the diameter of the tube, and
Hydraulik [18.60] was published, which contains a sec-
3. that for geometrically similar tubes, the di-
tion on The Law of Similarity (Das Ähnlichkeitgesetz).
mensionless product: (critical velocity) (dia-
Hydraulik becomes a highly regarded compendium and
meter)=(kinematic viscosity of water) is constant.
reference work on Hydraulics for many decades after-
ward. In the concluding paragraph of the section on Stanton and Pannell also noted a complication: sur-
the law of similarity, Forchheimer writes that every hy- face roughness needed to be taken into account; this is
Physically Similar Systems – A History of the Concept 18.4 1914: The Year of Physically Similar Systems 399
a matter of geometry on a much smaller scale making experiments are not conceived of in terms of the values
a difference. However, the overall approach of the use of individual measurable quantities such as velocity but
of dimensionless parameters to establish similar situ- in terms of the value of a dimensionless parameter.
ations was still seen to be valid, as indicated by their Rayleigh, too, presented a kind of survey paper in
extensive experiments [18.6, p. 201]: early 1914, as mentioned above. In that March 1914
paper [18.55], Rayleigh noted that the principle of dy-
“From the foregoing it appears that similarity of
namical similarity “allows us to infer what will happen
motion in fluids at constant values of the vari-
upon one scale of operations from what has been ob-
able v d= [velocity diameter=kinematic viscosity
served at another.” That is, one use of the principle is
of water] will exist, provided the surfaces relative
to use an observation or experiment as representative
to which the fluids move are geometrically simi-
of a whole class of actual cases: all the other cases to
lar, which similarity, as Lord RAYLEIGH pointed
which it is similar, even though the cases may have
out, must extend to those irregularities in the sur-
very different values of measureable individual quanti-
faces which constitute roughness. In view of the
ties such as velocity. The important fact of the situation
practical value of the ability to apply this principle
Part D | 18.4
is the dimensionless parameter just mentioned [18.55]:
to the prediction of the resistance of aircraft from
experiments on models, experimental investigation “It appears that similar motions may take place pro-
of the conditions under which similar motions can vided a certain condition be satisfied, viz. that the
be produced under practical conditions becomes of product of the linear dimension and the velocity, di-
considerable importance, [. . . ] By the use of colour- vided by the kinematic viscosity of the fluid, remain
ing matter to reveal the eddy systems at the back of unchanged.”
similar inclined plates in streams of air and water,
A consequence of this fact is that, even in cases of
photographs of the systems existing in the two flu-
a different fluid, so long as this dimensionless product
ids when the value of v d= was the same for each,
is the same, the motions will be similar: no mention of
have been obtained, and their comparison has re-
the fluid! Not only is this striking claim correct, but it is
vealed a remarkable similarity in the motions.”
responsible for a particularly useful application of Stan-
In referring to the dimensionless parameter v d= ton and Pannell’s work, of which they were well aware:
as a variable, what Stanton and Pannell meant was tests done on water can be used to infer behavior about
that their equation for the resistance R includes a func- systems where the fluid is air. Not because air and water
tion of this dimensionless parameter, that is, resistance are similar – the relevant fluid properties are very differ-
R D .density/ .velocity/2 .some function of v d=/. ent, in fact – but because the dimensionless parameter
As they put it, R D v 2 F.v d=/, where F.v d=/ indi- relating a number of the features of the fluid and of the
cates some unspecified function of v d=. Hence, v d= situation is the same. Air and water are about as differ-
is a variable in the sense that the relation for resis- ent as can be [18.6, p. 202]:
tance includes an unspecified function of v d=. It is
“The fluids used in the majority of the experiments
also a variable in a more practical sense: it can be phys-
have been air and water. The physical properties of
ically manipulated.
these are so widely different that observations on
Stanton and Pannell presented this relation as a con-
others are hardly necessary [. . . ]”
sequence of the principle of dynamical similarity (in
conjunction with assumptions about what “the resis- Just as the theorem of corresponding states in phys-
tance of bodies immersed in fluids moving relatively ical chemistry allowed the construction of a function
to them” depends on. Evidently, it was Rayleigh who such that the values for many different kinds of fluids
suggested the generalization; they cite Rayleigh’s con- all fell on the same line, so here, too: that the function
tribution in the Report to the Advisory Committee for of the variable v d= is the same for air, water, and oil is
Aeronautics, 1909–1910 [18.51, p. 38]. Rayleigh had experimentally illustrated by Fig. 18.6 from the paper.
there spoken of the possibility of taking a more general
approach than current researchers were taking in apply- 18.4.3 Buckingham and Tolman
ing the “principle of dynamical similarity.”
In presenting the results they obtained at the Na- Buckingham’s Background in 1914
tional Laboratory in the paper, it is noteworthy that Edgar Buckingham (1867–1940) was a physicist who
the results are presented in graphs where one of the had been working at the National Bureau of Standards
variables plotted is the term R=v 2 , which is just an- in Washington, DC, since 1906. He had little previous
other expression for the unspecified function, and is experience or background in aeronautics when he began
dimensionless. What this implies is that the laboratory working on issues related to aeronautical research. His
400 Part D Model-Based Reasoning in Science and the History of Science
0.05 0.02
of Reynolds number remains
the fundamental approach
h
0.04
Relative roughnes k /d
0.01 even today. The chart above,
a Moody diagram, illustrates
0.03 0.005
that the fluid behavior
Friction factor f =
0.01
0.008 0.00001
0 103 104 105 106 107 108
Reynold′s number Re = Vd /μ
involvement arose as a consequence of efforts afoot to merous applications of the method seemed perfectly
establish a government agency devoted to aeronautical clear, and yet their simplicity gave them the appear-
research in the United States, modeled on the British ance of magic and made the general principle rather
Advisory Committee for Aeronautics; one spot was al- elusive.”
located for a physicist from the National Bureau of
Standards [18.61]. How did it end up that it was Buck- It is noteworthy that Buckingham mentions look-
ingham, then, who authored the paper that has become ing at the main mechanics textbooks used in Britain,
such a landmark in hydrodynamics and aerodynamics? rather than engineering texts. Approaching aerodynam-
In a letter to Rayleigh in 1915, Buckingham explained ics from the point of view of a physicist was consistent
the origins of his 1914 paper On Physically Similar with the kind of community in which Buckingham
Systems: Illustrations of the Use of Dimensional Equa- worked and had been educated. He had earned an un-
tions [18.62]: dergraduate degree in physics at Harvard University
(graduating in 1887) and a doctorate in physics from
“Some three or four years ago, having occasion to Leipzig in 1894. Descriptions of him as an engineer
occupy myself with practical hydro- and aerody- or physicist-engineer as mentioned in Maila Walter’s
namics, I at once found that I needed to know more book [18.8] are somewhat misleading. After a few years
about the method [of dimensions] in order to use as a physics professor, Buckingham worked as a physi-
it with confidence for my own purposes. Since you cist at US government agencies; first at the USDA
and the few others who have made much use of the Bureau of Soils (where he did very original theoret-
method of dimensions have generally referred to it ical work, applying energy methods), and then at the
somewhat casually as to a subject with which ev- National Bureau of Standards [18.11]. Involving physi-
eryone was familiar, I supposed that the hiatus in cists on aerodynamical research planning made sense,
my education would be easily filled.” and it also helped cultivate a more prestigious image of
a research institution concerned with aerodynamics in
But it was not [18.62]: 1914. Buckingham seemed aware of this, as evidenced
by his remark to Rayleigh about the latter’s Nature ar-
“[. . . ] upon looking through your collected papers, ticle on the principle of dynamical similarity; he wrote
the Sound [probably a reference to Rayleigh’s The- Rayleigh that [18.62]:
ory of Sound], Stokes’s papers, and a few standard
books such as Thompson and Tait [Principles of “a note, such as the one in Nature of March 18th,
Mechanics] and Routh’s Rigid Dynamics I was which has your authority behind it, has an effect
amazed at my failure to find any simple but com- far more important in the present state of affairs
prehensive exposition of the method which could than any detailed exposition of the subject, however
be used as a textbook. [. . . ] Each one of your nu- good, because physicists will be sure to read it.”
Physically Similar Systems – A History of the Concept 18.4 1914: The Year of Physically Similar Systems 401
One of Buckingham’s special areas of expertise “The speaker began by deducing a general theorem
within physics was thermodynamics. He did not view regarding the form which physical equations must
thermodynamics as merely a subspecialty in physics, have in order to satisfy the requirement of dimen-
though, but rather as an enlightened view of science sional homogeneity.”
in which thermodynamics encompassed all of classi-
cal mechanics. In his 1900 book, Outline of a Theory Dimensional homogeneity is an exceedingly gen-
of Thermodynamics, Buckingham had written [18.63, p. eral requirement of an equation; if the terms in an
16]: equation have any units (as equations in physics do), the
equation is not really considered an equation if it does
“Thermodynamics [. . . ] aims at the study of all the
not meet the requirement of dimensional homogeneity.
properties or qualities of material systems, and of
Thus, this deduction is of something very fundamental
all the forms of energy which they possess. It must,
in physics; it is about the logic of equations. The ac-
therefore, be held, in a general sense, to include
count continues [18.59]:
pure dynamics, which is then to be looked upon as
the thermodynamics of systems of which a number “The theorem may be stated as follows: If a rela-
Part D | 18.4
of nonmechanical properties are considered invari- tion subsists among a number of physical quantities,
able. For thermodynamics, in this larger sense, the and if we form all the possible independent dimen-
more appropriate name energetics is often used, the sionless products of powers of those quantities, any
word thermodynamics being reserved to designate equation which describes the relation is reducible to
the treatment of problems which are directly con- the statement that some unknown function of these
cerned with temperature and heat.” dimensionless products, taken as independent argu-
ments, must vanish.”
Buckingham’s approach toward formalizing
physics in his 1900 book on the foundations of The antecedent of the theorem is extremely gen-
thermodynamics had been to make the formalism he eral: “If a relation subsists among a number of physical
proposed as flexible as possible, and to build as few quantities [. . . ];” what is striking is that the antecedent
assumptions into it as possible. In generalizing the of the theorem is not a requirement that the relation
existing science of dynamics, he chose to regard as mentioned be known, only that it exist. The theorem was
variable certain properties that are often considered described as a “general summary of the requirement
invariable in dynamics. As Buckingham obtained his of dimensional homogeneity.” The report on Bucking-
doctorate in Leipzig under Wilhelm Ostwald, a friend ham’s talk added that the method of determining the
of Boltzmann who was often engaged with him in number and forms of the independent dimensionless
discussions and debates about foundational issues in products was explained. There is no mention of simi-
science, Buckingham was familiar with debates in lar systems in the journal’s account of this May 1914
philosophy of science [18.11]. Buckingham developed talk, but it does add that the theorem “may be looked at
(if he had not already had) a penchant for asking from various standpoints and utilized for various pur-
foundational questions, too; in his new role of advisor poses,” and that “several illustrative examples” were
on research into aeronautics, he set for himself the task given showing the “practical operation of the theo-
of discerning the foundations of the methods he saw rem” [18.59].
being used in aeronautical research. In July 1914, the academy’s journal featured a short,
six-page paper by Buckingham. The topic identified
Buckingham’s Papers at the Washington was more general than model experiments, and this
Academy of Sciences in 1914 time it did mention similar systems; in fact, the paper
By the middle of 1914, Buckingham had figured out was titled Physically Similar Systems. That Bucking-
some things about the foundations of the methods used ham meant the July paper to be seen as a generalization
in aerodynamical research. As his note to Rayleigh indi- of the earlier paper on the interpretation of model ex-
cates, he had been concentrating on understanding how periments was indicated in the closing sentence of the
“the method of dimensions,” or dimensional analysis, paper [18.2, p. 353]:
was employed in aerodynamical and hydrodynamical
research. On May 23, 1914, he presented a paper enti- “A particular form of this theorem, known as the
tled The interpretation of experiments on models to the principle of dynamical similarity is in familiar use
Washington Academy of Sciences in Washington, DC, for the interpretation of experiments on mechani-
of which he was a member; 27 people were present, and cal models; but the theorem is equally applicable to
four discussed the paper afterward [18.59]. The account problems in heat and electromagnetism” (emphasis
published in the academy’s journal stated that: added).”
402 Part D Model-Based Reasoning in Science and the History of Science
Like the May 1914 talk, the short July 1914 pa- Buckingham spoke of an undetermined function
per is notable for the generality of its approach. It did whose arguments were dimensionless parameters and
not imply that there were any set fundamental quan- he spoke of varying the quantities (Qs and Ps above) in
tities, nor how many there were. It did not talk about ways that “are not entirely arbitrary but subjected to the
physics, even. It spoke of quantities, relations between (n k 1) conditions that [certain] dimensionless ˘i ’s
quantities, and equations. It is spare and elegant. It be- remain constant” [18.1].
gins [18.1]: Putting it in other terms, Buckingham character-
ized systems as similar in terms of a (nonunique) set
“Let n physical quantities, Q, of n different kinds, be
of invariants. His emphasis is on the principle of di-
so related that the value of any one is fixed by the
mensional homogeneity, which is really about the logic
others. If no further quantity is involved in the phe-
of the equations of physics. The concept of similar
nomenon characterized by the relation, the relation
systems arises from reflecting on how the principle of
is complete and may be described by an equation of
dimensional homogeneity might actually be put to use,
the form ˙MQb1 1 Q2 Q3 Qn D 0, in which the
b2 b3 bn
what it might allow one to infer. After the paper’s open-
coefficients M are dimensionless or pure numbers”
Part D | 18.4
(so far as he was aware) had not yet been articulated physically similar systems is not a method peculiar to
by others who had employed the method. He would mechanics; it applies to any equation describing a com-
later write to Rayleigh about these first papers on the plete relation that holds between quantities.
method [18.62]:
Richard Chace Tolman’s Principle of Similitude
“I had therefore [. . . ] to write an elementary text-
Meanwhile, another physicist in the United States was
book on the subject for my own information. My
publishing on similitude, too, though with considerably
object has been to reduce the method to a mere
less rigor. Richard Chace Tolman (1881–1948) was an
algebraic routine of general applicability, making
assistant professor of the relatively new field of physical
it clear that Physics came in only at the start in
chemistry at the University of California when Onnes
deciding what variables should be considered, and
won the Nobel Prize for his work in physical chem-
that the rest was a necessary consequence of the
istry on the liquefication of helium; Onnes delivered his
physical knowledge used at the beginning; thus dis-
Nobel Prize Lecture in December 1913 [18.31, 64]. As
tinguishing sharply between what was assumed,
noted above, Onnes had aimed to “demonstrate that the
either hypothetically or from observation, and what
Part D | 18.4
principle of corresponding states can be derived on the
was mere logic and therefore certain.
basis of what he calls the principle of similarity of mo-
The resulting exposition is naturally, in its gen-
tion, which he ascribes to Newton” [18.32].
eral form, very cumbersome in appearance, and
Tolman published The Principle of Similitude in the
a large number of problems can be handled vastly
March 1914 Physical Review, in which he proposed the
more simply without dragging in so much mathe-
following [18.58, p. 244]:
matical machinery.”
“The fundamental entities out of which the phys-
His exposition treats of a system S characterized
ical universe is constructed are of such a nature
very abstractly: “The quantities involved in a physi-
that from them a miniature universe could be con-
cal relation pertain to some particular physical system
structed exactly similar in every respect to the
which may usually be treated as of very limited ex-
present universe.”
tent” [18.1, p. 352]. The system constructed to be
similar to it, likewise, is described very formally [18.1, Tolman then (he claimed) showed that he could de-
p. 352]: rive a variety of laws, including the ideal gas law, from
the principle of similitude he had proposed, proceed-
“Let S0 be a second system into which S would be
ing in somewhat the same way as Onnes had proceeded
transformed if all quantities of each kind Q involved
in showing that the principle of corresponding states
in [the equation expressing the physical relation
was a consequence of mechanical similarity. Tolman
pertaining to the system] were changed in some
seemed to appeal to a criterion that the two universes
arbitrary ratio, so that the r’s for all quantities of
should be observationally equivalent [18.58, p. 245]
these kinds remained constant, while the particular
quantities Q1 , Q2 , . . . , Qk changed in k independent “[. . . ] let us consider two observers, O and O0 , pro-
ratios.” vided with instruments for making physical mea-
surements. O is provided with ordinary meter sticks,
After completing the specification of the constraints
clocks and other measuring apparatus of the kind
on how the quantities change in concert with each other
and size which we now possess, and makes mea-
so that S0 also satisfies the relation: “Two systems S
surements in our present physical universe. O0 ,
and S0 which are related in the manner just described
however, is provided with a shorter meter stick, and
are similar as regards the physical relation in ques-
corresponding altered clocks and other apparatus so
tion.” [18.1, p. 352]
that he could make measurements in the miniature
The exposition may have been cumbersome, but the
universe of which we have spoken, and in accor-
point is elegant and spare: the constraints that must be
dance with our postulate obtain exactly the same
satisfied in constructing the system S0 are just these:
numerical results in all his experiments as does O in
to keep the value of the dimensionless parameters that
the analogous measurements made in the real uni-
appear in the general form of the equation – the ar-
verse.”
guments of the function – the same in S0 as in S.
So, what is crucial is to identify a set of dimensionless He brings up some other considerations, some from
parameters that can serve as the arguments of the unde- physics (Coulomb’s law), some from the theory of di-
termined function . For Buckingham, unlike for some mensions, and then tries to show how various physical
predecessors writing about similar systems or dynamic relations, such as the ideal gas law, can be deduced from
similarity, the method underlying the construction of simple physical assumptions and his proposed principle
404 Part D Model-Based Reasoning in Science and the History of Science
of similitude. For relations involving gravitation, how- of unaccelerated motion from another; Tolman pro-
ever, a contradiction arises; his response is to use the poses to do the same for the statement that observers
contradiction as motivation to propose a new criterion not be able to distinguish an appropriately constructed
for an acceptable theory of gravitation. He concludes model universe from the actual one [18.58], if inhabit-
that his proposed principle is a new relativity prin- ing it as an appropriately transformed being and using
ciple: the “principle of the relativity of size” [18.58, appropriately constructed or transformed instruments.
p. 255]. There is a confusion in Tolman’s reasoning. While it
Tolman believes that, in his paper, he has lain out is quite natural to say that a desirable principle of na-
transformation equations that specify the changes that ture, and a desirable constraint on measuring systems,
have to be made in lengths, masses, time intervals, en- is that it should not matter to the project of pursuing
ergy quantities, etc., in order to construct a miniature truth that one observer in the actual world is using one
world such that [18.58, p. 255]: system of measurement and another observer in the ac-
tual world is using another system of measurement,
“If, now, throughout the universe a simultaneous Tolman seems here to be confusing that requirement
Part D | 18.4
change in all physical magnitudes of just the nature with a requirement that miniature universes constructed
required by these transformation equations should from the materials of the actual universe be indistin-
suddenly occur, it is evident that to any observer guishable from the actual, full size, universe by the
the universe would appear entirely unchanged. The miniature observers inhabiting those miniature uni-
length of any physical object would still appear to verses.
him as before, since his meter sticks would all be
changed in the same ratio as the dimensions of the Buckingham’s Physical Review Paper
object, and similar considerations would apply to and Reply to Richard Chace Tolman
intervals of time, etc. From this point of view we It’s rather obvious that the notion of similar systems –
can see that it is meaningless to speak of the abso- one system being transformed into another system S0
lute length of an object, all we can talk about are in such a way that it “corresponds” to S (“as re-
the relative lengths of objects, the relative duration gards the essential quantities”) – is relevant to eval-
of lengths of time, etc. The principle of similitude is uating the claim Tolman made in his 1914 Principle
thus identical with the principle of the relativity of of Similitude paper [18.58] that the universe could be
size.” transformed overnight into an observationally indis-
tinguishable miniature universe. The notion of similar
Tolman’s suggestion differs from the concept of systems is also relevant to Stanton and Pannell’s Simi-
similar systems mentioned so far, though the difference larity of Motion paper [18.6], in that it is a more general
may not be obvious. Others working on similar systems treatment of the methodology of model testing (“the
where quantities or paths were homologous between principle of dynamical similarity” [18.6, p. 201]) given
similar systems noted that there were limits of appli- there. In the next paper that Buckingham wrote on the
cability; they recognized the fact that there are ranges topic [18.2], in addition to presenting the generalized
in which size matters (e.g., surface tension matters dis- treatment found in the July 1914 version of Physically
proportionately at small scales [18.21]; the restriction Similar Systems, he addressed both these related top-
in Helmholtz’ 1873 paper that velocities must be small ics on which major papers had appeared in the earlier
with respect to the velocity of sound [18.41], Reynolds’ part of the year: experimental models and Tolman’s
recognition of the role of “mean range” of molecules claims about the possibility of an observationally in-
in transpiration [18.11]). Helmholtz even explicitly dis- distinguishable miniature universe. The October 1914
cussed the practical difficulties of constructing models Physical Review featured Buckingham’s On Physically
of a different size than the configuration modeled, rais- Similar Systems: Illustrations of the Use of Dimen-
ing the question of whether in some cases it may not sional Equations; his manuscript is dated June 18th of
be possible to do so [18.41]. Tolman not only does not that year [18.2].
recognize such limits; he suggests making the denial In his 1914 Physical Review paper [18.2], Bucking-
that they exist a principle of physics. It seems pretty ham says that his purpose in presenting how the notion
clear that Tolman is here modeling his exposition on of physically similar systems can be developed from
Einstein’s 1905 paper on the special theory of rela- the principle of dimensional homogeneity in that paper
tivity. Tolman proposes that the relativity of size be was to provide the background against which to respond
regarded along the lines of the relativity of motion: to Tolman’s proposed “principle of similitude” [18.2,
in his paper on special relativity, Einstein had consid- p. 356]. He makes several points relevant to address-
ered it a principle that observers cannot tell one state ing Tolman’s proposal for a new principle in physics in
Physically Similar Systems – A History of the Concept 18.4 1914: The Year of Physically Similar Systems 405
developing “the notion of physical similarity” and “the three fundamental units suffice to describe mechanical
notion of physically similar systems”: phenomena (more if thermal and electromagnetic phe-
nomena are to be described), then it would be correct to
1. It is only “the phenomenon characterized by the re- conclude that [18.2, pp. 372–373]:
lation [expressed by the equation whose existence
“[A] purely mechanical system may be kept sim-
was assumed at the start]” that “occurs in a similar
ilar to itself when any three independent kinds of
manner” in both systems: “we say that the bodies or
mechanical quantity pertaining to it are varied in
systems are similar with respect to this phenomenon
arbitrary ratios, by simultaneously changing the re-
(emphasis added).” Buckingham specifically points
maining kinds of quantity in ratios specified by [the
out that systems that are “said to be dynamically
constraint of dimensional homogeneity . . . ] For in-
similar” might not be similar “as regards some other
stance, we derive a unit of force from independent
dynamical relation”; two dynamically similar sys-
units of mass, length, and time, by using these units
tems might not “behave similarly in some different
in a certain way which is fixed by definition, and
sort of experiment.”
we thereby determine a definite force which is re-
Part D | 18.4
2. There is a more general conception of similarity
producible and may be used as a unit. Now by
than dynamical similarity, and it too “follows di-
Newton’s law of gravitation it is, in principle, pos-
rectly from the dimensional reasoning, based on the
sible to derive one of the three fundamental units of
principle of homogeneity.”
mechanics from the other two.”
3. Tolman’s proposed Principle of Similitude is not
clearly stated, but inasmuch as Buckingham un- Buckingham then describes a laboratory experiment
derstands it, it seems to him “merely a particular from which a unit of time can be derived from units
case” of the theorem Buckingham presents in the of mass and length – if one assumes Newton’s law of
paper. Buckingham reasons as follows: The way gravitation to hold. To be clear: Buckingham is grant-
Tolman proceeds is to select four specific indepen- ing that people have sometimes reduced the number of
dent kinds of quantity (length, speed, quantity of fundamental units to two, such as when a unit of time is
electricity, electrostatic force), subjects these four derived from units for mass and length, when working
kinds of quantity to four arbitrary conditions, then on specific problems. What he is concerned to show is
finds the conditions that some other kinds of quan- that, in order to do so, they have had to use assumptions
tities are subject to “in passing from the actual about the law of gravitation. He is not unaware that the
universe to a miniature universe that is physically current state of physics indicates Newton’s law of grav-
similar to it” [18.2, p. 356]. I take Buckingham’s itation is not the final word, and is pointing out the role
point to be that, inasmuch as what Tolman is con- that a law of gravitation plays in such reductions of the
cluding is correct, it can be concluded using the number of fundamental units to two. Put in terms of
principle of dimensional homogeneity without the similar systems, the question is: How many degrees of
aid of the “new” principle that Tolman proposed in freedom do we have in constructing a system S0 that is
his March 1914 Physical Review paper. similar to S? How many quantities can be varied in an
arbitrary ratio when we transform S into S0 , a system
Having already remarked that the notion of similar that is physically similar to it?
systems used in constructing and using a model pro- Buckingham points out that, even in the domain of
peller is generalizable beyond mechanics, he then goes mechanics, it depends on what phenomenon the relation
on to show how the principle involved in doing so – between quantities characterizes. As he emphasized,
the “method of dimensions” [18.65, p. 696] – applies in the notion of physical similarity and physically sim-
problems ranging from electrodynamics (energy den- ilar systems involves only similarity with respect to
sity of a field, the relation between mass and radius of a specified relation. (Recall that the analysis started
an electron, radiation from an accelerated electron) to with the quantities involved in a given equation, where
thermal transmission, and, finally, at a higher level, to that equation describes a relation that relates a certain
the kind of bird’s-eye view question to which his interest number of kinds of quantities such that any one was
tended to migrate: “the relation of the law of gravitation determined by all the others, and the relation character-
to our ordinary system of mechanical units.” ized a phenomenon of interest.) In developing a general
The question he asks about the role of the law of methodology, Buckingham had considered all possi-
gravitation in determining units of measure is a bit dif- ble relations that could exist among the given kinds
ferent. It is about the number of fundamental units, and of quantities. In the most general case, the law of
the question Buckingham asks can be put in terms of gravitation is a constraint on how quantities are re-
similar systems: if it is, in fact, true that in mechanics lated. Recognizing this additional constraint reduces the
406 Part D Model-Based Reasoning in Science and the History of Science
number of independent quantities by one. However, he This is seen in the “convenient summary” with
explains, such generality is not always required in prac- which he concludes the paper [18.2, p. 376]:
tice [18.2, p. 374]:
“A convenient summary of the general consequence
“But if for ‘all possible relations’ we substitute ‘all of the principle of dimensional homogeneity con-
relations that do not involve the law of gravitation,’ sists in the statement that any equation which
we may ignore the law and proceed as if it were describes completely a relation subsisting among
non-existent.” a number of physical quantities of an equal or
smaller number of different kinds, is reducible to
This can actually be done in many cases, he says, the form (˘1 , ˘2 , : : :, ˘i , etc.) D 0 in which the
since [18.2, p. 374]: ˘ ’s are all the independent dimensionless products
y
of the form Qx1 Q2 . . . , etc. that can be made by using
“in practice, physicists are seldom concerned with the symbols of all the quantities Q.”
the law of gravitation: for all our ordinary physi-
The equation (˘1 , ˘2 , . . . , ˘i , etc.)D 0 in the
Part D | 18.4
However, for precise geodesy and astronomy, one 18.4.4 Precursors of the Pi-Theorem
needs to be explicit about the law of gravitation. in Buckingham’s 1914 Papers
Buckingham’s answer to the question Tolman’s pa-
per raises about the possibility of constructing obser- This chapter is devoted to the history of the notion of
vationally indistinguishable miniature universes, thus, physically similar systems. Buckingham’s 1914 papers
bifurcates into two cases, depending on whether or not are considered a landmark in the development of our
the phenomenon that we are interested in observing current notion of physically similar systems, due to the
in the miniature universe is influenced by the law of articulation of what a physically similar system is and
gravitation. If not, then it might not be impossible to how it is related to the symbolism used to express re-
construct a miniature universe, as Tolman suggests, that lations in physics. First, Buckingham showed that The
will be similar to the universe (as regards that phe- Reduced Relation Equation of 1914 followed from the
nomenon.) On the other hand, if the phenomenon is principle of the homogeneity of a physical equation.
influenced by the law of gravitation, more things must Then, he showed how the notion of physically similar
be taken into account: systems could be developed from it.
However, since Buckingham’s name has since be-
“the gravitational forces in the miniature universe
come attached to the so-called pi-theorem, and the full
must bear to the corresponding gravitational forces
contents of his 1914 papers are often ignored, being in-
in the actual universe a ratio fixed by the law of
accurately viewed as doing little more than presenting
gravitation.”
the pi-theorem, I want to emphasize that what has be-
He points out that the effect of the law of gravitation come known as the pi-theorem itself is not actually due
on the phenomena of interest shows up in the process to Buckingham. There were, in fact, many precursors
of constructing similar systems. If we erroneously try who proved the same result, with varying levels of gen-
to independently choose three units rather than letting erality.
the third be determined by the first two fundamental
units chosen, we run into trouble because the measured Vaschy and Bertrand
values for corresponding speeds and forces will not cor- The pi-theorem is referred to in France as the
respond to the values in the actual universe – unless, Vaschy–Buckingham Pi-Theorem. In 1892, Vaschy
that is, the third unit is allowed to be fixed by the law of (1857–1899) published Sur les lois de similitude en
gravitation in terms of the first two. physique [18.66, 67], in which he stated the result about
The points about physically similar systems, sys- the number of parameters required to state a given
tems of units, and the law of gravitation seem to be relationship that is often attributed to Buckingham.
questions in the logic of physics. Yet, the main claim However, unlike Buckingham, Vaschy did not men-
of Buckingham’s papers on physically similar systems tion dimensions or dimensional equations. He spoke of
can actually be stated in terms of a theorem about the quantities and units, and did so as though they were the
symbolism of relations between physical quantities. same sort of thing, though he did speak of some units
Physically Similar Systems – A History of the Concept 18.4 1914: The Year of Physically Similar Systems 407
as fundamental and others as derived. More precisely, These two works by Bertrand [18.28, 68] thirty
Vaschy’s theorem is [18.67]: years apart reflect an important late nineteenth century
development that permitted using a logical principle
“Let a1 , a2 , a3 , . . . , an be physical quantities, of about the equations of physics, that is, the homogene-
which the first p are distinct fundamental units ity of equations of physics, rather than a principle of
and the last (n p) are derived from the p fun- physics itself. This late nineteenth-century development
damental units (e.g., a1 could be a length, a2 was the idea of coherence as a constraint on a system of
a mass, a3 a time, and the (n 3) other quanti- units; the idea, that is, of a coherent system of units.
ties would be forces, velocities, etc.; then p D 3). Coherence of a system of units, and its importance in
If between these n quantities there exists a relation connecting dimensional analysis and similarity, is dis-
F.a1 ; a2 ; a3 ; : : : ; an / = 0, which remains the same cussed in [18.69].
whatever the arbitrary magnitudes of the fundamen-
tal units, this relationship can be transformed in Riabouchinsky
another relationship between at most (n p) param- Sometime after 1914, Buckingham became aware that
eters, that is f .x1 ; x2 ; x3 ; : : : ; xnp / D 0 , the param-
Part D | 18.4
Dimitri Riabouchinsky (1882–1962) had also proved
eters x1 ; x2 ; x3 ; : : : ; xnp being monomial functions a mathematical theorem about the number of dimen-
of a1 ; a2 ; a3 ; : : : ; an .” sionless parameters needed to express a given physical
relation, using the methods of dimensional analysis,
The parameters x1 ; x2 ; x3 ; : : : ; xnp play the same in 1911 [18.65]. Riabouchinsky (spelled Riabouchin-
role as the dimensionless ˘ ’s in Buckingham’s theo- ski in Buckingham’s papers), was a scientist who had
rem. Vaschy then shows how to obtain reduced relations provided the private funding for the Aerodynamic In-
for the pendulum and for a telegraph cable. What is no- stitute of Koutchino associated with the University of
table is that he produces a pair of ratios, not just one Moscow, which had a wind tunnel; hence, Riabouch-
ratio, in each case, and he expresses the result as an un- insky was, like Buckingham, faced with the problem
known function of these parameters (xi ’s) set equal to of understanding how to interpret model experiments.
zero. He does not use the terminology of systems, but After becoming aware of Riabouchinsky’s proof, Buck-
he is interested in laws of similitude (in the sense of the ingham credited him prominently for the proof in his
similarity laws of Sect. 18.3.1) that can be derived from writings. In a paper in 1921, discussing the desire that
them, citing one by W. Thomson (Lord Kelvin) in the had arisen for a more systematic procedure for obtain-
case of the telegraph line. The conditions of Vaschy’s ing the results that Rayleigh and others had obtained
theorem are not exactly the same as in Buckingham’s using dimensional methods, he wrote [18.70, p. 696]:
theorem, but Vaschy does emphasize that his reason-
“Such a routine procedure is provided by formulat-
ing does not assume any particular system of units,
ing the requirement of dimensional homogeneity as
and he does derive the key move to the Reduced Rela-
a general algebraic theorem, which was first pub-
tion Equation of 1914. The case is strong for crediting
lished by Riabouchinski (sic), and which will be
Vaschy’s paper with containing the pi-theorem.
referred to as the ˘ theorem.”
Some have also argued that Joseph Bertrand pro-
vided an even earlier, though less general, proof of the Buckingham speculated that he might have seen
pi-theorem in 1878, in Sur l’homogeneite dans les for- a notice of Riabouchinski’s result in one of the Annual
mules de physique [18.66, p. 209]. This is the same Reports of the British Advisory Committee on Aero-
Joseph Bertrand (1822–1900) cited above for the much nautics [18.71], and that [18.70, p. 696n]:
earlier 1847 work drawing attention to the principle of
similitude, in which he mentioned “an infinite number “Guided [. . . ] by the hint contained in this ab-
of possible systems, which may be regarded as similar stract, the present writer came upon substantially
to” a given system, and provided a new basis for New- the same theorem, [. . . ] The theorem does not differ
ton’s theorem of similarity using a result by Cauchy materially from Riabouchinski’s, except in that he
involving the principle of virtual velocities. confined his attention to mechanical quantities.”
408 Part D Model-Based Reasoning in Science and the History of Science
teenth century. by then. That Buckingham was the one to write what
Brian Hepburn identifies Leonhard Euler as a key has become the landmark paper articulating the no-
eighteenth century figure linking Newton’s age and tion of physically similar systems, which he developed
ours, and has argued that the concept of a function was from the Reduced Relation Equation of 1914 in the ˘ -
crucial to the development of what we now know as theorem, then, appears to be a matter of timing, at least
Newtonian mechanics. Whereas Newton’s mechanics in part: when he was suddenly asked to devote time to
“dictated how motions are generated in time by forces” the question of the value of model experiments using
and “would treat of the actual process of moving bod- wind tunnels, it was the early twentieth century, when
ies,” Hepburn says, for Euler, in contrast, “the central the notion of a system was readily expressible by the
object of investigation in mechanics is the [mathemat- notation for a function, when coherent systems of units
ical] function” [18.12]. He points out that equilibrium in every part of physics was something that could be as-
relations are the most important among relations, and sumed, and someone with a doctorate in physics would
hence that “sets of quantities” characterized “states” – have a facility with formal methods applied to equa-
I would amend this to “states of a system.” The no- tions.
tion of a function allowed the concept of a system to Around the same time, or shortly thereafter, D’Arcy
be expressed in terms of the interrelatedness of some Wentworth Thompson wrote his classic work, On
quantities – if one quantity changed, any of the oth- Growth and Form [18.72], on the mathematicization of
ers in the system might be affected, too. The notation biology. In that work, he carried the use of similitude
of a function set to 0, that is, f .x1 ; x2 ; : : : ; xn / D 0 can in physics over into biology and he, too, explicitly cites
be used to express this interrelatedness. The notion of Newton (for his use of similitude), as well as Galileo
equilibrium and an equation of state, which are express- (for his discussion of scaling and similitude), Boltz-
ible by the functional notation, are important in this mann, Helmholtz and numerous publications on aerial
newer notion of a system; what this new notion of sys- flight. A detailed discussion of D’Arcy Thompson on
tem eventually replaced was the notion of a system as similitude may be found in Chap. 6 (The Physics of
a configuration of particles and/or bodies. The notion of Miniature Worlds) of Wittgenstein Flies a Kite [18.11,
a similarity law likewise progressed from simply a sin- pp. 117–130].
gle ratio to express an invariant relation, to a function How do things stand today, in the early twenty-first
with multiple arguments, each of which was a dimen- century? Certainly, there are pockets in many disci-
sionless ratio. plines – physics, hydrodynamics, aerodynamics, the
When Bertrand invoked the principle of virtual ve- geological and other sciences, hydrology, mechanics,
locities in 1847 [18.25, p. 380] to derive the principle biology, and more – where researchers recognize the
of mechanical similitude, he was using the notion of value of thinking in terms of physically similar systems.
a function, but he was still using considerations and However, it is not really a staple of the basic curriculum.
principles of mechanics. By 1878, he could take a much Few philosophers of science understand the concept or
more general approach, using a principle that was a con- why it is significant. This article is offered to help im-
straint on the equation expressing relations between the prove at least the latter situation.
physical quantities, rather than the system of bodies and
particles itself. Independently, many others could do so, Acknowledgments. Work on this paper was sup-
too: Vaschy in France and Riabouchinsky in Russia, ported in part by a Visiting Fellowship at the Center for
and they were not the only ones. In physical chem- Philosophy of Science at the University of Pittsburgh in
Physically Similar Systems – A History of the Concept References 409
2010, during which my project was the history of the A Kite: A Story of Models of Wings and Models of
concept of physically similar systems. This paper also the World. Thanks also to Brian Hepburn and George
incorporates some earlier work published in Chaps. 6 Smith for conversations about Newton’s use of simi-
(The Physics of Miniature Worlds) and 7 (Models of lar systems, and to Jasmin Ozel for translating parts of
Wings and Models of the World) of Wittgenstein Flies Forchheimer’s Hydraulik.
References
18.1 E. Buckingham: Physically similar systems, J. Wash. 18.16 J. Thomson: Comparison of similar structures as to
Acad. Sci. 93, 347–353 (1914) elasticity, strength and stability. In: Collected Pa-
18.2 E. Buckingham: On physically similar systems: Il- pers in Physics and Engineering, ed. by J. Larmor,
lustrations of the use of dimensional equations, J. C. Thomson (Cambridge Univ. Press, Cambridge
Phys. Rev. 4, 345–376 (1914) 1912) pp. 361–372
18.3 I. Newton, A. Motte, F. Cajori, R.T. Crawford: Sir 18.17 A.H. Gibson: Hydraulics and its Applications (D. Van
Isaac Newton’s Mathematical Principles of Natu- Nostrand Company, New York 1908)
Part D | 18
ral Philosophy and His System of the World (Univ. 18.18 O. Darrigol: Worlds of Flow: A History of Hydrody-
California Press, Berkeley 1946) namics from the Bernoullis to Prandtl (Oxford Univ.
18.4 E. Mach: The Science of Mechanics: A Critical and Press, Oxford, New York 2005)
Historical Account of Its Development, 6th edn. 18.19 R.H.M. Robinson: Experimental model basin – I,
(Open Court Pub. Co, La Salle, Ill 1960), transl. by T. Sci. Am. Suppl. 66, 37–38 (1908)
J. McCormack. New introduction by K. Menger with 18.20 G. Hagler: Modeling Ships and Space Craft: The
revisions through the ninth German edition Science and Art of Mastering the Oceans and Sky
18.5 S.D. Zwart: Scale modelling in engineering: (Springer, New York 2013)
Froude’s case. In: Philosophy of Technology and 18.21 W. Froude: On the Rolling of Ships (Parker, Son, and
Engineering Sciences, Vol. 9, ed. by A.W.M. Mei- Bourn, London 1862)
jers (North Holland/Elsevier, Amsterdam 2009) 18.22 S. Schaffer: Fish and Ships: Models in the age of
pp. 759–798 reason. In: Models: The Third Dimension of Science,
18.6 T.E. Stanton, J.R. Pannell: Similarity of motion in ed. by S. de Chadarevian, N. Hopwood (Stanford
relation to the surface friction of fluids, Philos. Univ. Press, Stanford 2004)
Trans. R. Soc. A 214, 199–224 (1914) 18.23 W. Froude: On Experiments with HMS Greyhound,
18.7 A. F. Zahm: Theories of Flow Similitude, Report No. Trans. R. Inst. Nav. Archit. 15, 36–73 (1874)
287 (National Advisory Committee for Aeronautics, 18.24 W. Denny: Mr Mansel’s and the late Mr Froude’s
Washington DC 1928) Methods of analysing the results of progressive
18.8 M.L. Walter: Science and Cultural Crisis: An in- speed trials, Trans. Inst. Eng. Shipbuild. Scotl. 28,
tellectual biography of Percy Williams Bridgman 1–8 (1885)
(1882–1961) (Stanford Univ. Press, Stanford 1990) 18.25 F. Cajori: A History of Mathematics (Macmillan, New
18.9 E.T. Layton: Escape from the Jail of Shape: Di- York 1894)
mensionality and engineering science. In: Techno- 18.26 R. Ettema: Hydraulic Modeling Concepts and Prac-
logical Development and Science in the Industrial tice (American Society of Civil Engineers, Reston
Age: New Perspectives on the Science-Technology 2000)
Relationship, ed. by P. Kroes, M. Bakker (Kluwer, 18.27 F. Reech: Cours de Mécanique d’Après la Na-
Dordrecht, Boston 1992) ture Généralement Flexible et Élastique des Corps
18.10 Galilei Galileo, S. Drake (Transl.): Two New Sciences: (Carilian-Gœury et Vor. Dalmont, Paris 1852), in
Including Centers of Gravity and Force of Percus- French
sion, 2nd edn. (Wall Emerson, Toronto 2000) 18.28 M.J. Bertrand: On the relative proportions of ma-
18.11 S.G. Sterrett: Wittgenstein Flies a Kite: A Story of chinery, considered with regard to their powers
Models of Wings and Models of the World (Pi of working, Newton’s Lond. J. Arts Sci. 31, 129–131
Press/Penguin, New York 2006) (1847)
18.12 B.S. Hepburn: Equilibrium and Explanation in 18th 18.29 H.A. Lorentz: The theory of radiation and the
Century Mechanics, Ph.D. Thesis (University of Pitts- second law of thermodynamics, KNAW Proceed-
burgh, Pittsburgh 2007) ings, Vol. 3 (Huygens Institute, Royal Netherlands
18.13 J. Thomson: Comparison of similar structures as to Academy of Arts and Sciences, Amsterdam 1901)
elasticity, strength, and stability, Trans. Inst. Eng. pp. 436–450
Shipbuild. Scotl. 54, 361 (1875) 18.30 The Nobel Prize in Physics 1910, Nobelprize.org,
18.14 A. Barr: Comparisons of similar structures and ma- http://www.nobelprize.org/nobel_prizes/physics/
chines, Trans. Inst. Eng. Shipbuild. Scotl. 42, 322– laureates/1910/
360 (1899) 18.31 The Nobel Prize in Physics 1913, Nobelprize.org,
18.15 R. P. Torrance: Use of models in engineering design, http://www.nobelprize.org/nobel_prizes/physics/
Engineering News, 18 December (1913) laureates/1913/
410 Part D Model-Based Reasoning in Science and the History of Science
18.32 H.K. Onnes: Investigations into the Properties of 18.46 D. Cahan: Helmholtz and the British Scientific Elite:
Substances at Low Temperatures, Which have led, From force conservation to energy conservation,
Amongst Other Things, to the Preparation of Liquid Notes Rec. R. Soc. 66, 55–68 (2011)
Helium, Nobel Lectures, Physics 1901–1921 (Elsevier 18.47 D.M. McDowell, J.D. Jackson: Osborne Reynolds and
Publishing Company, Amsterdam 1967) Engineering Science Today (Manchester Univ. Press,
18.33 J.M.H. Levelt Sengers: How Fluids Unmix: Discov- Manchester 1970)
eries by the School of Van der Waals and Kamer- 18.48 O. Reynolds: Letter to George Stokes, April 25, 1883.
lingh Onnes (Koninklijke Nerlandse Akademie van In: Memoir and Scientific Correspondence of the
Wetenschappen, Amsterdam 2002) Late Sir George Gabriel Stokes, Bart, Vol. 1, (Cam-
18.34 J. Wisniak: Heike Kamerlingh – The virial equa- bridge Univ. Press, Cambridge 1907) p. 233, Selected
tion of state, Indian J. Chem. Technol. 10, 564–572 and arranged by Joseph Larmor
(2003) 18.49 G.G. Stokes, J. Larmor (Eds.): Memoir and Scien-
18.35 J. Mehra, H. Rechenberg: The Historical Develop- tific Correspondence of the Late Sir George Gabriel
ment of Quantum Theory, Vol. 5 (Springer, New York Stokes, Bart, Vol. 1 (Cambridge Univ. Press, Cam-
1987) bridge 1907), Selected and arranged by Joseph Lar-
18.36 S.G. Brush: Ludwig Boltzman and the Foundations mor
of Natural Science. In: Ludwig Boltzmann (1844– 18.50 L. Prandtl: Motion of fluids with very little viscosity,
Part D | 18
1906): Zum Hundertsten Todestag, ed. by I.M. Fa- english transl. In: Early Developments of Modern
sol-Boltzmann, G.L. Fasol (Springer, Vienna 2006) Aerodynamics, ed. by J.A.K. Ackroyd, B.P. Axcell,
18.37 J.C. Maxwell: On Boltzmann’s Theorem on the av- A.I. Ruban (Butterworth-Heinemann, Oxford 2001)
erage distribution of energy in a system of material 18.51 J. W. Strutt (Baron Rayleigh): Note as to the appli-
points. In: The Scientific Papers of James Clerk cation of the principle of dynamical similarity. In:
Maxwell, Vol. 2, ed. by W.D. Niven (Cambridge Univ. Report of the Advisory Committee for Aeronautics
Press, Cambridge 1890) 1909–1910 (London 1910)
18.38 L. Boltzmann: Model. In: Encyclopedia Britannica, 18.52 Report of the Advisory Committee for Aeronautics
Vol. 30, 10th edn., ed. by EDITOR (“The Times” Print- 1910–1911 (British Advisory Committee for Aeronau-
ing House, London 1902) pp. 788–791 tics, London 1911)
18.39 L. Boltzmann: On aeronautics. In: Wittgenstein 18.53 Report of the Advisory Committee for Aeronautics
Flies a Kite: A Story of Models of Wings and Models 1911–1912 (British Advisory Committee for Aeronau-
of the World, ed. by S. G Sterrett (Pi Press/Penguin, tics, London 1912)
New York 2005/6) Transl. I. Pollman, M. Mertens 18.54 Report of the Advisory Committee for Aeronautics
18.40 C. Abbe: Mechanics of the Earth’s Atmosphere: A 1912–1913 (British Advisory Committee for Aeronau-
Collection of Translations, Smithsonian Miscella- tics, London 1913)
neous Collections Ser., Vol. 843 (The Smithsonian 18.55 J.W. Strutt: (Baron Rayleigh): Fluid motions, lecture
Institution, Washington DC 1891) original in Ger- delivered at the Royal Institute, March 20, Engi-
man: H. von Helmholtz neering 97, 442–443 (1914)
18.41 H. von Helmholtz: On a theorem relative to move- 18.56 O. Reynolds: An experimental investigation of the
ments that are geometrically similar in fluid bod- circumstances which determine whether the mo-
ies, together with an application to the problem tion of water shall be direct or sinuous and the law
of steering balloons. In: Mechanics of the Earth’s of resistance in parallel channels, Philos. Trans. R.
Atmosphere: A Collection of Translations, by Cleve- Soc. 174, 935–982 (1883)
land Abbe. Smithsonian Miscellaneous Collections, 18.57 J.D. van der Waals: Coefficients of expansion and
Vol. 843, (The Smithsonian Institution, Washington compression in corresponding states, Amst. Ak. Vh.
DC 1891) pp. 67–77 20, 32–43 (1880)
18.42 H. von Helmholtz: Über ein Theorem, geometrisch 18.58 R.C. Tolman: The principle of similitude, Phys. Rev.
ähnliche Bewegungen flüssiger Körper betreffend, 3, 244 (1914)
nebst Anwendung auf das Problem, Luftballons zu 18.59 E. Buckingham: The interpretation of experiments
lenken, Monatsber. Kgl. Preuß. Akad. Wiss. 1873, on models, J. Wash. Acad. Sci. 93, 336 (1914)
501–514 (1873) 18.60 P. Forchheimer: Hydraulik (Teubner, Leipzig, Berlin
18.43 H. von Helmholtz: On discontinuous motions in 1914), in German
liquids. In: Mechanics of the Earth’s Atmosphere: 18.61 J.R. Chambers: Cave of the Winds: The Remark-
A Collection of Translations, Smithsonian Miscel- able History of the Langley Full-Scale Wind Tunnel
laneous Collections, Vol. 843, ed. by C. Abbe (NASA, Washington 2014)
(The Smithsonian Institution, Washington DC 1891) 18.62 E. Buckingham: Letter to Lord Rayleigh (John
pp. 58–66 William Strutt) dated November 13, 1915, hand-
18.44 H. von Helmholtz: Über discontinuirliche Flüs- written on official National Bureau of Standards
sigkeits-Bewegungen, Monatsber. Kgl. Preuß. stationery
Akad. Wiss. 1868, 215–228 (1868) 18.63 E. Buckingham: An Outline of a Theory of Thermo-
18.45 G.G. Stokes: On the effect of the internal friction of dynamics (Macmillan, New York 1900)
fluids on the motion of pendulums, Trans. Camb. 18.64 J.G. Kirkwood, O.R. Wulf, P.S. Epstein: Richard
Philos. Soc. 9, 8 (1850) Chace Tolman: Biographical Memoir (National
Academy of Sciences, Washington DC 1952)
Physically Similar Systems – A History of the Concept References 411
Part D | 18
413
Hypothetical
19. Hypothetical Models in Social Science
Part D | 19.1
19.4.2 Isolation of Causal Mechanisms
that attempt to isolate, theoretically, the working
or Capacities ..................................... 424
of causal mechanisms or capacities from disturbing 19.4.3 Learning About Possibilities................ 425
factors. However, unlike experiments, hypothetical 19.4.4 Inferential Aids ................................. 426
models need to deal with the epistemic uncer- 19.4.5 Models as Blueprints for the Design
tainty due to the inevitable presence of unrealistic of Socio-Economic Mechanisms .......... 427
assumptions introduced for purposes of analyti- 19.4.6 Where Do We Go From Here?............... 428
cal tractability. Computer simulations have been
claimed to be able to overcome some of the stric- 19.5 Conclusions ...................................... 428
tures of analytical tractability. Still they differ from 19.A Appendix: J.H. von Thünen’s Model
hypothetical models in how they derive conclusions of Agricultural Land Use
and in the kind of understanding they provide. in the Isolated State. ......................... 429
The inevitable presence of unrealistic as- 19.B Appendix:
sumptions makes the legitimacy of the use of T. Schelling’s Agent-Based Model
hypothetical modeling to learn about the world of Segregation in Metropolitan Areas . 430
a particularly pressing problem in the social sci-
ences. A review of the contemporary philosophical References................................................... 431
debate shows that there is still little agreement on
what social scientific models are and what they are
for. This suggests that there might not be a single
answer to the question of what is the epistemic
value of hypothetical models in the social sciences.
enth: The laboratory style, which lies between methods The transfer of models and modeling techniques
(2) and (3) in that it relies on “built apparatus” to pro- across disciplinary boundaries is contributing to the es-
duce phenomena about which hypothetical models may tablishment of shared modeling standards. Recent fields
be true or false [19.2]. Each style of reasoning intro- such as network theory and agent-based modeling are
duces new kinds of objects, new evidence and new ways united by common modeling tools rather than by a set of
of being a candidate for truth. For our purposes the rel- principles or subject matter. These tools are then being
evant features of a style of reasoning are its stability modified and adapted in house, as it were, to satisfy the
across disciplinary contexts and its autonomy, in other specific epistemic and nonepistemic needs of each field.
words, once a style of reasoning becomes established, For example, the use of network theory in sociology
it determines its own criteria of what counts as good looks rather different from the use of network theory
reasoning [19.2]. in economics and this is in part due to their being em-
Hypothetical modeling refers to that scientific strat- bedded in different disciplinary cultures [19.7, 8]. Thus,
egy in which the known properties of an artifact are it is possible to talk of field-specific modeling practices
put to use in order to elucidate the unknown properties to emphasize what is distinctive about a specific dis-
of a natural phenomenon [19.1, p. 1087]. Mary Mor- cipline and to talk of style of reasoning to underline
gan [19.3] deploys the concept of style of reasoning the distinctiveness of model-based reasoning vis-à-vis
to characterize the practice of theoretical modeling in other scientific styles of reasoning, such as the labora-
economics and traces the history of how modeling grad- tory style. Which aspect is emphasized depends on the
ually became the prevalent style of reasoning in eco- purposes of one’s enquiry.
nomics, achieving its present features around the sec- Here we are interested in the commonalities: Treat-
ond half of the last century. It is these features that Mor- ing hypothetical modeling as a style of reasoning
Part D | 19.1
gan’s work, and ours too, endeavors to describe. Here, encourages us to look at its characteristic features vis-à-
however, we are concerned with hypothetical modeling vis other styles of reasoning employed in social science.
as it takes place in the social sciences at large. By treat- Philosophers of science sometimes talk of model-based
ing this method as one style, we seek to highlight its social science, a label that captures the same kind of
distinctive characteristics, which cut across disciplines. scientific activity. Here is how Peter Godfrey-Smith
Unlike in economics, in other social sciences such characterizes it [19.9, p. 726]:
as sociology and political science, until a few decades
“What is most distinctive of model-based science
ago the use of hypothetical models was limited to rela-
is a strategy of indirect representation of the world
tively narrow areas of inquiry. Well known, contentious
[. . . ] The modeler’s strategy is to gain understand-
attempts at introducing economic-style, rational-choice
ing of a complex real-world system via an un-
models in sociology and political science sparked accu-
derstanding of a simpler, hypothetical system that
sations of economics imperialism [19.4]. More gener-
resembles it in relevant respects.”
ally, formal modeling was often perceived in opposition
to the qualitative leanings of many social scientists. The terms hypothetical modeling and model-based
Critics complained that mathematical models could science both refer to the scientific activity of under-
not capture the complexity of social and economic standing phenomena by building hypothetical systems,
phenomena, which are often hard to quantify and mea- which at once are much simpler than the phenomenon
sure and do not obey the kinds of exceptionless laws under investigation and hopefully resemble it in some
that were believed to characterize the natural sciences. respect. The modeler studies these simpler, hypothet-
Disciplinary resistance to the method of hypothetical ical systems in order to gain insights into the more
modeling, however, is not at odds with the stability complex phenomena they represent. These hypothetical
characteristic of styles of reasoning: It is the deploy- systems can be of different kinds: They can be concrete
ment of a style in a new field and domain of inquiry that objects such as the scale models of engineers, or they
is contested, but its features, those that make it a style, can be the set of mathematical equations very familiar
might remain untouched. Moreover, the critical attitude to both physicists and economists. In this chapter our
toward the method of hypothetical modeling is now main focus is on models that are abstract in the sense
changing, at least in some social sciences. For instance, that they do not exist as physical objects to be manip-
Clarke and Primo [19.5, p. 1] claim that, “[m]odels ulated by the modeler. They are theoretical rather than
have come to be the dominant feature of modern po- empirical models. Empirical models are built for testing
litical science and can be found in every corner of the and measuring relationships between variables and are
field”. Edling [19.6, p. 197], writes that “since mathe- based on empirical data; they do not describe hypothet-
matical sociology was firmly established in the 1960s, ical systems. In this sense they are better thought of as
it has grown tremendously”. belonging to the statistical style of reasoning.
Hypothetical Models in Social Science 19.1 Hypothetical Modeling as a Style of Reasoning 415
The attribute theoretical should not be taken to sug- evaluation of hypothetical models often relies on other
gest that theoretical models are always instantiations of criteria such as credibility, insightfulness, explanatori-
general theories. Theoretical principles might be only ness or other modeling desiderata. This is one of the
one of the several ingredients that go into the construc- aspects that make hypothetical modeling partly au-
tion of theoretical models [19.10]. Ideally, theoretical tonomous from other styles of reasoning: The evalua-
(or hypothetical) modeling and empirical modeling are tion of hypothetical models and their results is based
tightly connected. The results of theoretical models are on criteria largely internal to the style, criteria that have
translated into empirical models and thereby subject developed together with the stabilization of the style.
to testing. In many cases, however, the evaluation of In contemporary social science the diffusion of hy-
theoretical models proceeds without the results being pothetical modeling to tackle social scientific questions
directly confronted with data. This is how the histo- is taking place in parallel with an increasing reliance on
rian and methodologist of economics Roger Backhouse computer simulations as well as on laboratory and field
explains the relationship between theoretical and em- experimentation. Recall that Hacking takes the labora-
pirical models [19.11, p. 138]: tory style to lie between the use of experiments and
hypothetical models. The laboratory style differs from
“Empirical work starts with a set of economic rela- experimentation simpliciter in that it creates artificial
tionships, formulated in such a way that they can be environments about which the hypothetical models can
confronted with data using formal, statistical tech- be true or false. In the social sciences where laboratory
niques. These relationships may be the theoretical experiments were believed to be virtually impossible,
results [. . . ], but typically they will be different. The modeling was considered to be an attempt–to some
reason for this is the requirement that they can be fruitful, to others idle–to achieve theoretically what it
Part D | 19.1
confronted with data: They must refer to variables is impossible to implement in the laboratory; even now,
on which statistical data exist, or for which proxies when both laboratory and field experimentation have
can be found; functional forms must be precisely become well-established practices in many social sci-
specified and amenable to statistical implementa- ences, many questions of interest still cannot be studied
tion.” experimentally. It is perhaps not surprising that mod-
els and experiments have been compared in order to
Backhouse’s account is about economics, but it can understand their characteristic features as well as their
be generalized to other social scientific contexts in common characteristics.
which theoretical results cannot be directly confronted The discussion in Sect. 19.2 addresses the question
with empirical data. In such cases, theoretical and em- of whether models are relevantly similar to experiments
pirical modeling might be only loosely connected with and, if not, where the deep differences lie. In Sect. 19.3,
one another, for example, by way of the theoretical re- we examine the issue of whether computer simulations
sults informing empirical modeling and vice versa, as belong to the style of hypothetical modeling. In gen-
depicted in Fig. 19.1. eral terms, the question is whether or not computer
But which theoretical results are taken seriously simulation and analytical modeling are different ways
enough to inform further empirical investigations? The of studying hypothetical systems. These comparisons
Theoretical modelling
Mathematical,
logical, Theoretical
Assumptions Evaluation
computational results
techniques
allow us to highlight some of the main features of Most of the philosophical literature that we consider
hypothetical modeling in the social sciences. Finally, throughout the chapter is about economic modeling.
we discuss the legitimacy of hypothetical modeling This is because, in the social sciences, economic mod-
as a way of learning about social scientific phenom- eling is where hypothetical modeling has been most
ena. The debate reconstructed in Sect. 19.4 attempts to prevalent and hence has received the most philosophical
understand how hypothetical modeling deals with the attention. However, many of the insights from this liter-
specificities of the social sciences and to examine the ature apply across the social sciences. Not only because
conditions of its legitimacy. As will become clear, there rational-choice models are widely employed in social
is little agreement on the nature and function of theo- sciences other than economics such as sociology and
retical models in social science. Perhaps this is not so political science, but also because the indirect represen-
much a sign of slow progress in understanding models tation of phenomena through idealized models crosses
as it is an indication that a substantive as well as a gen- disciplinary boundaries. If and when a given issue is
eral account of the nature and function of models is not peculiar to economics and cannot be generalized, it will
possible [19.12]. be pointed out.
tion has for long been considered one of the features are at the center of this comparison, which deals with
that sets the social sciences apart from the natural sci- functional, methodological and epistemic aspects of the
ences. It was widely believed, among both scientists two styles of reasoning.
and philosophers, that experiments were the exclusive Models and experiments can be seen as playing
purview of the natural sciences: Many if not all ques- a similar function, i. e., that of isolating the phe-
tions of interest in the social sciences were thought not nomenon of interest from the interference of disturbing
to be amenable to experimental investigation, owing to factors [19.16–19, 21, 22]. This analogy concerns the
the difficulty of designing experimental settings capable ideal model and the ideal experiment – or, as Cartwright
of reproducing and confining the phenomena of inter- calls it, the “Galilean experiment” – in which only the
est. The broad scale of many social phenomena and the factor of interest is allowed to vary, leaving everything
inevitable presence of disturbing factors of very differ- else constant. Ideally, experiments proceed by remov-
ent kinds (e.g., history, cultural background, value judg- ing or controlling all potentially disturbing factors so as
ments, etc.) were seen as insurmountable obstacles to to allow scientists to create the conditions under which
controlled experimentation. Obvious ethical issues also causal relations can be observed in isolation. An influ-
contributed to limit the range of feasible scientific exper- ential account of models holds that, like experiments,
iments and continue to do so. Since the second half of models aim at studying specific aspects of the phe-
the last century, however, the use of experimentation in nomenon of interest (such as causal relations, capacities
social science has grown remarkably, both in the labora- or mechanisms) in isolation from the interference of
tory and in the field, thanks to technological and method- disturbing factors [19.3, 16, 17, 21, 23, 24]. On this ac-
ological developments, which allow control of many of count, the isolative function of the ideal model is akin
the disturbing factors that were previously considered to that of the ideal experiment. To emphasize this simi-
impossible to control for ([19.13], see also [19.14] for larity, it has been suggested that models are theoretical
economics and [19.15] for political science). experiments aimed at creating, theoretically, the kind of
The method of hypothetical modeling has been controlled conditions typical of the laboratory [19.18].
seen as an alternative to experimentation when exper- As an illustrative case, consider von Thünen’s well-
imentation is difficult or unfeasible [19.16, 17]. Both known model of localization, which is described in the
models and experiments are interpreted as devices for Appendix (Sect. 19.A). The model could be interpreted
surrogate reasoning, which are examined to draw infer- as the result of a process of isolation that zooms in
ences about the target phenomena they aim to repre- on the relationship between spatial distance and land
sent [19.18–20]. In the philosophical literature models use. This is done by means of assumptions that neu-
and experiments have been compared in terms of the tralize the effect of other factors by assuming them to
functions they play in scientific inquiry and in terms of be constant, absent or negligible [19.23]. In the model
their use for drawing inferences about the world. The the pattern of agricultural production around the city
Hypothetical Models in Social Science 19.2 Models Versus Experiments: Representation, Isolation and Resemblance 417
depends exclusively on transportation costs, because away. One implication of this is that the use of experi-
all other factors that can influence land use, such as ments to test models is bound to be imprecise.
the presence of rivers or mountains, the different fer- In Mäki’s account, however, the difference in the
tility of the soil and the presence of transport routes, degree of control between models and experiments
are assumed away. Most of the assumptions listed in does not compromise the identification of further analo-
Sect. 19.A can be seen as having this function. gies. In particular, Mäki claims that “many theoretical
The investigations of this analogy by Morgan models are (thought) experiments, and many ordi-
[19.19, 22] and Mäki [19.18] emphasize that the way nary experiments are (material) models”, because both
in which models and experiments fulfill their isolating fall under the general concept of scientific representa-
function is different. Experimenters control for disturb- tion [19.18, p. 303]. In Mäki’s interpretation, something
ing factors by designing the experimental set-up so qualifies as a representation if (a) it is used as a rep-
that disturbing factors are prevented from interfering resentative of its target and (b) it resembles the target
with the mechanism of interest; this can be done both in some respects and to certain degrees. Hypothetical
by physical interventions in the experimental environ- models, as we characterize them, are substitute systems
ment and by choices regarding the procedure [19.22]. that are examined in order to gather information about
A social scientific experiment, for instance, might take what they represent. Similarly, experiments are pieces
place in a laboratory in which the subjects interact with of the world created in the laboratory that can be seen as
the experimenter only via computer terminals, so as to material models, which again are not examined for their
minimize the possibility that the experimenter’s expec- own sake, but rather for their value in gathering infor-
tations influence the subjects’ behavior. Alternatively, mation about the world outside the laboratory. Models
in order to control for the effect on subjects’ choices and experiments qua substitute representations face the
Part D | 19.2
of the language employed to describe a number of al- question of whether it is legitimate to draw inferences
ternatives, the experimenter might design the procedure from the representational tool to the real world. This
so that the subjects, instead of all being presented with can be thought of as an instance of the general problem
exactly the same description of the alternatives, are ran- of extrapolation, which concerns the generalization of
domly assigned different descriptions, thereby ensuring the results obtained in a model or in an experimental
that language will not have a systematic effect on ag- situation to the world. For Mäki, in both cases the key
gregate choice behavior [19.15]. to these problems rests on whether the world created in
Hypothetical models try to achieve the same kind the laboratory or in the model resembles reality in the
of control by means of theoretical assumptions, as in relevant respects and to a sufficient degree. In our ex-
the case of von Thünen’s model. The isolation achieved ample from von Thünen’s work, the problem amounts
by assuming away all disturbing factors is only a the- to establishing whether the model is relevantly similar
oretical exploration that does not provide a definitive to reality so as to justify the claim that the distance from
answer to the question of what would happen in the real the market actually affects the distribution of economic
world in such a situation. It is open to debate what this activities as described in the model. In a laboratory ex-
difference amounts to and what its consequences are, in periment on individual choice behavior, the question
particular when it comes to using this style of reasoning arises as to whether the kind of task subjects are asked
for drawing inferences about the world. For Morgan, to perform and the artificial environment of their inter-
as we will see shortly, this dissimilarity is grounded action are relevantly similar to situations that occur in
in the different materials of which models and exper- the wild so as to warrant inferences from the exper-
iments are made, and this, in turn, has major epistemic iment to the world. In Mäki’s account, since models
consequences. In Mäki’s account, however, the conse- and experiments raise similar issues of resemblance,
quences of this dissimilarity are limited to the degree the ability to draw inferences about the world does not
and strength of isolation that the two styles are able to hinge on their respective features. Although models can
provide and do not result in major differences concern- sometimes be made closer to reality, the answer to the
ing their use for making inferences about the world. problem of extrapolation ultimately depends on what
For Mäki, models are able to display a higher de- it means for a model to resemble a real situation, and
gree of control than experiments can do [19.18]. This on the ability to identify the relevant aspects and the
is because experiments can isolate only to the extent sufficient degrees of resemblance [19.25]. Therefore,
to which it is practically feasible, and hence some in- a margin of ambiguity in the notion of sufficient rele-
terferences are left uncontrolled for or only weakly vant resemblance remains.
controlled for. Models, instead, can provide tighter iso- Morgan takes a different stance on the problem of
lation, because they are not subjected to these practical extrapolation [19.19, 22]. She emphasizes that models
constraints and can simply assume all interferences and experiments are similar in the way in which they are
418 Part D Model-Based Reasoning in Science and the History of Science
used and manipulated, but argues that the distinct fea- is not obvious), we are more justified in claiming to
tures that characterize these instruments provide them learn something about the world from experiments than
with different epistemic powers for investigating the from models. Inference from the model to the world is
world. Models and experiments are manipulated by in- much more difficult because the materials are not the
troducing controlled variation of some of their aspects same as the world’s.
in order to check whether the results are affected by this Moreover, because of their materiality, experiments
change. These manipulations allow scientists to inves- can create new phenomena that might be different
tigate the consequences of variations in the initial con- or even contrary to theoretical expectations. When an
ditions on the result and/or the conditions under which unexpected experimental result is sufficiently stable
a particular result of interest can be obtained. Von Thü- across replications and manipulations of the experimen-
nen, for instance, after having illustrated his model as tal design, it qualifies as a new phenomenon, which
described in Sect. 19.A, continued his investigation by requires theoretical explanation. For instance, behav-
introducing the presence of a river and a smaller neigh- ioral regularities robustly observed in several economic
boring town, thereby altering two of the initial assump- experiments, such as co-operation in prisoner dilemma
tions. Changing the model in this way enables investiga- games, can be thought of as constituting new phe-
tion of how the newly introduced assumptions affect the nomena with respect to the expectations of rational
pattern of land use in comparison to the initial scenario: choice theory. Such experimental phenomena have now
The river facilitates the transportation of goods to the become the target of sustained theoretical efforts to ac-
market and thus distorts the concentric pattern, whereas count for them. Van Fraassen makes a similar point,
the presence of a smaller town generates its own concen- arguing that scientific instruments can be viewed not
tric regions of land use. In Morgan’s terms this can be only as windows that allow us to see what happens in
Part D | 19.2
described as a case of “experiment on models” [19.22]. the world, but also as machines that create new genuine
The manipulation of models can be interpreted as a kind phenomena that would not occur in the wild and that
of experiment in which the scientists “interrogate” the theory needs to explain [19.26].
model by modifying some of the initial assumptions. In Morgan’s account, the creation of new experi-
These questions can be motivated by theoretical issues, mental phenomena is only possible in the laboratory
by the aim to explain or predict real-world situations and does not belong to the style of hypothetical mod-
or by the policy agenda the model is meant to guide. eling [19.19]. Only real flesh and blood experimental
For instance, von Thünen’s manipulations of the basic subjects have the freedom to behave in other ways than
model can be seen as being prompted by questions such expected. Experiments must allow a certain degree of
as “what happens to the pattern of concentric rings if freedom because, if the subjects’ behavior was fully de-
there is a river and/or a neighboring town?” termined, then the experiment would have no genuine
Yet when it comes to connecting the answers ob- potential to confirm or refute a theory. Therefore, ac-
tained in this process to the real world, models and cording to Morgan, experiments have the potential to
experiments are substantially different for Morgan. She surprise and confound theory: They can illuminate un-
argues that experiments have greater epistemic power expected or hidden consequences of a theory, but their
than models, because the experimental inferential leap results can also be in conflict with theoretical expec-
is smaller than in the case of models. This dissimilar- tations. On the other hand, models cannot confound
ity is based on the different materials of which each is because the behavior of agents is pre-determined by
made: Experiments are concrete investigations, which the modeler’s assumptions [19.21]. In other words, the
deal directly with the world they are meant to study, agents in the model lack the “potential for independent
whereas models are abstract and idealized. Economic action”, which is what confers greater epistemic power
experiments, for instance, deal directly with real peo- to experiments [19.19].
ple’s behavior, however constrained the behavior is by Morgan’s ideas, however plausible, can be chal-
the experimental design. By contrast, models are ab- lenged. Parker rightly observes that it is not always
stract entities, which are made of different stuff than the case that inferences are better justified when the
the reality they represent. For Morgan, the “materiality” representative tool and the target are made of uniform
of experiments can make inferences from experimental materials [19.27]. What is crucial is not the material, but
results to the world both easy and strong, because they the presence of relevant similarities, which can be ma-
are grounded on the material uniformity between the terial but also formal (in this respect, Parker’s point is
system on which the manipulation is conducted and the similar to Mäki’s). The justification of inferences about
world about which the inference is made [19.19]. More the world depends on having good reasons to think that
precisely, Morgan maintains that, insofar as experi- the relevant similarities are in place. Having experiment
ments share the same ontology with their target (which and target made of the same material does not guaran-
Hypothetical Models in Social Science 19.2 Models Versus Experiments: Representation, Isolation and Resemblance 419
tee that all relevant similarities are in place, because it which are hoped to be realistic. Hence, according to
is possible that “same-stuff representations” fail to be Kuorikoski et al. even though robustness analysis is not
relevantly similar to their target systems. Nevertheless, a procedure of empirical confirmation, it can increase
Parker agrees with Morgan that the material uniformity modelers’ confidence about their inferences from hypo-
between experiments and the world can provide some thetical models. Note that although Kuorikoski et al. are
epistemic advantage: Experimental and target systems mainly concerned with analytical models, robustness
that are made of the same stuff will often be similar in analysis has also been claimed to be a crucial strategy
many relevant respects. In other words, being made of in correcting for various sources of error that might af-
the same material does not guarantee that relevant sim- fect the results obtained by computer simulations, as we
ilarities are in place, but it does make it more likely. As will see in Sect. 19.3.
a consequence, inferences from experiments are more Odenbaugh and Alexandrova raise the valid ob-
likely to be reliable than inferences from models. jection that, although in principle robustness analysis
A further challenge that models have to face arises might work, in practice it does not provide a defense
from the fact that not all assumptions in a model can of hypothetical models with over-constraining assump-
be thought to function as isolations. Assumptions in- tions [19.31]. If, for example, in economic modeling,
troduced to make it possible, or easier, to handle the some of the core rational-choice axioms are never mod-
model analytically might impose constraints that are too ified to check their effects on the modeling results and
tight to allow relevant similarities between the mod- if these axioms are in fact wildly unrealistic, then ro-
els and the represented aspects of the world. This is bustness analysis turns out to be of limited use. Hence,
a point Cartwright makes specifically with regards to although in principle hypothetical modeling might be
economic models: Although many economic models aimed at isolating a mechanism of interest, there re-
Part D | 19.2
aim at mimicking Galilean experiments, they also in- mains the problem that many assumptions do not have
clude a number of assumptions that do not play the role this isolating function. This situation can jeopardize the
of isolation, but are rather introduced for the purpose resemblance between the models and the represented
of mathematical tractability [19.21, 28]. If the models’ target, which would warrant the inferences from the
conclusions are due to assumptions that are completely model to the world. We will return to these issues in
unrelated to the world, then it is not clear what the Sect. 19.4.
model can tell us about the world. In conclusion, both models and experiments can be
Derivational robustness analysis has been proposed considered as representations that are examined in order
as a remedy for the problem of over-constraining as- to draw inferences about what is represented. They dif-
sumptions [19.29, 30]. For Kuorikoski et al. deriva- fer in the kinds of representation involved: Models are
tional robustness analysis refers to the collective prac- abstract indirect representations, whereas experiments
tice of building similar models of the same phenomenon are concrete direct representations made of the same
that differ only in a few assumptions. The analysis of material as the target. It has been claimed that this has
these groups of similar models can help to identify implications both for the kind and the degree of con-
which assumptions are necessary for deriving a cer- trol that these devices are able to provide and for the
tain result: Results that are robust across a number of way in which conclusions are drawn from them. On the
models are dependent on the shared, rather than on the one hand, theoretical models enable the investigation of
differing, assumptions. Now if the over-constraining as- phenomena that are difficult or impossible to reproduce
sumptions (or more generally the assumptions known to in the laboratory; yet on the other hand, only experi-
be unrealistic) are found to be unnecessary for deriving ments seem to have the genuine potential to bring to
the result of interest, then there are reasons to think that light new phenomena that require theoretical explana-
this result is based primarily on the shared assumptions, tion (Table 19.1).
Table 19.1 Models, experiments, simulations: A comparative perspective (after [19.22, p. 49])
Ideal model Ideal laboratory experiment Ideal simulation
Kind of representation Indirect: Different Direct: Same material Indirect: Different
material material
Isolation and control Assumed theoretical isolation Experimental material isolation Assumed theoretical isolation
Advantages Theoretical exploration in which Discovery of phenomena Representation of complex and/or
experiments are difficult for science to explain dynamic problems and other
or unfeasible problems that are not solvable
analytically
Challenges Tractability Material and ethical constraints Transparency
420 Part D Model-Based Reasoning in Science and the History of Science
placing analytical models. In the literature some authors strategic interactions rather than on the quality of the
have emphasized the elements of continuity between strategies per se.
the two methods; others have highlighted the differ- This point can be illustrated by an example. Imag-
ences. For instance, according to Guala [19.20] and ine for a moment that each of the procedures sent by
Morgan [19.22], models and simulations are akin to each game theorist to Axelrod represents the way in
each other in the way they are used to learn about which the theorist would have played the game in real
the world and for the functions they fulfill. Others life. The number of possible combinations quickly be-
have argued that computer simulations open up novel comes unmanageable (in the first tournament, which
methodological questions that did not arise in dealing was repeated 5 times, there was a total of 240 000
with analytical models [19.34]. Below, we will explore choices). By means of computer simulation, it is pos-
both the similarities and the differences between these sible to study which strategies survive, which become
methods, with special focus on the features of computer extinct and which co-exist. Through computer simula-
simulation that have been debated in relation to their tion, agents can be represented more realistically than
adoption in economics and the social sciences. before, for example, as individuals with bounded ratio-
To see how models and simulations connect, con- nality and with learning and memory constraints.
sider how computer simulation originally entered the Precisely because simulations study how social phe-
field of the social sciences. One pioneer in the study nomena emerge and evolve through the interactions of
of social phenomena with the aid of the computer has single individuals and their environment, it has been
been the political scientist Robert Axelrod. In 1980, Ax- claimed that they represent an invaluable tool in social
elrod launched a competition between experts in game science [19.37]. This is because the way the simulations
theory from different fields [19.35, 36]. The challenge analyze the occurrences of social phenomena repro-
was to come up with a strategy for an iterated prisoner’s duces the dynamics in which such phenomena occur in
dilemma game to be played in a computer tournament. the social world. They provide bottom-up, mechanistic
Axelrod paired strategies – fifteen in all – and had explanations [19.38, 39]. Through computer simula-
the participants play for two hundred rounds in an all- tion, the role of individual, structural and institutional
play-all tournament. At the end of the tournament, the variables can be represented in a particularly realistic
winning strategy turned out to be one of the simplest fashion, which has been claimed to help capture the
and most ancient strategies of human co-operation, tit complexity of their interdependence. Note, however,
for tat. The strategy is to co-operate in the first round of that the enthusiasm with which simulations have been
the game and then replicate the opponent’s moves, i. e., welcome in some fields, such as analytical sociology,
to co-operate in case of co-operation and defect as soon has not been unanimously shared. In fact, most notably
as the other player defects. The strategy is successful in economics, simulations have been viewed with suspi-
insofar as it reaps the benefits of co-operation and does cion and their adoption not always encouraged. In order
Hypothetical Models in Social Science 19.3 Models and Simulations: Complexity, Tractability and Transparency 421
to understand the diverging attitudes of economists and eral class of simulations that deploy computers for their
other social scientists, in what follows we will address ends [19.44]. In this view, a physical model of a tar-
the following questions: get system – e.g., a scale model such as those used
in structural engineering – counts as a simulation that
1. What are the features that characterize computer
adopts a specific means of representation, i. e., a phys-
simulation?
ical model rather than a computer model. An extensive
2. Do such features relate to a cluster of properties that
body of literature examines the similarities between
distinguish computer simulation from other styles
computer simulations and experiments and highlights
of reasoning such as analytical models?
the fact that simulations are closer to the style of experi-
3. Is economists’ preference for analytical modeling
mental analysis than to the style of analytical modeling.
over computer simulation justified on epistemic
For instance, according to Morrison, in their relations
grounds?
to models, simulations are akin to experiments [19.45].
Firstly, consider that even if we talk about computer According to Parker, however, simulations lie between
simulations as distinct from analytical models, com- models and experiments in that they display features of
puter simulations are ultimately based on models. The both experimentation and modeling [19.27] (for a re-
way in which the two methods differ is that computer view of the literature on the experimental interpretation
simulations obtain their solutions by means of a pro- of computer simulations, see e.g., [19.34]).
gram that runs on a computer, whereas the solutions Taxonomical differences aside, in the most basic
of analytical models can be obtained without the aid sense defined above, computer simulations are simply
of a computer. This is simply because simulated mod- a tool in the hands of the scientists. They help to achieve
els take into account a higher number of variables and the model’s results in a manner similar to the way in
Part D | 19.3
consider nonlinear relationships, which are easier to ex- which a calculator helps perform difficult mathematical
plore with the computer. Broadly speaking, computer operations. Moreover, simulations enable the modeler
simulation refers to the entire process of formulating to represent the target system with a greater level of de-
a model, transforming it into an algorithm that runs on tail than is usually found in analytical models and to
a computer, calculating the output and analyzing the spell out in a particularly precise way the assumptions
results [19.34, 40–42]. Moreover, the contrast between behind the working hypothesis [19.34, 40, 41].
analytical models and computer simulations should not To illustrate, let us compare in more detail the
convey the idea that computers do not proceed analyti- differences between two mathematical approaches to
cally. Obviously they do (at least if we narrow our focus a study of the same phenomenon. The Lotka–Volterra
to models in the social sciences that do not require nu- model is a model in population ecology that also has
merical analysis); the difference is that when computer had applications in the social sciences for the study of
simulations analyze complex systems, they usually pro- organization-environment relations [19.46]. The analyt-
ceed by averaging over a sufficiently high sample of ical version of the Lotka–Volterra model is a highly ab-
cases rather than in the manner of mathematical proofs stract representation of the ecological (organizational)
(more on this below). system under study, which omits features such as the
Secondly, note that computer simulation is a coarse- environment in which the species live (the market), a re-
grained label, which generalizes different ways in alistic level of satiation (competition), lifetime (supply)
which simulations can be developed. Different tax- and so on. When the same problem is addressed by
onomies have been proposed in the literature [19.34, means of an agent-based computer simulation model,
43]. A preliminary, common distinction is between a particularly detailed representation of the system of
agent-based models (ABM) and equation-based mod- interest is provided. Hence, it is claimed, not only
els. The former proceed by implementing local rules, can computer simulation help to avoid common errors
such as a decision rule in a sociological model; the in intuition, it might also reveal a system’s relevant
latter, by translating equation-based models, such as aspects that had been underestimated or disregarded.
partial or ordinary differential equations in physics, into Furthermore, computer simulations have heuristic func-
a computer program. The boundaries between disci- tions: They trigger our intuitions and can be helpful
plines, however, are not strict. Agent-based models are in exploring new hypotheses; not least, they enable
frequently used in areas other than sociology, includ- us to visualize the results of a problem in particularly
ing fields that were previously dominated by analytical efficacious ways. A more detailed and concrete exam-
approaches, such as population ecology and theoretical ple of how computer simulation proceeds is given in
physics. Sect. 19.B.
A less common but still well-known interpretation Given the features discussed above, it would be nat-
defines computer simulation as a subset of a more gen- ural to expect computer simulation to be called upon to
422 Part D Model-Based Reasoning in Science and the History of Science
complement analytical models. In 1982 Richard Feyn- volving the problem of verifying results [19.48]. To
man was already justifying the adoption of simulations adopt a term used in the literature, simulations are said
in physics on the following grounds [19.47, p. 468]: not to be transparent. While it is possible to check each
step in the derivation of an analytical model, the same
“The present theory of physics [. . . ] allows space to
does not apply to simulation. Errors might be concealed
go down into infinitesimal distances, wavelengths
within the particular machine used to run the simula-
to get infinitely great, terms to be summed in infi-
tion or within the particular programming language or
nite order, and so forth. With computers we might
within the algorithm itself. According to Muldoon, the
change the idea that space is continuous to the idea
best strategy for increasing confidence in the results is
that space perhaps is a simple lattice and everything
to show that simulations provide robust results, i. e.,
is discrete (so that we can put it into a finite number
results that are invariant to changes in the hardware,
of digits) and that time jumps discontinuously.”
the programming language and the algorithm. Depend-
Physical theories trade formal rigor for unrealis- ing on the degree of confirmation a scientist needs to
tic assumptions, such as continuous space and infinite achieve, a robustness test investigates the source of pos-
wavelengths. As Feynman suggests, computer simu- sible mistakes similar to the way in which experimental
lations can help physicists reduce the constraints of scientists test their experimental results.
mathematical tractability in favor of descriptive accu- Probably for a combination of the reasons given
racy. Although similar considerations also apply to the above, recourse to computer simulation in economics
social sciences at large, in economics the endorsement has been legitimized mainly when models become too
of computer simulation has been slower than in physics complex to be analytically solvable or when the vol-
and other areas of science – with a few exceptions, such ume of data collected is such that only high-powered
Part D | 19.3
as the one we saw above from evolutionary game the- computers can process them. But what precisely does
ory. it mean for a problem to be intractable, and how do
Why then is there such an uneven reception of sim- computer simulations deal with that? An example that
ulation in the social sciences? Why have economists, illustrates this issue with particular clarity is Schelling’s
who have the most established tradition of modeling, model of racial segregation (see Sect. 19.B). Schelling’s
been reluctant to embrace simulation? Lethinen and model explains the emergence of ethnic clusters in
Kuorikoski investigate the reason for economists’ pref- different metropolitan areas as a consequence of the
erence for analytical models [19.42]. The authors claim preference of individuals for having a few neighbors of
that this tendency might even slow down progress in their own ethnic group. Agents have different informa-
the subject and lead economists to dismiss results that tion about their neighborhood, and, at any point in time,
would be reasonable to accept. According to Lehtinen they can decide to move to another neighborhood that
and Kuorikoski, it is because simulations do not provide better suits their preferences. Agents move randomly
the kind of understanding that is perceived as legitimate in space, and when they move, they tend to generate
in the economic community that this methodology is further movements of those individuals whose neigh-
more often considered as a secondary option at best. borhood has now changed. The chain of possible effects
Two key assets of economic theory, namely rationality triggered by each agent’s decision makes segregation
and equilibrium analysis, have a marginal role in agent- processes particularly difficult to formalize analytically.
based modeling, and these are aspects that economists Note that the issue of tractability does not concern
do not seem willing to dispense with. This also partly only the probabilistic nature of the problem. Analytical
explains why in other social sciences, where there is methods can in fact be used to calculate the develop-
no strong commitment to a unified theoretical corpus, ment of a probabilistic system without the need to resort
agent-based models are increasingly used. As we have to simulation. In the case of Schelling’s model it is
seen, agent-based models allow a greater degree of flex- because agents have different information and because
ibility in the behavioral rules ascribed to the agents. For neighborhoods overlap with one another that analytical
many social scientists, this represents an asset rather treatments are usually excluded. One way to proceed
than a liability. In economics, however, this flexibility analytically would be, for instance, to assume that the
is often considered a problem in that the choice of be- entire city is a unique neighborhood. At that point, all
havioral assumptions is ad hoc rather than guided and agents would share the same information and the prob-
constrained by a unified theory (see [19.7] for a discus- lem of overlapping neighborhoods would be solved.
sion of this issue). However, no one would move anywhere simply because
In the paper Robust simulations, Ryan Muldoon there would be no neighborhood to go to. In this sit-
addresses another source of concern related to the uation computer simulations can remedy the lack of
adoption of computer simulation in science, mainly in- analytical solutions by providing an approximation of
Hypothetical Models in Social Science 19.4 Epistemology of Models 423
the process under investigation. At each step of the sim- after all. According to a less narrow perspective, the
ulation the state of the system probabilistically depends adoption of computer simulation might be welcomed
on its configuration in the previous round. When the re- even in cases in which an analytical solution is possible,
sults aggregate we can observe whether an underlying but which is particularly demanding to find, time con-
dynamic emerges. Furthermore, every time we rewind suming and expensive with respect to research costs.
the tape and run the simulation again, we can observe To conclude, this section opened with a number of
whether the macrophenomenon is stable despite the questions on the nature of computer simulations and
contingencies that characterize each particular stack of their relation to analytical models. As we have seen,
simulations. Finally, since the simulation environment the answer to the question of whether computer sim-
is significantly flexible, we can consider a variety of ulations constitute a different style of reasoning from
factors and their impact, such as agents with different that of analytical models depends on the level of analy-
utility functions, or cities with different network struc- sis we consider. The differences between analytical and
tures [19.49]. simulated models appear clearer when we look more
Notice, however, that there is no reason why a way closely at the two methods: Computer simulations are
could not eventually be found to develop analytical so- particularly apt to deal with complex systems, even
lutions for Schelling’s model. In fact, in evolutionary though they do so at the cost of dispensing with ana-
game theory, progress has been made in developing an- lytical solutions. At a more general level, however, the
alytical solutions with Schelling’s model, which rely on two practices can be seen to be similar: They both con-
stochastic processes [19.50]. More generally, it is often cern the formulation of models and their manipulation
the case that a certain problem does not have an analyt- for the achievement of results (Table 19.1). Computer
ical solution until a scientist finds one. When we look simulation should not be taken as a remedy for the prob-
Part D | 19.4
at the conditions that make a problem mathematically lems that affect analytical models, such as whether and
tractable or intractable we find that there are no neat under what conditions we are justified in transferring
boundaries between the two. Something that has been their results to real-world phenomena [19.51]. In these
mathematically intractable up until today might become respects, computer simulations deal with issues similar
tractable tomorrow, thanks to progress in the discipline. to those dealt with using analytical models, if not with
But there are no neat criteria that define the tractabil- more complicated ones. This, however, does not consti-
ity/intractability of a problem. Hence, the economists’ tute a reason to avoid their adoption. Rather it indicates
claim that simulations should be limited to intractable that scientists’ efforts are needed to meet the challenges
problems, and that scientists should not leap to simula- that this new methodological tool offers to actual scien-
tions when an alternative is possible, appears unjustified tific practice.
tions, and it is unclear how accurate predictions or true falsity of its assumptions does not matter. This position
explanations can be derived from models that are partly is well exemplified by Friedman’s position advocated
false. No doubt, false assumptions are also employed in his famous essay The methodology of positive eco-
in the natural sciences, and the role of scientific ideal- nomics [19.52, p. 14]:
izations is likewise central to the philosophical debate
“In so far as a theory can be said to have assump-
about modeling in the natural sciences. The presence of
tions at all, and in so far as their realism can be
false assumptions, however, has been regarded as be-
judged independently of the validity of predictions,
ing a particularly acute problem in the social sciences.
the relation between the significance of a theory and
It has been argued that whereas in the physical sciences
the realism of its assumptions is almost the opposite
it is possible to test idealized models by recreating the
of that suggested by the view under criticism. Truly
same conditions in the laboratory, in the social sciences
important and significant hypotheses will be found
this can rarely be done. Moreover, unlike the natural
to have assumptions that are wildly inaccurate de-
sciences, the social sciences typically lack general theo-
scriptive representations of reality, and, in general,
retical principles (or laws) that indicate how deviations
the more significant the theory, the more unrealistic
from the model’s assumptions will affect the result in
the assumptions (in this sense).”
the real world. For instance, suppose that our model
of a falling object assumes that there is no air resis- The Friedmanian version of instrumentalism has
tance. The effect of air resistance on the acceleration of been very popular among economists; the result has
a feather falling on the floor can be calculated with the been that the impression that abstract modeling was
appropriate formula of classical mechanics. In social somehow at odds with a commitment to realism has
science, there are only few, if any, general principles been fostered. Independently of general philosophical
Part D | 19.4
Part D | 19.4
centric pattern. Mäki’s account can be seen as a direct few theoretical and empirical principles on which to
response to some criticisms of unrealistic models in so- rely for the derivation of conclusions about phenom-
cial science, because it points out that the mere presence ena of interest. Therefore, deriving conclusions from
of false assumptions does not in itself prevent the pos- these few theoretical and empirical principles requires
sibility that the model is true about important aspects of a wide range of assumptions that do not serve the pur-
the target system. Hence, unrealistic models should not pose of isolating capacities, but are instead needed to
be dismissed out of hand, but evaluated on a case-by- lend structure to the models while at the same time
case basis. allowing the models to be (mathematically) tractable.
According to Cartwright, models can be seen as Cartwright’s concern is that the models’ conclusions
isolating causal capacities. Many false assumptions are might be artifacts of these assumptions rather than gen-
introduced with the purpose of building a hypothetical uine effects of the capacity in isolation. According to
situation in which those capacities would act undis- Cartwright, these assumptions are at risk of overly con-
turbed from the effects of disturbing factors. Unlike straining the models, whose results, as a consequence,
Mäki, however, she maintains that abstract models, would often be artifacts of such over-constraining as-
that is, models in which the operation of a capacity sumptions rather than genuine effects of the capacity
is examined by abstracting away from concrete situa- in isolation. The scarcity of well-established theoretical
tions or system-specific details, cannot be meaningfully principles is a problem that economics shares with the
interpreted as providing explanations of real-world phe- other social sciences. Nevertheless, one of the charac-
nomena. For such models to be used to understand teristics peculiar to economics is its strong commitment
real-world phenomena, they have to undergo a pro- to a small set of axioms. Cartwright’s worry might be
cess of concretization: The factors that can potentially less of a problem in fields that are ready to use a wide
affect the operation of the isolated capacity in the con- range of behavioral assumptions as well as to rely on
crete situation of interest should be reintroduced in the agent-based modeling. This flexibility, however, might
model. come at a cost: One often-heard criticism is that there
Cartwright has been rather skeptical that many is a flavor of the ad hoc in the way agent-based simula-
models in economics and in the social sciences more tions are used in social science [19.7, 12].
generally have the right features to be used for un-
derstanding the social world – for two main rea- 19.4.3 Learning About Possibilities
sons [19.21]. First, socio-economic phenomena are
often brought about by many causes, which do not Thus far, we have been taking for granted that mod-
combine vectorially making it hard if not impossible els have specific real-world targets that they are taken
to predict their net effect when they interact. More- to represent. Some social scientific models, however,
over, these causes often do not have capacities stable do not seem to represent any specific target, and thus
426 Part D Model-Based Reasoning in Science and the History of Science
they lack a representational link to the real world. Such out to specify how the model justifies one’s change
target-less models deserve a discussion about whether in confidence, learning becomes a rather subjective af-
they offer opportunities for learning about the world fair [19.64, 65].
and, if they do, what kind of learning [19.44, 58–61].
Let us go back to Schelling’s model, which, according 19.4.4 Inferential Aids
to Grüne-Yanoff, does not try to represent any particu-
lar city or any type of city, thereby making the issue of According to an inferentialist approach to models, the
model-target similarity seem meaningless [19.59]. The question of how models can teach us about a target
only bit of the model that is informed by the real world system is ill posed, because models are tools to help
is the assumption that people have preferences for not scientists’ inferential processes rather than autonomous
being in a minority. According to Grüne-Yanoff, evalu- entities capable of delivering information, learning or
ating the model as a representation by inquiring about explanation [19.66–68]. Models are not abstract enti-
its similarity to a city, or to cities in general, would force ties which social scientists manipulate to learn about
us to conclude that it is defective. The model, how- something else; in fact, the very idea of manipulating
ever, still offers opportunities to learn about the world, an abstract entity sounds rather suspect. According to
because it teaches us that, contrary to prior belief, resi- the inferentialists, models are tools that aid inferences
dential segregation need not be brought about by racist from a set of assumptions to a conclusion; they help
preferences. By describing how it is possible for the “to derive some conclusions about the empirical sys-
phenomenon of segregation to come about rather than tem, starting from information extracted from this same
how the phenomenon has actually occurred, the model system” [19.67, p. 103; emphasis in the original]. As
gives us a how-possibly explanation as opposed to depicted in Fig. 19.3, the modeling activity (on the
Part D | 19.4
a how-actually explanation [19.62]. For Grüne-Yanoff, right-hand side) aids inferences from premises to con-
learning from a model amounts to a justifiable change in clusions about the phenomenon (on the left-hand side).
one’s confidence in one or more necessity or impossibil- Denotation, demonstration and interpretation refer to
ity hypotheses: We learn from Schelling’s model insofar three stages of the modeling activity: First, aspects of
as it justifiably changes “one’s confidence in hypothe- the phenomenon are denoted by specific elements of
ses about racist preferences being a necessary cause the model, results are then derived within the model,
of segregation” [19.60, p. 7]. Grüne-Yanoff’s approach and finally the results are interpreted again in terms
makes sense of why social scientists often talk about of the phenomenon [19.63]. The special features of
models as indirectly providing insights about the world models as scientific tools are those that make them
rather than offering specific hypotheses about real sys- useful for inference-making by expanding our limited
tems, and of why in some cases little effort is spent in cognitive abilities. According to the inferentialist ap-
applying models to those systems. However, the con- proach, the relationship between (good) representation
cept of learning, which is supposed to replace that of and (correct) inference is inverted: It is its usefulness for
understanding or explanation, seems to be a heuristic drawing correct inferences that makes a model a good
rather than an epistemic one. Unless criteria are laid representation, not vice versa. In other words, while for
Target Representation
Phenomenon Model
Denotation
Premises
Physical
(or other type of) Demonstration
operation
Conclusion
Interpretation Fig. 19.3 The inferentialist view of
models (after [19.63, p. 328])
Hypothetical Models in Social Science 19.4 Epistemology of Models 427
representationalism a model helps draw correct infer- “According to a widely shared view of scientific
ences about its target only if it is a good representation, knowledge, the main task of the theorist is to
for inferentialism, if by using a model we are able to explain spontaneously occurring and experimental
draw correct inferences about a target, then we can say processes, by designing an appropriate model [. . . ]
that the model represents its target well enough. Saying The FCC case belongs to an altogether different
this does not add much since there is nothing of sub- kind of scientific activity, proceeding in the op-
stance in the relation of representation itself. posite direction, from models to mechanisms and
The difference between the two accounts can be fur- processes.”
ther appreciated by comparing Fig. 19.2 and Fig. 19.3.
Whereas in the inference described in Fig. 19.2 it is the Obviously, the process from the model to the de-
symmetric relation of resemblance that makes a model sign of the mechanism is not a straightforward one. In
a representation of the target phenomenon, in the case the FCC auction mechanism, for example, there was no
described in Fig. 19.3, it is first the activity of deno- simple way of implementing existing and highly ab-
tation and then that of interpretation that translate the stract auction models because the real-world situation
inferences reached through the model into conclusions had specificities that needed to be taken into account.
about a phenomenon. The goodness of a model, in turn, This required tinkering with models as well as probing
depends on the number and variety of successful in- them experimentally in a back-and-forth process that
ferences that it enables and on the ease with which led to the design of the mechanism that was eventually
it allows the user to draw those inferences. The in- implemented. Interestingly, it is precisely the case of
ferentialist approach, however, raises the questions of the FCC auction mechanism that Alexandrova uses to
how to establish whether the inferences are correct, argue that models do not provide explanations that are
Part D | 19.4
and, if they are, what explains their success. Different directly applicable to real-world situations [19.70].
answers are possible. De Donato and Zamora Bonilla According to Alexandrova, the theoretical models
maintain that what matters is success in prediction and were only indirectly implicated in this successful case
intervention, but this answer casts doubt on a large por- of economic engineering; rather it is the experimental
tion of social scientific modeling: As mentioned above, efforts that should be credited with the achievement.
few such models yield successful predictions and they By choosing this example, she seems to suggest that
rarely function as blueprints for interventions (however, there is nothing relevantly different between using mod-
see Sect. 19.4.5 for a counter-argument). In contrast, els to explain the working of existing institutions and
Kuorikoski and Ylikoski claim that explanatory success using them to design new institutional arrangements;
is itself explained by the model having captured some in both cases, the models merely provide templates
or another portion of the causal structure responsible for for the formulation of causal hypotheses. In particular,
the phenomenon, bringing them close to the isolationist Alexandrova proposes the view that models are “open
position. formulae” taking the following form: “In a situation
of type x with some characteristics that may include
19.4.5 Models as Blueprints for the Design C1 Cn , a certain feature F causes a certain behav-
of Socio-Economic Mechanisms ior B”, where x is a variable that needs to be specified,
F and B refer to cause and effect and the Ci refers to
Philosophers of social science have not only been in- the conditions under which F causes B. It is only when
terested in the fit between hypothetical models and x is specified that the open formula becomes a causal
the phenomena they seek to represent, but also in the claim [19.70, 71].
inverse relation, namely, in the role of hypothetical The reason is that often it is either impossible to
models in the design of socio-economic mechanisms. know whether the modeling assumptions are satisfied
An example of successful institutional design guided by in a particular context of application, or it is known that
theoretical models – and of the broader phenomenon they cannot be satisfied at all. To illustrate, let us re-
of economic engineering – is the auction mechanism, turn to von Thünen’s model. While the open formula
which a group of economists was asked to draw up would be something like in a situation such as type x,
for the efficient distribution of radio-electronic fre- transportation costs T cause the location of economic
quencies by the federal communications commission activities according to pattern L, a causal hypothesis
(FCC). Guala regards the FCC example as an instance would say that in a situation in which the transportation
in which, rather than the theoretical model trying to costs depend only on the kind of good to be transported
represent a real-world phenomenon, the real world is and on the distance from the market (ci D f .d/), the
molded so as to resemble the model as closely as possi- costs cause a concentric distribution of economic ac-
ble [19.69, p. 456; emphasis in the original]: tivities around the market (plus other conditions). This
428 Part D Model-Based Reasoning in Science and the History of Science
hypothesis needs to be subjected to empirical or ex- centric rings around a particular city) or about generic
perimental testing. The characterization of the kind of targets (the localization of agricultural activity), while
situation in which F causes B need not correspond to other models do not have a target at all. Whereas in the
the assumptions of the model, however. So the causal latter case the inferences might be of the how-possibly
hypothesis will not specify that the town has no spatial kind, in the former cases, the model might be said to
dimension (assumption (5) in Sect. 19.A), as we already be explanatory or to help with explanatory inferences,
know that this assumption cannot be satisfied by any provided other conditions are also met. The challenge
real-world system. is to identify such conditions. Although, according to
Whether or not it is in fact the experimental efforts a very general principle, for purposes of explanation
that ultimately identified the successful auction mecha- and possibly reliable prediction and control, a model
nism, the FCC case points to the possibility that highly should somehow capture the relevant features of the
abstract models, which are not representations of any phenomenon it targets, different bits of the model need
real-world phenomenon to begin with, can nevertheless to be empirically confirmed, depending on the kind of
be used as guides to the design of successful institu- inferences at stake. This is where model manipulation
tional arrangements. As Guala suggests, such uses of in the form of robustness analysis becomes impor-
models distinguish the engineering ambitions of the tant: The manipulation of modeling assumptions helps
social sciences – an aspect of model-based social sci- to identify which components of the model are cru-
ence connected with wider debates about the allegedly cial for obtaining the result we are interested in. It is
unique capacity of the social sciences to influence their these components that need to be empirically valid, not
object of study [19.69]. the whole. This seems to hold also when the model’s
crucial assumptions are satisfied, not because they ac-
Part D | 19.5
19.4.6 Where Do We Go From Here? curately describe relevant features of their target, but
because the conditions for those assumptions to hold
The question of when it is legitimate to use theoretical true have been created by design. Although the real
models for epistemic and practical purposes is not yet world, however engineered, will rarely, if ever, ap-
settled. It appears to be rather uncontroversial that dif- proximate the model in every detail, at least in some
ferent kinds of models are suitable for aiding different successful instances it can be molded so that the rele-
kinds of inferences: Some models can be used to make vant features – those that drive the modeling result –
inferences about specific targets (the pattern of con- are in place.
19.5 Conclusions
The social sciences are now undergoing significant as agent-based simulation are making headway. It has
methodological change. Experimentation both in the even been suggested that computational science may
laboratory and in the field has become an important entail a new reorganization of the sciences around com-
addition to social scientists’ toolkit, not only for the- putational templates that cut across the natural and
ory testing, but also for theory formation and policy social sciences [19.72]. The place of modeling within
design. There are also important new developments re- the arrays of styles that characterize the social sci-
lated to the availability of large databases, as well as ences is significantly changing, and it remains to be
means to analyze them that were previously unavail- seen how the different styles will interact to produce
able. In all this, not only has modeling become more scientific knowledge about social and economic phe-
widespread, but also new modeling techniques such nomena.
Hypothetical Models in Social Science 19.A Appendix: J.H. von Thünen’s Model of Agricultural Land Use in the Isolated State 429
Part D | 19.A
ing C.
3. The plain is completely cut off from the outside The slope of each revenue curve depends on trans-
world portation cost and distance td.
4. Climate and fertility are uniform across space The descending curves in Fig. 19.4 represent the
5. The town is located centrally and has no spatial di- revenue of each production depending on its distance
mension from the town; e.g., at distance a it becomes more prof-
6. All markets and industrial activities take place in the itable to produce product B.
town
7. Production costs are constant across space
8. Transportation costs are directly proportional to the Revenue
distance, the weight and the perishability of the
rA
goods
9. Selling prices are fixed and the demand is unlimited
10. Farmers have complete information and they act to rB
maximize their revenue. Town
rC
Under these assumptions, a pattern of concentric
rings around the town emerges. Dairying and intensive a b c Distance
A
farming (vegetables and fruit) occupy the ring closest
to the town, because these products are perishable and B
incur the highest transportation costs. Timber and fire-
wood are located in the second ring, because wood is
heavy and hence difficult and costly to transport. The C
third ring consists of extensive farming of crops, such as
grain for bread, that are more durable than fruit and less
heavy than wood. On the outermost ring stock farm- Distance
ing and cattle ranching take place, because animals can
walk to the city to be sold at the market and thus have Fig. 19.4 The production revenue and the land use in von
low transportation costs. Thünen’s model (after [19.75, p. 76])
430 Part D Model-Based Reasoning in Science and the History of Science
globals [
percent-similar ;; on the average, what percent of a turtle's neighbours
;; are the same color as that turtle?
percent-unhappy ;; what percent of the turtles are unhappy?
]
turtles-own [
happy? ;; for each turtle, indicates whether at least %-similar-wanted percent of
;; that turtles' neighbours are the same color as the turtle
similar-nearby ;; how many neighbouring patches have a turtle with my color?
other-nearby ;; how many have a turtle of another color?
total-nearby ;; sum of previous two variables
]
to setup
clear-all
if number > count patches
[ user-message (word "This pond only has room for " count patches " turtles.")
stop ]
Part D | 19
References
19.1 A.C. Crombie: Styles of Scientific Thinking in the Eu- pects of scientific modeling, Perspect. Sci. (2015),
ropean Tradition (Gerald Duckworth, London 1995) doi:10.1162/POSC_e_00179
19.2 I. Hacking: The disunities of the sciences. In: 19.13 U. Webster, J. Sell (Eds.): Laboratory Experiments in
The Disunity of Science: Boundaries, Context and the Social Sciences (Elsevier, Amsterdam 2007)
Power, ed. by P. Galison, D. Stump (Stanford Univ. 19.14 F. Guala: The Methodology of Experimental Eco-
Press, Palo Alto 1996) pp. 37–74 nomics (Cambridge Univ. Press, Cambridge 2005)
19.3 M.S. Morgan: The World in the Model: How 19.15 R.B. Morton, K.C. Williams: Experiment in Political
Economists Work and Think (Cambridge Univ. Press, Science and the Study of Causality: From Nature to
Cambridge 2012) the Lab (Cambridge Univ. Press, Cambridge 2010)
19.4 U. Mäki: Symposium on explanations and social 19.16 U. Mäki: On the method of isolation in economics.
ontology 2: Explanatory ecumenism and economics In: Idealization IV: Intelligibility in Science, ed. by
imperialism, Economics Phil. 18, 237–259 (2002) C. Dilworth (Rodopi, Amsterdam 1992) pp. 317–351
19.5 K.A. Clarke, D.M. Primo: A Model Discipline. Politi- 19.17 N. Cartwright: Nature’s Capacities and Their Mea-
cal Science and the Logic of Representation (Oxford surement (Clarendon, New York 1989)
Univ. Press, Oxford 2012) 19.18 U. Mäki: Models are experiments, experiments are
19.6 C.R. Edling: Mathematics in sociology, Annu. Rev. models, J. Econ. Methodol. 12, 303–315 (2005)
Sociol. 28, 197–220 (2002) 19.19 M.S. Morgan: Experiments versus models: New
19.7 C. Marchionni: Playing with networks: How phenomena, inference and surprise, J. Econ.
economists explain, Eur. J. Philos. Sci. 3(3), 331–352 Methodol. 12, 317–329 (2005)
(2013) 19.20 F. Guala: Models, simulations, and experiments.
19.8 J. Kuorikoski, C. Marchionni: Unification and mech- In: Model-Based Reasoning: Science, Technology,
anistic detail as drivers of model construction: Values, ed. by L. Magnani, N. Nersessian (Kluwer
Models of networks in economics and sociology, Academic/Plenum, New York 2002) pp. 59–74
Stud. Hist. Philos. Sci. 48, 97–104 (2014) 19.21 N. Cartwright: The vanity of rigour in economics:
19.9 P. Godfrey-Smith: The strategy of model-based sci- Theoretical models and Galileian experiments. In:
ence, Biol. Philos. 21, 725–740 (2006) The Experiment in the History of Economics, ed. by
19.10 M. Boumans: How Economists Model the World into P. Fontaine, R. Leonard (Routledge, London 1999)
Numbers (Routledge, London 2005) pp. 135–153
19.11 R. Backhouse: Representation in economics. In: 19.22 M.S. Morgan: Model experiments and models in
Measurement in Economics: A Hand Book, ed. by experiments. In: Model-Based Reasoning: Science,
M. Boumans (Elsevier, Amsterdam 2007) pp. 135–152 Technology, Values, ed. by L. Magnani, N. Ners-
19.12 J. Kuorikoski, C. Marchionni: Broadening the per- essian (Kluwer Academic/Plenum, New York 2002)
spective: Epistemic, social, and historical as- pp. 41–58
432 Part D Model-Based Reasoning in Science and the History of Science
19.23 U. Mäki: Models and the locus of their truth, Syn- 19.42 A. Lehtinen, J. Kuorikoski: Computing the perfect
these 180(1), 47–63 (2011) model – Why do economists shun simulation?,
19.24 M.S. Morgan: Models, stories and the economic Philos. Sci. 74, 304–329 (2007)
world, J. Econ. Methodol. 8(3), 361–384 (2001) 19.43 N. Gilbert, K.G. Troitzsch: Simulation for the Social
19.25 U. Mäki: MISSing the world. Models as isolations Scientist (Open Univ. Press, Buckingham 1999)
and credible surrogate systems, Erkenntnis 70(1), 19.44 M. Weisberg: Simulation and Similarity: Using
29–43 (2009) Models to Understand the World (Oxford Univ.
19.26 B.C. Van Fraassen: Scientific Representation: Para- Press, Oxford 2013)
doxes of Perspective (Oxford Univ. Press, Oxford 19.45 M. Morrison: Models, measurement and computer
2008) simulation: The changing face of experimentation,
19.27 W.S. Parker: Does matter really matter? Computer Philos. Stud. 143(1), 33–57 (2009)
simulations, experiments, and materiality, Syn- 19.46 M.T. Hannan, J. Freeman: The population ecol-
these 169, 483–496 (2009) ogy of organizations, Am. J. Sociol. 82(5), 929–964
19.28 N. Cartwright: Hunting Causes and Using Them (1977)
(Cambridge Univ. Press, Cambridge 2007) 19.47 R. Feynman: Simulating physics with computers,
19.29 J. Kuorikoski, A. Lehtinen, C. Marchionni: Economic Int. J. Theor. Phys. 21, 467–488 (1982)
modelling as robustness analysis, Br. J. Philos. Sci. 19.48 R. Muldoon: Robust simulations, Philos. Sci. 74(5),
61(3), 541–567 (2010) 873–883 (2007)
19.30 J. Kuorikoski, A. Lehtinen, C. Marchionni: Robust- 19.49 E. Bruch, R. Mare: Neighborhood choice and neigh-
ness analysis disclaimer: Please read the manual borhood change, Am. J. Sociol. 112, 667–709 (2006)
before use!, Biol. Philos. 27(6), 891–902 (2012) 19.50 J. Zhang: A dynamic model of residential segrega-
19.31 J. Odenbaugh, A. Alexandrova: Buyer beware: Ro- tion, J. Math. Sociol. 28, 147–170 (2004)
bustness analyses in economics and biology, Biol. 19.51 R. Frigg, J. Reiss: The philosophy of simulation: Hot
Philos. 26(5), 757–771 (2011) new issues or same old stew, Synthese 169, 593–613
Part D | 19
Part D | 19
435
Model-Based
Antoni Ligęza, Bartłomiej Górny
20. Model-Based Diagnosis
Part D | 20
system behavior with the use of causal graphs is
put forward. Then, a systematic method for dis- 20.9.2 Elimination of Spurious Diagnoses ..... 450
20.9.3 Deduction for Enhanced Diagnoses .... 452
covering all the so-called conflict sets (disjunctive
20.9.4 Analysis of Diagnoses ....................... 452
conceptual faults) is described. Such conflict sets
describe sets of elements in such a manner that in 20.10 Dynamic Systems Diagnosis:
order to explain the observed misbehavior at least The Three-Tank Case ........................ 454
one of them must be faulty. By selecting and re- 20.11 Incremental Diagnosis ..................... 456
moving such elements from all conflicts sets – for
each conflict set one such element – the proper 20.12 Practical Example and Tools ............. 458
candidate diagnoses are generated. An example 20.13 Concluding Remarks ........................ 459
of the application of the proposed methods to
the three-tank dynamic system is presented and References................................................... 460
some bases for on-line generation of diagnoses
for dynamic systems are outlined, together with
some theorems. The chapter introduces an easy
and self-contained material being an introduction
to modern model-based diagnosis, covering static
and dynamic systems.
436 Part D Model-Based Reasoning in Science and the History of Science
Diagnostic reasoning can be considered as a cover Model-based reasoning [20.1, 2] refers to reasoning
name for man and machine inference activities aimed about system behavior and properties on the basis of its
at discovering what is wrong when some systems do mathematical model. Note that no practical experience
not work as expected [20.1–3]. In fact, it constitutes or observations are necessary. Complete information on
a set of mutually complementary paradigms of infer- the system can be gathered during a single session of
ence with the ultimate goal to produce a set of rational model investigation.
explanations of the observed misbehavior of some sys- In constrast to deep knowledge and MBR,
tems under consideration. Diagnostic reasoning often CBR [20.9–11] is in opposite to using deep knowledge
combines causal reasoning with domain experience and MBR. It consists in gathering a number of cases
including statistical and expert knowledge. Hence, di- (in the case of diagnosis – failure descriptions and fault
agnostic reasoning makes use of the so-called shallow identifications) to be stored and used as patterns for
knowledge and in the case of the model-based diagnosis solving new problems. Case-based reasoning looks for
the deep knowledge. an identical case which occured in the past, or a simi-
Shallow knowledge refers to empirical knowledge lar one; in the latter case, reasoning by analogy can be
based on numerous observations and experience, typ- used.
ical for practitioners, usually encoded with rules and The process of fault diagnosis of technical systems
taking the form of the so-called expert system [20.4]. typically requires the use of different methods of knowl-
The diagnosed system is usually considered as a black edge representation and inference paradigms. The most
box where only inputs and outputs are known, and the common scenario of such a process consists of the
expert system covers the operational knowledge about detection of the faulty behavior of the system, classi-
its behavior, properties, and possible failures in the form fication of this behavior, search for and determination
of if–then rules. The development of the rules is based of potential causes of the observed misbehavior, that is,
on experience documented with accumulated records. generation of potential diagnoses, verification of those
On the other hand, deep knowledge refers to the hypothetical diagnoses, and selection of the correct one,
contents of the box (the term white box is sometimes and finally a repair phase.
used) and mathematical models of system behavior. There exist a number of approaches and diagnos-
Hence, deep knowledge is the knowledge of the internal tic procedures having their origin in very different
structure, components, and their interactions. It allows branches of science, such as mechanical engineering,
Part D | 20
us to simulate the behavior of the system for any admis- electrical engineering, electronics, automatic control,
sible input conditions. or computer science. In the diagnosis of complex
Diagnosis is usually carried out by domain experts, dynamical systems, approaches from automatic con-
and expert knowledge hardly undergoes any smart trol play an important role ([20.12–14]; a good state
formalization. Nevertheless, some most successful ap- of the art can be found in the handbook [20.3]).
proaches mimic diagnostic reasoning by the combina- The point of view of computer science, and espe-
tion of abduction/deduction including causal inference, cially artificial intelligence (AI) is presented, for ex-
model-based reasoning (MBR), and case-based reason- ample, in [20.1, 2, 15]. A good comparative analysis
ing (CBR). Let us briefly explain the meaning of these of some selected approaches was presented in [20.16]
terms in practice. and in [20.17]. A recent, comprehensive in-depth study
Abduction [20.5, 6] consists in looking for valid ex- aimed at comparison of approaches emerging from
planations for observed effects. In general, it is not AI and from classical automatic control is presented
a valid inference rule. The result of abductive inference in [20.18].
is a set of hypotheses explaining what is observed; the This chapter is devoted to the presentation of some
hypotheses must be consistent with the knowledge at selected approaches originating from AI, located in
hand. the area of model-based reasoning (or MBR) and
Deduction [20.6–8] consists in looking for conse- based on consistency-based reasoning [20.7, 19–22].
quences of some assumptions and initial knowledge at The presentation is based on the authors’ experience
hand. In general, it is a valid inference rule. The result and some former publications including [20.23–29].
of deductive inference is a set of facts true under the Many concepts and results were prepared during the
assumptions and background knowledge. work on [20.30].
Model-Based Diagnosis 20.1 A Basic Model for Diagnosis 437
Part D | 20.1
be stated (detected) through the observation of one or RC 2D 2M ; (20.3)
more symptoms of failure. Assume that there are m such
symptoms to be considered and their evaluation is also that is, any failure being defined as a subset of D is as-
binary. Let M denote the set of such symptoms, where signed one or more sets of possible failure symptoms;
M D fm1 ; m2 ; : : : ; mm g. The detection of a failure con- in certain particular cases, the failure – although oc-
sists in the detection of the occurrence of at least one curred – may also be unobservable.
symptom mi 2 M. In general, some subset M C M of Such approach, however, is indeterministic: a sin-
the symptoms can be observed to be true in the case gle failure may be assigned several different sets of
of a failure. The goal of the diagnostic process is the symptoms of the observed misbehavior. Therefore, it is
generation of a diagnosis being any set DC D, such frequently assumed that the causal dependency RC is of
that taking into account the domain expert knowledge functional nature, that is, RC is, in fact, the following
and the system model it explains the observed misbe- function
havior.
Let KB denote the knowledge base – in our case, the RC W 2D ! 2M : (20.4)
model of the system behavior. Furthermore, let D de-
note the components that are assumed to work correctly, In this approach, any failure causes some unique and
DC [ D D D, and let M denote the failure manifes- well-defined set of symptoms to occur. In this case, the
tations that are absent, M C [ M D M. More formally, task of building a diagnostic system consists in find-
any valid candidate diagnosis DC must satisfy the fol- ing the inverse function, that is, the so-called diagnostic
lowing conditions function f , where f D R1
C .
Unfortunately, in the case of many realistic systems,
DC [ D [ KB ˆ M C [ M ; (20.1) the function RC is not a one-to-one mapping, so there
does not exist the inverse mapping in the form of a func-
and tion.
Consider a simple example of such a system; let
DC [ D [ KB [ M C [ M 6ˆ ? : (20.2) it be the set of n bulbs connected serially (e.g., a set
438 Part D Model-Based Reasoning in Science and the History of Science
for a Christmas tree). An elementary diagnosis di is order to restrict the area of search. In the case of the se-
equivalent to the ith bulb being blown. However, the set rially connected bulbs, such an approach may consists
of manifestations of the failure M D fm1 g is a single- in the examination of certain bulbs or rather groups of
element set, where m1 indicates that the bulbs are not them (an optimal strategy is that of dividing the circuits
switched on. Even in the case, when the analysis is into two equal parts).
restricted to considering single-element elementary di- Note that complex diagnostic systems use a variety
agnoses, there exists n potentially equivalent diagnoses; of technologies to deal with complexity; these include
each of them causes the same result with m1 being true. hierarchical strategies for the identification of faulty
If multielement diagnoses are admitted, then there exist subsystem, interactive diagnostic procedures with the
.2n 1/ potential diagnoses. use of supplementary tests, and observations aimed at
In practice, the development of a diagnostic sys- restricting the search area, accessible statistical data in
tem consists in finding the inverse relation R1 C , and order to establish the most probable diagnoses, and ap-
more precisely it searches for this inverse relation dur- ply heuristic methods in diagnosis. One of the basic
ing the diagnostic process. In many practical diagnostic and frequently applied heuristics is considering only
systems, the diagnostic process is interactive, and ad- elementary diagnoses. Another, more advanced one
ditional tests and measurements can be undertaken in consists in considering only minimal diagnoses.
lar, from those concerning knowledge representation statistics, and expert systems)
methods and automated inference. These methods are Applied knowledge representation methods (alge-
good examples of practical applications of AI tech- braic, numerical, logical, graphical)
niques. They are mostly based on the algebraic, logical, Applied inference methods (abduction, deduction,
graphical, and rule-based knowledge representation and search)
automated inference methods [20.2, 28]. Inference control mechanism (systematic search,
A characteristic feature of KE methods is that they heuristic search).
use mostly the symbolic representation of the domain
and expert knowledge as well as automated inference The diagnostic knowledge can, in fact, be of two
paradigms for knowledge processing. They can also different origins. First, it can be the so-called shallow
make use of numerical data and models (if accessible) knowledge, having the source in input–output obser-
as well as uncertain, incomplete, fuzzy, or qualitative vations and experience. Such kind of knowledge is
knowledge. A common denominator and core for all the also called expert knowledge in case it is appropri-
methods is constituted by mathematical logic. ately significant, and frequently its acquisition consists
The key issue of KE is knowledge representation in interviewing some domain experts. In this case, the
and knowledge processing; some other typical activities knowledge of the system model (the structure, prin-
include knowledge acquisition, coding and decoding, ciples of work, mathematical models) is not required.
analysis, and verification [20.4, 6]. Because of a specific The specification of such kind of knowledge may take
character of KE methods originating mostly from sym- the external form of a set of observations of faults and
bolic methods for knowledge manipulation in AI, the assigned to them diagnoses, learning sequence of exam-
taxonomy of such methods is different than those of the ples, or ready-to-use rules coming from an expert. The
methods developed in the automatic control area [20.3] approaches based on the use of shallow knowledge are
and it constitutes some extension and complement. This generally classified as expert systems methods.
extension is oriented toward taking into account spe- Secondly, knowledge may be originating from the
cific aspects of KE methods, while the taxonomy takes analysis of mathematical models (the structure, equa-
into account both the applied tools and the philoso- tions, constraints) of the diagnosed system; this knowl-
Model-Based Diagnosis 20.2 A Review and Taxonomy of Knowledge Engineering Methods for Diagnosis 439
edge is referred to as the so-called deep knowledge. likely ones. The applied methods include blind search,
In case such knowledge is accessible, the diagnostic ordered search, heuristic search, use of statistical in-
process can be performed with the use of the model formation and methods, use of qualitative probabilities,
of the system being analyzed, that is, the so-called use of supplementary tests in order to confirm or reject
model-based diagnosis is performed. The deep knowl- search alternatives as well as hierarchical strategies.
edge takes the form of a specific mathematical model The following taxonomy of diagnostic approaches
(adapted for diagnostic purposes) and perhaps some is based on the KE point of view, and it takes into ac-
heuristic or statistical characteristics useful to direct di- count mainly the type and way of diagnostic knowledge
agnostic reasoning. specification and the applied methods of knowledge
Most frequently, the specification of deep knowl- representation.
edge includes definition of the internal structure and
dependencies valid for the analyzed system in connec- 20.2.2 Expert Methods
tion with the set of elements, the faults of which are
subject to diagnostic activities, as well as the specifi- 1. Methods based on the use of numerical data:
cation of the current state of the system (observations). Pattern recognition methods in feature space
The approaches based on the use of deep knowledge Classifiers using the technology of artificial neu-
are classified as model-based approaches. Obviously, ral networks
a wide spectrum of intermediate cases comprising both Simple rule-based classifiers, including fuzzy
of the above approaches in appropriate proportions are rule-based systems
also possible. Hybrid systems.
Knowledge representation methods include mostly 2. Methods using symbolic data and knowledge (clas-
symbolic ones, such as facts and inference rules, logic- sical knowledge engineering methods, simple alge-
based methods, trees and graphs, semantic networks, braic formalisms, graphic- and logic-based meth-
frames, scenarios, and hybrid methods [20.3, 28]. Nu- ods, and ones based on domain expert knowledge):
merical data (if present) may be represented with the Diagnostic tests
use of vectors, sequences, tables, etc. Mathematical Fault dictionaries
models (e.g., in the form of functional equations, dif- Decision trees
ferential equations, constraints) may also be used in Decision tables
Part D | 20.2
modeling and failure diagnosis. Logic-based methods including rule-based sys-
Reasoning methods applied in diagnosis include tems and expert systems
logical inference (deduction, abduction, consistency- Case-based systems.
based reasoning, nonmonotonic reasoning, and induc-
tion) as well as originating in logical methods for 20.2.3 Model-Based Methods
knowledge processing in rule-based systems (forward
chaining, backward chaining, bidirectional inference), 1. Consistency-based methods:
pattern matching algorithms, search methods, case- Consistency-based reasoning using purely logi-
based reasoning, and other. In the case of numerical cal models (Reiter’s theory [20.22])
data, various methods of learning systems, both para- Consistency-based reasoning using mathemati-
metric and structural ones, are also applied. cal, causal models, and qualitative models.
The control of diagnostic inference is mainly aimed 2. Causal methods:
at enhancing efficiency, so that all the diagnoses (in Diagnostic graphs and relations
the case of complete search) or only the most probable Fault trees
ones (in the case of incomplete search) are generated Causal graphs (CGs)
as fast as possible, so that the obtained diagnoses are Logical abductive reasoning
ordered from the most likely ones to the most un- Logical CGs.
440 Part D Model-Based Reasoning in Science and the History of Science
the one obtained with the use of the model; if this is diagnosis DC can be obtained (or a set of alternative
the case, it may be assumed that the system works cor- diagnoses). The diagnoses are represented with sets of
rectly. system components which (potentially) are faulty, and
In case some significant difference of the current be- such that assuming them faulty explains in a satisfac-
havior of the system from the one predicted with the use tory way the observed misbehavior by regaining the
of the model can be observed, then it must be stated that consistency of the observed output with the output of
the observed behavior is inconsistent with the model. the modified model.
Let us consider the multiplier–adder system, as Fig. 20.2 A simple multiplier–adder system
introduced in [20.22]. The scheme of the system is pre-
sented in Fig. 20.2. The presented system is, in fact, ing a combined multiplier–adder; it is widely explored
a nontrivial demonstration and benchmark system be- in the domain literature [20.2, 28]; it is also used for the
Model-Based Diagnosis 20.4 A Motivation Example 441
Part D | 20.4
D D 3, and E D 3. It is easy to check that – if the sys- DCF3 D fm1; a1; a2; m3g are the ones responsible for
tem works correctly – the outputs should be F D 12 the symmetry of signals F and G. This is also visual-
and G D 12. Since the current value of F is incorrect, ized with the CG presented in Fig. 20.3.
namely F D 10, the system is faulty. At least one of its Now, we are at the core of model-based diagnos-
components must be faulty. tic reasoning. Since the value of F is incorrect, at least
Now, let us ask, which component (or components) one of the components of DCF1 must be faulty. In other
is/are faulty. Note that, the fault of a component (or words, assumption that all the elements of DCF1 are
multiple faults of several components) affects the value correct is inconsistent with the observations. The set is
of variable F (it is smaller than expected), but simulta- called a conflict set [20.22] or a disjunctive conceptual
neously does not affect the value of variable G (12 is fault (DCF) [20.29]. If considering single-element po-
the correct value for the observed inputs). tential diagnoses, all the three elements of DCF1 are
The simplest approach may be based on the ob- candidate diagnoses. (In fact, m2 is not a valid candi-
servation that the fault must be caused by an error of date for a single-element diagnosis; it also influences
a component responsible for producing the value of F. the value of G, while no deviation of that value is ob-
In fact, one can expect a causal influence of the follow- served.)
ing type: faulty component leads to the faulty value. In But there is another problem to be explained. The
order to perform the search, a simple CG presented in observed values of F and G considered together are
Fig. 20.3 may be useful. inconsistent with the model. Note that if all the com-
The bottom nodes of the graph correspond to faults ponents were correct, then Z D C E calculated by m3
of components. The three top-level nodes correspond must be equal to 6, and since G is observed to be 12,
to the observed values of output variables; in particu- Y (calculated backward and under the assumption that
lar, the rightmost node marked with F G corresponds a2 is correct) must also equal 6; hence, if m1 is correct,
to the mutual relationship of signals F and G; in fact, then X must be 6 as well, and if a1 is correct, F would
for the observed inputs, the values should be equal, be equal to 12. Since it is not the case, at least one of
and, since they are not, one can also expect a double the used components must be faulty, that is, DCF3 is
442 Part D Model-Based Reasoning in Science and the History of Science
also a conflict set (a disjunctive conceptual fault) – at It is straightforward to observe that in the case of the
least one of its components must be faulty. two-element diagnoses, the compensation phenomenon
Now, in order to regain consistency of the observa- is observed.
tions with the system model, one must remove some of The diagnoses – in principle – are obtained as the
the assumptions that all the components are correct. We so-called hitting sets for all the conflict sets at hand.
look only for minimal such explanations, since typically Such a hitting set takes one element from each con-
we prefer the simplest possible diagnoses. And there are flict. Hence, the conflicts are removed (repaired), and
four such explanations: the collection of taken in such a way components form
a candidate diagnosis. In the case of the first two – it
fa1g is a single-element candidate diagnosis; it re- was the same element taken from both the conflict sets,
pairs both DCF1 and DCF3 so that single-element diagnoses are yield.
fm1g is another single-element candidate diagnosis; In the following sections, an attempt to present the
it repairs both DCF1 and DCF3 formal theory underlying model-based diagnosis is pre-
fa2; m2g is a two-element candidate diagnosis; m2 sented, and an approach to systematic generation of all
repairs DCF1 , while a2 repairs DCF3 conflict sets on the base of CG is put forward. Then the
fm2; m3g is another two-element candidate diagno- multiplier–adder example is revisited, and the formal
sis; m2 repairs DCF1 , while a2 repairs DCF3 . analysis of it is presented.
Part D | 20.5
provements are proposed. is inconsistent.
The idea of a conflict set (or just conflict for sim-
plicity; recall also the name disjunctive conceptual fault For intuition, a conflict set (under the given obser-
introduced in Sect. 20.4) is of key importance for the vations and system model) is a set of components, such
theory of consistency-based diagnostic reasoning with that at least one of its elements must be faulty. Any
use of the system model. A conflict set is any subset conflict set represents, in fact, a disjunction of poten-
of the distinguished system elements, that is, COMP, tial faults. A conflict set is minimal if any of its proper
such that all items belonging to such a set cannot be subsets is not a conflict set. Note that if the analysis is
claimed to work correctly (i. e., at least one of them restricted to minimal conflicts, then removing a single
must be faulty) – it is just the assumption about their element from such a conflict set makes this set become
correct work which leads to inconsistency. no longer conflict. In other words, the system regains
Assume that we consider a system specified as consistency.
a pair (SD, COMP), where SD is the theory describ- Now, let us define an important concept, that is,
ing the work of the system (i. e., system description) a hitting set.
and where COMP D fc1 ; c2 ; : : : ; cn g is the set of dis-
tinguished system elements. Any of these elements can Definition 20.4
become faulty, and the output of diagnostic procedure Let C beSany family of sets. A hitting set for C is any
is restricted to be a subset of the elements of COMP. set H S2C S, such that H \ S ¤ ; for any set S 2 C.
In diagnostic reasoning, it is assumed that the cor- A hitting set is minimal if and only if any of its
rect behavior of the system is fully described by the proper subsets is not a hitting set for C.
theory of SD. Assumptions about the correct work of
system components take the form For intuition, a hitting set is any set having
:AB.c1 / ^ :AB.c2 / ^ ^ :AB.cn / : a nonempty intersection with any conflict set; it is min-
imal if removing from it any single element violates the
Hence, assuming that the observed behavior is de- requirement of nonempty intersection with at least one
scribed with the formulas of the set OBS and in the case conflict set.
444 Part D Model-Based Reasoning in Science and the History of Science
Having defined the idea of a conflict set and a hit- minimal conflicts, one makes sure that by removing any
ting set, we can present the basic theorem of Reiter’s single element from such a set leads to the elimination
theory [20.22]: of conflict. Hence, the union of all such elements (i. e.,
a hitting set) allows for regaining global consistency;
Theorem 20.1 it then constitutes a (potential) diagnosis. Of course,
COMP is a diagnosis for (SD, COMP, OBS) if and for the considered system failure described with (SD,
only if is a minimal hitting set for the family of con- COMP, OBS), there can exist many diagnoses explain-
flict sets for (SD, COMP, OBS). ing the observed misbehavior.
In the case of the multiplier–adder system presented
Since any superset of a conflict set for (SD, COMP, in Fig. 20.2, the following conflicts were found
OBS) is also a conflict set, it can be shown that H is
a minimal hitting set for (SD, COMP, OBS) if and only fa1; m1; m2g; fa1; a2; m1; m3g :
if H is a minimal hitting set for all minimal conflict
sets defined for (SD, COMP, OBS). This observation On the basis of them, all the potential diagnoses can
(proved in [20.30]) leads to the following theorem be- easily be found, that is,
ing a fundamental result of Reiter’s theory:
D1 D fa1g; D2 D fm1g;
Corollary 20.1
COMP is a diagnosis for (SD, COMP, OBS) iff D3 D fa2; m2g; D4 D fm2; m3g :
is a minimal hitting set for the collection of minimal
conflict sets for (SD, COMP, OBS). Let us notice that only minimal diagnoses are found
(i. e., if some set is a diagnosis, then any of its supersets
To summarize, the role of conflict sets in Reiter’s will not be generated as a diagnosis) and that Reiter’s
theory is that they provide specifications of compo- theory allows for the generation of both single-element
nents, such that for each conflict set, at least one diagnoses (single faults) and multielement ones (multi-
element must be faulty. By restricting the analysis to ple faults).
Part D | 20.6
does not mean that there exists also causal relation- exists a disjunctive relationship and the symptom v is
ship, for example, the occurrence of d and m may be of OR type. By using an arrow to represent the causal
observed as some independent results of some other, relationship, a dependency of disjunctive type will be
external common cause.) This allows for the applica- denoted as v1 j v2 j : : : j vk ! v .
tion of logical inference models for the simulation of In the latter case, it is said that the relationship is
systems behavior as well as for reasoning about possi- a conjunctive one – for the occurrence of v , it is nec-
ble causes of failure. essary that all the symptoms of V must occur; it is said
The second condition, that is, the one of precedence that the symptom v is of AND type. The conjunctive
in time means that the cause must occur before its re- relationship is denoted as Œv1 ; v2 ; : : : ; vk ! v .
sult, and that the result occurs after the occurrence of Furthermore in some cases, it may happen that the
its cause. This implies some obvious consequences for occurrence of some symptom causes another symptom
modeling of the behavior of dynamic systems in the to disappear and vice versa; in such a case, it is said that
case of a failure. the relationship is of NOT type, that is
The last condition means that there must exist a way
for transferring the dependency (a signal channel en- uˆv and uˆv : (20.10)
abling the flow of the physical signal); lack of such
connection indicates that two symptoms are indepen- In such a case, the causal relationship is denoted as
dent, that is, there is no cause–effect relationship among u v.
them. A more detailed analysis of theoretical founda- The above-presented causal relationship applies to
tions of the causal relationship phenomenon from the symptoms having the character of propositional logic
point of view of diagnosis can be found in [20.23, 25] variables, that is, to formulas taking the value true or
or [20.3]. false. Such symptoms, apart from denoting the occur-
The above presented model of the causal relation- rence of a discrete event (e.g., tank overflow, signal is
ship is, in fact, a simplest kind of the so-called strong on, etc.) may also denote that certain continuous vari-
causal relationship; by relaxing the condition of logical ables take some predefined values or achieve certain
implication potential relationship can be obtained (in levels, that is, de facto they can encode some formulas
such a case occurrence of d may, but need not necessar- of the form X D w or X 2 W, where X is some process
ily, mean the occurrence of m), including a causal rela- variable and w is its value, and W is some set (interval)
Part D | 20.6
tionship of probabilistic nature (characterized by some of values. In such a case, qualitative reasoning and qual-
quantitative or qualitative probability). For simplicity, itative modeling of the causal relationship at the level of
such extensions are not considered here. Another ex- propositional logic only may turn out to be insufficient.
tension may consist in the causal relationship between A more general notation for the representation of
several cause symptoms and several result symptoms the causal relationship in case when values of certain
described with some functional dependencies; as this variables influence the values taken by other variable
case is important for technical diagnosis, it will be con- may take the following form
sidered in brief.
v1 ; v2 ; : : : ; vk ! v ; (20.11)
Let V denote some set of symptoms
or in the form of an equation
V D fv1 ; v2 ; : : : ; vk g :
.v1 ; v2 ; : : : ; vk / D v : (20.12)
The discussion here is restricted to logical symptoms
taking the value of true or false. In some cases, it may Note that, in this case, it is important that variables
be observed that there exists a causal relationship be- v1 ; v2 ; : : : ; vk influence variable v , and the quantitative
tween the symptoms of V constituting a common cause (or qualitative) characteristics of this influence are ex-
for some symptom v and this symptom. In particular, pressed with the appropriate equation. In practice, such
the following two cases are of special interest characteristics can be expressed with a look-up table
specifying the values of v for different combinations
v1 _ v2 _ _ vk ˆ v ; (20.8)
of values of the input variables.
and Now, let us pass to more general case, that is, mod-
eling the causal relationship among variables taking dis-
v1 ^ v2 ^ ^ vk ˆ v : (20.9) crete, continuous, qualitative, or even symbolic values.
This will be done with the use of causal graphs (CGs).
In the first case, occurrence of at least one symptom Consider two system variables, say X and Y. Then,
from V causes the occurrence of v ; it is said that there if X influences Y, we speak about causal dependency.
446 Part D Model-Based Reasoning in Science and the History of Science
And let (!) denote the existence of causal in- value of it (observed or measured) must be different
fluence between two variables. Any such influence is from the one predicted with the use of the model.
assigned an expression of the form The conflict set will be composed of the com-
ponents responsible for the correct value of the
i D .ŒX1 ; X2 ; : : : ; Xk ; f ; Y; Œc1 ; c2 ; : : : ; ck ; cY / misbehaving variable.
(20.13)
So, in case the CG for the analyzed system is de-
where X1 ; X2 ; : : : ; Xk are the input variables, f is a func- fined, the detection of all the conflicts requires the
tion defining the dependency in quantitative terms, Y is detection of all the misbehaving variables and next –
an output variable and c1 ; c2 ; : : : ; ck are the system com- search of the graph in order to find all the sets of com-
ponents responsible for the correct work of the subsys- ponents responsible for the observed misbehavior.
tems generating the output values; cY is the component It seems helpful for the discussion to introduce the
responsible for the value of the output variable Y. idea of a potential conflict structure (PCS) [20.24, 30].
P X P X P X
[U] [U] [V ]
Z
X
P P P X
[U] [U] [V ]
Y P
Q X P
[U]
R Q X X
[U]
[V ]
R Q
Part D | 20.7
and allows for relatively efficient search for all potential
minimal conflicts. [U]
Similar concept named possible conflict has also R
been described then in [20.31, 32].
b
Next, as a result of computational verification of a
potential conflicts, those which are not real ones are
eliminated. P Q
Definition 20.6
A PCS structure defined for variable X on m hidden m = 0 – no confliccts
variables is any subgraph of the CG, such that: m = 1; j = 2 {e, g}
j=3 {a, b, f }
It contains exactly m hidden variables (including X).
m = 2; j = 4 {f, c, d, g}
The values of all incorporated variables are mea-
{f, c, d, e}
sured or calculated (they are well defined).
The value of variable X is double-defined (e.g., j=5 {a, b, c, d, e}
measured and calculated with the use of values of {a, b, c, d, g}
the other variables).
In the considered PCS, all the values of the m
Fig. 20.6 Example conflict structures; j is the number of
links used (after [20.24])
variables are necessary for X in order to be double-
defined.
In Fig. 20.6, we show how the number of conflicts
and their structure changes with m D 0; 1; 2 for a simple
A structure just defined allows for potential conflict CG of two hidden variables.
generation. Some examples of PCS for m hidden vari- In Fig. 20.7, some further examples for a simple CG
ables are shown in Fig. 20.5. with a single immeasurable variable are shown.
448 Part D Model-Based Reasoning in Science and the History of Science
X Y Z {c2, c3}
Potential conflicts: X* Y
c4 c5 c6
{c1, c2, c3, c4}
c2 c3
{c1, c2, c3, c5}
[U] {c1, c2, c3, c6} {c1, c3}
{c4, c5}
[U]
{c5, c6}
c1 c2 c3 {c1, c2}
{c4, c6}
P Q R c1
since if all of them were correct, then Z D C E calcu- The OR matrix for the diagnosed system is pre-
lated by m3 must be equal to 6, and since G is observed sented in Table 20.1.
to be 12, Y (calculated backward and under the assump- The AND matrix defining the relationship between
tion that a2 works correct) must also equal 6; hence, if the DCFs (active in the case of F being incorrect and
m1 is correct, then X must be 6 as well, and if a1 is cor- G correct) and the manifestations is presented in Ta-
rect, F would be equal to 12. Since it is not the case, at ble 20.2.
least one of the used components must be faulty. So we In Table 20.2, F , G , etc., mean that the output
have the following rule is incorrect, while F, G, etc. denote the correct output
observed at the variable.
rule2_or W m1 _ m3 _ a1 _ a2 ! DCF2 : (20.15) In the analyzed case, that is F being faulty and
G correct, the final diagnoses for the considered
The situation is illustrated in Fig. 20.10.
case are calculated as reduced elements of the Carte-
Note that if F would be correct and G would be
sian product of DCF1 D fm1; m2; a1g and DCF2 D
faulty, for example, F D 12 and G D 10, then another
fm1; m3; a1; a2g. There are the following potential di-
observed conflict would be DCF3 D fm2; m3; a2g and
agnoses: D1 D fm1g, D2 D fa1g, D3 D fa2; m2g, and
so we would have a third OR rule of the form
D4 D fm2; m3g. They all are shown in Fig. 20.11.
rule3_or W m2 _ m3 _ a2 ! DCF3 : (20.16) The potentially possible final diagnoses in general
case are presented in Table 20.3.
Moreover, DCF2 equivalent to a fault in fm1; m3; a1; The calculation of diagnoses can be easily inter-
a2g would occur as well. preted by using AND/OR CGs [20.25, 28]. An appro-
If both the outputs were incorrect (e.g., F D 10 and priate AND/OR graph is presented in Fig. 20.12. The
G D 14), then, in general case, one can observe DCF1 , active links are represented with continuous lines, while
DCF2 , and DCF3 . Note, however, that whether DCF2 the potential ones are represented with dashed lines.
is a valid conflict may depend on the observed outputs. Active DCFs are marked with bold circles and the
For example, if F D 10 and G D 10 (both outputs are current diagnostic problem (manifestations) are also
incorrect but equal), then the structure and equations represented with a bold-line circle.
describing the work of the system do not lead to a con- The final diagnoses are calculated as the minimal
ceptual fault [20.16, 17]. sets of the lowest level elements which are necessary
Part D | 20.8
Note that any DCF is modeled with some PCS. De- to satisfy the currently observed set of manifestations.
pending on the current manifestations, a DCF can be The intermediate nodes representing the DCFs are OR
observed (be active), that is, a real conflict exists or it
may be a potential conflict only (be inactive). For effec- Table 20.1 An OR binary diagnostic matrix for the adder
tive diagnosis, one needs only the specification of active system (the lower level)
DCFs. DCF m1 m2 m3 a1 a2
The diagnoses are calculated as reduced elements of DCF1 1 1 1
the Cartesian product of the conflict sets associated with DCF2 1 1 1 1
the active DCFs. The reduction consists in the elimina- DCF3 1 1 1
tion of duplicates.
Table 20.2 An AND binary diagnostic matrix for the
m1 Observation adder system (the upper level)
A
6 M DCF1 DCF2 DCF3
[X ] a1
10 F , G, .F G/ 1 1
F* F; G , .F G/ 1 1
m1 a1 12 F , G , .F G/ 1 1
6 F , G , .F G/ 1 1 1
C m3 [Y ]
a2
6
G
a2 D1 D2 D3
6 12
m3 [Z ] D4
E { a1 , m1 , m2 }
Observation
{ a1 , a2 , m1 , m3 }
Fig. 20.10 Second conflict detected for the multiplier–
adder example Fig. 20.11 Generation of potential diagnoses
450 Part D Model-Based Reasoning in Science and the History of Science
AND -level
OR-level
m1 m2 m3 a1 a2
Table 20.3 Final possible diagnoses The presented graphical interpretation can be con-
Observations Diagnoses sidered as the effect of knowledge compilation for
F ; G; .F G/ fa1g, fm1g, building an efficient diagnostic procedure. In fact, the
fa2; m2g, fm2; m3g graph covers all the possible potential failures. In order
F; G ; .F G/ fa2g, fm3g, to build an automated diagnostic system, it would be
fa3; m2g, fm1; m2g, enough to apply a simple, three-stage procedure:
F ; G ; .F G/ fm2g,
fa1; a2g, fa1; m3g, Decide which of the top-level nodes describe the
fa2; m1g, fm1; m3g current diagnostic situation.
F ; G ; .F G/ fa1; a2g, fa1; m2g, Find all the real conflicts corresponding to the
Part D | 20.9
negative or only positive and (2) that the defined sign of describing the faulty behavior of elements in the
deviation of a fault defines also the sign of deviation of case of normal inputs
the influenced manifestation. describing the normal behavior in the case of devi-
For example, the voltage of a battery can only be ated inputs
normal (0, no fault) or low (, below normal). The level describing faulty behavior in the case of deviated
of liquid in a tank can be normal (0), low (), or high inputs.
(C). The clock can be exact, but when faulty it can slow
Let c denote a single component and X a variable.
down () or advance (C).
By c.v /, where v 2 f; 0; Cg, we shall denote the type
The influence of a fault on manifestation can be
of failure; if undefined, we shall write c.‹/. For partial
denoted using the sign. For example, low battery
definition, we can use a set of values. The same applies
(battery_fault) causes low light (light_fault), that is,
to variables. Now, more detailed characteristics of the
battery_fault./ ! light_fault./.
diagnoses can be found. Let us introduce a definition of
Let V denote a set of variables. For diagnostic pur-
qualitative diagnosis taking into account the deviation
poses, we shall assume that V D O [ H [ C, and these
sign of a fault.
sets are pairwise disjoint. O is the set of observable
(measurable) system variables, H is the set of hidden
Definition 20.7
variables, and C is the set of diagnostic variables aimed
A qualitative diagnosis
at describing different faulty modes of diagnosed com-
ponents.
D D fd1 .#/; d2 .#/; : : : ; dk .#/g
In the case of the multiplier–adder example system,
we have
is a diagnosis fully explaining the observed misbehavior
O D fA; B; C; D; E; F; Gg; and covering the knowledge of the deviation sign for
H D fX; Y; Zg; and any fault (if accessible). Here, # is C if the deviation
sign is positive, if the deviation sign is negative, and
C D fm1; m2; m3; a1; a2g :
‹ if the deviation sign is unknown (any, undetermined).
Variables of O and H take the values of the real numbers
(or integers) restricted to some reasonable intervals, In the following, three types of causal rules describing
Part D | 20.9
while variables of C are restricted to some of the pos- the qualitative behavior are discussed in detail.
sible modes of misbehavior. In our case, the values are A generic form of the first type of rules is as follows
restricted to f; 0; Cg with the obvious meaning of low-
ering the output value, producing correct output and c.v / ! Out.w /
producing the output higher than expected.
For enhancing diagnostic reasoning, however, we where c denotes one of the five components of the
shall assume that all the variables can take three quali- system, and v is one of the logical values defining
tative logical values f; 0; Cg. For any variable V 2 V, the operating mode, v 2 f; 0; Cg and Out is its out-
we can interpret these values as follows: put, w 2 f; 0; Cg. For example, a faulty m1 lowering
its output signal is described with the rule c./ !
V.0/ – the proposition that the value of V is correct Out./ with the obvious meaning. In the case of the
holds example system, we have as much as 10 such rules
V.C/ – the proposition that the value of V is incor- (two for each of the five components) defining partic-
rect; deviation is positive holds ular lowering or increasing of the output values when
V./ – the proposition that the value of V is incor- in the faulty state.
rect; deviation is negative holds. A generic form of the second type of rules is as fol-
In other words, the first statement can be interpreted lows
as a kind of true, while the last two statements can be
interpreted as some two different types of negation. In1 .v1 / ^ In2 .v2 / ! Out.w /
We shall extend the knowledge about the models of
the system over incorrect behavior. In order to do that where In1 and In2 denote the inputs of a component and
we shall define some qualitative inference rules. Let R Out is its output, v1 ; v2 ; w 2 f; 0; Cg.
be a set of rules defining all the accessible knowledge A generic form of the third type of rules is as fol-
about the behavior of faulty components depending lows
on the faulty mode. Note that, in fact, there are three
generic forms of such rules, that is those: In1 .v1 / ^ In2 .v2 / ^ c.v / ! Out.w /
452 Part D Model-Based Reasoning in Science and the History of Science
where In1 and In2 denote the inputs of a component c 20.9.4 Analysis of Diagnoses
and Out is its output, v ; v1 ; v2 ; w 2 f; 0; Cg.
The rules of the second type (normal behavior, de- Let us analyze, in turn, all the four potential diagnoses
viated inputs) are summarized in Table 20.4. and their potential qualitative forms. The analysis is
For example, a faulty component a1 showing aimed at finding all admissible qualitative diagnoses.
a lower value of its output signal (but assumed to be
correct) and taking one input signal lower and one nor- Case of m1
mal can produce a lower output value. Such a behavior There are two false values for m1, that is, m1./ and
is described with the rule m1.C/.
Consider m1./ first. Using an appropriate deduc-
X./ ^ Y.0/ ! F./ tion rule of the form
m1./ ! X./ ;
The rules of the third type (abnormal behavior, de-
viated inputs) are summarized in decision Table 20.5. we have X./. Since a1 is correct, but one of its inputs
For all other 10 combinations of input signals and is false (lower), we can use another rule of the form
component mode, the output is undefined. For exam-
ple, a faulty component a2 increasing its output signal X./ ^ Y.0/ ! F./
and taking one input signal higher than normal and one
normal produces a higher output value. Such a behavior and so we have F./. This is consistent with observa-
is described with the rule tions since F D 10, and the reference value was 12.
Now, consider m1.C/. Using an appropriate deduc-
Y.0/ ^ Z.C/ ^ a2.C/ ! G.C/ tion rule of the form
The analysis of potential qualitative diagnoses is per- we have X.C/. Since a1 is correct, but one of its inputs
formed by the propagation of values over the CGs is false (upper), we can use another rule of the form
Part D | 20.9
Conclusion: F.C/ is inconsistent with observations. Conclusion: F./ is consistent with observations. We
Finally, diagnosis fa2./; m2.C/g is eliminated as in- can proceed. Next rule to be applied
consistent with observations.
Case: fa2.C/; m2./g. Rule to be used m3.C/ ! Z.C/
m2./ ! Y./
Part D | 20.9
Conclusion: Z.C/. Next rule to be applied
Conclusion: Y./. Next rule to be applied
Y./ ^ Z.C/ ! G.‹/
X.0/ ^ Y./ ! F./
Conclusion: G.‹/ may be consistent with observations.
Conclusion: F./ is consistent with observations. We Finally, diagnosis fm2./; m3.C/g may be considered
can proceed. Next rule to be applied admissible.
Y./ ^ Z.0/ ^ a2.C/ ! G.‹/ Case: fm2.C/; m3./g. Rule to be used
z1 z2 z3 z3
L3 z3
k3
k12 k23 k3 [F3]
Fig. 20.13 Three-tank system Fig. 20.15 The CG for the three-tank system
U f (u)
U k1
L1 L2 L3
k12 k23 k3
z1 z2 z3
k1
U U k1
[F] [F]
z1 z1
L1 z1 L1 z1
k12
[F12] [F12]
k12
z2
L2 L2 z2
z3
U k1 L3 z3
k3
[F] [F3]
[F12] L1
k12
z2 [F12]
L2 z2 k12
Part D | 20.10
k23 z2
[F23] L2 z2
k23 k23
[F23]
L3
k23
ysis becomes a nontrivial task; this is the consequence Fig. 20.19 Conflict set fk12; z2; k23g
of the fact that this time the system under analysis is
a highly interconnected dynamic one, it is described
with nonlinear equations and there exists strong feed-
L1
back in the system.
The Matlab/Simulink model of this system is shown k12
in Fig. 20.14. Using Matlab/Simulink, one can simulate [F12]
the expected correct behavior of the system. If some k12
z2
calculated variables are different from the measured
values, an inconsistency is observed and the diagnostic L2 z2
procedure should be activated.
The CG for the example system is shown in [F23]
Fig. 20.15. The CG can be generated automatically z3
from the Matlab/Simulink model of the system with
conflict generator application developed for experimen- L3 z3
tal use. The application is described in more detail in k3
Sect. 20.12. [F3]
After defining which of variables are measured
ones, the program can generate PCSs. All poten- Fig. 20.20 Conflict set fk12; z2; z3; k3g
456 Part D Model-Based Reasoning in Science and the History of Science
1
multaneously for some subsystems of one system; in D1 ˚ D2 D
ˆfD2 g
ˆ D1 62 H2 and D2 2 H1
such a case, it can be used for combining together the :̂
separately generated sets of diagnoses. It can also be fD1 [ D2 g D1 62 H2 and D2 62 H1
used for incremental generation of diagnoses for one
set of conflict sets, and especially when one generates Note that the result of operation ˚ is a family of
conflicts and diagnoses simultaneously. sets, which may contain one or two sets and each of
Before the theorem is formulated, it is necessary to these sets is a hitting set for Ci , i D 1; 2.
put forward the following definition.
Example 20.2
Definition 20.8 Let us consider the following sets of conflict sets
Let A is set of nonempty sets. The reduced set bAc for
A is the set containing this elements from A which are C1 D ffa; b; cg; fa; dgg
not supersets for other elements. C2 D ffa; c; dg; fb; egg :
The sets of diagnoses that can be generated are, respec-
tively,
Example 20.1
Let A D ffa; bg; fa; b; cgg. We have bAc D ffa; bgg. D1 D ffag; fb; dg; fc; dgg; and
Let B D ffag; fa; bg; fa; b; cgg. Then bBc D ffagg. D2 D ffa; bg; fa; eg; fb; cg; fc; eg; fb; dg; fd; egg :
Finally, let C D ffag; fa; bg; fa; b; cg; fb; cg; fd; egg. In
this case, bCc D ffag; fb; cg; fd; egg. We have
fb; dg ˚ fa; bg D ffb; dg; fa; bgg ;
Now, a special operator for combining diagnoses
fag ˚ fa; bg D ffa; bgg ;
will be defined. The operator as its arguments takes
the diagnoses for two different families of conflict fag ˚ fb; cg D ffa; b; cgg :
sets.
Model-Based Diagnosis 20.11 Incremental Diagnosis 457
Now, let us define another operator that constitutes The set of diagnoses in this case is as follows
a kind of extension of the previous one.
Definition 20.10 D3 D ffa; bg; fa; eg; fb; dg; fc; d; egg :
Let Ci denote sets of conflict sets, and D1 D
fD11 ; D21 ; : : : ; Dm
1 g, D2 D fD2 ; D2 ; : : : ; D2 g be the sets of
1 2 n
Now, let us calculate the combination of diagnoses as
diagnoses calculable from Ci , i D 1; 2. We define oper-
ator ˚ as follows
D1 ˚ D2 D ffa; bg; fa; eg; fa; b; cg;
[
iDm;jDn fa; c; eg; fb; dg; fa; d; eg; fb; c; dg; fc; d; egg :
j
D1 ˚ D2 D fDi1 ˚ D2 g
iD1;jD1
Let us reduce the combined set of diagnoses; then we
obtain
In other words, by using ˚, one makes a union of
results of operations with ˚ for each diagnosis from D1
bD1 ˚ D2 c D ffa; bg; fa; eg; fb; dg; fc; d; egg :
with each diagnosis from D2 .
Finally, the main theorem of this algebraic approach
will be presented. The theorem is named a composition It is easy to see that D3 D bD1 ˚ D2 c.
theorem since it allows for combining partial results
(ones obtained separately or in turn) into the final set The composition theorem allows for calculating di-
of diagnoses (proof in [20.30]): agnoses for the sum of two families of conflict sets in
the case where there are known diagnoses for each of
Theorem 20.2 (Composition theorem) these sets of conflicts. One does not need start gener-
Let Ci denote sets of conflict sets and Di sets of diag- ation of diagnoses from the beginning, that is, without
noses calculable from Ci , i D 1; 2; 3. If C3 D C1 [ C2 , using the known diagnoses for each individual family
then D3 D bD1 ˚ D2 c. of conflicts. The application of this theorem may, there-
fore, significantly increase the efficiency of a procedure
Part D | 20.11
for the calculation of diagnoses.
Example 20.3 The composition theorem may be easily generalized
Let to the following theorem (proof in [20.30]):
C1 D ffa; b; cg; fa; dgg C2 D ffa; c; dg; fb; egg : Theorem 20.3 (Generalized composition theorem)
Let Ci denote sets of conflict sets and Di sets of diag-
noses calculable from Ci , i D 1; 2; ; n; n C 1. If
The sets of diagnoses are
CnC1 D C1 [ C2 [ [ Cn ;
D1 D ffag; fb; dg; fc; dgg; and
D2 D ffa; bg; fa; eg; fb; cg; fc; eg; fb; dg; fd; egg :
then
Consider the combined set of conflict sets
DnC1 D bbbbD1 ˚ D2 c ˚ D3 c ˚ c ˚ Dn c:
C3 D C1 [ C2 Dffa; b; cg; fa; dg;
fa; c; dg; fb; egg :
458 Part D Model-Based Reasoning in Science and the History of Science
Conflict generator A fault detector that has as input the values of mea-
Diagnostic module. sured variables of the diagnosed system and the
corresponding values obtained from the model. The
A conflict generator is an application implemented task of the fault detection subsystem is to detect in-
in C/C++. The main window of the system is shown in consistency.
Fig. 20.22. A conflict verificator that takes potential conflicts
The conflict generator performs two main tasks, that as its input. Its task is to determine which of the
is, the generation of CG from the model developed by conflicts are real ones.
using the Matlab/Simulink application and generation Diagnoses generator that calculates diagnoses in the
of all minimal, potential conflicts for such a graph. Con- form of minimal hitting sets for all real conflict sets.
Its algorithm is based on Theorem 20.3.
Potential
conflicts
Diagnosed Values of
system variables
Model of the
system Predicted
(in Matlab/Simulink) values of
variables
Diagnostic module Fig. 20.23 Diagram of the
diagnostic module
Model-Based Diagnosis 20.13 Concluding Remarks 459
Part D | 20.13
the category of model-based reasoning since the analy- An example of the application of the theory to
sis is performed with by using the system model. Con- a dynamic system of three tanks was also presented
trary to the so-called expert systems, especially ones us- in detail. In the case of dynamic systems, the on-
ing the so-called shallow knowledge of expert being the line generation of conflicts may be necessary. As new
result of experience and useful for diagnosis, and meth- information becomes accessible, the DCFs can be re-
ods based on using model of the system do not require calculated, thanks to the provided theorems. A note on
expert knowledge, experience, evidence, etc., but only practical diagnostic experiments and tools was provided
the model of correct system behavior. as well.
A classical example of the use of model-based di- The presented groups of methods have well-defined
agnosis with the illustrative example of the multiplier– theoretical foundations. However, for efficient applica-
adder system is presented in detail. The methods used tion, they require adjustment to the specific type of the
are based on consistency-based reasoning and abduc- diagnosed system. Moreover, new approaches to prob-
tive inference. Both the paradigms use models of the lem statement and new tools may open new diagnostic
system. A typical example for consistency-based rea- possibilities; those include embedding the diagnostic
soning is Reiter’s theory. Abductive inference may use process within the framework of constraint program-
a model in the form of a CG. A detailed analysis of an ming and compilation of diagnostic knowledge [20.33,
example CG by abductive reasoning was shown. The 34]. These methods may serve as a core of advanced di-
produced results are consistency-based reasonings with agnostic system, but in order to improve efficiency, they
the use of a system model. should be equipped with specific domain knowledge
The main idea for the search of conflicts which form and heuristic knowledge. They may be also comple-
disjunctive conceptual faults is the one of potential con- mentary to one another, and it seems reasonable to
flict structures; it was shown that PCS can be used to join expert knowledge based on experience with knowl-
find all DCFs in an efficient way, and even to compile edge about the system model. It should allow for the
the diagnostic knowledge. diagnosis of new, unknown before failures, while the
An elaborated analysis of qualitative diagnoses expert component should allow for improving reason-
was presented in detail. Such an analysis yields more ing efficiency.
460 Part D Model-Based Reasoning in Science and the History of Science
References
20.1 R. Davis, W. Hamscher: Model-based reasoning: proaches to model-based diagnosis, Proc. 14th Eur.
Troubleshooting. In: Readings in Model-Based Di- Conf. Artificial Intell. ECAI’2000, ed. by W. Horn (IOS,
agnosis, ed. by W. Hamscher, L. Console, J. deKleer Berlin 2000) pp. 136–140
(Morgan Kaufmann, San Mateo 1992) pp. 3–24 20.18 L. Travé-Massuyès: Bridges between diagnosis the-
20.2 W. Hamscher, L. Console, J. de Kleer (Eds.): Read- ories from control and AI perspectives. In: Intelli-
ings in Model-Based Diagnosis (Morgan Kauf- gent Systems in Technical and medila Diagnosis,
mann, San Mateo 1992) ed. by J. Korbicz, M. Kowal (Springer, Heidelberg
20.3 J. Korbicz, J.M. Kościelny, Z. Kowalczuk, W. Cholewa 2014) pp. 3–28
(Eds.): Fault Diagnosis. Models, Artificial Intelli- 20.19 J. de Kleer, A.K. Mackworth, R. Reiter: Character-
gence, Applications (Springer, Berlin 2004) izing diagnoses and systems, Artificial Intell. 56,
20.4 J. Liebowitz (Ed.): The Handbook of Applied Expert 197–222 (1992)
Systems (CRC, Boca Raton 1998) 20.20 R. Davis: Diagnostic reasoning based on structure
20.5 D. Poole: Normality and faults in logic-based diag- and behavior, Artificial Intell. 24, 347–410 (1984)
nosis, Proc. IJCAI-89, Detroit, ed. by N.S. Sridharan 20.21 J. de Kleer, B.C. Williams: Diagnosing multiple
(Morgan Kaufmann, San Mateo 1989) pp. 1304–1310 faults, Artificial Intell. 32, 97–130 (1987)
20.6 A. Ligęza: Logical Foundations for Rule-Based Sys- 20.22 R. Reiter: A theory of diagnosis from first principles,
tems (Springer, Berlin, Heidelberg 2006) Artificial Intell. 32, 57–95 (1987)
20.7 M.R. Genesereth: The use of design descriptions in 20.23 P. Fuster-Parra: A Model for Causal Diagnostic
automated diagnosis, Artificial Intell. 24, 411–436 Reasoning. Extended Inference Modes and Effi-
(1984) ciency Problems, Ph.D. Thesis (Univ. Balearic Is-
20.8 P. Fuster, A. Ligęza, J.A. Martin: Abductive diag- lands, Palma de Mallorca 1996)
nostic procedure based on an and/or/not graph 20.24 A. Ligęza: A Note on Systematic Conflict Generation
for expected behaviour: Application to a gas tur- in CA-EN-Type Causal Structures, LAAS Report No.
bine, Proc.10th Int. Congr. and Exhib. Cond. Monit. 96317 (LAAS, Toulouse 1996)
Diagn. Eng. Manag. (COMADEM), ed. by E. Jan- 20.25 A. Ligęza, P. Fuster-Parra: AND/OR/NOT causal
tunen, K. Holmberg, R.B.K. Rao (Valtion Teknillinen graphs – A model for diagnostic reasoning, Appl.
Tutkimuskeskus, Helsinki 1997) pp. 511–520 Math. Comput. Sci. 7(1), 185–203 (1997)
20.9 K. D. Althoff, E. Auriol, R. Barletta, M. Manago: 20.26 A. Ligęza, B. Górny: Systematic conflict generation
A Review of Industrial Case-Based Tools, AI Intel- in model-based diagnosis, Proc. 4th IFAC Symp.
ligence Report (Oxford 1995) Fault Detection, Superv. Saf. Technical Process.,
Part D | 20
20.10 C. Bach, D. Allemang: Case-based reasoning in di- Budapest, ed. by A.M. Edelmayer (IFAC, Budapest
agnostic expert systems, Artificial Intell. Commun. 2000) pp. 1103–1108
9(2), 49–52 (1996) 20.27 B. Górny, A. Ligęza: Model-based diagnosis of dy-
20.11 I. Watson: Applying Case-Based Reasoning: Tech- namic systems: Systematic conflict generation. In:
niques for Enterprise Systems (Morgan Kaufmann, Model-Based Reasoning, Scientific Discovery, Tech-
San Francisco 1997) nological Innovations, Values, ed. by L. Magnani,
20.12 P.M. Frank: Fault diagnosis in dynamic systems us- N.J. Nersessian, C. Pizzi (Kluwer Academic, Dor-
ing analitical and knowledge-based redundancy – drecht 2002) pp. 273–291
A survey and some new results, Automatica 26(3), 20.28 A. Ligęza: Selected methods of knowledge engi-
459–474 (1990) neering in system diagnosis. In: Fault Diagnosis.
20.13 P.M. Frank: Analytical and qualitative model-based Models, Artificial Intelligence, Applications, ed. by
fault diagnosis – A survey and some new results, J. Korbicz, J.M. Kościelny, Z. Kowalczuk, W. Cholewa
Eur. J. Control. 2, 6–28 (1996) (Springer, Berlin 2004) pp. 633–668, Chap.16
20.14 R. Paton, P. Frank, R. Clark: Fault Diagnosis in Dym- 20.29 A. Ligęza, J.M. Kościelny: A new approach to mul-
namic Systems. Theory and Applications (Prentice tiple fault diagnosis. Combination of diagnostic
Hall, USA 1989) matrices, graphs, algebraic and rule-based mod-
20.15 S.G. Tzafestas (Ed.): Knowledge-Based System Di- els. The case of two-layer models, Int. J. Appl.
agnosis, Supervision and Control (Plenum, New Math. Comput. Sci. 18(4), 465–476 (2008)
York, London 1989) 20.30 B. Górny: Consistency-Based Reasoning in Model-
20.16 M.O. Cordier, P. Dague, M. Dumas, F. Lévy, J. Mout- Based Diagnosis, Ph.D. Thesis (AGH, Kraków 2001)
main, M. Staroswiecki, L. Travé-Massuyès: AI and 20.31 B. Pulido, C.A. González: An alternative ap-
automatic control approaches of model-based di- proach to dependency-recording engines in
agnosis: Links and underlying hypotheses, Proc. consistency-based diagnosis. In: Artificial Intelli-
4th IFAC Symp. Fault Detection, Superv. Saf. Techni- gence: Methodology, Systems, and Applications,
cal Process., ed. by A.M. Edelmayer (IFAC, Budapest Lecture Notes in Artificial Intelligence, Vol. 1904,
2000) pp. 274–279 ed. by S.A. Cerri, D. Dochev (Springer, Berlin,
20.17 M.O. Cordier, P. Dague, M. Dumas, F. Lévy, J. Mout- Heidelberg 2000) pp. 111–121
main, M. Staroswiecki, L. Travé-Massuyès: A com- 20.32 B. Pulido, C.A. González: Possible conflicts: A com-
parative analysis of AI and control theory ap- pilation technique for consistency-based diagno-
Model-Based Diagnosis References 461
sis, IEEE Trans. Systems Man and Cybernetics 34(5), 20.34 A. Ligęza: Towards knowledge compilation for au-
2192–2206 (2004) tomated diagnosis: A qualitative, model-based
20.33 A. Ligęza: A constraint satisfaction framework for approach with constraint programming. In: Ad-
diagnostic problems. In: Diagnosis of Processes and vanced and Intelligent Computations in Diagnosis
Systems, ed. by Z. Kowalczuk (Pomeranian Science and Control, ed. by Z. Kowalczuk (Springer Inter-
and Technology, Gdańsk 2009) pp. 255–262 national, Switzerland 2016) pp. 355–367
Part D | 20
463
Thought Expe
21. Thought Experiments in Model-Based Reasoning
Margherita Arcangeli
Part D | 21
tion? (Sect. 21.5). These issues will lead to tackle 21.5 How Do Thought Experiments
other important points, such as the relationship Achieve Their Function? .................... 484
between real and thought experimentation, the 21.5.1 A Cognitive Approach
differences between philosophical and scien- to Thought Experimentation ............... 484
tific thought experimentation, the role played by 21.5.2 Imagination and Thought
intuitions and imagination in thought experimen- Experimentation ............................... 485
tation. 21.5.3 The Narrative Dimension
of Thought Experimentation ............... 486
References................................................... 487
464 Part D Model-Based Reasoning in Science and the History of Science
21.1 Overview
Thought experiments (TEs) have a rather curious “bodies differing in heaviness [gravità] are moved
history. Although thought experimentation is as old in the same medium with unequal speeds, which
as philosophy, both the introduction of the term maintain to one another the same ratio as their
and the philosophical interest in thought experiments weights [gravità].”
as such have a much more recent history. Since
the beginning of Western philosophy, many famous Galileo thought that this supposition was false and
thought experiments have been proposed and dis- maintained that a heavy cannon ball of 100 or more
cussed, including Plato’s ring of Gyges [21.1, 358a– pounds will not anticipate a half-pound musket ball
360d], Hilary Putnam’s brain in the vat [21.2], John both dropped from a height of 200 arms. In order to
Locke’s inverted spectrum [21.3, II, Ch. 32, §15], make his point, Galileo put forward a thought exper-
Galileo Galilei’s thought experiment on free-fall [21.4], iment and observed that the Aristotelian idea leads to
Étienne Bonnot de Condillac’s statue [21.5], Im- a contradiction. His thought experiment runs as follows
manuel Kant’s on handedness [21.6], Charles Dar- (Fig. 21.1).
win’s thought experiment on the evolution of the Imagine two bodies (e.g., stones) that have unequal
eye [21.7], Henri Poincaré’s disk world [21.8], Albert weights, and so speeds (e.g., the heavy stone falls with
Einstein’s lift [21.9], Werner Heisenberg’s ”-ray mi- a rate of 8 and the light stone with a rate of 4). Sup-
croscope [21.10], Tyler Burge’s arthritis [21.11], and pose that these bodies are linked together (e.g., with
John Searle’s Chinese room [21.12]. These examples a weightless chain) and, then, that one drops them from
are only a sample of a vast production that spans a certain height (e.g., the top of the Tower of Pisa). The
a huge amount of time and issues (for other surveys Aristotelian thesis would entail that the velocity of the
and in-depth analyses see contributions in Horowitz and composite body will have: a) an intermediate value be-
Massey [21.13]; Casati et al. [21.14]; Ierodiakonou and tween the two, since the lighter body delays the heavier,
Roux [21.15]; Frappier et al. [21.16], and in the spe- and b) a higher value than the two, since both bodies are
cial issues of Philosophica – 72, 2003 – and that of lighter than their union.
Perspectives on Science – 2/22, 2014). But what is Galileo’s conclusion is that large and small bod-
a thought experiment? After a brief historical introduc- ies fall with the same speed [21.4, p. 65]. He stresses
tion (Sect. 21.2), this chapter focuses on philosophical that the slight differences we experience are due to ex-
answers to this question (Sect. 21.3) and on other is- ternal factors, such as the air resistance. Arguably, “in
sues concerning thought experimentation, namely on the vacuum their velocities would be completely iden-
the issues about the function of thought experiments tical” [21.4, p. 73].
(Sect. 21.4) and how thought experiments achieve their Interestingly, it turns out that this hypothesis holds
function (Sect. 21.5). Before turning to these questions, also for bodies made of different materials, for example,
Part D | 21.1
it is worth having a precise idea of what is under dis- a hammer and a feather, as the well-known recreation
cussion by looking at some concrete examples. The of the experiment by Apollo 15 astronaut David Scott
remainder of the section will be devoted to the intro-
duction of six thought experiments that can be seen as
a good sample of the most quoted and discussed thought
experiments in the literature.
on the Moon showed. An Italian team of scientists has F. Nevertheless, as Stevin remarked, we are now facing
since conduced an atomistic version of Galileo’s exper- the same setup of the beginning and for the same rea-
iment, in which atoms of different weights fall in the son the four balls on the left side will slide down the left
vacuum at the same speed [21.17]. How could Galileo side and be replaced by other four balls, and so on and
have seen so far using only his imagination? so forth. “These spheres will produce by themselves a
continuum and eternal motion [continuum et aeternum
21.1.2 Stevin’s Chain Thought Experiment motum], which is false” [21.18, p. 35]. Given the fact
that a perpetual motion is absurd, we are led to conclude
In De statica, the fourth tome of Simon Stevin’s Hy- that the block with eight spheres and the one with six
pomnemata mathematica, Stevin was dealing with the balance each other. Moreover, since the spheres beneath
force needed to keep an object on an inclined plane the basis are symmetrically arranged and pull equally in
from sliding down, and he concluded that the force both directions, we can imagine cutting the string at the
required is inversely proportional to the length of the two lower corners without disturbing the equilibrium.
plane. In order to demonstrate his result, he proposed Hence, two spheres (E, F) offset four spheres (D, R,
the following thought experiment. Q, and P). From that Stevin concluded that the force
Imagine a triangular prism ABC (depicted in sec- required to keep an object from sliding down on an in-
tion in Fig. 21.2) whose basis (AC) is horizontal and clined plane varies inversely with the length of the plane
the left side (AB) is twice the length of the right side and derived his law of the inclined plane: The ratio of
(BC). A wreath of 14 balls of equal weight and size is the force to the weight is equal to the ratio of the height
draped over the prism, so that four balls are on AB (D, to the length of the plane.
R, Q, P), two on BC (E, F), and the remaining eight be-
neath the base (G, H, I, K, L, M, N, O). The spheres are 21.1.3 Newton’s Bucket Thought Experiment
linked by a thread passing through their centers, so that
they can move, but they must remain equally spaced. When we are sitting in a train and realize that the
Moreover, the sides are frictionless, and S, T, and V are train on the next platform changes its spatial relation-
fixed points on which the thread can slide freely. ship with our train, without any other external cue we
Stevin claimed that if D, R, Q, and P were not bal- are not immediately able to say which train is really
anced by E and F, one of the two groups of spheres moving. However, when a motion changes in speed
would pull the other. What would happen if this was or direction (i. e., accelerates), it is directly detectable,
so? Let us suppose that the block with eight balls (D, apparently without reference to any other object. Are
R, Q, P, O, N, M, and L) generates more force than the accelerated motions a special type of motion or even
one with six balls (E, F, K, I, H, and G) and D, R, Q, in such cases is there an implicit contrast between rela-
and P slide down the left side. Thus, D will go down tive and absolute motion? In his Principia Mathematica
where O is and E, F, G, and H will take the place of P, Isaac Newton [21.19] put forward a thought experiment
Part D | 21.1
Q, R, and D, while I and K, will take the place of E and in order to ascertain whether absolute and relative mo-
tions differ as regards their effects in the context of
accelerated motions, more precisely whether only the
D T
R former would involve centrifugal force. Let me explain
E
Q B step by step the thought experiment.
P Imagine a bucket containing water, which hangs on
F
S V a rope that is twisted as much as possible, and then re-
A C
O G leased. We can distinguish four phases (Fig. 21.3): (a)
both the water and the bucket are stationary, (b) only the
N H bucket begins to move, (c) the water and the bucket are
M
both moving, (d) the bucket stops, but the water keeps
I
moving. According to Newton, the existence of relative
L K motion between the water contained in the bucket and
the bucket itself is not able to explain the changes in
Fig. 21.2 Stevin’s chain: ABC is the section of the prism, the surface shape of the water, when the water is ro-
whose basis (AC) is horizontal and the left side (AB) is tating. Indeed, in both the first and the third phases,
twice the length of the right side (BC). Fourteen balls (D– the water and the bucket are stationary relative to one
R) compose the chain draped over the prism. S, T and V another. Yet while in (a) the water’s surface is flat, in
are fixed points on which the chain can slide freely (c) it is concave. Likewise, in both the second and the
466 Part D Model-Based Reasoning in Science and the History of Science
as a thought experiment, Newton affirmed to have done field problem [21.22]. Briefly, this example sets up an
it (clearly not in the empty space!). imaginary scenario in which a farmer is worried about
According to the standard interpretation of the his cow Daisy and gets relaxed when he sees her black
bucket experiment (be it a real or a thought experiment), and white shape in the field. It happens that Daisy was
the Newtonian theory could explain the phenomenon: safely in the field, but hidden in a hollow. What the
Absolute space is what discriminates between absolute farmer mistook for his cow was a large sheet of black
and relative motions, that is, between real and illusory and white paper. Like Smith, the farmer has a justified,
motions. Absolute space would be the system with re- true belief which seems not straightforwardly to count
spect to which it is possible to understand that in the as knowledge.
first and in the second phase the water surface is flat,
because the water does not really move, whereas it does 21.1.5 Twin Earth
in the third and the fourth phases, thus climbing the
bucket’s wall. Among philosophers, a famous thought experiment
is that of Twin Earth. Putnam [21.23, 24] proposed
21.1.4 Gettier’s Thought Experiment a thought experiment in order to show that psychologi-
cal states conceived in a narrow (i. e., intensional) sense
In his article, Is Justified True Belief Knowledge?, Ed- do not determine the references (i. e., extensions) of nat-
mund Gettier [21.20] seriously undermined the classic ural kind terms.
definition of knowledge via thought experimentation. Suppose that somewhere in the Universe there is
Almost from Plato to 1963, knowledge was considered Twin Earth, which is a planet exactly like Earth, with
Thought Experiments in Model-Based Reasoning 21.2 Historical Background 467
the exception that the chemical composition of what 21.1.6 Mary the Super-Scientist
is called water on Twin Earth is not H2 O, but a very
long and complicated formula that can be abbreviated Frank Jackson [21.25] proposed a famous thought
as XYZ. Nevertheless, water and twin water (i. e., what experiment (also known as the knowledge argument)
is called water on Twin Earth) have the same visual ap- aimed at showing that physicalism (i. e., the view that ev-
pearance, flavor, odor, etc. Thus, if Earthlings ever visit erything can be physically explained) cannot account for
Twin Earth, at first they will believe that the term wa- our knowledge about what it feels like to be in a certain
ter has the same meaning on both planets. But they will mental state. The thought experiment runs as follows.
suitably revise their belief after discovering that water Suppose that Mary is a brilliant neuroscientist who
on Twin Earth refers to XYZ. And it would be the same knows all physical facts about chromatic vision. For
for Twin Earthlings, if they ever visited Earth and dis- instance, she knows precisely which combinations of
covered that water on Earth refers to H2 O. wavelengths from a tomato stimulate the retina, and
Oscar-te is the Doppelgänger on Twin Earth of Os- “exactly how this produces via the central nervous sys-
car, a typical inhabitant of Earth, and we can suppose tem the contraction of the vocal chords and expulsion
that both persons are perfect duplicates. Going back to of air from the lungs” [21.25, p. 130], which results
1750 (about 50 years before the discovery of the chem- in the utterance of the judgment the tomato is red.
ical composition of water on Earth and, by hypothesis, Although she possesses all physical information con-
also of what is called water on Twin Earth), neither cerning what happens when, for example, we see the
Oscar, nor Oscar-te had beliefs about the chemical el- redness of a tomato and use terms like red, she has never
ements of what they call water. Yet the term water had the experience of seeing any color, because she has
referred to H2 O on Earth and to XYZ on Twin Earth, spent all her life in a black and white room. Now imag-
respectively, in 1750 as much as nowadays (i. e., the ine that Mary could see a tomato, because either she is
extension of the term did not change). Although by hy- released from her room or provided with a color tele-
pothesis Oscar and Oscar-te had all the same beliefs, vision monitor. Would she learn something or not? The
and enjoyed the exact same psychological states with intuitive answer seems to be yes. But does she really
respect to the word water, they did not mean the same learn a new fact? These questions have sparked a lively
thing by water, because each use referred to a different debate [21.26], which led Jackson himself to revise his
substance. Therefore, Putnam argues that meanings of position and claim that, after all, Mary is not discover-
words are not in our heads. ing a new fact, but rather a new way to represent it.
Part D | 21.2
on thought experiments into two phases, which we ments outside the realm of science (mainly physics),
could call classical and contemporary (see [21.27] for in particular to philosophical thought experimenta-
a more fine-grained division in four stages). Thomas S. tion.
Kuhn, in his 1964 paper, A Function for Thought Ex-
In this section, I shall review the most important
periments, marks the division between the two phases.
steps of the history of thought experimentation. I be-
Kuhn has the merit of having highlighted an episte-
gin with the origin of the term (Sect. 21.2.1). Then I
mological problem hitherto underestimated. His essay
briefly describe the two historical phases and finally in-
opened a new way of viewing thought experiments as
troduce the main actors of each phase (Sect. 21.2.2 and
puzzles, by asking: How might a mere thought exper-
Sect. 21.2.3, respectively).
iment yield new knowledge, without the input of new
data? It should be noted that Kuhn’s paper is mostly
21.2.1 The Rise of the Term
known in its 1977 version. This is the reason why some
of his contemporaries (e.g., Carl Hempel) can be con-
At the end of the nineteenth century, the Austrian physi-
sidered as belonging to the classical phase, even if their
cist and philosopher Ernst Mach wrote a paper entitled
writings are posterior to Kuhn’s essay.
Über Gedankenexperimente [21.28], which popular-
The contemporary phase can, therefore, be separated
ized the German term Gedankenexperiment (thought
quite sharply from the classical phase along two criteria:
experiment) and sparked a methodological debate on
1. A greater awareness of the problematic aspect of thought experiments, specifically in the scientific do-
thought experiments tied to their epistemic function main – physics in particular.
468 Part D Model-Based Reasoning in Science and the History of Science
However, the authorship of the German term goes to perimentation, which seems to go beyond the scientific
the Danish physicist Hans Ørsted [21.29] and, perhaps, domain. Indeed, when he lists thought experimenters,
even before him to the German physicist and aphorist he speaks of dreamers and novelists – whilst not of
Georg Christoph Lichtenberg, who wrote about exper- philosophers [21.28, p. 451 of the 1973 English trans-
imenting with thoughts [21.30, 31]. Actually, the very lation]). He also wrote that “Experimenting in thought
starting point of the philosophical interest about thought is extremely important for cognitive development” and
experimenting can be found in Immanuel Kant’s phi- that “the thought experiment not only is of importance
losophy [21.31], which is explicitly present in Ørsted’s in the field of physics, but on the contrary, in all fields
work [21.32], but also inspired Lichtenberg and many of knowledge” [21.28, pp. 455–456]. Still he analyses
other philosophers [21.33]. only thought experiments in physics and mathematics.
Despite the controversy about who coined the term Moreover, in Mach’s work we can find two traits
(on the topic see, for example, [21.31–37]), what is strik- typical of the classical phase: (i) it is not always clear
ing is that thought experimentation has a much older how to distinguish between genuine thought experi-
history – in philosophy, as well as in natural sciences. menting and merely imagining about real experiments
According to Nicholas Rescher, Pre-Socratics “invented (REs) [21.52, p. 74]; and (ii) Mach does not worry
thought-experimentation as a cognitive procedure and about the distinct sphere of autonomy of thought ex-
[. . . ] practiced it with great dedication and versatility” periments as such, and eventually he brings back the
([21.38, p. 31] – see also [21.39, 40], for a critique of latter to real experiments. This is not intended to un-
Rescher’s view; on the topic of ancient thought experi- derestimate in any way the great significance played by
ment; see [21.41] and several contributions in [21.15]). Mach’s analysis of thought experiments on the subse-
Likewise, Imre Lakatos [21.42] located in Ancient quent debate. Indeed, credit is due to his examination
Greece the beginning of thought experimentation, of the features proper to thought experiments (as well as
specifically in formal Euclidean mathematics [21.43]. to real experiments) and inquiry on how they function,
Afterward the practice of thought experimentation even from a psychological point of view (Sect. 21.5).
continued (see [21.44–46] for discussions on thought Alexius Meinong first noticed the quite broad no-
experiments during Middle Ages) and flourished, tion of thought experiment introduced by Mach. In
mainly thanks to exceptional thought experimenters, his discussion of thought experiment [21.53], he ad-
such as Galileo, René Descartes, Locke, Newton and dresses the distinction between experiments carried out
Gottfried Leibniz (on thought experimentation between on thoughts (i. e., psychological experiments based on
Middle Ages and the introduction of the term, see the subjects’ thoughts) and within thought (e.g., math-
contributions in Horowitz and Massey [21.13], Casati ematical thought experiments). But he himself did not
et al. [21.14], and Ierodiakonou and Roux [21.15]; on acknowledge that thought experiments should also be
historical perspectives about Galileo’s thought exper- differentiated from imaginings about real experiments.
imentation, see also Prudovsky [21.47], Atkinson and The same can be said about another stakeholder of this
Part D | 21.2
entific inquiry (only in an appendix, he considers their nam wondered what would happen to the meaning of
misuse – see [21.60]). the word water in a twin Earth where what is called
water has a different chemical composition from H2 O
21.2.3 The Contemporary Phase (Sect. 21.1.5); Searle [21.12] attempted to refute the
idea that the mind is a suitably programmed computer,
In his milestone 1964 paper, Kuhn raised three ques- by imagining himself in a room that receives and re-
tions that set the subsequent debate on thought experi- turns input in Chinese; Derek Parfit [21.63] racked his
ments: brain on personal identity and tried to show that the
concept of identity is less important than that of sur-
1. To what extent must the imagined situation be one
vival by imagining a person splitting like an amoeba
that can be (or has been) found in nature, that is,
(for concise but exhaustive descriptions of Thomson’s,
what conditions of verisimilitude are thought exper-
Searle’s and Parfit’s thought experiments, see [21.64]).
iments subject to?
As hinted earlier (Sect. 21.2.1), there have been philo-
2. How, “relying exclusively upon familiar data, can
sophical thought experiments long before, but much of
a thought experiment lead to new knowledge or to
contemporary philosophy makes heavy use of thought
new understanding of nature?” [21.61, p. 241].
experiments and it would be severely impoverished
3. What, if any, kind of knowledge do thought experi-
without them.
ments produce?
However, not all contemporary analyses of thought
The authors of the classical phase have analyzed experiments deal with both scientific and philosophi-
only thought experiments within the scientific domain, cal examples (Sect. 21.4.3). As underlined by Rachel
physics in particular. Although Kuhn is no exception, Cooper, although many authors have restricted their
the many attempts that have been made to answer his study to scientific thought experiments, a fine-grained
questions have extended the enquiry to thought experi- theory of thought experimentation should cover both
ments in philosophy. the sciences and the humanities [21.65–67].
The contemporary phase is characterized by both Different answers to Kuhn’s questions have
a proliferation of works on thought experiments, due emerged in a stream of literature since the 1990s, with
to the impact of Kuhn’s paper, and a very considerable two polar positions: James R. Brown and John D. Nor-
production of thought experiments. Philosophy was the ton. “The views of Brown and Norton represent the
main protagonist of this new season of thought exper- extremes of platonic rationalism and classic empiri-
imentation. Just to give some examples, between the cism, respectively” [21.34, p. 69]. The best way to get
1970s and the 1980s: Judith Thomson [21.62] ques- into this still flourishing debate is by tackling three
tioned on the concept of the right to life and how of the main issues which the contemporary phase has
it differs from the concept of the right to what is sought to clarify further: What is a thought experiment?
needed to sustain life, through a bizarre kidnapping What is the function of thought experiments? How do
Part D | 21.3
of a violinist by the Society of Music Lovers; Put- thought experiments achieve their function?
and, then, turn to the opposite end, namely the argu- preconception about the intrinsic epistemological su-
ments given in favor of the theoretical nature of thought periority of real experiments. By following this pre-
experimentation (Sect. 21.3.2). Finally, I shall dwell on conception, we risk focusing on the features proper to
the main features that should make thought experiments true experiments (i. e., real experiments) that thought
easily identifiable (Sect. 21.3.3). experiments lack. For instance, a typical plea for real
experimentation would stress that, insofar as thought
21.3.1 Thought Experiments experimentation does not directly examine nature, it is
and the Experimental Realm less reliable and lacks justificatory power (Sect. 21.4.3).
The upshot of this line of reasoning is that thought
The issue about the relationship between thought and experiments should be employed only when real ex-
real experiments is a much discussed topic in the de- periments are not available, otherwise they are use-
bate. The trend, as shown in Table 21.1, has been to less.
underline a continuity between thought experiments Following Roy Sorensen [21.52], we might say that
and real experiments. the problem is rooted in how the adjective thought
However, very often the experimental side of should be interpreted in the expression thought exper-
thought experimentation has not been evaluated per se: iment. A terminological attitude that can fall pray to the
Thought experiments have been judged from the stan- aforementioned preconception is to consider thought
dards of real experiments, rather than on the basis of experiments as mere imaginary visualizations of exper-
a broad definition of experiment that can include both iments. In the works of many philosophers (particularly
types of experimentation. According to many scholars, belonging to the classical phase), the expression imag-
thought experiments are not kinds of experiment, but inary experiment is frequently used as a synonym for
tend to proceed as if they were. thought experiment. However, substituting thought with
The analysis of the experimental side of thought imaginary can be misleading (see also Krimsky [21.91],
experiments seems to be influenced by a widespread who claims that all imaginary experiments are thought
experiments but not vice versa, and Wilkes [21.87]).
Table 21.1 Simplified overview of some of the positions The imaginary unit and the imaginary number for math-
in the debate on the continuity or discontinuity between ematicians, as well as the social imaginary and the
TE and RE. Strong continuists are authors who explic- child imaginary for psychologists, are not degraded en-
itly and extensively talk about the common features be- tities. Still imaginary is commonly used as an adjective
tween the two types of experimentations. Even though that falsifies or somehow discredits the phenomenon to
Paul Humphreys admits a parallelism between thought which it refers. An imaginary friend, imaginary worlds,
and real experiments, he is considered as a strong dis- imaginary fears and beliefs are understood as fictional
continuist, because he sharply distinguishes between the entities. The emphasis is on the negative aspects, on
theoretical realm of the former and the empirical realm of what they lack in order to be real friends, worlds, fears
the latter [21.75, pp. 218–219]. In italics are highlighted
Part D | 21.3
or beliefs.
philosophers who can be classified in the classical phase It is no coincidence that Duhem called thought ex-
of the analysis of thought experiments. More will be said periments expériences fictives (fictitious experiments),
in the following about both Duhem and Marco Buzzoni given his harsh critique of thought experiments used as
Degree if they were real experiments. Similarly, albeit not moti-
Status Weak Strong vated by the same causticity, Hempel spoke of imagina-
Continuity Ørsted [21.29] Mach [21.28, 83] tive experiments and seemed to complain about the fact
between TE Popper [21.58] Koyré [21.60] that thought experiments tend to be merely heuristic, in-
and RE Hempel [21.59] Sorensen [21.52] stead of providing purported evidence to be further vali-
Kuhn [21.61] Nersessian [21.84] dated. Hempel had been influenced, more or less explic-
Brown [21.55, 70] Gooding [21.85]
itly, by the neo-positivist distinction between context of
Szabó Gendler [21.81] Häggqvist [21.86]
Bokulich [21.56] Wilkes [21.87]
discovery and context of justification [21.55, p. 89]; for
Peijnenburg Bishop [21.88] the relevant distinction, see Hans Reichenbach [21.92]).
and Atkinson [21.79] Cohen [21.64] Contrary to real experimentation, thought experimenta-
Buzzoni [21.57] tion would be confined to the domain of discovery, that
Brendel [21.82] is, the processes through which a hypothesis has been
Discontinuity Duhem [21.54] Hull [21.89, 90] formulated, rather than how such hypothesis could be
between TE Norton [21.68] controlled and confirmed. The dichotomy between the
and RE Humphreys [21.75] context of discovery and that of justification seems to
Hacking [21.74] have influenced much of the analysis on thought exper-
Thought Experiments in Model-Based Reasoning 21.3 What Is a Thought Experiment? 471
iments belonging to the classical phase, but also some see [21.100]; for an analysis of thought experiments in
reflections of the contemporary phase, such as that of biology, see [21.101, 102]; see also [21.103], for a dis-
Norton (according to Brown [21.55]) and of David Hull cussion on artificial life and thought experimentation in
(Sect. 21.4.3). biology, and [21.104], for a recent discussion on the re-
In his Plea for real examples [21.89, 90], Hull lationship between real and thought experimentation in
puts forward a critique of thought experimentation biology).
even harsher than that of Duhem. As suggested be- Recalling what was said earlier with respect to how
fore (Sect. 21.2.2), Duhem’s criticism should not be adjectives can transform the value of the name they
seen as a total rejection of thought experimentation. modify, it is interesting to notice that Hull mostly calls
Although Duhem is commonly presented in the liter- thought experiments fictitious examples, but when he
ature as a detractor of the role of thought experiments emphasizes their positive aspect, he employs the ex-
in scientific practice (thus, as a discontinuist – see Ta- pression hypothetical examples. Hypothetical does not
ble 21.1), his remarks should lead us to re-evaluate (but convey a negative value as imaginary does. Still it
not dismiss) thought experimentation (also suggested in is more cautious than thought. For example, a hypo-
Daly [21.93, p. 114]). He criticized a naive view of real thetical buyer is not really a buyer, she may become
experimentation and put forward very innovative ideas a buyer, but at that point she will be a real buyer and
about it (e.g., real experiments are not a matter of mere not anymore hypothetical. Thus, Hull’s view on thought
observation without theory; they are subject to underde- experiments is nicely exemplified by his use of the ad-
termination in the choice of a theory, that is, they cannot jective hypothetical as a synonym for thought.
test isolated hypotheses). Duhem focused his attention It should be noted that positive continuist views also
on the negative aspects of thought experimentation that oversimplify thought experimentation when they an-
convey such a naive view without realizing that thought chor it to real experimentation. For instance, it has been
experimentation itself can be seen in a less simplis- stressed that Mach talks about thought experiments as
tic way and be subject to the same conceptual revision if they always lose to real experiments [21.52, p. 74].
he was advancing for real experimentation. Most likely Mach acknowledged that the outcome of some thought
Duhem is a continuist, who undermined a naive view of experiments, such as Galileo’s thought experiment on
experimentation as a whole, including thought experi- falling bodies (Sect. 21.1.1), is strictly determined,
mentation. so that the thought experimenter is led to consider
Hull is more likely to be considered as a dis- superfluous “any further test by means of a physical ex-
continuist. According to him, thought experiments are periment, whether rightly or wrongly” [21.28, p. 452].
mostly useless, and real experiments should be pre- However, he seems to complain about thought experi-
ferred to them (on real examples that go beyond our menters that avoid further real experimentation and take
imaginative ability; see [21.89, p. 312] and [21.90, p. the result of thought experiments as conclusive. In this
435]; on Hull’s view see also Sect. 21.4.3). In fact, regard, Mach’s critique of Newton’s bucket thought ex-
Part D | 21.3
Hull seems to admit that thought experiments can have periment (Sect. 21.1.3) is a good example. According to
scientific value, but only if they involve an imaginary Mach, Newton violated his rule of hypotheses non fin-
situation which is as plausible and detailed as possi- gere (to feign no hypotheses), since he used the bucket
ble (on the importance of a detailed scenario see also thought experiment in order to show what was actually
Brendel [21.82] and Häggqvist [21.86], which ties with presupposed by the thought experiment itself (i. e., the
issues discussed in Sects. 21.3.3 and 21.4.3). Moreover, existence of absolute space).
he seems to take for granted that thought experiments Similarly, even Koyré’s analysis [21.60] might seem
must become, sooner or later, real experiments (on driven by considering real experimentation as the
the issue about whether real experiments can resolve benchmark: Thought experiments are positive, since
thought experiments, see Arthur [21.94]; Sect. 21.3.3). they accentuate positive features of real experiments. In
However, Hull is not willing to concede that thought ex- this respect, Koyré’s and Duhem’s approaches can even
periments can be of value in all scientific fields, but only be seen as complementary, more than opposed.
for those which are “well-articulated” ([21.90, p. 431], More recent views have tried to investigate thought
where he quotes other detractors of thought experimen- experimentation as a genuine experimental practice on
tation such as Wilkes [21.87], and Fodor [21.95]; other a par with real experimentation. At least two analyses
sceptics are in line with Hull’s view – for example, Fey- are worth mentioning. Sorensen argues that thought ex-
erabend [21.96], Quine [21.97] and Thagard [21.98, periments are a limiting case of real experiments. Both
99]). For example, he argues that in biology thought types of experimentation have very similar purposes
experiments cannot play any role; rather they risk to and also share methods for assessing such purposes
create only confusion (for a critique of this position, (Sect. 21.3.3 and 21.4). Clearly, they differ insofar as
472 Part D Model-Based Reasoning in Science and the History of Science
thought experimentation emphasizes the design aspect thought experiments are disguised arguments [21.68,
at the expense of the execution aspect (other scholars 115–117]: a good thought experiment should be a sound
have followed Sorensen on this point – for exam- argument that increases our knowledge. In other words,
ple [21.82, 85, 105]). Sorensen goes further and argues without epistemic loss a thought experiment can be re-
that, although historically thought experiments have be- constructed (translated, or reduced) into an argument –
come autonomous, their origin has to be individuated i. e., a list of propositions, of premises and assumptions,
in the mental component of real experimentation. They leading to a conclusion via (inductive or deductive) in-
should be seen as the result of an evolutionary pro- ferences. Thought experiments are often rhetorically
cess: a “selective pressure” would have deprived real embellished and frequently they do not make explicit
experimentation of the execution aspect, emphasizing all the assumptions on which they rely: these features
the design aspect ([21.52, pp. 186 and 212], [21.106] conceal their argumentative nature. Along with this re-
for a critique of Sorensen’s evolutionary explanation; construction thesis, Norton suggests an elimination the-
on Sorensen’s view, see also contributions in the spe- sis: Thought experimentation is a dispensable epistemic
cial issue of Informal Logic [21.107–110]. tool (see Gendler Szabó [21.76, 118] for a fine-grained
Marco Buzzoni argues that Sorensen underestimates analysis of Norton’s elimination thesis; see also Timo-
“the technological-operational dimension of the sci- thy Williamson’s view [21.119] – which seems in line
entific experiment” and supports a concept of real with Norton’s view, except for the role granted to imag-
experiment as a mathematical function ([21.57, p. ination – Sect. 21.5).
175]; see also [21.36]). Buzzoni [21.36, 57, 111] de- Humphreys claims that thought experiments “lie
velops a Kantian framework according to which from much closer to theory than to the world” [21.75, p.
an empirical point of view (i. e., exactly with respect 218]. He admits that they can be assimilated to that
to the technological–operational dimension) real and type of real experiments that isolate “those features of
thought experimentations coincide, but they are com- the world that are represented in a theoretical model”
plementary from a transcendental point of view. Thus, and approximate “the idealizations that are employed
one type of experimentation without the other is un- therein” [21.75, p. 218]. But nowadays, according to
productive for scientific purposes (see [21.112, 113] him, this function is fulfilled by theories. In support
for objections to Buzzoni’s account and Buzzoni’s of his argument, he compares thought experiments to
reply in [21.114]). Many other scholars have under- computer simulations (or numerical experiments). Both
lined that thought experimentation shows an action- methods involve refinements of theories, adjustments
practical component (e.g., [21.85, 101]; see Sect. 21.3.2 to conform conditions, parameters, approximations and
and 21.5). Moreover, this component has been ad- idealizations to empirical data, and can deliberately al-
duced as one of the main arguments against views ter parameters in order to produce laws different from
that confine thought experiments into the theoretical those of our world.
realm. Actually, the parallelism between thought and nu-
Part D | 21.3
biological case, and [21.127], for a provocative view Finally, Michael Bishop ([21.88]; see also [21.141])
according to which computer modeling will replace has offered a counterexample to Norton’s view: The
thought experimentation), the trading zone between same thought experiment can be reconstructed in two
thought experiments and numerical experiments has different arguments. Often this is the case when schol-
been sparsely considered by current works on either ars disagree about the upshot of a thought experiment;
thought or numerical experiments. These works have otherwise it would be impossible to compare their
primarily focused their attention on the links between views and determine who is right. The main exam-
these two scientific tools and real experiments (in-depth ple given by Bishop is the debate between Einstein
analysis can be found in [21.128–131]; in passing other and Niels Bohr on an Einsteinian thought experi-
authors have commented on the parallelism between ment, namely the clock-in-the-box thought experiment
thought experimentation and numerical experimenta- (see the volume dedicated to Einstein of The Li-
tion – for example, [21.36, 43, 52, 57, 64, 65, 73, 101, brary of Living Philosophers, edited by Arthur Schlipp,
132]; see also the related topic of video games as ex- for a complete description of this thought experi-
ecutable thought experiments [21.133]). ment and also for Bohr’s objections and Einstein’s
Much criticism has been raised against theoreti- reply [21.142]).
cal views about thought experimentation, especially Nevertheless, one might think that Norton’s ar-
against Norton’s argument view. Although some have gumentative reconstruction thesis is valuable, while
found the latter too liberal (e.g., [21.134]), most maintaining that thought experiments are not arguments
philosophers have found it too restrictive and have and/or rejecting the elimination thesis. Once translated
offered four main objections. First, Norton’s transla- into arguments, thought experiments can make explicit
tion of thought experiments into arguments would lose their implicit assumptions. As pointed out by Richard
some important aspects proper to thought experimen- Arthur, “the reformulation of thought experiments as
tation (e.g., [21.71–73, 82, 101, 135–138]), such as its arguments is a vital part of the scientific process”
nonpropositional dimension or an action-practical com- ([21.136, p. 228]; see also [21.101]; see [21.143] for
ponent. These aspects should not be neglected, for a proposal which takes into account both the experi-
they play an epistemic role, rather than being merely mental and the argumentative sides of thought experi-
picturesque. Tamar Gendler Szabó [21.76, 118], for in- mentation).
stance, maintains that Galileo’s thought experiments
on falling bodies (Sect. 21.1.1) cannot be fully recon- 21.3.3 Thought Experiments
structed into arguments (see also [21.71, 72, 139] for and Their Features
other examples; see [21.140] for a reconstruction of
Galileo’s thought experiment within a nonclassical log- Despite the fact that there is not a unanimous defini-
ical framework). It has been pointed out that the same tion of thought experiment and different views push
holds for thought experiments that rely heavily on sen- thought experimentation toward either the empirical or
Part D | 21.3
sory imagination or spatial reasoning (e.g., [21.65]). the theoretical realms, there are some features common
A second related objection concerns the cognitive to most thought experiments. It should be noted that,
underpinnings of thought experimentation. The same although discussions about these features often lead
conclusions can be drawn from a thought experiment to draw several parallelisms between thought and real
and from a logical argument, but constructing and per- experimentations, they are not committed to the exper-
forming the former are different from producing and imental nature of thought experimentation. After all it
carrying out the latter. This is so even if we take for might be profitable to study thought experiments as if
granted that all thought experiments are translatable they were experiments, even if they are not [21.52, 132].
into arguments and that such a translation procedure is In what follows I shall focus, on three features com-
epistemically advantageous (Sect. 21.5). mon to both thought and real experimentation:
Third, it has been pointed out that thought experi-
ments may feature in the argumentation steps, but this 1. The method of variation (i. e., isolation of variables,
does not mean that they are arguments. Likewise for manipulation and observation)
real experiments. Real experiments can play a role in 2. Fallibility and
or be rephrased as arguments, but typically they are not 3. Theoretical underdetermination.
considered as arguments and it is unlikely that someone
would claim that they are dispensable. We should not I shall then turn to the main feature proper to
confuse an experiment of whatever kind with its pub- thought experimentation only (i. e., the mental nature
lished description [21.86, 101]. of its laboratory) and some connected features.
474 Part D Model-Based Reasoning in Science and the History of Science
and Roux [21.144], who speak of beliefs) than natural results (e.g., the thought experimental debate between
circumstances and concrete entities. Einstein and Bohr, as well as that between Darwin
As far as the second step is concerned, one might and Fleeming Jenkin – as for the latter, see [21.100,
be uneasy with the fact that in thought experimenta- 101]).
tion an experimenter is not literally manipulating the Alisa Bokulich [21.56] has suggested that thought
variables in question. Despite philology, however, ma- experiments can show theoretical underdetermination
nipulating is not merely influencing manually. Things in that they cannot discriminate between different the-
can be rotated and moved also in our imagination. Ex- oretical frameworks (see also [21.86, particularly ch.
pert chess players or Rubik’s Cube solvers do perform 6], [21.152] and [21.153]). EPR, a thought experiment
such kind of mental manipulation [21.52, 145]. This by Einstein et al. [21.154], is an example of a thought
consideration leads us to the third step of the method experiment that can be rethought from the perspective
of variation, namely the observation of the interactions of different and incompatible theories [21.56, p. 299].
among the variables ([21.64, 73, 85, 146–148] have par- Sidestepping the technical details of such thought ex-
ticularly stressed both (ii) and (iii)). periment, what is important to keep in mind is that its
Observation or visualization has seemed to many upshot is a dangerous correlation between two physi-
a necessary condition of thought experimentation, as cal quantities (position and momentum), which would
well as of real experimentation [21.36, 57, 71, 72, 85, undermine quantum mechanics (QM) (in its Copen-
147, 149, 150]. The problem is that it is not clear hagen version). Commonly EPR has been interpreted
whether observation means the same thing in both con- as a failed demonstration of the incompleteness of QM
texts. In thought experimentation, observation seems and as an argument in embryo form for a determinis-
Thought Experiments in Model-Based Reasoning 21.4 What Is the Function of Thought Experiments? 475
tic completion of QM, which is indeterministic. Indeed, Second, it has been argued that a genuine thought
precisely on the basis of EPR, David Bohm developed experiment should not require a concrete implemen-
the most famous deterministic version of QM. How- tation, which can even be impossible for practical
ever, per se EPR is not a crucial thought experiment that or ethical–political reasons. Sorensen [21.52], for in-
can help us to decide between QM and its determinis- stance, conceives thought experiments as experiments
tic rivals (EPR is widely discussed in the literature on in which the design aspect is accentuated at the expense
thought experiments – for example, see the debate be- of the execution aspect (Sect. 21.3.1). Furthermore,
tween Atkinson, [21.155], and Stöltzner, [21.43]). This he has identified three reasons (impossibility, unim-
feature of thought experiments undermines the idea that provableness, unaffordability) that explain why thought
they do not have a life of their own [21.74], that is the experiments need not to be concretely performed (see
ability to evolve and be adapted to different theories and the previous discussions on this issue in [21.54, 83]).
ends (also philosophical thought experiments seem to Sorensen’s view can be summed up by means of a spec-
have a life of their own – for example, see Twin Earth trum: There would be nonimplementation, on one ex-
chronicles in [21.156]). treme, due to the maximization of benefits and, on the
other, due to containment of losses.
The Laboratory of the Mind To put it in another way, even when possible, a real
The most striking feature of thought experiments is performance of a thought experiment would be irrele-
that they are conducted in the “laboratory of the vant to the purpose of the thought experiment [21.72].
mind” [21.55, 70]. Thought experimentation seems to This does not mean that thought experimentation can-
be grounded in imagination (Sect. 21.5). The fact that not lead to real experimentation. Arguably, a thought
we can experiment within our mind has its advantages. experiment can open new lines of inquiry, which can be
As remarked by Mach, “Our own ideas are more eas- explored by means of real experiments. Still the result-
ily and readily at our disposal than physical facts. We ing real experiment should not be seen as the realization
experiment with thought, so to say, at little expense” of the initial thought experiment.
([21.28, p. 452]; see also [21.52] for a discussion of the Some authors disagree and claim that, at least some,
advantages of thought experimentation over real exper- thought experiments can be concretely implemented
imentation). However, the fact that thought experiments and that, more generally, thought experimentation
are not in direct contact with natural phenomena and are should be resolved into real experimentation [21.78,
merely a product of our imagination has its shortcom- 79, 89, 90, 155, 158]. A classical example given in favor
ings (Sects. 21.3.1 and 21.4.3). of this view is Alan Aspect and colleagues’ real exper-
Two other features are tied to the mental nature iments, whose results have been published in a paper
of thought experimentation. First, thought experimenta- titled Experimental Realization of Einstein–Podolsky–
tion seems not able to give quantitative outcomes, since Rosen–Bohm Gedanken Experiment: A New Violation
it does not involve instrumental apparatus. However, at of Bell’s Inequality [21.159]. However it is possible
Part D | 21.4
least scientific thought experiments can give quantita- to contend this interpretation of EPR. Despite part
tive results (e.g., Ronald Fisher’s thought experiments of the title of Aspect and colleagues’ paper, the real
that explained the influence of natural selection on sex experiment they conducted can be considered as an em-
ratio – [21.52, p. 250]; Sect. 21.4.3). Still, the outcomes pirical test of a hypothesis suggested by John Stuart
of real experimentation seem to be fixed and possible to Bell, who found it by studying Bohm’s version of EPR
be determined in a way that the outcomes of thought ([21.152]; see also [21.43, 160–162] for similar views
experimentation cannot ([21.157]; see also [21.52, p. on EPR).
247], on the unavoidable incompleteness of thought On a moderate view, some thought experiments can
experimentation, which ties with the issue about philo- be concretely performed, without denying a genuine
sophical thought experimentation heavily relying on status to thought experimentation (on this topic see re-
unclear background assumptions – Sect. 21.4.3). cent discussions in [21.16]; Sect. 21.3.1).
tent are thought experiments a reliable source of in- tency within Aristotle’s account of motion (due to the
formation? What role do thought experiments play Aristotelian hypothesis that speed is proportional to
in processes of rational choice? The last question is weight).
strongly connected to the issue about how to clas- Second, a thought experiment can show a problem
sify thought experiments according to their epistemic external to a given theoretical framework, that is, be-
functions. I shall begin by presenting some proposed tween the latter and other assumptions or theoretical
taxonomies (Sect. 21.4.1), then I shall turn to the ques- frameworks. Erwin Schrödinger’s cat thought experi-
tions about the type of knowledge (if any) produced by ment is such an example, since it underlined how QM
thought experimentation (Sect. 21.4.2) and about its sta- (in its Copenhagen interpretation) was in conflict with
tus (Sect. 21.4.3). our beliefs about the macroscopic level.
According to the Copenhagen interpretation of QM,
21.4.1 Sorting Thought Experiments a physical system can be in a very special state which
actually is a simultaneous superimposition of different
It might be useful to have an efficient classificatory states. Once observed or measured, the physical sys-
scheme of immediate understanding, in order to put tem collapses into one of the superimposed states. This
some order in the domain of thought experimentation physical phenomenon occurs only at the quantum or mi-
and to try to understand it better. However, this is not croscopic level, but the problem is precisely where to
an easy task. Thought experiments are employed in so draw the divide between the latter and the macroscopic
many disciplines. Moreover, their interpretation can de- level (i. e., the object of study of classical physics). As
pend on historical factors ([21.163]; see also [21.164]) pointed out by Schrödinger, macroscopic objects like
and on the intention of the thought experimenter, in- cats are not likely to be at the same time dead and alive
deed they can even be rethought for different purposes (Fig. 21.4).
([21.56]; Sect. 21.3.3). Following Brown and Fehige [21.27], a third sub-
Thought experiments can be classified along sev- category might be added, namely “counter thought
eral dimensions (e.g., by domains such as science vs experiments” [21.166] or “thought-experiment/anti-
philosophy, by type of reasoning such as inductive vs thought-experiment pairs” [21.115, 117]. Counter or
deductive). However, most taxonomies classify thought anti-thought experiments target thought experiments,
experiments according to their functions with respect to more than theoretical frameworks. Examples of this
a group of hypotheses or a theory (several taxonomies category are Lucretius’ thought experiment (originally
are put forward by different scholars in [21.13]). None introduced by the Pythagorean Archytas; see [21.115,
of these taxonomies seems to be definitive, but one has 167]), meant to undermine an Aristotelian thought ex-
become quite popular, namely the taxonomy proposed periment on finiteness of space, and Mach’s version
by Brown ([21.81]; see [21.165] for a critique of this of Newton’s bucket (Sect. 21.1.3) aimed at showing
taxonomy). that centrifugal forces are due to the rotatory motion
Part D | 21.4
Brown firstly divides thought experiments into two relative to the terrestrial mass and other celestial bod-
general types, destructive and constructive. A thought ies ([21.168]; on this topic see, for example, [21.31,
experiment falling within the former category is “a pic- 116, 169]). Popper stressed that counter-thought exper-
turesque reductio ad absurdum” ([21.55, p. 34]; see iments run the risk to be unacceptable, because unfair
also [21.70, p. 123]) devised in order to reject, or at with respect to the opponent’s position [21.58, p. 466].
least seriously undermine, some hypotheses or a the- This is the apologetic use condemned by Popper (on the
ory. Here Brown rejoins Popper’s taxonomy and his latter and Popper’s critical use, see [21.170]).
critical use of thought experiment [21.58], which in
turn is analogous to Hempel’s theoretical thought ex-
periments – although the latter category goes beyond
thought experiments against theories and encompasses
all thought experiments that explicitly make fruitful
predictions [21.59].
There are different ways of undermining a theory,
thus suggesting different subcategories of destructive
thought experiments. At least two subcategories can be
offered. First, a thought experiment can show a problem
internal to a given theoretical framework. This is the
case, for instance, in Galileo’s falling bodies thought
experiment (Sect. 21.1.1), since it shows an inconsis- Fig. 21.4 Schrödinger’s cat thought experiment
Thought Experiments in Model-Based Reasoning 21.4 What Is the Function of Thought Experiments? 477
Constructive thought experiments aim to support tural thought experiment, since it advanced a problem
a theory or theoretical hypothesis, but they can do (i. e., things being equal from a relative point of view,
so in very different ways. Thus, Brown divides this there can be different effects on the surface of some
category into three further types, namely mediative, water contained in a rotating bucket) and its solution
conjectural, and direct. Mediative thought experiments (i. e., we should distinguish between relative and ab-
have a pedagogic or illustrative role (generally on the solute motions, where the latter refer to the absolute
pedagogical role played by thought experiments see, space). It should be noted that Mach would have dis-
for example, [21.148, 171–176]). Indeed, they help us agreed with such an interpretation of Newton’s bucket
to better understand the conclusions that can be drawn thought experiment, which will turn to be a mediative
from a specific theory. Brown gives as an example more than a conjectural thought experiment. Accord-
James Clerk Maxwell’s demon thought experiment. ing to Mach [21.168], from the thought experiment,
According to the kinetic theory of Maxwell, there is we can reach the conclusion that absolute motions and
a probability, albeit very small, that heat moves from space do exist, only if we accept from the beginning
a cold body to a hot one. The second law of thermody- the existence of absolute space and the distinction be-
namics, however, implies the impossibility of such an tween absolute and relative motions. Moreover, if we
event. To show the logical possibility of violating clas- consider Newton’s bucket as a thought experiment run
sical thermodynamics, Maxwell proposed his thought against relativist theories of motion (such as Descartes’s
experiment of the demon [21.177]. and Leibniz’s ones), it can also be seen as a destructive
Imagine two interconnected boxes; one filled with thought experiment.
cold gas (C) and the other with hot gas (H). A very small Direct thought experiments establish new theories
door controlled by a demon is in between the two boxes starting with unproblematic phenomena. An example of
(Fig. 21.5). The demon lets fast molecules go from C this category is Stevin’s chain thought experiment, since
to H, and slows molecules go from H to C. In this way, it introduced Stevin’s law of the inclined plane –i. e., the
while the average speed of the molecules in H would force to the weight is equal to the ratio of the height to
increase, the average speed of the molecules in C would the length of the plane (Sect. 21.1.2; this thought exper-
decrease. Since according to Maxwell’s theory, heat is iment can be seen also as destructive [21.153]).
nothing more than the average speed of the molecules, Finally, according to Brown some thought exper-
the thought experiment shows the possibility of the flow iments are both destructive and direct-constructive,
of heat moving from a cold body to a hot body. these are platonic thought experiments [21.55, 70–72].
Given their illustrative and expository role, Brown’s An example of this category is Galileo’s thought ex-
mediative thought experiments recall Popper’s heuristic periment on falling bodies (Sect. 21.1.1), since at the
use of thought experiment [21.58] – a category pro- same time it undermined the Aristotelian theory of
foundly similar to Hempel’s inductive thought experi- motion and put forward a new theoretical framework.
ments [21.59]. According to Brown, however, positive As Koyré [21.60] remarked, such a thought experi-
Part D | 21.4
thought experiments can do more than merely illustrate ment seems an example of good physics made a priori.
a theory; they can help in constructing a theory. This This is precisely what characterises platonic thought
is precisely what both conjectural and direct thought experiments; they are vehicles of a priori knowledge
experiments aim to do. Contrary to mediative thought (Sect. 21.4.2).
experiments, they do not start from a specific theory, The following schema (Fig. 21.6) sums up Brown’s
but they end with one. What distinguishes conjectural taxonomy.
from direct thought experiments is that they make up Although Brown applies his taxonomy only to sci-
conjectured phenomena and put forward theories in entific thought experiments, it might be extended to
order to explain them. Brown gives Newton’s bucket
experiment (Sect. 21.1.3) as an example of conjec-
TE
A B A B Destructive Constructive
Platonic
encompass philosophical thought experiments too (as place in the imaginary scenario should be described.
suggested in [21.27]). For example, Twin Earth thought Finally, valuational thought experiments assess the ap-
experiment (Sect. 21.1.5) may count as internal- propriate, moral, or aesthetic, evaluation of the envis-
destructive, since it targets a hypothesis assumed by aged situation. However, this taxonomy can be reduced
internalist theories of meaning (i. e., identical psycho- to two broad categories. Gendler takes thought experi-
logical states imply identical references). The same ments as contemplations of imaginary cases that force
thought experiment can also be seen as a conjectural- us to account for the represented exceptional episodes
constructive one, since it highlights a problematic phe- and identifies two strategies for doing that, namely
nomenon (i. e., the situation in which identical psycho- exception-driven and norm-driven. On the one hand,
logical states imply different references) and suggests the exception helps us to establish the norms of a the-
a solution (i. e., meanings are constrained by natural ory (e.g., Galileo on falling bodies – Sect. 21.1.1) and,
kinds). The fact that a thought experiment can be dif- on the other hand, norms guide us in evaluating the
ficult to classify is specific neither to philosophical exception (e.g., the thought experiment of the ship of
thought experiments (see what was said earlier about Theseus, which questions criteria for identity). While
Newton’s bucket), nor to Brown’s taxonomy (as will factive thought experiments typically fall within the
become clearer in the following). This is nothing but former strategy, conceptual and valuational thought ex-
a symptom of how sorting thought experiments is diffi- periments fall within the latter.
cult in itself. Note that some authors might complain that all
Two more taxonomies are worth mentioning, which these taxonomies neglect to consider other specific
have sought to integrate also philosophical thought types of thought experiment. First, some thought ex-
experiments. Sorensen has proposed a taxonomy of periments seem neither to refute, nor to support a the-
thought experiments driven by the idea that thought ory, but rather to be part of the theory itself. These
experiments are “stylized” paradoxes [21.52, p. 165], thought experiments have been called functional, pre-
which serve as “alethic refuters” [21.52, p. 135]. In cisely because they have “a specific function within
other words, thought experiments can be seen as “expe- a theory” [21.180, p. 384]. In psychological testing, for
ditions to possible worlds,” whose mission is “to refute instance, thought experiments advancing brainwashing
a source statement that has an implication about the procedure made possible to apply a frequentist concep-
constituents of these worlds” [21.52, p. 135]. In anal- tion of probability [21.180, p. 384]. It has been pointed
ogy with the two alethic modalities, Sorensen divides out that functional thought experiments can also be
thought experiments into two categories: refuters of found in modern physics [21.43].
possibility and refuters of necessity. Both argue against Second, some thought experiments can start by
a theory or theoretical framework: the former by point- clarifying conceptual issues and then turning to pro-
ing out inauthentic possibilities wrongly considered as vide the basis for normative judgments. This seems to
authentic, the latter by revealing neglected genuine pos- be the case with respect to the concept of money in
Part D | 21.4
sibilities. Sorensen gives many examples, for instance, economics. Julian Reiss has labeled such thought exper-
while Gettier’s thought experiments (Sect. 21.1.4) and iments “genealogical” ([21.181]; see also [21.182] for
Maxwell’s demon are necessity refuters, Schrödinger’s parallelisms between thought experiments in physics
cat and Mark Johnston’s [21.178] thought experiment and in politics – on political thought experimentation,
against views on personal identity that make the lat- see also [21.183]).
ter dependent on future events are possibility refuters Third, by analyzing thought experimentation in
(see [21.86] for a critique of Sorensen’ logical regimen- quantum gravity, Mark Shumelda [21.184] stresses that
tation, which is at the bottom of his taxonomy; it might thought experiments can also be used in order to impose
be interesting to explore the link between Sorensen’s logical constraints on future scientific theories.
taxonomy and the one proposed in [21.179], pivoting
on the idea that while some thought experiments en- 21.4.2 Thought Experiments
large the domain of properties pertaining to the actual and Kinds of Knowledge
world, others restrict such a domain).
Gendler [21.118] proposes a tripartite taxonomy of A good thought experiment should be conducive to
thought experiments pivoting on three different ques- a new justified belief about the world or, at least,
tions that can arise from a thought experiment. First, our interpretation of the latter. For instance, thanks to
one may wonder what would happen if the imaginary Galileo’s thought experiment on free fall (Sect. 21.1.1),
scenario take place. This is what factive thought ex- we know that speed is not proportional to weight.
periments ask. Second, there are conceptual thought Hence, we have, on the one hand, reasons against the
experiments that pose the question of how what takes Aristotelian theory of motion and, on the other hand, ev-
Thought Experiments in Model-Based Reasoning 21.4 What Is the Function of Thought Experiments? 479
idence in favor of Galileo’s theory, according to which the parallelism between visual and platonic percep-
speed is proportional to time. tion [21.116]. Norton highlights that we have good
Nevertheless, the fact that thought experimentation criteria for assessing the unreliability of the former, but
can produce knowledge is not a simple issue, and it the same does not hold for the latter, which relies on
turns out to be more complicated than in the case of both imagination and intuition [21.71, 72, 150]. Finally,
real experimentation. Leaving aside arguments to the since he argues that thought experiments can be re-
effect that thought experiments do not at all increase constructed into arguments without epistemic loss (see
our knowledge, disagreements arise when philosophers Sect. 21.3.2), Norton denies that thought experimenta-
want to specify the kind of knowledge that we gain tion has a distinguishing kind of epistemic force. Thus,
through thought experimentation. This is easily seen thought experiments can increase our knowledge, but
with the problem of informativeness raised by Kuhn only in the way logically sound arguments can do.
(Sect. 21.2.2): How can a thought experiment yield new Between these two extremes, many scholars agree
empirical knowledge without the input of new data? in thinking that new knowledge can be gained
This question has a paradoxical flavor due to the fact via thought experimentation. For example, Humph-
that only real experimentation is in direct contact with reys [21.75] maintains that thought experimentation
the world, from which it directly derives new materi- provides a better understanding of the conditions
als. By contrast, thought experimentation is bound to needed for a theoretical model to hold. By contrast,
use only old data, stored in the mind of the thought Gendler [21.118, 145] claims that via thought exper-
experimenter. How, therefore, can thought experiments iments, we can get either new justified beliefs about
provide us with new knowledge or understanding of contingent aspects of the natural world or new jus-
nature? And what kind of new knowledge would they tifications for old beliefs. Gendler’s reflections are
produce? clearly influenced by Kuhn, who already emphasized
Very different stances have emerged in the many at- the importance of thought experiments in conceptual re-
tempts to answer this epistemological question since the configuration. Moreover, like others (e.g., [21.73, 82]),
1990s. Two among them can be seen as the polar posi- she also follows some insights of Mach, who argued
tions of the relevant logical space: Brown and Norton, that thought experiments make explicit a kind of inar-
respectively, claiming that there is, and that there is not, ticulate knowledge, not yet organized in theoretical
new knowledge. frameworks, though stored in memory [21.28, 83]). In
Following Koyré, Brown identifies a set of a priori line with Mach’s view, it has been claimed that success-
thought experiments. These are, in Brown’s terminol- ful thought experiments transform ability knowledge
ogy, Platonic thought experiments (Sect. 21.4.1), since into propositional knowledge [21.185].
they are neither based on new empirical data, nor As far as the kind of knowledge is concerned, while
simply inferred from old ones [21.55, 70–72]. These some authors have maintained that thought experimen-
thought experiments are to be considered constitu- tation involves both a priori and a posteriori knowledge
Part D | 21.4
tively a priori and a source of knowledge independent (e.g., [21.186]; see also [21.136] for an aprioristic ac-
of experience. How are we to explain, for exam- count of thought experiments), others have strongly
ple, the transition from the Aristotelian theory to the criticized any aprioristic account of thought experimen-
Galilean theory of motion? The right answer cannot tation. For instance, although Rodney Snooks [21.78]
be new sensory data, since none has been added. Ac- agrees with Brown in thinking that thought experiments
cording to Brown, it is not even possible either to are a direct vehicle for the laws of nature, he argues that
invoke any logical truth, that allows us to infer that they do not give us access to a priori truths (see also
all bodies fall at the same speed, or to appeal to other Hopp [21.138], where a phenomenological approach
criteria, such as aesthetic ones (e.g., that of simplic- is defended according to which thought experimenta-
ity). Platonic thought experiments allow us to see the tion can lead us to intuit universals and relations among
laws of nature. Many authors have criticized Brown’s them).
aprioristic account, above all his risked extension of The debate has tackled also other kinds of knowl-
Platonism from mathematics to physics [21.65, 75, edge, for instance, both modal and counterfactual
136]. knowledge [21.119, 187].
According to Norton, pure thought is totally unable The issue about the kind of knowledge (e.g.,
to generate any kind of knowledge, except from log- new/old, a priori/a posteriori, universal/contingent, con-
ical truths, and can only transform what the subject ceptual/empirical) gained via thought experiments is
already possesses [21.115, p. 49]. Moreover, he crit- not the only thorny problem. Also, the issue about the
icizes a fundamental assumption in Brown’s account: status of such knowledge remains open.
480 Part D Model-Based Reasoning in Science and the History of Science
21.4.3 The Epistemological Status changes the focus to how thought experimentation func-
of Thought Experiments tions (Sect. 21.5). Gendler [21.118] has proposed to see
the functional difference in the type of results. Both
Is the knowledge gained via thought experimentation thought experiments (at least scientific ones) and real
valid or reliable? And more generally we may ask: experiments tell us about the real physical world, but
Are thought experiments indispensable epistemic tools? via the former we obtain intuitions, whereas via the
These questions, on the one hand, take us back to latter data (Sect. 21.3.3). A question arises: Do we re-
the comparison between real and thought experimen- ally make use of thought experiments because we are in
tation (Sect. 21.3.1) and, on the other hand, open the search of intuition rather than data?
issue about thought experiments in philosophy. In what Inspired by Kuhn, Bokulich [21.56] has suggested
follows, I shall address these issues respectively and de- that thought experimentation tests the nonempirical
vote a final subsection to the topic of intuitions. virtues of theories, such as (internal or external) co-
herence, simplicity, and fruitfulness (for the notion
The Proper Functions of nonempirical virtues, see [21.189]; similar notions
of Thought Experimentation can be found in [21.190, 191]). On this view, Galileo’s
As we have seen (Sect. 21.3.1), thought experimen- thought experiments showed an incoherence internal to
tation is often considered as being of a rank lower Aristotle’s theory of motion, stemming from an am-
than real experimentation, as if they were competing biguous use of the concepts of speed and weight. He
strategies. That is, the two types of experimentations also dared to go beyond the impasse, and to propose
perform the same function and the real type is to be a new theoretical framework within which he could
preferred when possible (Sect. 21.3.3). And indeed, real account for the phenomena. Bokulich’s conclusion on
and though experimentations seem to play very similar thought experimentation in physics finds a parallel in
roles in the evaluation of theories: Both test hypotheses, the work of James Lennox [21.101, 102] on Darwin’s
help to refine theories and similarly may fail in achiev- thought experiments (see also the claim that thought
ing these goals. However, one might ask whether there experiments in science test how unified a theory is
is a functional difference between thought and real ex- in [21.132]). Indeed, Lennox argues that thought ex-
periments. periments are functionally experiments, but we appeal
Some authors argue that, in fact, contrary to real to them under special conditions: “thought experiments
experiments, thought experiments cannot have a jus- are especially important when the issue at hand is the
tificatory role, but only an illustrative or explanatory theory’s potential to explain as, and what, it claims it
role [21.89, 90, 98, 99]. However, this position does not will” [21.101, p. 236]. It has been proposed to extend
do justice to both types of experimentations. After all, a similar approach to philosophical thought experi-
even real experiments are not only means of theoretical ments [21.152].
justification. Moreover, reasoning along this line tends It should be noted that most authors who chal-
Part D | 21.4
to focus only on the justificatory inadequacy and to see lenge the epistemic validity of thought experimenta-
it as the major limit and deficiency of thought exper- tion do not object to scientific thought experiments.
imentation. Once again the running idea seems to be George Bealer [21.192] has proposed to formalize this
that thought experimentation has no role to play within view terminologically. According to him, the expres-
the context of justification and should be confined to sion thought experiment should refer only to those
the context of discovery (see Sect. 21.3.1). Once again hypothetical situations designed to generate intuitions
we run the risk of underestimating the peculiarities of about the natural world, in other words to scien-
both types of experimentation. What, then, is the proper tific thought experiments. Likewise, others, among the
function of thought experiments, which sets them apart fiercest critics of the epistemological role of thought
from real ones and is also what motivates us to use them experiments, have always rescued scientific thought ex-
instead of the latter? periments, specifically within physics. For example,
In the literature, answers given to this question both Hull [21.90] and Snooks [21.78] limit the power of
are not crystal clear. Let us mention some of them. thought experiments to well-articulated scientific fields,
It has been emphasized that thought experimentation that is physics. Generally, there is a sharp scepticism
provides us with idealization and modeling of real- about philosophical thought experiments.
ity to a higher degree compared to real experiments
(e.g., Koyré [21.60]; idealization can also be a source The Status
of unreliability, for a discussion on this topic, see of Philosophical Thought Experimentation
Sorensen [21.188]). However, it is questionable whether Rachel B. Cooper [21.65] has pointed out that much of
this really answers the above question, or merely the analysis on thought experimentation is restricted to
Thought Experiments in Model-Based Reasoning 21.4 What Is the Function of Thought Experiments? 481
scientific thought experiments, probably due to a strat- Sect. 21.5). According to Wilkes, there are two condi-
egy of caution (Sect. 21.2.3). In the last decade, the tions for any experimenter (thought or real): first, to aim
number of analyses considering only philosophical at testing a theory by varying key parameters and main-
thought experiments has also grown – mainly due to taining constant other relevant parameters (Sect. 21.3.3)
the debate about the role of intuition in philosophy (see and, second, not to violate natural laws. This sec-
in the following). Although Brown does not deal with ond condition should further distinguish scientific from
examples of philosophical thought experimentation, he philosophical thought experiments. On this condition,
is aware of this shortcoming and wonders whether sci- however, the philosophical thought experiment of the
entific and philosophical thought experiment can be brain in the vat [21.2] would be acceptable, since it is
accommodated by a single theory [21.55, p. 28–31]. He not obviously nomologically impossible, whereas the
also claims that perhaps this kind of conceptual analy- thought experiment of Einstein chasing a light beam
sis reveals a core common to philosophy and physics. If would not be (see [21.164] for an in-depth analysis of
we were able to know more about such a common core, this thought experiment). Moreover, as rightly stressed
we would probably learn more about physics and phi- by Brown [21.55, p. 30–31]
losophy, as well as thought experimentation (it is worth
“Too often thought experiments are used to find
underlining that also in mathematics thought experi-
the laws of nature themselves; they are tools for
mentation seems an important, and perhaps the only,
unearthing the theoretically or nomologically pos-
form of experimentation; on the topic see, for exam-
sible. Stipulating the laws in advance and requiring
ple, [21.42], contributions in [21.13, 193–195]).
thought experiments not to violate them would sim-
Cooper claims that often what distinguishes sci-
ply undermine their use as powerful tools for the
entific from philosophical works are only the journals
investigation of nature”
in which they are published. But papers on scien-
tific thought experiments, such as Schrödinger’s cat See also what will be said about Cooper on this
(Sect. 21.4.1), can be found both in scientific and philo- point Sect. 21.5.2).
sophical journals. Similarly, a philosophical thought ex- Philosophical thought experiments are generally
periment such as Searle’s Chinese room (Sect. 21.2.3) pictured by their detractors as fairy tales, which do
is tackled by philosophers, as well as by psycholo- not deserve to be taken seriously. The underlining idea
gists. “It is hard to distinguish science from philosophy seems to be that philosophy is too much prone to con-
and even harder to distinguish philosophical from sci- ceptual ruminations, involving idealization and approx-
entific thought experiments” [21.65, p. 329]. Cooper imation, and based on a methodology less strict than
suggests that a comprehensive analysis of thought ex- the scientific one. Philosophical thought experimenta-
perimentation cannot avoid to give an account of both tion would be paradigmatic of these flaws. Following
philosophical and scientific thought experiments. Hull’s discussions ([21.89, 90]; Sect. 21.3.1), which
Besides Cooper, other scholars have tried, more or sum up criticisms against philosophical thought experi-
Part D | 21.4
less extensively, to analyze both scientific and philo- ments very well, there are four negative aspects, which
sophical thought experiments, without raising a barrier make philosophical thought experimentation less effec-
between them [21.52, 64, 79, 82, 118, 141, 152, 161]. tive than scientific thought experimentation.
Arguably science and philosophy are intertwined, First, philosophical thought experiments lack
but one might think that the latter is more hostage to well-defined theoretical frameworks. According to
speculation and boundless imagination than the for- Hull [21.90, pp. 432, 434 and 438] this is the funda-
mer. Precisely, for this reason, Kathleen Wilkes main- mental difference between philosophical and scientific
tains that a good thought experimenter should envisage thought experiments and, probably, the reason for the
a scenario not too far from reality and specify all con- disparity (at the time of his writings) between many ex-
ditions relevant to its understanding ([21.87, p. 9]; see cellent analyses of scientific thought experiments and
also [21.28]). Thus, she considers thought experimen- poor accounts of philosophical thought experiments.
tation fruitful only in the scientific domain, because the Hull thinks that thought experiments made within ana-
latter, contrary to the philosophical domain, cannot de- lytic philosophy well exemplify the lack of a theoretical
viate too much from reality and must invoke a type of framework that allows us to set up the imagined sce-
thought experimentation more akin to real experimen- nario. Provocatively, he writes: “If no such context
tation (this point brings us back to the issue about the exists, philosophers need to construct one. [. . . ] If Jane
biased continuity between real and thought experimen- Austen can do it, so can Hilary Putnam” [21.90, p. 434].
tation see Sect. 21.3.1; see recent discussion in [21.196] If a theoretical and technical background is missing,
about the idea that also thought experimentation in sci- as much as one tries to refine the details of the given
ence is weakened by being dependent on imagination – thought experiment, it will remain hopelessly incom-
482 Part D Model-Based Reasoning in Science and the History of Science
plete (Sect. 21.3.3) and of poor cognitive value. Fur- pect from them. This is so not only in the debate on
thermore, without a reliable theoretical background, the thought experimentation, but also in the specific de-
usefulness of philosophical thought experimentation is bate about the nature of intuitions (a good starting point
also undermined, in so far as thought experimentation on the topic are the contributions in [21.199]; see also
cannot exploit a fruitful interdependence between ob- Cappelen [21.200], Chudnoff [21.201] and Booth and
servations and theories [21.89, p. 311]. Rowbottom [21.202]).
Second, philosophical thought experiments are used
in order to justify or provide evidence in favor of the- Intuitions and Thought Experimentation
oretical hypotheses, though they should be used only Intuitions seem to be an integral part in the processes of
for descriptive purposes ([21.89, pp. 315–316]; see rational choice. Psychology and related disciplines have
also [21.90, p. 438 and p. 453]). According to Hull, the been investigating the formation and variation of our
fact that philosophical thought experimentation relies daily choices for a long time [21.203]. The picture that
more on common sense than on scientific data weakens has emerged is that our decisions are highly sensitive to
its justificatory power and is also the reason why they many elements which, at first sight, appear irrelevant,
cannot offer the same degree of technical specificity as such as the framing of the context. Similar considera-
scientific thought experimentation and real experimen- tions seem to apply to intuitions as well. This line of
tation. research is a warning to a standard philosophical prac-
Third, Hull maintains that, contrary to real experi- tice that uses intuitions generated in response to thought
mentation, thought experimentation requires a theory of experiments as evidence in the assessment of a philo-
conceivability as a vehicle for possibility. Thus, thought sophical thesis (e.g., in ethics – see, [21.204–206]). The
experimentation should adopt a strong standard of con- lesson would be that a more rigorous program, whose
ceivability. Unfortunately, “Too often, the decisions goal is to observe responses obtained via thought exper-
that philosophers make rest heavily on intuitions about iments and to study the nature of the intuitions involved,
what sounds right” [21.90, p. 435]. In a nutshell, we set- is needed. A new philosophical movement known as
tle for weak requirements for assessing the plausibility Experimental Philosophy (or X-Phi) aims precisely at
of the conclusions reached via thought experimentation meeting this challenge, by making use of the critical
(there is a vast philosophical literature on the link be- methods proper to social experimental psychology (for
tween conceivability and possibility; see, for example, an introduction to X-phi, see [21.207] and contributions
contributions in [21.197], which also touch upon the is- in [21.208]).
sue about thought experimentation; see also [21.198]). Note that the methodology of the experimental
Finally, misleading intuitions seriously undermine philosophers has been highly criticized (e.g., [21.209]
the efficacy of thought experimentation. These intu- and [21.210], Williamson is also sceptical about the role
itions are culturally variable, being dependent on our of intuitions in thought experimentation – [21.119, 211,
cultural beliefs. The latter can help us in exploring pos- 212] – without being sceptical about thought experi-
Part D | 21.4
sible worlds, but also be narrow-minded and inhibit mentation itself like [21.213]). Leaving aside such cri-
innovation [21.90, p. 431 and p. 446]. It does emerge, tiques, X-Phi studies have shown that some philosoph-
from the catastrophic picture of the critics about ical thought experiments typically considered as uni-
philosophical thought experiments, that the latter are versally acceptable (e.g., Gettier’s cases, Keith Lehrer’s
a source of, in Hull words, “conceptual morass” [21.89, Mr. Truetemp thought experiment [21.214], Putnam’s
p. 315], rather than a prelude to its remediation. brain in a vat, Saul Kripke’s thought experiment on
The lack of a strong standard of conceivability, the Gödel and Schmidt – [21.215]) evoke variable intu-
contextual vagueness, and the consequent paucity of in- itions both inter- and intra-subjectively [21.216, 217].
terrelation between empirical and theoretical data are The fact that thought experiments produce unstable in-
errors that can be traced to a single source: mislead- tuitions makes thought experimentation itself shaky.
ing intuitions. Hence, criticisms against philosophical However, thought experimentation in the scientific do-
thought experimentation can be reduced to 2. Philo- main is commonly considered efficient. Therefore, it is
sophical thought experimentation, on the one hand, legitimate to ask whether only philosophical thought
relies on questionable intuitions and, on the other hand, experiments evoke poor intuitions and, if not, whether
purports to bring evidence in defence of a philosophi- scientific thought experiments have other resources in
cal claim or theory. Without calling into question, the order to make their intuitions more useful.
plausibility of such a view, a further difficulty arises Some contemporary philosophers have explicitly
from the fact that the meaning of the term intuition is pointed out that all thought experiments, both philo-
not crystal clear. There is not a consensus either on sophical and scientific, evoke and make use of intu-
what intuitions are or on what we can reasonably ex- itions. In Brown’s account, for example, the state of
Thought Experiments in Model-Based Reasoning 21.4 What Is the Function of Thought Experiments? 483
seeing the laws of nature is interpreted in terms of hav- and philosophical thought experiments is questionable,
ing intuitions [21.55, 72]. On this view, in Galileo’s since the alleged self-evidence of intuitions produced
thought experiment (Sect. 21.1.1) the desired scientific by philosophical thought experimentation has been se-
conclusion comes as a result of our having the intuition riously challenged.
that the two bodies fall at the same speed. However, Daniel Dennett [21.218] defined thought experi-
it seems that in the scientific domain as well, thought ments as intuition pumps. Generally, in the literature
experimentation can elicit misleading and unreliable on thought experiments, this expression is interpreted
intuitions (e.g., in EPR thought experiment, which is in a negative sense. Indeed, Dennett does not consider
generally interpreted as a failed thought experiment – (at least philosophical) thought experiments highly.
Sect. 21.3.3). Still it is an open question whether and However, the philosopher seems to acknowledge that
how these intuitions can be properly used in the scien- thought experiments can be useful when he writes
tific domain. that [21.218, p. 18]
One way to answer this question is to argue that
“Philosophy with intuition pumps is not science at
philosophical and scientific thought experiments do not
all, but in its own informal way it is a valuable –
involve the same type of intuitions. Bealer [21.192]
even occasionally necessary – companion to sci-
seems to hold this view. He distinguishes between ra-
ence.”
tional and physical intuitions. The former would be
sui generis intellectual seemings and would arise when Following Peter Swirski [21.219], it can be argued
considering the (logical or metaphysical) possibility of that the fact that thought experiments, both scientific
an imagined scenario or the applicability of a given con- and philosophical, are intuition pumps, and that these
cept to such a scenario. The author gives as an example intuitions are unstable, is not negative per se. The epis-
of this type of intuition Gettier’s cases (Sect. 21.1.4), temic force of a thought experiment seems precisely to
which trigger two rational intuitions: A first intuition arise from the fact that it depicts an exceptional case
confirms that the case is possible, while a second in- and forces us to account for the latter. Perhaps, the
tuition that we cannot ascribe to the imagined subject problem lies in an overestimation of what “can rea-
a state of knowledge. Physical intuitions deal with what sonably be expected of such experiments” [21.219, p.
would happen if the given imagined scenario were ac- 105]. For example, it might be an exaggeration to con-
tual, rather than with its plausibility. Newton’s rotating sider thought experimentation a canonical procedure of
bucket thought experiment (Sect. 21.1.3) would exem- justification [21.192], as if a single thought experiment
plify this type of intuition, since in this case a physical could lead us to accept or to reject a theory. After all,
intuition has to answer the question: “Would water even real experiments seem not capable of doing so
creep up the side of the bucket (assuming that the much.
physical laws remained unchanged)?” [21.192, p. 207]. It is also possible to argue that it is a mistake
According to Bealer, another feature distinguishes ra- “to describe the sort of knowledge involved in these
Part D | 21.4
tional from physical intuitions: only the former present thought experiments as intuitions” [21.56, p. 300]. The
themselves as necessary. As Bealer would say, nec- idea would be that intuitions are a component in the
essarily if a subject S intuits that the given imagined cognitive process of thought experimenting, more than
scenario is not a case of knowledge, it seems to S that its upshot. It might be even argued that intuitions are
the given imagined scenario is not a case of knowledge dispensable in thought experimenting [21.31, 119, 211,
and also that necessarily the given imagined scenario 212, 220]. According to many authors [21.71, 72, 82,
is not a case of knowledge. By contrast, it does not 137, 161, 204, 221], however, intuitions play a crucial
seem that the water must crawl up the side of the role in thought experimenting. Through these intuitions,
bucket, though it is possible. Finally, Bealer claims that in conjunction with other components (e.g., theoreti-
the expression thought experiment should be used to cal assumptions, empirical data), thought experiments
refer only to hypothetical situations that generate phys- lead us to acquire knowledge. The power of thought
ical intuitions – i. e., to scientific thought experiments experiments would rely on optimizing the combination
(mathematics and logic excluded). between data, theories, and intuitions. It can be argued
Beyond the plausibility of a distinction between that scientific and philosophical thought experimenta-
physical and rational intuitions drawn on the dis- tions are functionally similar: The latter have the same
tinction between possibility and necessity, its rele- potential as the former in order to make interact data,
vance for a corresponding distinction between scientific theories, and intuitions [21.152].
484 Part D Model-Based Reasoning in Science and the History of Science
Mason Myers [21.185] complained about both the itions in thought experimentation (Sect. 21.4.3). Jeanne
lack of a deep investigation of the basis of thought Peijnenburg and David Atkinson claimed that, although
experimental reasoning and the epistemic aspects of there is not a unanimous definition of what thought
(philosophical) thought experiments. However, an epis- experiments are, there is unanimity about what they
temological approach to thought experiments is not should do, that is to give “a sudden and exhilarat-
cognitive, or not necessarily so. The difference between ing insight” ([21.79, p. 306]; see also the definition
these two ways of studying thought experiments lies in the Encyclopedia of Cognitive Science – [21.81]).
in the specific issues addressed, as well as in the fact Indeed, many philosophers in the debate on thought
that while the epistemological approach is generally experiments have claimed that intuitions are an impor-
normative, the cognitive approach is more descriptive. tant component of the thought experimental process, as
A fine-grained and comprehensive analysis of thought well as the type of its outcome. However, the precise
experiments should acknowledge the differences be- role played by intuitions in thought experimentation
tween these two approaches and pursue them together is an open question. As previously stressed, while
insofar as they are complementary [21.223]. some authors have argued that intuitions cannot explain
In the literature, there have been several attempts by themselves the epistemic role of thought experi-
to describe the stages of a thought experiment or how ments [21.56], others have based their scepticism about
it works. For instance, Reiner and Gilbert [21.148] the thought experimental practice precisely on the fact
argue that there are six stages to thought experi- that they mainly involve (deceptive) intuitions [21.79,
menting: 89, 90, 99].
Thought Experiments in Model-Based Reasoning 21.5 How Do Thought Experiments Achieve Their Function? 485
Despite this disagreement, almost all authors in- is a “quasi-spatial picture” and has a “concrete and
volved in the debate on thought experiments agree in quasi-spatial character” [21.149, p. 220].
considering thought experiments as epistemic tools, By contrast, Nersessian argues that the mental
which involve imagination in order to provide insights model manipulated in thought experimentation is nei-
on a certain hypothesis or theory (see the definition in ther a picture in the head nor a linguistic representation.
the Encyclopedia of Cognitive Science [21.81]). Following Johnson-Laird, she maintains that it is rather
a structural analog of the situation depicted in the
21.5.2 Imagination and Thought thought experimental narrative [21.73, p. 297]. Nev-
Experimentation ertheless, Nersessian highlights the role of nonpropo-
sitional representations much more than Miščević: In
Mach [21.28, 83, 168] was the first to argue that imag- her view the reasoning proper to thought experiments
ination plays a pivotal role in thought experimentation. is entirely rather than partially nonpropositional. For
According to him, performing a thought experiment is that reason, Nersessian maintains that deductive and in-
to “combine circumstances” in imagination [21.28, p. ductive inferences do not have a central part in thought
452]. Some passages of his writings have led authors experimentation.
to maintain that Mach conceived imagination as visu- Cooper disagrees with both Miščević and Nerses-
alization. Among these authors is Gendler [21.145], sian and holds a much more liberal view. On the one
who attributes to Mach the idea that it is visual im- hand, she maintains that the hypothesis of mental mod-
agery (i. e., visual imagination) that is primarily at work els is debatable as it is based on contestable empirical
in thought experimenting (see also [21.52]; for a cri- data. On the other hand, she argues that [21.65, p.
tique [21.223]). Gendler herself has tried to pursue 341]
Mach’s approach. She establishes a link between re-
“whether the thought experimenter reasons through
search of cognitive scientists and philosophers on visual
the situation via manipulating a set of propositions,
imagery and the analysis of Stevin’s thought experiment
or a mental picture, or even plasticine characters
on the inclined plan (Sect. 21.1.2), and finds that at least
makes no difference.”
in some thought experiments the role of visual imagery
is epistemically crucial (see also [21.105] on this point). For Cooper, a thought experimenter can manipulate
It is possible to consider the model-based approach, physical models in addition to mental representations,
which calls on the literature of model-based reasoning and she can carry out deductive or inductive inferences,
in cognitive science, as belonging to the Machian tra- as well as diagrammatical ones. Another point of dis-
dition too, that is, they pursue Mach’s aim of analyzing agreement is that for Nersessian and Miščević, “models
thought experiments with the help of a psychological are restricted to simulating the way in which phenom-
theory (for a detailed analysis, see [21.86, Chap. 4]). ena would unfold in the real world”. Cooper replies that
Among the authors advocating this approach [21.85, “the thought experimenter may model a world in which
Part D | 21.5
132, 225, 226], three are to be considered as its main some laws of nature are suspended or altered” ([21.65,
developers, namely Miščević [21.137, 149], Nerses- p. 341] – see what has been said about Wilkes on this
sian [21.73, 84, 135, 227] and Cooper [21.65]. These point Sect. 21.4.3).
authors agree in maintaining that in thought experimen- Two other views that fall within the mental-model
tation we gain new knowledge through manipulating approach are worth mentioning. First, by following
a model. They have, however, advanced different the- Nersessian’s account, Michael Bishop [21.132] has put
ses pivoting on different notions of model. forward a very liberal view according to which men-
Miščević and Nersessian appeal to the cognitive lit- tal models of actualized experiments and at least some
erature concerning mental modeling, and more specif- computer simulations count as thought experiments
ically to the notion of the mental model proposed (see what has been said about numerical experiments
by Philip Johnson-Laird [21.228, 229]. In a nutshell, Sect. 21.3.2). Second, David Gooding [21.85, 147] was
a mental model is a structure stored in short or long in line with Nersessian’s approach too. However, he did
term memory and it is defined by cognitive scientists as not rely on the notion of a mental model and devel-
a third type of mental representation, half way between oped an embodied view on thought experiments where
propositional and pictorial. Indeed, mental models are the bodily and visual components play a central role
structurally analogous to that which they represent, (see [21.230] for a more phenomenological and less
but not all such models can be visualized. This is naturalistic embodied approach on thought experimen-
clearly seen in Nersessian’s account [21.86], whereas tation).
Miščević holds a more pictorialist view about mental According to the Machian tradition, thought exper-
models. Indeed, Miščević claims that the mental model iments are species of simulative-based reasoning. This
486 Part D Model-Based Reasoning in Science and the History of Science
idea is implicit in Gendler’s analysis, but it is only in to most if not all thought experiments. Norton [21.115]
the model-based approach that it is made fully explicit admits that thought experiments involve visualization,
and linked to the notion of a (physical or mental) model. though he denies its epistemic role. Thought experi-
John Zeimbekis [21.231] has criticized simulationist ap- mentation, thus, seems to involve a sensory – specifi-
proaches to thought experiments by arguing that we cally, visual – variety of imagination. Although they are
should distinguish between two kinds of mental simu- in the minority, other authors have suggested that non-
lation: mental–mental simulation and mental–physical sensory forms of imagination may be necessary to the
simulation (see also the similar distinction between thought experimenter, like supposition [21.236] or con-
recreative and icastic imagination drawn in [21.223]). ceiving [21.237]. Indeed, Mach himself seems to have
He claims that while only the latter is captured by given to imagination in all its forms a role in thought
mental models, the former is a source of epistemic experimenting [21.223].
bias, at least, for moral thought experiments. Zeimbekis
grounds his argumentation on the literature in philoso- 21.5.3 The Narrative Dimension
phy of mind about simulation theory. However, it is not of Thought Experimentation
entirely clear, whether he is dealing with high-level or
low-level mental simulations. While the former comes The model-based approach has underlined a rather ne-
at the personal level and can be interpreted as con- glected aspect of thought experimentation, namely its
scious imagination, the latter comes at the subpersonal narrative dimension [21.73, 85]. Thought experiments
level and is realized by mirroring processes, for exam- are extremely important because they are intentional
ple, the activation of mirror neurons in the observation products related to the sharing and the spreading of
mode [21.232, 233]. knowledge. Moreover, they are publicly presented to
Imagination is often cited by all these authors, but different audiences through narratives [21.219]. In dis-
it is not crystal clear how imagination is defined and agreement with Norton, Nersessian has stressed that
what the link is between mental or physical models the aesthetic details in thought experimental narratives
and imagination. More generally, the role of imagi- are not simply rhetorical, but “serve to reinforce cru-
nation in thought experiment is a controversial topic cial aspects of the [thought] experiment” ([21.73, p.
(see [21.196]; on imagination and thought experiments 296] – she also sees a parallel between thought and
see also [21.234, 235]). Indeed, some authors, like real experiments, since also the latter when published
Gendler [21.145], give it a central role, while oth- are presented in a narrative form). Still, according to
ers, such as Norton [21.115], maintain that thought Lawrence Souder [21.238], even in Nersessian’s ac-
experimenters could and should do without it, imagi- count the role played by the narrative aspect of thought
nation being here a source of error (see also [21.54]). experimentation is underestimated. The same holds for
Moreover, an additional complication arises once we other views that deny to thought experimentation a life
acknowledge that accounts of imagination provided by of its own (the reference is mainly to [21.74], as we
Part D | 21.5
the cognitive literature have pointed out that imag- have seen before Sect. 21.3.3).
ination comes in many varieties, for instance there The narrative dimension of thought experimentation
would be sensory and nonsensory forms of imagina- has led some authors to conclude that the reasoning
tion (e.g., [21.232]). A closer look at the expressions underlying thought experimentation is closely related
used in the literature to describe thought experimenta- to the one used in the consumption of fiction [21.73,
tion suggests that most authors think of imagination as 149, 186, 234]. Some have proposed to consider thought
the means by which the thought experimenter gains ac- experiments as a genre, like science fiction [21.239].
cess to a scenario which is not directly accessible to her This view is in line with David Davies’ one. Indeed,
senses: in thought experimenting she quasi-observes. he argues [21.240, 241] that both philosophical and
For instance, Brown [21.55, 70–72] speaks of seeing the scientific thought experimentation meet two necessary
laws of nature and he claims that the pictorial and sen- and sufficient conditions for the fictionality of a nar-
sory aspects are essential to thought experimentation rative. That is, first, they involve to make believe,
(see also [21.136] on that point). Nersessian [21.73] rather than to believe, that the state of affairs described
stresses that when we perform a thought experiment, holds (this point brings us back to the issue about
we feel ourselves as observers. Martin Cohen takes the the role played by imagination in thought experimen-
second rule of good thought experimenting to be that tation, since make-believing is a form of imagining,
the thought experiment must be imaginable, that is, “the see [21.234, 242], Sect. 21.5.2); second, they involve
clearer the picture, the stronger the image, the better the a narrative constrained by some specific purpose “such
experiment” [21.64, p. 106]. Gooding [21.85] claims as entertaining or perhaps instructing readers in certain
that visualization is a necessary and sufficient condition specific ways” ([21.241]; in [21.240] he specifies that
Thought Experiments in Model-Based Reasoning References 487
the imaginary world should not be constrained by ac- of knowledge or understanding of the real world
tual events). (in [21.246], the author also argues that the process of
If, on the one hand, we can put thought experimen- exemplification is common to literary fiction, thought
tation on the level of literary fiction, on the other hand, and real experimentation). Some caveats to this move
we can also do it the other way around and put the have been raised by Davies ([21.241]; see also [21.80]),
latter on the level of the former. Fictions themselves in particular when applied to films [21.247].
can be seen as thought experiments aiming at enrich-
ing the subject’s knowledge via journeys in more or less Acknowledgments. I am very grateful to Marco Buz-
far possible worlds [21.219, 243]. Precisely, on this ba- zoni, Jérôme Dokic, Yiftach Fehige, and Michael Stuart
sis, both Carroll [21.244] and Elgin [21.245, 246] have for helpful comments on earlier versions of this chap-
tried to defend literary cognitivism, that is, the view ac- ter. I would also like to thank the editor in charge of the
cording to which fictional narratives can be a source part D, Nora Schwartz, for her support.
References
21.1 Plato: The Republic (Cosimo, New York 2008), lence Principle for 0-spin and half-integer-spin
translated by Benjamin Jowett atoms: Search for spin-gravity coupling effects,
21.2 H. Putnam: Reason, Truth and History (Cambridge Phys. Rev. Lett. 113, 023005-1–023005-5 (2014)
University Press, Cambridge 1981) 21.18 Stevinus: De Staticae Elementis (Ioannis Patii, Lei-
21.3 J. Locke: An Essay Concerning Human Under- den 1605/ 1608)
standing (Thomas Dring Samuel Manship, London 21.19 I. Newton: Philosophiae Naturalis Principia
1690/ 1694) Mathematica (Joseph Streater, London 1687)
21.4 G. Galilei: Discorsi e Dimostrazioni Matematiche 21.20 E. Gettier: Is justified true belief knowledge?,
Intorno a Due Nuove Scienze (Louis Elsevier, Leida Analysis 23, 121–123 (1963)
1638) 21.21 B. Russell: Human Knowledge: Its Scope and Its
21.5 E. Condillac: Traité des Sensations (Chez de Bure, Limits (Allen Unwin, London 1948)
London/Paris 1754) 21.22 M. Cohen: 101 Philosophy Problems (Routledge,
21.6 I. Kant: Von dem ersten Grunde des Unter- London, New York 1999)
schieds der Gegenden im Raume. In: Vorkritische 21.23 H. Putnam: Meaning and Reference, J. Philos. 70,
Schriften, ed. by A. Buchenau (Bruno Cassirer, 699–711 (1973)
Berlin 1768) pp. 375–383 21.24 H. Putnam: Meaning of “Meaning, Minnesota
21.7 C. Darwin: The Origin of Species by Means of Nat- Stud. Philos. Sci. 7, 131–193 (1975)
ural Selection, or the Preservation of Favoured 21.25 F. Jackson: Epiphenomenal qualia, Philos. Q. 32,
Races in the Struggle for Life (John Murray, Lon- 27–36 (1982)
don 1859) 21.26 P. Ludlow, Y. Nagasawa, D. Stoljar (Eds.): There’s
21.8 H. Poincaré: La Science et l’hypothèse (E. Flam- Something About Mary. Essays on Phenomenal
Part D | 21
marion, Paris 1908) Consciousness and Franck Jackson’s Knowledge
21.9 A. Einstein, L. Infeld: The Evolution of Physics. The Argument (MIT Press, Cambridge, London 2004)
Growth of Ideas from Early Concepts to Relativity 21.27 J.R. Brown, Y. Fehige: Thought experiments. In:
and Quanta (Simon Schuster, New York 1938) The Stanford Encyclopedia of Philosophy, ed.
21.10 W. Heisenberg: Physikalische Prinzipien der by E.N. Zalta, http://plato.stanford.edu/archives/
Quantentheorie (Hirzel, Leipzig 1930) fall2011/entries/thought-experiment/ (2014)
21.11 T. Burge: Individualism and the mental, Midwest 21.28 E. Mach: Über Gedankenexperimente, Z. Phys.
Stud. Philos. 4, 73–121 (1979) Chem. Unterr. 10, 1–5 (1896), Translated by W.O.
21.12 J. Searle: Minds, brains, and programs, Behav. Price, S. Krimsky: On thought experiments, Philo-
Brain Sci. 3, 417–457 (1980) sophical Forum 4(3), 446–457 (1973)
21.13 T. Horowitz, G. Massey (Eds.): Thought Experi- 21.29 H.C. Ørsted: Förste Indledning til den Almindelige
ments in Science and Philosophy (Rowman Lit- Naturlære (J.S. Schultz, Copenhagen 1811)
tlefield, Lanham 1991) 21.30 C. Schildknecht: Philosophische Masken: Lit-
21.14 R. Casati, A. Jacomuzzi, P. Kobau (Eds.): Esperi- erarische Formen der Philosophie bei Pla-
menti Mentali (Rosenberg Sellier, Turin 2009) ton, Descartes, Wolff und Lichtenberg (Metzler,
21.15 K. Ierodiakonou, S. Roux (Eds.): Thought Experi- Stuttgart 1990)
ments in Methodological and Historical Contexts 21.31 U. Kühne: Die Methode des Gedankenexperiments
(Brill, Leiden-Boston 2011) (Suhrkamp, Frankfurt 2005)
21.16 M. Frappier, L. Meynell, J.R. Brown (Eds.): Thought 21.32 J. Witt-Hansen: H.C. Ørsted, Immanuel Kant, and
Experiments in Philosophy, Science, and the Arts the thought experiment, Danish Yearb. Philos. 13,
(Routledge, London, New York 2013) 48–65 (1976)
21.17 M.G. Tarallo, T. Mazzoni, N. Poli, D.V. Sutyrin, 21.33 Y. Fehige, M.T. Stuart: On the origins of the phi-
X. Zhang, G.M. Tino: Test of Einstein Equiva- losophy of thought experiments: The forerun,
488 Part D Model-Based Reasoning in Science and the History of Science
Perspect. Sci. 22, 179–220 (2014) 21.48 D. Atkinson, J. Peijnenburg: Galileo and prior phi-
21.34 A.S. Moue, K.A. Masavetas, H. Karayianni: Tracing losophy, Stud. Hist. Philos. Sci. 35, 115–136 (2004)
the development of thought experiments in the 21.49 P. Palmieri: “Spuntur lo scoglio più duro”: Did
philosophy of natural sciences, J. Gen. Philos. Sci. Galileo ever think the most beautiful thought ex-
37, 61–75 (2006) periment in the history of science?, Stud. Hist.
21.35 D. Cohnitz: Ørsteds Gedankenexperiment: Eine Philos. Sci. 36, 305–322 (2005)
Kantianische Fundierung der Infinitesmal- 21.50 J. Daiber: Experimentalphysik des Geistes: Novalis
rechnung? Ein Beitrag zur Begriffsgeschichte und das Romantische Experiment (Vadenhoeck
von “Gedankenexperiment” und zur Mathe- Ruprecht, Göttingen 2001)
matikgeschichte des frühen 19. Jarhunderts, 21.51 Y. Fehige: Poems of productive imagination:
Kant-Studien 99, 407–433 (2008) Thought experiments, theology, and science in
21.36 M. Buzzoni: Thought Experiment in the Nat- Novalis, Neue Z. Syst. Theol. Religionsphilos. 55,
ural Sciences: An Operational and Reflexive- 54–83 (2013)
Transcendental Conception (Königshausen Neu- 21.52 R. Sorensen: Thought Experiments (Oxford Uni-
mann, Würzburg 2008) versity Press, Oxford 1992)
21.37 S. Roux: Introduction: The emergence of the no- 21.53 A. Meinong: Über die Stellung der Gegenstands-
tion of thought experiments. In: Thought Experi- theorie im System der Wissenschaften (Voigtlän-
ments in Methodological and Historical Contexts, der, Leipzig 1907)
ed. by K. Ierodiakonou, S. Roux (Brill, Leiden- 21.54 P. Duhem: La Théorie Physique: Son Objet, sa
Boston 2011) pp. 1–33 Structure (Vrin, Paris 1914)
21.38 N. Rescher: Thought experimentation in preso- 21.55 J.R. Brown: The Laboratory of the Mind: Thought
cratic philosophy. In: Thought Experiments in Experiments in the Natural Sciences (Routledge,
Science and Philosophy, ed. by T. Horowitz, London 1991)
G. Massey (Rowman Littlefield, Lanham 1991) 21.56 A. Bokulich: Rethinking thought experiments,
pp. 31–42 Perspect. Sci. 9, 285–307 (2001)
21.39 N. Rescher: What If?: Thought Experimenta- 21.57 M. Buzzoni: Esperimento ed Esperimento Mentale
tion in Philosophy (Transaction Publishers, New (FrancoAngeli, Milano 2004)
Brunswick 2005) 21.58 K. Popper: On the use and misuse of imaginary
21.40 A. Irvine: Thought experiments in scientific rea- experiments, especially in quantum theory. In:
soning. In: Thought Experiments in Science and The Logic of Scientific Discovery (Hutchinson, Lon-
Philosophy, ed. by T. Horowitz, G. Massey (Row- don 1959) pp. 442–456
man Littlefield, Lanham 1991) pp. 149–166 21.59 C. Hempel: Typological methods in the natural
21.41 K. Ierodiakonou: Ancient thought experiments: A and the social sciences. In: Aspects of Scientific
first approach, Ancient Philos. 25, 125–140 (2005) Explanation and Other Essays in the Philosophy
21.42 I. Lakatos: Proofs and Refutations. The Logic of Science (Free Press, New York 1965) pp. 155–171
of Mathematical Discovery (Cambridge University 21.60 A. Koyré: Galileo’s treatise de motu gravium: The
Press, Cambridge 1976) use and the abuse of imaginary experiment, Rev.
21.43 M. Stöltzner: The dynamics of thought experi- Hist. Sci. Paris. 13, 197–245 (1960)
ments – Comment to Atkinson. In: Observation 21.61 T.S. Kuhn: A function for thought experiments.
and Experiment in the Natural and Social Sci- In: L’aventure de la Science, Mélanges Alexandre
Part D | 21
ences, ed. by M. Galavotti (Kluwer Academic Pub- Koyré, ed. by I.B. Cohen, R. Taton (Hermann, Paris
lishers, Dordrecht 2003) pp. 243–258 1964) pp. 307–343
21.44 P. King: Mediaeval thought-experiments: The 21.62 J.J. Thomson: A defense of abortion, Philos. Public
metamethodology of mediaeval science. In: Aff. 1, 47–66 (1971)
Thought Experiments in Science and Philosophy, 21.63 D. Parfit: Reasons and persons (Clarendon Press,
ed. by T. Horowitz, G. Massey (Rowman Littlefield, Oxford 1984)
Lanham 1991) pp. 43–64 21.64 M. Cohen: Wittgenstein’s Beetle and Other Classic
21.45 D. Perler: Thought experiments: The methodolog- Thought Experiments (Blackwell, Oxford 2005)
ical function of angels in late medieval epis- 21.65 R. Cooper: Thought Experiments, Metaphilosophy
temology. In: Angels in Medieval Philosophical 36, 328–347 (2005)
Inquiry, ed. by I. Iribarren, M. Lenz (Ashgate, 21.66 G. Boniolo: On a unified theory of models and
Aldershot 2008) pp. 143–153 thought experiments in natural sciences, Int.
21.46 C. Grellard: Thought experiments in late medieval Stud. Philos. Sci. 11, 121–142 (1997)
debates on atomism. In: Thought Experiments 21.67 U. Gähde: Gedankenexperimente in Erkenntnis-
in Methodological and Historical Contexts, ed. theorie und Physik: Strukturelle Parallelen. In:
by K. Ierodiakonou, S. Roux (Brill, Leiden-Boston Rationalität, Realismus, Revision, ed. by J. Nida-
2011) pp. 65–82 Rümlin (de Gruyter, Berlin 2000) pp. 457–464
21.47 G. Prudovsky: The confirmation of the superposi- 21.68 J. Norton: Thought experiments in Einstein’s
tion principle: The role of a constructive thought work. In: Thought Experiments in Science and
experiment in Galileo’s Discorsi, Stud. Hist. Phi- Philosophy, ed. by T. Horowitz, G. Massey (Row-
los. Sci. 20, 453–468 (1989) man Littlefield, Lanham 1991) pp. 129–148
Thought Experiments in Model-Based Reasoning References 489
21.69 R. Laymon: Thought experiments by Stevin, Mach R.N. Giere (University of Minnesota Press, Min-
and Gouy: Thought experiments as ideal lim- neapolis 1992) pp. 3–44
its and as semantic domains. In: Thought Ex- 21.85 D. Gooding: What is experimental about thought
periments in Science and Philosophy, ed. by experiments? In: PSA 1992, ed. by D. Hull,
T. Horowitz, G. Massey (Rowman Littlefield, Lan- M. Forbes, K. Okruhlik (Philosophy of Science As-
ham 1991) pp. 167–192 sociation, East Lansing 1993) pp. 280–290
21.70 J.R. Brown: Thought experiments: A Platonic ac- 21.86 S. Häggqvist: Thought Experiments in Philosophy
count. In: Thought Experiments in Science and (Almqvist Wiksel, Stockholm 1996)
Philosophy, ed. by T. Horowitz, G. Massey (Row- 21.87 K. Wilkes: Real People: Personal Identity With-
man Littlefield, Lanham 1991) pp. 119–128 out Thought Experiments (Clarendon Press, Oxford
21.71 J.R. Brown: Why thought experiments transcend 1988)
experience. In: Contemporary Debates in the Phi- 21.88 M. Bishop: Why thought experiments are not ar-
losophy of Science, ed. by C. Hitchcock (Blackwell, guments, Philos. Sci. 66, 534–541 (1999)
Oxford 2004) pp. 23–43 21.89 D. Hull: A Function for actual examples in phi-
21.72 J.R. Brown: Peeking into Plato’s Heaven, Philos. losophy of science. In: What the Philosophy of
Sci. 71, 1126–1138 (2004) Biology is: Essays Dedicated to David Hull, ed. by
21.73 N.J. Nersessian: In the theoretician’s laboratory: M. Ruse (Kluwer Academic Publishers, Dordrecht
Thought experimenting as mental modelling. In: 1989) pp. 309–321
PSA 1992, ed. by D. Hull, M. Forbes, K. Okruhlik 21.90 D. Hull: That just don’t sound right: A plea for
(Philosophy of Science Association, East Lansing real examples. In: The Cosmos of Science: Es-
1993) pp. 291–301 says of Exploration, ed. by J. Earman, J.D. Norton
21.74 I. Hacking: Do thought experiments have a life (University of Pittsburgh Press, Pittsburgh 1997)
of their own? Comments on James Brown, Nancy pp. 430–457
Nersessian and David Gooding. In: PSA 1992, ed. 21.91 S. Krimsky: The Nature and Function of
by D. Hull, M. Forbes, K. Okruhlik (Philosophy of “Gedankenexperimente” in Physics, Ph.D.
Science Association, East Lansing 1993) pp. 302– Thesis (University of Michigan, Ann Arbor 1970)
308 21.92 H. Reichenbach: Experience and Prediction. An
21.75 P. Humphreys: Seven theses on thought experi- Analysis of the Foundations and the Structure of
ments. In: Philosophical Problems of the Internal Knowledge (University of Chicago Press, Chicago
and External World: Essays on the Philosophy 1938)
of Adolf Grunbaum, ed. by J. Earman, A. Ja- 21.93 C. Daly: An Introduction to Philosophical Methods
nis, J. Massey, N. Rescher (University of Pitts- (Broadview Press, Peterborough 2010)
burgh Press/Universitätsverlag Konstanz, Pitts- 21.94 R. Arthur: Can thought experiments be resolved
burgh/Konstanz 1993) pp. 205–227 by experiment? The case of Aristotle’s wheel. In:
21.76 T. Gendler Szabó: Galileo and the indispensability Thought Experiments in Philosophy, Science, and
of scientific thought experiment, Br. J. Philos. Sci. the Arts, ed. by M. Frappier, L. Meynell, J.R. Brown
49, 397–424 (1998) (Routledge, London, New York 2013) pp. 107–122
21.77 E. Weber, T. De Mey: Explanation and thought ex- 21.95 J. Fodor: On knowing what we would say, Philos.
periments in history, Hist. Theory 42, 28–38 (2003) Rev. 73, 198–212 (1964)
21.78 R. Snooks: Another scientific practice separat- 21.96 P. Feyerabend: Against Method (Verso, London
Part D | 21
ing chemistry from physics: Thought experiments, 1978)
Found. Chem. 8, 255–270 (2006) 21.97 W.V. Quine: Review of identity and individuation,
21.79 J. Peijnenburg, D. Atkinson: When are Thought J. Philos. 69, 488–497 (1972)
Experiments Poor Ones?, J. Gen. Philos. Sci. 34, 21.98 P. Thagard: The Brain and the Meaning of Life
305–322 (2003) (Princeton University Press, Princeton 2010)
21.80 G. McComb: Thought experiment, definition, and 21.99 P. Thagard: Thought experiments considered
literary fiction. In: Thought Experiments in Phi- harmful, Persp. Sci. 22, 288–305 (2014)
losophy, Science, and the Arts, ed. by M. Frappier, 21.100 M. Arcangeli: Il posto delle favole. In: Rivista di
L. Meynell, J.R. Brown (Routledge, London, New Estetica, s.i. Esperimenti mentali, Vol. 42, ed. by
York 2013) pp. 207–222 R. Casati, A. Jacomuzzi, P. Kobau (Rosenberg Sel-
21.81 T. Gendler Szabó: Thought experiment. In: En- lier, Turin 2009) pp. 3–19
cyclopedia of Cognitive Science, ed. by L. Nadel 21.101 J. Lennox: Darwinian thought experiments: A
(New York/London, Nature/Routledge 2002) function for just-so stories. In: Thought Exper-
pp. 388–394 iments in Science and Philosophy, ed. by T.
21.82 E. Brendel: Intuition pumps and the proper use Horowitz, G. Massey (Rowman and Littlefield 1991)
of thought experiments, Dialectica 58, 88–108 pp. 223–245
(2004) 21.102 J. Lennox: Darwin’s methodological evolution,
21.83 E. Mach: Erkenntnis und Irrtum (Barili, Leipzig J. Hist. Biol. 38, 85–99 (2005)
1905) 21.103 L.S. Swan: Synthesizing insight: Artificial life as
21.84 N.J. Nersessian: How do scientists think? Cap- thought experimentation in biology, Biol. Philos.
turing the dynamics of conceptual change in 24, 687–701 (2009)
science. In: Cognitive Models of Science, ed. by
490 Part D Model-Based Reasoning in Science and the History of Science
21.104 F. Doolittle: Craig venter’s new life. The realiza- 21.124 E. Winsberg: A tale of two methods, Synthese
tion of some thought experiments in biological 169(3), 575–592 (2009)
ontology. In: Thought Experiments in Philoso- 21.125 W. Parker: Does matter really matter? Computer
phy, Science, and the Arts, ed. by M. Frappier, simulations, experiments and materiality, Syn-
L. Meynell, J.R. Brown (Routledge, London, New these 169(3), 483–496 (2009)
York 2013) pp. 160–176 21.126 E.A. Di Paolo, J. Noble, S. Bullock: Simulation
21.105 J. McAllister: Thought experiments and the belief models as opaque thought experiments. In: Pro-
in phenomena, PSA 2002, Philos. Sci. 71, 1164–1175 ceedings of the Seventh International Conference
(2004) on Artificial Life, ed. by M.A. Bedau, J.S. McCaskill,
21.106 J. Maffie: “Just-so” stories about “inner cogni- N.H. Packard, S. Rasmussen (MIT Press, Cambridge
tive Africa”: Some doubts about Sorensen’s evo- 2000) pp. 497–506
lutionary epistemology of thought experiments, 21.127 S. Chandrasekharan, N.J. Nersessian, V. Subrama-
Biol. Philos. 12, 207–224 (1997) nian: Computational modeling: Is this the end of
21.107 R. Sorensen: Precis of thought experiments, In- thought experiments in science? In: Thought Ex-
formal Log. 17(3), 385–387 (1995) periments in Philosophy, Science, and the Arts,
21.108 M. Bunzl: Bunzl on Sorensen’s thought experi- ed. by M. Frappier, L. Meynell, J.R. Brown (Rout-
ments, Informal Log. 17(3), 389–393 (1995) ledge, London, New York 2013) pp. 239–260
21.109 R. Feldman: Feldman on Sorensen’s thought ex- 21.128 F. Staudner: Virtuelle Erfahrung. Eine Un-
periments, Informal Log. 17(3), 394–398 (1995) tersuchung über den Erkenntniswert von
21.110 R. Sorensen: Sorensen’s reply to Bunzl and Feld- Gedankenexperimenten und Computersimula-
man, Informal Log. 17(3), 399–405 (1995) tionen in den Naturwissenschaften, Ph.D. Thesis
21.111 M. Buzzoni: Thought experiments from a Kantian (Friedrich Schiller Universität, Jena 1998)
point of view. In: Thought Experiments in Phi- 21.129 M. Velasco: Experimentación y técnicas computa-
losophy, Science and the Arts, ed. by M. Frappier, cionales, Theoria 17, 317–331 (2002)
L. Meynell, J.R. Brown (Routledge, London, New 21.130 J. Lenhard: Epistemologie der Iteration:
York 2013) pp. 90–106 Gedankenexperimente und Simulationsex-
21.112 Y. Fehige: Experiments of pure reason. Kantianism perimente, Dtsch. Z. Philos. 59, 131–154 (2011)
and thought experiments in science, Epistemolo- 21.131 R. El Skaf, C. Imbert: Unfolding in the empiri-
gia Ital. J. Philos. Sci. 35, 141–160 (2012) cal sciences: Experiments, thought experiments
21.113 Y. Fehige: The relativized a priori and the labora- and computer simulations, Synthese 190, 3451–
tory of the mind: Towards a neo-kantian account 3474 (2013)
of thought experiments in science, Epistemologia 21.132 M. Bishop: An epistemological role for thought
Ital. J. Philos. Sci. 36, 55–63 (2013) experiments. In: Idealization in Contemporary
21.114 M. Buzzoni: On thought experiments and the Physics, ed. by N. Shanks (Rodopi, Amsterdam,
Kantian a priori in the natural sciences: A reply Atlanta 1998) pp. 19–33
to Yiftach J. H. Fehige, Epistemologia Ital. J. Phi- 21.133 M. Schulzke: Simulating philosophy: Interpreting
los. Sci. 36, 277–293 (2013) video games as executable thought experiments,
21.115 J. Norton: Why thought experiments do not tran- Philos. Technol. 27, 251–265 (2014)
scend empiricism. In: Contemporary Debates in 21.134 M. Bunzl: The logic of thought experiments, Syn-
the Philosophy of Science, ed. by C. Hitchcock these 106, 227–240 (1996)
Part D | 21
(Blackwell, Oxford 2004) pp. 44–66 21.135 N.J. Nersessian: Thought experiments as men-
21.116 J. Norton: Are thought experiments just what you tal modelling: Empiricism without logic, Croat.
thought?, Canad. J. Philos. 26, 333–366 (1996) J. Philos. 7(20), 125–161 (2007)
21.117 J. Norton: On thought experiments: Is there more 21.136 R. Arthur: On thought experiments as a priori sci-
to the argument?, PSA 2002, Philos. Sci. 71, 1139– ence, Int. Stud. Philos. Sci. 13, 215–229 (1999)
1151 (2004) 21.137 N. Miščević: Modelling intuitions and thought ex-
21.118 T. Gendler Szabó: Thought Experiment: On the periments, Croat. J. Philos. 20, 181–214 (2007)
Powers and Limits of Imaginary Cases (Garland 21.138 W. Hopp: Experiments in thought, Perspect. Sci.
Press, New York 2000) 22, 242–263 (2014)
21.119 T. Williamson: The Philosophy of Philosophy 21.139 J.R. Brown: Why empiricism won’t work. In: PSA
(Blackwell, Malden 2007) 1992, ed. by D. Hull, M. Forbes, K. Okruhlik (Phi-
21.120 N. Gilbert, K. Troitzsch: Simulation for the So- losophy of Science Association, East Lansing 1993)
cial Scientist (Open University Press, Philadelphia pp. 271–279
1999) 21.140 R. Urbaniak: “Platonic” thought experiments:
21.121 C. Beisbart, J. Norton: Why Monte Carlo simula- How on Earth?, Synthese 187, 731–752 (2012)
tions are inferences and not experiments, Int. 21.141 S. Häggqvist: A model for thought experiments,
Stud. Philos. Sci. 26, 403–422 (2012) Canad. J. Philos. 39, 56–76 (2009)
21.122 D. Dowling: Experimenting on theories, Sci. Con- 21.142 P.A. Schilpp (Ed.): Albert Einstein: Philosopher-
text 12(2), 261–273 (1999) Scientist (Open Court, La Salle 1949)
21.123 A. Barberousse, S. Franceschelli, C. Imbert: 21.143 T. De Mey: The dual nature view of thought ex-
Computer simulations as experiments, Synthese periments, Philosophica 72, 61–78 (2003)
169(3), 557–574 (2009)
Thought Experiments in Model-Based Reasoning References 491
21.144 J.Y. Goffi, S. Roux: On the very idea of a thought ences, ed. by M. Galavotti (Kluwer Academic, Dor-
experiment. In: Thought Experiments in Method- drecht 2003) pp. 237–241
ological and Historical Contexts, ed. by K. Iero- 21.161 D. Cohnitz: Poor thought experiments? A com-
diakonou, S. Roux (Brill, Leiden-Boston 2011) ment on Peijnenburg and Atkinson, J. Gen. Phi-
pp. 165–192 los. Sci. 37, 373–392 (2006)
21.145 T. Gendler Szabó: Thought experiments rethought 21.162 M. Dorato: Dalla freccia di Lucrezio all’ascensore
– And reperceived, Philos. Sci. 71, 1152–1164 (2004) di Einstein: Alcune considerazioni sul ruolo degli
21.146 D. Gooding: Experiment and the Making of Mean- esperimenti mentali nella scienza. In: Rivista di
ing: Human Agency in Scientific Observation and Estetica, s.i. Esperimenti mentali, Vol. 42, ed. by
Experiment (Kluwer Academic Publishers, Dor- R. Casati, A. Jacomuzzi, P. Kobau (Rosenberg Sel-
drecht, Boston 1990) lier, Turin 2009) pp. 21–37
21.147 D. Gooding: The procedural turn; or, why do 21.163 J. McAllister: The evidential significance of
thought experiments work? In: Cognitive Models thought experiment in science, Stud. Hist. Philos.
of Science, ed. by R.N. Giere (University of Min- Sci. Part A 27, 233–250 (1996)
nesota Press, Minneapolis 1992) pp. 45–76 21.164 J. Norton: Chasing the light: Einstein’s most fa-
21.148 M. Reiner, J. Gilbert: Epistemological resources for mous tought experiment. In: Thought Experi-
thought experimentation in science learning, Int. ments in Philosophy, Science, and the Arts, ed.
J. Sci. Educ. 22(5), 489–506 (2000) by M. Frappier, L. Meynell, J.R. Brown (Routledge,
21.149 N. Miščević: Mental Models and Tought Experi- London, New York 2013) pp. 123–140
ments, Int. Stud. Philos. Sci. 6, 215–226 (1992) 21.165 J. Norton: Seeing the laws of nature, Metascience
21.150 J.R. Brown: What Do we see in a thought ex- 3, 33–38 (1993)
periment? In: Thought Experiments in Philoso- 21.166 J.R. Brown: Counter thought experiments, R. Inst.
phy, Science, and the Arts, ed. by M. Frappier, Philos. Suppl. 61(82), 155–177 (2007)
L. Meynell, J.R. Brown (Routledge, London, New 21.167 K. Ierodiakonou: Remarks on the history of an
York 2013) pp. 53–68 ancient thought experiment. In: Thought Exper-
21.151 A. Janis: Can thought experiments fail? In: iments in Methodological and Historical Contexts
Thought Experiments in Science and Philosophy, ed by K. Ierodiakonou, S. Roux (Brill, Leiden-
ed. by T. Horowitz, G. Massey (Rowman Littlefield, Boston 2011) pp. 37–50
Lanham 1991) pp. 113–118 21.168 E. Mach: Die Mechanik in ihrer Entwicklung
21.152 M. Arcangeli: Poveri esperimenti mentali. In: historisch-kritisch dargestellt (Brockhaus, Leipzig
Analisi. Annuario e Bollettino della Società Ital- 1883)
iana di Filosofia Analitica SIFA, ed. by R. Davies 21.169 S. Krimsky: The multiple-world thought exper-
(Mimesis, Milano/Udine 2011) pp. 277–290 iment and absolute space, Noûs 6(3), 266–273
21.153 D. Rowbottom: Intuitions in science: Thought ex- (1972)
periments as argument pumps. In: Intuitions, ed. 21.170 S. Krimsky: The use and misuse of critical
by A. Booth, D. Rowbottom (Oxford Univ. Press, Gedankenexperimente, Z. Allg. Wissenschafts-
Oxford 2014) pp. 119–134 theor. 4, 323–334 (1973)
21.154 A. Einstein, B. Podolsky, N. Rosen: Can quantum- 21.171 H. Helm, J. Gilbert: Thought experiments and
mechanical description of physical reality be con- physics education – Part 1, Phys. Educ. 20, 124–
sidered complete?, Phys. Rev. 47, 777–780 (1935) 131 (1985)
Part D | 21
21.155 D. Atkinson: Experiments and thought experi- 21.172 H. Helm, J. Gilbert, M.D. Watts: Thought experi-
ments in natural science. In: Observation and ments and physics education – Part 2, Phys. Educ.
Experiment in the Natural and Social Sciences, 20, 211–217 (1985)
ed. by M.C. Galavotti (Kluwer Academic Publish- 21.173 S. Klassen: The science thought experiment: How
ers, Dordrecht 2003) pp. 209–225 might it be used profitably in the classroom?, In-
21.156 A. Pessin, S. Goldberg (Eds.): The Twin Earth terchange 37, 77–96 (2006)
Chronicles: Twenty Years of Reflection on Hi- 21.174 A. Velentzas, K. Halkia, C. Skordoulis: Thought
lary Putnam’s “The Meaning of Meaning” (M. E. experiments in the theory of relativity and in
Sharpe, New York 1996) quantum mechanics: Their presence in textbooks
21.157 G.N. Schlesinger: The power of thought experi- and in popular science books, Sci. Educ. 16, 353–
ments, Found. Phys. 26(4), 467–482 (1996) 370 (2007)
21.158 G. Boniolo: On Scientific Representations From 21.175 M. Toscano: Thought experimentation and mod-
Kant to a New Philosophy of Science (Palgrave elling in the science classroom, AARE Conf. Proc.
Macmillan, New York 2008) (2007), TOS07411
21.159 A. Aspect, P. Grangier, G. Roger: Experimen- 21.176 R. Casati: Dov’è il Sole di notte? (Raffaello Cortina
tal realization of Einstein-Podolsky-Rosen-Bohm Editore, Milano 2013)
Gedankenexperiment: A new violation of Bell’s 21.177 J.C. Maxwell: Theory of Heat (Longman, London
inequalities, Phys. Rev. Lett. 49(2), 91–94 (1982) 1871)
21.160 M. Rédei: Thinking about thought experiments in 21.178 M. Johnston: Human beings, J. Philos. 84, 59–83
physics. Comment on ’Experiments and thought (1987)
experiments by David Atkinson’. In: Observation 21.179 R. Casati, J. Dokic: La Philosophie du Son (Cham-
and Experiment in the Natural and Social Sci- bon, Nîmes 1994)
492 Part D Model-Based Reasoning in Science and the History of Science
Empiricism, ed. by P.M. Churchland, C.A. Hooker proaches, Midwest Stud. Philos. 31, 128–159 (2007)
(University of Chicago Press, Chicago 1985) pp. 35– 21.210 T. Williamson: Replies to Ichikawa, Martin and
47 Weinberg, Philos. Stud. 145, 465–476 (2009)
21.191 J.R. Griesemer, W.C. Wimsatt: Picturing weisman- 21.211 T. Williamson: Philosophical ‘intuitions’ and
nism: A case study of conceptual evolution. In: scepticism about judgement, Dialectica 58, 109–
What the Philosophy of Biology Is: Essays Ded- 153 (2004)
icated to David Hull, ed. by M. Ruse (Kluwer 21.212 T. Williamson: Armchair philosophy, metaphysical
Academic Publishers, Dordrecht 1989) pp. 75–137 modality and counterfactual thinking, Proc. Aris-
21.192 G. Bealer: Intuition and the autonomy of phi- tot. Soc. 105, 1–23 (2005)
losophy. In: Rethinking Intuition. The Psychology 21.213 E. Machery: Thought experiments and philosoph-
of Intuition and Its Role in Philosophical Inquiry, ical knowledge, Metaphilosophy 42(3), 191–214
ed. by M. DePaul, W. Ramsey (Rowman Littlefield, (2011)
Lanham 1998) pp. 201–239 21.214 K. Lehrer: Theory of Knowledge (Westview Press,
21.193 K. Shrader-Frechette: Using a thought experi- Boulder 1990)
ment to clarify a radiobiological controversy, Syn- 21.215 S. Kripke: Naming and Necessity (Blackwell, Ox-
these 128, 319–342 (2001) ford 1980)
21.194 V. Giardino: Sperimentare con i Triangoli. In: Riv- 21.216 J. Weinberg, S. Nichols, S. Stich: Normativity
ista di Estetica, s.i. Esperimenti mentali, Vol. 42, and epistemic intuitions, Phil. Topic 29, 429–460
ed. by R. Casati, A. Jacomuzzi, P. Kobau (Rosen- (2001)
berg, Sellier, Turin 2009) pp. 39–54 21.217 E. Machery, R. Mallon, S. Nichols, S. Stich: Se-
21.195 M. Buzzoni: On mathematical thought experi- mantics, cross-cultural style, Cognition 92, B1–B12
ments, Epistemol. Ital. J. Philos. Sci. 34, 61–88 (2004)
Thought Experiments in Model-Based Reasoning References 493
21.218 D. Dennett: Elbow Room: The Varieties of Free Will 21.232 A. Goldman: Simulating Minds: The Philosophy,
Worth Wanting (MIT Press, Cambridge 1984) Psychology, and Neuroscience of Mindreading
21.219 P. Swirski: Of Literature and Knowledge: Explo- (OUP, Oxford 2006)
rations in Narrative Thought Experiments, Evolu- 21.233 A. Goldman: Mirroring, mindreading, and sim-
tion and Game Theory (Routledge, London, New ulation. In: Mirror Neuron Systems: The Role
York 2007) of Mirroring Processes in Social Cognition, ed.
21.220 M.T. Stuart: Philosophical conceptual analysis as by J.A. Pineda (Humana Press, New York 2009)
an experimental method. In: Meaning, Frames pp. 311–330
and Conceptual Representation, ed. by T. Gamer- 21.234 L. Meynell: Imagination and insight: A new ac-
schlag, D. Gerland, R. Osswald, W. Petersen (Düs- count of the content of thought experiments,
seldorf University Press, Düsseldorf 2015) pp. 161– Synthese 191(17), 4149–4168 (2014)
186 21.235 M.T. Stuart: Imagination: A ‘Sine Qua Non’ of
21.221 T. Gendler Szabó: Philosophical thought exper- scientific understanding, Croat. J. Philos. (2015),
iments, intuitions and cognitive equilibrium, forthcoming
Midwest Stud. Philos. 31, 68–89 (2007) 21.236 K. Mulligan: La varietà e l’unità dell’imma-
21.222 M.T. Stuart: Cognitive science and thought exper- ginazione, Riv. Estet. 11, 53–67 (1999)
iments: A refutation of Paul Thagard’s skepticism, 21.237 M. Balcerack Jackson: On the epistemic value
Perspect. Sci. 22, 264–287 (2014) of imagining, supposing, and conceiving. In:
21.223 M. Arcangeli: Imagination in thought experi- Knowledge through Imagination, ed. by A. Kind,
mentation: Sketching a cognitive approach to P. Kung (Oxford Univ. Press, Oxford 2016) pp. 41–60
thought experiments. In: Model-Based Reason- 21.238 L. Souder: What are we to think about thought
ing in Science and Technology, ed. by L. Magnani, experiments?, Argumentation 17, 203–217 (2003)
W. Carnielli, C. Pizzi (Springer, Dordrecht 2010) 21.239 J. Weinberg: Configuring the cognitive imagina-
pp. 571–587 tion. In: New Waves in Aesthetics, ed. by K. Stock,
21.224 T. De Mey: Imagination’s grip on science, K. Thomson-Jones (Palgrave Macmillan, Hound-
Metaphilosophy 37, 222–239 (2006) mills, Basingstoke 2008) pp. 203–223
21.225 E. McMullin: Galilean Idealization, Stud. Hist. 21.240 D. Davies: Thought experiments and fictional nar-
Philos. Sci. 16, 247–273 (1985) ratives, Croat. J. Philos. 7, 29–45 (2007)
21.226 P. Palmieri: Mental models in Galileo’s early 21.241 D. Davies: Learning through fictional narratives in
mathematization of nature, Stud. Hist. Philos. Sci. art and science. In: Beyond Mimesis and Con-
34, 229–264 (2003) vention, ed. by R. Frigg, M.C. Hunter (Kluwer
21.227 N.J. Nersessian: Creating Scientific Concepts (MIT Academic Publishers, Dordrecht 2010) pp. 51–70
Press, Cambridge 2008) 21.242 K. Walton: Mimesis as Make-Believe: On the
21.228 P.N. Johnson-Laird: Mental Models: Toward Foundations of the Representational Arts (Har-
a Cognitive Science of Language, Inference and vard University Press, Harvard 1990)
Consciousness (Harvard Univ. Press, Cambridge 21.243 E.A. Davenport: Literature as thought experiment
1983) (On aiding and abetting the muse, Philos. Soc.
21.229 P.N. Johnson-Laird: The history of mental mod- Sci. 13(3), 279–306 (1983)
els. In: Psychology of Reasoning: Theoretical 21.244 N. Carroll: The wheel of virtue: art, literature, and
and Historical Perspectives, ed. by K. Manktelow, moral knowledge, J. Aesthet. Art Crit. 60(1), 3–26
Part D | 21
M.C. Chung (Psychology Press, New York 2004) (2002)
pp. 179–212 21.245 C.Z. Elgin: The laboratory of the mind. In: A Sense
21.230 Y. Fehige, H. Wiltsche: The body, thought experi- of the World: Essays on Fiction, Narrative, and
ments, and phenomenology. In: Thought Exper- Knowledge, ed. by W. Huerner, J. Gibson, L. Pocci
iments in Philosophy, Science, and the Arts, ed. (Routledge, London 2007) pp. 43–54
by M. Frappier, L. Meynell, J.R. Brown (Routledge, 21.246 C.Z. Elgin: Fiction as thought experiment, Per-
London, New York 2013) pp. 69–89 spect. Sci. 22, 221–241 (2014)
21.231 J. Zeimbekis: Thought experiments and mental 21.247 D. Davies: Can philosophical thought experiment
simulations. In: Thought Experiments in Method- be “screened”? In: Thought Experiments in Phi-
ological and Historical Contexts, ed. by K. Iero- losophy, Science, and the Arts, ed. by M. Frappier,
diakonou, S. Roux (Brill, Leiden-Boston 2011) L. Meynell, J.R. Brown (Routledge, London, New
pp. 193–215 York 2013) pp. 223–238
495
Part E
Models in Part E Models in Mathematics
Ed. by Albrecht Heeffer
The use of models in mathematics can broadly be construction of the geometrical diagram but also its use
distinguished as two categories. Most commonly, math- in the demonstration.
ematical models are applied to the formal sciences and
A third application of extended cognition worth
engineering, as well as to social sciences and other
mentioning, while not treated as a dedicated subject
modes of quantitative reasoning. Secondly, models are
below, are symbolic representations. It might be less
also used within formal mathematics and in mathemat-
obvious to view mathematical symbolism as a form of
ical practice. This part of the book is mostly concerned
model-based reasoning, but recent research indicates
with the latter use of models. In this sense, mod-
that symbolism is not just a game of meaningless sym-
els are tools for reasoning in mathematics or teaching
bols and that modern symbolism, as it is taught and
mathematics, and as a consequence also a means for un-
practiced today, relies on the visual processing of el-
derstanding mathematical practice. The oldest and most
ements in a way that is similar to the interpretation
common use of models in mathematical practice are
of diagrams. Spatial organization, directionality, group-
applications of extended cognition. Chinese counting
ing, and mental operations like picking up and moving
rods, the Roman abacus, medieval jetons or reckoning
elements across symbolic expressions, appear to be cru-
counters, or modern day computers are all material aids
cial for our understanding of mathematical symbolism
that allow us to delegate part of the cognitive load in
(Heeffer 2014).
performing complex calculations to the external envi-
ronment. The operations that we carry out using these Chapter 24 is the reflection of a joint project by
contrivances represent specific calculating procedures two philosophers of mathematical practice, Joachim
and in that sense they act as specific models of algo- Frans and Bart Van Kerkhove, and the mathemati-
rithms in arithmetic. cian Isar Goyvaerts. This contribution provides a more
general account of model-based reasoning in mathemat-
Diagrams are a more interesting application of ex- ical practice, concentrating on the processes that are
tended cognition in mathematics. It is rather surprising required to arrive at higher levels of abstraction in math-
that after more than 2000 years of practice with dia- ematics. Three examples from different mathematical
grams in mathematics, a study of their precise meaning, disciplines show how models facilitate additional layers
use, and function in mathematical reasoning has be- of abstraction. The first one is from Euclidean geome-
come a subject of serious study only during the past try and deals with the abstraction from concrete shapes
20 years. Chapter 22 by Valeria Giardino provides us and measures to mathematical objects to which deduc-
with a comprehensive state of the art in recent research tive reasoning can be applied. The second one is from
on mathematical diagrams. The new approach from approximation theory, in which functions can be mod-
the philosophy of mathematical practice – studying eled by other, more nicer and simpler functions. The
what mathematicians are actually doing when produc- third example is more technical and describes how the
ing mathematics – has focused on the role of diagrams highest level of abstraction is achieved by the modeling
as a reasoning model rather than a visual representa- of the inferences on certain algebraic objects by cate-
tion of a mathematical object. Especially, the classic gory theory.
lettered diagram from Euclidean geometry has come
under scrutiny, with the role of its constituent elements, Chapter 25 deals with abduction, which is well
their ontology, their epistemic functions, and their re- suited for modeling mathematical inferences within
lation with the text and inherent ambiguities being the context of discovery. Abduction is the response
inspected. It turns out that the diagram is not just a static to observations or findings which appear surprising or
object in a textbook, but as Giardino calls it, the dia- anomalous, and formulating hypotheses which allow
gram is kinaesthetic, as the text referring to it treats it us to evaluate their consequences. Abductive reasoning
as a constructed and manipulative object in which its can thus lead to the emergence of new mathematical ob-
inherent ambiguities become productive and open up jects, ideas, or even theories. An earlier historical case
modes of reasoning which are inhibited in more formal study has shown how imaginary numbers appear from
representations. practice in Renaissance algebra (Heeffer 2007). By
means of empirical case studies, Ferdie Rivera shows
Chapter 23 is a contribution by John Mumma that how abductive processes are prominent within students’
builds further on research from the philosophy on math- understanding of mathematics in the classroom, while
ematical practice on Euclidean diagrams, in particular previous studies have only focused on deductive or
the work of Ken Manders (2008). Mumma presents inductive reasoning. Four concrete suggestions are for-
a formalization of the model-based reasoning involved mulated to illustrate how abduction can be applied to
in mathematical diagrams, not only accounting for the mathematics education in a more systematic way.
497
Diagrammati
22. Diagrammatic Reasoning in Mathematics
Part E | 22.1
Valeria Giardino
facts? Are diagrams really part and parcel of the mathe- and spontaneous that there is a tendency to take the
Part E | 22.1
matical practice? And if it is so, what can be said about presence and the effectiveness of diagrams for granted.
their features, use, and relations with other elements of Moreover, most diagrams in Euclidean geometry be-
the same practice? The objective of the present chapter come part of our visual repertoire from a very early age
is to introduce the most recent works on diagrammatic at school. Think of the Pythagorean theorem and the
reasoning in mathematics and to review the answers impressive number of so-called visual proofs that have
that have been proposed so far for these questions. In been given for it [22.4]. According to this theorem, the
this first section, the domain of inquiry – diagrammatic square of the hypotenuse .c/ of a right triangle equals
reasoning in mathematics – and the issues at stake in the sum of the squares of its other two sides (a and b).
exploring it will be defined. In letters,
First of all, a clarification is needed on the meaning
of the term diagram in diagrammatic reasoning, so as to a2 C b2 D c2 : (22.1)
avoid misinterpretations. Throughout the chapter – and
possibly in contrast with other views – the term will One of the possible visualizations for the
be used in a very broad sense, that is, to include all Pythagorean theorem is offered in Fig. 22.1.
cases of two-dimensional representations where their In Fig. 22.1a, four identical right triangles have been
two dimensionality is relevant for the way in which in- arranged into two rectangles. To obtain a square of side
formation is displayed and read off from them. This a C b, these two rectangles are added to two squares:
seemingly too vague definition is actually appropriate one of side a, and the other of side b. In Fig. 22.1b,
to refer to many different phenomena that are found in the same four triangles have been rearranged inside the
mathematics. Moreover, diagrams will be intended here square of side a C b and they now individuate another
as cognitive tools that are meant to spatially display square of side c. By looking at the two diagrams to-
information in order to improve memory and promote gether and by applying subtraction of the same objects –
inference, and not necessarily to depict mathematical the four right triangles – to the same object – the square
objects. This will have two consequences: first, the fo- of side a C b – the Pythagorean theorem is obtained.
cus of the analysis will be on diagrams and not on However, there are cases of diagrammatic reasoning
visualizations; second, lengthy discussions about the that may be less obvious than in Euclidean geometry,
implications for the ontology of mathematics will be for example, for statements about numerical properties.
avoided. For these issues, one can refer among others to Consider the following geometric series
Brown, who claims that diagrams are not really pictures
but rather “windows to Plato’s heaven” [22.1, p. 40], or 1 1 1
to Sherry, who argues that some particular uses of dia- C C C D 1 (22.2)
2 4 8
grams make a realist view problematic [22.2].
Diagrammatic reasoning is surely relevant for hu- and its possible spatial arrangement in Fig. 22.2, in
man reasoning in general. As has been pointed out, which each new rectangle or square drawn in the dia-
human reasoning is heterogeneous: humans happen to gram – each new element added to the series – brings
rely on many different sorts of instruments with the us closer to the square of area 1 (the example is taken
aim of externalizing thought, diagrams being among from [22.1, pp. 36–38]).
them [22.3]. A common saying is that we are halfway As Brown points out, this picture proof should be
to finding a solution to a problem when we are able to contrasted with a traditional proof using "-ı techniques.
draw the right diagram for it. Nonetheless, in relation In such a proof, we first have to note that an infinite
to mathematics, it is necessary to distinguish between series converges to the sum S whenever the sequence of
mere sketches and diagrams. Sketches are certainly
widespread and useful for the mathematician to reason
a) a b b) a b
about a problem or to communicate with one’s peers.
However, they will not be the topic of this chapter, c a
which will be devoted to diagrams as parts of a system c
b c b b
of representation. Such diagrams obey some (more or
less explicit) rules and their manipulation is controlled
by the particular practice, in terms that will be defined c b
later. a c a a c
Not surprisingly, most analyses of diagrammatic
a b b a
reasoning in mathematics have dealt with Euclidean ge-
ometry, where the recourse to diagrams is so natural Fig. 22.1a,b Pythagorean theorem
Diagrammatic Reasoning in Mathematics 22.2 Diagrams and (the Philosophy of) Mathematical Practice 501
1
” log2 <n: (22.6)
Part E | 22.2
"
Let now N."/ D log2 "1 . As a consequence,
1
8 ˇ n ˇ
1 1 ˇ2 1 ˇ
16
n > log2 ! ˇˇ n 1ˇˇ < " : (22.7)
1
" 2
2
We have thus proven that the sum of the series is 1.
Compare now the easiness of forming the belief that
1
the sum of the series is 1 by looking at the diagram
4
in Fig. 22.2 with the resources required to prove the
same result in a traditional way. The topic of the chap-
ter will thus not only be Euclidean geometry. Other
studies will be presented that analyze the usefulness of
Fig. 22.2 A geometric series
diagrammatic reasoning also in other branches of math-
partial sums fsn g converges to S. In this case, we have ematics.
For the sake of completeness, there exists also
1 1 1 1 1 1 very interesting work on ancient mathematics other
s1 D ; s2 D C ; s3 D C C ;
2 2 4 2 4 8 than in Greece, involving, in some cases, also visual
1 1 1 1 tools [22.5, 6]. Nonetheless, for reasons of space and
sn D C C C C n : (22.3) given the specificity of the research, these works will
2 4 8 2
not be among the subjects of the present chapter. It
The values of these partial sums are must also be noted that analogous considerations about
1 3 7 2n 1 the importance of diagrammatic reasoning in math-
; ; ;:::; : (22.4) ematics can be made to logic. Many scholars have
2 4 8 2n
discussed diagrammatic reasoning in logic, in an in-
This infinite sequence has the limit 1, provided that terdisciplinary fashion. Some studies have focused on
for any number " > 0, no matter how small, there is the cognitive impact of diagrams in reasoning [22.7]
a number N."/, such that whenever n > N, then differ- and others on the importance of heterogeneous reason-
ence between the general term of the sequence 2 21
n and ing in logical proofs [22.8] and on the characteristics
1 is less than ". of nonsymbolic, in particular diagrammatic, systems
In symbols, of representation [22.9]. Very recently, and coherently
with what will be later said about diagrammatic rea-
2n 1
lim D1 soning in mathematics, it was claimed that different
n!1 2n ˇ n ˇ forms of representation in logic are complementary to
ˇ2 1 ˇ
” .8"/.9N/n > N ! ˇ n 1ˇˇ < " :
ˇ one another, and that future research should look into
2 more accurate road maps among various kinds of rep-
(22.5) resentation so that the appropriate one may be chosen
for any given purpose [22.10]. However, for reasons of
By applying some algebra, one obtains space and despite the numerous parallels with the case
ˇ n ˇ ˇ ˇ of mathematics, the use of diagrammatic reasoning in
ˇ2 1 ˇ ˇ ˇ
ˇ 1 ˇ < " ” ˇ 1 ˇ < " ” 2n > 1 logic will not be a topic of the present chapter.
ˇ 2n ˇ ˇ 2n ˇ "
ular and that has become more and more precarious in mathematics expressed a genuine mathematical need.
Part E | 22.2
recent years. Euclidean geometry was not the only logically possible
Famously, among others, Russell criticized Eu- geometry, and therefore it did not necessarily convey
clidean geometry for not being rigorous enough from truth about the physical world: perception, motion, and
a logical point of view [22.11, p. 404ff]. Consider the superposition of figures had to be excluded as illegiti-
very first proposition of the Elements, which corre- mate procedures. In the course of the twentieth century,
sponds to the diagram in Fig. 22.3. The proposition this search for certainty – as Giaquinto called it –
invites the reader to construct an equilateral triangle became a sort of philosophical obsession [22.13]. Fig-
from a segment AB by tracing two circles with cen- ures were considered as definitely unreliable, since they
ters A and B, respectively, and then connecting the did not any more represent our knowledge of physi-
extremes of the segment with the point that is created cal space. Moreover, they give rise to errors. Famously,
at the intersection of the two circles. According to Rus- Klein presented a case of a diagram that is apparently
sell, “There is no evidence whatever that the circles correct, but which in fact induces one to draw the –
which we are told to construct intersect, and if they false – conclusion that all triangles are isosceles tri-
do not, the whole propositions fails” [22.11, p. 404]. angles [22.14, p. 202]. Paradigmatic in this sense was
The proposition does, in fact, contain an implicit as- Hilbert’s program, who attempted to rewrite geome-
sumption based on the diagram – the assumption that try without any unarticulated assumptions [22.15]. For
the circles drawn in the proposition will actually meet. such post-nineteenth century philosophy of mathemat-
From Russell’s and analogous points of view, diagrams ics, a proof should be followed, not seen.
do not entirely belong to the formal or logical level, However, some studies based on the scrutiny of
and therefore they should be considered as epistemi- the practice of mathematics have recently challenged
cally fragile. If this is assumed, then a proof is valid this standard point of view. As editors of a book
only when it is shown to be independent from the corre- on visualization, explanation, and reasoning styles in
sponding diagram or figure. In order to save Euclidean mathematics, Mancosu et al. explained in 2005 how it
geometry from the potential fallacies derived from the was necessary to extend the range of questions to raise
appeal of diagrams, such as the one just shown, some about mathematics besides the ones coming from the
assumptions, sometimes called Pasch axioms, were in- traditional foundational programs. The focus should be
troduced. For example, it is necessary to assume that turned toward the consideration of “what mathemati-
A line touching a triangle and passing inside it touches cians are actually doing when they produce mathemat-
that triangle at two points, so as to avoid the refer- ics” [22.16, p . 1]:
ence to the corresponding diagram and make it a logical
truth. By contrast, prior to the nineteenth century, such “Questions concerning concept-formation, under-
assumptions were generally taken to be “diagrammati- standing, heuristics, changes in style of reasoning,
cally obvious” [22.12, p. 46]. the role of analogies and diagrams etc. have become
There were historical reasons for this kind of scep- the subject of intense interest. [. . . ] How are mathe-
ticism in relation to the use of visual tools in math- matical objects and concepts generated? How does
ematics. At the end of the nineteenth century, due to the process tie up with justification? What role do
progress in disciplines such as analysis and algebra on visual images and diagrams play in mathematical
the one hand, and the development of non-Euclidean activity?”
geometries on the other, the request for a foundation of
This invitation to widen the topics of philosophi-
cal inquiry about mathematics has developed into a sort
C
of movement, the so-called philosophy of mathemati-
cal practice, which also criticizes the “single-minded
focus on the problem of access to mathematical ob-
jects that has reduced the epistemology of mathematics
to a torso” [22.16, p. 1]. Epistemology of mathemat-
A B ics can venture beyond the present confines and address
epistemological issues that have to do with [22.16,
p. 1]
orthogonal to the problem of access to abstract words,” since to use a diagram is not only a matter of
Part E | 22.3
objects.” applying specific perceptual capacities but also of mas-
tering the relevant background knowledge. In Nelsen’s
This approach would be more in line with what proofs, diagrams refer to mathematical statements that
at least some of the very practitioners seem to think can in some way be found in them. Diagrams and texts
about the practice of mathematics. As Jones, a topolo- are, in fact, related: each practice will in turn define the
gist and former Field medallist, summarizes, it is quite terms of this relation. Second, there is another sense
usual among mathematicians to have very little un- in which diagrammatic reasoning is not only visual. In
derstanding of its philosophical underpinnings; in his most cases, diagrams are kinaesthetic objects, that is,
view, for a mathematician, it is actually not at all dif- they are intended to be changed and manipulated ac-
ficult to live with worries such as Russell’s paradox cording to practice. A diagram can be conceived as an
while having complete confidence in one’s mathemat- experimental ground, where mathematicians are quali-
ics [22.17]. fied to apply epistemic actions, which are – following
In this perspective, the study of diagrammatic rea- Kirsch and Maglio’s definition – “actions that are per-
soning in mathematics thus resumes its philosophical formed to uncover information that is hidden or hard
interest, by taking into account the appropriate areas of to compute mentally” [22.20]. Third, as will be dis-
mathematics. Before presenting the different analyses cussed in the Conclusions, the philosophical interest in
that have been provided about diagrammatic reasoning studying diagrammatic reasoning is due to the cogni-
in mathematics, three features of diagrammatic reason- tively hybrid status of diagrams. In fact, diagrams are
ing that will characterize most of the studies reviewed certainly related to text, but at the same time, they are
should be pointed out. First, diagrammatic reasoning more than a mere visual translation of it; moreover,
in mathematics is not only visual reasoning. In fact, they are not only synoptic images, but also tools sub-
in most cases, a diagram comes with a text, and, as ject to manipulation; finally, they are not only part of
a consequence, any analysis of diagrammatic reason- the process of discovery, but in the appropriate context
ing cannot disregard the role of the text accompanying of use they are also able to constitute evidence for jus-
diagrams. In two very fascinating volumes, Nelsen tification. The inquiry into diagrammatic reasoning in
collected a series of proofs, taken from the Mathemat- mathematics will in the end force us to blur the standard
ics Magazine, that he calls “without words” [22.18, boundaries between the various elements of the mathe-
19]. Nonetheless, these proofs are not exactly “without matical practice.
ophy of mathematics gave foundations of logic for what low, does not focus on Euclid only. He starts from
Part E | 22.3
was implicitly assumed in reference to a particular dia- the observation that despite the already discussed post-
gram. But what is implicit in a diagram? What cognitive nineteenth century criticisms, when doing Euclidean
abilities are needed to recognize this information and geometry, one would find it difficult [22.12, p. 23]
use it in a proof? Some proposals gave a Kantian read-
ing of the spatial intuition that is involved in reasoning “to unsee the diagram, to teach oneself to disregard
with a Euclidean diagram, as, for example, in the works it and to imagine that the only information there is is
of Shabel [22.22] and Norman [22.23]. According to that supplied by the text. Visual information is itself
these views, in Euclid’s time, spatial and visual intu- compelling in an unobtrusive way.”
ition was considered as mathematically reliable, and
tacit assumptions were warranted on the basis of spatial Euclidean diagrams seem to be part of the visual
and visual information. Nonetheless, these works have repertoire of shapes and figures that we are familiar
a wider scope than that of the present chapter, that is, with. If this is the case, then any analysis of Euclidean
they aim to give evidence in favor of the plausibility of geometry must take this fact into account. One possible
a Kantian philosophy of mathematics, or at least of part strategy would be to try to reconstruct the geometric
of it. For this reason, they will not be discussed here. practice of the time and focus on what Netz believes is
Despite the specificity of the Euclidean case, in the the distinctive mark of Greek mathematics, something
remainder of the chapter, it will become evident how that has not been developed independently by any other
some of the characteristics of diagrammatic reasoning culture: the lettered diagram.
in Euclidean geometry can also be adapted to other Following Netz’ definition, the lettered diagram is
mathematical practices involving diagrams. As already a combination of distinct elements that taken together
mentioned, the literature about diagrammatic reasoning make it possible to generalize an argument that is given
in ancient Greek geometry is vast. The studies pre- in a single diagram having specific geometrical prop-
sented here are among the most influential ones. For erties. The lettered diagram can thus be considered at
other works, one can refer to the bibliography at the end different levels. At the logical level, it is composed,
of the chapter and to the references given in the single as the name suggests, by a combination of the con-
studies. tinuous – the diagram – and the discrete – the letters
added to it. At the cognitive level, it is a mixture of
22.3.1 The (Greek) Lettered Diagram the visual resources that are triggered by it, and the
finite manageable models that the letters made acces-
The first analysis that will be introduced is the original sible. By following Peirce’s distinction among icons,
and fascinating contribution on the shaping of Greek indexes, and symbols [22.24], the lettered diagram as-
deduction provided by Netz [22.12]. Netz’ aim is to sociates, at the semiotic level, an icon – the diagram –
reconstruct a cognitive history of the use of diagrams with some indices – the letters. As will be shown in
and text in Greek mathematics. According to his def- the next sections, Peirce’s distinction will be a refer-
inition, cognitive history lies at the intersection of the ence also for other studies on diagrammatic reasoning
history of science and cognitive science: it is analogous in mathematics. It is interesting to point out from now
to the history of science, because it takes into account that the Peircean terminology, despite being a com-
cultural artifacts, but it is also comparable to cognitive mon background for many of these authors, is applied
science because it approaches knowledge not through in a variety of ways to different elements of diagram-
its specific propositional contents but by looking at its matic reasoning in mathematics. The lettered diagram
forms and practice. In Netz’ words, such an intersection can be considered also from an historical point of view.
is “an interesting but dangerous place to be in” [22.12, Against this background, the same diagram is a combi-
p. 7]. In fact, his worry is that historians might see his nation of two elements. First, it refers to an art related
research as over-theoretical and too open to generaliza- to the construction of the diagram which, in Netz’ anal-
tion, while cognitive scientists might consider it as too ysis, is most likely a banausic art, that is, a practical
“impressionistic” [22.12, p. 7]. art serving utilitarian purposes only. Second, it exploits
Netz’s idea, in line with the philosophical approach a form of very sophisticated reflexivity, which is related
described in Sect. 22.1, is to look at specific prac- to the use of the letters. The lettered diagram is an ef-
tices and consider the influence that they might have fective geometric tool precisely because of the richness
(or might have had) on the cognitive possibilities of of these different aspects characterizing it. In a lettered
science. His case study is Greek geometry. Note that diagram, we see how almost antagonistic elements are
Netz’ analysis concerns Greek geometry in general and, integrated, so as to make it the appropriate instrument
differently from the studies that will be presented be- to promote and justify deduction [22.12, p. 67].
Diagrammatic Reasoning in Mathematics 22.3 The Euclidean Diagram 505
In Netz’ reconstruction, Greek mathematics is con- system of Greek mathematics as a background. Thanks
Part E | 22.3
stituted by a whole set of procedures for argumentation. to this feature, the proof can be considered as invari-
These procedures are based on the diagram, which con- ant under the variability of the single action of drawing
sequently serves as a source of evidence. Thanks to the one diagram on the papyrus or of presenting the par-
procedure described in the text accompanying the let- ticular proof orally. Therefore, in Greek mathematics,
tered diagram in Fig. 22.3, one knows that the circles what counts is the repeatability of the proof rather than
will actually meet at the intersection point. An inter- the generalizability of the result (for details, see [22.12,
esting consequence of this reading is that the lettered Chap. 5]). According to Netz, to understand Greek ge-
diagram supplies a universe of discourse, without re- ometry, a change of mentality is required: while we are
ferring to any ontological principle. According to Netz, used to generalizing a particular result, Greek mathe-
this would be a characteristic feature of Greek mathe- maticians were used to extending the particular proof
matics: the proof is done at an object level – the level to other proofs using other and different objects that
of the lettered diagram – and no abstract objects corre- are nonetheless characterized by the same invariant el-
sponding to it need to be assumed. As he explains, in ements. A particular construction, given by the lettered
Greek practice [22.12, p. 57]: diagrams – the diagram plus the text accompanying it –
can be repeated, and this is considered as certain.
“One went directly to diagrams, did the dirty work,
The lettered diagram was a very powerful tool, be-
and, when asked what the ontology behind it was,
cause it allowed Greek mathematicians to automatize
one mumbled something about the weather and
and elide many of the general cognitive processes that
went back to work. [. . . ] There is a certain single-
are implied in doing geometry. This was connected to
mindedness about Greek mathematics, a deliberate
expertise: the more expert a mathematician was, the
choice to do mathematics and nothing else. That
more immediately he became aware of relations of form
this was at all possible is partly explicable through
and the more readily he read off information from the
the role of the diagram, which acted, effectively, as
diagram. Interestingly enough, such a feature of the
a substitute for ontology.”
practice with the Greek diagrams seems to be found
This point on the ontology of the Euclidean dia- in other contemporary mathematical practices as well.
gram is not uncontroversial. Other studies dealing with As the topologist and former Field medallist Thurston
the Euclidean practice consider it necessary to take into has proposed, mathematicians working in the same field
account the abstract objects to which, in a way to de- and thus familiar with the same practice share the same
fine, the diagrams seem to refer. For example, Azzouni “mental model” [22.26], which seems to refer precisely
conjectures that the Greek geometers had to posit an to the structure of the particular field and the amount
ontology of geometrical objects, even if, in his stip- of procedures that can be automatized or elided. To
ulationalist reading, this drive was not motivated by sum up, the diagram is a static object, but it becomes
sensitivity to the presence of anything ontologically in- kinaesthetic thanks to the language that refers to it
dependent from us that mathematical terms refer to, but as a constructed and manipulable object: the proof is
rather by geometers’ need to prove things in a greater based on a practical invariance. In Netz’ careful analy-
generality and to make applications easier [22.25]. We sis, this is the best solution to the problem of generality
will see later how Panza introduces quasi-concrete geo- that could be afforded at the time, given the means of
metrical objects (Sect. 22.3.4). communication at hand. If this is true, then any recon-
In this perspective, the paradox that Netz has to struction as formalization, such as the one proposed by
solve is how to explain that one proof – done by refer- Hilbert, would not be faithful to the Greek practice.
ring to a particular diagram, inevitably having specific Moreover, Netz argues that Greek mathematics did not
properties – can be considered as a general result. In deal with philosophical matters. In the sources, nothing
his interpretation, a proof in the Greek practice is an like a developed theory supporting this solution can be
event occurring on a papyrus or in a given oral com- found.
munication, and, despite this singularity, is something
that is felt to be valid. Nonetheless, validity must be in- 22.3.2 Exact and Co-Exact Properties
tended here in a different sense than the standard one.
When looking at Greek mathematics, and contrary to Netz’ approach is not the only one based on practical in-
the post-nineteenth century philosophy of mathematics, variances. Consider Manders’ contribution in an article
logic seems to collapse back into cognition. that has been – in Mancosu’s words – “an underground
In order to reply to this challenge, Netz first points classic” [22.27, p. 14] and that was finally published
out that generality in Greek mathematics exists only on in 2008 (in its original version, which dates back to
a global plane: a theorem is proved having the global 1995) [22.28]. In a later introductory paper, Manders
506 Part E Models in Mathematics
presents some of the philosophical issues that emerge diagrams is controlled by standards for their proper pro-
Part E | 22.3
from diagrammatic reasoning in geometry [22.29]. For duction and refinement. Diagram discipline governs the
him, Euclidean practice deserves philosophical atten- possible constructions.
tion, even only for the simple reason that it has been Consider the features of the diagram of a triangle.
a stable and fruitful tool of investigation across diverse Such a diagram would have to be a nonempty region
cultural contexts for over 2000 years. Up to the nine- bounded by three visible curves, and these curves are
teenth century, no one would have denied that such straight lines. The first property is co-exact and the
a practice was rigorous; by contrast, it was rather con- second is exact. Paradigmatic co-exact properties are
sidered as the most rigorous practice among the various thus features such as a region containing another – un-
human ways of knowing. Also in Manders’ view, the affected if the boundaries are shifted or deformed –
Euclidean practice is based on a distribution of labor or the existence of an intersection point such as the
between two artifact types – the diagram and the text se- one required in Euclid I.1, as already discussed. By
quence – that have to be considered together. Note that contrast, exact features are affected by deformation,
once again the notion of artifact comes onto the scene except in some isolated cases. If one varies the dia-
as referring to diagrams as well as to text, that is, natural gram of the equilateral triangle, lines might no longer
language plus letters linking the text to the diagram. Hu- be straight or angles might lose their equality. In this
mans, due to their limited cognitive capabilities, cannot framework, what is typically alleged as fallacy of di-
control the production and the interpretation of a dia- agram use rests on reading off from a diagram exact
gram so as to avoid any case of alternative responses to conditions of this kind – for example, that the lines
it. For this reason, the text is introduced with the aim of in a triangle are not straight. However, the practice –
tracking equality information. As Manders explains, in the diagram discipline – never allows such a situation
practice, the diagram and the text share the responsibil- to happen. As already mentioned, practitioners created
ity of allowing the practitioners to respond to physical the resources to control the recourse to diagrams, so
artifacts in a “stable and stably shared fashion” [22.28, as to allow the resolution of disagreement among al-
p. 83]. ternative judgements that are based on the appearance
In Manders’ reconstruction, proofs in traditional ge- of diagrams, and therefore to limit the risk of disagree-
ometry have two parts: one verbal – the discursive text – ment for co-exact attributions. Things become trickier
and the other graphical – the diagram. The very objects when it comes to exact properties, and this is the reason
of traditional geometry seem to arise in the diagram: why the text comes in as support. In fact, since exact
in his words, “We enter a diagonal in a rectangle, and attributes are, by definition, unstable under the pertur-
presto, two new triangles pop up” [22.28, p. 83]. The bation of a diagram, they can be priorly licensed by the
text ascribes some features to the diagram, and these discursive text. To go back to Euclid I.1, that the curves
features are called diagram attributions. Letters are in- introduced in the course of the proof are circles is li-
troduced to facilitate cross-references between the text censed, for example, by Postulate 3; furthermore, it is
and the diagram – also Manders’ Euclidean diagram is recorded in the discursive text that other subsequent ex-
lettered. Defining diagram attributions, Manders intro- act attributions are to be licensed, such as the equality
duces a distinction between co-exact and exact features of radii (by Definition 15, again in the discursive text).
of the diagram that has become, as will be shown, very To sum up, for Manders, the diagram discipline is
influential. A co-exact feature is a directly attributable such that it is able to supervise the use of appropriate di-
feature of the diagram, which has certain perceptual agrams. In the remainder of the chapter, it will be shown
cues that are fairly stable across a range of variations. how Manders’ ideas have influenced other research in
Moreover, such a feature cannot be readily eliminated, diagrammatic reasoning also going beyond traditional
thanks to what Manders calls diagram discipline, that Euclidean geometry.
is, the proper exercise of skill in producing diagrams
that is required by the practice. To clarify, if one con- 22.3.3 Reasoning in the Diagram
tinuously varies the diagram in Fig. 22.3, its co-exact
attributes will not be affected. Imagine deforming the Macbeth has proposed a reading of the Euclidean dia-
two circles no matter how: this would not change the gram that is in line with the ones that have just been
fact that there still is a point at which the two fig- presented [22.30]. For the purpose of the chapter, it is
ures intersect. The distinction thus concerns the control interesting to note that her aim in reconstructing the
that one can have on the diagram and on its possible practice of Euclidean geometry is to see whether a clar-
continuous deformations. This would be in line with ification of the nature of this practice might ultimately
the basic general resource of traditional geometrical tell us something about the nature of mathematical prac-
practice, that, is diagram discipline: the appearance of tice in general. She criticizes the interpretation of the
Diagrammatic Reasoning in Mathematics 22.3 The Euclidean Diagram 507
Elements as an axiomatic system and proposes to see it drawn figure really look that way. There is a correspon-
Part E | 22.3
instead as a system of natural deduction. Common no- dence between the iconicity of the Euclidean diagram as
tions, postulates, and definitions are not to be intended introduced by Macbeth and co-exact properties in Man-
as premises, but as rules or principles according to ders’ terms. Also in Macbeth’s reading, the diagram is
which to reason. Moreover, in her view, a diagram is not intended to show the relations that are constitutive of
an instance of a geometrical figure, but an icon. Such the various kinds of geometrical entities involved. As
a feature of the Euclidean diagram makes the demon- she summarizes, “A Euclidean diagram does not instan-
stration in the Euclidean system general throughout. tiate content but instead formulates it” [22.30, p. 250].
In order to clarify such a claim, Macbeth introduces Finally, Macbeth aims to show that the chain of rea-
Grice’s distinction between natural and nonnatural soning in Euclidean geometry involving diagrams is not
meaning [22.31]. For Grice, natural meaning is exem- diagram-based but diagrammatic. According to her ter-
plified by sentences, such as These spots mean measles. minology, a reasoning is diagram-based when its moves
By contrast, a sentence, such as Schnee means snow, are licensed or justified by the diagram; by contrast,
expresses nonnatural meaning. Let us suppose then that it is diagrammatic when the mathematician is asked
a drawing is an instance of a geometrical figure, that to reason in the diagram. Consider again Fig. 22.3.
is a particular geometrical figure. If this is the case, it There is a sense in which this figure is analogous to
would have natural meaning and a semantic counter- the Wittgensteinian duck–rabbit picture, where one al-
part. For example, in Fig. 22.3, one sees a particular ternates between seeing it as the picture of a duck and
triangle ABC that is one instance of some sort of ge- seeing it as the picture of a rabbit. In a similar fashion,
ometrical entity called an equilateral triangle. But let in order for the demonstration to go through, the math-
us instead hypothesize that the drawing has nonnatural ematician has to alternate between seeing certain lines
meaning and therefore is not an instance of an equi- in the figure as icons of radii – and therefore equal in
lateral triangle but is taken for an equilateral triangle. length – and as icons of the sides of a triangle – so as to
Then, the crucial step would be to recognize the inten- draw the conclusion that the appropriately constructed
tion that is behind the making of the drawing. This is triangle is in fact equilateral. The point then is that the
the reason one can also draw an imprecise diagram – physical marks on the page have the potential to be re-
for example, drawing a circle that looks like an ovoid – garded in radically different ways. By pointing at such
as long as the intention – the one of drawing a circle – a feature of the Euclidean diagram, Macbeth aims to
is clear. Such an intention is expressed throughout the make sense of Manders’ view, saying that geometrical
course of the demonstration. Also, Azzouni has sug- relations pop out of the diagram as lines are added to it.
gested that the proof-relevant properties are not the The mathematician uses the diagram to reason in it and
actual (physical) properties of singular diagrammatic to make new relations appear.
figures, but conventionally stipulated ones, the recog- Moreover, according to Macbeth, the Euclidean di-
nition of which is mechanically executable [22.25]. agram has three levels of articulation in the way it can
To sum up, in Macbeth’s reconstruction, the Eu- be parsed by the geometer’s gaze. At a first level, there
clidean diagram has nonnatural meaning and is, by are the primitive parts: points, lines, angles, and areas.
intention, general. Moreover, by following Pierce’s dis- At the second level, there are geometrical objects that
tinction again, it is an icon because it resembles what are intended to be represented in the diagram. At the
it signifies. However, resemblance here cannot be in- third level, there is the whole diagram, which is not
tended as resemblance in appearance. The Euclidean in itself a geometrical figure but, in some sense, con-
diagram resembles what it signifies by displaying the tains the objects at the other levels. In the course of
same relations of parts, that is, by being isomorphic to the demonstration, the diagram can thus be configured
it. The circles in Fig. 22.3 are icons of a geometrical and reconfigured according to different intermediate
circle because there is a likeness in the relationship of wholes. Thanks to such a function of diagrams, sig-
the parts of the drawings. Specifically, the resemblance nificant and often surprising geometrical truths can be
is in the relation of the points on the drawn circum- proved. In Macbeth’s account, the site of reasoning is
ference to the drawn center compared to the relation the diagram, and not the accompanying text. Her con-
of the corresponding parts of the geometrical concept. clusion is that Euclidean geometry is [22.30, p. 266]
Such a resemblance can be a feature of the diagram be-
cause the geometer means or intends to draw a circle, “a mode of mathematical enquiry, a mathematical
that is, to represent points on the circumference that are practice that uses diagrams to explore the myr-
equidistant from the center. Given this intention, it is iad discoverable necessary relationships that obtain
not important whether or not the figure is precise, that among geometrical concepts, from the most obvi-
is, whether or not the points on the circumference in the ous to the very subtle.”
508 Part E Models in Mathematics
Another more recent study has complemented Man- drawing diagrams and a geometrical object can be given
Part E | 22.3
ders and Macbeth’s account by emphasizing even more in the Euclidean system when a procedure is stipu-
strongly how the Euclidean diagram has a role of lated for drawing a diagram representing it. Second,
practical synthesis: to draw a figure means to balance the geometrical objects inherit some properties and re-
multiple desiderata, making it possible to put together lations from these diagrams. This is the local role of
insight – that is timeless – and constructions – that Euclidean diagrams. Such properties and relations are
are given in time [22.32]. We also mention here that recognized because a diagram is compositional. So un-
Macbeth has applied similar arguments to the role of derstood, a diagram is a configuration of concrete lines
Frege’s Begriffschrift as exhibiting the inferentially ar- drawn on an appropriate flat material support. Accord-
ticulated contents of mathematical concepts [22.33]. ing to Panza, Euclid’s geometry is, therefore, neither
Despite the interest of this account, for the reasons an empirical theory nor a contentual one in Hilbert’s
given in Sect. 22.1, we will not give here the details sense, that is, a theory of “extra-logical discrete ob-
of such a study. jects, which exist intuitively as immediate experience
In Sect. 22.4, we will come back to the notion before all thought” [22.37, p. 202]. In his view, differ-
of iconicity and see how the productive ambiguity to ently from the approaches described so far, it is crucial
which Macbeth alludes to in talking about the parsing of to define an appropriate ontology for the Euclidean dia-
the Euclidean diagram can also be found in other cases gram. In fact, his objective is to argue against the view
of diagrammatic reasoning. that arguments in Euclid’s geometry are not about sin-
gular objects, but rather about something like general
22.3.4 Concrete Diagrams and schemas, or only about concepts. Such a view, ac-
Quasi-Concrete Geometrical Objects cording to which Euclidean geometry would deal with
purely ideal objects, is often taken to be Platonic in
Another view on the generality of the Euclidean dia- spirit and is supposed to have been suggested by Pro-
grams has recently been proposed by Panza [22.34]. clus [22.38, 39]. Panza’s proposal is instead closer to
His aim is to analyze the role of diagrams in Euclid’s an Aristotelian view that geometric objects result by
plane geometry, that is, the geometry as expounded by abstraction from physical ones, but the author claims
Euclid in the first six books of the Elements and in the that it is not his intention to argue that Euclid was ac-
Data, and as largely practiced up to early-modern age tually guided by an Aristotelian rather than a Platonic
(see also [22.35]). In his view, Euclid’s propositions are insight.
general insofar as they assert that there are some ad- In the same spirit, also Ferreiros suggests that the
mitted rules that have to be followed in constructing objects of Greek geometry are taken to be the diagrams
geometric objects. Once again, what matters for gener- and other similarly shaped objects [22.21, Chap. 5]. Of
ality are construction procedures. These admitted rules course, the diagram in this context is not intended to
allow the geometer to construct an object having certain refer to the physically drawn lines that are empirically
properties and relations. To put it briefly, it would be given, but to the interpreted diagram, which is per-
impossible for one to follow the rules and end up with ceived by taking into account the idealizations and the
constructing an object without the requested properties. exact conditions conveyed in the text and derived from
Panza argues that arguments in the Euclidean sys- the theoretical framework in the background. For the
tem are about geometrical objects: points, segments of geometer, the figure one works with is not intended as
straight lines, circles, plane angles, and polygons. Tak- an empirical token but as an ideal type. Nonetheless, it
ing inspiration from Parsons [22.36], he defines such is crucial to remark that such an ideal type does not exist
geometrical objects as quasi-concrete. Their quasi- outside the mind of the geometer and becomes available
concreteness depends precisely on the relation they only thanks the diagram. Therefore, on the one hand,
have with the relevant diagrams, which are instead con- the object of geometry is the diagram, and, as a conse-
crete objects: the Euclidean diagram is a configuration quence, the diagram constitutes the object of geometry;
of points and lines, or better is what is common to on the other hand, the diagram has to be interpreted in
equivalence classes of such configurations. Two claims order to make the object emerge, and accordingly it also
describe the peculiarity of the relation between quasi- represents the object of geometry. Moreover, quoting
concrete geometrical objects and concrete diagrams. Aristotle, Ferreiros points out that Greek geometry re-
First, the identity conditions of the geometrical objects mains a form of theoretical and not practical geometry,
are provided by the identity conditions of the diagrams for the reason that its objects are conceived as immov-
that represent them. In his definition, this is the global able and separable, without this necessarily leading to
role of diagrams in Euclid’s arguments: a diagram is the thesis that there exist immovable and separable en-
taken as a starting point of licensed procedures for tities [22.40].
Diagrammatic Reasoning in Mathematics 22.4 The Productive Ambiguityof Diagrams 509
Part E | 22.4
This brief section will be devoted to the discussion in terms of intervals, whose lengths form the sequence
of the role of ambiguity in diagrammatic reasoning. of odd numbers .1; 3; 5; 7; : : :/, which are represented
Grosholz has devoted her work to develop a pragmatic by a slightly longer cross-bar. The unit intervals are
approach to mathematical representations, by arguing intended to be counted as well as measured. In the
that the appropriate epistemology for mathematics has left-hand figure, AB represents time, divided into equal
to take into account the pragmatic as well as the syn- intervals AD, DE, EF, etc., with perpendicular instan-
tactic and semantic features of the tools that are used in taneous velocities raised upon it – EP, for example,
the practice of mathematics. The post-nineteenth cen- represents the greatest velocity attained by the falling
tury philosophy of mathematics wants all mathematics body in the time interval AE – generating a series
to be reduced to logic; by contrast, Grosholz claims of areas which are also a series of similar triangles.
that philosophy should account for all kinds of math- Thanks to an already proven result from Th. I, Prop.1,
ematical representations, since they are all means to Galileo builds the first proposition, according to which
convey mathematical information. Moreover, the pow- the distance covered in time AD (or AE) is equal to
ers and limits of each of them should be explored. One the distance covered at speed 1=2 DO (or 1=2 EP) in
format might be chosen among the others for reasons time AD (or AE). Therefore, the two spaces that we are
of convenience, depending on the problem to solve in looking for are to each other as the distance covered
the context of a specific theory or in a particular his- at speed 1=2 DO in time AD and the distance covered
torical moment. Even the analysis of the use of formal at speed 1=2 EP in AE. Th. IV, Prop. IV tells us that
language can thus be framed in terms of its represen- “the spaces traversed by two particles in uniform mo-
tational role in a historical context of problem-solving. tion bear to one another a ratio which is equal to the
As Grosholz explains [22.41, p. 258], product of the ratio of the velocities by the ratio of the
times”; in this case, given the similarity of the triangles
“Different modes of representation in mathematics ADO and AEP, AD and AE are to each other 1=2 DO
bring out different aspects of the items they aim and 1=2 EP. Then, the proportion between the two ve-
to explain and precipitate with differing degrees of locities compounded with the time intervals is equal to
success and accuracy.” the proportion of the time intervals compounded with
the time intervals, and therefore ŒV1 W V2 compounded
In such a picture, a central cognitive role is played with ŒT1 W T2 equals ŒT1 W T2 2 . As a consequence, the
in mathematics by a form of controlled and highly spaces described by the falling body are proportional to
structured ambiguity that potentially involves all repre- the squares of the time intervals: ŒD1 W D2 2 D ŒT1 W T2 2 .
sentations, and is particularly interesting in the case of Look now at the left-hand diagram. Consider the sums
diagrams. Grosholz as well adopts the general Peircean
terminology and distinguishes between iconic and sym-
A H
bolic uses of the same representation. These two differ-
ent uses make representations potentially ambiguous. L
O D
To clarify, consider as an example Galileo’s treat-
ment of free fall and projectile motion in the third P E
and fourth days of his Discourses and Mathemati- M
F
cal Demonstrations Concerning Two New Sciences.
G
Galileo draws a geometrical figure to prove that
(Fig. 22.4),
“The spaces described by a body falling from rest N
with uniformly accelerated motion are to each other
as the squares of the time-intervals employed in
traversing these distances.”
In the right-hand figure, the line HI stands for the
spatial trajectory of the falling body, but is articulated
C B I
into a sort of ruler, where the intervals representing
distances traversed during equal stretches of time, HL,
LM, MN, etc., are indicated in terms of unit inter- Fig. 22.4 Galileo, Discorsi, third day, naturally acceler-
vals, which are represented by a short cross-bar, and ated motion, Theorem II, Proposition II
510 Part E Models in Mathematics
forth. These sums represent distances and are propor- relations, while Grosholz refers to two possible differ-
tional to the squares of the intervals. Therefore, the time ent uses – iconic or symbolic – of the same diagram.
elapsed is proportional to the final velocity and the dis- Moreover, Macbeth’s Gricean distinction between nat-
tance fallen will be proportional to the square of the ural and nonnatural meaning does not coincide with
final velocity. the distinction between the literal and nonliteral – con-
Galileo’s use of the diagram can be analyzed in re- ventional – uses of the representation made here by
lation to the different modes of representation that are Grosholz.
employed to express his argument to prove the theorem. Grosholz’ approach is not limited to diagrams in
First, he refers to at least four modes of representation: mathematics, unless one wants to say that all mathe-
proportions, geometrical figures, numbers, and natural matical representations are diagrammatic. In fact, in her
language. Second, the same geometrical diagram serves view, another straightforward example of productive
as an icon and at the same time as a symbol. As an ambiguity is Gödel’s representation of well-formed for-
icon, it is configured in such a way that it can stand mulas through natural numbers, whose efficacy stems
for a geometrical figure and exhibit patterns of rela- from their unique prime decomposition. In her termi-
tions among the data it contains. For example, when nology, the peculiarity of Gödel’s proof of incomplete-
proportions are taken as finite, they are represented ness is that the numbers in it must stand iconically for
iconically. When the proportions are taken as infinitesi- themselves – so as to allow the application of num-
mal (because one may take “any equal interval of time ber theoretic results – and symbolically for well-formed
whatsoever” [22.41, p. 14]), the diagram is instead used formulas – so as to allow transferring those results to
as a symbol. In this case, the configuration of the dia- the study of completeness and incompleteness of logi-
gram changes because it is now intended to represent cal systems. Without going into details, it is sufficient
dynamical, temporal processes. Therefore, despite the to say that Grosholz points out that this particular case
fact that an appropriate parsing of the diagram can- shows how much even logicians exploit the constitu-
not represent iconically something that is dynamical or tive ambiguity of some of the representations they use.
temporal, it can still do it symbolically. In Grosholz’ In her view, the recourse to ambiguous formats is, in
view, the distinction between iconic and symbolic use fact, typical of mathematical reasoning in general, and
of a mode of representation sheds light on the impor- this is precisely the feature of mathematics that has not
tance of semantics in mathematics. In fact, for a mode been recognized by the standard post-nineteenth cen-
of representation to be intended not only iconically but tury approaches, which have focused on the possibility
also symbolically, the reference to some background of providing a formal language that would avoid ambi-
knowledge is necessary. The representation does not guities. As Grosholz explains [22.41, p. 19]
have to be intended in its literal configuration but from
“the symbolic language of logistics is allegedly an
within a more elaborated context of use, which provides
ideal mode of representation that makes all content
a new interpretation and a new meaning for it. Galileo’s
explicit; it stands in isomorphic relation to the ob-
diagram must thus be interpreted in two ways: inter-
jects it describes, and that one-one correspondence
vals have to be seen as finite – so that Euclidean results
insures that its definitions are ‘neither ambiguous
can be applied – and also as infinitesimals – so as to
nor empty’.”
represent accelerated motion. In the proof, errors are
prevented by a careful use of ratios. In Grosholz’s view, ambiguity and iconicity then
Compare this example with Macbeth’s discussion of seem to be not only a mark of diagrams such as
Euclid I.1 in the previous section. Here as well, there is Galileo’s one, but also crucial features of mathematical
only one set of diagrams, but, in order for the demon- representations – formulas not being an exception.
stration to go through, it must be read and interpreted In the following sections, other examples of pro-
in different ways. However, Macbeth and Grosholz em- ductive ambiguity and iconicity in contemporary math-
ploy Peirce’s distinction in a different way. Macbeth ematics will be given.
Carnot believes that, at least in the case of the theorem the relevant properties of the permutation are identified,
Part E | 22.5
of Menelaus, the diagram must be considered as a con- thanks to the diagram, they can then be reintroduced
figuration, appropriately chosen with the aim of finding into the original setting.
the solution to the problem in question. As a con- To give an idea of what the diagrams represent-
sequence, the theorem no longer concerns a specific ing permutations look like, consider two examples of
quadrilateral, but any intersection between a triangle constructing the permutation
. O Let p D 4, so that
W
and a straight line. Chemla claims that Carnot’s ideas f1; 2; 3; 4g ! f1; 2; 3; 4g.
were nonstandard at his time, because he introduced Instead of writing
a way of processing information that relies on indi-
viduating what a general diagram is in opposition to B
1 B.1/ B2 B.2/ B3 B.3/ B4 B.4/ ; (22.10)
a multitude of particular figures.
This section will be devoted to briefly introducing we rewrite the expression in the following form
some works on diagrammatic reasoning in present-day
mathematics. The studies have been divided into three C1 C2 C3 C4 C5 C6 C7 C8 : (22.11)
categories: analysis, algebra, and topology. Differently
from the Euclidean or the theory of number case, the Suppose then that
.1/ D 2 and
.3/ D 4, giving
examples taken from contemporary mathematics de-
serve much more technical machinery in order to be B
1 B2 B2 B1 B3 B4 B4 B3 : (22.12)
understood, that is, even only to introduce the diagram,
much mathematics is required. For reasons of space, it What the permutation
O is supposed to do is to tell
is therefore impossible to give here all the mathemati- us which of the Cs are identical, in terms of their in-
cal details, and I invite the reader to refer to the original dices. By comparing the two expressions, we see that
papers. C1 D C4 , C2 D C3 , C5 D C8 , and C6 D C7 . In terms
of the permutation
,
O this means that
.1/
O D 4 and
22.5.1 Analysis
.2/
O D 3, and so on. Both permutations can be repre-
sented by the diagrams in Fig. 22.5.
In two different articles, Carter analyzed a case study Another example could be
.1/ D 3 and
.2/ D 4,
of diagrammatic reasoning in free probability theory, an giving
area introduced by Voiculescu during the 1980s [22.43,
44]. The aim of free probability theory was to formu- B
1 B3 B2 B4 B3 B1 B4 B2 : (22.13)
late a noncommutative analog to classical probability
theory, with the hope that this would lead to new results By rewriting it in terms of Ci ’s and comparing
in analysis. In particular, Carter discusses a section of again, we obtain C1 D C6 , C2 D C5 , C3 D C8 , and C4 D
a paper written by Haagerup and Thorbjørnsen [22.45], C7 , as shown in Fig. 22.6.
where a combinatorial expression for the expectation of First, diagrams would suggest definitions and proof
the trace of the product of so-called Gaussian random strategies. In Carter’s example, the definitions of a pair
matrices (GRMs) of the following form is found of neighbors, or of a noncrossing and a crossing permu-
tation as well as of cancellation of pairs – manipulations
E ı Trn ŒB Bp : (22.8) that are all clearly visible in the diagrams – are in-
spired by them. Moreover, as confirmed by the very
The authors show that this expression depends on
authors of the study, also the formal version of at least
the following
a part of the proofs is inspired by the proof based on
E ı Trn ŒB
1 B.1/ : : : Bp B.p/ : (22.9)
a) 1 b) 8 1
The indices
.i/ are symbols denoting the values of 4
a permutation
on f1; 2; : : : ; pg. Therefore, the value
of the expression depends on the existence and proper- 2
7
ties of the permutation that pairs the matrices off 2 2.
Diagrams can be introduced to represent the permu-
tations, and this is a crucial move, since such diagrams 3
6
make it possible to study permutations independently
from the fact that they were set forth as indices of the 3 4
2 5
GRM. Moreover, the recourse to diagrams makes it eas-
ier to evaluate the properties of the permutations. Once Fig. 22.5 (a)
is the permutation (12)(34); (b) the correspondent
O
512 Part E Models in Mathematics
links the sign to the object. For Peirce, signs are then
4 1 divided into three categories: icons, indices, and sym-
2 bols; icons are signs in virtue of a relation of likeness
7
with their objects, indices are actually connected to
the objects they represent, and symbols represent an
3 object because of a rule stipulating such a relation.
6 Central to Peirce’s conception of reasoning in math-
3 2 ematics is that all such reasoning is diagrammatic –
5 4
and therefore iconic. Moreover, Peirce employs the
Fig. 22.6 (a)
is the permutation (13)(24); (b) the correspondent
O term diagram in a much wider sense than usual. In
his view, even spoken language can be diagrammatic.
diagrams. Second, diagrams function as frameworks in Consider a mathematical theorem that contains certain
parts of proofs: Although they are not used directly to hypotheses. By fixing the reference with certain in-
give rigorous proofs, they still play an essential role dices, it is possible to produce a diagram that displays
in the discovery and formulation of both mathemati- the relations of these referents. In statements concern-
cal theorems and proofs, and thus in the practice of the ing basic geometry, the diagram could be a geometric
mathematical reasoning. diagram such as the Euclidean diagram. But in other
Carter’s idea is that certain properties of the di- parts of mathematics, it may take a different form. In
agrams correspond to formal definitions. In her case Carter’s view, the diagrams in her case study are iconic
study, some diagrams are used to represent permu- because they display properties that can be used to for-
tations and similar diagrams to represent equivalence mulate their algebraic analogs. Moreover, the role of
classes. Diagrams thus make it possible to perform ex- indices – the numbers – in the diagram is to allow
periments on them; for example, the crossings identify for reinserting the result into its original setting. Once
the number of the equivalence classes and therefore such a framework is assumed, then diagrams as well
the definition of a crossing is given an algebraic for- as other kinds of representations used in mathematics
mulation. Likewise, the concepts of a neighboring pair become an interesting domain of research. As already
and of removing pairs (from the diagram) are translated discussed when presenting Grosholz’s work, the ob-
into an algebraic setting. To sum up, the relations used jects of inquiry extend from mathematical diagrams to
in the proof based on the diagram represent relations mathematical signs – mathematical representations –
that also hold in the algebraic setting. As Carter ex- in general, including, for example, also linear or two-
plains, the notions of crossing and neighboring pairs dimensional notations. In the final section, we will
are, in Manders-inspired terminology, examples of co- say more about this issue. A further point made by
exact properties of the diagrams. In a Piercean semiotic Carter is that the introduced diagrams also enable us
perspective, the diagram in this case would again be to break down proofs into manageable parts, and thus
iconic, and it is for this reason that one can translate to focus on certain details of a proof. By using dia-
the diagrammatic proof into an algebraic proof. In this grams at a particular step of the proof, one needs only
example, from contemporary mathematics, we are in to focus on one component, thus getting rid of irrel-
a sense certainly far from the Euclidean diagram, but we evant information. In an unpublished paper, Manders
still see that the proof includes an accompanying text; makes a similar point by introducing the notions of re-
only when the appropriate text is added, do text and di- sponsiveness and indifference in order to address the
agram – taken together – constitute a proof. The text topic of progress in mathematics [22.47]. In the fol-
is also important to disambiguate diagrams that can be lowing section, more details about this paper will be
interpreted as representing different things (recall Man- given.
ders’ view on the Euclidean diagram). It is interesting to note that Carter discusses a po-
In a more recent article, Carter discusses at length tential ambiguity of the term visualization, used as
her reference to Peirce’s terminology. Her reconstruc- (i) representation, as in the example given, and as
tion of Peirce’s discussion of the use of representa- (ii) mental picture, helping the mathematician see that
tion in mathematics is based on some of the most something is the case. In this second meaning, diagrams
recent studies about Peirce’s mathematical philoso- would be fruitful frameworks to trigger imagination.
phy [22.46]. Note that the central notion for Peirce is Carter’s claim is that there is not a sharp distinction
the one of sign, that is, in his words, “Something that to be drawn here between concrete pictures and men-
stands for something else” [22.24, 2.228]. A sign can tal ones, but quite the opposite: a material picture may
stand for something else not in virtue of some of its trigger our imagination, producing a mental picture, and
Diagrammatic Reasoning in Mathematics 22.5 Diagramsin Contemporary Mathematics 513
vice versa a mental picture may be reproduced by a con- each step of our reasoning. Of course, this control and
Part E | 22.5
crete drawing. We will come back also to this issue later. coordination may have different levels of quality across
practices. Manders’ conclusion is that mathematical
22.5.2 Algebra progress is based on this coordinated and systematic
use of responsiveness and indifference, and that such
Another case study from contemporary mathematics a coordination is implemented by the introduction and
is taken from a relatively recent mathematical subject: the use of the various representations. The role of the
geometric group theory. Starikova has discussed how accompanying text is still crucial, since diagrams are
the representation of groups by using Cayley graphs produced according to the specifications in the text.
made it possible to discover new geometric properties Thanks to the text, the depicted relations become re-
of groups [22.48, 49]. In this case study, groups are rep- producible and therefore stable; diagram and text keep
resented as graphs. Thanks to the consideration of the supporting each other.
graphs as metric spaces, many geometric properties of To give the reader an idea of what a Cayley graph
groups are revealed. As a result, it is shown that many for a group looks like, we consider first the definition
combinatorial problems can be solved through the ap- of a generating set. Let G be a group. Then, a subset
plication of geometry and topology to the graphs and by S G is called a generating set for the group G if every
their means to groups. element of G can be expressed as a product of the ele-
The background behind Starikova’s work is the ments of S or the inverses of the elements of S. There
analysis proposed by Manders in the unpublished paper may be several generating sets for the same group. The
already mentioned in presenting Carter’s work [22.47]. largest generating set is the set of all group elements.
In this paper, Manders elaborates more on his study For example, the subsets f1g and f2; 3g generate the
on Euclidean diagrams, this time taking into account group .Z; C/.
the contribution of Descartes’ Géométrie compared to A group with a specified set of generators S is called
Euclid’s plane geometry. He gives particular stress to a generated group and is designated as .G; S/. If a group
the introduction of the algebraic notation. In fact, in has a finite set of generators, it is called a finitely gen-
mathematical reasoning, we often produce and respond erated group. For example, the group Z is a finitely
to artifacts that can be of different kinds: natural lan- generated group, for it has a finite generating set, for ex-
guage expressions, Euclidean diagrams, algebraic or ample, S D f1g. The generated group Z with respect to
logical formulas. In general, mathematical practice can the generating set f1g is usually designated as .Z; f1g/.
be defined as the control of the selective responses The group .Q; C/ of rational numbers under addition
to given information, where response is meant to be cannot be finitely generated. Generators provide us with
emphasizing some properties of an object while ne- a compact representation of finitely generated groups,
glecting others. According to Manders, artifacts help that is, a finite set of elements, which by the appli-
to implement and control these selective responses, cation of the group operation gives us the rest of the
and therefore their analysis is crucial if the target is group.
the practice of the mathematics in question. More- We can now define a Cayley graph. Let .G; S/
over, selective responses are often applied from other be a finitely generated group. Then the Cayley graph
domains. Think of the introduction of algebraic nota- .G; S/ of a group G with respect to the choice of S
tion to apply fast algebraic algorithms. In Descartes’ is a directed colored graph, where vertices are identi-
geometry, geometric problems are solved through solv- fied with the elements of G and the directed edges of
ing algebraic equations, which represent the geometric a color s connect all possible pairs of vertices .x; sx/,
curves. Also here, the idea is that by using differ- x 2 G, s 2 S.
ent representations of the same concepts, new proper- In the following, we can see three examples of
ties might become noticeable. Starikova’s study would Cayley graphs: the Cayley graph for the first given ex-
show a case where a change in representation is a valu- ample, .Z; f1g/, that is, an infinite chain (Fig. 22.7),
able means of finding new properties: drawing the another Cayley graph for the same group Z with gen-
graphs for groups would help discovering new features erators f1; 2g, which can be depicted as an infinite
characterizing them. In this perspective, mathemati- ladder (Fig. 22.8), and finally the Cayley graph for the
cal problem-solving involves the creation of the right group .Zf2; 3g/ (Fig. 22.9). By geometric properties of
strategies of selection: at each stage of practice, some groups, Starikova intends the properties of groups that
information is taken into account and some other in- can be revealed by thinking of their corresponding Cay-
formation is disregarded. It is only by responding to ley graphs as metric spaces. In other words, the idea is
some elements coming from the mathematical context to look at groups through their Cayley graphs and try
and not paying attention to others that we can control to see new (geometric) properties of groups, and then
514 Part E Models in Mathematics
Part E | 22.6
our familiarity with manipulating them. Moreover, the a) b) 4
3
meaning of a knot diagram is fixed by its context of use:
+1
diagrams are the results of the interpretation of a fig- 2
1
ure, depending on the moves that are allowed on them 1
5 +B
23 4
and at the same time on the space in which they are 23 4
embedded. Once the appropriate moves are established, 1 –A 6
+A 75 3 4 6
the ambient space is fixed, thus determining the differ- 2 –B
51
ent equivalence relations. The context of use does not 7
have to be predefined, preserving this kind of ambiguity 5
that is not “damaging” [22.9], but productive. Actually,
the indetermination of meaning makes different inter- Fig. 22.11a,b The surgery code and the Heegaard diagram
pretations co-habit, and, therefore, allows attending to for the Poincaré homology sphere
various properties and moves.
The same authors have also analyzed the practice possible to establish general criteria for mathematical
of proving in low-dimensional topology [22.51]. As validity, since they can only be local. The picture of
a case study, they have taken a specific proof: Rolfsen’s mathematics emerging from these kinds of studies is
demonstration of the equivalence of two presentations thus very different from the one proposed from the post-
of the Poincaré homology sphere. This proof is taken nineteenth century philosophy of mathematics.
from a popular graduate textbook: Knots and Links by A final remark about representations in topology
Rolfsen [22.52]. The first presentation of Poincaré ho- concerns a point about their materiality, already raised
mology sphere is a Dehn surgery, while the second one by Carter in a different context. To avoid confusion, it
is a Heegaard diagram (Fig. 22.11). is necessary to keep in mind the distinction between the
Without going into the details, the aim of the authors material pictures and the imagination process, which,
is to use this case study to show that, analogously to especially in the case of trained practitioners, tends to
knot theory, seeing in low-dimensional topology means vanish. Actual topological pictures trigger imagination
imagining a series of possible manipulations on the rep- and help see modifications on them, but experts may
resentations that are used, and is, of course, modulated not find it necessary to actually draw all the physical
by expertise. Moreover, the actual practice of prov- pictures. The same holds for algebra where experts skip
ing in low-dimensional topology cannot be reduced to transitions that nontrained practitioners cannot avoid
formal statements without loss of intuition. Several ex- writing down explicitly. This does not mean that experts
amples of representationally heterogeneous reasoning – do not need pictures to grasp the reasoning, but only
that is neither entirely propositional nor entirely visual – that, thanks to training and thus to their familiarity with
are given. Both the very representations introduced and drawing and manipulating pictures, they are sometimes
the manipulations allowed on them – what the authors, able to determine what these pictures would look like
following a terminology proposed by Larvor [22.53], even without actually drawing them. More generally,
call permissible actions – are epistemologically rele- for each subfield, it would be possible to define a set of
vant, since they are integral parts both of the reasoning background pictures that are common to all practition-
and the justification provided. To claim that inferences ers, which would determine what Thurston has called
involving visual representations are permissible only the mental model. To go back to Netz’ analysis of the
within a specific practice is to consider them as context Euclidean diagram, here as well diagrams allow for pro-
dependent. A consequence would be that it is no longer cedures to be automatized or elided.
H
The analysis and the definitions provided by Man-
ders about Euclidean geometrical reasoning were used
to establish a formalization for diagrams in line with B
C
what he calls the diagram discipline. Such a project
has brought about the creation of two logical systems,
E [22.54] and Eu [22.55, 56], thanks to the work of Avi- A
gad, Dean and Mumma. Both systems produce formal
derivations that line up closely with Euclid’s proofs,
in many cases following them step by step. (Another Fig. 22.12 A Euclidean diagram depicting exact and co-
system that has been created to formalize Euclidean exact relations
geometry is FG [22.57]. For details about FG and Eu
and for a general discussion of the project in relation
to model-based reasoning, see Chap. 23.). As summa- a) Points A and B are on opposite sides of line l
Points A and B are on line m
rized in a recent paper [22.58], the proof systems are
designed to bring into sharp relief those attributes that
Line m interects line l
are fundamental to Euclid’s reasoning as characterized
by Manders in his distinction between exact and co-
exact properties. Nonetheless, the distinction is made b)
with respect to a more restricted domain.
A
The Euclidean diagram has some components,
which can be simple objects, such as points, lines,
segments, and circles, and more complex ones, such
as angles, triangles, and quadrilaterals. These compo-
nents are organized according to some relations, which l
are the diagram attributes. Exact relations are obtained
between objects having the same kind of magnitude: B
for example, for any two angles, the magnitude of m
one can be greater than the magnitude of the other
or the same. Co-exact relations are instead positional: Fig. 22.13a,b An inference in Euclid’s system according
for example, a point can lie inside a region, outside to Manders’ reconstruction
it, or on its boundary. Co-exact relations concern-
ing one-dimensional objects exclusively, such as line senting co-exact relations; and second, to formulate the
segments or circles, are intersection and nonintersec- rules in terms of the elements whereby diagrammatic
tion, while those concerning regions, one-dimensional inferences can be represented in derivations. The main
or two-dimensional, are containment, intersection, and difference between Eu and E is how the first task is
disjointness. Take the diagram in Fig. 22.12, represent- modulated. Eu possesses a diagrammatic symbol type
ing the endpoint A as lying inside the circle H (a co- intended to model what is perceived in concrete phys-
exact property), along with a certain distance between ical diagrams, while E models the information directly
the point A and the circle’s center B (an exact prop- extracted from concrete physical diagrams by providing
erty). (Consider that in reproducing the diagram from a list of primitive relations recording co-exact informa-
Mumma’s original article, the co-exact features were tion among three object types: points, lines, and circles.
not affected, while the exact ones probably were.) Fol- In Fig. 22.14, the formalization in Eu of the inference
lowing Manders, in a proof in Euclid’s system, premises in Fig. 22.13a is shown. In Fig. 22.15, the formalization
and conclusions of diagrammatic inferences are com- of the same inference in E is shown, with the primitive
posed of co-exact relations between geometric objects. on.A; l/ meaning point A is on line l.
In Fig. 22.13, an inference is shown (Fig. 22.13a) to- In addition, the formalizations do not only have for-
gether with one of its possible associated diagrams mal elements corresponding to Euclidean diagrams, but
(Fig. 22.13b). also formal elements corresponding to the Euclidean
In order to develop a formal system for these in- text, so as to also record exact information. In order to
ferences, the main tasks in developing the programs give a proof, the two kinds of representations have to in-
were two: first, to specify the formal elements repre- teract.
Diagrammatic Reasoning in Mathematics 22.6 Computational Approaches 517
Part E | 22.6
a) b)
m
Not only formalizations of Euclidean geometry have
been provided. Jamnik developed a semi-automatic A A
proof system, called DIAMOND (Diagrammatic
Reasoning and Deduction), to formalize and mecha-
nize diagrammatic reasoning in mathematics, and in l
particular to prove theorems of arithmetic using dia-
grams [22.59]. Interestingly, Jamnik starts by recording
a simple cognitive fact, that is that given some basic B B
mathematical training and our familiarity with spatial
manipulations – remember the study on knot theory – it c)
m
suffices to look at the diagram representing a theorem
to understand not only what particular theorem it A
represents, but also that it constitutes a proof for it. As
a consequence, one arrives at the belief that the theorem
is correct. From here, the question is: Is it possible l
to simulate and formalize this kind of diagrammatic
reasoning on machines? In other words, is this an
example of intuitive reasoning that is particular to
B
humans and machines are incapable of?
The first part of Jamnik’s book provides a nice Fig. 22.14a–c The given inference in EU
overview of the different diagrammatic reasoning sys-
tems that have been developed in the past century, such
as, for example, Gelernter’s Geometry Machine [22.60] a) b) c)
or Koedinger and Anderson’s Diagram Configuration A, B points
l line on (A, m)
Model [22.61]. For reasons of space, these systems will Not same side (A, B, l) on (B, m) intersects (l, m)
not be discussed here. In order to develop her proof sys-
tem, she considers many different visual proofs in arith- Fig. 22.15a–c The given inference in E
metic and some of the analyses that have been given for
them, by relying on the already mentioned collection Consider now Fig. 22.16. By looking at the spatial
edited by Nelsen [22.18, 19]. Such an analysis enables arrangement of the dots, we first take the rectangle of
her to define a schematic proof as “a recursive function length Fib.n C 1/ and height Fib.n/. Then, we split it in
which outputs a proof of some proposition P.n/ given a square of magnitude Fib.n/, that is, the smaller side of
some n as input” [22.59, p. 52]. the rectangle. We continue decomposing the remaining
Consider inductive theorems with a parameter, rectangle in a similar fashion until it is exhausted, that
which, in Jamnik’s proposed taxonomy, are theorems is, for all n. The sides of the created squares represent
where the diagram that is used to prove them repre- the consecutive Fibonacci numbers, and the longer side
sents one particular instance. An example of a theorem of every new rectangle is equal to the sum of the sides
pertaining to this category is the sum of squares of Fi- of two consecutive squares, which is precisely how the
bonacci numbers. According to this theorem, the sum Fibonacci numbers are defined. As noted by Jamnik, the
of n squares of Fibonacci numbers equals the product proof can also be carried out inversely, that is, starting
of the n-th and .n C 1/-th Fibonacci numbers. In sym- from a square of unit magnitude .Fib.1/2/ and joining
bols, it on one of its sides with another square of unit magni-
tude .Fib.2/2/: we have a rectangle. Then we can take A schematic proof is thus a schematic program
Part E | 22.7
the rectangle and join to it a square of the magnitude which by instantiation at n gives a proof of every propo-
of its longer side, so as to create another rectangle. The sition P.n/. The constructive !-rule justifies that such
procedure can be repeated for all n. a recursive program is indeed a proof of a proposi-
The schematic diagrammatic proof for this theorem tion for all n. This rule is based on the !-rule, that
would then be a sequence of steps that need to be per- is, an infinitary logical rule that requires an infinite
formed on the diagram in Fig. 22.16: number of premises to be proved in order to conclude
a universal statement. The uniformity of this proce-
1. Split a square from a rectangle. The square should
dure is captured in the recursive program, for example,
be of a magnitude that is equal to the smaller side of
proof.n/. Jamnik’s attempt is thus to formalize and
a rectangle (note that aligning squares of Fibonacci
implement the idea that the generality of a proof is
numbers in this way is a method of generating
captured in a variable number of applications of geo-
Fibonacci numbers, that is, 1,1,1 C 1 D 2,1 C 2 D
metrical operations on a diagram, and as a consequence
3,2 C 3 D 5, etc.).
to challenge the argument according to which human
2. Repeat this step on the remaining rectangle until it
mathematical reasoning is fundamentally noncomputa-
is exhausted.
tional, and, therefore, cannot be automatized. Details
These steps are sufficient to transform a rectangle about DIAMOND’s functioning cannot be given here.
of magnitude Fib.n C 1/ by Fib.n/ to a representa- We just point out that also in this case diagrammatic
tion of the right-hand side of the theorem, that is, n reasoning is interpreted as a series of operations on
squares of magnitudes that are increasing Fibonacci a particular diagram, which can be repeated on other
numbers [22.59, p. 66]. diagrams displaying the same geometric features.
The main epistemological thesis of the book is that between aspects, so as to recognize that the area of the
Part E | 22.7
there is no reason to assume a uniform evaluation that square is both .aCb/2 and 2abCc2 . From here, we then
would fit all cases of visual thinking in mathematics, have to proceed algebraically as follows
since visual operations are diverse depending on the
mathematical context. Moreover, in order to assess this a2 C 2ab C b2 D 2ab C c2 ; a2 C b2 D c2 : (22.16)
thesis, we do not need to refer to advanced mathemat-
ics: basic mathematics is already enough to account for At this point, by looking back at the figure, we real-
the process of mathematical discovery by an individual ize – geometrically again – that the smaller square is also
who reasons visually. In fact, only the final part of the the square of the hypotenuse of the right-angled triangle.
book goes beyond very elementary mathematics. Finally, from the formula, we conclude that the area of
It should be mentioned that also Giaquinto defends the square of the hypotenuse is equal to the sum of the
a neo-Kantian view according to which in geometry squares of its other two sides. Then the question is: Is
we can find cases of synthetic a priori knowledge, that this argument as a whole to be considered as primary al-
is cases that do not involve either analysis of mean- gebraic or geometric? It seems that neither of these two
ings or deduction from definitions. In fact, he refers categories would be fully appropriate to capture it.
to the already mentioned study by Norman, which is This is an interesting point also relative to other
neo-Kantian in spirit, as a strong case showing that kinds of mathematical reasoning by means of some par-
following Euclid’s proof of the proposition that the ticular representation. Consider a notation that is used
internal angles of a triangle sum to two right angles in topology and take as an example the torus that can
require visual thinking, and that visual thinking is not be defined as a square with its sides identified. In order
replaceable by nonvisual thinking [22.23]. Nonetheless, to obtain the torus from a square, we identify all its four
the focus in this section will be mostly on the last chap- sides in pairs. The square in Fig. 22.17a has arrows in it
ter of the book, where Giaquinto discusses how the indicating the gluings, that is, the identifications. First,
traditional twofold division between algebraic thinking we identify two sides in the same direction, so as to
versus geometric thinking is not appropriate for ac- obtain the cylinder (Fig. 22.17b); then, we identify the
counting for mathematical reasoning. His conclusion, other two, again in the same direction: in Fig. 22.17c,
which can be borrowed also as a conclusion of the one can see the torus with two marked curves, where
present chapter, is that there is a need for a much more the gluings, that is, the identifications, were made.
comprehensive taxonomy for spatial reasoning in math- In discussing the role of notation in mathematics,
ematics, which that would include operations such as Colyvan takes into consideration diagrams such as the
visualizing motion, noticing reflection symmetry, and one in Fig. 22.17a, and points out that this notation is
shifting aspects. In fact, if one considers thinking in “something of a halfway house between pure algebra
mathematics as a whole, then there arises a sense of dis- and pure geometry” [22.65, p. 163]. In Colyvan’s view
satisfaction with any of the common binary distinctions these diagrams are, on the one hand, a piece of notation,
that have been proposed between algebraic thinking on but, on the other, also an indication on how to con-
the one hand and geometric thinking on the other; the struct the object in question. The first feature seems to
philosopher’s aim should be to move toward a much belong to algebra, while the second to geometry. More-
more discriminating taxonomy of kinds of mathemat- over, note that if we identify two sides of the square
ical reasoning.
Consider, for example, aspect shifting as precisely
one form of mathematical thinking that seems to elude a) b)
standard distinctions. Aspect shifting is the same cog-
nitive ability that Macbeth describes in discussing the
way in which the Greek geometer – and every one of
us today who practices Euclidean geometry – reasoned
in the Euclidean diagram. Take again the visual proof
given in Sect. 22.1 for the Pythagorean theorem. As Gi-
aquinto explains, it is possible to look at the square in c)
Fig. 22.1b – that has letters in it, and, therefore, is a kind
of lettered diagram – and see that the area of the larger
square is equal to the area of the smaller square plus the
area of the four right-angled triangles [22.62, pp. 240–
241]. How do we acquire this belief? Giaquinto’s reply
is that first we have to reason geometrically and shift Fig. 22.17a–c Constructing the torus
520 Part E Models in Mathematics
in the same direction and the other two in the opposite where geometric refers mostly to geometric construc-
Part E | 22.8
direction, we obtain the Klein bottle, which is a very tions as methods of geometry rather than algebra. But
peculiar object, since it is three-dimensional but needs eventually this geometric element was significantly ex-
four spatial dimensions for its construction, and, even panded and groups became geometric objects in virtue
more interestingly, has no inside or outside. The Klein of their revealed geometric properties. The introduction
bottle demonstrates how powerful such a notation is: it of graphs thus provided mathematicians with a power-
leads to objects that would be otherwise considered as ful instrument for facilitating their intuitive capacities
nonsense, and it also allows us to deduce their prop- and furthermore gave a good start for further intu-
erties. As Colyvan summarizes, “Whichever way you itions which finally lead to advanced conceptual links
look at it, we have a powerful piece of notation here that with geometry and the definition of a broader geomet-
does some genuine mathematical work for us” ([22.65, ric arsenal to algebra. Also in the knot theory example
p. 163], emphasis added). Diagrams, as well as other (Sect. 22.5.3), knot diagrams are shown to have at the
powerful notations, operate at our place. Moreover, at same time diagrammatic and symbolic elements, and,
least some of them seem to be some kinds of hybrid ob- as a consequence, their nature cannot be captured by
jects, trespassing boundaries. They are geometric and the traditional dichotomy between geometric and alge-
algebraic at the same time. braic reasoning. All this is to show that Giaquinto’s
Consider again the relations between the algebra of invitation to define a “more discriminating and more
combinatorial groups and their geometry (Sect. 22.5.2). comprehensive” [22.13, p. 260] taxonomy for mathe-
As Starikova tells us, first the combinatorial group the- matical thinking going beyond twofold divisions is still
ory was amplified with a geometric element – a graph – valid, and that more on this topic needs to be done.
22.8 Conclusions
The objective of the present chapter was to introduce There is a last remark to make at the end of this sur-
the different studies that have recently been devoted to vey, that is, that in diagrammatic reasoning, we have
diagrammatic reasoning in mathematics. The first topic seen the continuity and the discreteness of space operat-
discussed was the role of diagrams in Euclidean and ing. Continuous was the space of the Euclidean diagram,
Greek geometry in general (Sect. 22.3); then, the pro- discrete (at least in part) the space of the diagrams for
ductive ambiguity of diagrams was defined (Sect. 22.4) Galileo’s theorem and for the sum of the Fibonacci num-
and case studies in contemporary mathematics were bers. Diagrammatic reasoning thus seems to have fun-
briefly reviewed (Sect. 22.5). It has been shown how damentally a geometric nature, since it organizes space.
some attempts have tried to automatize diagrammatic Nonetheless, we have also shown that a diagram never
reasoning in mathematics, in particular to formalize ar- comes alone, but always with some form of text giving
guments in Euclidean geometry and proofs in theory of indications for its construction or stipulating its correct
numbers (Sect. 22.6); finally, it has been argued that interpretation. The relation between the diagram and text
the attention to diagrammatic reasoning in mathemat- is defined each time by the specific practice. As a con-
ics can shed light on the fact that mathematics makes sequence, diagrams appear to be very interesting hybrid
use of different kinds of representations that are so in- objects, whose nature cannot be totally captured by stan-
tertwined that it is difficult to draw sharp distinctions dard oppositions. They are cognitive tools available for
between the different subpractices and the correspond- thought, whose effectiveness depends on both our spa-
ing reasoning (Sect. 22.7). We started from the study tial and our linguistic cognitive nature.
of diagrammatic reasoning and we arrived at the con-
sideration of mathematical thinking as a whole, and of Acknowledgments. My thanks go to the people
the role of notations and representations in it. Mathe- quoted in the text, for their work and for the fruitful
maticians use a vast range of cognitive tools to reason discussions in which I have taken part at recent con-
and communicate mathematical information; some of ferences and workshops. I am particularly indebted to
these tools are material, and, therefore, they can easily Mario Piazza and Silvia De Toffoli, with whom I have
be shared, inspected, and reproduced. Specific repre- extensively reflected upon the topic of diagrammatic
sentations are introduced in a specific practice and, once reasoning in mathematics. I am also grateful to Albrecht
they enter into the set of the available tools, they may Heeffer, who gave me the occasion of working on this
have an influence on the very same practice. This pro- chapter and to the Université de Lorraine and the Ré-
cess plays a significant role in mathematics. gion Lorraine for having sustained my research.
Diagrammatic Reasoning in Mathematics References 521
References
Part E | 22
22.1 J.R. Brown: Philosophy of Mathematics: An Intro- ematical Association of America, Washington 1997)
duction to the World of Proofs and Pictures (Rout- 22.20 D. Kirsh, P. Maglio: On distinguishing epistemic
ledge, New York 1999) from pragmatic action, Cogn. Sci. 18, 513–549 (1994)
22.2 D. Sherry: The role of diagrams in mathematical ar- 22.21 J. Ferreiros: Mathematical Knowledge and the In-
guments, Found. Sci. 14, 59–74 (2009) terplay of Practices (Princeton Univ. Press, Prince-
22.3 S.-J. Shin: Heterogeneous reasoning and its logic, ton 2015)
Bull. Symb. Log. 10(1), 86–106 (2004) 22.22 L.A. Shabel: Mathematics in Kant’s Critical Philos-
22.4 E. Maor: The Pythagorean Theorem. A 4000-Year ophy: Reflections on Mathematical Practice (Rout-
History (Princeton Univ. Press, Princeton 2007) ledge, New York 2003)
22.5 J. Høyrup: Tertium non datur: On reasoning styles in 22.23 J. Norman: After Euclid (CSLI Publications, Univ.
early mathematics. In: Visualization, Explanation Chicago Press, Chicago 2006)
and Reasoning Styles in Mathematics, Synthese Li- 22.24 C.S. Peirce: Collected Papers (The Belknap Press of
brary, Vol. 327, ed. by P. Mancousu, K.F. Jørgensen, Harvard Univ. Press, Cambridge 1965)
S.A. Pedersen (Springer, Dordrecht 2005) pp. 91–121 22.25 J. Azzouni: Proof and ontology in Euclidean math-
22.6 K. Chemla: The interplay between proof and al- ematics. In: New Trends in the History and Philoso-
gorithm in 3rd century China: The operation as phy of Mathematics, ed. by T.H. Kjeldsen, S.A. Ped-
prescription of computation and the operation as erson, L.M. Sonne-Hansen (Univ. Press of Southern
argument. In: Visualization, Explanation and Rea- Denmark, Odense, Denmark 2004) pp. 117–133
soning Styles in Mathematics, ed. by P. Mancosu, 22.26 W.P. Thurston: On proof and progress in mathemat-
K.F. Jørgensen, S.A. Pedersen (Springer, Berlin 2005) ics, Bull. Am. Math. Soc. 30(2), 161–177 (1994)
pp. 123–145 22.27 P. Mancosu (Ed.): The Philosophy of Mathematical
22.7 K. Stenning, O. Lemon: Aligning logical and psy- Practice (Oxford Univ. Press, Oxford 2008)
chological perspectives on diagrammatic reason- 22.28 K. Manders: The Euclidean diagram. In: The Philos-
ing, Artif. Intell. Rev. 15, 29–62 (2001) ophy of Mathematical Practice, ed. by P. Mancosu
22.8 J. Barwise, J. Etchemendy: Visual information and (Oxford Univ. Press, Oxford 2008) pp. 80–133
valid reasoning. In: Logical Reasoning with Dia- 22.29 K. Manders: Diagram-based geometric practice.
grams, ed. by G. Allwein, J. Barwise (Oxford Univ. In: The Philosophy of Mathematical Practice, ed.
Press, Oxford 1996) pp. 3–25 by P. Mancosu (Oxford Univ. Press, Oxford 2008)
22.9 S.-J. Shin, O. Lemon, J. Mumma: Diagrams. In: pp. 65–79
The Stanford Encyclopedia of Philosophy, ed. by E. 22.30 D. Macbeth: Diagrammatic reasoning in Euclid’s
Zalta, Fall 2013 Edition, http://plato.stanford.edu/ elements. In: Philosophical Perspectives on Math-
archives/fall2013/entries/diagrams/ ematical Practice, Vol. 12, ed. by B. Van Kerkhove,
22.10 S.-J. Shin: The mystery of deduction and dia- J. De Vuyst, J.P. Van Bendegem (College Publica-
grammatic aspects of representation, Rev. Philos. tions, London 2010)
Psychol. 6, 49–67 (2015) 22.31 H.P. Grice: Meaning, Philos. Rev. 66, 377–388 (1957)
22.11 B. Russell: The Principles of Mathematics (W.W. 22.32 P. Catton, C. Montelle: To diagram, to demonstrate:
Norton, London 1903/ 1937) To do, to see, and to judge in Greek geometry, Phi-
22.12 R. Netz: The Shaping of Deduction in Greek Math- los. Math. 20(1), 25–57 (2012)
ematics: A Study of Cognitive History (Cambridge 22.33 D. Macbeth: Diagrammatic reasoning in Frege’s Be-
Univ. Press, Cambridge 1999) griffsschrift, Synthese 186, 289–314 (2012)
22.13 M. Giaquinto: The Search for Certainty (Oxford Univ. 22.34 M. Panza: The twofold role of diagrams in Euclids
Press, Oxford 2002) plane geometry, Synthese 186(1), 55–102 (2012)
22.14 F. Klein: Elementary Mathematics from an Ad- 22.35 M. Panza: Rethinking geometrical exactness, Hist.
vanced Standpoint (Dover, Mineola 2004), the first Math. 38, 42–95 (2011)
German edition is 1908 22.36 C. Parsons: Mathematical Thought and Its Objects
22.15 D. Hilbert: The Foundations of Geometry (K. Paul, (Cambridge Univ. Press, Cambridge 2008)
Trench, Trübner, London 1899/ 1902) 22.37 P. Mancosu (Ed.): From Brouwer to Hilbert. The De-
22.16 P. Mancosu, K.F. Jørgensen, S.A. Pedersen (Eds.): Vi- bate on the Foundations of Mathematics in the
sualization, Explanation and Reasoning Styles in 1920s (Oxford Univ. Press, Oxford 1998)
Mathematics (Springer, Berlin 2005) 22.38 Proclus: In primum Euclidis Elementorum librum
22.17 V.F.R. Jones: A credo of sorts. In: Truth in Math- commentarii (B.G. Teubner, Leipzig 1873), ex recog-
ematics, ed. by H.G. Dales, G. Oliveri (Clarendon, nitione G. Friedlein, in Latin
Oxford 1998) 22.39 Proclus: A Commentary on the First Book of Eu-
22.18 R. Nelsen: Proofs without Words II: More Exercises in clid’s Elements (Princeton Univ. Press, Princeton
Visual Thinking, Classroom Resource Materials (The 1992), Translated with introduction and notes by
Mathematical Association of America, Washington G.R. Morrow
2001) 22.40 Aristotle: Metaphysics, Book E, 1026a, 6–10
22.19 R. Nelsen: Proofs without Words: Exercises in Visual 22.41 E. Grosholz: Representation and Productive Am-
Thinking, Classroom Resource Materials (The Math- biguity in Mathematics and the Sciences (Oxford
522 Part E Models in Mathematics
Univ. Press, Oxford 2007) 22.55 J. Mumma: Proofs, pictures, and Euclid, Synthese
Part E | 22
Deduction, D
23. Deduction, Diagrams and Model-Based Reasoning
John Mumma
Part E | 23
23.1 Euclid’s Systematic Use
A key piece of data in understanding mathematics
of Geometric Diagrams ...................... 524
from the perspective of model-based reasoning
is the use of diagrams to discover and to convey 23.2 Formalizing Euclid’s Diagrammatic
mathematical concepts and proofs. A paradig- Proof Method ................................... 525
matic example of such use is found in the classical 23.2.1 The Formal System FG ........................ 526
demonstrations of elementary Euclidean geometry. 23.2.2 The Formal System Eu ........................ 528
These are invariably presented with accompany-
23.3 Formal Geometric Diagrams as Models 532
ing geometric diagrams. Great progress has been
made recently with respect to the precise role the References................................................... 534
diagrams plays in the demonstrations, so much
so that diagrammatic formalizations of elemen-
tary Euclidean geometry have been developed. seminal analysis of Euclid’s diagrammatic proofs.
The purpose of this chapter is to introduce these The chapter presents these insights, the challenges
formalizations to those who seek to understand involved in realizing them in a formalization, and
mathematics from the perspective of model-based the way FG and Eu each meet these challenges.
reasoning. The chapter closes with a discussion of how the
The formalizations are named FG and Eu. Both formalizations can each be thought to prespecify
are based on insights articulated in Ken Manders’ a species of model-based reasoning.
The formalization of mathematical knowledge has been different in kind from the strictly regulated procedures
a mainstay of the philosophy of mathematics since the of inference prescribed by a formalization. Inference
end of the nineteenth century. The goals and assump- concerning some mathematical subject is driven by the
tions characteristic of the enterprise are foundationalist. reasoner’s engagement with representations modeling
A piece of mathematics is formalized to obtain a clear X, rather than the logical form of sentences express-
view of it within the context of justification. The various ing claims about X. For a paradigmatic example of
lemmas, theorems and corollaries of the mathematics such an X consider elementary geometry. From the
are translated into sentences of a fixed formal language, perspective of the tradition that investigates mathemat-
whereby it becomes possible to ascertain with precision ical knowledge via formalization, what is fundamen-
the logical relationships of the lemmas, theorems and tal are sentences expressing the axioms and theorems
corollaries to one another and to a group of sentences of elementary geometry in a fixed formal language.
distinguished as axioms. The end result is a picture Geometric reasoning is depicted as a progression of
of the how the mathematics is – or at least can be – sentences laid out along the rigid pathways defined
grounded on a collection of its basic truths. by the formalization’s rules. From the perspective of
Formalization would thus seem to be of little use model-based reasoning, what is fundamental are the
to those who seek to understand mathematics from the diagrams that model the geometric situations that the
perspective of model-based reasoning. The goal from axioms and theorems concern. Geometric reasoning is
this perspective is to obtain a clear view of the math- an open-ended process in which mind and diagram in-
ematics within the context of discovery. What is of teract.
interest is how the lemmas, theorems, and corollar- A curious recent development has been the appear-
ies of the mathematics came to be known in the first ance of formalizations advanced to show that diagrams
place. A fundamental premise is that the process is of elementary geometry can be understood as part of the
a reasoning process, where the reasoning involved is formal syntax of the subject’s proofs. These specifically
524 Part E Models in Mathematics
are the proof systems FG [23.1] and Eu [23.2]. Since in model-based reasoning. For the presentation of a for-
the target of the formalizations is the use of diagrams mal proof system closely related to Eu – termed E –
in proving geometric theorems, one may think that they see [23.3]. For a discussion of the utility of the analyses
provide a model-based reasoning picture of mathemati- of Eu and E in understanding Euclidean diagrammatic
cal proof. At the same time, by the very fact that they are reasoning from a cognitive perspective, see [23.4].
formalizations, one may take them to miss what is im- The formalizations are based on principles formulated
portant about geometric diagrams from the perspective in [23.5], Ken Manders’ seminal analysis of Euclid’s
of model-based reasoning. Perhaps the formal objects diagrammatic proof method in the Elements. In the
Part E | 23.1
identified as geometric diagrams within them are best chapter’s first section I present this analysis. In the sec-
understood as sentences formulated with an unconven- ond, I sketch FG and Eu as formal proof systems.
tional notation. Finally, in the chapter’s third section, I advance an inter-
In this chapter I present FG and Eu with the aim pretation of the systems where each characterizes a kind
of illuminating their potential relevance to researchers of model-based reasoning.
Part E | 23.2
†CBG is equal to the angle †BCF, the remaining an-
Let ABC be an isosceles triangle having the side gle †ABC is equal to the remaining angle †ACB; and
AB equal to the side AC; and let the straight lines BD, they are at the base of the triangle ABC. But the angle
CE be produced further in a straight line with AB, AC †FBC was also proved equal to the angle †GCB; and
(postulate 2). they are under the base.
I say that the angle †ABC is equal to the †ACB, Two steps of the proof rely on the diagram. The
and the angle †CBD to the angle †BCE. first is the application of the equals-subtracted-from-
Let a point F be taken at random on BD; from AE equals rule (common notion 3 in the Elements) to infer
the greater let AG be cut off equal to AF the less (propo- the equality of lengths BF and CG, the second is the
sition I, 3); and let the straight lines FC, GB be joined application of the same rule to infer the equality of
(postulate 1). angles †ABC and †ACB. A requirement for the cor-
Then, since AF is equal to AG and AB to AC, the rect application of the common notion is that certain
two sides FA, AC are equal to the two sides GA, AB, co-exact containment relations hold. In order to apply
respectively; and they contain a common angle, the an- equals-subtracted-from-equals in the last step for in-
gle †FAG. Therefore, the base FC is equalto the base stance, angle †ABG is required to contain †ABC and
GB, and the triangle AFC is equal to the triangle AGB, †CBG, and angle †ACF is required to contain †ACB
and the remaining angles will be equal to the remaining and †BCF. On Manders’ account the diagram of the
angles respectively, namely those which the equal sides proof licenses the inference that these co-exact condi-
subtend, that is, the angle †ACF to the angle †ABG, tions are satisfied.
and the angle †AFC to the angle †AGB (proposition I, Generally, the results of elementary geometry de-
4). pend on nonmetric positional relations holding between
And, since the whole AF is equal to the whole AG, the components of a configuration. A method for prov-
and in these AB is equal to AC, the remainder BF is ing the results, then, must provide a means for recording
equal to the remainder CG. such information about a configuration, and grounding
But FC was also proved equal to GB; therefore the inferences with respect to it. According to Manders’ ac-
two sides BF, FC are equal to the two sides CG, GB count of Euclid’s method, diagrams fulfill this function,
respectively; and the angle †BFC is equal to the an- and do so in a mathematically legitimate way – i. e.,
gle †CGB, while the base BC is common to them; they do not compromise the rigor of the method.
other figures which are not exact duplicates of the orig- construction step results in a perpendicular lying out-
inal. And so, for Euclid, consultation of the original side the triangle. For example, with the triangle
diagram, with all its particular features, is somehow
supposed to license a generalization. But Euclid leaves
the process by which this is done obscure. And so
we are left with some doubt as to whether the jump
from the particular to general is justified. Even before
the nineteenth century, when the legitimacy of Euclid’s the result of applying the construction step is
methods was taken for granted, philosophers recog-
nized that there was something to be explained with this
jump.
Manders’ exact/co-exact distinction provides the
basis for a partial explanation. The co-exact properties
of a diagram can be shared by all geometric configura- And so, carrying out a Euclidean construction on a rep-
tions in the range of a proof, and so in such cases one resentative diagram can result in an unrepresentative
is justified in reading off co-exact properties from the diagram. If a formal system is to provide a compelling
diagram. In a proof about triangles for instance, varia- analysis of Euclid’s diagrammatic proofs it must ac-
tion among the configurations in the range of the proof count for this in carrying out task 3.
is variation of exact properties – e.g., the measure of the
triangles’ angles or the ratios between their sides. They 23.2.1 The Formal System FG
all share the same co-exact properties – i. e., they all Task 1 in FG: FG Diagrams
consists of three bounded linear regions which together The four fundamental syntactical notions of FG are
define an area. frame, dot, solid segment, and dotted segment. Ev-
This is not a full answer because Euclid’s proofs ery FG diagram possesses a frame, characterized as
typically involve constructions on an initial configura- “a rectangular box drawn in the plane” [23.1, p. 22]).
tion type. With the proof of proposition 5, for example, Within it dots, solid segments, and dotted segments can
a construction on a triangle is specified. In such cases, lie. The dots of an FG diagram are point-like graphic
a diagram may adequately represent the co-exact prop- objects. Solid segments and dotted segments are one-
erties of an initial configuration. But the result of ap- dimensional graphic objects that do not intersect any
plying a proof’s construction to the diagram cannot be other objects of the diagram and terminate either in
assumed to represent the co-exact properties of all con- dots or the diagram’s frame. Solid segments serve to
figurations resulting from the construction. One does represent line segments, and dotted segments serve to
not need to consider complex geometric constructions represent arcs of circles. Accordingly, an FG diagram
to see this. Suppose for instance the initial configura- comes equipped with a partition on its set of solid seg-
tion type of a proof is a triangle. Then the diagram ments and a partition on its set of its dotted segments.
The dlines of the diagram are the components of the
former partition, and the dcircles of the diagram are the
components of the latter. See Fig. 23.1 for an example
of an FG diagram.
Aside from the requirement that solid and dotted
serves to represent the co-exact properties of this type. segments do not intersect anything (including them-
Suppose further that the first step of a proof’s construc- selves), there are no constraints imposed upon them.
tion is to drop the perpendicular from a vertex of the They are free to bend and curve any which way be-
triangle to the line containing the side opposite the ver- tween the dots that bound them. Consequently, there is
tex. Then the result of carrying this step out on the no upper bound on the number of times sets of such
diagram, i. e., objects can intersect one another at dots within an FG
Deduction, Diagrams and Model-Based Reasoning 23.2 Formalizing Euclid’s Diagrammatic Proof Method 527
Part E | 23.2
Fig. 23.1 Example FG diagram a dline, extending a dline, and adding a dcircle. And so
for any Euclidean construction there is a parallel FG
frame. There are upper bounds, however, on the num- construction. Yet Euclidean constructions, in general,
ber of times Euclidean lines and circles can intersect do not yield unique corresponding graphs structures.
one another in points. Two distinct lines, for instance, Consequently, if we are given a Euclidean construc-
intersect in at most one point. Thus, so that dlines and tion and produce an FG diagram D according to the
dcircles intersect one another like Euclidean lines and parallel FG construction, we cannot assume that the
circles, they are required to satisfy a variety of con- corresponding graph structure of D is shared by all con-
ditions. One of these conditions, for instance, ensures figurations produced by the construction. For a simple
that two dlines do not intersect at more than one dot in example, suppose that D0 is the diagram
an FG diagram. For the details see [23.1, Sect. 2.1]. As
illustrated in the discussion of FG’s completion of task
3 below, such conditions play an essential role in FG’s
formalization of Euclid’s proofs.
ant co-exact relations of a construction is to produce all in more than one dot. This eliminates all but the first
FG cases of the construction, and determine which co- and last case
exact relations are obtained in all cases. Such a method
is satisfactory, of course, only if there is a procedure for
producing all the FG cases of the construction. Miller
has implemented such a procedure in a computer pro-
gram CDEG, the general principles behind which he
describes in section 3.5 of [23.1]. To understand how
the procedure works, consider the step in the construc- Generally, FG cases can arise when a dline is added
tion of proposition I, 5 in which the segment from F to to a diagram, when a dcircle is added to a diagram,
C is added. The configuration on which the construc- or when a dline is extended. For each possibility, the
tion is performed can be represented by the following conditions on dlines and dcircles are such that the
FG diagram (the hashmarks in the diagram represent possible routes of the added element through the corre-
equality of lengths according to the standard conven- sponding graph structure of the diagram are sufficiently
tion). restricted – i. e., the resulting FG cases are finite in
number and can be systematically enumerated.
A side note: FG is a purely diagrammatic formal
system. Thus, the techniques whereby one recognizes
parts of an FG diagram (e.g., its dots) in terms of Eu-
clid’s verbal presentation of a construction (e.g., the
point F, the point G) are taken from the beginning
The parallel construction step in FG is to add a dline to be external to it. As discussed in the next section,
connecting the points representing F and G. The cases Eu is a heterogenous system – i. e., it possesses both
that result from this step are individuated by the ways a diagrammatic and a sentential syntax. Sentential sym-
the new dline can snake through the regions of the bols label its diagrams, and formalize (to a certain
corresponding graph structure of D0 . If the dline were extent) a means for relating sentential and diagram-
conceived simply as a one-dimensional curve, some of matic representations. The labels also provide a means
these cases would be for classifying two diagrams as equivalent with respect
to the geometric information they express. Nothing in
the definition of FG diagrams prevents the development
of a heterogeneous version of FG with these features.
Part E | 23.2
A linear element of an Eu diagram is a linear subset
of its array elements – i. e., a subset of array elements Given what this diagram is intended to represent, we
whose coordinates satisfy a linear equation. The ought to be able to produce an intersection point be-
elements of a linear element can be further constrained tween the segment and the line. But the underlying
by inequalities on its first or second coordinate. If array of the diagram is too coarse. An array entry does
there is one inequality to be satisfied the linear element not exist where a point ought to be.
is a ray; if there are two it is a segment. The linear This can always be dealt with by refining an Eu dia-
element of the diagram below is a segment defined by gram into one of the same syntactic type. The equation
the conditions: y D x C 1; 1
x
5 that characterizes a line (and the circumference of a cir-
cle) is linear, expressed in terms of the coordinates of
the array entries. Since the arrays are discrete, the co-
efficients of the equation are always integers. Thus, the
solution for two equations characterizing geometric el-
ements of a diagram will always be rational.
This means that if two geometric elements ought
Finally, a circle of an Eu diagram is a subset of array
to intersect but don’t in a diagram, we can always find
elements that form the perimeter of a convex polygon.
a diagram of the same syntactic type where they do. It
will just be the original diagram with a more refined un-
derlying array. In particular, if the original diagram has
dimension n and the solution between the two equations
is a rational with an m in its denominator, the new dia-
gram will have dimension m.n 1/ C 1.
The correspondence between the abstract, formal For the diagram above, then, adding the desired in-
diagrams of Eu and the concrete diagrams they model is tersection point is a two-step process. First, the diagram
not one-to-one, but many-to-one. Specifically, a single is refined to an equivalent diagram of dimension 7.
concrete diagram is modeled in Eu by a set of Eu dia-
grams with the same syntactic type. What is and is not
possible with the concrete diagram modeled is deter-
mined by all Eu diagrams with the concrete diagram’s
syntactic type.
Roughly, having the same syntactic type means dif-
Then the intersection point is added.
fering only with respect to the number of underlying
array entries. Given a diagram ı we can increase the
number of array entries it contains while leaving the rel-
ative position of its objects within the array fixed. Since
the resulting Eu diagram has the same objects with the
same relative positions, it is taken to model the same
concrete diagram ı does. Another natural worry has to do with the circles of
The procedure of refinement addresses a worry one diagrams. The circles that appear in Euclid’s diagrams
may have about the suitability of Eu’s diagrams. As dis- actually appear circular. The circles of diagrams, how-
crete objects, they will fail in general to produce the ever, are rectilinear. If Euclid exploits the circularity of
intersection points that appear in Euclid’s diagrams. For his circles in his proofs, then the diagrams of Eu would
instance, in the diagram fail to capture this aspect of Euclid’s mathematics. Eu-
530 Part E Models in Mathematics
clid, however, never does this. All he seems to assume semantic equivalence is a relation between labeled
about circles is that they have an interior. Thus, with Eu diagrams. A labeling of an Eu diagram assigns
respect to the project of formalizing Euclid’s proofs, variables to the points, circles and end arrows of the di-
Eu circles suffice. If on the other hand the circular agram. If the same variables label the same object types
appearance of Euclid’s circles were deemed for other in two diagrams, then the labeling induces a one-to-one
reasons to be important, the Eu syntax could be modi- correspondence between the objects of the two dia-
fied accordingly. Eu circles could be defined as regular grams. This is a precondition of semantic equivalence.
polygons with, say, at least 1000 sides. The two diagrams are then semantically equivalent if
Part E | 23.2
B C B C B C B C
E E E E G
F F F
Part E | 23.2
e) f) g) h)
A A A A
B C B C C B C
E E E E G
F
i) j) k) l)
A A A
B C B C B B C
G G G E
m) n) o) p)
A
B C B C B C B C
E G G E
q) r) s)
A A A
B C B C B C
E G G
An Eu derivation thus splits into two stages, a con- as follows: the different Eu diagrams of the construc-
struction stage and a demonstration stage. The con- tion correspond to different stages in the construction of
struction stage is intended to correspond to the pro- a single concrete diagram D; each of the different Eu
duction of a geometric diagram as a concrete object, diagrams of the demonstration correspond to the prod-
while the demonstration stage is intended to corre- uct of an act of attention directed at D.
spond, in part, to the reasoning carried out with the Consider for instance the Eu diagrams in Figs. 23.3
concrete object. (The demonstration stage also serves and 23.4. These are the diagrams that appear in an Eu
to record reasoning carried out with sentences repre- derivation modeling a single diagram proof of proposi-
senting relations between geometric magnitudes.) Both tion 5, book I of the Elements. The final Eu diagram
the construction and the demonstration stage in an Eu of the construction sequence (Fig. 23.4) corresponds
derivation can (and in most cases of interest do) con- to the proof’s concrete diagram D. The Eu diagrams
tain many distinct Eu diagrams, even in the canonical preceding it correspond to stages in the construction of
case of a derivation that is intended to model a sin- D. All the Eu diagrams of the demonstration sequence
gle diagram proof. And so, how the distinct, abstract (Fig. 23.3) are subdiagrams of the final diagram of the
Eu diagrams of such a derivation are to be understood construction sequence. They correspond to acts of at-
in relation to the concrete diagram of single diagram tention whereby certain relationships present in D are
proof requires some explanation. The general idea is verified to hold in general. The sequence thus repre-
532 Part E Models in Mathematics
B C B C B C B C
E E
D D D F
Part E | 23.3
e) f) g)
A A A
B C B C B C
E G E G E G
D F D F D F
sents a reasoning process verifying that the position of the demonstration stage that is governed by the relation
CB within angle †ACE and the position of BC within of semantic equivalence in Eu. Specifically, the rules
†ABG hold in general. that license the addition of a diagram to the demonstra-
As it is only the demonstration sequence that is in- tion sequence given previous diagrams in the sequence
tended to correspond to a reasoning process, it is only must preserve semantic equivalence.
Now, at the heart of the proof of a statement of the How do representations of instantiations do this? It
above form within the semantic tableau setting is the is not immediately clear from the general logical per-
proof of a conditional spective Hintikka assumes. From this perspective, the
only way to represent an instantiation is sententially –
'1 .a1 ; : : : ; an / ! '2 .a1 ; : : : ; an / i. e., via predicates and singular terms. If the singular
terms are understood simply to denote objects in the
in which a1 ; : : : ; an are understood as arbitrary, and broadest logical sense, a listing of predicates that the
the formulas '1 .a1 ; : : : ; an / and '2 .a1 ; : : : ; an / come singular terms satisfy reveals on its own only trivial
Part E | 23.3
to be linked via logical operations, axioms and previ- constraints. Suppose we have a three-place predicate B
ously proven theorems. With respect to proposition 5, and singular terms a1 ; a2 ; a3 and a4 . Then
the a1 ; : : : ; an represent an instantiation of the theorem.
Thus, according to the framework of semantic tableau, B.a1 a2 a3 / B.a2 a3 a4 /
at the heart of the reasoning establishing proposition
5 is the consideration of representations understood to qualifies as a sentential representation of an instan-
instantiate the theorem. The a1 ; : : : ; an serve, in other tiation. But the only constraint on a1 ; a2 ; a3 and a4
words, to model the type of configuration the proposi- that the representation reveals, if our conception of the
tion concerns, and it is by interacting with this modeling objects is the broadly logical one, is that the triples
that the proposition is established. ha1 ; a2 ; a3 i and ha2 ; a3 ; a4 i must satisfy B. Aside from
Hintikka’s work leads us thus to the following ab- the two sentential expressions that negate B.a1 a2 a3 /
stract characterization of deduction from a model-based and B.a2 a3 a4 / – i. e., :B.a1 a2 a3 / and :B.a2 a3 a4 / –
reasoning perspective. Deductive inference concerns we are free to add to the representation any senten-
a complex of interrelated objects. The complex is as- tial expression with the singular terms a1 ; a2 ; a3 and
sumed to satisfy certain conditions '1 , and is inferred a4 .
to satisfy further conditions '2 . The inference proceeds This observation shows, at the very least, that if the
via consideration of a representation modeling the com- MBR conception of deduction is to be of any interest,
plex. An important aspect of deduction understood in the operative conception of object in a deductive in-
this way, emphasized by both Hintikka and Magnani ference has to be richer than the austere one furnished
but passed over in the above discussion of proposition by logic. There has to be, in other words, background
5, is that the representation modeling the complex can knowledge with respect to the objects and their com-
be enriched. One need not restrict oneself to the ob- bination in complexes – e.g., what relations can and
jects the deduction explicitly concerns in constructing cannot obtain among the objects of a complex, what ad-
a model for it. One may add to the representation ad- ditions can be made to a given complex, and so on. If
ditional objects to facilitate the reasoning. With respect this is accepted, the question then becomes: how does
to a proof of elementary geometry, this simply amounts this background knowledge exert itself when a repre-
to performing a construction on the initial configuration sentation of an instantiation is considered in the course
of the proof. of a deduction?
Call this the model-based reasoning, or MBR, con- Here is where proven theorems and/or axioms come
ception of deduction. The conception differs from the into play in a semantic tableau formalization. At the
standardly accepted one in that what is front and cen- initial stage, before any theorems are proven, all back-
ter are representations of objects and their relations, ground knowledge about the objects under considera-
rather than sentences asserting relations between ob- tion is encoded in unproven axioms. We look to these
jects. It is such representations that drive the deductive for what can and cannot be done with representations
inference that any collection of objects satisfying con- of instantiations. These then allow us to use such rep-
ditions '1 also satisfy the conditions '2 . To perform resentations to deduce nontrivial theorems via semantic
such an inference, one must recognize that the condi- tableau, which then can be used in future deductions.
tions '1 impose constraints upon a collection of objects For an example of how this works, consider the three-
with respect to the relations in '2 . This act is accom- place predicate B again, and suppose that it denotes the
plished by representing an instantiation of '1 and '2 , relation of betweenness for points on a geometric line.
augmented perhaps with additional objects. The repre- Suppose further we are at a point where the basic fact
sentation serves to reveal the constraints the conditions about betweenness given by
'1 impose upon objects with respect to the relations in
'2 directly. 8x; y; z; w Œ.B.xyz/ ^ B.yzw // ! B.xyw /
534 Part E Models in Mathematics
B.a1 a2 a3 / B.a2 a3 a4 /
References
23.1 N. Miller: Euclid and His Twentieth Century Rivals: 23.5 K. Manders: The Euclidean diagram. In: Philoso-
Diagrams in the Logic of Euclidean geometry (CSLI, phy of Mathematical Practice, ed. by P. Mancosu
Stanford 2007) (Clarendon Press, Oxford, 2008) pp. 112–183
23.2 J. Mumma: Proofs, pictures, and Euclid, Synthese 23.6 Euclid: The Thirteen Books of the Elements, Vol. I–
175, 255–287 (2010) III, 2nd edn. (Dover, New York 1956), transl. by T.L.
23.3 J. Avigad, E. Dean, J. Mumma: A formal system Heath
for Euclid’s Elements, Rev. Symb. Log. 2, 700–768 23.7 A. Tarski: What is elementary geometry? In: The
(2009) Axiomatic Method, with Special Reference to Ge-
23.4 Y. Hamani, J. Mumma: Prolegomena to a cognitive ometry and Physics, ed. by L. Henkin, P. Suppes,
investigation of Euclidean diagrammatic reasoning, A. Tarski (North Holland, Amsterdam 1959) pp. 16–
J. Log. Lang. Inf. 22, 421–448 (2014) 29
Deduction, Diagrams and Model-Based Reasoning References 535
23.8 L. Magnani: Logic and abduction: Cognitive exter- 23.10 J. Hintikka: Method of analysis: A paradigm of
nalizations in demonstrative environments, Theo- mathematical reasoning?, Hist. Philos. Log. 33, 49–
ria 60, 275–284 (2007) 67 (2012)
23.9 J. Hintikka, U. Remes: The Method of Analysis: Its 23.11 D. Hilbert: Foundations of Geometry (Open Court,
Geometrical Origin and General Significance (Rei- La Salle 1971)
del, Dordrecht 1974)
Part E | 23
537
Model-Based
24. Model-Based Reasoning in Mathematical Practice
Part E | 24.1
counts of scientific modeling, that these are also From Approximation Theory ............... 539
suitable for analyzing mathematical reasoning. In 24.2.3 Third Example:
order to defend such a claim, we take a closer look From Category Theory......................... 540
at three specific cases from diverse mathematical 24.3 The Power of Heuristics
subdisciplines, namely Euclidean geometry, ap- and Plausible Reasoning ................... 540
proximation theory, and category theory. These
examples also display various levels of abstraction,
24.4 Mathematical Fruits
of Model-Based Reasoning................ 542
which makes it possible to show that the use of
models occurs at different points in mathematical 24.5 Conclusion........................................ 546
reasoning. Next, we reflect on how certain steps
24.A Appendix ......................................... 546
in our model-based approach could be achieved,
connecting it with other philosophical reflections References................................................... 548
on the nature of mathematical reasoning. In the
final part, we discuss a number of specific pur-
poses for which mathematical models can be used as an important part of mathematical practice
in this context. The goal of this chapter is, accord- enables us to gain new insights in the nature of
ingly, to show that embracing modeling processes mathematical reasoning.
In this chapter, we explore the significance of spectively. Section 24.3 reflects on how specific tran-
model-based reasoning for mathematical research. In sitional steps in the model-based argument schemes
Sect. 24.1, we start by outlining an account of the presented are to be achieved, and more particularly
nature of scientific modeling, and how it could be on what are commonly called types of plausible rea-
applied to mathematics. This becomes more clear in soning that thus arguably play an important role in
Sect. 24.2, where this account will be briefly ap- mathematical discovery. In Sect. 24.4, some of the
plied to three specific examples coming from differ- alleged epistemic merits or purposes of model-based
ent mathematical subdisciplines, and also exhibiting reasoning as presented in the context of mathemati-
a different level of abstraction, namely Euclidean ge- cal practice are considered. Section 24.5 concludes the
ometry, approximation theory, and category theory re- chapter.
24.1 Preliminaries
Aris [24.1] has proposed the following definition (as totype. The prototype may be a physical, biological,
quoted in Davis and Hersh [24.2, p. 78]): social, psychological or conceptual entity, perhaps
even another mathematical model.”
“A mathematical model is any complete and con-
sistent set of mathematical equations which are Davis and Hersh [24.2, p. 78–79] have commented
designed to correspond to some other entity, its pro- on this:
538 Part E Models in Mathematics
“One might substitute the word structure for equa- the Aris quote already are certain mathematical struc-
tions [in the above quote], for one does not always tures.
work with a numerical model. Some of the purposes First of all, let us specify what can be meant by
for which models are constructed are: making abstraction from mathematical reality in our
present context. We intimately follow the treatment of
1. To obtain answers about what will happen in the this subject by Davis and Hersh [24.2, pp. 126–36]. The
physical world. term abstraction is used in different but related senses
2. To influence further experimentation or observation. in mathematics; Davis and Hersh distinguish abstrac-
3. To foster conceptual progress and understanding. tion as idealization and abstraction as extraction. The
4. To assist the axiomatization of the physical situa- idealizations in this context proceed from the world of
tion. spatial experience to the mathematical world. Aristo-
5. To foster mathematics and the art of making mathe- tle is referred to in this respect, pointing out that [24.2,
matical models. p. 127]:
“the mathematician strips away everything that is
Part E | 24.2
the present discussion how the following observations 24.2.2 Second Example:
are (to be) made. We indeed suppose that all steps in From Approximation Theory
the cycle described can actually be carried out at some
point. In Sect. 24.3, we shall explore the alleged impor- Our second case comes from the mathematical field
tance of informal reasoning (including analogies and called approximation theory, one of the central goals
visualizations) in the process of inductive and deduc- of which it is to “represent an arbitrary function in
tive mathematical inference (see however also the other terms of other functions which are nicer or simpler or
contributions in the current part of this volume.). both” (Hrushikesh and Devidas [24.4, p. 1]). This area
Thus, in whichever way, inductive reasoning, by of research is thus mostly concerned with how func-
reading and taking in the given information described tions can be better approximated with easier functions,
above, makes one realize that a and b, a0 and b0 are legs and with how the errors occurring in this process can
of two right-angled triangles. Mathematical inference be characterized. The point is that, in many cases, it
teaches us that for such triangles, the universal regular- is difficult or even impossible to extract exact analyt-
ity called Pythagoras’ theorem holds, which assures us ical information from an arbitrary function f . In such
Part E | 24.2
that the square of the length of the hypothenuse equals cases, it is nevertheless useful and therefore important
the sum of the squares of the lengths of the legs. This to be able to approximate f with a simpler function. In-
means that by deduction, plugging in the values of a tuitively speaking, in cases like these, mathematicians
and b in this result, we find that sometimes look for a function g, such that the relevant
calculation can be performed on the function g while g
c2 D a2 C b2 I is close enough to f in the sense that the outcome of the
calculation performed on g gives us meaningful infor-
hence c D 5, as lengths are positive real numbers. Sim- mation about f .
0 0 Let us give a concrete and simple example to clar-
pin a and b in Pythagoras’ theorem, we
ilarly, plugging
0 ify what the role of approximation theory can be.
find that c D 13 (observing that 3:6 is relatively close
to this). Schematically depicted, we arrive at the follow- The example is inspired by Christensen and Chris-
ing more or less commutative diagram (by commutative tensen [24.5]. Assume that we want to compute the
diagram we mean that the diagram has the property that following integral
all directed paths with the same start and endpoint lead
to the same result by following the arrows.): Z1
x2
e 2 dx :
Mathematical
inference 0
Right–angled triangle(s) Pythagoras′ theorem
Induction Deduction
Now, a primitive function of the function
Measuring — x2
a = 4, b = 3; a′ = 3, b′ = 2 c = 5; c′ = √ 13 f .x/ D e 2
Some brief remarks with respect to this diagram are
cannot be expressed as a combination (sum, composi-
appropriate. First of all, we added more or less when
tion, multiplication, quotient) of elementary functions
talking about the above diagram’s being commutative.
(polynomials, trigonometric, logarithmic functions, and
This is because we indeed are disregarding the issue
their inverses). So in order to obtain numerical val-
of actually establishing the link between measuring the
ues of the above integral, other means are called for.
hypothenuses on the one hand, and theoretically deduc-
This is where approximation theory enters the picture.
ing their real lengths from Pythagoras’ theorem on the
One of the goals is to search a function g for which (i)
other. Given the discussionp of ever perfectly measur- R1
ing length, 5 as well as 13 in this case, one might 0 g.x/dx can be calculated, and (ii) g.x/ is close to
indeed wonder if it could at all be feasible to render x2
a down-to-earth diagram as the one given (involv- e 2
ing empirical verification) commutative. We do hold
that for both pedagogical and conceptual reasons, ar- for x 2 Œ0; 1, in the sense that we can keep under control
R1
riving at the measures of c and c0 by following the how much 0 g.x/dx deviates from
induction–mathematical inference–deduction route of
the diagram, is way more satisfactory than just measur- Z1
x2
ing their approximate values. e 2 dx :
0
540 Part E Models in Mathematics
A possible way of doing so, is to find a positive in- field, it might be helpful to recall some of the mathe-
tegrable function g for which, for some > 0, matical notions involved. Therefore, without going into
full detail, we briefly develop some of those ideas, or at
x2 least an interpretation of them which is relevant to our
e 2 g.x/
; 8x 2 Œ0; 1 ;
2
setting, in the appendix.
x2
C g.x/
e
C g.x/; 8x 2 Œ0; 1 : Groups are algebraic objects intimately related to
the notion of symmetry. Hopf algebras (over a field
Consequently, one can obtain that k) are – slightly more complicated – algebraic objects,
group algebras being an important class of examples of
Z1 Z1 2
Z1 such structure. As explained in the appendix, groups
e
x2 can be seen as Hopf monoids (HM) in the braided
C g.x/dx
e dx
C g.x/dx :
0 0 0
Hopf algebras are simply Hopf monoids in Vectk .A
monoidal category of sets, denoted Sets. Similarly,
A
inference case you choose to plug in Vectk (actually, both Sets and
Function g Value of g
Induction
Vectk satisfy the necessary conditions for FT to hold).
Deduction It should be noted that in many cases, the categorical
(transformation) Impossible
inference? proof is inspired by a classical (often linear algebraic)
Function f Approximated value of f
proof of the statement for some particular algebraic ob-
ject. Again, this whole practice can schematically be
24.2.3 Third Example: From Category recapitulated in the following way:
Theory
Categorical inference
HM in BMN FT in BMN
Finally, we would also like to provide an illustra-
Induction Deduction
tion from higher level mathematical practice, sketching
Algebraic inference
some of the features of research in category theory. In Groups/ FT for groups/
order to appreciate the results and techniques from this Hopf algebras Hopf algebras
tiate why this is in fact possible, we nevertheless want of these conjectures. Only in a last phase of mathe-
to briefly explore this context of discovery-dimension matical labor, they formalize these informal theories,
here in somewhat more general terms. establishing the deduction of the (by then) theorems by
An interesting starting point can be found in the means of formal transformations on an axiomatic basis.
work of Pólya. In Induction and analogy in mathemat- Lakatos famously illustrates this practice with a (ratio-
ics, he states that all our knowledge outside mathemat- nally) reconstructed history of how the proof of Euler’s
ics and demonstrative logic consists of conjectures. It is polyhedron formula V E C F D 2 came about, V be-
certainly the case that some of these conjectures, such ing the number of vertices, E the number of edges, and
as those expressed in general laws of physical science, F the number of faces of any given polyhedron.
are highly reliable and commonly accepted. Other con- Let us take a somewhat closer look at some of
jectures are neither reliable or respectable. Lakatos’ terminology, which he used to outline various
The support for conjectures is obtained by plausible methods by which mathematical discovery (and subse-
reasoning, while our mathematical knowledge is se- quent justification) can occur. These methods describe
cured by demonstrative reasoning. Mathematical proofs ways in which mathematical concepts, conjectures, and
Part E | 24.3
are part of demonstrative reasoning, while the induc- proofs gradually evolve through interaction between
tive, circumstantial, or statistical evidence for a scientist mathematicians. Central to these practices, so Lakatos
belongs to plausible reasoning. One of the main dif- claims, are counterexamples, and he discusses sev-
ferences between these two kinds of reasoning is that eral ways in which mathematicians or students can
demonstrative reasoning leads to safe and final knowl- react to these: by surrender, monster-barring, exception-
edge that is beyond revision. Plausible reasoning, on barring, monster-adjusting, or lemma-incorporation. In
the other hand, leads to provisional and controversial what follows, we briefly sketch the essence.
knowledge. First of all, surrender amounts to abandoning a con-
What Pólya argues for is that, while mathematics jecture in the light of a counterexample. This is however
is regarded as a demonstrative science, plausible rea- not done lightly, so more frequent are other reactions.
soning also plays an important role in mathematics. Monster-barring, for instance, which consists in ig-
He clarifies this by referring to finished mathematics noring or excluding an alleged counterexample. This
and mathematics in the making (by way of a very nice implies that one has to show why it is not within the rel-
metaphor, Reuben Hersh later called this the distinc- evant concept definition. One can claim, for example,
tion between the front and the back in mathematics that a hollow cube, that is a cube with a cube-shaped
Hersh [24.8].). Finished mathematics appears to be hole in it, is not a counterexample to Euler’s conjecture,
purely demonstrative, consisting of only proofs. Yet by arguing that the hollow cube is not in fact a poly-
mathematics in the making is similar to other human hedron, and thus cannot threaten the conjecture. This
knowledge in the making. The following passage clari- means that the concept polyhedron is under discussion,
fies this (Pólya [24.9, p. 100]): soliciting a further explication of its definition.
As for exception-barring, Lakatos argues that ex-
“You have to guess a mathematical theorem be- ceptions, rather than simply being problematic for cases
fore you prove it, you have to guess the idea of and thus dismissed as monsters, can lead to new knowl-
the proof before you carry through the details. You edge. Two ways to deal with exceptions are discussed.
have to combine observations and follow analo- One is piecemeal exclusion, for example, by excluding
gies, you have to try and try again. The results of one type of polyhedron from the conjecture in order to
the mathematician’s creative work is demonstrative set aside a whole class of counterexamples. The other
reasoning, a proof; but the proof is discovered by is strategic withdrawal, which does not directly rely on
plausible reasoning, by guessing.” counterexamples. Instead, positive examples of a con-
jecture are used in order to generalize to a class of
From this observation, Pólya concludes that stu- objects, and consequently limit the domain of the con-
dents of mathematics should learn both kinds of rea- jecture to this class.
soning. Yet another way of responding to counterexamples
A very similar story is told by Lakatos in his land- is termed monster adjusting. It is intended to meet
mark study Proofs and refutations (Lakatos [24.10]), the possible criticism that both monster-barring and
where he identifies three rough stages in mathemati- exception-barring are not taking counterexamples seri-
cal reasoning. First, mathematicians use induction (in ous enough. Here, the mathematicians reinterpret the
the sense of generalization on the basis of particular in- counterexamples so that they indeed fall within the
stances) to discover conjectures worth trying to prove. scope of the original formulation of the conjecture, and
Then they develop and criticize highly informal proofs thus show how the anomalies are in fact unproblematic.
542 Part E Models in Mathematics
The method of lemma-incorporation differs from means that there is a problem with the conclusion with-
all the above methods in that it uses properties of the out obvious problems for any of the reasoning steps,
proof itself. The idea is to examine the proof in order then one should look for a possible hidden assumption
to determine exactly which lemma has been refuted by in one of the proof steps and modify the proof by mak-
the counterexample. The guilty lemma is then added as ing this assumption explicit.
a condition to the conjecture, and is consequently no Summarized, Lakatos’ proposed method consists
longer refuted by the counterexample. in exploiting proof steps to suggest counterexamples.
The most important or general method for Lakatos, By looking for objects violating an argumentative step,
as the title of his work suggests, is that of proofs and one can identify possible such candidates. Whenever
refutations, which in a certain sense amounts to a di- a counterexample is actually found, one needs to de-
alectic form of the method of lemma incorporation. termine of its kind and accordingly modify the proof or
Lemma incorporation enables one to make a distinc- conjecture. Note that other modes of model-based rea-
tion between global and local counterexamples. Global soning, such as several ones touched upon elsewhere in
counterexamples refute the main conjecture, while local this collection (metaphorical, analogical, and/or visual
Part E | 24.4
counterexamples are counterexamples to specific proof reasoning) can very much be at play in these partic-
steps only. If a counterexample is both global and lo- ular stages of mathematical inquiry. Indeed are these
cal, and thus constitutes a problem for both argument not your exemplary instances of heuristic or plausible
and conclusion, one should modify the conjecture by reasoning in the context of discovery? We shall not fur-
incorporating the problematic step as a condition. If ther explore this issue ourselves, and after this interlude
the counterexample is not global but just local, which return to our central topic, in order to consider how
means the conclusion can still be correct while one of and why the reasoning model introduced in Sect. 24.2
the reasons for believing it is flawed, one should leave should work, that is what its original (if perhaps
the conjecture unchanged and modify the proof. Finally, not essential) contributions to mathematical research
if the counterexample is global but not local, which might be.
developed above on the basis of the diagram we had almost always unaware of how new results have
first introduced to visualize the process of model-based been discovered. [. . . ]
reasoning in mathematical practice. Mathematicians do Experimental Mathematics was founded in the
not simply use abstraction in order to find a result at belief that theory and experiment feed on each
that particular level. For after obtaining results there, other, and that the mathematical community stands
they often translate them back to the original target sys- to benefit from a more complete exposure to the
tem, and consequently, via an abstraction detour, obtain experimental process. The early sharing of insights
their answers about this more basic mathematical do- increases the possibility that they will lead to theo-
main. rems [. . . ] Even when the person who had the initial
Indeed, in our example of Euclidean geometry, one insight goes on to find a proof, a discussion of the
theoretically predicts the (approximated) length of the heuristic process can be of help, or at least of inter-
triangles in the real world or one verifies the theoretical est, to other researchers. There is value not only in
result in the real world. Category theory tells us a sim- the discovery itself, but also in the road that leads to
ilar story. Suppose one has some algebraic structure A it. [. . . ]
Part E | 24.4
of which one can prove that it is a (Hopf) monoid in The word experimental is conceived broadly:
a certain (braided) monoidal category e C . At the level Many mathematical experiments these days are car-
of monoidal category theory, some general theorem ried out on computers, but others are still the result
T./ that holds for all (Hopf) monoids in any (braided) of pencil-and-paper work, and there are other exper-
monoidal category exists (sometimes assuming some imental techniques, like building physical models.”
small extra conditions on e C ), such as the Fundamental
theorem for instance. If you then plug in the category Obviously, we particularly have to pick up this last
e
C you get a version of the theorem T.A/ for the al- sentence here. Next to number crunching (checking as
gebraic structure A you started from. Sometimes this many cases as possible) or probabilistic reasoning tech-
theorem T.A/ is known and has been shown to be true niques, which – either or not aided by computers – have
before. Then we can speak of algebraic world verifi- a distinct inductive and thus experimental ring to them,
cation in some sense. This verification is not always clearly also model building enters the picture.
a priori possible in every field of mathematical research. This issue has been touched upon by Van Ben-
For instance, in the example of approximation theory, degem [24.13], characterizing an experiment as involv-
we are given answers to questions that would be un- ing certain actions such as the manipulation of objects,
solvable in certain cases. However, it is always the case setting up processes in the real world and observ-
that the information gained within the model can be and ing possible outcomes of these processes. An example
often is translated back to the mathematical structure it from mathematics that he discusses is the work of
represents. nineteenth-century Belgian physicist Plateau on min-
Does this also imply (purpose 2) that modeling imal surface area problems. Plateau builded several
allows for further experimentation or observation in geometrical shapes of wire, and by dipping these into
mathematics? To answer this, one may first have to a soap solution he was able to investigate specific
wonder what a mathematical experiment is or can be. aspects of the minimum surface bounding various par-
First of all, the experimental mood of the mathematician ticular shapes. Here, we see how a physical experiment
might be referred to, as a way of personal exploration leads to relevant information of a mathematical prob-
in the mathematical field. However, this will not do lem. In such cases, we see how both the model and the
here. A genuine experiment should at least have an el- physical prototype can influence further experiments
ement of systematic data-generating or testing. Notice and observations. On one hand, the physical experi-
that a field called experimental mathematics does in fact ments help one to formulate some general principles
exist, and also has its own journal of that name. Let us about a connected mathematical domain. On the other
quote from its editorial policy statement [24.12]: hand, the mathematician will set up his experiment in
such a way to answer specific mathematical questions.
“While we value the theorem-proof method of ex- However, such experiments are extremely rare in math-
position, and do not depart from the established ematical practice.
view that a result can only become part of mathe- Another starting point can be the notion of
matical knowledge once it is supported by a logical a mathematical thought experiment, which Van Ben-
proof, we consider it anomalous that an important degem [24.14, pp. 9–10] characterizes as follows:
component of the process of mathematical creation
is hidden from public discussion. It is to our loss “If it is so that what mathematicians are searching
that most of us in the mathematical community are for are proofs within the framework of a mathemati-
544 Part E Models in Mathematics
cal theory, then any consideration that (a) in the case two best-known and the most discussed approaches to
were the proof is not yet available, can lead to an in- intra-mathematical explanation.
sight to what the proof could possibly look like, and, Steiner [24.21] uses the concept of characteriz-
(b) in the case where the proof is available, can lead ing property to draw a distinction between explanatory
to a better understanding of that proof, can be con- and nonexplanatory proofs. A characterizing property
sidered to be a mathematical thought experiment.” is a property unique to a given entity or structure within
a family or domain of such entities or structures. The
A specific example is a description of the octonions concept of a family is left undefined. According to
by means of (monoidal) category theory (rudimentary Steiner, an explanatory proof always makes reference to
background information for this case is presented in the a characterizing property of an entity or structure men-
appendix.). It was shown in Bulacu [24.15] that octo- tioned in the theorem. Furthermore, it must be evident
nions are in fact a weak Hopf algebra in the (braided) that the result depends on the property (if we substi-
monoidal category constructed in Albuquerque and Ma- tute the entity for another entity in the family which
jid [24.16], revealing thus more details of their alge- does not have the property, the proof fails to go through)
Part E | 24.4
braic structure. So in this sense, the work executed and that by suitably deforming the proof while holding
by Albuquerque and Majid influenced further exper- the proof-idea constant, we can get a proof of a related
imentation/observation, leading eventually to a result theorem. Though many of Steiner’s concepts (family,
which might have not been easily deduced by algebraic deformation, proof-idea) remain vague, he discusses
world verification alone, that is without using the results several examples to clarify his account. He presents, for
from [24.16]. The similarity with approximation theory, example, a proof of the irrationality of the square root of
where the model provides information that remains hid- 2 as an explanatory proof since it depends on the unique
den in the target system, should be clear. Since models prime factorization of 2 and since similar proofs for the
can give us new information, this can lead to further ex- irrationality of the square roots of other numbers can be
perimenation or observation. given. Following this approach to explanations, models
The notion of understanding (purpose 3), in its can foster understanding if the model produces proofs
turn, is closely linked to that of explanation. Indeed, that depend on characterizing properties, or where it is
most of the traditional accounts of explanation state easier for the mathematicians to identify these charac-
that understanding is centrally involved in it. Achin- terizing properties.
stein [24.17, p. 16] writes that there is a “fundamen- Kitcher [24.22, p. 437] also argues that his account
tal relation between explanation and understanding”. covers mathematical explanations as well:
Kitcher [24.18] argues that a “a theory of explana-
tion shows us how scientific explanation advances our “The fact that the unification approach provides an
understanding” (p. 330). Woodward [24.19, p. 249] account of explanation, and explanatory asymme-
similarly says that any theory of explanation should tries, in mathematics stands to its credit.”
“identify the structural features of such explanation
which function so as to produce understanding in the Let us briefly go over the model of unification that
ordinary user”. Kitcher proposes. Take a consistent and deductively
Recently, we have seen an increasing interest in closed set K of beliefs. A systematization of K is any
the topic of mathematical explanation as well (See set of arguments that derive some sentences of K from
Mancosu [24.20] for a useful overview of the lit- other sentences of K. The explanatory story, called
erature.). Philosophical work on it can be divided E.K/, corresponds to the systematization with the high-
into two main strands, namely focussing on extra- est degree of unification. The degree of unification is
mathematical and intra-mathematical explanation, re- determined by the number of argument patterns, the
spectively. Extra-mathematical explanation is essen- stringency of patters and the set of consequences deriv-
tially about the role mathematics plays in the natural able. Finally, an argument pattern is an argument that
or social sciences, more precisely whether mathematics consists of schematic sentences, filling in instructions
is or can provide explanations for physical phenomena. and classification of the sentences. Following Kitcher,
When considering intra-mathematical explanation, on and contrary to Steiner, there are no criteria that help
the other hand, one looks into the role of explanation us to analyze the explanatory power of a singular proof.
within mathematics itself, for example by distinguish- Rather, explanation is presented as a value of a unified
ing between explanatory and nonexplanatory proofs. theory or systematization. Within this view, models can
The underlying idea is that all proofs tell us that a theo- foster understanding if the model shows how mathemat-
rem is true, but only some proofs go further and tell us ical results that were considered unrelated are in fact
why a theorem is true. Steiner and Kitcher provided the related. We can see, for example, how category the-
Model-Based Reasoning in Mathematical Practice 24.4 Mathematical Fruits of Model-Based Reasoning 545
ory can advance such understanding. Category theory 4. Proofs and refutations, or the combination of the
allows us to see the universal components of a family previous origins through the “various applications
of structures of a given kind, and show how structures of initial conjectures, deductive arguments, seman-
of different kinds are interrelated. Mathematical models tic considerations, and different kinds of refine-
can thus advance our understanding of mathematical re- ments” [24.25, p. 62].
sults. But there are often different models that address
the same mathematical result. Furthermore, the back- It should be rather easily appreciated that model-
ground knowledge and skills of a certain mathematician based reasoning as it has been proposed by us here has
will play a role in determining whether a model grants an obvious role to play in processes like these.
understanding for this mathematician. The explanatory The following example about structures called Hopf
value of a mathematical model is, in this sense, a con- algebroids may be a good illustration [24.26]:
textual notion.
Axiomatization (purpose 4) is undoubtedly another “A Hopf algebroid is a (possibly noncommuta-
important aspect of mathematical practice. How can tive) generalization of a structure which is dual to
Part E | 24.4
models assist it? A first observation is that axioma- a groupoid (equipped with atlas) in the sense of
tization can appear quite arbitrary. Although axioms space-algebra duality. This is the concept that gen-
were once seen as self-evident truths about the consti- eralizes Hopf algebras with their relation to groups
tution of the physical world, the emphasis nowadays from groups to groupoids.”
mostly seems to be on deducing as much as possible
from a minimum number of axioms, while the exact In Vercruysse [24.6], it is remarked that due to the
nature of these axioms is of secondary importance. Nev- asymetry in this notion, several different notions of
ertheless, several mathematicians have argued against Hopf algebroid were introduced in the literature. Some
this so-called arbitrariness (Weyl [24.23], pp. 523–524), of these were shown to be equivalent, although this was
(Nevanlinna [24.24, p. 457]): far from being trivial. The now seemingly overall ac-
cepted notion of a Hopf algebroid was introduced by
“One very conspicuous aspect of twentieth cen- Böhm and Szlachányi [24.27]. Lu [24.28] introduced
tury mathematics is the enormously increased role a nonsymmetric version over a noncommutative base
which the axiomatic approach plays. Whereas the ring, hereby being able to include quite some examples.
axiomatic method was formerly used merely for the The definition by Schauenburg [24.29] allows one to
purpose of elucidating the foundations on which we recover a version of FT (Sect. 24.2) in this noncom-
build, it has now become a tool for concrete mathe- mutative setting, amongst other things. Only recently,
matical research. [. . . ] [However] without inventing Bruguières et al. [24.30] provided an interpretation of
new constructive processes no mathematician will Schauenburg’s notion by means of so-called Hopf mon-
get very far. It is perhaps proper to say that the ads (Vercruysse [24.6, §5.2.2.1]):
strength of modern mathematics lies in the interac-
tion between axiomatics and construction. “It took quite a long time to establish the correct
The setting up of entirely arbitrary axiom sys- Hopf-algebraic notion over a noncommutative base.
tems as a starting point for logical research has The reasons for the difficulties are quite clear. First
never led to significant results. [. . . ] The aware- of all, if R is a noncommutative ring then the cate-
ness of this truth seems to have been dulled in the gory of right R-modules is no longer monoidal (in
last few decades, particularly among younger math- general). Therefore we have to look instead to the
ematicians.” category of R-bimodules, which is monoidal, but in
general still not braided. So Hopf monoids cannot
Schlimm [24.25] (Sect. 3) has identified four nonar- be computed inside this category. However, we can
bitrary sources of (new or adapted) axioms from within compute Hopf monads on this category (: : :) His-
mathematical practice: torically, Hopf algebroids were constructed first in
a more direct way, and the interpretation via Hopf
1. Reasoning from accepted theorems, that is back- monads is only very recent.”
ward so to say, by wondering what axioms would
be in need in order to substantiate current theories The latter approach shows that for certain applica-
2. Manipulation of existing axioms, as a way of tions, it is preferable to use Schauenburg’s definition
(game-like) exploration as being conceptually the most interesting one. The
3. Conceptual analysis of a mathematical domain, price to pay, however, is that it cannot include exam-
such as e.g., number or set theory; and ples that are included in the slightly weaker notions
546 Part E Models in Mathematics
of Böhm and Szlachányi [24.27] and Lu [24.28], the 1. To remedy perceived gaps or deficiencies in earlier
definition from the latter source in its turn not being arguments
adapted to prove categorically flavored theorems (such 2. To employ reasoning that is simpler, or more per-
as e.g., FT). As it seems, all depends what flavor one spicuous, than earlier proofs
prefers. 3. To demonstrate the power of different methodolo-
Finally, that model-based reasoning should be able gies
to foster mathematics and the art of making mathe- 4. To provide a rational reconstruction (or justifica-
matical models (purpose 5), is of course self-evident tion) of historical practices
in the context of mathematical inquiry itself, at least 5. To extend a result, or to generalize it to other con-
given all that has been elaborated above. The essence texts
of mathematics resides in inventing methods, tools, 6. To discover a new route
strategies, and concepts for solving problems. That is 7. Concerns for methodological purity
the very answer to the question of why mathemati- 8. Role analogous to the role of confirmation in the
cians prove theorems. From this view, Rav [24.31] natural sciences.
Part E | 24.A
24.5 Conclusion
As announced at the outset of this chapter, we have tics that mathematicians use, and further research into
been focusing here on the philosophical significance of these processes is certainly welcome. Nevertheless, the
one specific aspect of mathematical practice, namely discussion of the mathematical fruits of model-based
model-based reasoning as a general methodological reasoning should convince the reader of the significance
framework. By presenting three cases, taken from dif- of the general framework of model-based reasoning,
ferent mathematical subdisciplines and with varying as it shows us how mathematical modeling is linked
levels of abstraction, we showed how mathematicians with several specific purposes of mathematical practice
engage in model-based reasoning. We are well aware such as experimentation, understanding, or axiomati-
that the general account of such reasoning remains zation. Hence, future reflections on the specifications
silent on how mathematicians go from one level to an- of model-based reasoning in mathematics can provide
other level. Philosophers such as Pólya and Lakatos crucial insights in several interesting questions about
discuss the richness of different intellectual heuris- mathematical practice.
24.A Appendix
In this appendix, we briefly recall some notions from f
(monoidal) category theory. Classical references for cat- A B
egory theoretical notions and constructions are Borceux
[24.33] and Mac Lane [24.34]. We start with some basic that is, this process f can be visualized by an arrow;
notions from set theory. Let us consider two nonempty the only requirement being that one must be able to tell
sets A and B and a set-theoretical map (or function) f be- for every element of the departure set A where it is go-
tween them. This situation can be depicted as follows: ing to. Now, let A; B; C be three sets and consider two
Model-Based Reasoning in Mathematical Practice 24.A Appendix 547
Part E | 24.A
monoidal category is precisely the categorification of
1A the definition of monoid (here categorification aims at
A A
the name for the process as it was coined by Crane and
This function 1A is called the identity function on A. Yetter in [24.36]).
It has the property that for any function f W A ! B, the Here are some examples of monoidal categorical
following holds: f ı 1A D 1B ı f D f . structures:
Let us now consider a more general scenario, not .C ; ˝; I/ D .Sets; ; f g/, where:
necessarily set-theoretic. Let A and B be objects: – is the Cartesian product of sets.
– f g is any singleton.
A B We will briefly denote this monoidal category by
Sets.
and replace the set-theoretical notion of map by just an
arrow between these objects:
.C ; ˝; I/ D .Vectk ; ˝k ; k/, where:
– ˝k is the tensor product over k
f – k is any field.
A B
This example will be briefly denoted by Vectk .
We can now give an idea of the notion of category: Now we have a vague idea of what it means to
Roughly speaking, a category C consists of objects be a monoidal category, in order to illustrate an ex-
and arrows (between objects) such that there is a com- ample, we wish to glance at certain objects in such
position ı for the arrows and an identity arrow 1A for categories. More precisely, we start with considering
any object A of C . These ingredients have to satisfy monoids (sometimes called algebras in literature) in
some conditions that mimic the associative behavior of a monoidal category C . The idea is that these objects
composition of functions between sets and the above- mimic the behavior of classical monoids (i. e., sets
mentioned property of the identity function on any set. with an associative, unital binary operation), the lan-
In this sense, following Awodey [24.35], category guage of monoidal categories offering a natural setting
theory might be called abstract function theory. to do so; this is an instance of the so-called micro-
We give some basic examples of categories: cosm principle of Baez and Dolan [24.37], affirming
that “certain algebraic structures can be defined in any
C D Sets
category equipped with a categorified version of the
objects: sets
same structure.”
arrows: functions between sets
A monoid in C is a triple A D .A; m; /, where A 2 C
C D Vectk (k being a field)
and m W A ˝ A ! A and W I ! A are arrows in C (such
objects: k-vector spaces
that two diagrams – respectively mimicing the associa-
arrows: k-linear maps.
tivity and unitality condition – commute; we refer the
Now we would like to illustrate the adjective reader to [24.6, Sect. 5.3.1] for instance).
monoidal in the term monoidal category. Therefore, let Many algebraic structures can be seen as monoids
us introduce monoids. in an appropriate monoidal category; we present some
A monoid .M; ; 1M / consists of a set M, a function examples here, for details and more examples we refer
W M M ! M and an element 1M 2 M such that: the reader again to [24.6, Sect. 5.3.1] e.g.:
548 Part E Models in Mathematics
Taking C to be the monoidal category Sets, one can In case a monoidal category C exhibits moreover
easily verify that, taking a monoid in C , one recov- a braided structure (whatever this means), we denote
ers exactly the definition of a classical monoid, as C equiped with this braided structure as e C . In this case,
one expects. one can not only consider monoids in e C , one can impose
Similarly, a monoid in Vectk gives precisely more structure on the definition of monoid, obtaining
the classical notion of (an associative, unital) k- such notion as Hopf monoid in e C . To be a bit more
algebra. precise, a Hopf monoid in e C is a bimonoid (which is
A more surprising example is given by the oc- a monoid also having a so-called comonoid structure,
tonions. The octonions O are a normed division both structures being compatible), having an antipode.
algebra over the real numbers. There are only four The reader is referred to [24.6, Sect. 5.3.2] for more de-
such algebras, the other three being the real num- tails.
bers, the complex numbers, and the quaternions.
Although not as well known as the quaternions or e
The categories Sets and Vectk can be given
a braided structure, which we denote by Sets and Vectk A
the complex numbers, the octonions are related to respectively, such that – without going into the details –
e A
Part E | 24
a number of exceptional structures in mathemat- the notions of group and Hopf algebra can be recovered
ics, among them the exceptional Lie groups. For as being Hopf monoids in Sets and Vectk , respectively.
more details, we refer to the excellent paper by
Baez on this subject [24.38]. One of the proper- Acknowledgments. The authors are mentioned in al-
ties of the octonions is that they are nonassociative phabetical order. The first author is a doctoral research
(that is, considered as monoid in Vectk ). They can assistant of the Fund for Scientific Research – Flanders.
be seen, however, as an (associative) monoid in the The second author would like to thank Joost Vercruysse
monoidal category constructed by Albuquerque and for fruitful discussion. The third author is indebted to
Majid in [24.16]. research project SRP22 of Vrije Universiteit Brussel.
References
24.1 R. Aris: Mathematical Modelling Techniques (Pit- 24.10 I. Lakatos: Proofs and Refutations (Cambridge Univ.
man, San Francisco 1978) Press, Cambridge 1976)
24.2 P.J. Davis, R. Hersh: The Mathematical Experience 24.11 C. Pincock: A revealing flaw in Colyvan’s indispens-
(Penguin Books, London 1983) ability argument, Philos. Sci. 71, 61–79 (2004)
24.3 G. de Vries: Slides from Workshop on Mathemati- 24.12 D. Epstein, S. Levy, R. de la Llave: Statement of phi-
cal Modelling, June 2001 Mathematics Symposium: losophy and publishing criteria, Exp. Math. 1, 1–3
Focus on Applied and Pure Mathematics, Edmonton (1992)
Regional Consortium (2001) 24.13 J.P. Van Bendegem: What, if anything, is an ex-
24.4 N.M. Hrushikesh, V.P. Devidas: Fundamentals of periment in mathematics? In: Philosophy and the
Approximation Theory (Narosa Publishing House, Many Faces of Science, ed. by D. Anapolitanos,
New Dehli 2000) A. Baltas, S. Tsinorema (Rowman and Littlefield,
24.5 O. Christensen, K.L. Christensen: Approximation London 1998) pp. 172–182
Theory: From Taylor Polynomials to Wavelets 24.14 J.P. Van Bendegem: Thought experiments in math-
(Springer, New York 2005) ematics: Anything but proof, Philosophica 72, 9–33
24.6 J. Vercruysse: Hopf algebras. Variant notions and (2003)
reconstruction theorems. In: Quantum Physics 24.15 D. Bulacu: The weak braided Hopf algebra struc-
and Linguistics: A Compositional, Diagrammatic ture of some Cayley-Dickson algebras, J. Algebr.
Discourse, ed. by E. Grefenstette, C. Heunen, 322, 2404–2427 (2009)
M. Sadrzadeh (Oxford Univ. Press, Oxford 2013) 24.16 H. Albuquerque, S. Majid: Quasialgebra structure of
pp. 115–146 the octonions, J. Algebr. 220, 188–224 (1999)
24.7 M. Takeuchi: Finite Hopf algebras in braided tensor 24.17 P. Achinstein: The Nature of Explanation (Oxford
categories, J. Pure Appl. Algebr. 138, 59–82 (1999) Univ. Press, Oxford 1983)
24.8 R. Hersh: Mathematics has a front and a back, Syn- 24.18 P. Kitcher: Explanatory unification. In: The Philos-
these 88, 127–133 (1991) ophy of Science, ed. by R. Boyd, P. Gasper, J.D. Trout
24.9 G. Pólya: From the preface of induction and anal- (MIT Press, Cambridge 1988) pp. 329–347
ogy in mathematics. In: New Directions in the 24.19 J. Woodward: A theory of singular causal explana-
Philosophy of Mathematics, Revised and expanded tion. In: Explanation, ed. by R. David-Hillel (Oxford
edn, ed. by T. Tymoczko (Princeton Univ. Press, Univ. Press, New York 1993) pp. 246–274
Princeton 1998) pp. 99–101 24.20 P. Mancosu: Mathematical explanation: Why it
matters. In: The Philosophy of Mathematical Prac-
Model-Based Reasoning in Mathematical Practice References 549
tice, ed. by P. Mancosu (Oxford Univ. Press, Oxford Appl. Categ. Struct. 6, 193–222 (1998)
2008) pp. 134–150 24.30 A. Bruguières, S. Lack, A. Virelizier: Hopf monads
24.21 M. Steiner: Mathematical explanation, Philos. on monoidal categories, Adv. Math. 227, 745–800
Stud. 34, 135–151 (1978) (2011)
24.22 P. Kitcher: Explanatory unification and the cuasal 24.31 Y. Rav: Why do we prove theorems?, Philos. Math.
structure of the world. In: Scientific Explanation, 7, 5–41 (1999)
ed. by P. Kitcher, W. Salmon (Univ. Minnesota Press, 24.32 J. Dawson: Why do mathematicians re-prove the-
Minneapolis 1989) pp. 410–505 orems?, Philos. Math. 14, 269–286 (2006)
24.23 H. Weyl: A half-century of mathematics, Am. Math. 24.33 F. Borceux: Handbook of Categorical Algebra I. En-
Mon. 58, 523–533 (1951) cyclopedia of Mathematics and Its Applications,
24.24 R. Nevanlinna: Reform in teaching mathematics, Vol. 50 (Cambridge Univ. Press, Cambridge 1994)
Am. Math. Mon. 73, 451–464 (1966) 24.34 S. Mac Lane: Categories for the Working Mathe-
24.25 D. Schlimm: Axioms in mathematical practice, Phi- matician, Graduate Texts in Mathematics Ser., Vol.
los. Math. 21(1), 37–92 (2013) 5, second edn (Springer, Berlin 1998)
24.26 Zoran Škoda: Hopf Algebroid http://ncatlab.org/ 24.35 S. Awodey: Category Theory, Oxford Logic Guides
nlab/show/Hopf+algebroid (2014) Ser., second edn (Oxford Univ. Press, Oxford 2010)
24.27 G. Böhm, K. Szlachányi: Hopf algebroids with bi- 24.36 L. Crane, D.N. Yetter: Examples of categorification,
Part E | 24
jective antipodes: Axioms, integrals and duals, Cah. Topol. Géom. Différ. Catég. 39, 325 (1998)
Commun. Algebr. 32, 4433–4464 (2004) 24.37 J.C. Baez, J. Dolan: Higher-dimensional algebra III.
24.28 J.-H. Lu: Hopf algebroids and quantum groupoids, n-categories and the algebra of opetopes, Adv.
Int. J. Math. 7, 47–70 (1996) Math. 135, 145–206 (1998)
24.29 P. Schauenburg: Bialgebras over noncommutative 24.38 J.C. Baez: The octonions, Bull. Am. Math. Soc. 39,
rings and a structure theorem for Hopf bimodules, 145–205 (2001)
551
Ferdinand Rivera
Abduction an
25. Abduction and the Emergence of Necessary
Mathematical Knowledge
Part E | 25.1
implemented in more systematic terms. In this in Mathematical Contexts .................. 564
chapter four types of inferences that students 25.4.1 Cultivate Abductively-Infused Guesses
develop in mathematical activity are presented with Deduction ................................. 564
and compared followed by a presentation of key 25.4.2 Support Logically-Good Abductive
findings from current research on abduction in Reasoning......................................... 565
25.4.3 Foster the Development of Strategic
mathematics and science education. The chap-
Rules in Abductive Processing ............. 565
ter closes with an exploration of ways in which
25.4.4 Encourage an Abductive Knowledge-
students can effectively enact meaningful and
Seeking Disposition ........................... 565
purposeful abductive thinking processes through
activities that enable them to focus on relational or References................................................... 566
orientation understandings. Four suggestions are
provided, which convey the need for meaning-
ful, structured, and productive abduction actions. abductive cognition, that is, thinking, reasoning,
Together the suggestions target central features in processing, and disposition.
p. 225]. He was pleasantly surprised about how easy “Before asking where new ideas come from, we
it was to count “all the sixes” by “counting by fives need to ask what new ideas are for, and knowing
and adding the ones left”, which generated in him an what they are for, we can attune their newness to
intense feeling of discovering something new through their purpose. And their purpose is, in the case of
a guess that made sense and that he was able to ver- abduction, to provide true explanations following
ify to be correct. The following passage below from El experimental verification.”
Khachab [25.2] provides another, and yet deeper, way
of thinking about Ian’s experience. El Khachab fore- Ian saw purpose in counting by five plus one that
grounds the significance of having a purpose as a way encouraged him to further pursue his new idea. Af-
of motivating the emergence of new ideas, which is one ter verifying that his strategy actually worked on the
way of explaining how learners sometimes find them- available cases, he then articulated an explanation that
selves being carried away during the process of discov- matched what he was thinking in his head. The nature
ery. The second sentence in the passage articulates in of what counts as a true explanation in abduction is ex-
very clear terms the primary purpose of abduction and plored in some detail in the succeeding sections. For
its central and unique role in the establishment of new now, it makes sense to think of abductive explanations
knowledge [25.2, p. 172]: as modeling instances of “relational or orientational
Abduction and the Emergence of Necessary Mathematical Knowledge 25.1 An Example from the Classroom 553
way of knowing”, which is a type of “embodied cop- uations” [25.1, p. 225]. Dung’s processing illustrates
ing” that attends to [25.2, p. 172] a kind of double description (i. e., in Bateson’s [25.5,
p. 31] sense of “cases in which two or more infor-
“the possible relations – what we might call the mation sources come together to give information of
relational dimensions – that exist as a dynamical a sort different from what was in either source sepa-
outcome of the interacting of objectively observable rately”) that is a necessary condition when students are
phenomena which are not in themselves objectively engaged in mathematical thinking and learning. When
observable.” Dung was presented with the ambiguous Fig. 25.1 task
consisting of two beginning stages in a growing pattern,
Ian’s abductive thinking about counting by six is he constructed a growing sequence of L-shaped figures
worth noting early in this chapter in light of recent (Fig. 25.3). When he was asked to generate explicit
findings on children’s algebraic thinking that show how rules for his pattern, he suggested s D n C n 1 and
many of them tend to use their knowledge of the mul- s D 2n 1. When he was asked to justify them, Dung
tiplication table to help them generate and establish saw the pattern stages in terms of groups of squares. In
mathematical relationships and support their ability to the case of his first rule, each stage in his growing pat-
construct explicit or function-based formulas involving tern consisted of the union of two variable units having
linear patterns [25.3]. cardinalities n and (n 1) corresponding to the column
US eighth-grade student Dung’s figural process- and row of squares, respectively (see Fig. 25.3 stage 3
ing of the two pattern generalization tasks shown in for an illustration). In the case of his second rule, two
Figs. 25.1 and 25.2 illustrates another characteriza- composite sides of squares that had the same number of
tion of abductive thinking that “carries over a deeper squares on each side overlapped along the corner square
Part E | 25.1
similarity to a number of seemingly rather different sit- (see Fig. 25.3 stage 5 for an example).
Stage 1 Stage 2
A. Find a direct formula for the total number of sticks at any stage in the pattern. Justify your
formula.
B. Find a direct formula for the total number of points at any stage in the pattern. Justify your Fig. 25.2 Square array
formula.
pattern (after [25.4])
554 Part E Models in Mathematics
For Dung, seeing pattern stages in terms of groups for the pattern, that is, s D .4n .n 1//n .n 1/n,
enabled him to justify his explicit rules, which became which he then simplified to s D 2n2 C2n. Dung’s multi-
his abductive resource for constructing and justifying plicative thinking ability became his abductive – that is,
an explicit rule for the square array pattern shown in double descriptive – abstracting resource that enabled
Fig. 25.2. Dung initially saw each pattern stage into him to infer deeper similarity among, and thus general-
parts of separate rows of squares and separate smaller ize to, different kinds of patterns.
squares per row (Fig. 25.4). Using stage 4, he parsed In this chapter, we explore the relationship between
the whole figure into four disjoint rows and counted the abductive action and the emergence of necessary math-
Part E | 25.1
number of sticks per row. In counting the number of ematical knowledge. The prevailing epistemological
sticks per row, he saw four disjoint squares for a total of perspective on mathematical knowledge values the cen-
4 4 D 16 sticks and then subtracted the three overlap- tral role of induction and deduction in the development
ping vertical sticks. He then counted the total number of of necessary mathematical knowledge with a rather
horizontal and vertical sticks counting repetitions and taken-for-granted view of abduction that in the past
obtained .44/4 D 52. In his written work, he imme- has been characterized as the creative, wild, and messy
diately resorted to the use of a variable n to convey that space of theory generation or construction. However,
he was thinking in general terms, which explains the recent empirical studies on abduction and mathemati-
expression .4n .n 1// n. Since he also saw that the cal knowledge construction have begun to explore ways
four disjoint rows had overlapping sides (i. e., the inte- in which abduction could be implemented in more sys-
rior horizontal sticks), he then took away three .D 41/ tematic terms beyond a way of reasoning by detectives
groups of such four horizontal sticks from 52. That con- from observations to explanations [25.6, p. 24] and
crete step allowed him to complete his explicit rule merely “studying facts and devising a theory to ex-
Take away
3 groups of
overlapping
[4n – (n – 1)]n (n – 1)n
horizontal
adjacent sides
of 4 sticks
4 rows of 4n – (n – 1)
[4(4) – 3]
sticks 4 groups of 3 overlapping vertical
4 sticks adjacent sides
A. Find a direct formula for the total number of sticks at any stage in the pattern. Justify your
formula.
Fig. 25.4 Dung’s construc-
tion and justification of his
formula for the Fig. 25.2
pattern (after [25.4])
Abduction and the Emergence of Necessary Mathematical Knowledge 25.2 Inference Types 555
plain them” because “its only justification is that if we from current research on abduction in mathematics
are ever to understand things at all, it just be in that and science education, which should provide the nec-
way” [25.7, p. 40]. For instance, Mason et al. [25.8] essary context for understanding the ideas we pursue
associate abductive processing with the construction in the succeeding section. In Sect. 25.4 we explore
of structural generalizations, while Pedemonte [25.9] ways in which students can effectively enact meaning-
situates abduction within a cognitive unity thesis that ful and purposeful abductive thinking processes and
sees it as being prior and necessary to induction and other [25.1, p. 224]
ultimately deduction. Recent investigations in science
“kinds of preparing activities in mathematical
and science education that pursue an abductive frame-
learning contexts that will enable learners to be-
work also underscore the central role of abduction in
come self-consciously engaged in, can get them
inference systems that model everyday phenomena. For
ready to notice, immediately and spontaneously, the
instance, Addis and Gooding propose the iterative cycle
kinds of events relevant to their acquiring such re-
of “abduction (generation) ! deduction (prediction)
lational or orientation understandings – where, by
! induction (validation) ! abduction” in modeling
being ready to do something means what we often
the “scientific process of interpreting new or surpris-
talk of as being in possessions of a habit, an instinct,
ing findings by generating a hypothesis whose conse-
an inclination, etc.”
quences are then evaluated empirically” [25.10, p. 38].
Another instance involves Magnani’s [25.11] formu- Central to such processes and activities involves
lation of actual computational models in which case orchestrating effective tasks and other learning con-
abduction is seen as central to the development of cre- texts that will engage all students in abductive thinking,
ative reasoning in scientific discoveries and can thus be which will go a long way in supporting growth in
Part E | 25.2
used to generate rational models. necessary mathematical knowledge and excellence in
In Sect. 25.2, we provide a characterization of the reasoning that is strategic and has “logical virtue (i. e.,
four types of inferences that students develop in mathe- avoiding logical fallacies and learning what is and what
matical activity. In Sect. 25.3 we note two key findings is not admissible and valid)” [25.12, p. 269].
closure established (apply) outcomes will behave are valid deductions) based on an Mathematical induction
deduction in the same manner established (e.g., demonstration of
to future as a result of a valid deduction a valid deductive claim)
outcomes deductive hypothesis
are occurrences or instantiations of the stipulated law. learning experiences appear to consider deductive in-
When the first three statements above are switched ferences as being more certain than inductive ones and
in two different ways, we obtain the canonical struc- other guesses [25.16].
tures for abduction and induction, which are ampliative Students also need to understand the limitations
because the conclusions “amplify or go beyond the in- of each inferential process. For Polya [25.17], de-
formation incorporated in the premises” [25.11, p. 511] duction exemplifies demonstrative reasoning, which is
and invalid (i. e., not necessary) from a deductive point the basis of the “security of our mathematical knowl-
of view. In a deductive closure, an established deduction edge” [25.17, p. v] since it is “safe, beyond contro-
becomes the cause or hypothesis that is then applied to versy, and final”. Abduction and induction exemplify
future outcomes, which are effects. Figure 25.5 visually plausible reasoning, which “supports our conjectures”
captures the fundamental differences among the four in- and could be “hazardous, controversial, and provi-
ferential types. sional” [25.17]. Despite such constraints, however,
From a logicopsychological perspective, students Peirce and Polya seem to share the view that abduction,
need to learn to anticipate inferences that are sensible induction, and deduction are epistemologically neces-
and valid in any mathematical activity. Peirce [25.15, sary. According to Polya [25.17], while “anything new
p. 449], of course, reminds us that context matters that we learn about the world involves plausible reason-
despite our naturally drawn disposition toward “per- ing”, demonstrative reasoning uses “rigid standards that
petually making deductions”. As an aside, kindergarten are codified and clarified by logic” [25.17, p. v]. Polya’s
students (ages five to six years) in the absence of formal perspectives are narrowly confined to how we come to
understand and explain the nature of mathematical ob-
jects, unlike Peirce who formulates his view by drawing
Deduction Abduction Induction Deductive closure
on his understanding of the nature of scientific practice.
L and C R and L C and R L&C R “All ideas of science come to it by way of abduction”,
Peirce writes, which is the fundamental source of the
emergence of ideas and “consists in studying facts and
devising a theory to explain them” [25.7, p. 90].
R C L O
In the next three subsections below, we discuss ad-
Fig. 25.5 Differences among the four inferential types ditional characteristics of each inferential type.
Abduction and the Emergence of Necessary Mathematical Knowledge 25.2 Inference Types 557
Part E | 25.2
Second, Thagard [25.22] makes sense in saying that
Iconic-based inferences also provide another possi- an abductive process involves developing and enter-
ble source of abduction [25.19]. Icons, unlike percepts, taining inferences toward a law that will be tested via
are pure possible forms of the objects they represent or induction, which will then produce inferences about
resemble. Iconic-based abductions employ the follow- a case. For Eco [25.23], however [25.23, p. 203],
ing abductive process [25.19, p. 306]:
“the real problem is not whether to find first the
Case or the Law, but rather how to figure out both
P1 the Law and the Case at the same time, since they
! An iconic relationship between P1 and P2
H1 are inversely related, tied together by a sort of chi-
P1 and P2 are similar (iconically) asmus.”
) Maybe H1 (or something that is similar to H1) : Third, while the original meaning of abduction
based on Peirce’s work refers to inferences that yield
plausible or explanatory hypotheses, Josephson and
Abduction also involves “the problem of logical good-
Josephson’s [25.24] additional condition of inferences
ness, i. e., how ideas fulfill their logical purpose in the
that yield the best explanation revises the structure of
world” [25.2, pp. 159, 162]. El Khachab [25.2] uses
the original meaning of abduction in the following man-
the example of global warming to show how different
ner:
stakeholders tend to model different kinds of good-
ness based on their purpose. Following Peirce, he notes Case: D is a collection of data (facts, observations,
that “the purpose of abduction is to provide hypothe- givens).
ses which, when subjected to experimental verification, Law: H explains D (would, if true, explain D).
will provide true explanations” [25.2, p. 162]. True ex- Strong Claim: No other hypothesis can explain D as
planations refer to “sustainable belief-habits, that is, as well as H does.
recurring settlements of belief about the world which Result: H is probably true.
rely on experientially or experimentally verifiable state-
Paavola [25.25] notes that while the original and re-
ments” [25.2, p. 163].
vised versions of abductions share the concern toward
We note the following four important points below
generating explanations, they are different in several
about abduction.
ways. The original version addresses issues related
First, Tschaepe [25.20] underscores the significance
to the processes of discovery and the construction of
of guessing in abduction, that is [25.20, p. 117],
plausible hypotheses, while the revised version mod-
“guessing is the initial deliberate originary activity els a nondeductive form of reasoning (except induction)
of creating, selecting, or dismissing potential solu- that eventually establishes the true explanation. Across
558 Part E Models in Mathematics
the differences, it is instructive to keep in mind both not necessarily yield true generalizations. However, “in
Adler’s “simple, conservative, unifying, and yields the the long run they approximate to the truth” [25.29, p.
most understanding” conditions for constructing strong 207].
abductions [25.26, p. 19] and El Khachab’s logical Four important points are worth noting about the re-
goodness conditions that characterize good abductions. lationship between abduction and induction, as follows:
That is, they [25.2, p. 164] First, El Khachab points out how both abduction
and induction appear to be “unclear” about their “practi-
“(1) need to be clear, i. e., they need to have distin- cal effects which are essentially similar” [25.2, p. 166].
guishable practical effects; (2) they need to explain However, they are different in terms of “degree”, that is
available facts; and (3) they need to be liable to fu- [25.2, p. 166],
ture experimental verification.”
“an induction is an inference to a rule; an abduc-
tion is an inference to a rule about an occurrence, or
Fourth, it is important to emphasize that abduc-
in Peirce’s own words, an induction from qualities
tions provide explanations or justifications that do not
[. . . ] Induction is a method of experimental verifi-
prove. Instead, they provide explanations or justifi-
cation leading to the establishment of truth in its
cations that primarily assign causal responsibility in
long-term application.”
Josephson’s [25.27, p. 7] sense below.
Second, abduction is not a requirement for in-
“Explanations give causes. Explaining something, duction. That is, there can be an abduction without
whether that something is particular or general, induction (i. e., abductive generalizations). Some ge-
gives something else upon which the first thing de- ometry theorems, for example, do not need inductive
Part E | 25.2
pends for its existence, or for being the way that it is. verification. In some cases, abduction is framed as con-
[. . . ] It is common in science for an empirical gener- jectures that are used to further explain the development
alization, an observed generality, to be explained by of schemes ([25.30] in the case of fractions). However,
reference to underlying structure and mechanisms.” it is useful to note the insights of Pedemonte [25.9] and
Prusak et al. [25.31] about the necessity of a structural
25.2.2 Induction continuity between an abduction argument process and
its corresponding justification in the form of a logical
Unlike abduction, induction tests a preliminary or an proof. That is, a productive abductive process in what-
ongoing abduction in order to support a most reason- ever modal form (visual, verbal) should simultaneously
able law and thus develop a generalization that would convey the steps in a deductive proof.
both link and unite both the known and projected cases Even in the most naïve and complex cases of induc-
together in a meaningful way. By testing an abductive tions (e.g., number patterns with no meaningful context
claim over several cases, induction determines whether other than the appearance of behaving like objects in
the claim is right or wrong. So defined, a correct in- some sequence), learners initially tend to produce an
duction does not produce a new concept that explains abductive claim as a practical embodied coping strat-
(i. e., an explanatory theory), which is the primary pur- egy, that is, as a way of imposing some order or
pose of abductive processing. Instead, it seeks to show structure that may or may not prove to make sense in
that once the premises hold (i. e., the case/s and the the long haul. Euler’sPnumerical-driven generalization
result/s), then the relevant conclusions (i. e., the law) of the infinite series 1 1
nD1 n2 is a good example. He
must be true by enumeration (number of observed initially established an analogical relationship between
cases), analogy (i. e., structural or relational similarity two different types of equations (i. e., a polynomial P of
of features among cases), or scientific analysis (through degree n having n distinct nonzero roots and a trigono-
actual or mental experiments) [25.28] and thus reflect metric equation that can be transformed algebraically
causal relationships that are expressed in the form of into something like P but with an infinite number of
(categorical inductive or universally quantified) gen- terms). Euler’s abductive claim had him hypothesizing
eralizations [25.11]. In the case of enumeration, in an anticipated solution drawn from similarities between
particular, the goal is not to establish an exhaustive the forms of the two equations. Upon inductively ver-
count leading to a precise numerical value, but it is ifying that the initial four terms of the two equations
about “producing a certain psychological impression were indeed the same, Euler concluded that [25.17, pp.
[. . . ] brought about through the laws of association, 17–22]
and creating an expectation of a continuous repetition 1
X 1 2
of the experience” [25.28, p. 184]. In all three con- D :
texts of inductive justification, inductive inferences do nD1
n2 6
Abduction and the Emergence of Necessary Mathematical Knowledge 25.2 Inference Types 559
Third, another consequence of the preceding dis- Deductive closure emerges in students’ mathemati-
cussion involves the so-called inductive leap, which cal thinking and reasoning in at least two ways depend-
involves establishing a generalization from concrete in- ing on grade-level expectations, as follows. Among
stances to a conclusion that seems to contain more than elementary and middle school students, once they (im-
the instances themselves. On the basis of the character- plicitly) form a deduction, they tend to provide an
izations we have assigned to abduction and induction, empirical (numerical or visual) structural argument then
such a leap is no longer an issue since the leap itself is a formal deductive proof as a form of explanation or
settled by abduction. Hence, criticisms that in effect cite justification. For example, Cherrie’s algebraic gener-
“hazardous inductive leap” as an argument in relation to alization relative to the pattern in Fig. 25.3 could be
erroneous patterning questions such as the one shown expressed in deductive form. When she began to cor-
in Fig. 25.6 is more appropriately and fundamentally rectly apply her result to any stage in her pattern beyond
a problem of abduction. the known ones, her reasoning entered the deductive
Fourth, neither abduction nor induction can settle closure phase.
the issue of reasonable of context. For example, the Among high school students and older adults, once
patterning situation in Fig. 25.7 can have a stipulated they formulate a deduction, they tend to provide any of
abduction and an inductively verified set of outcomes the following types of justification that overlap in some
based on an interpreted explicit formula. However, as situations: an empirical structural argument; a logical
Parker and Baldridge [25.32] have noted, “there is no deductive proof; or a mathematical induction proof.
reason why the rainfall will continue to be given by that Figure 25.8 illustrates how a group of 34 US Alge-
expression, or any expression”, which implies that the bra 1 middle school students (mean age of 13 years)
“question cannot be answered” [25.32, p. 90]. empirically justified the fact that a b D .a
Part E | 25.2
b/by demonstrating a numerical argument following
25.2.3 Deduction and Deductive Closure a statement-to-reason template [25.34, pp. 126–130].
Note that when the numbers in the empirical argument
While abduction and induction provide support in shown in Fig. 25.8 are replaced with variables, the argu-
constructing or producing a theory, both deduction ment transforms into a logical deductive proof in which
and deductive closure aim to exhibit necessity. Pace case the steps follow a logical “recycling process” (Du-
Smith [25.33]: “(R)epeated co-instantiation via induc- val, quoted in Pedemonte [25.9, p. 24]), that is, the
tion is not the same as inferential necessity” [25.33, conclusion of a foregoing step becomes the premise of
p. 5]. A valid deduction demonstrates a logical im- a succeeding step from beginning to end. Deductive clo-
plication, that is, it shows how a law and a case as sure for these students occurred when they began to
premises or hypotheses together imply a necessary re- obtain products of integers (and, much later, rational
sult, conclusion, or consequence. It is a “self-contained numbers) involving negative factors without providing
process” because the validation process relies on “the a justification.
existence of well-defined sets” and preserves an already Figure 25.9 shows a mathematical inductive proof
established law, thus, “freeing us from the vagaries of a classic theorem involving the sum of the interior an-
and changeability of an external world” [25.10, p. gles in an n-sided convex polygon that has been drawn
37]. from Pedemonte’s [25.9] work with 102 Grade 13 stu-
dents (ages 16–17 years) in France and in Italy. The
“multimodal argumentative process of proof” [25.31,
A certain pattern begins with 1, 2, 4. If the pattern continues,
what is the next number? 35] evolved as a result of a structural continuity be-
tween a combined abductive-inductive action that was
A. 1 performed on a dynamic geometry tool, which focused
B. 2 on a perceived relationship between the process of con-
C. 7
D. 8
structing nonoverlapping triangles in a polygon and
the effects on the resulting interior angle sums, and
Fig. 25.6 An example of an erroneous generalization the accompanying steps that reflected the structure of
problem a mathematical induction proof.
It started to rain. Every hour Sarah checked her rain Hours Rainfall
gauge. She recorded the total rainfall in a table. 1 0.5 in Fig. 25.7 An Example of
How much rain would have fallen after h hours? 2 1 in
a patterning task with an erro-
3 1.5 in
neous context (after [25.32])
560 Part E Models in Mathematics
– + – +
Fig. 25.9 A mathematical inductive proof for the sum of the interior angles in an n-sided convex polygon (after [25.8,
p. 37–38])
The work shown in Fig. 25.10 was also drawn from comes in a table of values, and steps that might have
the same sample of students that participated in Pede- produced either a valid empirical justification or a logi-
monte’s [25.9] study. Unlike Fig. 25.9, the analysis that cal mathematical induction proof. Deductive closure for
the students exhibited in Fig. 25.10 shows a structural these students occurred when they began to obtain the
discontinuity between a combined abductive-inductive interior angle sum measures of any convex polygon be-
action, which primarily focused on the results or out- yond the typical ones.
Abduction and the Emergence of Necessary Mathematical Knowledge 25.3 Abduction in Math and Science Education 561
3 180° Step
4 360° 180° × 2 Hp: 180°(n –2)
5 540° 180° × 3 Ts: 180°(n –1)
6 720° 180° × 4
S(n) = 180° (n–2) = 180n – 360
29. A: So the rule is probably 180 × (n–2) for an S(n +1) = 180° (n +1) – 360 = 180n +180 – 360 = n + 1 – 2 =
n-sided polygon n – 1 Th
30. L: Yes... n is the number of sides We have proved the thesis by a mathematical
induction
Fig. 25.10 Example of an erroneous mathematical inductive argument for the sum of the interior angles in an n-sided
convex polygon (after [25.8, p. 36])
Part E | 25.3
dealing with abduction in mathematical and scientific thus confusing, and creating disorder” in their process-
thinking and learning yields two interesting findings, as ing [25.36, p. 302]. An additional dilemma that students
follows. have with creative abductions is the need to justify
them prior to using them as rules in a proof process.
25.3.1 Different Kinds of Abduction “Consequently”, Pedemonte and Reid write [25.36, p.
302],
Drawing on Eco’s [25.23] work, Pedemonte and
Reid [25.36] provided instances in which traditional “it seems that there is not a simple link between the
15–17-year-old Grades 12 and 13 students in France use of abduction in argumentation and constructing
and Italy modeled overcoded, undercoded, and cre- a deductive proof. Both the claim that abduction
ative abductions in the context of proving statements is an obstacle to proof and the claim that abduc-
in mathematics. For Pedemonte and Reid, abduction tion is a support, if considered in a general sense,
comes before deduction. Some students in their study are oversimplifications. Some kinds of abductions,
generated overcoded abductions, which involve using in some context may make the elements required
a single rule to generate a case, while others produced for the deductions used in a proof more accessi-
undercoded abductions, which involve choosing from ble. Some are probably less dangerous to use and
among several different rules to establish a case. Over- can make the construction of a proof easier to get
coded and undercoded abductions for Magnani [25.11] to because they could make easier to find and to
exemplify instances of selective abductions because select the theorem and the theory necessary to pro-
the basic task involves selecting one rule that would duce a proof. However, other kinds of abductions
make sense, which, hopefully, would also yield the best present genuine obstacles to constructing the proof.
explanation. Medical diagnosis, for instance, employs This suggests that teaching approaches that involve
selective abductions [25.11]. In cases when no such students conjecturing in a problem solving process
rules exist, students who develop new rules of their prior to proving have potential, but great care must
own yield what Eco [25.23] refers to as creative abduc- be taken that the abductions expected of the students
tions, which also account for “the growth of scientific do not become obstacles to their later proving.”
knowledge” [25.11, p. 511]. Pedemonte and Reid have
noted that students are usually able to construct a de- Aside from selective and creative abductions, Mag-
ductive proof in cases involving overcoded abductions nani [25.11] pointed out the significance of theoretical
due to the limited number of possible sets of rules to and manipulative abductions in other aspects of every-
choose from. Furthermore, they tend to experience con- day and scientific work that involve creative processing.
siderable difficulties in cases that involve undercoded Theoretical abductions involve the use of logical, verbal
and creative abductions since they have to deal with or symbolic, and model-based (e.g., diagrams and pic-
562 Part E Models in Mathematics
tures) processing in reasoning. While valuable, they are Thousand Hundred Tens Ones
unable to account for other possible types of explana-
tions (e.g., statistical reasoning, which is probabilistic;
sufficient explanations; high-level kinds and types of
creative and model-based abductions; etc.). Manipula- Check your answer
tive abductions emerge in cases that involve “thinking
and discovering through doing”, where actions are piv-
otal in enabling learners to model and develop insights
simultaneously leading to the construction of creative Fig. 25.11 Mark’s initial visual processing of 126 6
or selective abductions. They operate beyond the usual
purpose of experiments and create “extra-theoretical sticks into six groups, recorded accordingly, and so on
behaviors” that [25.11, p. 517] until he completed the division process for all subcol-
lections. His numerical recording in Fig. 25.11 also
“create communicable accounts of new experiences captured every step in his sequence of visual actions.
in order to integrate them into previously existing Results of consistent visual processing enabled him to
systems of experimental and linguistic (theoreti- shift his attention away from the visual form and to-
cal) practices. The existence of this kind of extra- ward the rule for division, which was accompanied by
theoretical cognitive behavior is also testified by two remarkable changes in his numerical processing. In
the many everyday situations in which humans are Fig. 25.12, he performed division on each digit in the
perfectly able to perform very efficacious (and ha- dividend from left to right with the superscripts indi-
bitual) tasks without the immediate possibility of cating partial remainders that had to be ungrouped and
Part E | 25.3
realizing their conceptual explanation.” regrouped. In Fig. 25.13, he made another subtle cre-
ative revision that remained consistent with his earlier
Typical accounts of conceptual change processes in work and experiences. When he was asked to explain
science tend to highlight theoretical abductions, how- his division method, Mark claimed that “it’s like how
ever [25.11, p. 519], we do adding and subtracting with regrouping, we’re
just doing it with division”. Mark’s manipulative abduc-
“a large part of these processes are instead to due tive processing for division involving whole numbers
practical and external manipulations of some kind, necessitated a dynamic experience in which “ a first
prerequisite to the subsequent work of theoretical rough and concrete experience” [25.11, p. 519] of the
arrangement and knowledge creation.” process enabled him to eventually develop a version of
the long division process that “unfolded in real time”
Manipulative abductions may also emerge in learn- via thinking through doing.
ing situations that provide “conceptual and theoretical
details to already automatized manipulative executions” 25.3.2 Abduction
in which case either teacher or learner [25.11, p. 519] in Mathematical Relationships
“does not discover anything new from the point A study by Arzarello and Sabena [25.38] illustrates
of view of the objective knowledge about the in- the important role of abduction in constructing math-
volved skill, however, we can say that his concep- ematical relationships involving different signs. Signs
tual awareness is new from the local perspective of pertain to the triad of signifier, signified, and an individ-
his individuality.” ual learner’s mental construct that enables the linking
between signifier and signified possible. Arzarello and
For example, Rivera [25.37] provides a narrative Sabena underscore their students’ use of semiotic and
account of US third-grade Mark’s evolving understand- theoretic control when they argued and proved state-
ing of the long division algorithm involving multidigit ments in mathematics. Semiotic control involves choos-
whole numbers by a single-digit whole number. Mark’s ing and implementing particular semiotic resources
initial visual representation processing of (sharing- (e.g., graphs, tables, equations, etc.) when they manip-
partitive) division (Fig. 25.11) employed the use of ulate and interpret signs (i. e., type-1 semiotic action),
place value-driven squares, sticks, and circles. In the while theoretic control involves choosing and imple-
case of the division task 126 6, when he could not menting appropriate theories (e.g., Euclidean theorems)
divide a single (hundreds) box into six (equal) groups, or parts of those theories and related conceptions when
he recorded it as a 0. He then ungrouped the box into they “elaborate an argument or a proof” (i. e., type-3
ten sticks, regrouped the sticks together, divided the semiotic action; [25.38, p. 191]). Between type-1 and
Abduction and the Emergence of Necessary Mathematical Knowledge 25.3 Abduction in Math and Science Education 563
type-3 semiotic action is a type-2 semiotic action that focus from the semiotic to theoretical, respectively.
involves using abduction to identify relationships be- Studies by Pedemonte and colleagues [25.9, 36, 39] and
tween signs and assessing the arguments. Based on their Boero and colleagues [25.40, 41] also note the same
Part E | 25.3
qualitative work with Grade 9 students, such [25.38, p. findings in both algebra and geometry contexts. Across
202] such studies we note how abduction is conceptualized
in terms of its complex relationships with induction
“relationships between signs are examined and and deduction. Other studies do not deliberately fo-
checked with redundant local arguments, and (eco- cus on such shifts and relationships, making it difficult
nomic, explanatory, and testable) hypotheses are for students to see the value of engaging in abductive
detected and made explicit by means of abduc- processing in the first place. For example, Watson and
tions.” Shipman [25.42] documented the classroom event that
happened in a Year 9 class of 13–14 year-old students
Furthermore, they note how [25.38, p. 204]: in the UK that investigated the following task: Find p
a way to multiply pairs of numbers of the form a C b
“abduction has an important role at this point. There that results in integer products. While the emphasis
is an evolution from a phase where the attention is of their study focused on learning through exempli-
mainly on the given signs, towards a phase where fication by using special examples to help students
the logical-theoretical organization of the argument develop meaningful plausible structures, it seems that
becomes the center of the activities and evolves the abductive process for them became a matter of con-
from abductive to deductive and more formal struc- jecturing relationships based on their experiences with
tures. [. . . ] Such an evolution implies a passage their constructed examples. But certainly there is more
from actions of type 1 to actions of type 2 and then to abductive processing than merely generating conjec-
3, and a shift of control by the student, i. e., passing tures, as follows.
from actions guided by semiotic control to actions Several studies have suggested inferential model
guided by theoretical control. [. . . ] Passing from systems that show relationships between and among
type 1- to type 3-semiotic actions means an evolu- abduction, induction, and deduction. Addis and Good-
tion from the data to the truth because of theoretical ing [25.10], for example, illustrate how the iterative
reasons. It is exactly this distinction that makes the cycle of
difference between [. . . ] a substantial argument and
an analytical argument, which is a mathematical abduction (generation) ! deduction (prediction) !
proof.” induction (validation) ! abduction
Arzarello and Sabena’s study foregrounds the role might work in the formation of consensus from beliefs.
of abduction in inferential processing and documents Radford’s [25.43] architecture of algebraic pattern gen-
how a shift from abduction to deduction is likely to eralizations emphasizes a tight link between abduction
occur when students’ mathematical thinking shifts in and deduction, that is, hypothetico-deduction, in the fol-
564 Part E Models in Mathematics
lowing manner: alizing, the middle school student groups skipped the
induction phase and instead exhibited the following
abduction (from particulars p1 ; p2 ; : : : ; pk structure:
to noticing a commonality C)
abduction and deduction ! deductive closure :
! transforming the abduction
(from noticing C to making C a hypothesis) Abduction in this phase was combined with de-
! deduction (from hypothesis C duction and thus became structured and inferential as
to producing the expression of pn ) : a consequence of their ability to express generalizations
Part E | 25.4
and logical as a consequence of knowing the problem the underlying conceptual issues and are not merely
context and being “compounds of deductions from gen- reflective of mechanical recipes or algorithms for gen-
eral rules” (i. e., hypothetico-deductivist) that individ- erating ideas and discoveries. Furthermore, the analysis
ual knowers are already familiar with (Peirce, quoted or explanation should present “a viable way of solving
in [25.20, p. 119]). Tschaepe writes, “(w)e guess in an a particular problem and that it works more gener-
attempt to address the surprising phenomenon that has ally (and not only in relationship to one, particular
led to doubt; it is our inchoate attempt to provide an ex- anomalous phenomenon)” [25.12, p. 273] and fit the
planation” [25.20, p. 118]. Viewed in this sense [25.20, “constraints and clues that are involved in the problem
p. 122], situation in question” [25.12, p. 274].
“[a]bduction is a logical operation, and guess is log-
25.4.4 Encourage an Abductive
ical insofar as it is a type of reasoning by which
Knowledge-Seeking Disposition
an explanation of a surprising phenomenon is first
created, selected, or dismissed [. . . ] Guessing is the
Sintonen’s [25.45] interrogative model of inquiry that
creative component of abductive inference in which
employs an explicit logic of questions demonstrates the
a new idea is first suggested through reasoning.”
significance of using certain strategic principles and
why-questions as starting points in abductive processing.
25.4.2 Support Logically-Good Abductive
Questions as well as answers drive discoveries and the
Reasoning
scientific process. Questions, especially, “pick out some-
thing salient that requires special attention, and that it
Students will benefit from knowing how to develop ab-
also gives heuristic power and guidance in the search for
ductions that are logically good, that is, they are: clear
Part E | 25.4
answers” [25.45, p. 250]. Furthermore, [25.45, p. 263],
(i. e., can be confirmed or discomfirmed); can explain
the facts; are capable of being tested and verified; and “principal questions are often explanation-seeking
can lead to true explanations that establish “sustainable in nature and arise when an agent tries to fit new
belief-habits” [25.2, p. 163]. Such explanations may be phenomena to his or her already existing knowl-
new and may emerge from guesses and instincts, but, edge. Advancement of inquiry can be captured
Khachab writes [25.2, pp. 171–172], by examining a chain of questions generated. By
finding answers to subordinate questions, an agent
“logical goodness is the reason for abduction, under
approaches step by step toward answering the big
its diverse meanings. No matter how abduction ac-
initial question, and thus changes his or her epis-
tually generates new ideas – whether it is abductive
temic situation.”
inference, strategic inference, instinctive insight,
etc. – its purpose is, ultimately, to provide true Students will benefit from situations and circum-
explanatory hypotheses for inquiry. And, in this re- stances that engage them in a knowledge-seeking game
gard, new hypotheses should always be evaluated in in which they “subject a source of information [. . . ]
reference to their goodness.” to a series of strategically organized questions. This
Sherlock Holmes method therefore is at the heart of ab-
25.4.3 Foster the Development of Strategic ductive reasoning” [25.45, p. 254]. Furthermore, the in-
Rules in Abductive Processing terrogative model allows conclusions (i. e., answers) to
emerge. “For abductive tasks”, Sintonen writes [25.45,
Paavola [25.12] distinguishes between definitory and p. 256],
strategic rules. While definitory rules focus on logic and
logical relationships, strategic rules pertain to “goal- “the goal must be understanding and not just knowl-
directed activity, where the ability to anticipate things, edge. A rational inquirer who wants to know why
and to assess or choose between different possibili- and not only that something is the case must, after
ties, are important” [25.12, p. 270]. Thus, abductive hearing the answer, be in the position to say Now
strategies produce justifications for given explanatory I know (or rather understand) why the (singular
hypotheses, including justifications for “why there can- or general) fact obtains. Obviously this condition
not be any further explanation” [25.12, p. 271]. Hence, is fulfilled only if she or he knows enough of the
all generated abductive inferences conveyed in the form background to able to insert the offered piece of in-
of discoveries provide an analysis or explanation of formation into a coherent explanatory account.”
566 Part E Models in Mathematics
Acknowledgments. The research that is reported in views and opinions expressed in this report are solely
this chapter has been funded by the National Science the author’s responsibility and do not necessarily reflect
Foundation under Grant Number DRL 0448649. All the the views of the foundation.
References
25.1 J. Shotter: Bateson, double description, Todes, and 25.18 B. Haig: Precis of “an abductive theory of scientific
embodiment: Preparing activities and their re- method, J. Clin. Psychol. 64(9), 1019–1022 (2008)
lation to abduction, J. Theory Soc. Behav. 39(2), 25.19 S. Paavola: Diagrams, iconicity, and abductive dis-
219–245 (2009) covery, Semiotica 186(1/4), 297–314 (2011)
25.2 C. El Khachab: The logical goodness of abduction in 25.20 M. Tschaepe: Guessing and abduction, Trans.
C. S. Peirce’s thought, Trans. Charles S. Peirce Soc. Charles S. Peirce Soc. 50(1), 115–138 (2014)
49(2), 157–177 (2013) 25.21 G.-J. Kruijff: Peirce’s late theory of abduction:
25.3 F. Rivera: Changing the face of arithmetic: Teaching A comprehensive account, Semiotica 153(1/4), 431–
children algebra, Teach. Child. Math. 12(6), 306–311 454 (2005)
(2006) 25.22 P. Thagard: Semiosis and hypothetic inference in C.
25.4 F. Rivera: Visual templates in pattern generalization S. Peirce, Versus Quaderni Di Studi Semiotici 19/20,
activity, Educ. Stud. Math. 73, 297–328 (2010) 163–172 (1978)
25.5 G. Bateson: Mind in Nature: A Necessary Unity 25.23 U. Eco: Horns, hooves, insteps: Some hypotheses
(Fontana/Collins, London 1979) on three types of abduction. In: The Sign of Three:
25.6 D. Holton, K. Stacey, G. FitzSimons: Reasoning: Dupin, Holmes, Peirce, ed. by U. Eco, T. Sebeok (In-
A dog’s tale, Aust. Math. Teach. 68(3), 22–26 (2012) diana Univ. Press, Bloomington 1983) pp. 198–220
Part E | 25
25.7 C. Peirce: Collected Papers of Charles Saunders 25.24 J. Josephson, S. Josephson: Abductive Inference:
Peirce, Vol. 5 (Harvard Univ. Press, Cambridge 1934) Computation, Philosophy, Technology (Cambridge
25.8 J. Mason, M. Stephens, A. Watson: Appreciating University Press, New York 1994)
mathematical structures for all, Math. Educ. Res. 25.25 S. Paavola: Hansonian and Harmanian abduction
J. 21(2), 10–32 (2009) as models of discovery, Int. Stud. Philos. Sci. 20(1),
25.9 B. Pedemonte: How can the relationship between 93–108 (2006)
argumentation and proof be analyzed?, Educ. Stud. 25.26 J. Adler: Introduction: Philosophical foundations.
Math. 66, 23–41 (2008) In: Reasoning: Studies of Human Inference and Its
25.10 T. Addis, D. Gooding: Simulation methods for an Foundations, ed. by J. Adler, L. Rips (Cambridge
abductive system in science, Found. Sci. 13, 37–52 Univ. Press, Cambridge 2008) pp. 1–34
(2008) 25.27 J. Josephson: Smart inductive generalizations are
25.11 L. Magnani: Conjectures and manipulations: Com- abductions. In: Abduction and Induction: Essays
putational modeling and the extra-theoretical di- on Their Relation and Integration, ed. by P. Flach,
mension of scientific discovery, Minds Mach. 14, A. Kakas (Kluwer, Dordrecht 2000) pp. 31–44
507–537 (2004) 25.28 J. Hibben: Logic: Deductive and Inductive (Charles
25.12 S. Paavola: Abduction as a logic and methodology Scribner’s Sons, New York 1905)
of discovery: The importance of strategies, Found. 25.29 C. Peirce: Grounds of validity of the laws of
Sci. 9, 267–283 (2004) logic: Further consequences of four incapacities,
25.13 A. Heeffer: Learning concepts through the his- J. Specul. Philos. 2, 193–208 (1869)
tory of mathematics: The case of symbolic algebra. 25.30 A. Norton: Josh’s operational conjectures: Abduc-
In: Philosophical Dimensions in Mathematics Ed- tions of a splitting operation and the construction
ucation, ed. by K. Francois, J.P. Van Bendegem of new fractional schemes, J. Res. Math. Educ.
(Springer, Dordrecht 2010) pp. 83–103 39(4), 401–430 (2008)
25.14 U. Goswami: Inductive and deductive reasoning. 25.31 N. Prusak, R. Hershkowitz, B. Schwarz: From visual
In: The Wiley-Blackwell Handbook of Childhood reasoning to logical necessity through argumenta-
Cognitive Development, ed. by U. Goswami (Wiley- tive design, Educ. Stud. Math. 74, 185–205 (2012)
Blackwell, Malden 2011) pp. 399–419 25.32 T. Parker, S. Baldridge: Elementary Mathematics for
25.15 C. Peirce: Collected Papers of Charles Saunders Teachers (Sefton-Ash Publishing, Okemos 2004)
Peirce, Vol. 1/2 (Belnap Press of Harvard Univ. Press, 25.33 L. Smith: Reasoning by Mathematical Induction in
Cambridge 1960) Children’s Arithmetic (Elsevier Science Ltd., Oxford
25.16 B. Pillow, R. Pearson, M. Hecht, A. Bremer: Chil- 2002)
dren’s and adults’ judgments of the certainty 25.34 F. Rivera: Toward a Visually-Oriented School Math-
of deductive inference, inductive inferences, and ematics Curriculum: Research, Theory, Practice, and
guesses, J. Genet. Epistemol. 171(3), 203–217 (2010) Issues (Springer, New York 2011)
25.17 G. Polya: Induction and Analogy in Mathemat- 25.35 F. Arzarello: The proof in the 20th century. In: The-
ics, Mathematics and Plausible Reasoning, Vol. 1 orems in Schools: From History, Epistemology, and
(Princeton Univ. Press, Princeton 1973) Cognition in Classroom Practices, ed. by P. Boero
(Sense Publishers, Rotterdam 2006) pp. 43–64
Abduction and the Emergence of Necessary Mathematical Knowledge References 567
25.36 B. Pedemonte, D. Reid: The role of abduction in 25.41 P. Boero, N. Douek, F. Morselli, B. Pedemonte: Argu-
proving processes, Educ. Stud. Math. 76, 281–303 mentation and proof: A contribution to theoretical
(2011) perspectives and their classroom implementation,
25.37 F. Rivera: From math drawings to algorithms: Emer- Proc. 34th Conf. Int. Group Psychol. Math. Educ.,
gence of whole number operations in children, Vol. 1, ed. by M. Pinto, T. Kawasaki (IGPME, Belo
ZDM 46(1), 59–77 (2014) Horizante 2010) pp. 179–204
25.38 F. Arzarello, C. Sabena: Semiotic and theoretic con- 25.42 A. Watson, S. Shipman: Using learner generated
trol in argumentation and proof activities, Educ. examples to introduce new concepts, Educ. Stud.
Stud. Math. 77, 189–206 (2011) Math. 69, 97–109 (2008)
25.39 M. Martinez, B. Pedemonte: Relationship between 25.43 L. Radford: Iconicity and contraction: A semiotic
inductive arithmetic argumentation and deduc- investigation of forms of algebraic generalizations
tive algebraic proof, Educ. Stud. Math. 86, 125–149 of patterns in different contexts, ZDM 40, 83–96
(2014) (2008)
25.40 P. Boero, R. Garuti, M. Mariotti: Some dynamic 25.44 F. Rivera: Teaching and Learning Patterns in School
mental processes underlying producing and prov- Mathematics: Psychological and Pedagogical Con-
ing conjectures, Proc. 20th Conf. Int. Group Psychol. siderations (Springer, New York 2013)
Math. Educ., Vol. 2, ed. by L. Puig, A. Gutierrez 25.45 M. Sintonen: Reasoning to hypotheses: Where do
(IGPME, Valencia 1996) pp. 121–128 questions come?, Found. Sci. 9, 249–266 (2004)
Part E | 25
569
Part F
Model-Ba Part F Model-Based Reasoning in Cognitive Science
Ed. by Athanassios Raftopoulos
Model-based reasoning refers to the kinds of infer- depends only on the mental properties of the reason-
ences performed on the basis of a knowledge-context ing agent and is independent of the agent’s body. More
that guides them. This context constitutes a model of generally, cognition is limited within the mind of the
a domain of reality, that is, an approximative and sim- cognitive agent. This assumption has been recently
plifying to various degrees representation of the factors challenged on many grounds and the argument has been
that underlie, and the interrelations that govern, the made that reasoning is embodied in that it constitutively
behavior of the entities in this domain. Model-based and not merely causally involves the body of the rea-
reasoning is ubiquitous in the human (and not only soning agent. In this vein, Chap. 28 focuses on the
the human) brain. Various studies have shown that role of the concept of mental imagery as a fundamen-
most likely we do not draw inferences by applying tal cognitive capability that enhances the performance
some abstract, formal rules; instead inference rules are of cognitive robots. The authors discuss the embodied
applied within concrete-knowledge contexts that deter- imagery mechanisms applied to build artificial cogni-
mine which rules should be used and when. tive models of motor imagery and mental simulation to
control complex behaviors of humanoid platforms that
Model-based reasoning is not limited to the cogni- represent the artificial body.
tive functions of the brain but it is likely that it extends
to perceptual functions that retrieve information from If reasoning is model based, the reasoning agent
the environment. Chapter 26 defends the view that draws from a variety of sources in order to choose
the processes of visual perception constitute a case of the more salient and useful rules in a particular prob-
model-based reasoning. It discusses, first, the problem lem context, to choose the information that will be
of whether vision involves model-based inferences and, brought to bear on the problem at hand, to determine
if so, what kind. Secondly, it discusses the problem of how to update her knowledge basis in view of the out-
the nature of the context that guides visual inferences. puts of the rule, etc. The complexity of the task paves
It finally addresses the broader problem of the relation the way for dynamic approaches to cognition, as they
between visual processing and thinking; various modes are better suited to handle complexities of this mag-
of inferences, the most predominant conceptions about nitude and explain animals’ intelligent behavior more
visual perception, the stages of visual processing, the adequately. Dynamical models of cognition put empha-
problem of the cognitive penetrability of perception, sis on time and complexity, both of which relate context
and the logical status of the processes involved in all to behavior. In Chap. 29, Metzger argues that temporal
stages of visual processing are discussed and assessed. processes allow memory, feedback, the effects of non-
linear recursion, and the generation of expectation to be
Reasoning is usually considered to consist of ac- brought to bear on cognitive activities, whereas com-
tions that occur exclusively within the brain of the agent plexity allows stable patterns of coordination to emerge
that reasons. Reasoning is a mental activity in which from the interaction of sub-processes. Metzger reviews
various inference rules are applied to mentally rep- several models of cognition, and their dynamical fea-
resented sentences. This is not always true, however. tures. She focuses on the manner in which each model
On many occasions agents use external representations deals with time and complexity, thought, and action.
to enhance their inferential capabilities by overcoming
limitations in working memory capacities, by simplify- Dynamical models could also be used to model
ing the problem space, etc. In the second chapter, Bech- the ways that humans continuously adapt their behav-
tel argues that humans often reason by constructing, ior to changes in their environment, and the way their
manipulating, and responding to external representa- cognitive abilities continuously develop over time. In
tions, whether the reasoning be deductive, abductive, or Chap. 30, P. van Geert, R. den Hartigh, and R. Cox
inductive. These representations are not only linguistic argue that an important question for psychologists in
expressions (symbols on a piece of paper, for instance) this direction has been the discovery of the (cognitive)
but also include diagrams. Although diagrams are used mechanism that underlies the control of human behav-
in everyday reasoning, they are particularly important in ior in real time, as well as the process of cognitive
science; diagrams, for example, figure in the processes development in the long term. Their chapter discusses
through which scientists analyze data and construct two kinds of general approaches, namely, the reduction-
their explanations. In Chap. 27 Bechtel discusses what ist approach and the complex dynamic systems (CDS)
is known about how people, including scientists, reason approach. The reductionist approach, on the one hand,
with diagrams. assumes that separate components, such as brain ar-
eas or cognitive processing mechanisms, are the main
Another consequence of the assumption that reason- determinants of behavior and development, by process-
ing takes place within the human mind is that reasoning ing (and responding to) specific environmental inputs.
571
The CDS approach, on the other hand, assumes that different approaches. They conclude that the CDS ap-
cognition, and, hence, the control of behavior and de- proach provides the most plausible approach to cogni-
velopment, are distributed over the brain, the body, and tion.
the environment, all three continuously interacting over
In Chap. 31, Waskan argues that model-based rea-
time.
soning in science is often carried out in an attempt to
Thus, dynamic system theory proposes ways in understand the kinds of mechanical interactions that
which embodied cognition (that is, the view that cog- might give rise to particular occurrences. Scientists do
nitive processes constitutively involve the body) and that by constructing and using mental models that are
extended cognition (that is, the view that cognitive like scale models in crucial respects. Behavioral evi-
processes constitutively involves the environment and, dence points to the existence of these mental models,
in this sense, are extended to the world breaking the but the neural plausibility of this hypothesis is still
boundaries of the mind/brain) could be brought to- questioned. Waskan provides an overview of the psy-
gether with the received view that cognition is restricted chological literature on mental models of mechanisms,
to what happens within the boundaries of the brain, focusing on the problem of how representations that
and provide a more adequate account of animal cog- share the distinctive features of scale models might be
nition. To substantiate this claim, the authors compare realized by neural machinations. He argues that lessons
the two approaches with respect to their assumptions, brought together from the computational simulation of
research strategies, and analyses. Furthermore, they mechanisms and from neurological research on mental
discuss the extent to which current research data in maps in rats, could be applied to explain how neuro-
the cognitive domain can be explained by the two physiological processes might realize mental models.
573
Vision, Think
26. Vision, Thinking, and Model-Based Inferences
Part F | 26
Athanassios Raftopoulos
Helmholtz [26.1] famously maintained that perception the basis of thoughts that are already entertained. In this
Part F | 26
is a form of inference; the brain uses probabilistic sense, vision is a cognitive, that is, thought involving,
knowledge-driven inferences to induce the causes of the process.
sensory input from this input, that is, to extract from the If perception is to be thought of as some sort of
bodily effects of the light emanating from the objects in thinking, its processes must necessarily first include
a visual scene as it impinges on our transducers the var- transformations of states that are expressed in symbolic
ious aspects of the world that cause the input. The brain or propositional form, and, second, these transforma-
integrates computationally the retinal properties of the tions must be inferences from some states that function
image of an object with other relevant sources of in- as premises to a state that is the conclusion of the infer-
formation to determine the object’s intrinsic properties. ence. That is to say, visual processes must be inferences
Rock [26.2] claimed that the perceptual system com- or arguments, exactly like the processes of rational be-
bines inferential information to form the percept. From lief formation. These two conditions follow directly
visual angle and distance information, for example, the from the claim that perception is some sort of thinking,
perceptual system infers and perceives size. This infer- since the characteristic trait of thinking is drawing infer-
ence may be automatic and outside the authority of the ences (whether it be deductive, abductive, or inductive)
viewer who does not have control over it, but is an in- operating on symbolic forms by means of inference
ference nevertheless. rules that are represented in the system, although think-
Similarly, Spelke [26.3] suggests “perceiving ob- ing is not reduced to drawing inferences this way. In
jects may be more akin to thinking about the physical view of these considerations, the principles guiding the
world than to sensing the immediate environment”. The transformations of perceptual states, that is, the princi-
reason is that the perceptual system, to solve the under- ples (such as Spelke’s principles) acting as the inference
determination problem of both the distal object from the rules in perceptual inferences, must be expressed in the
retinal image and of the percept from the retinal image, system and, specifically, must be represented in a sym-
employs a set of object principles (the Spelke princi- bolic form. Whenever the system needs some of the
ples) that reflect the geometry and the physics of our principles to draw an inference, it simply activates and
environment. Since the principles can be thought of as uses them. In addition, the premises and the conclusion
some form of knowledge about the world, perception of a visual argument be represented in the viewer in
engages in inferential processes from some pieces of a propositional-like, symbolic form.
worldly knowledge and visual information to the per- If these conditions are met, perception involves
cept, that is, the object of our ordinary visual encounters discursive inferences, that is, drawing propositions or
with the world. conclusions from other propositions acting as premises
Recently Clark [26.4] argued that: by applying (explicitly or implicitly) inferential rules
that are also represented in the system. Clark’s view
“To perceive the world just is to use what you know
quoted above seems to echo this thesis in so far as
to explain away the sensory signal across multiple
Clark conceives the processes of visual perception as
spatial and temporal scales. The process of per-
a rational process of belief fixation. It follows that the
ception is thus inseparable from rational (broadly
inferences used in perception are no different from the
Bayesian) processes of belief fixation [. . . ] As
inferences used in thought. That is, they are discursive
thought, sensing, and movement here unfold, we
inferences.
discover no stable or well-specified interface or in-
A short digression is needed here, however, lest
terfaces between cognition and perception. Believ-
we attribute to Clark intentions that he may not have.
ing and perceiving, although conceptually distinct,
The previous analysis assumes the standard view of the
emerge as deeply mechanically intertwined.”
brain as a physical machine that processes symbols in
The aim of this conglomeration of faculties that purely formal or syntactic way on the basis of the physi-
constitute perception is, therefore, to enable perceivers cal properties of the symbols; the brain performs digital
to respond, modify their responses, and eventually computations. These symbols have meaning, of course,
adapt their responses as they interact with the environ- and so do the transformations of these symbols, but the
ment so as to tune themselves to the environment in processes in the brain are independent of any meaning.
such a way that this interaction be successful; success in To put it differently, the brain is a syntactic machine
such an endeavor relies on inferring correctly (or nearly that processes symbols that have meaning. The stan-
so) the nature of the source of the incoming signal from dard view can be modified by adding the thesis that
the signal itself. digital computations are not merely formal syntactic
In all these views, the visual system constructs the manipulations but also involve semantics, that is, the
percept in the way thinking constructs new thoughts on contents of the states that participate in computations
Vision, Thinking, and Model-Based Inferences Vision, Thinking, and Model-Based Inferences 575
are causally relevant in the production of the computa- in thought. One might even go further than that and
Part F | 26
tions’ outputs [26.5]. claim that these inferences, or rather state transforma-
Although this is the standard, algorithmic, view of tions, do not involve representational states at all [26.7].
cognition, it is by no means unequivocally endorsed. Although the percept is certainly a representational
There is another, competing view of cognition, accord- state, the processes that lead to its formation are not
ing to which the brain is not a syntactic machine that representations. It follows that visual perception is not
processes symbols through algorithms. The brain rep- a cognitive process, if cognitive is taken to entail the
resents information in a nonsymbolic, analogue-like use of mental representations; “a system is cognitive be-
form, as activation patterns across a number of units. cause it issues mental representations” [26.7].
Furthermore, the processes in the brain do not assume In this chapter, I examine vision and its processes
the form of algorithmic but of algebraic transforma- and discuss the relation of vision with thinking. I do not
tions; this is the connectionist view of cognition, of have the space here to discuss the problem of whether
which Clark is a stern proponent. This is not the place to visual processes involve representations. I proceed by
expand and explain connectionism, but I wish to stress assuming that they do although, first, as I will argue,
that in this view of cognition, the brain does not use at the state transformations do not presuppose the appli-
all discursive inferences, although some of its behavior cation of inference rules that are represented in the
certainly simulates the usage of discursive inferences. system, and, second, not all visual states are represen-
If this is so, Clark’s thesis that perception is inseparable tational.
from the rational processes of belief fixation does not In Sect. 26.1, in view of the close relationship be-
commit him to the view that perception employs dis- tween thinking and inference, I chart and briefly discuss
cursive inferences for the simple reason that thinking inference and its modes, namely, deduction, induction,
itself does not implicate such inferences. and abduction or inference to the best explanation.
Furthermore, given the propositional or symbolic In Sect. 26.2, I sketch an overview of the main con-
form of the format in which the states of the visual ceptions concerning vision, to wit constructivism, direct
system must be represented if vision is akin to think- or ecological theory of vision, and the more recent pro-
ing, the contents of these states, that is the information posals that view vision as inseparable from action.
carried by the states, consists of concepts that roughly In Sect. 26.3, I present the two stages of which vi-
correspond to the symbols implicated; it is conceptual sual perception consists, namely early vision and late
content. If vision is some sort of thinking, therefore, vision.
its contents must be conceptual contents. This means In Sect. 26.4, I discuss the problem of the cogni-
two things. Either the visual circuits store conceptual tive penetrability (CP) of perception, because if vision
information that they use to process the incoming in- is akin to thinking, visual processes necessarily involve
formation, or they receive from the inception of their concepts and are thus cognitively penetrated. If it turns
function such information from the cognitive areas of out that some stage of vision is cognitively impenetra-
the brain while they are processing the information im- ble (CI) and conceptually encapsulated, the status of
pinging on the retina. Spelke’s principles that guide the logical characterization of the visual processes of
visual processing and render the percept possible are that stage remains open since, being nonconceptual in
examples of conceptual content. nature, they cannot be discursive inferences. I am go-
It should be noted that discursive inferences are ing to argue that a stage of vision, early vision, is CI
distinguished from inferences as understood by vision and has nonconceptual content. This content is probably
scientists according to whom any transformation of sig- iconic, analogue-like and not symbolic. By not being
nals carrying information according to some rule is an symbolic, the contents of the states of early vision can-
inference [26.6]: not be transformed to some other contents by means of
discursive inferences in so far as the latter operate on
“Every system that makes an estimate about unob- symbolic forms. The second main visual stage, namely
served variables based on observed variables per- late vision, is CP and implicates concepts. I also ad-
forms inference [. . . ] We refer to such inference dress in this section two problems with my claim that
problems that involve choosing between distinct early vision is conceptually encapsulated. The first is
and mutually exclusive causal structures as causal raised by the existence of some general regularities that
inference.” seem to guide the functioning of the perceptual sys-
tem, of which the Spelke principles are a subset, and
One could claim, therefore, that although infer- which operate at all levels of visual processing. The
ences, in this liberal sense, occur in the brain during problem is, first, whether the existence of such princi-
visual perception, they are not like the inferences used ples entails that at least some part of the information
576 Part F Model-Based Reasoning in Cognitive Science
processed in early vision is inherently conceptual, and, In Sect. 26.5, I examine the logical status of the
Part F | 26.1
second, whether the existence of such principles entails processes of early and late vision and argue that the
that vision in general is theory-laden. The second con- processes of early vision are abductive nondiscursive
cerns the effects of perceptual learning, since one might inferences that do not involve any concepts, while the
argue that through perceptual learning some concepts processes of late vision despite the fact that they are
are embedded in the perceptual circuits of early vision. abductive inferences guided by concepts, are not dis-
If either of these two is correct, the states of early vi- cursive inferences either. I argue that the abductive in-
sion have conceptual contents and thus the processes of ferences involved in visual perception are not sentential
early vision may involve discursive inferences render- inferences but, instead, they rely on pattern-matching
ing early vision akin to thought and belief formation. mechanisms that explore both iconic, analogue-like
I argue, however, that neither the principle nor the ef- information and symbolic information. In this sense,
fects of perceptual learning entail that early vision has visual abduction could be construed as consisting of
conceptual content. a series of model-based inferences.
might argue that all inductions are abductions or in- one infers the color of all crows. This is hardly a good
Part F | 26.2
ferences to the best explanation [26.11]. Most authors, explanation though. A good explanation seeks to ex-
however, think that abduction is a subspecies of induc- plain, that is, make us or the scientific community
tion since it bears the basic marks of induction as it is understand why crows ˛ and ˇ are black. The general-
ampliative and does not preserve truth. However, it is ization All crows are probably black fails to accomplish
more specific than induction since it aims exclusively to this since all that it does is gather together all instances
pinpoint the cause or causes for some phenomena, that of black crows in a generalization. Moreover, a good
is, it aims to yield an explanation of a set of phenom- explanation of a set of phenomena is expected to have
ena. Not all inductions are focused towards this aim. a wider range than these specific phenomena in the
Several times a good induction leads to a generalization sense that it can be used as a springboard to explain
that subsumes a set of phenomena under the heading of a wider class of phenomena. In our case, a good expla-
a generalization, which, however, does not explain the nation of why crows ˛ and ˇ are black should certainly
phenomena. Consider the following induction. involve genetics. Such an account not only would pro-
vide understanding of the correlation of crows with the
Bird ˛ is a crow and is black .Ca&Ba/ color black, but it could also be used to explain the col-
Bird ˇ is a crow and is black .Cb&Bb/ ors of other species. Now, it is widely agreed that the
::: discovery of the relevant laws of genetics would fall
within the purview of abduction. To put this point dif-
Bird is a crow and is black .CK&Bk/ ferently, all abductions are inductive inferences but not
Therefore (inductively) all inductions are abductions.
All crows are probably black ..x/.Cx!Bx// When I examine in Sects. 26.3 and 26.5 the visual
processes in some detail, I shall adduce more evidence
Under certain conditions this is a good induction in supporting the claim that visual processing is an abduc-
which from the colors of specific specimens of crows tive inference.
processes of vision that consist of the application of observer makes some assumptions about the physical
Part F | 26.2
transformational rules that take as input representation world that give rise to the particular retinal image, per-
r1 at time t1 and output representation rt C 1 at time ception is not feasible.
t2. These rules could be construed as abductive infer- It is important at this juncture to stress that accord-
ences since the brain is called upon to fill in the gaps ing to Marr, all the processes that lead to the formation
in the information contained in the retinal image in or- of the 2 21 D sketch are data-riven; they are driven solely
der to construct a representation of the distal object that by the input.
is the most likely candidate for being the object that One of the aims of vision is the recognition of
could have produced the retinal image. It could be ar- objects. This requires the matching of the shape of
gued, hence, that the brain guesses which object is the a structure with a particular object, a matching that re-
best fit to explain the retinal image. quires an object-centered representation. This is Marr’s
Since visual perception consists of a series of con- three-dimensional (3-D) model. The recovery of the ob-
structions of visual representations, vision is a con- jects present in a scene cannot be purely data-driven,
structive process. Let us call this construal of visual since what is regarded as an object depends on the
perception constructivism. According to one of the most subsequent usage of the information, and thus is task
influential visual scientists that espouse constructivism, dependent and cognitively penetrable. Most computa-
Marr [26.12], there are three levels of representation. tional theories of vision [26.12, 13] hold that object
The initial level of representation involves Marr’s pri- recognition is based on part decomposition, which is
mal sketch, which consists of the raw primal sketch the first stage in forming a structural description of
and the full primal sketch. The raw primal sketch pro- an object. It is doubtful, however, whether this de-
vides information about the edges and blobs present composition can be determined by general principles
in a scene, their location and their orientation; this reflecting the structure of the world alone, since the
information is gathered by locating and coding indi- process appears to depend upon knowledge of specific
vidual intensity changes. Grouping procedures applied objects [26.14]. Object recognition, which is a top-
to the edge fragments formed in the raw primal sketch down process and requires knowledge about specific
yield the full primal sketch, in which larger structures objects, is accomplished by the high-end vision. The
with boundaries and regions are recovered. Through construction of the percept, which is the end product of
the primal sketch contours and textures in an image visual perception, therefore requires the synergy of both
are captured. The primal sketch can be thought of as top-down and bottom-up transfer of information be-
a description of the image of a scene but not as a de- tween the visual circuits and the cognitive centers of the
scription of the real scene. This latter involves the brain. Object recognition requires matching the internal
relative distances of the objects and their motions. This representation of an object stored in memory against the
information is provided by the viewer-centered repre- representation of an object generated from the image. In
sentation, which is Marr’s 21=2 sketch. At this level Marr’s model of object recognition the 3-D model pro-
information about the distance and layout of each sur- vides the representation extracted from the image that
face is computed using various depth cues and by means will be matched against the stored structural descrip-
of analysis of motion and of shading. This information tions of objects (perceptual classification). (It should be
describes only the parts of the object that are visible to emphasized that these object recognition units are not
the viewer and thus is relative to the viewer. necessarily semantic, since we may recognize an object
The computations leading to the formation of the that we had seen before, even though we have no idea
21=2 sketch are determined by three factors: of its name, of what it does and how it functions, that
is, even if we have no semantic and lexical information
1. The input to the visual system, that is, the optical about it. Ref. [26.15] introduces a distinction between
array the perceptual classification and semantic classification
2. The physiological mechanisms involved in vision, and naming. These processes are independent one of
and the computations they allow, and the other. ). See Appendix 26.B for an overview of con-
3. Certain principles that restrict and guide the compu- structivism.
tation. Marr’s and Biederman’s hypothesis that object
recognition occurs through part decomposition is based
These principles are constraints that the system on the conception of three-dimensional objects as ar-
must satisfy in processing the input. These constraints rangements of some set of primitive 3-D shapes. Ac-
are needed because perception is underdetermined by cording to Marr, these primitive 3-D shapes are gener-
any particular retinal image; the same retinal image alized cylinders (Fig. 26.1) that are defined in terms of
could lead to distinct perceptions. Thus, unless the major axes and radii of objects.
Vision, Thinking, and Model-Based Inferences 26.2 Theories of Vision 579
Part F | 26.2
Human
are the so-called geons (Fig. 26.2). All objects can be
decomposed into a set of 36 specific geons related in
various ways. The properties that identify geons and Arm
allow them to function as volumetric perceptual primi-
tives are viewpoint invariant, that is, they do not change Forearm
as the angle of view changes. As such, they are called
nonaccidental features since they are features not only Hand
of the image but also of the worldly objects (that is, they
are properties that exist in the environment outside the
viewer) that do not depend on what the viewpoint may
be accidentally. Examples of nonaccidental properties
are parallel lines and collinearity. If an object has paral-
lel lines many rotations of this object yield an image in
which these lines are still nearly parallel; that is to say,
parallelism is a property that is rotation- or perspective- Fig. 26.1 Marr’s generalized cylinders
invariant.
Let me close the account of constructivism by re-
Geons Objects
minding the reader that the theories of visual perception
presented in this part of the chapter are some among 1 2
the many different theoretical accounts of visual pro- 5 3
cessing. The differences between the various theories 1
notwithstanding, all constructivist theories share a com-
mon core, namely that visual perception involves state 5
5
transformations in the course of which visual repre- 3 4
sentations of increasing complexity are being gradually 3 2
constructed by the visual system. The visual processes
start from the meager information contained in the reti-
5
nal image and which consists of local distributions of 5
4
light intensities and wavelengths. These transforma-
tions can also be construed as computations in which 3 3
the brain computes an output state given an input state.
Many of these transformations (but not all of them) Fig. 26.2 Biederman’s geons
act on and therefore essentially involve mental repre-
sentations that are within the brain of the viewer, and that many of the earlier visual states are probably not
can be independent of any other activities on the part representational because they do not meet the criteria
of the viewer. The transformations are made possible that an adequate definition of representation posits, and
through the application of transformational rules, such to referring the reader to the discussion in Chap. 4.
as, for example, the rule that abrupt changes in light As we shall see in Sect. 26.4, one could claim that
intensity signify the presence of edges that is used there is a sharp distinction between internal probabilis-
by the perceptual system to construct the raw primal tic dependencies between states that can be explained
sketch. Such a rule takes as input states that carry in- by internal causal connections between the circuits of
formation about various light intensities distributed in the brain and those that cannot; only those that can-
space and delivers states that carry information about not be explained internally carry information about
edges. It follows that the transformations taking place the external world and thus involve representational
in visual processing are information-processing opera- states.)
tions. (I said that not all of the transformations operate The fact that the visual brain transforms states to
on representations because many of these transforma- other states through the usage of some rules means that
tions operate on states that are not representational. the function of the brain can be understood as a series
It would require another chapter to discuss the con- of inferences from some state/premises to some other
ditions under which a state is representational or not states/conclusion. In view of our discussion in the be-
and, of course, much depends on how one defines the ginning of this section, as well as in the previous one,
term representation. I confine myself to pointing out the inferences most likely are abductive in nature.
580 Part F Model-Based Reasoning in Cognitive Science
26.2.2 Theory of Direct Vision or Ecological of objects that remain invariant under motion. Note that
Part F | 26.2
that, in view of the fact that the information contained processing, and in this sense one of Gibson’s main in-
Part F | 26.2
in the visual input is richer than previously thought, sights is considered to be wrong, several of Gibson’s
this need is attenuated. Therefore, visual perception in- insights have been incorporated in the constructivist
volves some sort of inferences. information-processing research program. For exam-
Gibson’s theory was coined the theory of direct per- ple, most, if not all, information-processing theories
ception because it relinquished the need for internal hold that most of the ambiguities that occur during the
information processing; instead, the viewers retrieve all information processing of the retinal input cannot be
the information they need to detect the environment resolved by that input alone and need top-down as-
directly from the environment without any internal pro- sistance only when information comes from a static
cessing of any sort mediating the process of information monocular image. When additional information can be
retrieval. If, however, some information processing over derived from stereopsis and motion of real scenes, then
internal representations is needed as well, as a moderate the information-processing program can resolve the
form of Gibson’s theory asserts, can the qualification di- ambiguity without the need of a top-down flow of infor-
rect be salvaged? mation. If one takes into account the real input to human
There is a sense in which it might. Suppose that vision, which is binocular and dynamic, there are few
direct is construed so as to emphasize not the lack of ambiguities that cannot be resolved through a full con-
information processing operating on internal represen- sideration of the products of the early visual processing
tations, but the fact that the information processing is modules [26.20]. This shows that the dynamic and inter-
entirely data-driven, that is, guided by environmental active character of vision solves several problems en-
input and some principles that reflect regularities in the countered within the information-processing research
environment, and the whole process is not influenced by program.
other internal nonvisual states of the viewer, such as the Our discussion about direct vision revealed an as-
viewer’s cognitive or emotional states. If this supposi- pect of visual processing that traditional constructivist
tion is borne out, then visual perception is direct in the theories did not initially consider, namely the interac-
sense that the whole process is data-driven and, as such, tion of perception and action. The next kind of theory
the information processing used operates over informa- of perceptual processing that we will examine views vi-
tion retrieved exclusively from the environment. Note sual perception as inextricably linked with action and
that this presupposes that the principles guiding visual uses the most recent neuropsychological evidence, vi-
processing do not constitute some form of intervention sion science research, and computer modeling to both
on the part of the viewer whose contribution exceeds substantiate this claim, and draw the details of how the
what is given in the environment. active visual brain works in order to provide a fully
This assumption is borne out if visual perception fledged unifying model of perception and action. Al-
or at least some stage of it, is purely data-driven, that though this model aims to cover all modalities, for the
is, cognitively and emotionally impenetrable. If cog- purpose of this chapter I will restrict the presentation
nitive states penetrate and thus influence perceptual and discussion to visual perception.
processing, the viewer’s cognitive states actively con-
tribute to the formation of the percept and the visual 26.2.3 Predictive Visual Brain: Vision
processing does not retrieve information directly from and Action
the environment but only through some cognitive in-
tervention; visual perception, in this case, is not direct. The basic tenet of the theory of ecological or direct vi-
Norman [26.19] has argued along this line that the pro- sion is that all the information viewers need to recover
cessing along the dorsal visual pathway that guides our the visual scene that causes the retinal image is already
on-line interactions with the environment, owing to the included in the incoming information in the optic ar-
fact that when it operates immediately on the visual in- ray. Little or no information processing is required for
put it is entirely data-driven, is a visual function that the construction of the percept. The constructivist theo-
conforms very closely to Gibson’s direct theory. The ries of visual perception, in contradistinction, underline
ventral visual pathway, in contradistinction that is re- the necessity of information processing and state trans-
sponsible for object recognition and categorization is formations in the brain. The flow of information in the
clearly affected by cognition and, in this sense, is not brain is bidirectional; both top-down and bottom-up
a direct visual function. Since both visual pathways are signals are transmitted and the ensuing percept is the
found in the brain, the constructivist and the ecological result of the synergy between top-down and bottom-
theories of perception can be reconciled. up processing. This class of theories assumes that the
Even though it seems abundantly clear that visual representation constructed at some level is transmit-
perception requires a significant amount of information ted bottom-up to the neuronal assembly at the next
582 Part F Model-Based Reasoning in Cognitive Science
immediate level where it is further processed. More- the immediately previous level, and testing these hy-
Part F | 26.2
over, recurrent signals return top-down to earlier levels potheses by matching their predictions with the actual
mainly to test hypotheses concerning aspects of the vi- sensory data at the preceding processing level. Sup-
sual scene (recall that visual perception aims to recover pose, for example that a neuronal assembly at level l
the visual scene that causes the retinal image and does receives from level l-1 information concerning differ-
that by constructing increasingly more complex repre- ences in light intensities. The higher level attempts to
sentations of the probable aspects of the visual scene at recover the probable edges that cause the variation in
various spatial and temporal scales) until the percept is light intensity and forms a hypothesis involving such
constructed. edges. Now, and this is the crucial part, if this hypoth-
Recent empirical findings and modeling shed light esis were correct, that is, if the edges as represented in
on the way the brain actually effectuates these pro- the hypothesis were present in the environment, then
cesses. These details, as we shall see, entail certain a certain pattern of variation of light intensities at the
deviations from the traditional constructivism image, appropriate local scale would have been present in the
which concern (a) the sort of information transmitted sensory data. This prediction is transmitted top-down to
bottom-up; only prediction errors are transmitted to the level l-1 and matched against the actual pattern of vari-
next level, (b) the nature of the representations con- ations in light intensities. If there is a match (with an
structed; they are distributions of probabilities rather acceptable degree of error deviation due to the inher-
than having a unique value (note that this new approach ent noise of the signal, of course) no further action is
emphasizes the indispensable role of representations in needed since the perceptual system assumes that it has
visual processing), and (c) the interaction between per- constructed the correct, at this spatial scale, represen-
ception and cognition. This last trait is very important tation of the relevant environmental input. If the match
has important repercussions for our discussion on the reveals a discrepancy, that is, if an error in the prediction
relation between visual processing and thinking. is detected, this prediction error is transmitted bottom-
According to this view of visual perception, brains up to level l so that a new hypothesis be formulated
are predictive machines [26.4]: and tested until, eventually, no unacceptable prediction
error persists. If one thinks of the discovered error as
“They are bundles of cells that support perception
a surprise for the system, the system strives to correct
and action by constantly attempting to match in-
its hypotheses so that by making correct predictions,
coming sensory inputs with top-down expectations
the testing of the hypotheses yields no surprises; this is
or predictions. This is achieved using a hierarchical
a typical error-driven learning process where a system
generative model that aims to minimize prediction
learns, i. e., constructs a correct representation, by grad-
error within a bidirectional cascade of cortical pro-
ually reducing error. The hierarchical generative models
cessing.”
hence generate, in essence, low-level states (the predic-
A hierarchical generative model as applied to visual tions they make about the activities at the lower levels)
processing is a model of perceptual processes according from high-level causes (the hypotheses that would, if
to which the brain uses top-down flow of information correct, explain the activity at the lower levels).
(enabled by top-down neural connections) in an attempt Bidirectional hierarchical structure allows the sys-
to generate a visual (meaning, in the brain) representa- tem to [26.4]:
tion of the visual scene (in the environment) that causes
“infer its own priors (the prior beliefs essential to
the light pattern impinging on the transducers and the
the predicting routines) as it goes along. It does this
low-level visual responses to this light pattern. The
by using its best current model at one level as the
brain attempts to recover gradually the causal matrix
source of the priors for the level below, engaging in
(the various aspects of a visual scene) that causes and
a process of iterative estimation that allows priors
thus is responsible, for the retinal image seen as a data-
and models to coevolve across multiple linked lay-
structure (i. e., the sensory data). The brain does that
ers of processing so as to account for the sensory
by capturing the statistical structure of the sensory data,
data.”
that is, by discovering the deep regularities underlying
the retinal structure, on the very plausible assumption To form hypotheses concerning the probable cause
that the deep structure underneath the sensory data re- of the sensory data at a certain level, at a specific spa-
flects, so to speak, the causal structure of the visual tial and temporal scale, the neuronal assembly at the
scene. next level, say level l, uses information not only about
Hierarchical generative models attempt to achieve the sensory data at the previous level (or, to be precise,
this by constructing, at each level, hypotheses about information regarding its prediction error) that is trans-
the probable cause of the information represented in mitted bottom-up, but also higher-level information that
Vision, Thinking, and Model-Based Inferences 26.2 Theories of Vision 583
is transmitted to l either laterally, that is, from neuronal in view of vehement criticism, Noe has attempted to
Part F | 26.2
assemblies at the same level (neurons in V1 process- modify it without compromising the main tenets of his
ing wavelengths inform other neurons in V1 processing views [26.21]:
shape information, for example), or top-down from lev-
“When you experience an object as cubical merely
els higher in the hierarchy (neurons in V4, for instance,
on the basis of its aspect, you do so because you
are informed about the color of incoming information
bring to bear, in this experience, your sensorimo-
from neurons in the inferotemporal cortex in the brain
tor knowledge of the relation between changes in
(IT) as a result of precueing – that is, when a viewer
cube aspects and movement. To experience a figure
has been informed about the color of an object that will
as a cube, on the basis of how it looks, is to under-
appear on a screen). This higher-level information may
stand how it looks changes as you move (emphasis
and usually does concern general aspects of the world
added).”
(such as solid objects do not penetrate each other, or
solid objects do not occupy exactly the same space at The sensorimotor knowledge consists of the expec-
the same time, etc.), and may also reflect knowledge tations of how our perception of an object changes as
about specific objects learned through experience. All we move around it, or as this object moves with respect
this lateral and top-down flow of information provides to us. These expectations constitute a form of practi-
the context in which each neuronal assembly constructs cal knowledge, a knowing how as opposed to a knowing
the most probable hypothesis that would explain the that. Thus, to be able to experience visually an object,
sensory data at the lower level. Thus, context-sensitivity one needs to have the ability to move around the object
is a fundamental and pervasive trait of the processing and explore it. Visually experiencing the object literally
of hierarchical predictive coding; the contextualized in- consists of grasping the relevant sensorimotor contin-
formation significantly affects, and on occasions (as in gencies, that is, the sensorimotor knowledge associated
hallucinations) may override, the information carried by with this specific object. There are two ways to read this
the input. claim. According to the first reading, which Noe seems
The hierarchical predictive processing model can be to espouse judging from the previously cited passage,
naturally extended to include action and thus closely to be able to visually perceive requires the actual exer-
ties perception with action [26.4]. This is the action- cise of the ability to probe the world. According to the
oriented predictive processing. Action-oriented predic- second reading, visually perceiving an object only re-
tive processing extends the standard hierarchical predic- quires the ability to probe the world but not the actual
tive model by suggesting that motor intentions actively exercise of this ability. The first reading entails imme-
elicit, via the motor actions they induce, the ongo- diately that prior to exercising this ability, one does not
ing streams of sensory data at various levels that our visually perceive the object. Since this is absurd, one
brains predict. In other words, once a prediction is made has to concede that viewers do not need to exercise ac-
about the state in the world that causes the transduced tually the ability to probe the environment, it suffices
information, the action-oriented predictive processes that they take recourse to their experience with similar
engage in a search in the environment of the appropriate objects and retrieve the requisite sensorimotor contin-
worldly state. Suppose, for example, that owing to bad gencies from experience. Even if one takes this line,
illumination conditions, a perceiver is unsure about the however, the problem remains that at the time of a first
identity of an object in view. Its brain makes a predic- encounter with an object to be able to see its, say, shape,
tion about the putative object that causes the sensory one should be able to probe the object either by mov-
data the perceiver receives, and the perceiver moves ing around the object, or by having the object move
around the object in order to acquire a better view around them. Thus, when stationary viewers perceive
that will confirm the prediction. By moving around, a stationary novel object, lacking any knowledge of sen-
the perceiver’s expectations about the proprioceptive sorimotor contingencies, they do not see its shape or its
consequences of moving and acting directly cause the other properties.
moving and acting since where and when the perceiver It follows that infants upon opening their eyes for
moves is guided by the aim that the perceiver’s action the first time and facing the world, by lacking any sen-
brings the object into a better view. sorimotor knowledge and by not probing the environ-
It is worth pausing at this point to discuss briefly the ment, they do not see anything. This claim flies to face
problem of nature of the relation between visual percep- of countless empirical evidence, which shows that there
tion and action and, specifically, motion. Is this relation is something fundamentally wrong with equating visual
constitutional, which means that if someone cannot perception with understanding sensorimotor contingen-
or does not move they cannot visually perceive any- cies and deploying the relevant practical knowledge.
thing? This claim was initially made by Noe although, This entails, in turn, that the relation between visual
584 Part F Model-Based Reasoning in Cognitive Science
perception and action, no matter how important it is, discover the most probable hypothesis that explains
Part F | 26.2
is not a constitutional relation; one gets to see the world away a set of data, it is most likely a Bayesian inference.
even if both they and the world are stationary, although It is very plausible, therefore, that the computational
it goes without saying that their experience will be re- framework of hierarchical predictive processing real-
stricted compared to other viewers who can probe the izes a Bayesian inferential strategy (see Appendix 26.C
environment. They could not visually experience, for for an analysis of Bayes’ theorem). Indeed, recent work
example, Marr’s 3-D sketch because they lack knowl- on Bayesian causal networks [26.22] presents the brain
edge of the unseen surfaces of objects. as a Bayesian net operating at various space and time
This unity between perception and action emerges scales.
most clearly in the context of active inference, where What Bayes’ theorem, on which this strategy is
the agent moves its sensors in ways that amount to ac- based, ensures is that a hypothesis is eventually selected
tively seeking or generating the sensory consequences that makes the best prediction about the sensory data
that their brains expect. “Perception, cognition, and ac- minimizing thereby the prediction error and thus best
tion work closely together to minimize sensory predic- explains them away; that is a hypothesis that by hav-
tion errors by selectively sampling, and actively sculpt- ing the highest posterior probability provides the best
ing, the stimulus array” [26.4]. Their synergy moves fit for the sensory data. The construction of this hypoth-
a perceiver in ways that fulfill a set of expectations esis crucially and necessarily involves the context, as
that constantly change in space and time. Accordingly, it is clearly expressed in Bayes’ equation in the form
perceptual inference is necessary to induce prior expec- of the prior probability for the hypothesis P.A/, whose
tations about how the sensorium unfolds and action is value depends on the context. That is to say, it is the
engaged to resample the world to fulfill these expecta- context that provides the initial plausibility of a hypoth-
tions. esis before the hypothesis is tested.
Since the construction of the representations of the This enables Clark [26.4] to claim that in the
putative causes of the sensory inputs is made possible framework of predictive brains that use hierarchical
through a synergy of bottom-up processing transmitting generative processing perception becomes theory-laden
the prediction errors and top-down processing transmit- in the specific sense that what viewers perceive de-
ting for testing the hypotheses concerning the probable pends crucially on the set of priors (that is, the hy-
causes of the input and in so far as the processes con- potheses that guide the predictions about the matrix
structing these hypotheses are informed by high-level of the sensory data at the lower processing levels,
knowledge of the sort discussed above, visual percep- which the hypothesis projects) that the brain brings
tion unifies cognition and thinking with sensation; these to bear in its attempt to predict the current sensory
two become intertwined. This means that perception in- data. This remark brings us back to the main theme
extricably involves thinking. Notice that this account of this chapter, namely, the relation between perceiving
of visual perception necessarily involves representa- and thinking. If thinking necessarily implicates discur-
tions; it requires that each level retain a representation sive inferences and deploying concepts, as it usually
of the data represented at this level so that the top- does, Clark’s claims entail that perception employs
down transmitted predictions of the hypotheses formed from its onset concepts and draws discursive inferences.
at subsequent higher levels be matched against the in- To assess this dual claim, we must examine the pro-
formation represented at the lower level in order for cesses of vision to determine first whether concepts
the hypothesis to be tested. It also requires the repre- are used and if the answer is affirmative the extent
sentation of the putative causes of the sensory data at to which they are being used, and second, whether
the preceding level; these are called the representation- the inferences that are undoubtedly used in percep-
units, which operate along the error units (the units tion must necessarily be discursive. I hasten to note
that compute the error signal, that is, the discrepancy that, with respect to this last problem, nowhere in his
between prediction and actual data) in a hierarchical account does Clark suggest that the inferences must
generative system. be discursive. In fact, the sources he refers to, espe-
Furthermore, testing hypotheses and altering them cially those concerning connectionist neural networks,
as a result of any prediction errors until the prediction suggest that the inferences on which perception re-
error is minimized and thus until the most probable lies may take another form and need not necessarily
cause of the sensory data has been discovered, is an involve propositionally structured premises and conclu-
inference. Being a probabilistic inference that aims to sions.
Vision, Thinking, and Model-Based Inferences 26.3 Stages of Visual Processing 585
Part F | 26.3
I said above that we must examine the processes of used in a certain way by the organism; the organism
vision with a view to determine whether and, depend- just perceives the affordance, that is, the opportunity
ing on the answer to this question, the extent to which, of action on this specific object. Affordances have two
cognition penetrates visual perception in the sense that important properties. First, they are determined by the
perceptual processing uses conceptual information that functional form of an object, that is, a combination of
is either transmitted top-down to perceptual circuits, or the object’s visible properties should suffice to deter-
is inherently embedded in visual circuits. In the litera- mine whether this object has an affordance relative to
ture, visual processing is divided into two main stages, some viewer. Affordances are based on certain invariant
to wit, early vision and late vision. characteristics of the environment. Second, the affor-
dance is always relative to the viewing organism; this is
26.3.1 Early Vision a consequence of the fact that affordances provide or-
ganisms with the opportunity to interact with objects in
Early vision is a term used to denote the part of their environment. This interaction depends on the ob-
perceptual processing that is preattentive, where at- jects’ properties but it also depends on the needs and the
tention means top-down, cognitively driven attention. constitution of the organism. A fly, for instance, affords
Lamme [26.23, 24] argues for two kinds of process- eating to a frog but not to a human.)
ing that take place in the brain, the feedforward sweep At this level there are nonattentional selective mech-
(FFS) and recurrent processes (RP). In the FFS, the sig- anisms that prevent many stimuli from reaching aware-
nal is transmitted only from the lower (hierarchical) or ness, even when attended to. Such stimuli are the high
peripheral (structural) levels of the brain to the higher temporal and spatial frequencies, physical wavelength
or more central ones. There is no feedback; no sig- (instead of color), crowded or masked stimuli and so
nal can be transmitted top-down as in RP. Feedforward forth. FFS results in some initial feature detection. Then
connections in conjunction with lateral modulation and this information is fed forward to the extrastriate areas.
recurrent feedback that occurs and is restricted within When it reaches area V4 recurrent processing occurs.
the early perceptual areas (local recurrent processing – Horizontal and recurrent processing allows interaction
LRP) extract high-level information that is sufficient to between the distributed information along the visual
lead to some initial categorization of the visual scene stream. At this stage, features start to bind and an initial
and selective behavioral responses. coherent perceptual interpretation of the scene is pro-
When a visual scene is being presented, the feedfor- vided. Initially, RP is limited to within visual areas; it
ward sweep reaches V1 in about 40 ms. Multiple stimuli is local. At this level one can be phenomenally aware
are all represented at this stage. The unconscious FFS of the content of perceptual states. At these interme-
extracts high-level information that could lead to cate- diate levels there is already some competition between
gorization, and results in some initial feature detection. multiple stimuli, especially between close-by stimuli.
LRP produces further binding and segregation. The The receptive fields that get larger and larger going up-
representations formed at this stage are restricted to in- stream in the visual cortical cannot process all stimuli in
formation about spatiotemporal and surface properties full and crowding phenomena occur. Attentional selec-
(color, texture, orientation, motion, and perhaps to the tion intervenes to resolve this competition. Signals from
affordances of objects), in addition to the representa- higher cognitive centers and output areas intervene to
tions of objects as bounded, solid entities that persist in modulate processing; this is global RP and signifies the
space and time. (Affordances is the term Gibson [26.16] inception of late vision.
used to refer to the functional properties of objects (an Lamme [26.23, 24] discusses the nature of informa-
object affords eating to an organism, grasping to an or- tion that has achieved local recurrent embedding. He
ganism, etc.). Clark [26.25] defines affordance as “the suggests that local RP may be the neural correlate of
possibilities for use, intervention and action which the binding or perceptual organization. However, it is not
physical world offers a given agent and are determined clear whether at this preattentional stage the binding
by the fit between the agent’s physical structure, capac- problem has been solved. The binding of some features,
ities and skills and the action-related properties of the such as its color and shape, may require attention, while
environment itself”. Affordances are directly perceiv- other feature combinations are detected preattentively.
able by an organism in the sense that an object does So, before attention has been allocated, the percept con-
not have to be classified as a member of a certain cate- sists of only tentatively but uniquely bound features that
gory in order for the organism to draw the conclusion, form the proto-objects [26.26]. Lamme [26.24] argues
or use the relevant knowledge, that this object can be that Marr’s 2 12 D surface representation of objects and
586 Part F Model-Based Reasoning in Cognitive Science
their surface properties are extracted during the local With respect to the first point, there is actually no
Part F | 26.3
RP stage. Other research [26.27] suggests that spatial real discrepancy. Recall that lateral and local recur-
relations are extracted at this recurrent stage. In addi- rent processes play a fundamental role in the formation
tion motion and size are represented in cortical areas in of the hypotheses that are constructed in early vision.
which local RP take place. Moreover, as we shall see in the next section, all visual
It should be added that Marr thought of the 2 21 D processes including those of early vision, are restricted
sketch as the final product of a cognitively unaffected by certain principles, or better constraints, that reflect
stage of visual processing, since, as we have seen, the general regularities about the world and its geometry.
formation of the 3-D sketch relies on semantic, concep- Now, one could say that these constraints constitute
tual knowledge. If, as is usually thought, cognitive ef- a body of knowledge that informs early vision process-
fects on perception are mediated by cognitively-driven ing and affects early vision from the within and not in
top-down attention, Lamme’s proposal that early vision a top-down manner, since as we saw there are no cog-
is not affected by this sort of attention echoes Marr’s nitive top-down effects in early vision. This as we shall
view that early vision is not affected by cognition and see, however, is misleading because these constraints do
is thus CI, a view shared by Pylyshyn [26.28]. not constitute some form of knowledge that by affect-
Current research (see [26.4] for a discussion) sheds ing early vision renders it theory-laden, as Clark claims.
light on the nature of inferences involved in the hy- Finally, early vision is also affected by associations of
pothesis testing implicated in early vision. Specifically, object properties that reflect statistical regularities in the
the top-down and lateral effects within early vision aim environment and are stored in the early visual circuits
to test hypotheses concerning the putative distal causes through perceptual learning. I argue in the next sec-
of the sensory data encoded in the lower neuronal as- tion that these associations do not constitute a body of
semblies in the visual processing hierarchy. This testing knowledge that affects early vision rendering it theory-
assumes the form of matching predictions made on the laden. The lateral and local recurrent processes, the
basis of this hypothesis about the sensory information constraints, and the associations built in the early vi-
that the lower levels should encode assuming that the sual circuits constitute a rich context that contributes
hypothesis is correct, with the current, actual sensory significantly to the formation of the working hypothe-
information encoded at the lower levels. Eventually, the ses that early vision neuronal assemblies construct to
hypothesis that best matches the sensory data is se- explain the sensory data at the lower processing lev-
lected and the whole process of hypothesis selection els. This context, however, does not involve any body
can be construed as an abductive inference or infer- of knowledge that renders perception theory-laden, as
ence to the best explanation, which could very well theories are traditionally understood.
be carried through by Bayesian nets. One should note As far as the second point is concerned, there is in-
that this account of early vision shows that the standard deed a discrepancy because the account of early vision
constructivist theories of visual processing can be rec- and Clark’s views. Early vision, by being CI and con-
onciled and greatly benefit from the recent conceptions ceptually encapsulated does not involve thinking and is
of the brain as a generative, predictive machine. radically different from thinking. In fact, as I argue in
There seems to be, however, a crucial discrepancy Sect. 26.5, not even late vision that involves concepts
between the account of early vision presented here and and is affected by the viewers’ knowledge about the
Clark’s account of generative hierarchical predictive world is like thinking.
models. It concerns the role of context, or previously
acquired knowledge, in the formation of the work- 26.3.2 Late Vision
ing hypotheses and its direct consequence that because
of this trait, visual perception and discursive thinking The conceptually modulated stage of visual process-
are inseparable. If early vision is restricted to pro- ing is called late vision. Starting at 150200 ms, sig-
cesses occurring within the visual cortex and excludes nals from higher executive centers including mnemonic
any cognitive influences, then first, previous knowledge circuits intervene and modulate perceptual process-
seems to play no role in the formation of the working ing in the visual cortex and this signals the onset
hypotheses, and second, early vision does not involve of global recurrent processing (GRP). In 50 ms low
any thinking since the latter requires the participation of spatial frequency (LSF) information reaches the IT
the cognitive centers of the brain. Moreover, the repre- and in 100 ms high spatial frequency (HSF) infor-
sentations in early vision are analogue-like, iconic and mation reaches the same area. (LSF signals precede
not symbolic and this entails that early vision cannot be LSF signals. LSF information is transmitted through
some sort of discursive thinking since the latter operates fast magnocellular pathways, while HSF information
on symbolic forms. is transmitted through slower parvocellular pathways.)
Vision, Thinking, and Model-Based Inferences 26.3 Stages of Visual Processing 587
Within 130 ms, parietal areas in the dorsal system but pattern activation subsystems in the IT cortex where the
Part F | 26.3
also areas in the ventral pathway (IT cortex) semanti- image is matched against representations stored there,
cally process the LSF information and determine the and the compressed image representation of the ob-
gist of the scene based on stored knowledge that gener- ject is thereby activated. This representation (which is
ates predictions about the most likely interpretation of a hypothesis regarding the identity, that is, class mem-
the input. This information reenters the extrastriate vi- bership of an object) provides imagery feedback to
sual areas and modulates (at about 150 ms) perceptual the visual buffer where it is matched against the input
processing facilitating the analysis of HSF, for example image to test the hypothesis against the fine pictorial
by specifying certain cues in the image that might facil- details registered in the retinotopical areas of the visual
itate target identification [26.29–31]. Determining the buffer. If the match is satisfactory, the category pattern
gist may speed up the FFS of HSF by allowing faster activation subsystem sends the relevant pattern code to
processing of the pertinent cues, using top-down con- associative or WM, where the object is tentatively iden-
nections to preset neurons coding these cues at various tified with the help of information arriving at the WM
levels of the visual pathway [26.32]. through the dorsal system (information about size, lo-
At about 150 ms, specific hypotheses regarding the cation, and orientation). Occasionally the match in the
identity of the object(s) in the scene are formed using pattern activation subsystems is enough to select the
HSF information in the visual brain and information appropriate representation in WM. On other occasions,
from visual working memory (WM). The hypothesis is the input to the ventral system does not match well a vi-
tested against the detailed iconic information stored in sual memory in the pattern activation subsystems. Then,
early visual circuits including V1. This testing requires a hypothesis is formed in WM. This hypothesis is tested
that top-down signals reenter the early visual areas with the help of other subsystems (including cognitive
of the brain, and mainly V1. Indeed, evidence shows ones) that access representations of such objects and
that V1 is reentered by signals from higher cognitive highlight their more distinctive feature. The informa-
centers mediated by the effects of object- or feature- tion gathered shifts attention to a location in the image
centered attention at 235 ms post-stimulus [26.33, 34]. where an informative characteristic or an object’s dis-
This leads to the recognition of the object(s) in the tinctive feature can be found, and the pattern code for
visual scene. This occurs, as signaled by the P3 event- it is sent to the patternactivation subsystem and to the
related-potentials (ERP) waveform, at about 300 ms in visual buffer where a second cycle of matching com-
the IT cortex, whose neurons contribute to the integra- mences.
tion of LSF and HSF information. (The P3 waveform Thus, the processes of late vision rely on recurrent
is elicited about 250600 ms and is generated in many interactions with areas outside the visual stream. This
areas in the brain and is associated with cognitive pro- set of interactions is called global recurrent processing
cessing and the subjects’ reports. P3 may signify the (GRP). In GRP, standing knowledge, i. e., information
consolidation of the representation of the object(s) in stored in the synaptic weights is activated and modu-
working memory.) lates visual processing that up to that point was concep-
A detailed analysis of the form that the hypothesis tually encapsulated. During GRP the conceptualization
testing might take is provided by Kosslyn [26.35]. Note of perception starts and the states formed have partly
that one need not subscribe to some of the assumptions conceptual and eventually propositional contents. Thus,
presupposed by Kosslyn’s account, but these disagree- late vision involves a synergy of perceptual bottom-up
ments do not undermine the framework. Suppose that processing and top-down processing, where knowledge
one sees an object. A retinotopic image is formed in the from past experiences guides the formation of hypothe-
visual buffer, which is a set of visual areas in the occip- ses about the identity of objects. This is the stage where
ital lobe that is organized retinotopically. An attentional the 3-D sketch (that is, the representation of an object
window selects the input from a contiguous set of points as a volumetric structure independently of the viewer’s
for detailed processing. This is allowed by the spatial perspective) is formed. This recovery cannot be purely
organization of the visual buffer. The information in- data-driven since what is regarded as an object depends
cluded in the attention window is sent to the dorsal and on the subsequent usage of the information and thus
ventral system where different features of the image are depends on the knowledge about objects. Seeing 3-D
processed. The ventral system retrieves the features of sketches is an instance of amodal completion, i. e., the
the object, whereas the dorsal system retrieves infor- representation of object parts that are not visible from
mation about the location, orientation, and size of the the viewer’s standpoint. (Amodal completion is the per-
object. Eventually, the shape, the color, and the texture ception of the whole of an object or surface when only
of the object are registered in anterior portions of the parts of it affect the sensory receptors. An object will
ventral pathway. This information is transmitted to the be perceived as a complete volumetric structure even if
588 Part F Model-Based Reasoning in Cognitive Science
only part of it, namely, its facing surface, projects to the levels in the visual hierarchy. The whole process fits the
Part F | 26.4
retina and thus is viewed by the viewer; it is perceived scheme of an abductive inference or inference to the
as possessing internal volume and hidden rear surfaces best explanation that could be carried out by means of
despite the fact that only some of its surfaces are ex- Bayesian networks.
posed to view. Whether this perception involves visual There is a marked difference between the abductive
awareness, in which case the brain completes the miss- inferences involved in early vision and those involved
ing features through mental imagery, or visual under- in late vision; the latter but not the former are informed
standing only, which means that the hidden features are by knowledge properly speaking, that is, by information
not present in the phenomenology of the visual scene that is articulated in thought and thus contains concepts.
but are thought of, is a matter of debate.). In amodal This might tempt one to think that late vision may be
completion, one does not have a perceptual impression akin to thought and thus that there is a stage of visual
of the object’s hidden features since the perceptual sys- processing that has the most crucial traits of thinking,
tem does not fill in the missing features as happens in i. e., it involves discursive inferences justifying thus in
modal perception; the hidden features are not perceptu- part Clark’s, Spelke’s and others’ belief to that effect.
ally occurrent (see Appendix 26.D for a discussion of Against this, I am going to argue in Sect. 26.5, that
modal and amodal perception or completion). late vision despite its being informed by conceptually
One readily notices that Kosslyn’s account of hy- articulated knowledge, differs in significant ways from
pothesis testing naturally fits the schema of hierar- thinking, the most important difference being that late
chical generative predictive models as discussed in vision does not engage in discursive inferences.
Clark [26.4]. The main themes of this schema are I have claimed that late vision constructs gradu-
present in Kosslyn’s account. These are: the generation ally a representation that best matches the visual scene
of hypotheses at a higher level of visual processing, the through a set of processes that test a series of hypothe-
crucial role of context or previously acquired knowl- ses by matching these hypotheses against stored iconic
edge in the formation of these hypotheses, and the information. In other words, the output of late vision,
testing of these hypotheses through their predictions a recognitional belief, is the result of an abductive in-
against the rich iconic information stored in the lower ference.
learned and stored in the early visual circuits to fa- tion of the species. They allow us to lock onto medium
Part F | 26.4
cilitate the processing of familiar input. Since these size lumps of matter, by providing the discriminatory
associations could arguably be construed as involving capacities necessary for the individuation and tracking
concepts, a claim could be made that early vision if af- of objects in a bottom-up way; they allow perception to
fected by concepts. generate perceptual states that present worldly objects
In what follows, I examine these two objections and as cohesive, bounded, solid, and spatiotemporally con-
argue that both sorts of phenomena do not signify the tinuous entities.
CP and theory-ladenness of perception. This is so be- The constraints are not available to introspection,
cause, first, they do not entail that there are any concepts function outside the realm of consciousness, and can-
embedded in early vision, and second, because it is not be attributed as acts to the perceiver. One does not
doubtful whether they contain any representations. This believe implicitly or explicitly that an object moves in
is also important for the wider claim that visual percep- continuous paths, that it persists in time, or that it is
tion is like thinking, since thinking necessarily involves rigid, though one uses this information to parse and
inferences driven by representations of both premises index the object. These constraints are not perceptu-
and the rules of inferences. If it turns out, as I argue ally salient but one must be sensitive to them if one
here, that the transformation rules that visual percep- is to be described as perceiving their world. The con-
tion employs to process its states are not represented straints constitute the modus operandi of the perceptual
anywhere in the system, this would severely undermine system and not a set of rules used by the percep-
the claim that perceptual inferences are the same as the tual system as premises in perceptual inferences even
inferences used in belief formation. though the modus operandi of the visual system con-
sists of operations determined by laws describable in
26.4.1 The Operational Constraints terms of computation principles. They are reflected in
in Visual Processing the functioning of the perceptual system and can be
used only by it. They are available only for visual pro-
There is extensive evidence that there is an impor- cessing, whereas theoretical constraints are available
tant body of information that affects perception not in for a wide range of cognitive applications. These con-
a top-down manner but from within and this might straints cannot be overridden since they are not under
be construed as evidence for the CP of visual percep- the perceiver’s control; one cannot decide to substi-
tion from its inception. The perceptual system does not tute them with another body of constraints even if one
function independently of any kind of internal restric- knows that they lead to errors.
tions. Visual processing at every level is constrained Being hardwired, the constraints are not even con-
by certain principles or rather operational constraints tentful states of the perceptual system. A state is formed
that modulate information processing. Such constraints through the spreading of activation and its modification
are needed because distal objects are underdetermined as it passes through the synapses. The hardwired con-
by the retinal image, and because the percept itself is straints specify the processing, i. e., the transformation
underdetermined by the retinal image. Unless the pro- from one state to another, but they are not the result
cessing of information in the perceptual system is con- of this processing. They are computational principles
strained by some assumptions about the physical world, that describe transitions between states in the perceptual
perception is not feasible. Most computational accounts system. Although the states that are produced by means
hold that these constraints substantiate some reliable of these mathematical transformations have contents,
generalities of the natural physical world as it relates to there is no reason to suppose that the principles that
the physical constitution and the needs of the perceiving specify the mathematical transformation operations are
agents. There is evidence that the physiological visual states of the system or contents of states in the system.
mechanisms reflect these constraints. Their physical If they are not states of the visual system, the princi-
making is such that they implement these constraints, ples that express them linguistically cannot be contents
which are thus hardwired in perceptual systems (see of any kind. Even though the perceptual system uses
Appendix 26.E for a list of some of these constraints). the operational constraints to represent some entity in
These are Raftopoulos’ [26.36] operational con- the world and thus operates in accord with the princi-
straints and Burge’s [26.37] formation principles. The ples reflected in the constraints (since the constraints are
operational constraints reflect higher-order physical hardwired in the perceptual system, physiological con-
regularities that govern the behavior of worldly objects ditions instantiate the constraints), the perceiver does
and the geometry of the environment and which have not represent these principles or the constraints in any
been incorporated in the perceptual system through form. By the same token, these principles cannot be
causal interaction with the environment over the evolu- thought of as implicating concepts, since concepts are
590 Part F Model-Based Reasoning in Cognitive Science
representational. For this reason, perceptual operations in perceptual processing notwithstanding, these con-
Part F | 26.4
should not be construed as inference rules, although straints do not justify Cavanagh’s characterization of
they are describable as such, and they do not consti- visual perception as visual cognition, if cognition is
tute either a body of knowledge or some theory about thought of as involving discursive inferences.
the world.
Recent work on Bayesian causal networks [26.4] 26.4.2 Perceptual Learning
draws a picture of the brain as a Bayesian net oper-
ating at various space and time scales, and suggests Evidence from studies showing early object classifica-
that there is a sharp distinction between internal prob- tion effects suggests that to the extent that object classi-
abilistic dependencies that can be explained by internal fication presupposes object knowledge, this knowledge
causal connections and those that cannot. Only those affects early vision in a top-down manner rendering it
that cannot be explained internally carry information theory-laden. Moreover, even if one could show that
about the external world. Applying this to the case these effects do not entail the CP of early vision, one
of the neuronal mechanisms that implement the oper- could argue that since perceptual learning affects the
ational constraints at work in visual processing, one way one sees the world, some experiences are learned
could say that these mechanisms perform transforma- and form memories that are stored in visual memory
tions that depend entirely on the internal probabilistic and affect perceptual processing from its inception. Our
dependencies in the system as they are determined by experiences shape the way we see the world.
the hardwired circuitry that realizes the internal causal Indeed, visual memories affect perception. Famil-
connections and thus there is nothing representational iarity with objects or scenes that is built through
about them. repeated exposure to objects or scenes (sometimes
These considerations allow us to address Ca- one presentation is enough), or even repetition mem-
vanagh’s [26.38] claim that the processes that lead to ory [26.40] facilitate search, affect figure from ground
the formation of a conscious percept constitute visual segmentation, speed up object identification and image
cognition in virtue of their use of inferences. The con- classification, etc. [26.41–43].
struction of a percept is “the task of visual cognition Familiarity can affect visual processing in different
and, in almost all cases, each construct is a choice ways. It may facilitate object identification and catego-
among an infinity of possibilities, chosen based on like- rization, which are processes that take time since their
hood, bias, or a whim, but chosen by rejecting other final stage occurs between 300600 ms after stimu-
valid competitors” [26.38]. This process is an inference lus onset as is evidenced by the P3 responses in the
in that “it is not a guess. It is a rule-based extension brain, but their earlier stage starts about 150 ms after
from partial data to the most appropriate solution”; in stimulus onset [26.44–46]. One notices that familiarity
the terminology of this chapter, the selection process is intervenes during the latest stage of visual processing
an abduction. (300360 ms). These effects involve the higher cogni-
According to Cavanagh [26.38], for inference to tive levels of the brain at which semantic information
take place the visual system should not rely to purely and processing, both being required for object identifi-
bottom-up analyses of the image that use only retinal cation and categorization, occur [26.30]. In this sense,
information, such as sequences of filters that under- these sort of familiarity effects do not threaten the CI of
lies facial recognition, or the cooperative networks that early vision, which has ended about 120 ms after stim-
converge on the best descriptions of surfaces and con- ulus onset.
tours. Instead, the visual system should use some object Familiarity, including repetition memory, also af-
knowledge, which is nonretinal, context-dependent in- fects object classification (whether an image portrays an
formation. By object knowledge Cavanagh means any animal or a face), a process that occurs in short latencies
sort of nonretinal information that may be needed for (95100 ms and 8595 ms respectively) [26.47–49].
the filling in that leads to the construction of the percept. These early effects may pose a threat to the CI of early
This knowledge consists of rules that guide or constrain vision since they cannot be considered post-sensory.
visual processing in order to solve the underdetermina- The threat would materialize should the classification
tion problem that I mentioned above; they provide the processes either require semantic information to in-
rule-based extension from partial data that constitutes tervene or require the representations of objects in
an inference. These rules do not influence visual pro- working memory to be activated, since that would, too,
cessing in a top-down way, since they reside within the mean conceptual involvement.
visual system; they are “from the side” [26.39]. Researchers however unanimously agree that the
The discussion concerning the nature of the op- early classification effects in the brain result from the
erational constraints suggests that, their crucial role FFS and do not involve top-down semantic information,
Vision, Thinking, and Model-Based Inferences 26.5 Late Vision, Inferences, and Thinking 591
nor do they require the activation of object memories. cal properties of different subsets of images are detected
Part F | 26.5
The brain areas involved are low-level visual areas (in- very early by the visual system before any top-down
cluding the FEF – front eye fields) from V1 to no semantic involvement as is evidenced by the elicita-
higher than V4 [26.48] or perhaps a bit more upstream tion of an early deflection in the differential between
to posterior IT [26.42] and lateral occipital complex animal-target and nontarget ERP’s at about 98 ms (in
(LOC) [26.49]. the occipital lobe) and 120 ms (in the frontal lobe). The
The early effects of familiarity may be explained low-cues could be retrieved very early in the visual sys-
by invoking contextual associations (target-context spa- tem from a scene by analyzing the energy distribution
tial relationships) that are stored in early sensory ar- across a set of orientation and spatial frequency-tuned
eas to form unconscious perceptual memories [26.50] channels [26.54]. This suggests that the rapid image
which, when activated from incoming signals that bear classification may rely on low-level, or intermediate-
the same or similar target-context spatial relationships, level cues [26.51] that act diagnostically, that is, they
modify the FFS of neural activity resulting in the facil- allow the visual system to predict the gist of the scene
itating effects mentioned above. Thus, what is involved and classify images very fast. These cues may be pro-
in the phenomenon are certain associations built in the vided by coarse visual information, say by low-level
early visual system that once activated speed up the spatial frequency information and thus the visual sys-
feedforward processing. This is a case of rigging-up the tem does not have to rely on high-level fully integrated
early visual processing; it is not a case of top-down cog- object representations in order to be able to classify
nitive effects on early visual processing. rapidly visual scenes.
The early effects may also be explained by appeal- It follows that the classification of an object that oc-
ing to configurations of properties of objects or scenes. curs very early during the fast FFS at about 85100 ms
Currently, neurophysiological research [26.40, 49], psy- is due to associations regarding shape and object frag-
chological research [26.42], and computation model- ments stored in early visual areas and does not reflect
ing [26.51] suggest that what is stored in early visual any top-down cognitive effects on, that is, the CP of,
areas are implicit associations representing fragments early vision. Thus, early object classification is not
of objects and shapes, or edge complexes, as opposed to a sign of the theory-ladenness of early vision, since the
whole objects and shapes. One of the reasons that have knowledge about the world does not affect it in a top-
led researchers to argue that it is object and shape frag- down manner.
ments that are used in rapid classifications instead of To recapitulate the results of our discussion in
whole objects and shapes is the following: If these as- this section, I have argued that neither the operational
sociations reflecting some sort of object recognition can constraints operating in visual perception, nor percep-
affect figure-ground segmentation as we have reasons to tual learning entail that concepts affect early vision.
believe [26.42] in view of the fact that figure-ground Moreover, they do not entail that visual processing in
segmentation occurs very early (80100 ms) [26.52] general is theory-laden because of the role of these con-
these associations must be stored in early visual areas straints, since they are not representational elements
(up to V4, LO and posterior IT) and cannot be the rep- and any theory constitutively implicates representa-
resentations stored in, say, anterior IT. The earlier visual tional elements. On the other hand, both the operational
areas store object and shape fragments and not holistic constraints and the effects of perceptual learning pro-
figures and shapes [26.40, 51]. vide the context in which early vision constructs its
The associations that are built in, through learn- hypotheses, and part of the context in which late vision
ing, in early visual circuits reflect in essence the operate, the other part being the viewer’s knowledge of
statistical distribution of properties in environmental the world, which, as I have said, affects late vision but
scenes [26.32, 53]. The statistical differences in physi- not early vision.
is a result of an inference. These views belong to the discursive stages and renders it a different sort of a set of
Part F | 26.5
belief-based account of amodal completion: the 3-D processes than understanding, even though late vision
sketch is the result of beliefs abductively inferred from involves implicit beliefs regarding objects that guide
the object’s visible features and other background in- the formation of hypotheses concerning object identity,
formation from past experiences (see Appendix 26.D and an explicit belief of the form that O is F eventually
for an explanation of amodal and modal completion or arises in the final stages of late vision. Late vision has
perception). an irreducible visual ingredient that makes it different
The problem is whether the object identification from discursive understanding.
that occurs in late vision (which, as we have seen Let me clarify two terminological issues. First,
most likely constitutes in essence an abductive infer- judgments are occurrent states, whereas beliefs are dis-
ence) and depends on concepts should be thought of as positional states. To judge that O is F is to predicate
a purely visual process or as a case of discursive under- F-ness to O while endorsing the predication [26.60].
standing involving discursive inferences. If late vision To believe that O is F is to be disposed to judge under
involves conceptual contents and if the role of con- the right circumstances that O is F. This is one sense
cepts and stored knowledge consists of providing some in which beliefs are dispositional items. There is also
initial interpretation of the visual scene and in form- a distinction between standing knowledge (information
ing hypotheses about the identity of objects that are stored in long term memory, LTM) and information
tested against perceptual information, one is tempted that is activated in working memory (WM). The be-
to say that this stage relies on inferences and thus dif- lief that O is F may be a standing information in LTM,
fers in essence from the purely perceptual processes a memory about O even though presently one does not
of early vision. Perhaps it would be better to construe have an occurrent thought about O. Beliefs need not be
late vision as a discursive stage involving thoughts, in consciously or unconsciously apprehended, that is, acti-
the way of epistemic seeing, where seeing is used in vated in the mind, in order to be possessed by a subject,
a metaphorical nonperceptual sense, as where one says which means that beliefs are dispositional rather than
of his friend whom she visited I see he has left, based occurrent items; this is a second sense in which beliefs
on perceptual evidence [26.56]. It is, also possible that are dispositional. When this information is activated,
Dretske [26.57, 58] thinks that seeing in the doxastic the thought that O is F emerges in WM; all thoughts
sense is not a visual but rather a discursive stage. are occurrent states.
One might object, first, that abandoning this usage It follows that a belief qua dispositional state may
of to see violates ordinary usage. A fundamental ingre- be either a piece of standing knowledge, in which case
dient of visual experience consists of meaningful 3-D it is dispositional in the sense that when activated it
solid objects. Adopting this proposal would mean that becomes a thought, or a thought that awaits endorse-
one should resist talking of seeing tigers and start talk- ment to become a judgment, in which case the belief
ing about seeing viewer-centered visible surfaces. “By is dispositional in the sense that it has the capacity to
this criterion, much of the information we normally take become a judgment. In the first case, beliefs differ from
to be visually conscious would not be, including the thoughts. In the second case, a belief is a thought held in
3-D shape of objects as well as their categorical iden- WM, albeit one that has not been yet endorsed. In what
tity” [26.59]. follows, I assume that beliefs are either pieces of stand-
More to the point, I think that one should not ing information or thoughts that have not been endorsed
assume either that late vision involves abductive in- and thus are not judgments. Finally, by implicit belief I
ferences construed as inferential discursive-state trans- mean the belief held by a person who is not aware that
formations that constitutively involve thoughts in the she is having that belief.
capacity of premises in inferences whose conclusion is As I said in the introduction, this chapter exam-
a recognitional belief, or that late vision consists of dis- ines whether the abductive processes that take place
cursively entertaining thoughts; if thinking is construed in late vision should be construed as discursive infer-
as constitutively implicating discursive argumentation, ences. Specifically, my claim is that the processes in
visual perception is different from thinking in some late vision are not inferential processes where infer-
radical ways. The reason is twofold. First, seeing an ob- ence is understood as discursive, that is, as a process
ject is not the result of a discursive inference, that is, that involves drawing propositions or conclusions from
a movement in thought from some premises to a con- other propositions, that are represented in the system,
clusion, even though it involves concepts and intrastate acting as premises by applying (explicitly or implic-
transformations. Second, late vision is a stage in which itly) inferential rules that are also represented. As we
conceptual modulation and perceptual processes form saw, these inferences are distinguished from inferences
an inextricable link that differentiates late vision from as understood by vision scientists according to whom
Vision, Thinking, and Model-Based Inferences 26.5 Late Vision, Inferences, and Thinking 593
any transformation of signals carrying information ac- This can be done through purely associational processes
Part F | 26.5
cording to some rule is an inference. of the sort employed, say, in connectionist networks
that process information according to rules and thus
26.5.1 Late Vision, Hypothesis Testing, can be thought of as instantiating processing rules,
and Inference without either representing these rules or operating on
language-like symbolic representations. Such networks
I think that the states of late vision are not inferences perform vector completion and function by satisfying
from premises that include the contents of early vi- soft constraints in order to produce the best output
sion states, even though it is usual to find claims that given the input into the system and the task at hand.
one infers that a tiger, for example, is present from the Note that the algebraic and thus continuous nature of
perceptual information retrieved from a visual scene. state transformations in neural networks, as opposed to
An inference relates some propositions in the form the algorithmic discrete-like operations of classical AI
of premises with some other proposition, the conclu- (which assumes that the brain is a syntactic machine
sion. However, the objects and properties as they are that processes discrete symbols according to rules that
represented in early vision do not constitute contents are also represented in the system) suits best the ana-
in the form of propositions, since they are part of logue nature of iconic representations.
the nonpropositional, iconic nonconceptual content of In perceptual systems construed as neural networks,
perception. In late vision, the perceptual content is con- the fundamental representational unit is not some lin-
ceptualized but the conceptualization is not a kind of guistic or linguistic-like entity but the activation pat-
inference but rather the application of stored concepts tern across a proprietary population of neurons. If
to some input that enters the cognitive centers of the one wishes to understand the workings of the visual
brain and activates concepts by matching their content. brain, one should eschew sentences and propositions
Thus, even though the states in late vision are formed as bearers of representations and meanings and recon-
through the synergy of bottom-up visual information ceptualize representations as activation patterns. This
and top-down conceptual influences, they are not infer- does not mean, of course, that the brain does not have
ences from perceptual content. symbolic representations but only that, first, these are
Late vision involves hypotheses regarding the iden- a subset of the representations that the brain uses in
tity of objects and their testing against the sensory its various functions, and, second and most importantly,
information stored in iconic memory. One might think the symbolic representations are constructed somehow
that inferences are involved since testing hypotheses is out of the more fundamental context-dependent rep-
an inferential process even though it is not an inference resentations that the brain uses and are, consequently,
from perceptual content to a recognitional thought. It is, a later construct, phylogenetically speaking. This has
rather, an argument of the form of: if A and B then (con- an important corollary for any theory of cognition that
clusion) C, where A and B are background assumptions employs activation patterns as the fundamental units of
and the hypothesis regarding the identity of an object representation, namely, that it must be able to explain
respectively, and C is the set of visual features that the the existence and usage of symbolic representations.
object is likely to have. A consists of implicit beliefs This means also that the processing at work in the brain,
about the features of the hypothesized visual object. If that is, the transformation of the representational units
the predicted visual features of C match those that are to other representational units is not exclusively the
stored in iconic memory in the visual areas, then the hy- transformation of complex or simple symbols by means
pothesis about the identity of the object is likely correct. of a set of syntactic rules as in the algorithms that, ac-
The process ends when the best possible fit is achieved. cording to the classical view, the brain is supposed to
However, the test basis or evidence against which these run. Instead, it can be the algebraic transformation of
hypotheses are tested for a match, that is, the iconic activation patterns (in essence the algebraic transfor-
information stored in the sensory visual areas, is not mations from one multidimensional matrix or tensor to
a set of propositions but patterns of neuronal activations another). The transformation is effected by the synap-
whose content is nonpropositional. tic connections among the neurons as the signal passes
There is nothing inference-like in this matching. It from one layer to another. These connections have
is just a comparison between the activations of neu- weights that constitute a filter through which the signal
ronal assemblies that encode the visual features in the is transformed as it passes through.
scene and the activations of the neuronal assemblies The above also explain the holistic nature of the
that are activated top-down from the hypotheses. If the abductive visual processes that classical cognitive the-
same assemblies are activated then there is a match. ories (the family of theories that assume that the brain
If they are not, the hypothesis fails to pass the test. is a syntactic machine that processes symbols that are
594 Part F Model-Based Reasoning in Cognitive Science
constant, context independent, and freely repeatable el- perception engages in discursive, inferential processes.
Part F | 26.5
ements) have failed to capture. It is interesting that if I Against this, I argued above that the processes that con-
am right, Fodor’s attempt to differentiate the perceptual strain the operations of the visual system should not be
systems from cognitive functions in order to protect the construed as discursive inferences. They are hardwired
former from the abductive holistic reasoning implicated in the perceptual circuits and are not represented in it.
in the latter fails since late vision is abductive and holis- Thus, perceptual operations should not be construed as
tic as well. inference rules, although they are describable in terms
Since discursive inferences are carried out through of discursive inferential rules. It follows that the abduc-
rules that are represented in the system and operate tion that takes place in late vision is not an Aristotelian
on symbolic structures, the processing in a connec- inference; it is better described by the ampliative vector
tionist network does not involve discursive inferences, completion of connectionism.
although it can be described in terms of inference mak-
ing. Thus, even though seeing an object in late vision 26.5.2 Late Vision and Discursive
involves the application of concepts that unify the ap- Understanding
pearances of the object and of its features under some
category, it is not an inferential process. Even if I am right that seeing in late vision is not the
I have said that the noninferential process that re- result of a discursive abductive inference but the re-
sults in the formation of a recognitional thought or sult of a pattern-matching process that ensures the best
belief can be recast in the form of an argument from fit with the available data, it is still arguable that late
some premise to a conclusion. However, this does not vision should be better construed as a stage of dis-
entail that the formation of the perceptual thought is cursive understanding rather than as a visual stage. If
a piece of reasoning, that is, a transition from a set of object recognition involves forming a belief about class
premises that act as a reason for holding the thought membership, even if the belief is not the result of an
to the thought itself. Admittedly, the perceiver can be inference, why not say that recognizing an object is an
asked on what grounds she holds the thought that O is experience-based belief that is a case of understanding
F, in which case she may reply because I saw it or I saw rather than vision?
that O is F. However, this does not mean that the reason
she cites as a justification of her thought is a premise Late Vision Is more than Object Recognition
from which she inferred the thought. The perceiver does A first problem with this view is that late vision involves
not argue from her thought I saw it to be thus and so to more than a recognitional belief. Suppose that S sees an
the thought It is thus and so. She just forms the thought animal and recognizes it as a tiger. In the parallel preat-
on the basis of the evidence included in her relevant tentive early vision, the proto-object that corresponds to
perceptual state in the noninferential way I described the tiger is being represented amongst the other objects
above. What warrants the recognitional thought O is F in the scene. After the proto-objects have been parsed,
is not the thought held by the perceiver that she sees O the object recognition system forms hypotheses regard-
to be F but the perceptual state that presents to her the ing their identity. However, for the subject’s confidence
world as being such and such. “When one knows some- to reach the threshold that will allow her to form be-
thing to be so by virtue of seeing to be so, one’s warrant liefs about the identity of the objects and report it, these
for believing it to be so is that one sees it to be so, not hypotheses must be tested [26.61].
one’s believing that one sees it to be so” [26.57]. For this to happen, the relevant sensory activations
Spelke [26.3] who echoes Rock’s [26.2] views that enter the parietal and temporal lobes, and the prefrontal
the perceptual system combines inferential information cortex, where the neuronal assemblies encoding the in-
to form the percept (for example, from visual an- formation about the objects in the scene are activated
gle and distance information, one infers and perceives and the relevant hypotheses are formed. To test these
size) – argues “perceiving objects may be more akin to hypotheses, the visual system allocates resources to
thinking about the physical world than to sensing the features and regions that would confirm or disconfirm
immediate environment”. The reason is that the percep- the hypotheses. To accomplish this, activation spreads
tual system, to solve the underdetermination problem of through top-down signals from the cognitive centers
both the distal object from the retinal image and of the to the visual areas of the brain where the visual sen-
percept from the retinal image, employs a set of object sory memory and the fragile visual memory store the
principles and that reflect the geometry and the physics proto-objects extracted from the visual scene. This way,
of our environment. Since the contents of these princi- conceptual information about the tiger affects visual
ples consist of concepts, and thus the principles can be processing and after some hypothesis testing the an-
thought of as some form of knowledge about the world, imal is recognized as a tiger through the synergy of
Vision, Thinking, and Model-Based Inferences 26.5 Late Vision, Inferences, and Thinking 595
visual circuits and WM. At this point the explicit be- as a perceptual demonstrative that refers to the object
Part F | 26.5
lief O is F is formed. This occurs after 300 ms, when of perception, as this has been individuated though the
the viewer consolidates the object in WM and identifies processes of early vision. As such, late vision is consti-
it with enough confidence to report it, which means that tutively context dependent since the demonstration of
beliefs are formed at the final phases of late vision. the perceptual particular is always context dependent.
However, semantic modulation of visual process- Thought, on the other hand, by its use of context in-
ing and the process of conceptualization that eventually dependent symbols, is free of the particular perceptual
leads to object recognition starts at about 130200 ms. context. Even though recognitional beliefs in late vision
There is thus a time gap between the onset of concep- and pure perceptual beliefs involve concepts, the con-
tualization and the recognition of an object, which is cepts function differently in the two contexts [26.37]:
a prerequisite for the formation of an explicit recog-
nitional belief. As Treisman and Kanwisher [26.62] “Perceptual belief makes use of the singular and
observe, although the formation of hypotheses regard- attributive elements in perception. In perceptual
ing the categorization of objects can occur within belief, pure attribution is separated from, and sup-
130200 ms after stimulus onset, it takes another plements, attributive guidance of contextually pur-
100 ms for subsequent processes to bring this informa- ported reference to particulars. Correct conceptual-
tion into awareness so that the perceiver could be aware ization of a perceptual attributive involves taking
of the presence of an object. To form the recognitional over the perceptual attributive’s range of applica-
belief that O is F, one must be aware of the pres- bility and making use of its (perceptual) mode of
ence of an object token and construct first a coherent presentation.”
representation. This requires the enhancement through
attentional modulation of the visual responses in early The attributive and singular elements in perception
visual circuits that encode rich sensory information in correspond to the perceived objects and their proper-
order to integrate them into a coherent representation, ties respectively. The attributive elements or properties
which is why beliefs are delayed in time compared with guide the contextual reference to particulars or objects
the onset of conceptualization; not all of late vision in- since the referent in a demonstrative perceptual refer-
volves explicit beliefs. ence is fixed through the properties of the referent as
these properties are presented in perception.
Late Vision as a Synergy of Bottom-Up Concepts enter the game in their capacity as pure
and Top-Down Information Processing attributions that make use of the perceptual mode of
A second reason why the beliefs formed in late vi- presentation. Burge’s claim that in perceptual beliefs
sion are partly visual constructs and not pure thoughts pure attributions supplement attributions that are used
is that the late stage of late vision in which explicit for contextual reference to particulars may be read to
beliefs concerning object identity are formed constitu- mean that perceptual beliefs are hybrid states involv-
tively involves visual circuits (that is, brain areas from ing both visual elements (the contextual attributions
LGN to IT in the ventral system). Pure thought, on used for determining reference to objects and their
the other hand, involves an amodal form of representa- properties) and conceptualizations of these perceptual
tion formed in higher centers of the brain, even though attributives in the form of pure attributions. In this case,
these amodal representations can trigger in a top-down the role of perceptual attributives is ineliminable. In late
manner the formation of mental images and can be trig- vision, unlike in pure beliefs, there can be no case of
gered by sensory stimulation. The point is that amodal pure attribution, that is, of attribution of features in the
representations can be activated without a concomitant absence of perceptually relevant particulars since the at-
activation of the visual cortex. The representations in tributions are used to single out these particulars.
late vision, in contrast, are modal since they constitu- The inextricable link between thought and per-
tively involve visual areas. Thus, what distinguishes late ception in late vision explains the essentially contex-
vision beliefs from pure thoughts is mostly the fact that tual [26.63, 64] character of beliefs in late vision. The
the beliefs in late vision are formed through a synergy proposition expressed by the belief cannot be detached
of bottom-up and top-down activation and their main- from the perceptual context in which it is believed and
tenance requires the active participation of the visual cannot be reduced to another belief in which some third
circuits. Pure thoughts can be activated and maintained person or objective content is substituted for the in-
in the absence of activation in visual circuits. dexicals that figure in the thought (in the way one can
The constitutive reliance of late vision on the visual substitute via Kaplan’s characters the indexical terms
circuits suggests that late vision relies on the presence with their referents and get the objective truth-evaluable
of the object of perception; it cannot cease to function content of the belief); the belief is tied to a idiosyncratic
596 Part F Model-Based Reasoning in Cognitive Science
viewpoint by making use of the viewer’s physical pres- The fact that both conceptual and nonconceptual repre-
Part F | 26.6
ence and occupation of a certain location in space and sentations are in essence activation patterns allows us to
time; the context in which the indexical thought is be- understand how conceptual, symbolic information and
lieved is essential to the information conveyed. nonconceptual iconic information could interact. The
The discussion on late vision and the inferences it main difference between the two forms of representa-
uses to construct the percept suggests that late vision, tions is that the former are not homogeneous and have
its conceptual nature notwithstanding, does not involve a syntactic structure that has a canonical decomposition,
discursive inferences and in this sense is fundamentally whereas the latter are homogeneous and lack a canon-
different from thinking, if the latter is thought to im- ical decomposition. To appreciate the difference think
plicate constitutively discursive inferences. Late vision of it in following way: the fact that a symbolic repre-
employs abductive inferences, in that it constructs the sentation has a canonical decomposition means that not
representation that best fits the sensory image, but these every subpart of the representation is a representation;
inferences are not the result of the application of rules only those subparts that satisfy the syntactic rules of
that are represented in the system. Even the operational the representational systems are symbols or representa-
constraints that restrict visual processing in late vision tions. The expression (p&Q), for instance, is a symbol
and could be thought of as transformation rules that the or a representation, but the expression (p(&q) is not.
system follows to make inferences, are not, as we have Any subpart of an image, on the other hand, is an image
seen, propositional structures or even representations and thus a representation.
in the brain. The inferences involved are informed and The output of late vision, namely the percept, en-
guided by conceptual information in pattern-matching ters the space of reasons and participates in discursive
processes but fall short of being discursive inferences. inferences and thus in thought.
To recapitulate, the main conclusion of this chap- Table 26.1 Visual perception and thinking
Part F | 26.A
ter is, first, that to the extent that thinking is associated Thinking
with the use of discursive inferences, perception dif- Perception Thinking narrow Thinking wide
fers radically from thinking. If the meaning of thinking Early vision No Yes/no concepts
is extended to comprise nondiscursive inferences, the Late vision No Yes/yes concepts
claim may be made that perception is thinking. In
this case, however, a distinction should be drawn be-
tween discursive thinking that characterizes cognition if thinking is conceived as necessarily implicating dis-
and nondiscursive thinking that characterizes percep- cursive inferences late vision is not akin to thinking,
tual processes. Second, if thinking also necessarily in- notwithstanding the conceptual involvement. If the con-
volves the deployment of concepts, then there is a stage cept of thinking is extended to include other sorts of
of visual processing, namely, early vision, which is not inferences, such as the model-based abductive infer-
akin to thinking since its contents are nonconceptual. ences discussed in this chapter, late vision could be
The other stage of visual processing, namely late vision, thought of as a sort of thinking, which, unlike early
uses conceptual information. Since, as I will argue, the vision, implicates concepts (see Table 26.1 for a tax-
processes of late vision are not discursive inferences, onomy).
preserve truth and is thus probabilistic in that the con- Both abduction and induction are tentative forms of in-
clusion is tentative. ference in that they do not warrant the truth of their
conclusion even if the premises are true. They are,
26.A.4 Differences Between the Modes also, both ampliative in that the conclusion introduces
of Inference information that was not contained implicitly in the
premises. As we have seen, in abduction one aims to
Induction versus Deduction explain or account for a set of data. Induction is a more
Induction is an ampliative inference, whereas deduc- general form of inference. When, for instance, one suc-
tion is not ampliative. This means that the information cessfully tests a hypothesis by making predictions that
conveyed by the conclusion of an inductive argument are borne out, the predicted data provide inductive, but
goes beyond the information conveyed by the premises not abductive, support for the hypothesis. In general, the
and, in this sense, the conclusion is not implicitly con- evaluation phase in hypothesis, or theory, construction
tained in the premises. In deduction, the conclusion is is considered to be inductive. Conceiving the explana-
implicitly contained in the premises and the inference tory hypothesis, on the other hand, is an abductive
just makes it explicit. If all men are mortal and Socrates process that may assume the form of a pure, educated
is a man, for example, the fact that Socrates is mortal is guess that need not have involved any previous testing.
implicitly contained in these two propositions. What the In this case, the abductively produced hypothesis is not,
deduction does is to render it explicit in the form of the a priori, the best explanation for the set of data that
conclusion. When we deduce that Socrates is mortal, need explanation; this is one of the occasions in which
our knowledge does not extend that which we already abduction can be distinguished from the inference to
knew; it only makes it explicit. When, on the other the best explanation. However, it should be stressed,
hand, we inductively infer that all crows are probably although I do not have the space to elaborate on this
black from the premise that all the specimens of crows problem, that in realistic scientific practice abduction
that we have examined thus far are black, we extend the as theory construction could not be separated from the
scope of our knowledge because the conclusion con- evaluative inductive phase since they both form an in-
cerns all crows and not just the crows thus far examined. extricable link. This justifies the claim that abduction is
The above discussion entails the main difference an inference to the best explanation.
between deductive and inductive arguments. Deductive A further difference between abduction and in-
arguments are monotonous, while inductive arguments duction is that even though both kinds of inference
are not. This means that a valid deductive argument are ampliative, in abduction the conclusion may, and
remains valid no matter how many premises we add usually does, contain terms that do not figure in the
to the argument. The reason is that the validity of the premises. Almost all theoretical entities in science were
deductive argument presupposes that the conclusion is conceived as a result of abduction. The nucleus of an
a logical conclusion of its premises. This fact does atom, for example, was posited as a way of explain-
not change by the addition of new premises, no mat- ing the scattering of particles after the bombardment
ter what these premises stipulate and thus the deductive of atoms. Nowhere in the premises of the abductive
argument remains valid. Things are radically different argument was the notion of an atom present; the evi-
in induction. A new premise may change the conclu- dence consisted in measurements of the deviation of the
sion even if the previous premises strongly supported pathways particles from their predicted values after the
the conclusion. For example, if we discover that crow bombardment. The conclusion all crows are probably
C 1 is white, this undermines that previously drawn black, on the other hand, contains only terms that are
and well-supported conclusion that all crows are black. available in the premises.
remain useful. According to this form of construc- faces in the environment that might have produced
Part F | 26.B
tivism, vision consists of four stages, each of which the features constructed in the image-based model.
outputs a different kind of visual representation: At this stage, and in contradistinction to the pre-
ceding stage, the information about the worldly
1. The formation of the retinal image; the immediate surfaces is represented in three dimensions. Marr’s
stimulus for vision, that is the first stimulus that af- two-and-a-half-dimensional (2.5-D) sketch is a typ-
fects directly the sensory organs (this is called the ical example of a surface-based representation. Note
proximal stimulus) is the pair of two-dimensional that the surface-based representation of a visual
(2-D) images projected from the environment to scene does not contain information about all the
the eyes. This representation is based on a 2-D surfaces that are present in the scene, but only
retinal organization. At this stage, the information those that are visible for the viewer’s current view-
impinging on the retina (which as you may recall point.
concerns intensity of illumination and wavelengths,
and which is captured by the retinal receptors) is In general, the surface-based representation has
organized so that all of the information about the the following properties: First, The elements that the
spatial distribution of light (i. e., the light intensity surface-based stage outputs consist of the output of
falling on each retinal receptor) be recast in a refer- the image-based stage, that is, in 2-D surfaces at some
ence frame that consists of square image elements particular slant that are located at some distance from
(pixels), each indicating with a numerical value the the viewer in 3-D space. Second, these 2-D surfaces
light intensity falling on each receptor. Sometimes, are represented within a 3-D spatial framework. Third,
the processes of this stage are called sensation. the aforementioned reference framework is defined in
2. The image-based stage; it includes operations that terms of the direction and distance of the surfaces from
receive as input the retinal image (that is, the nu- the observer’s standpoint (it is egocentric):
merical array of values of light intensities in each
pixel) and process it in order to detect local edges 4. The object-based; this is the stage in which the
and lines, to link these edges and lines in a more visual system constructs 3-D representations of ob-
global scale, to match up corresponding images in jects that include at least some of the occluded
the two eyes, to define 2-D regions in the image, surfaces of the objects, that is, the surfaces that are
and to detect line terminations and blobs. This stage invisible from the standpoint of the viewer, such
outputs 2-D surfaces at some particular slant that as the back parts of objects. In this sense, this is
are located at some distance from the viewer in 3-D the stage in which explicit representations of whole
space. objects in the environment are constructed. It goes
without saying that in order for the visual system to
In general, the image-based representation has the achieve this aim, it must use information about the
following properties: First, it receives as input and thus whole objects that viewers have stored from their
operates first on information about the 2-D structure of previous visual encounters with objects of the same
the retinal image rather than on information concern- type. The viewer retrieves from memory this infor-
ing the physical, distal, objects. Second, its geometry mation and fills in with it the surface-based image
is inherently two-dimensional. Third, the image-based constructed at the previous stage.
representation of the 2-D features is cast in a coordi-
nate reference system that is defined with respect to the In general, the object-based representation has the
retina (as a result, the organization of the information is following properties: First, this stage outputs volumet-
called retino-topic). This means that the axes of the ref- ric representations of objects that may include informa-
erence system are aligned with the eye rather than the tion about unseen surfaces. Second, the space in which
body or the environment. This stage is the first stage of these objects are represented is three-dimensional.
perception proper: Third, the frame of reference in which the object-based
representations are cast is defined in terms of the intrin-
3. The surface-based; in this stage, vision constructs sic structure of the objects and the visual scene (it is
representations of the intrinsic properties of sur- scene-based or allocentric).
600 Part F Model-Based Reasoning in Cognitive Science
Bayes’ theorem is the following probabilistic formula likely explanation of the data among the various possi-
(in its simple form because there is another formula- ble alternative accounts.
tion when one considers two competing hypotheses), The probability of A also depends on the P.B=A/,
where A is a hypothesis purporting to explain a set of that is, the probability that B be true given A. This re-
data B flects a significant epistemological insight, namely, that
since a correct account of a set of data explains away,
P.A=B/ D P.B=A/P.A/=P.B/ ; these data are a natural consequence of the explaining
hypothesis, or naturally fit into the conceptual frame-
where P.A/ is the prior probability, that is, the ini- work created by the hypothesis. The various gravity
tial degree of belief in A; P.A=B/ is the conditional phenomena, for instance, become very plausible in view
probability of A given B, or posterior probability, that of the law of gravity; they are not so much so if the hy-
is, the degree of belief in A after taking into con- pothesis purporting to explain these same phenomena
sideration B; P.B/ is the probability of B. P.B=A/ is involves some accidents of nature, even if they are sys-
the likelihood of B given A, that is, the degree of tematic. To put in a reverse way, if gravity exists, then
belief that B is true given that A is true. The ratio the probability that unsupported objects will fall down
P.B=A/=P.B/ represents the degree of support that B is greater than the probability of these objects falling
provides for A. down if some other hypothesis is postulated to explain
Suppose that B is the sensory information encoded the fall of unsupported objects.
by a neuronal assembly at level l-1, and A is the hypoth- The probability of the hypothesis A depends in-
esis that the neuronal assembly at level l posits as an versely on the probability of the data B. Since probabili-
explanation of B. Bayes’ theorem tells us that the prob- ties take values from (0 to 1), the smaller the probability
ability that A is true, that is, the probability that level in the denominator, that is, the more surprising and thus
l represent a true pattern in the environment given the improbable B is, the greater the probability that A be
sensory data B, depends first on the prior probability of true given B. This part of the equation also reflects an
hypothesis A, that is the probability of A before the pre- important epistemological insight, namely that the more
dictions of A are tested. This prior probability depends surprising a set of data is, the more likely is to be true
on both the incoming signal to l but, also and most cru- a hypothesis that successfully explains them. Finally,
cially because many different causes could have caused the ratio P.B=A/=P.B/ expresses the support B provides
the incoming signal, on the contextual effects because to A in the sense that the greater this ratio, the greater
these are the factors that determine which is the most the probability that the hypothesis A is true.
vation of the visual cortex from the cognitive centers of other cases of cognitively driven amodal completion,
Part F | 26.E
the brain. In some of these cases, top-down processes the viewer simply forms a pure thought concerning the
activate the early visual areas and fill in the missing hidden structure in the absence of any activation of the
features that become phenomenologically present. In visual areas and thus in the absence of mental imagery.
References
Part F | 26
26.1 H. von Helmholtz: Treatise on Psychological Optics 26.23 V.A.F. Lamme: Why visual attention and awareness
(Dover, New York 1878/ 1925) are different, Trends Cogn. Sci. 7, 12–18 (2003)
26.2 I. Rock: The Logic of Perception (MIT Press, Cam- 26.24 V.A.F. Lamme: Independent neural definitions of
bridge 1983) visual awareness and attention. In: The Cog-
26.3 E.S. Spelke: Object perception. In: Readings in Phi- nitive Penetrability of Perception: An Interdisci-
losophy and Cognitive Science, ed. by A.I. Goldman plinary Approach, ed. by A. Raftopoulos (Nova-
(MIT Press, Cambridge 1988) Science Books, Hauppauge 2004)
26.4 A. Clark: Whatever next? Predictive brains, situated 26.25 A. Clark: An embodied cognitive science?, Trends
agents, and the future of cognitive science, Behav. Cogn. Sci. 3(9), 345–351 (1999)
Brain Sci. 36, 181–253 (2013) 26.26 P. Vecera: Toward a biased competition account of
26.5 M. Rescorla: The causal relevance of content to object-based segmentation and attention, Brain
computation, Philos. Phenomenol. Res. 88(1), 173– Mind 1, 353–384 (2000)
208 (2014) 26.27 E.C. Hildreth, S. Ulmann: The computational study
26.6 L. Shams, U.R. Beierholm: Causal inference in per- of vision. In: Foundations of Cognitive Science, ed.
ception, Trends Cogn. Sci. (Regul. Ed.) 14, 425–432 by M.I. Posner (MIT Press, Cambridge 1989)
(2010) 26.28 Z. Pylyshyn: Is vision continuous with cognition?
26.7 N. Orlandi: The Innocent Eye: Why Vision Is not The case for cognitive impenetrability of visual per-
a Cognitive Process (Oxford Univ. Press, Oxford 2014) ception, Behav. Brain Sci. 22, 341–423 (1999)
26.8 P. Lipton: Inference to the Best Explanation, 2nd 26.29 M. Barr: The proactive brain: Memory for predic-
edn. (Routledge, London, New York 2004) tions, Philos. Trans. R. Soc. Lond. B Biol. Sci. 364,
26.9 D.G. Campos: On the Distinction between Peirce’s 1235–1243 (2009)
Abduction and Lipton’s inference to the best ex- 26.30 K. Kihara, Y. Takeda: Time course of the integration
planation, Synthese 180, 419–442 (2011) of spatial frequency-based information in natural
26.10 G. Minnameier: Peirce-suit of Truth-why inference scenes, Vis. Res. 50, 2158–2162 (2010)
to the best explanation and abduction ought not 26.31 C. Peyrin, C.M. Michel, S. Schwartz, G. Thut,
to be confused, Erkenntnis 60, 75–105 (2004) M. Seghier, T. Landis, C. Marendaz, P. Vuilleu-
26.11 G. Harman: Enumerative induction as inference to mier: The neural processes and timing of top-down
the best explanation, J. Philos. 68(18), 529–533 processes during coarse-to-fine categorization of
(1965) visual scenes: A combined fMRI and ERP study,
26.12 D. Marr: Vision: A Computational Investigation into J. Cogn. Neurosci. 22, 2678–2780 (2010)
Human Representation and Processing of Visual 26.32 A. Delorme, G.A. Rousselet, M.J.-M. Macé,
Information (Freeman, San Francisco 1982) M. Fabre-Thorpe: Interaction of top-down and
26.13 J. Biederman: Recognition by components: A the- bottom up processing in the fast visual analysis of
ory of human image understanding, Psychol. Rev. natural scenes, Cogn. Brain Res. 19, 103–113 (2004)
94, 115–147 (1987) 26.33 L. Chelazzi, E. Miller, J. Duncan, R. Desimone: A
26.14 A. Johnston: Object constancy in face processing: neural basis for visual search in inferior temporal
Intermediate representations and object forms, Ir. cortex, Nature 363, 345–347 (1993)
J. Psychol. 13, 425–438 (1992) 26.34 P.R. Roelfsema, V.A.F. Lamme, H. Spekreijse: Ob-
26.15 G.W. Humphreys, V. Bruce: Visual cognition: Com- ject-based attention in the primary visual cortex of
putational, Experimental and Neuropsychological the macaque monkey, Nature 395, 376–381 (1998)
Perspectives (Lawrence Erlbaum, Hove 1989) 26.35 S.M. Kosslyn: Image and Brain (MIT Press, Cam-
26.16 J.J. Gibson: The Ecological Approach to Visual Per- bridge 1994)
ception (Houghton-Mifflin, Boston 1979) 26.36 A. Raftopoulos: Cognition and Perception: How Do
26.17 J. Fodor, Z. Pylyshyn: How direct is visual per- Psychology and the Neural Sciences Inform Philos-
ception? Some reflections on Gibson’s ‘Ecological ophy? (MIT Press, Cambridge 2009)
Approach, Cognition 9, 139–196 (1981) 26.37 T. Burge: Origins of Objectivity (Clarendon Press,
26.18 M. Rowlands: The New Science of Mind: From Ex- Oxford 2010)
tended Mind to Embodied Phenomenology (MIT 26.38 P. Cavanagh: Visual cognition, Vis. Res. 51, 1538–
Press, Cambridge 2010) 1551 (2011)
26.19 J. Norman: Two visual systems and two theories of 26.39 R. Gregory: Concepts and Mechanisms of Perception
perception: An attempt to reconcile the construc- (Charles Scribners and Sons, New York 1974)
tivist and ecological approaches, Behav. Brain Sci. 26.40 K. Grill-Spector, T. Kushnir, T. Hendler, S. Edel-
25, 73–144 (2002) man, Y. Itzchak, R. Malach: A sequence of object-
26.20 V. Bruce, P.R. Green: Visual Perception: Physiology, processing stages revealed by fMRI in the Human
Psychology and Ecology, 2nd edn. (Lawrence Erl- occipital lobe, Human Brain Mapping 6, 316–328
baum, Hillsdale 1993) (1998)
26.21 A. Noe: Action in Perception (MIT Press, Cambridge 26.41 H. Liu, Y. Agam, J.R. Madsen, G. Krelman: Timing,
2004) timing, timing: Fast decoding of object information
26.22 J. Pearl: Causality: Models, Reasoning and Infer- from intracranial field potentials in human visual
ence (Cambridge Univ. Press, Cambridge 2009) cortex, Neuron 62, 281–290 (2009)
Vision, Thinking, and Model-Based Inferences References 603
26.42 M. Peterson: Overlapping partial configurations in 26.58 F. Dretske: Naturalizing the Mind (MIT Press, Cam-
Part F | 26
object memory. In: Perception of Faces, Objects, bridge 1995)
and Scenes: Analytic and Holistic Processes, ed. by 26.59 S.E. Palmer: Vision Science: Photons to Phe-
M. Peterson, G. Rhodes (Oxford Univ. Press, New nomenology (MIT Press, Cambridge 1999)
York 2003) 26.60 J. McDowell: Mind and World (Harvard Univ. Press,
26.43 M. Peterson, J. Enns: The edge complex: Implicit Cambridge 2004)
memory for figure assignment in shape perception, 26.61 A. Treisman: How the deployment of attention de-
Percept. Psychophys. 67(4), 727–740 (2005) termines what we see, Vis. Cogn. 14, 411–443 (2006)
26.44 M. Fabre-Thorpe, A. Delorme, C. Marlot, S. Thorpe: 26.62 A. Treisman, N.G. Kanwisher: Perceiving visually
A limit to the speed of processing in ultra-rapid vi- presented objects: Recognition, awareness, and
sual categorization of novel natural scenes, J. Cogn. modularity, Curr. Opin. Neurobiol. 8, 218–226 (1998)
Neurosci. 13(2), 171–180 (2001) 26.63 J. Perry: Knowledge, Possibility, and Consciousness,
26.45 J.S. Johnson, B.A. Olshausen: The earliest EEG sig- 2nd edn. (MIT Press, Cambridge 2001)
natures of object recognition in a cued-target task 26.64 R.C. Stalnaker: Our Knowledge of the Internal World
are postesensory, J. Vis. 5, 299–312 (2005) (Clarendon Press, Oxford 2008)
26.46 S. Thorpe, D. Fize, C. Marlot: Speed of processing 26.65 L. Magnani: Abductive Cognition. The Epistemolog-
in the human visual system, Nature 381, 520–522 ical and Eco-Cognitive Dimensions of Hypothetical
(1996) Reasoning (Springer, Berlin 2009)
26.47 S.M. Crouzet, H. Kirchner, S.J. Thorpe: Fast saccades 26.66 C.S. Peirce: Perceptual judgments (1902. In: Philo-
toward faces: Face detection in just 100 ms, J. Vis. sophical Writings of Peirce, ed. by J. Buchler (Dover,
10(4), 1–17 (2010) New York 1955)
26.48 H. Kirchner, S.J. Thorpe: Ultra-rapid object detec- 26.67 C.S. Peirce, N. Houser: The Essential Peirce: Selected
tion with saccadic movements: Visual processing Philosophical Writings, Vol. 2 (Indiana Univ. Press,
speed revisited, Vis. Res. 46, 1762–1776 (2006) Bloomington 1998)
26.49 K. Grill-Spector, R. Henson, A. Martin: Repetition 26.68 S. Toulmin: The Uses of Argument (Cambridge Univ.
and the brain: Neural models of stimulus-specific Press, Cambridge 1958)
effects, Trends Cogn. Sci. 10, 14–23 (2006) 26.69 D.I. Perrett, M.W. Oram, J.K. Hietanen, P.J. Benson:
26.50 M. Chaumon, V. Drouet, C. Tallon-Baudry: Uncon- Issues of representation in object vision. In: The
scious associative memory affects visual processing Neuropsychology of Higher Vision: Collated Tuto-
before 100 ms, J. Vis. 8(3), 1–10 (2008) rial Essays, ed. by M.J. Farah, G. Ratcliff (Lawrence
26.51 S. Ullman, M. Vidal-Naquet, E. Sali: Visual features Erlbaum, Hillsdale 1994)
of intermediate complexity and their use in classi- 26.70 E.S. Spelke, R. Kestenbaum, D.J. Simons, D. Wein:
fication, Nat. Neurosci. 5(7), 682–687 (2002) Spatio-temporal continuity, smoothness of motion
26.52 V.A.F. Lamme, H. Super, R. Landman, P.R. Roelf- and object identity in infancy, Br. J. Dev. Psychol.
sema, H. Spekreijse: The role of primary visual 13, 113–142 (1995)
cortex (V1) in visual awareness, Vis. Res. 40(10-12), 26.71 E.S. Spelke: Principles of object perception, Cogn.
1507–1521 (2000) Sci. 14, 29–56 (1990)
26.53 R. VanRullen, S.J. Thorpe: The time course of vi- 26.72 A. Karmiloff-Smith: Beyond Modularity: A Develop-
sual processing: From early perception to decision mental Perspective on Cognitive Science (MIT Press,
making, J. Cogn. Neurosci. 13, 454–461 (2001) Cambridge 1992)
26.54 A. Torralba, A. Oliva: Statistics of natural image cat- 26.73 G.F. Poggio, W.H. Talbot: Mechanisms of static and
egories, Network 14, 391–412 (2013) dynamic stereopsis in foveal cortex of the rhesus
26.55 R. Jackendoff: Consciousness and the Computa- monkey, J. Physiol. 315, 469–492 (1981)
tional Mind (MIT Press, Cambridge 1989) 26.74 D. Ferster: A comparison of binocular depth mech-
26.56 F. Jackson: Perception: A Representative Theory anisms in areas 17 and 18 of the cat visual cortex,
(Cambridge Univ. Press, Cambridge 1977) J. Physiol. 311, 623–655 (1981)
26.57 F. Dretske: Conscious experience, Mind 102, 263–283 26.75 J.F.W. Mayhew, J.P. Frisby: Psychophysical and
(1993) computational studies towards a theory of human
stereopsis, Artif. Intell. 17, 349–385 (1981)
605
William Bechtel
Diagrammati 27. Diagrammatic Reasoning
Part F | 27
27.1 Cognitive Affordances of Diagrams
Diagrams figure prominently in human reasoning,
and Visual Images............................. 606
especially in science. Cognitive science research has
provided important insights into the inferences 27.2 Reasoning with Data Graphs ............. 608
afforded by diagrams and revealed differences in 27.2.1 Data Graphs in Circadian Biology ........ 608
the reasoning made possible by physically instan- 27.2.2 Cognitive Science Research Relevant
tiated diagrams and merely imagined ones. In to Reasoning with Graphs .................. 611
scientific practice, diagrams figure prominently 27.3 Reasoning with Mechanism Diagrams 613
both in the way scientists reason about data and 27.3.1 Mechanism Diagrams
in how they conceptualize explanatory mecha- in Circadian Biology ........................... 613
nisms. 27.3.2 Cognitive Science Research
To identify patterns in data, scientists of- Relevant to Reasoning
ten graph it. While some graph formats, such with Mechanism Diagrams ................. 615
as line graphs, are used widely, scientists often
27.4 Conclusions and Future Tasks ............ 616
develop specialized formats designed to reveal
specific types of patterns and not infrequently References................................................... 617
employ multiple formats to present the same
data, a practice illustrated with graph formats de-
veloped in circadian biology. Cognitive scientists responsible for a phenomenon, a practice again
have revealed the spatial reasoning and iterative illustrated with diagrams of circadian mecha-
search processes scientists deploy in understand- nisms. Cognitive science research has revealed
ing graphs. how reasoners mentally animate such diagrams
In developing explanations, scientists com- to understand how a mechanism generates
monly diagram mechanisms they take to be a phenomenon.
Human reasoning is often presented as a mental ac- tists read papers and engage in further reasoning about
tivity in which we apply inference rules to mentally them. Zacks et al. [27.1] determined that the number of
represented sentences. In the nineteenth century, Boole graphs in scientific journals doubled between 1984 and
presented the rules for natural deduction in logic as 1994. One would expect that trend has continued. Al-
formalizing the rules of thought. Even as cognitive though many journals now limit the number of figures
scientists moved beyond rules of logical inference as that can appear in the published paper, they have in-
characterizing the operations of the mind, they tended creasing allowed authors to post supplemental material,
to retain the idea that cognitive operations apply to which often includes many additional diagrams.
representations that are encoded in the mind (e.g., in Scientists clearly use diagrams to communicate
neural activity). But in fact humans often reason by their results with others. But there is also evidence
constructing, manipulating, and responding to external that they make extensive use of these diagrams in their
representations, and this applies as well to deductive as own thinking – in developing an understanding of the
to abductive and inductive reasoning. Moreover, these phenomenon to be explained and in advancing an ex-
representations are not limited to those of language plantion of it. Diagrams also figure prominently in
but include diagrams. While reliance on diagrams ex- the processes through which scientists analyze data.
tends far beyond science, it is particularly important Since far less attention has been paid, both in phi-
in science. Scientific papers and talks are replete with losophy of science and in the cognitive sciences, to
diagrams and these are often the primary focus as scien- how diagrams figure in reasoning activities, my objec-
606 Part F Model-Based Reasoning in Cognitive Science
tive in this chapter is to characterize what is known a two or more dimensional layout where the marks are
about how people, including scientists, reason with di- intended to stand for entities or activities or informa-
agrams. tion extracted from them and the geometrical relations
An important feature of diagrams is that they are between the marks are intended to convey relations be-
processed by the visual system, which in primates is tween the things represented. In Sects. 27.2 and 27.3, I
a very highly developed system for extracting and re- will discuss separately two types of diagrams that I des-
lating information received by the eyes (approximately ignate data graphs and mechanism diagrams. In each
one-third of the cerebral cortex is employed in visual case I introduce the discussion with examples from one
Part F | 27.1
processing). I begin in Sect. 27.1 by focusing on the dis- field of biological research, that on circadian rhythms –
tinctive potential of diagrams to support reasoning by the endogenously generate oscillations with a period
enabling people to employ visual processing to detect of approximately 24 h that are entrainable to the light-
specific patterns and organize together relevant pieces dark cycle of our planet and that regulate a wide range
of information and examine the question of whether of physiological activities. I then draw upon cognitive
images constructed in one’s imagination work equally science research relevant to understanding how people
well. In this chapter, I employ the terms diagram in its reason with each type of diagram and relate this to the
inclusive sense in which it involves marks arranged in diagrams used in the science.
Part F | 27.1
(2a.3) (pulley-system Ry Pc Rz)
(2b.1) (hangs Pb from Rt)
(2b.2) (hangs Rt from c) C
(3a.1) (hangs Rx from c) p q s
(3a.2) (hangs Rs from Pc)
(3b.3) (hangs W2 from Rs)
W1 W2
(4.1) (value W1 1)
different demands on internal processes.) Zhang further fer three examples of diagrams used in economics and
claims that by limiting winning strategies to lines, tradi- physics, graphs and vector diagrams, that employ not
tional tic-tac-toe reduces the cognitive demands, freeing actual space but dimensions mapped to space and ar-
up cognitive resources for other activities. gue that they too provide the benefits in search and
In a provocative pioneering study addressing the recognition.
question why a diagram is (sometimes) worth 10 000 Together, these two studies make clear that dia-
words? Larkin and Simon [27.6] also focused on how grams differ from other representations in terms of the
diagrammatic representations support different cogni- cognitive operations they elicit in problem-solving sit-
tive operations than sentential representations. Like uations. Most generally, diagrams as visual structures
Zhang, they focused on representations that were equiv- elicit pattern detection capacities whereas sentential
alent in the information they provided but turned out representations require linguistic processing. Larkin
not to be computationally equivalent in the sense that and Simon note that a common response to a complex
inferences that could be “drawn easily and quickly sentential description is to draw a diagram. An inter-
from the information given explicitly in the one” could esting question is whether comparable results can be
not be drawn easily and quickly from the other. (Kul- obtained by mentally imagining diagrams. Pioneering
vicki [27.7] speaks in terms of information being ex- studies by Shepard [27.8, 9] and Kosslyn et al. [27.10]
tractable where there is a feature of a representation that demonstrated that people can rotate or move their at-
is responsible for it representing a given content and tention across a mentally encoded image. But quite
nothing more specific than that. This helpfully focuses surprisingly Chambers and Reisberg [27.11] found
on the issue of how the representation is structured, but that this capacity is severely limited. They presented
does not draw out the equally important point that ex- Jastrow’s duck-rabbit (Fig. 27.3) to participants suffi-
tracting information depends on the cognitive processes ciently briefly that they could only form one interpre-
that the cognizer employs.) tation of the figure. They then asked the participants
One of the problems Larkin and Simon investigate if they could find another interpretation while imaging
is the pulley problem shown in Fig. 27.2, where the task the figure. None were able to do so even when offered
is to find the ratio of weights at which the system is in guidance. Yet, when they were allowed to draw a fig-
equilibrium. They developed a set of rules to solve the ure based on their mental image, all participants readily
problem. The advantage of the pulley diagram on their discovered the alternative interpretation.
analysis is that it locates information needed to apply These findings inspired numerous other investi-
particular rules at nearby locations in the diagram so gations into the human ability to work with mental
that by directing attention to a location a person can images whose results present a complex pattern. Reed
secure the needed information. In the sentential rep- and Johnsen [27.12] reached a similar conclusion as
resentation the information needed for applying rules Chambers and Reisberg when they asked participants
was dispersed so that the reasoner would need to con- to employ imagery to determine whether a figure was
duct multiple searches. In a second example, involving contained in a figure they had previously studied. Yet
a geometry proof, Larkin and Simon show how a dia- when Finke et al. [27.13] asked participants to construct
gram reduces both the search and recognition demands, in imagery complex images from components, they
where recognition utilizes the resources of the visual performed well. Studies by Finke and Slayton [27.14]
system to retrieve information. The authors also of- showed that many participants were able to generate
608 Part F Model-Based Reasoning in Cognitive Science
as a stimulus in Chambers and Reisberg’s experiments is trained drawers create compound objects that involved
shown on the left. The other two versions were drawn restructuring the components (e.g., changing propor-
by participants based on their own image interpreted as tions within the component). One conclusion suggested
a rabbit (b) and as a duck (c). From these they were read- by these results is that reasoning with diagrams may be
ily able to discover the other interpretation, something a learned activity. Humans spend a great deal of time
they could not do from their mental image alone. With learning to read and write, and even then further educa-
permission from the American Psychological Association, tion is often required to extract information from text
(after [27.11]) and construct and evaluate linguistic arguments. Yet,
perhaps because vision seems so natural, we assume
creative images from simple shapes in imagery (the that diagrams are automatically interpretable and except
drawings the participants produced were independently in curricula in fields like design, we provide no sys-
assessed for creativity). Anderson and Helstrup [27.15, tematic education in constructing and reasoning with
16] set out to explore whether drawing enhanced perfor- diagrams. Accordingly, it perhaps should not be a sur-
mance on such tasks and their conclusions were largely prise that science educators have found that students
negative – participants produced more images, but the often ignore the diagrams in their textbooks [27.19].
probability of generating ones judged creative was not One of the challenges in teaching students how to
increased: “These results were contrary to the initial be- reasoning with diagrams is identifying what cognitive
lief, shared by most experimenters and subjects alike, operations people must perform with different types of
that the use of pencil and paper to construct patterns diagrams. Cognitive scientists have begun to identify
should facilitate performance.” some of these operations, and I will discuss some of
Verstijnen et al. [27.17] explored whether the fail- these in the context of data graphs and mechanism dia-
ure of drawing to improve performance might be due grams in the next two sections.
Part F | 27.2
97.5 36.4 dark cycle, as indicated by the letters LD on the right
side, with the periods of light and dark indicated by the
36.2 light-dark bar at the top. From day 15 to day 47, as indi-
97
36 cated by the letters DD on the right side, the mouse was
subjected to continuous darkness. On day 37, the row
96.5
35.8 indicated by the arrow, the animal received a 6 h pulse
of light at hour 16. It was returned to LD conditions
96 35.6 on day 48, but returned to DD on day 67. The activity
00 06 12 18 00 06 12 18 00 06 records shown on the actogram exhibit a clear pattern.
Time (Clock hour) During both LD periods the activity of the mouse was
b) entrained to the pattern of light and dark so that the
1 mouse was primarily active during the early night, with
LD a late bout of activity late in the night (mice are noctur-
nal animals). On the other hand, during the DD periods
DD
the mouse began its activity somewhat earlier each day,
a phenomenon known as free running. The light pulse
reset the onset time for activity on the following day,
after which the mouse continued to free run but from
LD
this new starting point. When switched back to LD the
mouse exhibited a major alternation in activity the next
69 DD day, but it took a couple more days to fully re-entrain to
0 24 48
the LD pattern.
Fig. 27.4 (a) Line graph from [27.21] showing the circa- Data graphs are used not just to characterize phe-
dian oscillation in body temperature for one person across nomena but also to identify factors that may play a role
48 h. (b) Example actogram showing times of running in explaining phenomena. Figures in biological papers
wheel activity of a wild-type mouse, reprinted with per- often contain many panels, invoking different repre-
mission from Elsevier (after [27.22]) sentational formats, as part of the attempt to make
visible relationship between variables that are taken
tem readily identifies the oscillatory pattern, which we to be potentially explanatory. For example, Fig. 27.5,
can then coordinate with the bar at the bottom indict- from [27.23], employs photographs, line graphs, heat
ing periods of light and dark and the gray regions that maps, and radial (Rayleigh) plots. To situate their
redundantly indicate periods of darkness. By visually research, in the 1970s the suprachiasmatic nucleus
investigating the graph, one can detect that body tem- (SCN), a small structure in the hypothalamus, was im-
perature rises during the day and drops during the night, plicated through a variety of techniques as the locus
varying by about 2 ı F over the course of a day. of circadian rhythms in mammals. Welsh et al. [27.24]
A line graph makes clear that the value of a vari- had demonstrated that while individual SCN neurons
able is oscillating and with what amplitude, but it does maintain rhythmicity when dispersed in culture, they
not make obvious small changes in the period of ac- oscillate with varying periods and quickly become
tivity. For this reason, circadian researchers developed desynchronized. Maywood et al.’s research targeted va-
actograms – a version of a raster plot on which time of soactive intestinal polypeptide (VIP), which is released
each day is represented along a horizontal line and each by some SCN neurons, as the agent that maintains syn-
occurrence of an activity (rotation of a running wheel by chrony in the whole SCN or in slices from the SCN.
a mouse) is registered as a hash mark. Subsequent days Accordingly, they compared SCN slices from mice in
are shown on successive lines placed below the previ- which one (identified as VIP2rC= ) or both copies
ous one. Some actograms, such as the one shown in (VIP2r= ) of the gene that codes for the VIP recep-
Fig. 27.4b, double plot the data so that each successive tor are deleted. To render the rhythmicity of individual
610 Part F Model-Based Reasoning in Cognitive Science
13
panels (a–d) that, when
11
a receptor for VIP is present,
9 oscillations of individual neu-
7 rons are synchronized but that
5 P = 10–8 this is lost without VIP, pan-
24 48 72 96 120 24 48 72 96 els (e–i) (after [27.23]), with
Time (h) permission from Elsevier
e) Vip2r –/– V 54 V 66 78 90
f)
0 4 8 12 16 20
g) Relative bioluminescence h) i)
36
34
32
30
28
26 P = 0.046
24 48 72 96 120 24 48 72 96
Time (h)
cells visible, the researchers inserted a gene coding for each type in line graphs. This makes it clear that while
luciferase under control of the promoter for a known there is variation in amplitude, with VIP the five cells
clock gene, Per1, so as to produce luminescence when- are in phase with each other while without VIP they
ever PER is synthesized. The photographs in panels A are not. Even with five cells, though, it becomes diffi-
are selections among the raw data. They make clear cult to decipher the pattern in a line graph. The raster
that VIP luminescence in the slice is synchronized, oc- plots in panels C and H enable comparison of 25 cells,
curring at hour 48 and 72. Panel E reveals the lack one on each line, with red indicating periods when bi-
of synchrony without the VPN receptor and panel F oluminescence exceeds a threshold and green periods
demonstrates that individual neurons are still oscillat- when it is below the threshold (such displays using hot
ing without VIP but that the three neurons indicted by and cold colors are often called heat maps). The raster
green, blue, and red arrows exhibit luminescence at dif- plot enables one to compare the periodicity of individ-
ferent phases. ual cells more clearly, but with a loss of information
Although the photographs are sufficient to show that about the amplitude of the oscillation at different times.
VIP is potentially explanatory of synchronous activity The Rayleigh plots shown in panels D and I sacrifice
in the SCN, the researchers desired to characterize the even more information, focusing only on peak activ-
relationship in more detail. They began by quantifying ity, but show that the peak phases are highly clustered
the bioluminescence recorded at the locus of the cell in with VIP and widely distributed without. The blue ar-
photographs at different times. In panels B and G they row shows the aggregate phase vector and indicates not
displayed the results for five individual cells in each of only that it is oriented differently without VIP but also is
Diagrammatic Reasoning 27.2 Reasoning with Data Graphs 611
Part F | 27.2
10
5 5 5 30
0 0 0
10 30 10 30 60 80
Noise level (dB) Noise level (dB) Room temperature (°F)
Fig. 27.6a–c A bar graph (a) and two line graphs (b,c), each showing the same data, but which viewers typically interpret
differently (after [27.29])
extremely short, indicative of little correlation between suggests that there are intermediate values between the
individual neurons. two explicitly plotted. The effect is sufficiently strong
that Zacks and Tversky found that when line graphs
27.2.2 Cognitive Science Research Relevant are used with categorical variables, viewers often treat
to Reasoning with Graphs them as interval variables and make assertions such as
“The more male a person is, the taller he/she is” [27.26,
Having introduced examples of graphs used in one field p. 1076].
of biology, I turn now to cognitive science research The choice of what to present on the axes also
that has attempted to identify aspects of the cogni- affects the information people extract. Shah and Car-
tive operations that figure in reasoning with graphs. penter [27.30] found that participants produce very
Pinker [27.25] provided the foundation for much sub- different interpretations of the two graphs on the right
sequent research on how people comprehend graphs. of Fig. 27.6, one representing noise and the other room
He differentiated the cognitive activities of creating temperature on the abscissa. Thus, viewers of the graph
a visual description of a graph and applying an appro- in the center are more likely to notice the trend with in-
priate graph schema to it. He treats the construction creasing noise levels whereas those viewing the graph
of a visual description as initially a bottom-up activity on the right notice the trend with increasing tempera-
driven by the visual stimulus to which gestalt princi- ture. Further, when lines in graphs have reverse slopes,
ples such as proximity and good continuation, among as in the rightmost graph, participants take longer to
other procedures, are invoked. As explored by Zacks process the graph. Moreover, this difference makes the
and Tversky [27.26], these principles differentially af- third variable, noise level, more salient since it identifies
fect perception of bar graphs and line graphs [27.26, the difference responsible for the contrasting slopes.
p. 1073]: The research reported so far focused on visual fea-
tures of graphs, but one of the seminal findings about
“Bars are like containers or fences, which enclose the organization of the mammalian visual processing
one set of entities and separate them from others. system is that it is differentiated into two processing
Lines are like paths or outstretched hands, which streams, one extracting information about the shape and
connect separate entities.” identity of objects and one extracting information about
location and potential for action [27.31, 32]. Hegarty
The result, which has been documented in many and Kozhevnikov [27.33] proposed that the distinction
studies, is that people are faster and more accurate between different processing pathways could help ex-
at reading individual data points from bar graphs plain apparently contradictory results other researchers
than line graphs but detect trends more easily in line had reached about whether skill in visual imagery facil-
graphs [27.27, 28]. For example, the bar graph in itates solving mathematics problems. They separately
Fig. 27.6a makes it easy to read off test scores at differ- evaluated sixth-grade boys in Dublin, Ireland, in terms
ent noise levels and room temperatures, and to compare of pictorial imagery (“constructing vivid and detailed
test scores at the two temperatures. The line graph in images”) and schematic imagery (“representing the spa-
Fig. 27.6b encodes the same data but the lines con- tial relationships between objects and imagining spatial
necting the values at the two noise levels make that transformations”). They found that good pictorial im-
comparison more apparent. Moreover, the line graph agery was actually associated with poorer performance
612 Part F Model-Based Reasoning in Cognitive Science
80 80
20 50 20 50
Age (y) Age (y)
in solving mathematical problems. The following was than the time required to recognize simple patterns,
a typical problem: At each of the two ends of a straight including words and objects. In addition to detecting
path, a man planted a tree and then every 5 meters a pattern of data points along, for example, a posi-
along the path he planted another tree. The length of tively sloping line, the graph interpreter must relate
the path is 15 meters. How many trees were planted? these points to the labels on the axes and what these rep-
In contrast, good spatial imagery was associated resent and this is what requires processing time. Using
with better performance. In subsequent work, Hegarty eye tracking which participants study graphs, Carpen-
and her collaborators focused on kinetic problems in- ter and Shah revealed that viewers initially carve the
volving graphs of motion and demonstrated a sim- graph into visual chunks and then cycle through fo-
ilar effect of pictorial versus spatial visualization. cusing on different components – the pattern of the
Kozhevnikov et al. [27.34] presented graphs such as lines, the labels on the axes, the legend, and the title
that on the left in Fig. 27.7 to participants who, on of the graph (Fig. 27.7). Similarly drawing attention to
a variety of psychometric tests, scored high or low the prolonged engagement individuals often have with
on spatial ability. Those who scored low interpreted graphs, Trickett and Trafton [27.37] employed verbal
this graph pictorially as, for example, a car moving protocols as well as eye tracking to study what people
on a level surface, then going down a hill, and then do when making inferences that go beyond what is ex-
moving again along a level surface. None of these par- plicitly represented in a given graph. They found that
ticipants could provide the correct interpretation of the participants often employ spatial manipulations such as
graph as showing an object initially at rest, then mov- mentally transforming an object or extending it; they
ing at a constant velocity, and finally again at rest. are not just passively viewing it.
On the other hand, all participants who scored high Cognitive scientists have limited their focus to rel-
on spatial ability provided the correct interpretation. atively simple graph forms such as line graphs and
Subsequently, Kozhevnikov et al. [27.35] examined the have not investigated the larger range of format we saw
differences between professionals in the arts and the deployed in circadian research. Many of the results,
sciences with respect to these graphs. They found that, however, are applicable to these other graph formats.
except for participants who provided an irrelevant in- Gestalt principles such as good continuation affect the
terpretation by focusing on nonpictorial features of patterns people see in actograms and raster plots (heat
the graph, artists tended to provide a literal picto- maps). In the actogram in Fig. 27.4, one recognizes
rial interpretation of the path of movement whereas the phase locking of activity to the light-dark cycle
all scientists offered a correct schematic interpreta- and daily phase advance when light cues are removed
tion (example responses are shown on the right of by implicitly (and sometimes explicitly) drawing a line
Fig. 27.8). through the starting point for each day’s activity. Spa-
So far I have focused on viewing a graph and tial processing is clearly important not only with the
extracting information from it. But an important fea- photographs in Fig. 27.5 but also with the heat map
ture of graphs in science such as those I presented in and Rayleigh plot. A skilled user of these graphs must
the earlier section is that they afford multiple engage- recognize that space in the photographs corresponds to
ments in which a user visually scans different parts of space on the slice from the brain but that space in the
the graph seeking answers to different questions, some heat map corresponds not to physical space but an ab-
posed by information just encountered. Carpenter and stract space in which different cells are aligned. Finally,
Shah [27.36] drew attention to this by observing that these diagrams are not designed to convey information
graph comprehension is an extended activity often re- in one look but rather are objects that afford shifting
quiring half a minute, two orders of magnitude longer one’s attention many times to focus on different in-
Diagrammatic Reasoning 27.3 Reasoning with Mechanism Diagrams 613
Part F | 27.3
c)
Time
formation. With the Rayleigh plot, for example, one whereas scientists often use interact with graphs over
typically attends separately to the dispersal of blue ar- multiple engagements, constructing new queries on the
rowheads reflecting peaks of individual cells and to the basis of previous ones (e.g., probing an actogram to see
vector indicating the population average. If eye tracking if the behavior really does look rhythmic or not or ex-
were performed, the pattern would likely resemble that ploring the variability between cells revealed in a heat
displayed in Fig. 27.7. With panels showing the same map). This is particularly evident when a researcher
information in multiple formats, as in Fig. 27.5, view- pours over a graph after producing it to determine what
ers are also likely to shift their focus between panels it means or when, in a journal club discussion, other
to see, for example, how the times in the line graph researchers raise questions about specific features of
correspond to those in the photograph or those in the a graph. Ultimately we need to better understand how
heat map. One limitation of the cognitive science stud- scientists pose and address such queries over time if we
ies is that the tasks participants were asked to perform are to understand the different roles graphs play in sci-
were usually quite limited (e.g., interpret the graph) entific reasoning.
Extracellular signals
Nucleus Cytoplasm
REV-ERBA cAMP
BMAL1 CLOCK
BMAL1 CLOCK
E-box Rev-erba CREB/MAPK signalling pathway
β-TrCP
SCF
P P
BMAL1 CLOCK
PER +Ub 265
LRE E-box Per1/Per2 PER
CSNK1E/D P P
BMAL1 CLOCK PER PER
E-box Cry1/Cry2 CRY
? ?
+Ub 265
CRY CRY FBXL3 CRY
BMAL1 CLOCK P P
PER CRY
CSNK1E/D
BMAL1 CLOCK
BMAL1 CLOCK
Clock output/rhythmic
E-box CCG
biological processes
Fig. 27.9 Takahashi et al. mechanism diagram of the mammalian circadian clock involves genes and proteins within
individual cells. Reprinted with permission from Macmillan Publishers Ltd (after [27.38])
ends indicate inhibitory activity. When phosphates at- CLOCK to activate transcription of BMAL1, PER, and
tach to molecules (as preparation for nuclear transport CRY (shown as a loop out from and back into the
or degradation), they are shown as white circles con- nucleus in the center-left of the figure). (The other op-
taining a P. erations shown are those involved in signaling from
The diagram is clearly laid out spatially, but only outside the cell that regulates the overall process, in the
some features of the diagram convey information about degradation of PER and CRY, and in the expression of
spatial structures in the cell. The differentiation of the clock-controlled genes (CCGs) that constitute the out-
nucleus and cytoplasm is intended to correspond to put of the clock.
these regions in the cell and lines crossing the boundary For someone acquainted with the types of parts
between the nucleus and cytoplasm represent transport shown and the operations in which they engage, a di-
between the two parts of the cell. Beyond that, however, agram such as this provides a means of showing
the distribution of shapes and arrows conveys no spa- schematically how the various parts perform operations
tial information but only functional differentiation. The that affect other parts. One is not intended to take in
most important operations shown in this diagram are the whole diagram at once, but to follow the opera-
the synthesis of REV-ERBA and its subsequent trans- tions from one part to another. To understand how the
port into the nucleus to inhibit transcription of BMAL1 mechanism gives rise to oscillatory activity, one can
(shown as a loop out from and back into the nucleus mentally simulate the operations of the mechanism by
in the upper left) and the synthesis of PER and CRY, starting in the middle with the Per and Cry genes.
the formation of a dimer, and the transport of the dimer As they are expressed, more PER and CRY proteins
into the nucleus to inhibit the ability of BMAL1 and are generated. After the proteins form a dimer and are
Diagrammatic Reasoning 27.3 Reasoning with Mechanism Diagrams 615
transported into the nucleus, they inhibit the activity of both error rates and reaction times increased with the
the BMAL1WCLOCK dimer and thereby stop their own number of operations within the mechanism the partici-
expression. This reduction in express ion results in re- pant had to animate in order to answer the question, she
duction in their concentration and reduced inhibitory inferred that people don’t simulate the whole machine
activity, which allow expression to resume. This capac- operating at once but rather animate individual parts in
ity for mental animation is, however, limited, and to sequence. She provided further evidence for this claim
determine what the activity will be, especially when the by tracking the movements of the participants’ eyes
other components are included, researchers often turn as they solved problems. In a follow-up experiment,
Part F | 27.3
to computational models, generating what Abrahamsen Hegarty compared performance when participants were
and I [27.44] refer to as dynamic mechanistic explana- asked to infer the motion of a component from that of
tions. Even here, though, diagrams provide a reference another component earlier in the causal chain or from
point in the construction of equations describing opera- that of a component later in the chain. Participants made
tion of the various parts [27.45]. more errors and required more time when they had to
Looking carefully at the lower right side of the fig- reason backward from events later in the chain, and still
ure, one will see two ovals with question marks in them. showed a preference to move their eyes forward along
This indicates that the researchers suspected that some- the causal chain.
thing unknown binds with CRY before and potentially Schwartz and Black [27.49] provided further in-
mediates its binding with FBXL3, which then results in sights into how people simulate mechanisms by attend-
its degradation. Here it is the identity of an entity that is ing to the gestures people make. In one task, shown
in doubt, but sometimes question marks are employed in Fig. 27.11a, participants were asked to determine in
to indicate uncertainty about the identity of an opera- which direction the rightmost gear would turn given
tion. In this case the diagram is from a review paper the clockwise turn of the leftmost gear. They found
and the question mark reflects uncertainty in the dis- that their participants would use their hands to indi-
cipline. On occasions when question marks appear in cate the direction of movement of each successive gear.
mechanism diagrams presented at the beginning of a re- (In these studies the participants never saw the dia-
searcher paper they signal that the goal of the paper is to grams but were provided with verbal descriptions of
answer a question regarding the identity of a component the configuration.) In this case, an alternative strategy is
or its operation. available: apply a simple global rule such as the parity
rule: if there are odd number of gears, the first and last
27.3.2 Cognitive Science Research will turn in the same direction or the more local rule if
Relevant to Reasoning two gears are touching, they will turn in opposite direc-
with Mechanism Diagrams tions. Schwartz and Black found that as people acquired
the rule, their gestures declined. But when people lack
Although cognitive scientists have not explicitly fo- such rules or find their application uncertain, as in the
cused on mechanism diagrams that figure in biology gear problem in Fig. 27.11b, they again gesture. This
(but see Stieff et al. [27.46], which explores strategies use of gesture indicates that whatever imagery people
used to transform diagrams of molecular structure used
in organic chemistry), research on simple mechanical
systems such as pulley systems (already the focus of
Larkin and Simon’s research discussed above) has high-
lighted one of the important cognitive activities people
use with mechanism diagrams – mentally animating the
operation of a mechanism when trying to figure out how
it will behave. Drawing on theorists in the mental mod-
els tradition (see papers in [27.47] that explore how
people answer problems by constructing and running
a mental model), Hegarty [27.48] investigated experi-
mentally “to what extent the mental processes involved
in reasoning about a mechanical system are isomorphic
to the physical processes in the operation of the sys-
tem.” She measured reaction times and eye movements Fig. 27.10 Pulley problem (after [27.48]) used to study
as participants answered questions about how various how people employ mental animation in problem solving.
parts of a pulley system such as shown in Fig. 27.10 With permission from the American Psychological Asso-
would behave if the rope is pulled. From the fact that ciation
616 Part F Model-Based Reasoning in Cognitive Science
Five gears are arranged in a horizontal Five gears are arranged in a circle so
line. If you try turn the gear on the that each gear is touching two other gears.
left clockwise, what will the gear on the If you try to turn the gear on the top clock-
far right do? wise, what will the gear just to its right do?
employ to solve the task, it is coordinated with action. ture figures not only in communication but also in
Accordingly, the researchers propose a theory of simu- the development of one’s own understanding [27.51].
lated doing in which [27.50]: Tversky focuses on the activity of drawing maps, high-
lighting such features of the activity as selecting what
“the representation of physical causality is fun-
features to include and idealizing angles to right an-
damental. This is because doing requires taking
gles. These findings can be extended to mechanism
advantage of causal forces and constraints to ma-
diagrams, which constitute a map of the functional
nipulate the world. Our assumption is that people
space of the mechanism, situating its parts and opera-
need to have representations of how their embod-
tions. While Tversky speaks of diagrams as permanent
ied ideas will cause physical changes if they are to
traces and there is a kind of permanence (or at least en-
achieve their goals.”
durance) to diagrams produced on paper or in computer
Animating a diagram, either mentally or with ges- files, they are also subject to revision – one can add
ture, plays a crucial role in the cognitive activity of glyphs for additional parts or alter arrows to represent
understanding how a proposed mechanism could pro- different ideas of how the operations of one part af-
duce the phenomenon one is trying to explain. But fect others. In the design literature this is often referred
diagrams present not only a finished explanation of the to as sketching. Sketching mechanism diagrams can be
phenomenon, they often figure in the process of discov- motivated by evidence, but they can also be pursued in
ering mechanisms. Here what matters is the ability to a purely exploratory manner, enabling reasoning about
create and alter the glyphs and their arrangement. Tver- what would happen if a new connection were made or
sky [27.42] suggests a helpful way to understand this an existing one redirected. Sketching possible mecha-
activity – view diagrams as the “permanent traces of nisms is a common activity of scientists, and by further
gestures” in which “fleeting positions become places investigating the cognitive activities involved in this ac-
and fleeting actions become marks and forms” [27.42, tivity one can develop richer analyses of this important
p. 500]. There is a rich literature showing how ges- type of scientific reasoning.
Acknowledgments. Research for this chapter was C. Burnston, and Benjamin Sheredos) and John Norton
supported by Grant 1127640 from the US National Sci- and the Fellows at the Center for Philosophy of Sci-
ence Foundation and a Senior Visiting Fellowship in the ence at the University of Pittsburgh in Fall 2014 (Joshua
Center for Philosophy of Science at the University of Alexander, Karim Bschir, Ingo Brigandt, Sara Green,
Pittsburgh, which are gratefully acknowledged. I also Nicholaos Jones, Raphael Scholl, and Maria Serban) for
thank fellow members of the Working Group on Dia- insights and suggestions.
grams in Science at UCSD (Adele Abrahamsen, Daniel
Part F | 27
References
27.1 J.M. Zacks, E. Levy, B. Tversky, D.J. Schiano: Graphs 27.17 I.M. Verstijnen, C. van Leeuwen, R. Hamel,
in print. In: Diagrammatic Representation and J.M. Hennessey: What imagery can’t do and why
Reasoning, ed. by M. Anderson, B. Meyer, P. Olivier sketching might help, Empir. Stud. Arts 18, 167–182
(Springer, London 2002) pp. 187–206 (2000)
27.2 D.C. Marr: Vision: A Computation Investigation Into 27.18 I.M. Verstijnen, C. van Leeuwen, G. Goldschmidt,
the Human Representational System and Process- R. Hamel, J.M. Hennessey: Creative discovery in im-
ing of Visual Information (Freeman, San Francisco agery and perception: Combining is relatively easy,
1982) restructuring takes a sketch, Acta Psychol. 99, 177–
27.3 J.J. Gibson: The Ecological Approach to Visual Per- 200 (1998)
ception (Houghton Mifflin, Boston 1979) 27.19 M.P. Cook: Students’ comprehension of science
27.4 J. Zhang: The nature of external representations in concepts depicted in textbook illustrations, Elec-
problem solving, Cogn. Sci. 21, 179–217 (1997) tron. J. Sci. Educ. 12(1), 1–14 (2008)
27.5 J. Zhang, D.A. Norman: Representations in dis- 27.20 J. Bogen, J. Woodward: Saving the phenomena,
tributed cognitive tasks, Cogn. Sci. 18, 87–122 (1994) Philos. Rev. 97, 303–352 (1988)
27.6 J.H. Larkin, H.A. Simon: Why a diagram is (some- 27.21 W.L. Koukkari, R.B. Southern: Introducing Biologi-
times) worth ten thousand words, Cogn. Sci. 11, cal Rhythms (Springer, New York 2006)
65–99 (1987) 27.22 M.K. Bunger, L.D. Wilsbacher, S.M. Moran, C. Clen-
27.7 J.V. Kulvicki: Images, 1st edn. (Routledge, New York denin, L.A. Radcliffe, J.B. Hogenesch, M.C. Simon,
2013) J.S. Takahashi, C.A. Bradfield: Mop3 Is an essential
27.8 L.A. Cooper, R.N. Shepard: Chronometric sudies of component of the master circadian pacemaker in
the rotation of mental images. In: Visual Infor- mammals, Cell 103, 1009–1017 (2000)
mation Processing Symposium on Cognitiion, 8th 27.23 E.S. Maywood, A.B. Reddy, G.K.Y. Wong, J.S. O’Neill,
Carnegie Mellon University, ed. by W.G. Chase (Aca- J.A. O’Brien, D.G. McMahon, A.J. Harmar, H. Oka-
demic Press, New York 1973) pp. 75–175 mura, M.H. Hastings: Synchronization and mainte-
27.9 R.N. Shepard, J. Metzler: Mental rotation of three- nance of timekeeping in suprachiasmatic circadian
dimensional objects, Science 171, 701–703 (1971) clock cells by neuropeptidergic signaling, Curr. Biol.
27.10 S.M. Kosslyn, T.M. Ball, B.J. Reiser: Visual images 16, 599–605 (2006)
preserve metric spatial information – Evidence 27.24 D.K. Welsh, D.E. Logothetis, M. Meister, S.M. Rep-
from studies of image scanning, J. Exp. Psychol. pert: Individual neurons dissociated from rat
Human Percept. Perform. 4, 47–60 (1978) suprachiasmatic nucleus express independently
27.11 D. Chambers, D. Reisberg: Can mental images be phased circadian firing rhythms, Neuron 14, 697–
ambiguous?, J. Exp. Psychol. Human Percept. Per- 706 (1995)
form. 11, 317–328 (1985) 27.25 S. Pinker: A theory of graph comprehension. In: Ar-
27.12 S. Reed, J. Johnsen: Detection of parts in patterns tificial Intelligence and the Future of Testing, ed. by
and images, Memory Cogn. 3, 569–575 (1975) R. Feedle (Psychology Press, London 1990) pp. 73–
27.13 R.A. Finke, S. Pinker, M.J. Farah: Reinterpreting 126
visual patterns in mental imagery, Cogn. Sci. 13, 51– 27.26 J.M. Zacks, B. Tversky: Bars and lines: A study of
78 (1989) graphic communication, Memory Cogn. 27, 1073–
27.14 R.A. Finke, K. Slayton: Explorations of creative vi- 1079 (1999)
sual synthesis in mental imagery, Memory Cogn. 16, 27.27 W.A. Simcox: A method for pragmatic commu-
252–257 (1988) nication in graphic displays, Human Factors 26,
27.15 R.E. Anderson, T. Helstrup: Multiple perspectives 483–487 (1984)
on discovery and creativity in mind and on paper. 27.28 C.M. Carswell, C.D. Wickens: Information integra-
In: Imagery, Creativity, and Discovery: A Cognitive tion and the object display: An interaction of task
Perspective, Vol. 98, ed. by B. Roskos-Ewoldsen, demands and display superiority, Ergonomics 30,
M.J. Intons-Peterson, E.A. Rita (Elsevier, Amster- 511–527 (1987)
dam 1993) pp. 223–253 27.29 P. Shah, J. Hoeffner: Review of graph comprehen-
27.16 R.E. Anderson, T. Helstrup: Visual discovery in mind sion research: Implications for instruction, Educ.
and on paper, Memory Cogn. 21, 283–293 (1993) Psychol. Rev. 14, 47–69 (2002)
618 Part F Model-Based Reasoning in Cognitive Science
27.30 P. Shah, P.A. Carpenter: Conceptual limitations in 27.40 W. Bechtel, A. Abrahamsen: Explanation: A mech-
comprehending line graphs, J. Exp. Psychol. Gen- anist alternative, Stud. Hist. Philos. Biol. Biomed.
eral 124, 43–61 (1995) Sci. 36, 421–441 (2005)
27.31 L.G. Ungerleider, M. Mishkin: Two cortical visual 27.41 P. Machamer, L. Darden, C.F. Craver: Thinking about
systems. In: Analysis of Visual Behavior, ed. by mechanisms, Philos. Sci. 67, 1–25 (2000)
D.J. Ingle, M.A. Goodale, R.J.W. Mansfield (MIT 27.42 B. Tversky: Visualizing thought, Top. Cogn. Sci. 3,
Press, Cambridge 1982) pp. 549–586 499–535 (2011)
27.32 D.C. van Essen, J.L. Gallant: Neural mechanisms of 27.43 W. Bechtel: Generalizing mechanistic explanations
form and motion processing in the primate visual through graph-theoretic perspectives. In: Expla-
Part F | 27
system, Neuron 13, 1–10 (1994) nation in Biology. An Enquiry into the Diversity of
27.33 M. Hegarty, M. Kozhevnikov: Types of visual-spatial Explanatory Patterns in the Life Sciences, ed. by P.-
representations and mathematical problem solv- A. Braillard, C. Malaterre (Spinger, Dordrecht 2015)
ing, J. Educ. Psychol. 91, 684–689 (1999) pp. 199–225
27.34 M. Kozhevnikov, M. Hegarty, R.E. Mayer: Revising 27.44 W. Bechtel, A. Abrahamsen: Dynamic mechanistic
the visualizer-verbalizer dimension: Evidence for explanation: Computational modeling of circadian
two types of visualizers, Cogn. Instr. 20, 47–77 (2002) rhythms as an exemplar for cognitive science, Stud.
27.35 M. Kozhevnikov, S.M. Kosslyn, J. Shephard: Spatial Hist. Philos. Sci. Part A 41, 321–333 (2010)
versus object visualizers: A new characterization of 27.45 N. Jones, O. Wolkenhauer: Diagrams as locality aids
visual cognitive style, Memory Cogn. 33, 710–726 for explanation and model construction in cell bi-
(2005) ology, Biol. Philos. 27, 705–721 (2012)
27.36 P.A. Carpenter, P. Shah: A model of the perceptual 27.46 M. Stieff, M. Hegarty, B. Dixon: Alternative strate-
and conceptual processes in graph comprehension, gies for spatial reasoning with diagrams. In: Di-
J. Exp. Psychol. Appl. 4, 75–100 (1998) agrammatic Representation and Inference, Vol.
27.37 S. Trickett, J. Trafton: Toward a comprehensive 6170, ed. by A. Goel, M. Jamnik, N.H. Narayanan
model of graph comprehension: Making the case (Springer, Berlin, Heidelberg 2010) pp. 115–127
for spatial cognition. In: Diagrammatic Represen- 27.47 D. Gentner, A.L. Stevens: Mental Models (Erlbaum,
tation and Inference, Vol. 4045, ed. by D. Barker- Hillsdale 1983)
Plummer, R. Cox, N. Swoboda (Springer, Berlin, Hei- 27.48 M. Hegarty: Mental animation: Inferring motion
delberg 2006) pp. 286–300 from static displays of mechanical systems, J. Exp.
27.38 J.S. Takahashi, H.-K. Hong, C.H. Ko, E.L. McDear- Psychol. Learn. Memory Cogn. 18, 1084–1102 (1992)
mon: The genetics of mammalian circadian order 27.49 D.L. Schwartz, J.B. Black: Shuttling between de-
and disorder: Implications for physiology and dis- pictive models and abstract rules: Induction and
ease, Nat. Rev. Genet. 9, 764–775 (2008) fallback, Cogn. Sci. 20, 457–497 (1996)
27.39 W. Bechtel, R.C. Richardson: Discovering Complex- 27.50 D.L. Schwartz, T. Black: Inferences through imag-
ity: Decomposition and Localization as Strategies ined actions: Knowing by simulated doing, J. Exp.
in Scientific Research (MIT Press, Cambridge 1993) Psychol. Learn. Memory Cogn. 25, 116–136 (1999)
27.51 S. Goldin-Meadow, S.M. Wagner: How our hands
help us learn, Trends Cogn. Sci. 9, 234–241 (2005)
619
Embodied Me
28. Embodied Mental Imagery in Cognitive Robots
Part F | 28
artificial cognitive models of motor imagery and 28.3.1 The Humanoid Robotic Platform:
mental simulation to control complex behaviors of The iCub ........................................... 624
humanoid platforms, which represent the artificial 28.3.2 First Experimental Study:
body. Motor Imagery
With the aim of providing a panorama of the for Performance Improvement ............ 624
research activity on the topic, first we give an 28.3.3 Second Experimental Study:
introduction on the neuroscientific and psycho- Mental Training Evoked by Language ... 628
logical background of mental imagery in order to 28.3.4 Third Experimental Study:
help the reader to contextualize the multidisci- Spatial Imagery ................................. 630
plinary environment in which we operate. Then,
28.4 Conclusion........................................ 635
we review the work done in the field of artificial
cognitive systems and robotics to mimic the pro- References................................................... 635
cess behind the human ability of creating mental
images of events and experiences, and to use this
process as a cognitive mechanism to improve the These empirical studies exemplify how the pro-
behavior of complex robots. Finally, we report the prioceptive information can be used by mental
detail of three recent empirical studies in which imagery models to enhance the performance of
mental imagery approaches were modelled trough the robot, giving evidence of the embodied cog-
artificial neural networks (ANNs) to enable a cog- nition theories in the context of artificial cognitive
nitive robot with some human-like capabilities. systems.
The work presented in this chapter takes inspiration tivation behind a strongly humanoid design of some of
from the human capability to build representations of the recent and most advanced robotic platforms, for ex-
the physical world in its mind. In particular, we studied ample, iCub [28.2], NAO [28.3], and Advanced Step in
the motor imagery, which is considered a multimodal Innovative MObility (ASIMO) [28.4]. These platforms
simulation that activates the same, or very similar, are equipped with sophisticated motors and sensors,
sensory and motor modalities that are activated when which replicate animal or human sensorimotor input–
human beings interact with the environment in real output streams. The sensors and actuators arrangement
time. This is strictly related to the embodied cognition determine a highly redundant morphological structure
hypothesis, which affirms that all aspects of human cog- of humanoid robots, which are traditionally difficult to
nition are shaped by aspects of the body. control and, thus, require complex models implement-
Similarly, when artificial intelligence was moving ing more sophisticated and efficient mechanisms that
his first steps, Alan Turing argued that in order to think resemble the human cognition [28.5].
and speak a machine may need a human-like body and In this multidisciplinary context, improving the skill
that the development of robot cognitive skills might be of a robot in terms of motor control and navigation
just as simple as teaching a child [28.1]. This is the mo- capabilities, especially in the case of a complex robot
620 Part F Model-Based Reasoning in Cognitive Science
with many degrees of freedom, is a timely and impor- imagery as a complex, goal directed and flexible motor
tant issue in current robotics research. Among the many planning strategy and to go further in the development
bio-inspired mechanisms and models already tested in of artificial cognitive systems capable to better interact
the field of robot control and navigation, the use of men- with the environment and refine their cognitive motor
tal imagery principles is of interest in modeling mental skill in an open-ended process.
recently, neuroscience [28.6]. Some of the main ef- 5. Cognition is for action
fects of mental practice on physical performance have 6. Offline cognition is bodily based.
been well established in experiments with humans as
in the fields of sports science, work psychology, and Among those six claims, the last claim is particu-
motor rehabilitation. Neuropsychological research has larly important. According to this claim, sensorimotor
long highlighted the complexity of brain activation in functions that evolved for action and perception are
the activity of imagination. Studies have demonstrated used during offline cognition that occurs when the per-
the localization partly similar and partly different be- ceiver represents social objects, situations, or events
tween imagery, perception, and visual memory [28.7, in times and places other than the ordinary ones. This
8], while, in [28.9], it was demonstrated by a clini- principle is reinforced by the concept of embodied
cal case as visuospatial perception and imagery can be cognition [28.13, 14], which affirms that the nature of
functionally separated in activating brain. In [28.10], intelligence is largely determined by the form of the
authors proposed a revision of the constructs relevant body. Indeed, the body and every physical experience
to cognitive styles, placing them into a complex frame- made through the body, shape the form of intelligence
work of heuristics regarding multiple levels of infor- that can be observed in any autonomous systems. This
mation processing, from the attentional and perceptual means that even if the mind does not directly interact
to metacognitive ones. These heuristics are grouped ac- with the environment, it is able to apply mechanisms
cording to the type of regulatory function that they take of sensory processing and motor control by using some
from the automatic coding of data to the conscious use innate abilities such as memory (implicit, short, and
of cognitive resources. In this view, also at the cerebral long term), problem solving, and mental imagery. These
area activation level, the distinction between elabora- capabilities have been well studied in psychology and
tion of object properties (like shape or color) and spatial neuroscience, but the debate is still open on the issue
relations is better representative of a different style in of mental imagery, where mental imagery is defined as
the use of mental images than the ancient dichotomy a sensation activated without sensorial stimulation.
verbal–visual [28.11]. Many evidences from empirical sciences have
But only quite recently a growing amount of ev- demonstrated the relationship between bodily experi-
idence from empirical studies begun to demonstrate ences and mental processes that involve body represen-
the relationship between bodily experiences and men- tation. Neuropsychological research has demonstrated
tal processes that actively involve body representations. that the same brain areas are activated during seeing
This is also due to the fact that in the past, philosoph- or recalling by images [28.15] and that areas control-
ical and scientific investigations of the topic primarily ling perception are needed also for maintaining mental
focused upon visual mental imagery. Contemporary im- images active in working memory. Therefore, mental
agery research has now broadly extended its scope to in- imagery may be considered as a kind of biological sim-
clude every experience that resembles the experience of ulation. In [28.16], author observed that the primary
perceiving from any sensorial modality. From this per- motor cortex M1 is activated during the production of
spective, understanding the role of the body in cognitive motor images as well as during the production of active
processes is extremely important and psychological and movement. Similarly, experimental studies show that
neuroscience studies are extremely important in this re- neural mechanisms underlying real-time visual percep-
gard. Wilson [28.12] identified six claims in the current tion and mental visualization are the same when a task
view of embodied cognition: is mentally recalled [28.17]. Nevertheless, the neural
Embodied Mental Imagery in Cognitive Robots 28.1 Mental Imagery Research Background 621
mechanisms involved in the active elaboration of men- it is based on higher level changes in cortical maps,
tal images might be different from those involved in presumably resulting in a more efficient recruitment
passive elaborations [28.18]. These studies demonstrate of motor units. These findings are in line with other
the tight relationship between mental imagery and mo- studies, specifically focused on motor imagery, which
tor activities (i. e., how the image in mind can influence shows the enhancement of mobility range [28.28] or
movements and motor skills). increased accuracy [28.29] after mental training. Inter-
Recent research, both in experimental as well as estingly, it should be noted that such effects operate
practical contexts, suggests that imagined and exe- both ways: mental imagery can influence motor perfor-
cuted movement planning relies on internal models for mance and the extent of physical practice can change
action [28.19]. These representations are frequently as- the areas activated by mental imagery [28.30]. As a re-
sociated with the notion of internal (forward) models sult of these studies, new opportunities for the use of
and are hypothesized to be an integral part of action mental training have opened up in collateral fields, such
planning [28.20, 21]. Furthermore, in [28.22], authors as medical and orthopaedic–traumatologic rehabilita-
suggest that motor imagery may be a necessary prereq- tion. For instance, mental practice has been used to
Part F | 28.1
uisite for motor planning. In [28.23], Jeannerod studied rehabilitate motor deficits in a variety of neurological
the role of motor imagery in action planning and pro- disorders [28.31]. Mental training can be successfully
posed the so-called equivalence hypothesis – suggest- applied in helping a person to regain lost movement
ing that motor simulation and motor control processes patterns after joint operations or joint replacements
are functionally equivalent [28.24, 25]. These studies, and in neurological rehabilitation. Mental practice has
together with many others (e.g., [28.26]), demonstrate also been used in combination with actual practice to
how the images that we have in mind might influence rehabilitate motor deficits in a patient with subacute
movement execution and the acquisition and refinement stroke [28.32], and several studies have also shown im-
of motor skills. For this reason, understanding the rela- provement in strength, function, and use of both upper
tionship that exists between mental imagery and motor and lower extremities in chronic stroke patients [28.33,
activities has become a relevant topic in domains in 34].
which improving motor skills is crucial for obtaining In sport, beneficial effects of mental training for the
better performance, such as in sport and rehabilita- performance enhancement of athletes are well estab-
tion. Therefore, it is also possible to exploit mental lished and several works are focused on this topic with
imagery for improving a human’s motor performance tests, analysis, and in new training principles [28.35–
and this is achieved thanks to special mental training 38]. In [28.39], for example, a cognitive-behavioral
techniques. training program was implemented to improve the
Mental training is widely used among professional free-throw performance of college basketball players,
athletes, and many researchers began to study its benefi- finding improvements of over 50%. Furthermore, the
cial effects for the rehabilitation of patients, in particu- trial in [28.40], where mental imagery is used to en-
lar after cerebral lesions. In [28.24], for example, the hance the training phase of hockey athletes to score
authors analyzed motor imagery during mental train- a goal, showed that imaginary practice allowed ath-
ing procedures in patients and athletes. Their findings letes to achieve better performance. Despite the fact
support the notion that mental training procedures can that there is ample evidence that mental imagery, and
be applied as a therapeutic tool in rehabilitation and in in particular motor imagery, contributes to enhancing
applications for empowering standard training method- motor performance, the topic still attracts new research,
ologies. Others studies have shown that motor skills can such as [28.41] that investigated the effect of mental
be improved through mental imagery techniques. Jean- practice to improve game plans or strategies of play in
nerod and Decety [28.18] discuss how training based on open skills in a trial with 10 female pickers. Results
mental simulation can influence motor performance in of the trial support the assumption that motor imagery
terms of muscular strength and reduction of variability. may lead to improved motor performance in open skills
In [28.27], authors show that imaginary fifth finger ab- when compared to the no-practice condition. Another
ductions led to an increased level of muscular strength. recent paper [28.42] demonstrated that sports experts
The authors note that the observed increment in muscle showed more focused activation patterns in prefrontal
strength is not due to a gain in muscle mass. Rather, areas while performing imagery tasks than novices.
622 Part F Model-Based Reasoning in Cognitive Science
nisms in order to enhance motor control in autonomous which is characterized by the coexistence of slow
robots, and to develop autonomous systems that are and fast dynamics in generating anticipatory behaviors.
capable of exploiting the characteristics of mental im- Through the iterative tutoring of the robot for multiple
agery training to better interact with the environment goal-directed actions, interesting developmental pro-
and refine their motor skills in an open-ended pro- cesses emerged. Behavior primitives in the earlier fast
cess [28.44]. Indeed, among the many hypotheses and context network part were self-organizing, while they
models already tested in the field of cognitive systems appeared to be sequenced in the later, slow context part.
and robotics, the use of mental imagery as a cogni- Also observed was that motor images were generated in
tive tool capable of enhancing robot behaviors is both the early stage of development.
innovative and well-grounded in experimental data at The study presented in [28.49] show how simulated
different levels. robots evolved for the ability to display a context-
A model-based learning approach for mobile robot dependent periodic behavior can spontaneously develop
navigation was presented in [28.45], where it is dis- an internal model and rely on it to fulfil their task when
cussed how a behavior-based robot can construct a sym- sensory stimulation is temporarily unavailable. Results
bolic process that accounts for its deliberative thinking suggest that internal models might have arisen for be-
processes using internal models of the environment. havioral reasons and successively exapted for other
The approach is based on a forward modeling scheme cognitive functions. Moreover, the obtained results sug-
using recurrent neural learning, and results show that gest that self-generated internal states need not match in
the robot is capable of learning grammatical structure detail the corresponding sensory states and might rather
hidden in the geometry of the workspace from the local encode more abstract and motor-oriented information.
sensory inputs through its navigational experiences. Fascinatingly, in [28.50], authors explore the idea
An example of the essential role mental imagery of dreams as a form of mental imagery and the possi-
can play in human–robot interaction was recognized by ble role they might play in mental simulations and in
Roy et al. [28.46]. She presented a robot, called Ripley, the emergence and refinement of the ability to gener-
which is able to translate spoken language into actions ate predictions on the possible outcomes of actions. In
for object manipulation guided by visual and haptic per- brief, what the authors propose is that robots might first
ception. The robot maintained a dynamic mental model, need to possess some of the characteristics related to
a three-dimensional model of its immediate physical the ability to dream (particularly those found in infants
environment that it used to mediate perception, ma- and children) before they can acquire a robust ability
nipulation planning, and language. The contents of to use mental imagery. This ability to dream, according
the robot’s mental model could be updated based on to them, would assist robots in the generation of pre-
linguistic, visual, or haptic input. The mental model dictions of future sensory states and of situations in the
endowed Ripley with object permanence, remembering world.
the position of objects when they were out of its sensory Internal simulations can help artificial agents to
field. solve the stereo-matching problem, operating on the
Experiments on internal simulation of perception sensorimotor domain, with retinal images that mimic
using ANN robot controllers are presented by Ziemke the cone distribution on the human retina [28.51].
et al. [28.47]. The paper focuses on a series of experi- This is accomplished by applying internal sensorimotor
ments in which feedforward neural networks (FFNNs) simulation and (subconscious) mental imagery to the
Embodied Mental Imagery in Cognitive Robots 28.2 Models and Approaches Based on Mental Imagery 623
process of stereo matching. Such predictive matching hierarchical organization of the cortex and it is based
is competitive to classical approaches from computer on a set of interconnected artificial neural networks
vision, and it has moreover the considerable advantage that control the humanoid robot iCub in tasks that in-
that it is fully adaptive and can cope with highly dis- volve coordination between vision, hand-arm control,
torted images. and language. The chapter also highlights interesting re-
A computational model of mental simulation that lations between the model and neurophysiological and
includes biological aspects of brain circuits that appear neuropsychological findings that the model can account
to be involved in goal-directed navigation processes is for.
presented in [28.52]. The model supports the view of An extension of the neurocomputational model
the brain as a powerful anticipatory system, capable TRoPICAL (two route, prefrontal instruction, compe-
of generating and exploiting mental simulation for pre- tition of affordances, language simulation) is proposed
dicting and assessing future sensory motor events. The by [28.55] to implement an embodied cognition ap-
authors show how mental simulations can be used to proach to mental rotation processes, a classic task in
evaluate future events in a navigation context, in or- mental imagery research. The extended model develops
Part F | 28.2
der to support mechanisms of decision-making. The new features that allow it to implement mental sim-
proposed mechanism is based on the assumption that ulation, sensory prediction, as well as enhancing the
choices about actions are made by simulating move- model’s capacity to encode somatosensorial informa-
ments and their sensory effects using the same brain tion. The model, applied to a simulated humanoid robot
areas that are active during overt actions execution. (iCub) in a series of mental rotation tests, shows the
An interpretation of mental imagery based on ability to solve the mental rotation tasks in line with
the context of homeostatic adaptation is presented results coming from psychology research. The authors
in [28.53], where the internal dynamics of a highly also claim the emergence of links between overt move-
complex self-organized system is loosely coupled with ments with mental rotations, suggesting that affordance
a sensory-motor dynamic guided by the environment. and embodied processes play an important role in men-
This original view is supported by the analysis of a neu- tal rotation capacities.
ral network model that controls a simulated agent facing Starting from the fact that some evidence in exper-
sensor shifts. The agent is able to perceive a light in the imental psychology has suggested that imagery abil-
environment through some light sensors placed around ity is crucial for the correct understanding of social
its body and its task is that of approaching the light. intention, an interesting study to investigate intention-
When the sensors are swapped, the agent perceives the from-movement understanding is presented in [28.56].
light in the opposite direction of its real position and the Authors’ aim is to show the importance of includ-
control systems has to autonomously detect the shift- ing the more cognitive aspects of social context for
ing sensor and act accordingly. The authors speculate further development of the optimal theories of motor
that mental imagery could be a viable way for creating control, with positive effects on robot companions that
self-organized internal dynamics that is loosely cou- afford true interaction with human users. In the paper,
pled with sensory motor dynamics. The loose coupling the authors present a simple but thoroughly executed
allows the creation of endogenous input stimulations, experiment, first to confirm that the nature of the mo-
similar to real ones that could allow the internal sys- tor intention leads to early modulations of movement
tem to sustain its internal dynamics and, eventually, kinematics. Second, they tested whether humans use
reshape such dynamics while modifying endogenous imagery to read an agent’s intention when observing the
input stimulations. very first element of a complex action sequence.
Lallee and Dominey [28.54] suggest the idea that A neural network model to produce an anticipatory
mental imagery can be seen as a way for an au- behavior by means of a multimodal off-line Hebbian
tonomous system of generating internal representation association is proposed in [28.57]. The model emu-
and exploiting the convergence of different multimodal lates a process of mental imagery, in which visual and
contingencies. That is, given a set of sensory-motor tactile stimuli are associated during a long-term pre-
contingencies specific to many different modalities, dictive simulation chain motivated by covert actions.
learned by an autonomous agent in interaction with Such model was studied by means of two experiments
the environment, mental imagery constitutes the bridge with a physical Pioneer 3-DX robot that developed
toward even more complex multimodal convergence. a mechanism to produce visually conditioned obstacle
The model proposed by the authors is based on the avoidance behavior.
624 Part F Model-Based Reasoning in Cognitive Science
28.3 Experiments
In this section, we present three experimental studies (Fig. 28.1) is an open-source humanoid robot platform
that exemplify the capabilities and the performance im- designed to facilitate cognitive developmental robotics
provements achievable by an imagery-enabled robot. research as detailed in [28.2]. At the current state the
Results of experimental tests with the simulator of the iCub platform is a child-like humanoid robot 1:05 m
iCub humanoid robot platform are presented as evi- tall, with 53 degrees of freedom (DoF) distributed in
dence of the opportunities given by the use of artificial the head, arms, hands, and legs. The implementation
mental imagery in cognitive artificial systems. used for the experiments presented here is a simulation
The first study, [28.58], details a model of a con- of the iCub humanoid robot (Fig. 28.1). The simula-
troller, based on a dual network architecture, which tor, which was developed with the aim to accurately
allows the humanoid robot iCub to improve au- reproduce the physics and the dynamics of the physical
tonomously its sensorimotor skills. This is achieved iCub [28.61], allows the creation of realistic physical
by endowing the controller of a secondary neural sys- scenarios in which the robot can interact with a vir-
Part F | 28.3
tem that by exploiting the sensorimotor skills already tual environment. Physical constraints and interactions
acquired by the robot, is able to generate additional that occur between the environment and the robot are
imaginary examples that can be used by the controller simulated using a software library that provides an
itself to improve the performance through a simulated accurate simulation of rigid body dynamics and colli-
mental training. sions.
The second study, [28.59], builds on the previous
study showing that the robot could imagine or mentally 28.3.2 First Experimental Study:
recall and accurately execute movements learned in pre- Motor Imagery
vious training phases, strictly on the basis of the verbal for Performance Improvement
commands. Further tests show that data obtained with
imagination could be used to simulate mental training The first experimental study explored the application of
processes, such as those that have been employed with mental simulation to robot controllers, with the aim to
human subjects in sports training, in order to enhance mimic the mental training techniques to improve the
precision in the performance of new tasks through the motor performance of the robot. To this end, a model
association of different verbal commands. of a controller based on neural networks was designed
The third study, [28.60], explored how the rela- to allow the iCub to autonomously improve its sensori-
tionship between spatial mental imagery practice in motor skills.
a training phase could increase accuracy in sports re- The experimental task is to throw a small cube of
lated performance. The focus is on the capability to side size 2 cm and weight 40 g as far as possible accord-
estimate, after a period of training with proprioceptive ing to an externally given velocity for the movement.
and visual stimuli, the position into a soccer field when The task phases are shown in Fig. 28.2 and it is the real-
the robot acquires the goal. ization of a ballistic action, involving the simultaneous
movement of the right arm and of the torso, with the
28.3.1 The Humanoid Robotic Platform: aim to throw a small object as far as possible accord-
The iCub ing to an externally given velocity for the movement.
Ballistic movements can be defined as rapid movements
The cognitive robotic platform used for the experiments initiated by muscular contraction and continued by mo-
presented here is the simulation of the iCub humanoid mentum [28.62]. These movements are typical in sport
robot controlled by artificial neural networks. The iCub actions, such as throwing and jumping (e.g., a soccer
kick, a tennis serve, or a boxing punch). In this exper-
iment, we focus on two main features that characterize
a) b)
a ballistic movement: (1) it is executed by the brain with
a predefined order, which is fully programmed before
the actual movement realization and (2) it is executed
as a whole and will not be subject to interference or
modification until its completion. This definition of bal-
listic movement implies that proprioceptive feedback is
not needed to control the movement and that its devel-
Fig. 28.1a,b The iCub humanoid robot platform: (a) The realistic opment is only based on starting conditions [28.63]. It
simulator; (b) The real platform should be noted here that since ballistic movements are
Embodied Mental Imagery in Cognitive Robots 28.3 Experiments 625
Part F | 28.3
which feed directly to the hidden layer through a set
Fig. 28.2a–c Three phases of the movement: (a) Prepara- of context units added to the input layer. At each it-
tion: The object is grabbed and shoulder and wrist joints eration (epoch), the context units hold a copy of the
are positioned at 90ı ; (b) Acceleration: The shoulder joint previous values of the hidden and outputs units. This
accelerates until a given angular velocity is reached, while creates an internal memory of the network, which al-
the wrist rotates down; (c) Release: the object is released lows it to exhibit dynamic temporal behavior [28.64]. It
and thrown away should be noted that in preliminary experiments, this
architecture proved to show better performances and
by definition not affected by external interferences, the improved stability with respect to classical architec-
training can be performed without considering the sur- tures, for example, Jordan [28.65] or Elman [28.66].
rounding environment, as well as vision and auditory The DRNN comprises 5 output neurons, 20 neurons in
information. the hidden layer, and 31 neurons in the input layer (6 of
To build the input–output training and testing sets, them encode the proprioceptive inputs from the robot
all values were normalized in the range Œ0; 1. joints and 25 are the context units). Neuron activations
To control the robot, we designed a dual neural are computed according to (28.1) and (28.2). The six
network architecture, which can operate to improve proprioceptive inputs encode, respectively, the shoulder
autonomously the robot motor skills with techniques pitch angular velocity (constant during the movement),
inspired by the ones that are employed with human sub- positions of shoulder pitch and hand wrist pitch (at
jects in sports training. This is achieved through two time t), elapsed time, expected duration time (at time
interconnected neural networks to control the robot: t), and the grab/release command (0 if the object is
a FFNN that directly controls the robot’s joints and grabbed, 1 otherwise). The five outputs encode the pre-
n0 ... ...
Input
Vel Arm Hand Grab 1 2 3 ... N
Input Inputs layer
Context
Controller Imagery units
net Real world inputs net Motor information
Fig. 28.3a,b Design of the cognitive architecture of the first experimental study: (a) The dual network architecture
(FFNN C RNN). (b) Detail of RNN: Brown connections (recurrences and predicted output for the FFNN) are active
only in imagery mode, meanwhile light grey links (external input–output from real world) are deactivated
626 Part F Model-Based Reasoning in Cognitive Science
dictions of shoulder and wrist positions at the next time the output sequences produced by the network. The er-
step, the estimation of the elapsed time, the estimation ror function E is calculated as follows
of the movement duration, and the grab/release state.
1X
p
The DRNN retrieves information from joint encoders to
predict the movement duration during the acceleration ED kyi ti k2 ; (28.3)
2 iD1
phase. The activation time step of the DRNN is 30 ms.
The object is released when at time t1 the predicted time
to release, tp , is lower than the next step time t2 . where p is the number of outputs, ti is the desired ac-
A functional representation of the neural system tivation value of the output unit i, and yi is the actual
that controls the robot is given in Fig. 28.3a: a three- activation of the same unit produced by the neural net-
layer FFNN, which implements the actual motor con- work, calculated using (28.2) and (28.3). During the
troller, and of RNN. The RNN models the motor im- training phase, synaptic weights at learning step n C 1
agery and it is represented in detail in Fig. 28.3b. The are updated using the error calculated at the previous
choice of the FFNN architecture as a controller was learning step n, which in turn depend on the error E. Ac-
tivations of hidden and output units yi are calculated by
Part F | 28.3
Part F | 28.3
output series. The learning and testing dataset for the
FFNN comprises one pair of data: the angular veloc- Fig. 28.4 Comparison of the distance reached by the object after
ity of the shoulder as input and the execution time, that throwing with the FFNN as controller and different training ap-
is, duration, as desired output. The learning and test- proaches. Negative values represent the objects falling backward
ing datasets for the RNN comprises sequences of 25
elements collected using a time-step of 0:03 s. All data with the slow range subset, then the RNN runs in
in both learning and testing datasets are normalized in mental imagery mode to build a new dataset of fast
the range Œ0; 1. Results are shown for the testing set examples for the FFNN that is incrementally trained
only. this way.
To test the mental training, we compared results on
As expected, the FFNN is the best controller for the
three different case studies:
task if the full range is given as training, thus, it is the
1. Full range: For benchmarking purposes, it is the ideal controller for the task (Table 28.1). But, not sur-
performance obtained by the FFNN when it is prisingly, in Table 28.2 it is shown that the FFNN it
trained using the full range of examples (slow C is not able to generalize with the fast range when it is
fast) trained with the slow range only.
2. Slow range only training: The performance ob- These results show that generalization capability of
tained by the FFNN only when it is trained using the RNN helps to feed the FFNN with new data to cover
only the slow-range subset. This case stressed the the fast range, simulating mental training. In fact, the
generalization capability of the controller when it FFNN, trained only with the slow subset, is not able to
is tested with the fast range subset foresee the trend of duration in the fast range; this im-
3. Slow range plus mental training: In this case the two plies that fast movements last longer than needed and,
architectures operate together as a single hierarchi- because the inclination angle is over 90ı , the object falls
cal architecture, in which first both nets are trained backward (Fig. 28.4).
Table 28.1 Full-range training: comparison of average results of feedforward and recurrent artificial neural nets
Test Feedforward net Recurrent net
range Duration Release point Duration Release point
type s Error% Degree Error% s Error% Degree Error%
Slow 0:472 1:75 30:718 6:46 0:482 3:60 33:345 11:96
Fast 0:202 0:87 31:976 6:34 0:194 5:67 28:088 22:18
Full 0:307 1:21 31:486 6:39 0:306 4:86 30:132 18:21
Table 28.2 FFNN: Comparison of average performance improvement with artificial mental training
Test Slow range only training Slow range plus mental training
range Duration Release point Duration Release point
type s Error% Degree Error% s Error% Degree Error%
Slow 0:474 1:38 30:603 7:88 0:471 1:74 30:774 8:18
Fast 0:247 26:92 64:950 111:72 0:188 7:12 20:901 35:89
Full 0:335 16:99 51:593 71:34 0:298 5:03 24:741 25:11
628 Part F Model-Based Reasoning in Cognitive Science
The FFNN failure in predicting temporal dynamics very similar, also in the case in which the RNN is
is explainable by the simplistic information used to train trained with the slow range only. Interestingly, the se-
the FFNN, which seems to be not enough to reliably ries shown in Fig. 28.8 are highly correlated with the
predict the duration time in a faster range, never ex- duration time, which is never explicitly given to the
perienced before. On the contrary, the greater amount neural network. The correlation is 97:50 for the full-
of information that comes from the proprioception and range training, 99:41 for slow only and 99:37 for slow
the fact that the RNN has to integrate over time those plus mental training. This result demonstrates that the
information in order to perform the movement, makes RNN is able to extrapolate the duration time from the
the RNN able to create a sort of internal model of the input sequences and to generalize when operating with
robot’s body behavior. This allows the RNN to better new sequences never experienced before.
generalize and, therefore guide the FFNN in enhancing Similarly, Fig. 28.5b shows the first principal com-
its performance. ponent of the RNN in relation to different angular
This interesting aspect of the RNN can be partially velocities: slow being the slowest test velocity, medium
unveiled by analyzing the internal dynamic of the neural the fastest within the slow range, and fast the fastest
Part F | 28.3
network, which can be done by reducing the complexity possible velocity tested in the experiment. As can be
of the hidden neuron activations trough a principal com- seen, the RNN is able to uncover the similarities in the
ponent analysis. Figure 28.5a, for example, presents the temporal dynamics that link slow and fast cases. Hence,
values of the first principal component at the last time- it is finally able to better approximate the correct trajec-
step, that is, after the neural network has finished the tory of joint positions also in a situation not experienced
throwing movement, for all the test cases, both slow before.
and fast showing that the internal representations are
28.3.3 Second Experimental Study:
Mental Training Evoked by Language
a) Component value
3
Full range In this experimental study, we dealt with motor imagery
Plus mental training and how verbal instruction may evoke the ability to
2.5
Slow only
imagine movements, already seen before or new ones
2 obtained by combination of past experiences. These
imagined movements either replicate the expected new
1.5 movement required by verbal commands or correspond
in accuracy to those learned and executed during train-
1 ing phases. Motor imagery is defined as a dynamic state
during which representations of a given motor act are
0.5 internally rehearsed in working memory without any
2 4 6 8 10 12 14 16 18
Test case
overt motor output [28.67].
This study extends the first experimental study pre-
b) Component value
3 sented above, by focusing on the integration of auditory
Slow stimuli in the form of verbal instructions, to the motor
2 Fast stimuli already experienced by the robot in past sim-
Medium
ulations. Simple verbal instructions are added to the
1
training phase of the robot, in order to explore the im-
0 pact that linguistic stimuli could have in its processes of
mental imagery practice and subsequent motor execu-
–1 tion and performance. In particular, we tested the ability
–2
of our model to use imagery to execute new orders,
obtained combining two single instructions. This study
–3 has been inspired by embodied language approaches,
0 5 10 15 20 25
Timestep
which are based on evidence that language comprehen-
sion is grounded in the same neural systems that are
Fig. 28.5 (a) Hidden units’ activation analysis. Lines rep- used to perceive, plan, and take action in the external
resent final values of the first principal component for all world.
test cases; (b) hidden units’ activation analysis. Lines rep- Figure 28.6 presents pictures of the action with the
resent the values of first principal component for a slow iCub simulator, which was commanded to execute the
velocity, a medium velocity and a fast velocity four throw tasks according to the verbal command is-
Embodied Mental Imagery in Cognitive Robots 28.3 Experiments 629
Part F | 28.3
creates an internal state of the network, which allows it joints and 20 are the context units, that is, are back links
to exhibit dynamic temporal behavior. To model mental of hidden units, they only copy the value from output
imagery the outputs related with the motor activities are of upper unit to the input of lower unit. The learning
redirected to corresponding inputs. algorithm and parameters are the same as the second
Similarly to the previous experimental study, after experiment.
the learning phase in which real data collected during As proprioceptive motor information, we take into
simulator execution was used to train the RNN and for account just the following three joints, shoulder pitch,
comparison with imagined data, we tested the ability torso yaw, and hand wrist pitch. In addition, in order
of the RNN architecture to model mental imagery. As to model the throw of an object, the primitive action
before, this was achieved by adding other back con- to grab/release was also considered in the motor infor-
nections from motor outputs to motor inputs; at the mation fed to the network. Visual information was not
same time connections from/to joint encoders and mo- computed and speech input processing was based on
tor controllers are deactivated. This setup is presented standard speech recognition systems.
in Fig. 28.7, where red connections are the ones active Using the iCub simulator, we performed two exper-
only when the imagery mode is on, while green con- iments:
nections are deactivated, including the motor controller.
Specific neurons, one for each verbal instruction, were The first experiment aimed to evaluate the ability
included in the input layer of the RNN in order for it to of the RNN to model artificial mental imagery. It
take into account these commands, while the sensori- was divided into two phases: in the first phase the
motor information is directed to the rest of the neurons network was trained to predict its own subsequent
in the input layer. The RNN architecture implemented, sensorimotor state. The task was to throw in dif-
as presented in Fig. 28.7, has 4 output units, 20 units ferent directions (forward, left, right, back) a small
in the hidden layer, and 27 units in the input layer, 7 of object that was placed in the right hand of the robot,
Output
Vel Arm Hand Grab
layer
Hidden
1 2 3 ... N layer
which is able to grab and release it. To this end the only to throw front and to move right and left (without
RNN was trained using the proprioceptive informa- throwing). To allow the RNN to generalize, training ex-
tion collected from the robot. The proprioceptive amples were created using an algorithm that randomly
information consisted of sensorimotor data (i. e., chose joint positions not involved in the movement, that
joint positions) and of verbal commands given to the is, when throwing, the torso joint had a fixed position
robot according to directions. In the second phase, that was randomly chosen. The same was true for arm
we tested the ability of the RNN to model mental joints when moving right and left.
imagery providing only the auditory stimulus (i. e., Test cases, presented in Fig. 28.9, were composed
the verbal commands) and requiring the network to using two commands (e.g., throw together with right
obtain sensorimotor information from its own out- or left). In our experiments, we tested two different ap-
puts. proaches in the language processing. In this test two
The goal of the second experiment was to test the commands were computed at the same time, so that
ability of the RNN to imagine how to accomplish input neurons associated with throw and right (or left)
a new task. In this case we had three phases: in the were fully activated at the same time with value 1.
Part F | 28.3
first phase (real training), the RNN was trained to Results of the mental training experiment are pre-
throw front and just to move left and right (with no sented in Fig. 28.10, which show the error of torso and
throw). In the second phase (imagined action), the arm joint position with respect to the ideal ones. The
RNN was asked to imagine its own subsequent sen- before mental training column presents the results of
sorimotor state when the throw command is issued tests made without additional training, the after mental
together with a side command (left or right). In the training column shows results after the simulated men-
final phase (mental training), the input/output series tal training, the imagined action only column refers to
obtained are used for an additional mental training totally imagined data (i. e., when the RNN predicts its
of the RNN. After the first and third phase, experi- own subsequent input). Comparing results before and
mental tests with the iCub simulator were made to after mental training an improvement in precision of
measure the performance of the RNN to control the dual command execution could be noticed, this should
robot. be accounted to the additional training that helps the
RNN to operate in a condition not experienced before.
In this experiment, we tested the ability of the RNN It should be noticed also that the throw right task has
to recreate its own subsequent sensorimotor state in worse performance compared to that of throw left with
absence of external stimuli. In Fig. 28.8, we present iCub simulations, but the same result is not achieved
a comparison of training and imagined trajectories of in imagined only action mode. This could be mainly
learned movements according to the verbal command explained by the influence of real proprioceptive infor-
issued: mation coming from robot joints that modifies the ideal
behavior expected by the RNN, as evidenced by the
1. Shows results with the FRONT command
comparison between imagined only and the real tests.
2. With the BACK command
In fact, we noticed that when a right command is is-
3. With the RIGHT command
sued the robot torso is initially moved on the left for few
4. For the LEFT command.
time-steps and then it turns right. Since the total time of
Imagined trajectories are accurate with respect to the movement is due to the arm movement to throw, the
the ones used to train the robot only in Fig. 28.8b, we initial error for the torso could not be recovered.
notice a slight difference between imagined and training
positions of the arm. This difference can be attributed 28.3.4 Third Experimental Study:
to the fact that the BACK command is the only one that Spatial Imagery
does not require the arm to stop early in throwing the
object. In other words, the difference is related to the For this experiment the environment is a square portion
timing of the movement rather than to the accuracy. Re- of a soccer field, whose length and width are both 15 m.
sults show that the RNN is able to recall the correct At one end is placed a goal 1:94 m wide, as can be seen
trajectories of the movement according to the verbal in Fig. 28.11b, which is represented all in blue to con-
command issued. The trajectories are the sequence of trast with the background and to be easily recognized.
joint positions adopted in the movements. The robot can be positioned anywhere in this square
The second test was conducted to evaluate the abil- and, as starting position, the ball is placed in front of
ity of the RNN to build its own subsequent sensorimotor his left foot (Fig. 28.11a).
states when it is asked to accomplish new tasks not ex- The neural system that controls the robot is a fully
perienced before. In this case, the RNN was trained connected RNN with 16 hidden units, 37 input units,
Embodied Mental Imagery in Cognitive Robots 28.3 Experiments 631
a) Position b) Position
1 1
0 0
1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19
Time step Time step
Training torso Training arm Training hand
Imagined torso Imagined arm Imagined hand
Part F | 28.3
c) Position d) Position
1 1
0 0
1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19
Time step Time step
Fig. 28.8 First test: A comparison of training and imagined trajectories of learned movements
Throw Throw
left right %
80
Right
Left
60
40
20
a) b) Motor commands
Hidden
1 2 3 ... N layer
... 1 ... 32 r θ
controlling the robot, but used only to estimate the po- its visual field. The same happened when the robot was
sition coordinates. In a second trial the robot was fully in position 2 and 7 of the learning set. The robot misses
moved by means of the neural network (autonomous the 50% of the scores, but it is worth to mention that
condition), which directly controlled the joints (i. e., errors were mostly made when the goal was very dis-
network outputs were sent to neck, torso, and the body tant (i. e., more than 8 m) and even a little error in the
rotation actuators) as well as the kick command (moves position leads the ball out of the goal.
the leg down). The result of this preliminary study Figures 28.13 and 28.14 graphically summarize the
shows that the autonomous condition performs better results showing the environment with the 8 real and
(i. e., 5% average error less) than the controlled one in imagined positions of the train set and the test set, re-
terms of position estimation. For this reason, for the spectively. As the figures show, overall the robot is able
main experiment we employed the autonomous condi- to estimate its position in the environment to a good ex-
tion only, which is also more in line with the theoretical tent.
background presented so far. Table 28.4 report the distance, evaluated using
The aim of the experiment was to test the generaliza- Cartesian coordinates, between the real positions and
Part F | 28.3
tion performance of the neural network and to evaluate the first and last estimated positions in the imagined
the use of visual and proprioceptive information for the series. This evaluation gives further evidence that the
estimation of the robot position with respect to the goal. failures on some positions are due to the fact the robot
As testing phase, the robot was positioned in the same is not trained to find the target when it is out of sight. In-
positions used for training (learning set) to verify the deed, the first imagined position is quite good, but after
quality of the learning and, then, in eight new positions the wrong movement the robot is no longer able to see
(testing set), that it did not experienced before, to eval- the goal and the visual input becomes all zeros, thus, it
uate the generalization capability of the model. has no way to recover.
The structure of the experiment is divided into two Figures 28.15 and 28.16 reports the entire imag-
phases: in the first phase the network is trained to ined path, along with markers for first and last imagined
predict its own subsequent sensorimotor state. In the positions and the actual position. The imagined path
second phase the network is tested on the robot, in in- is the fictitious path that is composed of all posi-
teraction with the environment. tions imagined according to the movements made. The
Table 28.3 shows the percentage error between imagined paths for positions with very high error are
imagined positions and real position. The error is evalu- not depicted to avoid confusion. It can be noted that
ated as the percentage with respect to the real positions the accuracy of imagined positions gradually improves
using the polar coordinates. All values of the imagined while the robot performs the movement to aim the
positions in Table 28.3 were obtained with autonomous center of the goal and shoot. The average improve-
control. Analyzing errors in testing position, we can see ment is 0:43 m for learning set and 0:56 m for testing
that the error is very high for position 8, this is be- set. According to this result, the use of propriocep-
cause the robot fails to acquire the target after it makes tive motor information, coming from the autonomous
a wrong move at the beginning and the goal goes out body movements, influences the robot imagination, and
of sight. It should be said that the robot has not been it often helps to better estimate its position in the
trained to find the goal when it is not at least in part in field.
Table 28.3 Real versuss imagined positions for learning and testing sets (normalized polar coordinates) and error per-
centages
Learning Testing
N
Actual Imagined Actual Imagined
Angle Radius Angle Radius Error% Angle Radius Angle Radius Error%
1 0:50 0:400 0:50 0:415 2:07 0:58 0:275 0:57 0:242 11:95
2 0:34 0:457 0:76 0:905 122:59 0:44 0:34 0:43 0:301 11:57
3 0:64 0:477 0:64 0:560 11:18 0:61 0:496 0:61 0:476 4:32
4 0:58 0:550 0:56 0:545 4:04 0:39 0:57 0:39 0:447 21:56
5 0:42 0:550 0:41 0:543 3:38 0:54 0:511 0:53 0:476 7:35
6 0:62 0:706 0:66 0:724 10:48 0:5 0:653 0:51 0:652 2:74
7 0:37 0:743 0:76 0:905 130:32 0:65 0:757 0:58 0:78 20:24
8 0:50 0:733 0:50 0:648 11:47 0:36 0:806 0:77 0:904 125:53
Average 36:94 Average 25:66
Excluding positions 2 & 7 7:1 Excluding position 8 11:39
634 Part F Model-Based Reasoning in Cognitive Science
Table 28.4 Distance (in meters) of imagined positions with respect to actual ones, with improvement from first to last
estimate
First estimate static vision only Last estimate vision and motor information Improvement with body movement
N
Learning Testing Learning Testing Learning Testing
1 0:48 0:78 0:23 0:49 0:25 0:29
2 1:18 2:46 13:66 0:59 12:48 1:87
3 1:28 0:59 1:24 0:32 0:04 0:27
4 0:28 2:46 0:45 1:84 0:17 0:62
5 0:64 0:86 0:14 0:56 0:5 0:3
6 2:39 0:95 1:16 0:27 1:23 0:68
7 1:84 2:22 14:52 2:3 12:68 0:08
8 1:98 2:73 1:27 14:99 0:71 12:26
Avg 1:26 1:63 4:09 2:67 0:43a 0:56a
a Average without positions: 2 and 7 (learning)/8 (testing), see text for details
Part F | 28.4
Sideline Sideline
12 12
10 10
8 8
6 6
4 Goal 4
Real position
Learning position
First imagined
2 Estimate (Imagined) 2
Last imagined
0 0
–6 –4 –2 0 2 4 6 –6 –4 –2 0 2 4 6
Endline Endline
Fig. 28.13 The eight real and imagined positions in the Fig. 28.15 Learning set: Imagined paths, with first and last
field for the learning set position estimates, compared to real locations in the field
Sideline Sideline
12 12
10 10
8 8
6 6
4 4 Real position
Goal First imagined
2 Testing position 2 Last imagined
Estimate (Imagined)
0 0
–6 –4 –2 0 2 4 6 –6 –4 –2 0 2 4 6
Endline Endline
Fig. 28.14 The eight real and imagined positions in the Fig. 28.16 Testing set: Imagined paths, with first and last
field for the training set position estimates, compared to real locations in the field
Embodied Mental Imagery in Cognitive Robots References 635
28.4 Conclusion
Despite the wide range of potential applications, the it similarly reduces the probability of failures and dam-
fast-growing field of cognitive robotics still poses sev- ages to the robot while keeping the robot improving its
eral interesting challenges, both in terms of mechanics performance through mental simulations. In the future,
and autonomous control. Indeed, in the new humanoid we speculate that imagery techniques might be applied
platforms, sensor and actuator arrangements determine in robotics not only for performance improvement, but
a highly redundant system, which is traditionally dif- also for the creation of safety algorithms capable to pre-
ficult to control, and hard-coded solutions often do not dict dangerous joints’ positions and to stop the robot’s
allow further improvement and flexibility of controllers. movements before that critical situation actually occurs.
Letting those robots free to learn from their own expe- From a technological point of view, this chapter
rience is very often regarded as the unique real solution aims to support the better understanding of mental im-
that will allow the creation of flexible and autonomous agery as a potential breakthrough for cognitive robot
controllers for humanoid robots in the future. engineering principles. Such principles can be applied
Part F | 28
In this chapter, we presented the work done so far to go further in the development of artificial cognitive
to explain the concept of motor imagery and mental systems capable to better interact with the environment
simulation as a fundamental capability for cognitive and refine their cognitive motor skill in an open-ended
models, based on artificial neural networks, which al- process. These robots will be able to reason, behave,
low the humanoid robot iCub to autonomously improve and interact in a human-like fashion, thanks to the
its sensorimotor skills via simulated mental imagery integration of the capabilities to mentally represent
mechanisms. the physical and social world, resemble experiences,
Three experimental studies with the iCub platform and simulate actions. The imagery-enabled cognitive
simulator were presented to show that the application of robotic agents will be able to handle and manipulate
imagery inspired mechanisms can significantly improve objects and tools autonomously, to cooperate and com-
the cognitive behaviors of the robot, even in ranges not municate with other robots and humans, and to adapt
experienced before. The results presented, in conclu- their abilities to changing internal, environmental, and
sion, allow imagining the creation of novel algorithms social conditions.
and cognitive systems that implement even better and
with more efficacy the concept of artificial mental train- Acknowledgments. This work was partially sup-
ing. Such a concept appears very useful in robotics, for ported by UK EPSRC Project BABEL and the Euro-
at least two reasons: it helps to speed up the learning pean Commission FP7 Projects: POETICON++ (ICT-
process in terms of time resources by reducing the num- 288382) within the Cognitive Systems, Interaction,
ber of real examples and real movements performed by Robotics unit (FP7 ICT Challenge 2), ROBOT-ERA
the robot. Besides the time issue, the reduction of real (ICT-288899) within the ICT for Health, Ageing Well,
examples is also beneficial in terms of costs, because Inclusion and Governance unit (FP7 ICT Challenge 5).
References
formance: A case of blind imagination, Neuropsy- learning: Improving trajectorial kinematics through
chologia 48, 145–155 (2010) imagery training, Behav. Brain Res. 90(1), 95–106
28.10 M. Kozhevnikov: Cognitive styles in the context of (1998)
modern psychology: Toward an integrated frame- 28.30 M. Takahashi, S. Hayashi, Z. Ni, S. Yahagi, M. Fav-
work of cognitive style, Psychol. Bull. 133, 464–481 illa, T. Kasai: Physical practice induces excitability
(2007) changes in human hand motor area during motor
28.11 M. Kozhevnikov, S. Kosslyn, J. Shephard: Spatial imagery, Exp. Brain Res. 163(1), 132–136 (2005)
versus object visualizers: A new characterization 28.31 P.L. Jackson, M.F. Lafleur, F. Malouin, C. Richards,
of visual cognitive style, Mem. Cogn. 33, 710–726 J. Doyon: Potential role of mental practice using
(2005) motor imagery in neurologic rehabilitation, Arch.
28.12 M. Wilson: Six views of embodied cognition, Psy- Phys. Med. Rehabil. 8, 1133–1141 (2001)
chon. Bull. Rev. 9(4), 625–636 (2002) 28.32 J.A. Verbunt, H.A. Seelen, F.P. Ramos,
28.13 A. Clark, D. Chalmers: The extended mind, Analysis B.H. Michielsen, W.L. Wetzelaer, M. Moennekens:
58(1), 10–23 (1998) Mental practice-based rehabilitation training to
28.14 L. Munari: How the body shapes the way we think – improve arm function and daily activity perfor-
A new view of intelligence, J. Med. Pers. 7, 110–111 mance in stroke patients: A randomized clinical
(2009) trial, BMC Neurol. 8(C), 7 (2008)
Part F | 28
28.15 A. Ishai, L.G. Ungerleider, G.V. Haxby: Distributed 28.33 S.J. Page, P. Levine, A. Leonard: Mental practice in
neural systems for the generation of visual images, chronic stroke: Results of a randomized, placebo-
Neuron 28(3), 379–390 (2000) controlled trial, Stroke A J. Cereb. Circ. 38(4), 1293–
28.16 M. Jeannerod: The representing brain. Neural cor- 1297 (2007)
relates of motor intention and imagery, Behav. 28.34 D.M. Nilsen, G. Gillen, A.M. Gordon: Use of men-
Brain Sci. 17(2), 187–245 (1994) tal practice to improve upper-limb recovery after
28.17 J. Decety, M. Jeannerod, C. Prablanc: The timing stroke: A systematic review, Am. J. Occup. Ther.
of mentally represented actions, Behav. Brain Res. 64(5), 695–708 (2010)
34(1/2), 35–42 (1989) 28.35 A.A. Sheikh, E.R. Korn: Imagery in Sports and Phys-
28.18 M. Jeannerod, J. Decety: Mental motor imagery: ical Performance (Baywood, Amityville 1994)
A window into the representational stages of ac- 28.36 B.S. Rushall, L.G. Lippman: The role of imagery in
tion, Curr. Opin. Neurobiol. 5(6), 727–732 (1995) physical performance, Int. J. Sport Psychol. 29(1),
28.19 G. Hesslow: The current status of the simulation 57–72 (1998)
theory of cognition, Brain Res. 1428, 71–79 (2012) 28.37 T. Morris, M. Spittle, A.P. Watt: Imagery in Sport
28.20 D.M. Wolpert: Computational approaches to motor (Human Kinetics, Champaign 2005)
control, Trends Cogn. Sci. 1(6), 209–216 (1997) 28.38 R. Weinberg: Does imagery work? Effects on per-
28.21 X. Skoura, A. Vinter, C. Papaxanthis: Mentally simu- formance and mental skills, J. Imag. Res. Sport
lated motor actions in children, Dev. Neuropsychol. Phys. Act. 3(1), 1–22 (2008), http://www.degruyter.
34(3), 356–367 (2009) com/view/j/jirspa.2008.3.1/jirspa.2008.3.1.1025/
28.22 B. Steenbergen, M. van Nimwegen, C. Crajé: Solving jirspa.2008.3.1.1025.xml
a mental rotation task in congenital hemiparesis: 28.39 S.A. Hamilton, W.J. Fremouw: Cognitive-behavioral
Motor imagery versus visual imagery, Neuropsy- training for college basketball free-throw perfor-
chologia 45(14), 3324–3328 (2007) mance, Cogn. Ther. Res. 9(4), 479–483 (1985)
28.23 M. Jeannerod: Neural simulation of action: A uni- 28.40 D. Smith, P. Holmes, L. Whitemore, D. Collins, T. De-
fying mechanism for motor cognition, Neuroimage vonport: The effect of theoretically-based imagery
14, S103–S109 (2001) scripts on field hockey performance, J. Sport Behav.
28.24 J. Munzert, B. Lorey, K. Zentgraf: Cognitive motor 24(4), 408–419 (2001)
processes: The role of motor imagery in the study of 28.41 A. Guillot, E. Nadrowska, C. Collet: Using motor im-
motor representations, Brain Res. Rev. 60(2), 306– agery to learn tactical movements in basketball,
326 (2009) J. Sport Behav. 32(2), 27–29 (2009)
28.25 R. Ramsey, J. Cumming, D. Eastough, M.G. Edwards: 28.42 G. Wei, J. Luo: Sport expert’s motor imagery: Func-
Incongruent imagery interferes with action initia- tional imaging of professional motor skills and
tion, Brain Cogn. 74(3), 249–254 (2010) simple motor skills, Brain Res. 1341, 52–62 (2010)
28.26 K.D. Markman, W.M. Klein, J.A. Suhr: Handbook of 28.43 M. Asada, K. Hosoda, Y. Kuniyoshi, H. Ishiguro,
Imagination and Mental Simulation (Psychology, T. Inui, Y. Yoshikawa, M. Ogino, C. YoshidaL: Cogni-
New York 2009) tive developmental robotics: A survey, Auton. Ment.
28.27 G. Yue, K.J. Cole: Strength increases from the mo- Dev. IEEE Trans. 1(1), 12–34 (2009)
tor program: Comparison of training with maxi- 28.44 A. Di Nuovo, V.M.D. De La Cruz Marocco: Special
mal voluntary and imagined muscle contractions, issue on artificial mental imagery in cognitive sys-
J. Neurophysiol. 67(5), 1114–1123 (1992) tems and robotics, Adapt. Behav. 21(4), 217–221
28.28 T. Mulder, S. Zijlstra, W. Zijlstra, J. Hochstenbach: (2013)
The role of motor imagery in learning a totally novel 28.45 J. Tani: Model-based learning for mobile robot
movement, Exp. Brain Res. 154(2), 211–217 (2004) navigation from the dynamical systems perspec-
28.29 L. Yágüez, D. Nagel, H. Hoffman, A.G. Canavan, tive, IEEE Trans. Syst. Man. Cybern. 26(3), 421–436
E. Wist, V. Hömberg: A mental route to motor (1996)
Embodied Mental Imagery in Cognitive Robots References 637
Part F | 28
(2013) cognitive robotics research: The prototype of the
28.51 A. Kaiser, W. Schenck, R. Möller: Solving the cor- iCub humanoid robot simulator, Proc. 8th Work-
respondence problem in stereo vision by internal shop Perform. Metr. Intell. Syst. (PerMIS) (2008)
simulation, Adapt. Behav. 21(4), 239–250 (2013) pp. 57–61
28.52 F. Chersi, F. Donnarumma, G. Pezzulo: Mental im- 28.62 R. Bartlett: Introduction to Sports Biomechanics:
agery in the navigation domain: A computational Analysing Human Movement Patterns (Routledge,
model of sensory-motor simulation mechanisms, London 2007)
Adapt. Behav. 21(4), 251–262 (2013) 28.63 E.P. Zehr, D.G. Sale: Ballistic movement: Muscle
28.53 H. Iizuka, H. Ando, T. Maeda: Extended home- activation and neuromuscular adaptation, Can.
ostatic adaptation model with metabolic causa- J. Appl. Physiol. 19(4), 363–378 (1994)
tion in plasticity mechanism – toward constructing 28.64 M.M. Botvinick, D.C. Plaut: Short-term memory for
a dynamic neural network model for mental im- serial order: A recurrent neural network model,
agery, Adapt. Behav. 21(4), 263–273 (2013) Psychol. Rev. 113(2), 201–233 (2006)
28.54 S. Lallee, P.F. Dominey: Multi-modal convergence 28.65 M.I. Jordan: Attractor dynamics and parallelism
maps: From body schema and self-representation in a connectionist sequential machine, Proc. 8th
to mental imagery, Adapt. Behav. 21(4), 274–285 Annu. Conf. Cogn. Sci. Soc. (1986) pp. 531–546
(2013) 28.66 J.L. Elman: Finding structure in time, Cogn. Sci.
28.55 K. Seepanomwan, D. Caligiore, G. Baldassarre, 14(2), 179–211 (1990)
A. Cangelosi: Modelling mental rotation in cogni- 28.67 J. Decety: The neurophysiological basis of motor
tive robots, Adapt. Behav. 21(4), 299–312 (2013) imagery, Behav. Brain Res. 77, 45–52 (1996)
28.56 D. Lewkowicz, Y. Delevoye-Turrell, D. Bailly, 28.68 D.E. Rumelhart, J.L. McClelland: Parallel Distributed
P. Andry, P. Gaussier: Reading motor intention Processing: Explorations in the Microstructure of
through mental imagery, Adapt. Behav. 21, 315–327 Cognition (MIT Press, Cambridge 1986)
(2013) 28.69 P.J. Werbos: Consistency of HDP applied to a simple
28.57 W. Gaona, E. Escobar, J. Hermosillo, B. Lara: An- reinforcement learning problem, Neural Netw. 3(2),
ticipation by multi-modal association through an 179–189 (1990)
artificial mental imagery process, Conn. Sci. 27, 68–
639
Part F | 29.1
provided a bridge between thought and action, 29.3.4 Adaptive Resonance Theory ................ 647
a basis by which to characterize thought and 29.3.5 Summary .......................................... 648
action as inextricably combined. These models 29.4 Cognition and Action
hold that action is a component of perception, or Intrinsically Linked ........................... 648
that thought and action are inseparable, or that 29.4.1 Methods ........................................... 648
thought and action act in concert, two sides of the 29.4.2 Embodied Cognition .......................... 650
same coin serving to reduce the uncertainty about 29.4.3 Motor Theory..................................... 651
the nature of events. 29.4.4 Simulation Theory ............................. 651
This chapter provides a review of several 29.4.5 Free Energy Theory ............................ 652
models of cognition in terms of their dynamical 29.4.6 Evolution of Cognitive Search .............. 653
features, including models not generally included 29.4.7 Summary .......................................... 653
in the dynamical tradition, such as ART and ACT- 29.5 Conclusion........................................ 653
R. It focuses on the manner in which each model
treats time and complexity, thought, and action. References................................................... 655
It provides a glimpse into the methods of model
development and analysis associated with the var-
ious approaches to modeling cognitive processes.
29.1 Dynamics
The properties of the mind are formulated and quan- The dynamic point of view involves two proper-
tified by models of cognition. Examples of these will ties of cognitive models, one methodological the other
be summarized here from the dynamic point of view, substantive. The methodological aspect centers on mat-
which will bring in some of their points of contact ters of time and complexity, the question being in what
with fields other than psychology, from philosophy to way does the model incorporate concepts and methods
engineering. Models will be described in ways which which address temporal change as formulated in nonlin-
highlight the manner in which they address dynamical ear dynamic systems theory and the science of complex
features of mental activity. systems. The substantive aspect involves the manner in
640 Part F Model-Based Reasoning in Cognitive Science
which the models address the relationship between the scribing and drawing conclusions from dynamical time
mental and the physical, cognition and action. series.
With dynamical time-series data, the researcher can
29.1.1 Time and Complexity apply graphical techniques to succinctly lay out the
time-course or trajectory of the process, the appearance
Philosophical properties of dynamical models have of both stable patterns, called attractors, and the paths,
been considered in comparison to models which have transients, to and from attractors. The graphical tech-
been called computational. The former models are dis- niques have mathematical counterparts and additional
tinguished by elements of timing and complexity, as mathematical techniques are available to quantify and
might characterize the production and comprehension summarize features of the time series. In this way the
of speech. The latter by stability, for which timing is dynamics of the process, its rule of evolution, may be
arbitrary, as in associations among mental symbols, or quantified, classified, and understood.
application of rules of logic. Another way to express Accepting time and time series as fundamental re-
the difference is that for dynamical models of complex quirement of data emphasizes a focus not only on
processes, several subprocesses act simultaneously and cognitive entities, symbols and rules of manipulation,
interactively, while for computational models, the sub- but also on cognitive performances, perceiving, re-
parts of a modeled process act as independent modules, membering, conversing. From this point of view, the
each of which communicates with other modules by scientific objective should be one of describing the time
taking input prepared by a previously active module, series of processes and correlates of cognitive behav-
processing it, and making the result available to the next ior and discovering the rule of evolution by which the
module. cognitive performance unfolds over time [29.2].
Part F | 29.1
Similar distinctions have appeared in philosophi- In an example from the production and comprehen-
cal debates using related terminology as in emergent sion of speech, dynamical defines the basic unit of data
versus reductionist, dynamical versus generative, and to be a continuous linguistic signal, while generative
connectionist versus artificial intelligence. Advocates defines it as discrete phonetic segments. A summary
for dynamical models have argued for dynamical mod- of the implication of the two approaches, continuous-
els being superior in the sense of being more complete, dynamical versus segmented generative is given in the
perhaps more general, than their computational coun- following quotation [29.3]:
terparts, as in the following quote, which refers to the
“[. . . ] a fundamental mistake of the generative
computational approach as Hobbesian [29.1]:
paradigm is its assumption that phonetic segments
“[. . . ] Every cognitive process unfolds in contin- are formal symbol tokens. This assumption permit-
uous time, and the fine temporal detail calls out ted the general assumption that language is a dis-
for scientific accounting. Moreover, many cognitive crete formal system. This bias forced generative
structures are essentially temporal: like utterances, phonology to postulate a phonetic space that is
they exist only as change in time. Often, getting the closed and contains only stable symbolic objects.
timing right is critical to the success of cognitive We show that theories of phonetics satisfying these
performance; this is especially so when in direct in- constraints have little to no support from phonetic
teraction with surrounding events. evidence and thus that the formal-language assump-
Hobbesian computational models have made tion is surely incorrect.”
a bet that cognitive phenomena can be described in
There is an implication that dynamical models of
a way that abstracts away from the full richness of
cognition enjoy superior status, its associated laws be-
real time, replacing it with discrete orderings over
ing deeper and more generally applicable to behavior.
formal states.”
That assertion has been philosophically evaluated by
The emphasis on continuous time is a stringent examining dynamical models considered examples of
requirement for research strategies in psychology. In- greater and lesser laws. Greater laws are defined to be
corporating time as an essential component of data is more widely applicable than lesser. Lesser laws might
something easily done, but in many areas of psychol- be descriptive accounts of particular mental dynam-
ogy this means not continuous time, but time-sampling, ics, or they might rely on concepts for which temporal
in which the process in question is measured at in- factors are negligible. Dynamical models of cognition
tervals, resulting not in a continuous signal, but in were shown to exemplify both greater and lesser laws,
discrete time, a time series of observations. Much of leading to the conclusion that time and complexity
the terminology characteristic of research related to alone are not sufficient to distinguish greater versus
dynamical models arises from the requirement of de- lesser laws [29.4].
Dynamical Models of Cognition 29.2 Data-Oriented Models 641
The dynamical versus computational distinction has In discussions of dynamical versus computational
been characterized as competing answers to the mind- models of cognition, illustrations have been examples of
body problem. In this view, dynamical theories address action, for which timing of interacting subsystems is an
mind-body as a single phenomenon with cognition and essential feature versus cognition, for which timing en-
action being two faces of the same process. In contrast, ters as a secondary or negligible feature. The contrast
computational models have the dualist view that mind between dynamical and computational might be illus-
and body are different entities; the mind manipulates in- trated as the contrast between a musical performance
formation to formulate goals and plans for the body to and the music sheet, or between a conversation and
execute. The following is an elaboration of the dynam- a transcript, in each case the former being action, the lat-
ical view [29.5]: ter cognition. Hence, there are two conflicting points of
view on the relationship between action and cognition,
“All that we know we have constructed within they are either sequentially cooperating subsystems, or
ourselves from the unintelligible fragments of en- they are integrated, communicating subsystems.
ergy impacting our senses as we move our bod- This chapter examines several theories of cognition
ies through the world. This process of intention focusing where possible on ways in which they charac-
is transitive in the outward thrust of the body in terize cognition and action and on ways in which they
search of desired future states; it is intransitive incorporate timing and complexity. The topic is divided
in the dynamic construction of predictions of the into three major categories. The first concerns meth-
Part F | 29.2
states in the sensory cortices by which we recog- ods in data-oriented models, which make few general
nize success or failure in achievement. The process claims, but rather summarize a phenomenon or a special
is phenomenologically experienced in the action- purpose model. The second concerns general models of
perception cycle. Enactment is through the serial cognition for which cognition and action are treated as
creation of neurodynamic activity patterns in brains, distinct and separate processes, that is, knowledge can
by which the self of mind-brain-body comes to be developed and transformed without reference to ac-
know the world first by shaping the self to an ap- tion. The third concerns models for which cognition and
proximation of the sought-for input, and then by action are intrinsically linked.
models generate simple patterns from complex systems. of which is the control parameter. Thus, the dynamics,
Three of the many methods associated with dynamical motion, and phenomenology of the system is governed
modeling will be of interest here, quantifying attractors, by the control parameter, via the potential function at
concepts of potential, and scale invariance. any point in time. There is a component of optimiza-
tion, that is, of the system having a quality of always
Quantifying Attractors moving toward a lower potential.
Attractors are persistent patterns which arise in a dy- A method for quantifying the relation between the
namical process. As an example, a whirlpool is an control parameter and the attractors of a dynamical pro-
attractor in a flowing stream. When and how attrac- cess starts with a formula for potential V associated
tors are exhibited by a dynamical process often depends with each point in the space possibly lying on a tra-
on the value of a quantity analogous to energy of the jectory of the system, that is, in a space defined by the
process. That quantity is represented by the control pa- axes which correspond to the system variables and the
rameter. One of the ways to formalize a dynamical control parameter. The rule is that systems always move
system is to identify the attractors, then specify manner in a direction to reduce V. The attractors will therefore
in which the control parameter governs the trajectory of be located at points for which the potential is at a local
the process into and out of attractors. In the example of or global minimum which might be conceived, respec-
the flowing stream, the control parameter could be the tively, as a shallow or deep valley. When the system
rate of flow, when the stream flows faster or slower old reaches a point for which V is a minimum, it tends to
whirlpools might disappear and new ones come into be- remain there in an attractor to be moved out of that at-
ing. The idea is to incorporate the control parameter and tractor only by random or systematic fluctuation both of
nonlinear feedback into the rule of evolution. which might be increased by additional energy, hence
Part F | 29.2
Attractors are the most perceptible aspect of a dy- the persistence of attractors. Some energy is required to
namical process primarily due to their duration. Other keep the process in operation and drive it to its deepest
aspects of the process are fleeting and ephemeral, but attractor, but too much energy can make the system less
attractors can last long enough to be noticed and named. predictable by causing it to jump in and out of attrac-
Take, for example, the process underlying the reversals tors.
of an optical illusion in which the attractors are the two It is the essence of attractors that they persist over
possible perceptions of a staircase, rising or descend- time and so they are perceived as patterns or entities.
ing. Most of the time the perception is of one or the Local attractors last for a time comparatively shorter
other, each of which is easy to describe. The time of than global attractors. Thus, global attractors have the
transition between attractors is rapid and the transient effect of staving off the effects of time. That is, the sys-
ephemeral. During the transient, the staircase is nei- tem remains patterned and therefore more predictable
ther rising nor descending, and is not easily described over an extended period. Comparing to transients, tran-
or named. Transient and attractor intuitively may seem sients happen in time whereas attractors happen over
two different processes, yet the dynamics of process in time.
the visual system are presumed to be the same whether
the perception is of the staircase rising, falling, or tran- Scale Invariance
sient. It is that constant process which is captured in Nonlinear dynamic systems analysis can often reveal
its rule of evolution. The assumption of dynamical sys- that a process or object has the same form whether it
tems analysis is that all phenomena arise from a rule is viewed overall or microscopically. When such a rela-
of evolution which remains unchanged as the system tionship occurs, it is described by the equivalent terms
evolves through attractors and transients. As the pro- of fractal, self-similar, or scale invariant. Examples of
cess continues through time, it visits the attractors, scale invariance occur widely in nature, often being vi-
whose persistence constitutes the phenomenological sual examples of trees or landscapes, but the concept
experience of the process. While in an attractor, the of scale invariance also applies to the mathematics of
system generates and maintains a pattern of increased the underlying processes, so a model without a visual
predictability, equated with lower surprise, sometimes counterpart may be described as scale invariant.
equated with emergent phenomena.
29.2.2 Example: Motor Coordination
Potential
Dynamical processes may be drawn into attractors and Applied to bilateral motor coordination, a data-oriented
may stay in or escape from an attractor based on en- method of complexity science, coupled oscillators, was
ergetic properties. Systematic energetic properties are applied specifically to bilateral coordination of hand
measured by a potential function V the basic parameter motion [29.9]. The analysis of the purely the motor
Dynamical Models of Cognition 29.2 Data-Oriented Models 643
phenomenon has served as inspiration for applying the mental applications rely on the emergence of patterns of
same method of analysis to cognitive phenomena. In phase coordination which arise naturally out of proper-
the hand-motion study, a participant with hands on ta- ties of motor structures. A pattern, however, gives the
ble is instructed to start moving both index fingers in impression of independent existence. Some theories of
a given pattern at a given rate, then on cue, speed up cognitive development claim some of the phenomena
or slow down. The model relies on a fundamental vari- of the infant’s cognitive development, ideas about the
able , the observed phase difference between the two permanence of objects, for example, may simply be
fingers. When in phase (toward and away from each emergent properties wholly a product of interaction of
other), D 0. When out of phase (both point the same motor systems.
direction), D
. With the application of complexity
theory and the theory of coupled oscillators, it was pos- 29.2.3 Example: Decision Under Risk
sible to derive specific predictions of amplitudes and
frequencies critical for the system. That is, the theory A second example of a data-oriented approach offers
predicted accurately for individuals at what frequency an integration of several theories and results in the field
in-phase or out-of-phase patterns would appear, and ad- of decision under risk [29.10]. To summarize, using
ditionally what changes in amplitude of finger motion a modification of terminology: Established results begin
occurred as the process moved between out-of-phase with the concept of the subjective value .y/ of a gam-
and in-phase. The fundamental equation used to quan- ble, a quantity which can be determined from a person’s
tify the locations and depths of the attractors of the choices among gambles. Within a wide range, for a ra-
system was derived and given as tional person, subjective value of a gamble is equal to
the objective expected value .x/, negative for losses,
Part F | 29.2
V D a cos b cos 2 ; (29.1) positive for gains. It might be called Rational Value
Theory (RVT). But much research has shown people
where potential V is a function of the phase dif- often act as if y has been modified by additional sub-
ference , and constants a and b. The trajectory of jective evaluations of winning and losing and other
coordination of hand motion occurs predominantly in features of the gambling experience, yielding a mea-
a direction which minimizes V, with small fluctuations. sure of utility. When y equals utility for each value
The function V has two distinct minima, a shallow, local of x, it can be said choice of gambles is governed by
minimum, and a deep global minimum. The values of Expected Utility Theory (EUT). In a third theory, ac-
at which these minima occur can be found by taking the knowledging that losses and gains are often relative,
derivative of V with respect to , setting it equal to zero y of the gamble can be profoundly affected, even re-
and solving for . These are the values of associated versed, by the context in which the gamble is presented
with two attractors of the process. When in opera- (framing). When y can be reversed from a preference
tion, the system generates only two attractors, a strong to an aversion by, for example, changing the framing
one at D 0, associated with a global minimum of V; from a context of gain to a context of loss, it is an
a weak one at D
associated with a local minimum. example of Prospect Theory (PT). Each theory yields
Transition between phases is governed by the control a characteristic pattern in the graph of y against x.
parameter, the speed of oscillation, and shows unidirec- The patterns RVT, EUT, and PT can all be observed
tional hysteresis. That is, it is more likely to fall into within an individual. Deviations from RVT have been
the global attractor than to escape it. While the process associated with emotional involvement, where least
is in an attractor, it appears patterned, either in phase emotional involvement described by RVT, moderate by
or out of phase. When the trajectory takes the system EUT, and high by PT, critical facts for dynamical sys-
between attractors, it is in a transient and appears unco- tems analysis.
ordinated. For any participant, the precise frequencies Dynamical systems analysis of the choices among
at which the two attractors occur and the ease of mov- gambles begins with the assumption that the three
ing between attractors is captured by the relationship of graphs of y against x constitute the observed attrac-
jbj to jaj, a quantity which can independently be deter- tors of the process of choice among gambles. For RVT,
mined for each participant. The description so derived the graph would be a 45ı line, y D x through origin.
has been shown to be consistent with details of coordi- For EUT the graph is ogive shaped with varying de-
nation of bilateral hand movement. grees of steepness, often asymmetric. For PT, over the
Although the application obviously concerns only mid-range of x, the graph is S-shaped, also somewhat
coordinated motor systems, the method of coupled os- asymmetric. Taking just the upper and lower arms of
cillators has come into play, at least metaphorically in the S, y is then a two-valued function of x, represent-
theories of infant cognitive development. The develop- ing reversal of preferences due to framing. The goal
644 Part F Model-Based Reasoning in Cognitive Science
is to embed the three attractors on a surface in three take place within an agent. This suggests time-series
dimensions by introducing a third axis, the control pa- studies of a single agent with varying levels of emo-
rameter a. tional involvement over might profitably address dy-
The control parameter a is defined as an indicator namical movement among attractors. Such an approach
of amount of emotional involvement. It is included as might reveal individual differences, intermediate attrac-
a third dimension, placing the three y by x graphs on tors, a bifurcation point at which the curve becomes
a surface in three-dimensional space and in a specific 2-valued, and possibly hysteresis in the 2-valued range,
relation to each other. Graphs for RVT, EUT, and PT are thus posing challenges for continued development of
arranged, respectively, from linear to S-shape, at a D 0, theories of decision under risk.
a D moderatevalue, and a D highvalue. As is char-
acteristic in nonlinear dynamic systems analysis, the 29.2.4 Summary
control parameter is hypothesized to govern movement
between the attractors; the least emotional involvement The examples illustrate two of the many dynami-
associated with linear RVT, moderate involvement with cal approaches of data-oriented research. In the first,
ogival EUT, and high involvement with discontinu- the approach quantifies coordination of limb move-
ities of PT. When the three curves are arranged along ment, applying the model of coupled oscillators, well-
the third dimension, a, their graphs suggest the cross- understood in complexity science. The model enabled
sections of the folded surface of a cusp catastrophe, for predictions about both phenomenological and quanti-
which a standard quantification exists. tative details of coordination of hand movement, that
For processes the trajectories of which lie on a cusp- is, about the nature of attractors and transients. This
catastrophe surface, (29.2) applies. It gives the formula adds to theory by clearly delineating an approach to
Part F | 29.3
for finding the potential .V/ associated with any triple characterize the nature of emergent phenomena. The
of points .y; x; a/. To obtain the formula for the surface, second approach is to begin with known attractors, as
it is only necessary to find the values of .y; x; a/ which in the three styles of risk-taking characterizing subjec-
make V a minimum. This is a straightforward process tive value as a function of objective value, then propose
of taking the partial derivative of V with respect to y, to place the functions in a space, arranged along a di-
setting the derivative equal to zero and solving the result mension of a control parameter. The approach envisions
to express y as a function of x and a. The graph of the a single surface on which the three attractors lie. The
resulting function will be the cusp catastrophe surface resulting relationships add to theory by delineating the
upon which the graphs of the three theories can be fit forms of other possible attractors of the process and
processes which might go on within an individual. The
V D y4 C ay2 C xy : (29.2) first theoretical analysis concerns only action. In the
second, study concerns only cognition. In both types of
The value of this analysis by cusp catastrophe is not approaches, the goal is to end up with an understand-
only that it represents several types of betting choice on ing of the attractors the nature of the transients, and the
a single surface, describing all three with a single for- control parameter, using methods of complexity science
mula but also that it lays out a dynamics which might and nonlinear dynamic systems theory.
a familiar picture or a novel one and be instructed to higher level cognition and problem solving rather than
respond familiar or new. The model characterizes the perception or action [29.12].
process as a sequence of re-perceptions of the picture In any application, ACT-R produces a simulation
or re-looks at it, each look providing more information of the operation of a system consisting of modules of
to update positive and negative accumulators for fa- a multifaceted brain, a perceptual motor system, a goal
miliarity. When the positive and negative accumulators system, a declarative memory, and a procedural system.
for familiarity reach sufficiently small rates of change, Each module produces information in a form useful
a judgment is generated and a response is initiated. Al- to one or more other modules. This has some dynam-
though the model is not presented in this way, the rates ical aspects since there is an empirically determined
of change of the accumulators are analogs of a poten- characteristic timing of the operations applied to sep-
tial function to be minimized by re-looks. When the arate modules. Although it deals with symbols and
potential function has reached a minimum, the cogni- rules, timing for operations of a module is not nec-
tive system has reached an attractor, namely a persistent essarily arbitrary nor altogether ignored. This model
judgment of familiar or unfamiliar. Temporal features has complexity of the mental system without the addi-
of accumulating and looking are hypothesized to affect tional complexity of the motor system to which it issues
response time. goals [29.12]:
In the matter of the relation between cognition and
action, some parts of the model are undefined. The “[. . . ] the critical cycle in ACT-R is one in which the
process of re-looking, for example, is not explicitly buffers hold representations determined by the ex-
defined. It is not clear whether re-looking is a motor ternal world and internal modules, patterns in these
function or mental re-perception without motor involve- buffers are recognized, a production fires, and the
Part F | 29.3
ment. It is not clear whether looking is at the service of buffers are then updated for another cycle. The as-
the accumulators. It is possible an accumulating mod- sumption in ACT-R is that this cycle takes about
ule might send out a goal to a looking module, and the 50 ms to complete this estimate of 50 ms as the
looking module then produce some sensations for the minimum cycle time for cognition has emerged in
accumulating module to work on. Alternatively, it could a number of cognitive architectures [. . . ]. Thus,
be that looking and accumulating proceed in mental flux a production rule in ACT-R corresponds to a spec-
until the process reaches an attractor and initiates a re- ification of a cycle from the cortex, to the basal
sponse. The first would be more like a computational ganglia, and back again. The conditions of the pro-
model, the second, dynamical. duction rule specify a pattern of activity in the
There are additional questions about the relation of buffers that the rule will match, and the action spec-
cognitive to motor. The ocular system is not the only ifies changes to be made to buffers. The architecture
motor function in the experimental setup. The agent assumes a mixture of parallel and serial process-
must also record his judgment with a word or a press ing.”
of a button. After the process has reached an attrac-
tor, presumably a command would be issued to a motor The feature of parallel processing does not imply
module for this purpose. So this model has the possibil- unlimited capacity since there are two limited-capacity
ity of both an integrated motor process for looking and features built into the system. The first is the limitation
a separate motor module for executing the response. It on buffer contents. A buffer can hold only a single item
illustrates concepts that come into play when analyzing from memory or perception. The second is a limitation
modular models in terms of dynamic and computational on production rules, only a single one can be selected
features. on each cycle.
ACT-R has been used along with brain imaging to
29.3.2 Adaptive Control of Thought – identify certain brain structures with which aspects of
Rational cognition can be associated. Brain activity was imaged
for participants while they learned a new artificial al-
Adaptive Control of Thought – Rational (ACT-R) is gebraic system, manipulated its equations, and keyed
a theory of cognition designed to incorporate what is in answers, in an experiment which lasted over several
known as brain function into an architecture which op- days. Imaging yielded a measure of activity, the blood
erates to solve problems using symbols and rules of oxygenation level-dependent (BOLD) function in brain
deductive and inferential logic. Because its operations structures over the time course of the experiment and
are based on symbols and rules, it can be classified as related the measure to theoretical account of steps in
a computational model of cognition. It has focused on problem solution with the following results [29.12]:
646 Part F Model-Based Reasoning in Cognitive Science
“1. The motor area tracks onset of keying. Other- generally referred to as hidden, and their firing patterns,
wise, the form of the BOLD function is not sensitive determined by the weights, are not necessarily easily in-
to cognitive complexity or practice. terpreted.
2. The parietal area tracks transformations in the Each N accumulates the weighted stimulation and
imagined equation. The form of the BOLD function transforms it according to a nonlinear function usu-
is sensitive to cognitive complexity but not practice. ally acting as a threshold. When the transformed value
3. The prefrontal area tracks retrieval of algebraic reaches the threshold, the N fires, usually stimulating at
facts. The form of the BOLD function is sensitive to least one N in the next higher level of its next higher
cognitive complexity and decreases with practice. neighboring artificial neurons. The word usually ap-
4. The caudate tracks learning of new procedural pears often in the description of ANNs because the
skill. The BOLD function is not sensitive to cog- structure and operation of an ANN is subject to the
nitive complexity and disappears with practice.” ingenuity of its designer. The amount of stimulation
received by one N from the firing of another is gov-
These four results illustrate the role of timing and erned by their connection weight which is taken to be
complexity in ACT-R theory. Although the theory can a measure analogous to synaptic strength. From these
be categorized as computational, it is clear from this ex- few properties and their numerous variations, structures
ample that the theory addresses the dynamics of brain can be developed which when in operation simulate
processes which accompany learning procedural skills thought processes associated with many brain activities
of varying difficulty. Although time comes in as an in- thought to underlie learning, concept formation, and ra-
dex of practice, measured in days, the focus is on the tional thought.
end-effect of practice, to identify which brain mod- For an example of the operation, input might be
Part F | 29.3
ules and transmitter substances might be involved in of a new pattern, expressed in ones and zeros. The
developing an understanding of the algebra and skill weighted and reweighted elements of the pattern are
at executing a sequence of steps to solve an algebraic passed through one or more layers of artificial neurons,
problem. The objective is to match brain function and of each which applies a nonlinear transformation. When
modules with their theoretical counterparts. That is, the it reaches the output layer, the output might be ones and
modules and information flow represented in the ar- zeros representing a category into which the ANN has
chitecture of the theory are matched with activities in determined the new pattern falls.
specific brain modules, allowing the function of the Whatever the network and task, connection weights
brain modules to be inferred and described in terms of which optimize performance must be found. This is
the theory. So, too, with complexity, which is here not usually accomplished by minimizing a cost function
taken to refer to the complexity of complexity science, during a training procedure in which the ANN learns
but rather to the difficulty of the artificial algebra and the optimal connection weights to perform well on
the problems given for solution, and is also a variable to a particular task. Considering only supervised pattern
be related to the function of brain modules. Thus, ACT- learning, any given input pattern can be associated with
R has timing and complexity, but does not address the a desired output, a correct classification for example.
process using methods of complexity science or nonlin- With the goal of minimizing the error over all patterns,
ear dynamic systems theory. the weights over the entire network can systematically
be adjusted in a direction which reduces the cost func-
29.3.3 Artificial Neural Networks Methods tion, which will over several trials incrementally bring
the cost function to a local or global minimum. Min-
Artificial neural networks (ANNs) are a large num- imizing error is similar to minimizing potential V of
ber of theories addressing many phenomena including (29.1) and (29.2). A typical cost function is the least-
phenomena of cognition. ANN theories share the fun- squares minimization given as
damental component of the artificial neuron (N) an X
element with some similarity to physical neurons. Ns VD .oi di /2 ; (29.3)
are usually lined up in layers. Artificial neurons in the i
lowest layer receive stimulation from an external source
(input). The remaining layers receive weighted stimu- where V is the cost, i indicates the ith output N, o is
lation from other neurons usually from the next lower the output value, and d is the desired value. Minimizing
layer. The weights reflect variations in the strength of the cost function is a process of reducing errors in clas-
the connection to one N from another. The firing pattern sification by adjusting connection weights throughout
of the highest layer (output) is readable in meaning- the network, usually through a technique called back-
ful terms by some other system. Intermediate layers are propagation. Unlike the earlier examples, the local and
Dynamical Models of Cognition 29.3 Cognition and Action Distinct 647
global minima for V are not usually completely known processes of stored patterns (top-down) and of input
and the process of minimizing V can settle at either management (bottom-up). When a match is achieved
type of minima, arriving at a local or global attractor the system sends the information to the next module,
for the set of weights. It is a process of parameter esti- by ANN routes, for further processing. ART has by
mation which has some analogy to the characterization these means addressed and simulated the theoretical
of the learning process. Normally the process of pa- processes that underlie numerous results from the lit-
rameter fitting is not part of the cognitive model, but in erature of experimental psychology and neuroscience.
the case of ANNs, as the parameters are being changed The fundamental unit of ART incorporates the dy-
trial by trial the process of parameter estimation mimics namics of adaptive resonance and forms the common
the learning process, that is, changes synaptic weights. basis for many related models. Models have been for-
Such changes in synaptic weights are thought to char- mulated with different objectives but all with ART in the
acterize learning and adaptation in biological systems. acronym, indicating that adaptive resonance is a fun-
One of the strengths of ANN models is that they damental feature of every model so derived. Adaptive
can often easily incorporate substructures with known resonance is the fundamental theoretical process of cog-
properties. ANNs are well suited to do real tasks such nition.
as those based on identification of a limited number of Adaptive resonance entails a temporal process that
patterns. For example two ANNs, each of which is itself unfolds during the creation of a meaningful percep-
a theory of adaptation are the ANN model for Hebbian tion from an input. The input may be a pattern to be
learning and the Hopfield Network for learning to clas- identified or categorized, or it may be part of a tem-
sify patterns without supervision. These are examples poral sequence such as speech to be comprehended.
of the power of neural network modules which can be Having received input, ART initiates a comparison pro-
Part F | 29.3
used in sections of an ANN model of cognitive and be- cess is with the objective of maximizing a resonance
havioral functions. function. In the comparison process, the bottom-up in-
put signal interacts with top-down previously learned
Hebbian Learning patterns, prototypes, and expectations. The interaction
In Hebbian learning when a neuron N fires, all of its consists of repeated cycles of directing attention to
connections with other Ns in the layer below are af- combinations of bottom-up features deemed signifi-
fected, in particular, any N whose firing stimulated it. cant by a matching process and suppressing bottom-up
Each of these connections is increased in effective- features deemed irrelevant. In repeated cycles, the res-
ness, that is, will transmit a greater effect in the future. onance function is optimized and the information can
Hebbian learning is a dynamical model of neural mod- be passed on to the artificial neurons in the next module
ification during learning. In a Hebbian, ANN parts of for further processing.
the system during their normal operation create a use- ART is currently silent when it comes to generat-
ful learned configuration. It is a dynamical model of ing overt responses, although some indication has been
adaptive brain changes, which instantiates the Hebbian given of a proposed approach to the problem. For ART,
theory of neural correlates of learning. the perceptual system derives resonance from achiev-
ing a match. For the motor response, the system issues
Hopfield Network a goal and leaves it at that. The suggestion has been
An ANN can simulate other cognitive functions, for ex- put forward that the motor response might be shaped
ample, the Hopfield network can infer good patterns by a complementary process, a sort of mirror image of
from samples without any supervision, that is, with- resonance. In the proposed process, the complementary
out feedback on correctness of its performance while energy analog of resonance would be generated not by
learning to classify patterns during training. The Hop- a match, but by a mismatch between actual and desired
field Network can also remove noise from imperfect action, the mismatch indicating the desired goal has not
patterns, and is known as a technique for self-addressed yet been reached. Motor functions are addressed and
memory. characterized thus [29.13]:
29.3.4 Adaptive Resonance Theory “The START model proposes how adaptively timed
inhibition of the hippocampal orienting system [. . . ]
Adaptive Resonance Theory (ART) is a global theory and adaptively timed disinhibition of cerebellar
of cognition using related models of brain and neuronal nuclear cells [. . . ] may be coordinated to enable
properties to assemble ANNs to simulate cognitive and motivated attention to be maintained on a goal
other brain functions. Adaptive resonance is analo- while adaptively timed responses are released to
gous to energy produced by a pattern match based on obtain a valued goal. [. . . ] Biological learning in-
648 Part F Model-Based Reasoning in Cognitive Science
cludes both perceptual/cognitive and spatial/motor required for a performance, but cognition and perfor-
processes. Accumulating experimental and theoret- mance are two different processes. Behavior is taken as
ical evidence show that perceptual/cognitive and a window on mental processes, that is, at the completion
spatial/motor processes both need predictive mech- of the cognitive process, a goal may be issued to a mo-
anisms to control learning. Thus there is an inti- tor system. Behavior, then, indicates what goal was set.
mate connection between learning and predictive Separating cognition and action is, however, not a nec-
dynamics in the brain. However, neural models essary feature of these types of models, an integration
of these processes have proposed, and many ex- of the two is an explicit goal for both ACT-R and ART.
periments have supported, the hypothesis that per- With regard to complexity, both ACT-R and ART
ceptual/cognitive and spatial/motor processes use have interrelated communicating subsystems, modules
different types of predictive mechanisms to regulate which act in cooperation, rather than in concert, that
the learning that they carry out. [. . . ] The need for is, usually sequentially, not simultaneously. As such,
different predictive mechanisms is clarified by ac- they do not invite the techniques of complexity science.
cumulating theoretical and empirical evidence that The fact that ACT-R specifically addresses symbols and
brain specialization is governed by computationally rules of their manipulation does not prohibit applica-
complementary cortical processing streams that em- tion of nonlinear dynamic systems analysis, as there has
body different predictive and learning mechanism.” been developed a method of symbolic dynamics [29.14].
Not time, but timing is an essential feature of both
29.3.5 Summary models. That is, what is known from brain imaging
about the active areas of the brain and about the times
Both ACT-R and ART characterize dynamics of men- required for particular brain activities is incorporated
Part F | 29.4
tal processes theoretically observable through brain into both models. These are reflected in temporal re-
imaging. Both models characterize cognition and per- strictions on sequencing and when passing information
formance as separate processes. Cognition may be is permissible between modules.
be mutually exclusive and jointly exhaustive as far as In the dynamics of multiprocess models, the prediction
beliefs are concerned. error serves the same purpose V does for complex sys-
The important aspect of mental models is they tems, that is, the system of beliefs forming the context
make a specific probabilistic prediction for the very of the experience, evolves in such a way always to re-
near future, that is, they attach a probability to every duce errors of prediction by systematically bringing the
event which might possibly happen next. After an event successive priors closer to their respective posteriors. In
occurs, each model is re-evaluated according to the sup- this way, ongoing experience produces and refines the
port each receives from the evidence of the event. For context of the experience.
example, consider an agent who has two models for
a coin, that it is either fair or biased. The fair coin Particle Filters
model, z50 , is expressed as P.H/ D 0:5 and P.T/ D 0:5. Particle filters provide a method for applying Bayes rule
The biased coin model, z80 , P.H/ D 0:8 and P.T/ D in an environment of constant flux where the agent has
0:2. Suppose before the coin is tossed he believes 60- both beliefs about his actions as they affect his environ-
40 that the coin is fair, that is, his prior amount of ment and beliefs derived from evidence about the state
belief in each model is expressed in the probabilities of the environment. By incorporating beliefs about the
P.z50 / D 0:6 and P.z80 / D 0:4. The two models and his effects of action, new models may be introduced to the
relative degree of belief in each form the context for set, providing a flux of models to accommodate the flux
his interpretation of the subsequent event, which will of the environment.
be the outcome of a toss of the coin. For each model the Particle filters give a best-guess approximate solu-
agent generates a prediction and a probability for any tion to the problem of finding prior beliefs about other-
other outcome, for z50 he might predict heads H and wise unknown states of an agent moving in a changing
Part F | 29.4
note a 0:5 probability of error, namely T. For z80 he will world. The method uses such prior beliefs, a motion
predict H, but with a 0:2 probability of error. The coin model, input to a map model, and current stimulation
is tossed and comes up H. The question is how does to generate updated beliefs about current state [29.16].
this outcome affect his degree of belief in each model Its application can be illustrated with a simple example
to yield revised or posterior probabilities? The answer of an agent moving in the dark around the furniture in
is given by Bayes rule, in the odds form, applied to each a familiar room.
model separately in (29.4) The particles in question are simple hypotheses,
such as a statement about location. One particle might
P.z50 j xH / P.xH j z50 / P.z50 / claim You are here at A. another You are here at B.
D ; (29.4)
P.z80 j xH / P.xH j z80 / P.z80 / and so forth. Having started into the room from the
threshold and taken three steps, he consults his mo-
where xH is the event of having H occur after predicting tion model and believes he has arrived at one of three
H, that is, having predicted H accurately. Although pre- points, A, B, C with respective probabilities 0:2, 0:3,
diction from each model is accurate, that is, prediction 0:5. To further delineate his location, he reaches out
error is zero for each, the probability of a zero error is and finds that he touches a table, providing evidence
different for each model. Evaluating (29.4) along with x. He then consults his mental model of a map of the
the condition that the models are jointly exhaustive, familiar room and determines the probability of touch-
yields the effect of the event on the belief in each of ing the table from A, B, C, is respectively, 0:7, 0:5,
the two models, posterior beliefs of P.z50 j xH / D 0:48 0:2. These yield posterior probabilities for A, B, C,
and P.z80 j xH / D 0:52. Given the same prediction H, given x, of 0:36, 0:38, 0:26. At this point, the filter
if the outcome of the coin toss had been T, the result reweights each particle according to the degree to which
would have been different with P.xT j z50 / D 0:5 and touching the table was a surprise for it. The weight
P.xT j z80 / D 0:2. Then the result would have been the for each particle is calculated as the ratio of its poste-
posterior beliefs of P.z50 j xT / D 0:79 and P.z80 j xT / D rior according to the map to its prior according to the
0:21. motion model. The particles are then re-sampled with
In the theory of multiprocess models of cognition, replacement according to the weights to yield a new
the models invoked by the agent arise from both inter- probability density function (pdf) for final belief of
nal sources and experience. The set of models together location. The posteriors for A, B, C, respectively are
with their degrees of belief form the context by which 0:50, 0:36, 0:14. He can use these posteriors as pri-
the agent understands events. In any situation, the pre- ors in the motion model, for his next steps, then once
diction error is used to revise the degree of belief in again reach out, then consult his map about the result.
each model. The effect of repeated application of Bayes At this time it is quite likely a new particle D might
rule is to alter beliefs to reduce errors of prediction. come into the picture, justifying the re-weighting and
650 Part F Model-Based Reasoning in Cognitive Science
re-sampling of the posteriors for his next input to the for the toy. If the infant is 10 months old he will reach
motion model. for A not B, apparently making an error of cognition
The particle filter is a method for continuously boot- in the sense that the infant appears to believe that the
strapping probabilities which cannot simply be dragged toy hidden at B nevertheless will still appear at A. At
from one situation to another because the question and 12 months, the infant will correctly reach to B, ap-
the environment are continually changing. The question pearing to have reached a new concept that toys put
is not simply Where am I?, but rather Now that I have someplace will not move from there on their own. This
taken three steps from one of three places I was at with consistently reproducible error had previously been in-
varying probabilities, where am I? It gives weight to the terpreted as a sign that between 10 and 12 months, the
previous beliefs from models, but sheds them when new infant develops an understanding of object permanence,
circumstances and other models come into play. that is, develops a belief that the object will not mag-
Although it is not stated explicitly in terms of pre- ically jump from B to A. The dynamical model takes
diction errors, by the procedure just described, particles issue with this interpretation proposing the same intel-
with smaller prediction error become more likely to lectual factors enter into the two types of responses and
be sampled to estimate the pdf, while particles with the same process but the process is complex, made up
greater errors tend to drop out. This keeps neighbor- of two motor subprocesses with different timing at dif-
ing particles more influential and distant particles less. ferent ages.
The dynamical feature of this model is that cognition The first process is a motor memory of reaching for
changes always in a way to reduce a potential function, A, made strong by the initial repetitions. The second is
in this case, the quantity minimized is surprise, the dis- memory for the looking where the toy was recently hid,
crepancy between the prior and posterior beliefs. B. For a younger infant, the memory of looking at B is
Part F | 29.4
reasoning, mental processes which originally only ac- complexity in coupling the two are central features of
companied veridical perception and action, have been motor theories of perception.
adapted to operate off-line, without requiring input or
output, but nevertheless retain features characteristic of 29.4.4 Simulation Theory
perception and action [29.18]:
Simulation theory [29.20] holds, in the large, that men-
“Off-line aspects of embodied cognition, in con-
tal life, particularly imagination, consists of processes
trast, include any cognitive activities in which sen-
by which the brain synthesizes sensations and per-
sory and motor resources are brought to bear on
ceptions which do not arise from the external world
mental tasks whose referents are distant in time and
and actions which do not affect the external world. To
space or are altogether imaginary. These include
achieve this, the brain activates the same pathways used
symbolic off-loading, where external resources are
in veridical sensation, perception, and action, but inter-
used to assist in the mental representation and ma-
rupts their contact with the external world. In this way,
nipulation of things that are not present, as well
a replica, or simulation of external events and processes
as purely internal uses of sensori-motor represen-
can be experienced without requiring the presence of
tations, in the form of mental simulations. In these
their veridical counterparts. Simulation can be either of
cases, rather than the mind operating to serve the
a stable perception, or of a dynamic unfolding of a pro-
body, we find the body (or its control systems) serv-
cess of simulated perception, action, and anticipation of
ing the mind.”
the consequences of action.
Almost in contrast to the off-line view is a theory Simulation theory requires that the brain contain
of embodiment which holds that the mind extends into structures which can accommodate perception, action,
Part F | 29.4
the physical world. For example, this theory asserts that and anticipation in the absence of external input, as
mental activity includes actions such as using paper and contrasted with veridical perception and action. The re-
pencil to solve an arithmetic problem or write a sen- sulting motor stimulation is stopped short of execution,
tence. The rules of manipulation of symbols, and the resulting in simulated behavior [29.20]:
symbols themselves are extensions of the mind into
“Saying that behaviour can be simulated here means
the physical world. In this way, the theory asserts the
nothing more than that the signal flow from the
mind increases its capacity for memory and rule appli-
prefrontal cortex via the premotor areas may occur
cation [29.18].
even if it is interrupted before it activates the pri-
mary motor cortex and results in overt behaviour.
29.4.3 Motor Theory
A simulated action is thus essentially a suppressed
or unfinished action.”
Motor theories of perception link action to cognition.
Understanding or perceiving an observed action by oth- Simulation theory explains imagination and antici-
ers is accomplished by the covert production of that pation, both arising from sensations in the absence of
action. The early example of such a theory is the mo- external input. It is characterized as a variety of percep-
tor theory of speech perception. The motor theory holds tion involving the same parts of the brain which occur
that speech is perceived through the listener’s covert with veridical perception. In addition to being stimu-
production of the same speech [29.19]: lated by input from the external world and transmitting
the stimulation to higher parts of the brain, the parts of
“As for speech perception, there is now evidence
the brain that produce sensation can also be stimulated
that perceiving speech involves neural activity of
by a retrograde flow from higher parts to sensation.
the motor system. Two recent studies involving the
Such sensations may be distinguished from veridical
use of transcranial magnetic stimulation of the mo-
sensations by the fact that they occur in an episode dur-
tor cortex have demonstrated activation of speech-
ing which overt action is being suppressed.
related muscles during the perception of speech.”
Anticipation is characterized as the imagined con-
Concepts in the motor theory of speech perception sequences of suppressed action. From experience, the
have been extended to a more general theory, applying agent learns to anticipate consequences of action and
to any observed action by others. The general motor can fold this information into the simulated expe-
theory of perception has received much support from rience. Anticipation thus consists of models of the
studies of neuronal activation and brain imaging which effects of actions. It returns the effects of the imag-
shows perceiving actions involve the same neuronal, ined actions as imagined sensations. In accordance
brain, and motor system as producing the action. Com- with simulation theory, after some amount of veridi-
plexity in production, complexity in perception, and cal experience, perception, action, and consequence, all
652 Part F Model-Based Reasoning in Cognitive Science
may occur without concurrent contact with the external come to a conclusion in which the discrepancy between
world. Q and P is minimal. This can be accomplished be min-
imizing a measure of the discrepancy, the Kullback–
29.4.5 Free Energy Theory Liebler (KL) divergence D.Q k P/, expressed in (29.5)
kind of model, labeled m, which links an action to its changing the event x
consequences, changes in the external world.
It is the aim of action to alter sensations to more F D D.Q k P/ ln.p.x// : (29.7)
closely match beliefs associated with brain states,
which it can accomplish most effectively by turning up When the conditionals are included, Free Energy
evidence confirming the most likely model. Thus cog- Theory expresses the F of (29.7) as F.s; /, where s
nition and action are each part of comprehending the stands for sensory states and stands for brain states
sensations which arise from experience, the former by or, equivalently, neural processes representing links be-
identifying its causes, the latter by seeking confirmation tween s and its causes '. In addition, the set of models,
of the most likely causes by manipulating the environ- m, link actions to changes in s by their effect of alter-
ment. Both cognition and action are governed by the ing the external world. There are two ways to minimize
same principle, the minimization of free energy. F, both central to free energy theory, the first by chang-
The purpose of the following illustration is to ex- ing brain states , the second by changing sensations s
plain how causes and models can be connected to free through action on the environment. The role of chang-
energy. Certain conditionals have been dropped in the ing brain states is one of coming to some degree of
interest of clarity and brevity. Thus, in (29.5)–(29.7) the belief in new ideas of causal relationships between the
symbol z should be understood as a belief about the ex- external world and the accompanying sensations. In the
ternal causes of sensation given the brain state. There case of action, free energy can be reduced most ef-
can be many zs, each with some initial degree of belief, fectively by a particular sort of action, namely action
expressed as Q.z/. The quantity x is used here to rep- which changes the environment to uncover evidence
resent external states or events given the sensations, the (event y) favoring the most likely cause of s for a given
causes, and the model of action. These representations brain state. The models m determine which action a will
of z and x are similar but not identical to the models likely uncover event y which has the minimizing effect,
of Bayesian multiprocess theory and simulation theory. F.z; y/ < F.z; x/. Free Energy Theory predicts the or-
In particular, free energy theory explicitly expresses x ganism will act in a way to uncover evidence consistent
as the result of a filtering process, not the unobservable with the most likely cause, because doing so most ef-
external process itself .'/, but an estimate of it, instan- fectively minimizes F.
tiated in neural activity, in a time series of vectors of A model in m might be of any variety, for example,
sensations. a simulation or a simple probability. The only require-
In the simplified example, assume the agent ob- ment is it link actions to their predicted consequences.
serves an event, x, for which he has prior probabilities, In this way, the action is conceived, not as carrying out
Q.z/, for each z, yielding an approximation Q to the orders to obtain certain goals, but rather as being a fea-
optimum posterior probability P.z j x/. The goal is to ture of the way the brain interacts with the physical
Dynamical Models of Cognition 29.5 Conclusion 653
world to increase predictive accuracy and, equivalently, rectly from physical search [29.22] from behavioral to
lessen surprise. cognitive, physical to mental. According to the theory,
As an example of the role of action, suppose an analogous processes occur in cognitive and physical
actor cannot read a word written in poor handwriting. search. For example, both have strategies of remaining
Assume the uncertainty is high because, considering in one area (patch) until its resources have been fully
only the word in isolation, there are several equally exploited or otherwise depleted, only then exploring for
likely possibilities. The actor can take action to reduce a new patch to be exploited. Both also are associated
the prediction error by altering the input, namely read- with the same areas of brain activities and transmitter
ing what he can of the words in the surrounding context. substances. Thus, the evolutionary theory of cognitive
The context can alter the respective probabilities of search asserts important aspects of the relation between
the possibilities for the problem word. That is, action cognition and behavior have been developed over a long
changes the sensory input. When reading the context, time scale. The relation between cognition and action
the action is quite directed, via the action model m. The is not one of the behavior creating the illusion of cog-
actor is not flipping pages or scanning the room, he nitive entities, but rather brain functions which govern
is acting in a way to confirm his most likely hypothe- behavioral search acting as a paradigm for brain activity
sis, that this word is part of a meaningful sentence. His governing cognitive search.
strongest hypothesis guides his action. Alternatively, he
could instead, or in addition, have tried an approach not 29.4.7 Summary
involving changing the environment. For example, he
could have made guesses about handwriting quirks of Where cognition and action are intrinsically linked, the-
the writer. This would introduce new mental models for ories range from denying the existence a category of
Part F | 29.5
the letters, thereby changing perception via changing cognition separate from the complex coordination of
brain state . Either of these approaches can be called behavioral subsystems, to a complex system of cogni-
an effort to reduce free energy. tion and action combined, to a brain system created by
Free Energy Theory characterizes mental activity as analogy to a complex behavioral system. Complexity
a complex process, guided by optimization. In addition, is a feature of theories that link cognition and action,
free energy theory has been asserted to apply to mental almost by definition, since there are applications of
activity both large and small scale. The examples apply complexity science to motor processes even when they
it to mental dynamics on the scale of human activity are not linked to cognition. The time associated with
and perception, such as finding one’s way in the dark. coordination of subsystems and unfolding of processes,
In extension, the same concepts and formulas apply it to also enters requiring concepts of nonlinear dynamic
brain activity on a neuronal scale, as revealed by brain systems theory.
imaging and single-neuron techniques. The methods of For free energy theory, the goal does not direct the
nonlinear dynamic systems analysis provide a formal- search, but the search policy inevitably uncovers things
ization of such scale invariance. of value. For the theory of evolution of cognitive search,
behavior and cognition are connected, but in a different
29.4.6 Evolution of Cognitive Search way. Cognitive search, in particular, is seen as internal-
ized replica of behavioral search. There is no single
The evolutionary theory of cognitive search proposes way in which all theories of linkage claim cognitive
that processes of attention and memory include search and behavioral functions are linked, but there is great
strategies analogous to behavioral strategies used in agreement among them that the linkage is there. The
the search of physical space for objects of value. On dynamical features of these models follow directly from
an evolutionary scale, cognitive search has arisen di- that link.
29.5 Conclusion
Unequivocally, theories which combine mental and mo- From the sampling of cognitive models presented here,
tor into a unified process are dynamical theories of incorporating action appears to be a sufficient, but not
cognition. Elements of the integrated process are clearly a necessary condition for a cognitive model to be called
quantified by measures of complexity, real-time coordi- dynamical.
nation, and time series data, which invites application of Application of a dynamical model to a data set
methods of complexity science and nonlinear dynamic does not necessarily require a comprehensive theory.
systems theory, the methodology of dynamical models. This is evident in the examples of data-based studies.
654 Part F Model-Based Reasoning in Cognitive Science
Breadth of theory is not a necessary starting point. What and functions as revealed primarily through brain imag-
is required is instead a commitment to the ideas can ing. When ART or ACT-R is analyzed in contrast to
be generated by a coherent application of dynamical dynamical theories, it is symbols and their rules of ma-
methodology. nipulation that underlie the contrast.
One of the central features of dynamical methodol- Symbols and rules of their manipulation are the
ogy is optimization of an expression of potential. The durable, stable products of mental activity. According
system proceeds in a way that minimizes some expres- to dynamical models of cognition, an alphabet, for ex-
sion of potential, such as formulated by (29.1), (29.2), ample, is the product of dynamical forces of cognition
(29.3), or (29.7). Such optimization can reveal how it is and behavior. One theory of embodiment claims that
that the system settles into an attractor and how it moves symbols are physical extensions of mind. They exist as
from one attractor to another. Minimizing potential long as general usage and culture keeps them far from
gives direction and predictability to complex processes. equilibrium, the increased entropy of which would turn
Identification and formalization of a potential V of a dy- them to dust. Thus, symbols and rules have the features
namical system paves the way for understanding its of attractors in a dynamical system. This is emphasized
stable states and transients. by the data-analytic method of symbolic dynamics.
Dynamical models may emphasize the role of op- From this point of view, computational models address
timization both as a method of determining important relationships among attractors, while dynamical models
quantities in the theory and as a fundamental prin- address the process that brings attractors into existence
ciple of cognitive processes. This is evident in Free and govern the transients among them.
Energy Theory, in which optimization is the organiz- Philosophical questions arise from the dynamical
ing principle of thought and behavior. According to free point of view concerning the status of the contents and
Part F | 29.5
energy theory, the optimizing process occurs in a scale- products of mind. What is the philosophical status of an
invariant way at every level from brain structures to internal model such as those of simulation theory or free
cognition–action complex. At every level, optimization energy theory? How are they the same as or different
in the form of minimization of free energy brings the from scientific theories? If brains have brain-state mod-
organism to a state of decreased surprise. els, and action models, do they not also have symbols?
Optimization of a potential function does not have How is a model or a brain state different from a sym-
the same overarching role in theories for which cog- bol? What is the status of the durable external products
nition and action appear as separate systems. In ART, of mental activity; alphabets, numerals, rules of induc-
maximizing adaptive resonance is an optimizing prin- tive and deductive logic, art and engineering, novels and
ciple at one stage of processing, designed to direct history books? Are these extensions of mind?
top-down and bottom-up procedures to identify the in- The benefits of a dynamical point of view to mod-
put pattern or create a new pattern in a way that is eling cognition, in addition to its intuitive appeal, are
useful for further processing. Adaptive resonance and the wealth of finely developed methods, some of which
the changes that take place during training of an ART have been described here. Complexity science intro-
network are examples of dynamic aspects of the ART. duces new methods which make it possible to address
These features are usually embedded in a network of questions which have previously been inaccessible for
modules to which the methodology associated with formalizing and quantifying mental processes. Non-
complexity is not explicitly applied. linear dynamic systems theory introduces systematic
Both theories, ART and ACT-R, address the inter- methods for characterizing the phenomena of nonlin-
action of the symbols, patterns, and rules of cognition ear recursive processes using new concepts, such as
and the relationships of these to brain function. Ma- formulation of trajectories and their features, including
jor objectives of ART and ACT-R are to explain how attractors and measures of predictability or dimension
symbols and rules are manipulated and combined in of the process. To preserve and further develop bene-
the operations of pattern recognition, classification, and fits such as these, the associated methodologies would
inference, both deductive and inductive. These opera- be a valuable addition to the curriculum of psychol-
tions are of central interest to both ART and ACT-R, ogy for the study of the mechanics and properties of
especially along with their correlation with brain areas mind.
Dynamical Models of Cognition References 655
References
29.1 T. van Gelder: The dynamical hypothesis in cogni- Psychol. Rev. 111(4), 1036–1060 (2004)
tive science, Behav. Brain Sci. 21, 615–665 (1998) 29.13 S. Grossberg: Adaptive resonance theory: How
29.2 T. van Gelder, R. Port: It’s about time. In: Mind as a brain learns to consciously attend, learn, and
Motion: Dynamics, Behavior, and Cognition, ed. by recognize a changing world, Neural Netw. 37, 1–47
R. Port, T. van Gelder (MIT Press, Cambridge 1995) (2013)
pp. 1–44 29.14 R.A.M. Gregson, S.J. Guastello: Introduction to
29.3 R.F. Port, A.P. Leary: Against formal phonology, nonlinear systems analysis. In: Nonlinear Dynam-
Language 81(4), 927–964 (2005) ical Systems Analysis for the Behavioral Sciences
29.4 C. Zednik: The nature of dynamical explanation, Using Real Data, ed. by S.J. Guastello, R.A.M. Greg-
Philos. Sci. 78(2), 238–263 (2011) son (CRC, Boca Raton 2011) pp. 3–15
29.5 W.J. Freeman: Nonlinear brain dynamics and in- 29.15 M.A. Metzger: Multiprocess models of cognitive and
tention according to aquinas, Mind Matter 6(2), behavioral dynamics. In: Mind as Motion: Dynam-
207–234 (2008) ics, Behavior, and Cognition, ed. by R. Port, T. van
29.6 J.A.S. Kelso: Dynamic Patterns: The Self- Gelder (MIT Press, Cambridge 1994) pp. 491–526
Organization of Brain and Behavior (MIT Press, 29.16 K. Hsiao, H. de Plinval-Salgues, J. Miller: Particle
Cambridge 1995) Filters and Their Applications 2005)
29.7 S.J. Guastello, M. Koopmans, D. Pincus (Eds.): Chaos 29.17 L.B. Smith, E. Thelen: Development as a dynamic
and Complexity in Psychology: Theory of Nonlin- system, Trends Cogn. Sci. 7(8), 343–348 (2003)
ear Dynamical Systems (Cambridge Univ. Press, New 29.18 M. Wilson: Six views of embodied cognition, Psy-
York 2009) chon. Bull. Rev. 9(4), 625–636 (2002)
29.8 S.J. Guastello, R.A.M. Gregson (Eds.): Nonlinear 29.19 B. Galantucci, C.A. Fowler, M.T. Turvey: The mo-
Dynamical Systems Analysis for the Behavioral Sci- tor theory of speech perception reviewed, Psychon.
Part F | 29
ences Using Real Data (CRC, Boca Raton 2011) Bull. Rev. 13(3), 361–377 (2006)
29.9 H. Haken, J.A.S. Kelso, H. Bunz: A theoretical model 29.20 G. Hesslow: The current state of simulation the-
of phase transitions in human hand movements, ory of cognition, Brain Res.: The Cogn. Neurosci,
Biol. Cybern. 51, 347–356 (1985) Thought 1428, 71–79 (2012)
29.10 T.A. Oliva, S.R. McDade: Catastrophe model for the 29.21 K. Friston: The free-energy principle: A unified
prospect-utility theory question, Nonlinear Dyn, brain theory?, Nat. Rev. Neurosci. 11, 127–138 (2010)
Psychol. Life Sci. 12, 261–280 (2008) 29.22 T.D. Hills, R. Dukas: The evolution of cognitive
29.11 G.E. Cox, R.M. Shiffrin: Criterion setting and the dy- search. In: Cognitive Search: Evolution, Algorithms,
namics of recognition memory, Top. Cogn. Sci. 4, and the Brain, ed. by P. Todd, T. Hills, T. Robbins
135–150 (2012) (MIT Press, Cambridge 2012) pp. 11–24
29.12 J.R. Anderson, D. Bothell, M.D. Byrne, S. Douglass,
C. Lebiere, Y. Qin: An integrated theory of the mind,
657
Complex vers
30. Complex versus Complicated Models of Cognition
Ruud J.R. Den Hartigh, Ralf F.A. Cox, Paul L.C. Van Geert
Part F | 30
ronment, which continuously interact over time.
The aim of this chapter is to compare the two ap- 30.4 Conclusion........................................ 666
proaches in terms of their assumptions, research
References................................................... 666
strategies, and analyses. Furthermore, we will dis-
cuss the extent to which current research data in
the cognitive domain can be explained by the two
different approaches. Based on this review, we
conclude that the CDS approach, which assumes
a complex rather than a complicated model of
cognition, provides the most plausible approach
to cognition.
At present, two classes of approaches are used to ex- sumptions they make (Sect. 30.1). In Sect. 30.2, we
plain cognition. The first class proceeds from the idea discuss the research strategies and analyses applied
that human behavior is controlled by separate cogni- by researchers proceeding from a reductionist or CDS
tive (processing) components, which we refer to in this approach. Then, in Sect. 30.3 we demonstrate the ex-
chapter as the reductionist approach. The second class tent to which complicated (related to the reductionist
assumes that cognition can be considered as a complex, approach) and complex models (related to the CDS
dynamic set of components, and that human behav- approach) fit with research data on real-time cogni-
ior is an emergent consequence. We shall refer to the tive performance and long-term cognitive development.
latter as the complex dynamic systems (CDS). The Finally, in the concluding section, we discuss which
first part of this chapter starts with an overview of kind of model seems to explain human cognition best
the two approaches and the explicit and implicit as- (Sect. 30.4).
658 Part F Model-Based Reasoning in Cognitive Science
we act, what and how we (can) learn, and even in our proceeds from is that specific mechanisms are respon-
emotions and personality [30.1]. Cognition, as amal- sible for the way humans behave and learn. In this
gam of many such distinct cognitive functions and sense, some environmental stimulus is represented in
subfunctions, is a mechanistic apparatus consisting of the mind, and based on algorithms performed by the
specialized modules linked together in a linear causal internal cognitive components the behavioral output is
chain. This premise has directed the research attention produced [30.3, 26–29]. For instance, in a sports con-
to localizing these modules or components and the func- text a football player perceives the positions of his team
tions they perform. We shall refer to this approach, in mates and the opponents, and cognitively computes the
which cognition and the explanation of human behav- best next move [30.30]. Expert football players would
ior is reduced to localized functions, as the reductionist better master this skill given their extensive knowledge
approach. base, or software, of previous encounters with different
Obviously, the environment also contributes to how kinds of situations. This approach implicitly takes the
we behave and learn. In the reductionist view, the role of computer as a metaphor in order to explain behavior,
the environment is rather dissociated, that is, it provides and typically conceives of the mind, or brain, as a cen-
input to cognitive processes. More specifically, envi- tral computing agent that encodes the environmental
ronmental cues are cognitively processed, after which inputs and controls subsequent behavior (see the review
the best subsequent action can be computed, and the of Van Gelder [30.7] for an extensive discussion of this
situational input can be cognitively stored in order to view). This entails that the brain, comprising the differ-
respond optimally the next time a comparable situation ent component processes, is considered as the central
is encountered [30.2–4]. controller of human behavior.
In the past decades, several researchers have criti- The idea that the brain controls our behavior, and
cized the above-mentioned point of view, in particular that the body and environment provide (only) input to
that human cognition can be compared with a very the brain, was first challenged by Gibson’s ecological
complicated computational machine [30.1, 5–7]. The approach [30.31–33]. He proposed that action possi-
computational requirements to perform the most opti- bilities are not cognitively computed, but are directly
mal actions would be too high to be feasible in the guided by information from a structured environment
Complex versus Complicated Models of Cognition 30.1 Current Views on Cognition 659
Part F | 30.1
construction and application of static models, in which and so forth. Thus, generally speaking, the reductionist
the levels of some set of dependent variable(s) .yi / are approach proceeds from the idea that the explanatory
directly and uniquely related to (or caused by) the levels power should be increased by increasing the number
of some other set of independent variable(s) .xi / of specific factors (i. e., components) involved and the
links between them. As already mentioned, temporal
yi D f .xi / : (30.1) change, represented by age in the example, is treated
as a factor like any other factor in the model.
In this functional description, any possible set of values According to the CDS approach, however, the
of xi generates a corresponding value for the dependent causal principle of behavioral or psychological change
variables yi . In other words, if we know the values of xi , does not lie in the values of some variables or com-
we can predict the values of yi . An implicit assumption ponents at a certain moment in time. As noted earlier,
here is that the operating causal variables remain stable cognition could be envisioned as a dynamic process,
for the duration of the behavior or psychological state which entails that time is an essential aspect to take into
they would cause. account. More specifically, the change in the cognitive
Take, as an illustration, the development of a child’s system is a function of its preceding state
lexicon. A typical static study would consist of assess-
ing the maternal talk to children of different ages, for y tC1 D f .y t / ; (30.2)
instance of 1, 2, and 3 years old. The size of the lexi-
con can then be predicted by explanatory variables such where y tC1 corresponds to the state of the system at
as age, maternal talk, or a combination of these two time t C 1, which is a function of state y at the previous
variables (Fig. 30.1). Note that, although age is in fact time point t. Hence, the CDS approach proposes models
a continuously changing temporal parameter, it is used of change that involve recursive relationships (y t leads
in a typically static way as a sequence of values (ages), to y tC1 , which leads to y tC2 , and so forth) [30.44]. Re-
similar to the way maternal talk is treated as a series turning to the explanation of a child’s lexicon, a simple
of static values. In line with the reductionist view, the explanation would be that learning new words at time t
implicit assumption here is that the child’s cognitive depends (among possible other things) on the words the
660 Part F Model-Based Reasoning in Cognitive Science
is not possible nor feasible to reduce cognition to sep- cation of maternal lexical input on the child’s lexi-
arate, fairly isolated components (recall the concept of con [30.43, 44, 55]. The outcome is framed in terms
self-organization again in Sect. 30.1.1). of the variance in the child’s lexicon that can be ex-
Referring back to the example of football, according plained by (co-varies with) the variance in the maternal
to the CDS approach the actions of a (attacking) player input variables. In other words, the researcher attempts
are emergent from the underlying self-organization dy- to find a linear relationship between the lexical input of
namics (changing positions of players, the ball, etc.) the mother and a child’s lexicon (the output variable).
[30.18, 40]. Of primary interest to researchers is there- The relationship between maternal input and a child’s
fore the unfolding of footballers’ actions in real-time. lexicon as it is found across a sample of mother–child
A typical study would focus on the emergence of action dyads, is implicitly assumed to govern the process
patterns that are continuously shaped by the way the of language learning at the level of individual chil-
system components attune to each other, involving for dren [30.56].
instance the attackers’ and defenders’ relative distance
to each other [30.49, 50] and to the goal [30.51], or 30.2.4 Analyses to Capture the Complexity
more generally the size of the field [30.52]. As an illus- of Cognition
tration, Headrick and colleagues [30.51] revealed that
the distance between the defender and the ball stabilized According to the CDS approach, the associations be-
at higher values – indicating low risk-taking behavior tween variables as they are observed at the sample level,
of the defender – when the defender–attacker duel oc- cannot be used as valid approximations of the dynamic
curred close to the goal, than when it occurred relatively relations that govern the process. More specifically, if
far from the goal. we assume that components change over time, influence
each other reciprocally, which gives rise to (changing)
30.2.3 Analyses to Untangle Cognition patterns of behavior, analyzing associations between
Based on Complicated Models variables in large samples cannot tell us how the process
actually works (cf. the ergodicity problem as described
Given the different assumptions of the reductionist by Molenaar and colleagues [30.57, 58]). According to
and CDS approaches, not only the research strategy CDS theorists, if a researcher is interested in why and
(Sect. 30.2.2), but also the applied analyses are differ- how actual change occurs, the process of interest should
Part F | 30.2
ent [30.53]. In the reductionist approach, the analysis is be studied over time [30.44, 53, 56, 59, 60]. Therefore,
focused on finding the linear associations at the level of researchers often apply time series analyses, and they
isolated variables. Researchers therefore typically ap- focus on particular signatures of the time series, as well
ply the so-called control of variables strategy, which in as on the underlying dynamic rules that may explain the
standard accounts of the scientific method is seen as the dynamics of the time series.
quintessential way of explaining the nature of reality, Van Geert and colleagues have conducted sev-
namely to disentangle the variables and control the vari- eral studies on language development from a CDS
ables separately to see what changes in those variables perspective [30.21, 22, 61–63]. The authors consis-
actually do. The way one variable controls another vari- tently found discontinuities in individual children’s
able is assumed to be a property that can be isolated language development, which provided valuable infor-
from other properties and other variables. The reason- mation about lexical change. For example, Bassano
ing is that the most general way in which a variable and Van Geert [30.21] studied early language devel-
can control another one is the way in which a variable opment among French children, and they showed that
co-varies with another one over the entire population. the discontinuities in the time series mark the transi-
Hence, the study of the way a variable controls an- tion from a one-word to a combinational mode, and
other one is based on samples that are big enough to from single combinations to more abstract syntac-
be a good representative of that population. This also tic modes of language. In CDS terms, the language
points to the importance of the generalizability issue, modes can be considered as attractors, that is, states
which is understood as the degree to which the state- or patterns toward which the system tends to con-
ment based on a sample is true of the population that verge [30.36, 48, 64–66]. Thus, the increase in vari-
the sample is intended to represent (see also Hasselman ation signals the transition to another attractor, and
and colleagues [30.54] for a discussion of theorizing in thereby to another milestone in children’s language de-
cognitive science). velopment (see also the work of Van Dijk and Van
In the example of the relationship between maternal Geert [30.62]).
talk and a child’s lexicon (Sect. 30.1.2), a researcher Interestingly, whereas variation patterns carry
may analyze the effect of the quantity and sophisti- highly valuable information about the cognitive process
662 Part F Model-Based Reasoning in Cognitive Science
according to the CDS approach, variation is typically individual children. This means that the authors were
considered as random error according to the reduction- not concerned with providing a model of the average
ist approach. In classical repeated measures designs, language development across the population of chil-
for instance, variation around the (linear) tendency over dren, which is typical for the reductionist approach,
time is considered as error variance. On the contrary, and which would probably result in an unrealistic pic-
periods of variation have been consistently found to ture of what individual language development may look
be markers of a transition stage to another attractor in like (many children do not develop according to the
a variety of cognition-related (dynamical) research, not statistically average child). Rather, Bassano and Van
only in studies on language development (see the exam- Geert proposed a dynamic model that could also be
ple above), but also on cognitive reasoning [30.67, 68], generalized to the individual. In other words, the au-
perception [30.69], and motor control [30.70]. thors provided insights into the lawful mechanisms, or
Finally, returning to the study of Bassano and Van CDS principles, underlying language development over
Geert [30.21], they proposed a mathematical model, de- time (for a comparable example of model building in
fined as a dynamic growth model in which the growth mother–child linguistic interactions and the associated
of the language modes, and how they mutually in- developmental process in the child, see the recent work
fluence each other, could be reliably modeled for the of Van Dijk and colleagues [30.45]).
maternal smoking during pregnancy on children’s later 30.3.1 Explaining Real-Time Cognitive
academic achievements, a reductionist approach may Performance
provide the best fit, because it is desirable that the vari-
ables of interest are studied in isolation to determine In a recent study, Den Hartigh and colleagues [30.72]
the effect. Indeed, when this question addresses the were interested in the mechanism underlying the (cog-
population level it can, for instance, be used in cam- nitive) control of a motor (rowing) task. The authors let
paigns and medical advice. In a typical study, Batstra rowers perform a practice session on rowing ergome-
and colleagues [30.71] adjusted for confounders such as ters, consisting of 550 strokes at the rowers’ preferred
socioeconomic status and pre- and perinatal complica- rhythm. A force sensor was attached to the handle of
tions, and across 1186 children they found that maternal the ergometer, which measured the exerted force of the
smoking during pregnancy was independently related to rowers at 100 Hz. Subsequently, the authors analyzed
the children’s arithmetic and spelling skills between the the time series of the durations from force-peak-to-
ages of 5:5 and 11 years. Note that this study was not force peak (the force peak intervals). With the reduc-
focused on explaining cognition, but rather on one po- tionist approach in mind, one would expect that each
tential risk factor, that is, the distribution of maternal new stroke is controlled by specific modules or compo-
smoking across the population and its statistical asso- nent processes (e.g., central pattern generators [30.73]).
ciation with a population-defined effect (distribution of This entails that each new stroke would be indepen-
arithmetic and spelling skills). dently controlled from the previous stroke, and that
However, as discussed earlier, the reductionist ap- the results should reveal interval series characterized
proach is also widely used to provide an understanding by some average interval value with random variation
of the (complicated) mechanism that drives cognitive around it (recall that variation is typically treated as ran-
performance in real time, as well as cognitive devel- dom noise in the reductionist approach).
opment across the life span. The extent to which the On the other hand, a CDS is characterized by an
reductionist approach on the one hand, or the CDS iterative process involving interactions between vari-
approach on the other hand, is most applicable to cog- ous component processes at different levels (e.g., in
nition that depends on whether it is a complicated or this case cell activity, muscle contractions, limb move-
Complex versus Complicated Models of Cognition 30.3 Is Cognition Best Explained by a Complicated or Complex Model? 663
ments) and across multiple time scales (e.g., from a few tern of variation, we found a FD of 1:22, which is
seconds to several minutes of performance [30.74]). close to pink noise, and a comparable pattern was
Such ongoing component interactions would cooper- found for the other rowers in the sample. These re-
atively generate the rower’s performance, and are as- sults strongly suggest that the performance of the
sumed to generate time series characterized by a struc- rowers emerged from complexity, that is the interac-
tured pattern of variation, called pink (or 1=f ) noise. tion between many components on various scales, as
More specifically, the coordination among interacting opposed to (just) the contribution of many compo-
component processes across multiple time scales within nents. Figure 30.3b displays the performance data of
the system, and between the system and its (task) envi- the same rower as the one in Figure 30.3a, but in
ronment, would result in small fluctuations on a short this case the force-peak interval data are randomized.
time scale (a few rowing strokes) that are nested in Hence, the average interval and the size of the varia-
larger fluctuations across longer time scales (tens or tion (standard deviation) are exactly the same in the
hundreds of strokes). The temporal structure of varia- two graphs, only the temporal order is different. In
tion can be quantified in terms of the fractal dimension line with the fact that this randomization made each
(FD): A FD close to 1:5 corresponds to random (white) next rowing stroke independent of the previous one(s),
noise, and a FD close to 1:2 corresponds to pink we found a FD close to 1:5, which reflects random
noise [30.38]. noise.
Figure 30.3a provides a representative example In line with the study of Den Hartigh and col-
of a time series of one of the rowers. Based on leagues [30.72], the occurrence of pink noise in
only visual inspection, one can observe that minor cognitive performance seems a universal phenome-
fluctuations are embedded in waves of larger fluctu- non [30.39, 75]. Virtually any cognitive or motor per-
ations. In line with this (seemingly) structured pat- formance in which time series of healthy individuals are
analyzed reveal pink noise patterns, ranging from re-
a) Peak-intervals (normalized) action times in psychological experiments and reading
3 fluency, to stride intervals of human gait and rhyth-
mical aiming tasks [30.14, 16, 34, 38, 74, 76–80]. These
2 studies provide robust and converging evidence to the
claims of the CDS approach, which makes it likely that
Part F | 30.3
1 real-time cognitive performance emerges from com-
plexity, and cannot be reduced to separate, rather inde-
0
pendently operating components that perform specific
–1
functions to control human behavior [30.72].
Ability
predicted by particular determining factors, often in- an ability profits from the constant resources .r/, the
cluding the age of the child, in a linear fashion (see also weight of the connection .s/ with other components (i,
Sect. 30.1.2). j, etc.) and a general limiting factor .C/ that keeps the
Interestingly, the literature on human development, growth within realistic maximum values
and more specifically cognitive development, hardly re-
8 v Di ! 9
ˆ LA D r L 1 LA C X s L V
ˆ
veals linear patterns [30.81]. In the specific case of LA >
ˆ 1 >
>
scientific talent development, some defining properties ˆ
ˆ t
LA A v A v
CA >
>
ˆ
ˆ KLA v D1 >
>
have been summarized by Simonton in one of his arti- ˆ
ˆ vX ! >
>
ˆ >
Part F | 30.3
Dj
cles on talent [30.85]. An example of these properties ˆ
< B
L L B L >
B =
D rLB LB 1 C sv LB Vv 1
is that a similar form of (scientific) talent may emerge t KLB CB :
ˆ
ˆ
v D1 >
>
at different ages. Another is that the level of talent ˆ
ˆ ::: >
>
ˆ
ˆ >
>
is not necessarily monotonically raising or stable. It ˆ
ˆ >
>
ˆ
ˆ ::: >
>
can change or even disappear during a person’s life
:̂ >
;
span. :::
We will briefly show how to apply computer mod- (30.3)
eling to test whether a particular model would be able
to generate valid predictions of scientific talent de- For simplicity, we simulated a system consisting
velopment, such as the properties mentioned above. of 10 components. Each simulation represents a par-
In line with the CDS approach, we will demonstrate ticular individual trajectory, which is based on initial
a model in which development is shaped by the on- parameter values that were randomly drawn from sym-
going interactions with other components, which also metric distributions. Furthermore, the average degree of
undergo change. The key mathematical principles of connectivity between the nodes is 25%, and the connec-
such a (relatively simple) dynamic systems model con- tions are randomly distributed over the nodes [30.86].
sist of the scientific ability .L/ that changes over time Figure 30.4 provides a graphical representation of a typ-
.t/ as a function of two kinds of resources. One re- ical network of relationships specified by this kind of
mains relatively stable across time .K/, for instance, model. The nodes correspond to different variables that
the individual’s genetic endowment. The second type interact with the ability growth and with each other
of resource .V/ may change on the same time scale (think of the individual’s commitment and family sup-
as the change of the scientific ability, and comprises port). The sizes of the nodes reflect the magnitudes
components such as commitment and teacher support. of the variables. Furthermore, each directed arrow be-
These components may dynamically interact with the tween two nodes represents a supportive (green) or
scientific ability component and with each other. The competitive effect (brown) of one variable on another.
interaction between the components is governed by The strength of the relationships between the variables
a number of parameters, including the degree in which is reflected in the thickness of the edges.
Complex versus Complicated Models of Cognition 30.3 Is Cognition Best Explained by a Complicated or Complex Model? 665
Part F | 30.3
observation is that the model reveals very different
patterns of scientific ability development. Figure 30.5 ranked journals). The actual productivity of scientists is
provides two representative simulation examples. The extremely right skewed with very few researchers hav-
black lines in the graphs correspond to the scientific ing many high-impact publications and relatively many
ability, whereas the other lines correspond to the other with one high-impact publication. In fact, the distribu-
variables. The figure shows that the ability develop- tion is so right skewed that the log–log representation
ment of one individual develops in a step-wise fashion, corresponds with a straight line [30.87, 90–92]. In com-
and reaches a plateau during the second half of the bination with various product models discussed in the
life cycle (Fig. 30.5a). On the other hand, the ability literature [30.87, 93], simulations of the CDS model
of the second individual (Fig. 30.5b) starts with a rela- reveal an extremely skewed distribution that is in ac-
tively rapid increase, which levels off, and in the second cordance with the distribution of scientific productivity
half of this individual’s life span the ability develop- of scientists in various scientific domains (Fig. 30.6).
ment declines. Together, Fig. 30.5a and b correspond to The typical reductionist model would try to explain the
the typical properties of scientific talent development, product distributions on the basis of linear combina-
namely that it can take different forms, that it is not tions of underlying predictor variables. However, such
a linear (monotonic) process, and that talent may dimin- a model is unable to predict the typical and ubiquitous
ish or disappear over time [30.85] (for more extensive heavy-skewed distribution of the products, in this case
demonstrations of dynamic systems modeling of cog- the publications [30.86].
nitive development, see the work of Van Geert [30.23, Taken together, based on data on real-time cognitive
81, 82]). performance and computer modeling of long-term cog-
A final observation in the literature is that excep- nitive development, researchers can choose the model
tional abilities are rare [30.86–89]. Specific abilities, that most likely underlies the empirically observed pat-
such as the scientific ability required to write papers terns. We have presented some examples of data that
for high-ranked journals, are in most cases only mea- can be better explained by model predictions stemming
surable by referring to the typical performances or from the CDS approach (a complex model) than from
products (i. e., the number of published articles in high- a reductionist approach (a complicated model).
666 Part F Model-Based Reasoning in Cognitive Science
30.4 Conclusion
Every human behaves and develops in a different way, approach that cognitive performance emerges from on-
and is embedded in a rich, constantly changing envi- going component interactions, resulting in a time series
ronment. This has made it challenging for scientists to in which short-term adaptations are embedded in slower
explain cognitive development and the control of hu- but larger changes (Sect. 30.3.1).
man behavior. In the past decades, human cognition Second, we demonstrated predictions that were fo-
has, on the one hand, been approached as localized in cused on long-term cognitive development (i. e., scien-
the brain and controlled by separate components, and, tific ability development). The reductionist approach as-
on the other hand, as a dynamic process consisting sumes that cognitive development is shaped by the ad-
of nonlocalized interacting component processes. The dition of relevant explanatory components or variables
first approach – the reductionist approach – assumes (e.g., genetic endowment, commitment, and teacher
that research practice should be focused on finding the support), whereas the CDS approach proceeds from the
explanation of cognition in the specific functions of idea that cognitive development is shaped by the ongo-
the components, whereas the second approach – the ing dynamic interaction between the relevant variables.
complex dynamic systems approach – assumes that We showed that some typical properties of cognitive de-
we should focus on the underlying (complex) dynamic velopment, scientific talent development in particular,
principles to understand cognition. In our belief, re- are generated by a model that is based on CDS princi-
searchers often apply the approach that they and their ples (Sect. 30.3.2).
close colleagues are most familiar and comfortable The plausible predictions that followed from the
with. Often, this is the reductionist approach, which has CDS approach suggest that cognition can best be ex-
been widely applied in social and behavioral sciences plained by a complex model. Therefore, in light of
since the cognitive revolution in the 1950s, whereas future model building, we hope that researchers who
the CDS approach has relatively slowly gained ground apply the reductionist approach will keep an open mind
since the 1990s [30.7, 34, 46, 65, 81]. regarding the potential of the CDS approach to capture
In this chapter, we started with an overview of the full richness of cognition and behavior. At the same
some key differences between the approaches without time, CDS theorists should continue exploring whether
taking a position in which of the two is the better a reductionist explanation may also fit with obtained
Part F | 30
one (Sects. 30.1 and 30.2). Subsequently, we discussed results on (time-serial) cognitive processes. By doing
findings on real-time processes and long-term cogni- so, researchers will be in a better position to provide
tive development (Sect. 30.3). First, we showed that a model to unlock the mystery of the three pounds of
cognitive performance measured in real-time reveals matter between our ears, and, importantly, how this is
a structured pattern of variation (pink noise), which situated in our bodies and the environment we interact
is difficult to reconcile with the reductionist view ac- with. Given the current state of knowledge, we should
cording to which a pattern of random variation would keep in mind that the answer to this mystery, and the
be expected. On the other side, it fits with the CDS model we need, may not be complicated, but complex.
References
30.1 M.J. Richardson, K.L. Marsh, R.C. Schmidt: Chal- 30.6 A. Clark: An embodied cognitive science?, Trends
lenging the egocentric view of coordinated per- Cogn. Sci. 3, 345–351 (1999)
ceiving, acting, and knowing. In: The Mind in 30.7 T. Van Gelder: What might cognition be, if not com-
Context, ed. by L.F. Barrett, B. Mesquita, E. Smith putation?, J. Philos. 92, 345–381 (1995)
(GuilFord, New York 2010), Chap. 15 30.8 H.A. Simon: A behavioral model of rational choice,
30.2 K.A. Ericsson, W. Kintsch: Long-term working Q. J. Econ. 69, 99–118 (1955)
memory, Psychol. Rev. 102, 211–245 (1995) 30.9 R.F.A. Cox, W. Smitsman: Action planning in young
30.3 A.B. Markman, E. Dietrich: In defense of represen- children’s tool use, Dev. Sci. 9, 628–641 (2006)
tation, Cogn. Psychol. 40, 138–171 (2000) 30.10 P. Fitzpatrick, R. Diorio, M.J. Richardson,
30.4 K. Yarrow, P. Brown, J.W. Krakauer: Inside the brain R.C. Schmidt: Dynamical methods for eval-
of an elite athlete: The neural processes that sup- uating the time-dependent unfolding of social
port high achievement in sports, Nat. Rev. Neu- coordination in children with autism, Front. Integr.
rosci. 10, 585–596 (2009) Neurosci. 7 (2013), doi:10.3389/fnint.2013.00021
30.5 A. Chemero: Anti-representationalism and the dy- 30.11 K.L. Marsh, R.W. Isenhower, M.J. Richardson,
namical stance, Philos. Sci. 67, 625–647 (2000) M. Helt, A.D. Verbalis, R.C. Schmidt, D. Fein: Autism
Complex versus Complicated Models of Cognition References 667
and social disconnection in interpersonal rock- 30.27 B. Hommel: The cognitive representation of action:
ing, Front. Integr. Neurosci. 7 (2013), doi:10.3389/ Automatic integration of perceived action effects,
fnint.2013.00004 Psychol. Res. 59, 176–186 (1996)
30.12 E. Thelen, G. Schöner, C. Scheier, L.B. Smith: The 30.28 T. Schack, H. Ritter: The cognitive nature of ac-
dynamics of embodiment: A field theory of infant tion – Functional links between cognitive psychol-
perseverative reaching, Behav. Brain Sci. 24, 1–34 ogy, movement science, and robotics, Prog. Brain
(2001) Res. 174, 231–250 (2009)
30.13 M. Varlet, L. Marin, S. Raffard, R.C. Schmidt, 30.29 P. Thagard: Mind: Introduction to Cognitive Science
D. Capdevielle: J.P., Boulenger, J. Del-Monte, B.G. (MIT Press, Cambridge 2005)
Bardy: Impairments of social motor coordina- 30.30 A.M. Williams, N.J. Hodges, J.S. North, G. Barton:
tion in schizophrenia, PLoS ONE (2012), doi:10.1371/ Perceiving patterns of play in dynamic sport tasks:
journal.pone.0029772 Investigating the essential information underlying
30.14 M.L. Wijnants, F. Hasselman, R.F.A. Cox, skilled performance, Perception 35, 317–332 (2006)
A.M.T. Bosman, G. Van Orden: An interaction- 30.31 J.J. Gibson: The Senses Considered as Perceptual
dominant perspective on reading fluency and Systems (Houghton-Mifflin, Boston 1966)
dyslexia, Ann. Dyslexia 62, 100–119 (2012) 30.32 J.J. Gibson: The theory of affordances. In: Perceiv-
30.15 R.F. Port, T. van Gelder: Mind as Motion: Explo- ing, Acting and Knowing: Toward an Ecological
rations in the Dynamics of Cognition (MIT Press, Psychology, ed. by R. Shaw, J. Bransford (Lawrence
Cambridge 1995) Erlbaum Associates, New York 1977) pp. 67–82
30.16 G.C. Van Orden, J.G. Holden, M.T. Turvey: Human 30.33 J.J. Gibson: The Ecological Approach to Visual Per-
cognition and 1=f scaling, J. Exp. Psychol. Gen. 134, ception (Houghton-Mifflin, Boston 1979)
117–123 (2005) 30.34 C.T. Kello, B.C. Beltz, J.G. Holden, G.C. Van Orden:
30.17 R.F.A. Cox, A.W. Smitsman: Special section: Towards The emergent coordination of cognitive function,
an embodiment of goals, Theor. Psychol. 18, 317– J. Exp. Psychol. Gen. 136, 551–568 (2007)
339 (2008) 30.35 P.N. Kugler, M.T. Turvey: Information, Natural Law,
30.18 D. Araujo, K. Davids, R. Hristovski: The ecologi- and The Self-Assembly of Rhythmic Movement
cal dynamics of decision making in sport, Psychol. (Lawrence Erlbaum Associates, Hillsdale 1987)
Sport Exerc. 7, 653–676 (2006) 30.36 E. Thelen, L.B. Smith: A Dynamic Systems Approach
30.19 R.J.R. Den Hartigh, S. Van Der Steen, M. De Meij, to the Development of Cognition and Action (MIT
N. Van Yperen, C. Gernigon, P.L.C. Van Geert: Char- Press, Cambridge 1994)
acterising expert representations during real-time 30.37 P.L.C. Van Geert, K.W. Fischer: Dynamic systems and
action: A Skill Theory application to soccer, J. Cogn. the quest for individual-based models of change
Part F | 30
Psychol. 26, 754–767 (2014) and development. In: Toward a New Grand The-
30.20 A.M. Williams: Perceptual skill in soccer: Implica- ory of Development? Connectionism and Dynamic
tions for talent identification and development, Systems Theory Reconsidered, ed. by J.P. Spencer,
J. Sports Sci. 18, 737–750 (2000) M.S.C. Thomas, J. McClelland (Oxford Univ. Press,
30.21 D. Bassano, P. Van Geert: Modeling continuity and Oxford 2009), Chap. 16
discontinuity in utterance length: A quantitative 30.38 G.C. Van Orden, J.G. Holden, M.T. Turvey: Self-orga-
approach to changes, transitions and intra-indi- nization of cognitive performance, J. Exp. Psychol.
vidual variability in early grammatical develop- Gen. 132, 331–335 (2003)
ment, Dev. Sci. 10, 588–612 (2007) 30.39 M.L. Wijnants: A review of theoretical perspectives
30.22 M. Van Dijk, P. Van Geert: Wobbles, humps and in cognitive science on the presence of scaling in
sudden jumps: A case study of continuity, discon- coordinated physiological and cognitive processes,
tinuity and variability in early language develop- J. Nonlinear Dyn. (2014), doi:10.1155/2014/962043
ment, Infant Child Dev. 16, 7–33 (2007) 30.40 K. Davids, D. Araújo: The concept of Organismic
30.23 P. Van Geert: A dynamic systems model of cognitive Asymmetry in sport science, J. Sci. Med. Sport 13,
and language growth, Psychol. Rev. 98, 3–53 (1991) 633–640 (2010)
30.24 G. Park, D. Lubinski, C.P. Benbow: Contrasting in- 30.41 J.F. Grehaigne, D. Bouthier, B. David: Dynamic-
tellectual patterns predict creativity in the arts and system analysis of opponent relationships in col-
sciences tracking intellectually precocious youth lective actions in soccer, J. Sports Sci. 15, 137–149
over 25 years, Psychol. Sci. 18, 948–952 (2007) (1997)
30.25 S. Van Der Steen, H. Steenbeek, M.W.G. Van Dijk, 30.42 N. Hurtado, V.A. Marchman, A. Fernald: Does in-
P. Van Geert: A process approach to children’s put influence uptake? Links between maternal talk,
understanding of scientific concepts: A longitu- processing speed and vocabulary size in Spanish-
dinal case study, Learn. Indiv. Diff. 30, 84–91 learning children, Dev. Sci. 11, F31–F39 (2008)
(2014) 30.43 A. Weisleder, A. Fernald: Talking to children matters
30.26 W.H. Dittrich: Seeing biological motion – Is there early language experience strengthens processing
a role for cognitive strategies? In: Gesture-Based and builds vocabulary, Psychol. Sci. 24, 2143–2152
Communication in Human-Computer Interaction, (2013)
Lecture Notes in Computer Science, Vol. 1739, ed. by 30.44 P.L.C. Van Geert: Nonlinear complex dynamic sys-
A. Braffort, R. Gherbi, S. Gibet, D. Teil, J. Richardson tems in developmental psychology. In: Chaos and
(Springer, Berlin 1999) pp. 3–22 Complexity in Psychology: The Theory of Nonlinear
668 Part F Model-Based Reasoning in Cognitive Science
Dynamical Systems, ed. by S.J. Guastello, M. Koop- to the study of development, Dev. Rev. 25, 408–442
mans, D. Pincus (Cambridge Univ. Press, New York (2005)
2009) pp. 242–281 30.61 R. Ruhland, P. Van Geert: Jumping into syntax:
30.45 M. Van Dijk, P. Van Geert, K. Korecky-Kröll, I. Mail- Transitions in the development of closed class
lochon, S. Laaha, W.U. Dressler, D. Bassano: Dy- words, Brit. J. Dev. Psychol. 16, 65–95 (1998)
namic adaptation in child–adult language inter- 30.62 M. Van Dijk, P. Van Geert: Disentangling behavior
action, Lang. Learn. 63, 243–270 (2013) in early child development: Interpretability of early
30.46 J.M. Ottino: Engineering complex systems, Nature child language and its effect on utterance length
427, 399 (2004) measures, Infant Behav. Dev. 28, 99–117 (2005)
30.47 C.T. Kello, G.D. Brown, R. Ferrer-i-Cancho, 30.63 P. Van Geert, M. Van Dijk: Focus on variability: New
J.G. Holden, K. Linkenkaer-Hansen, T. Rhodes, tools to study intra-individual variability in de-
G.C. Van Orden: Scaling laws in cognitive sciences, velopmental data, Infant. Behav. Dev. 25, 340–374
Trends Cogn. Sci. 14, 223–232 (2010) (2002)
30.48 J.A.S. Kelso: Dynamic Patterns: The Self- 30.64 W. Briki, R.J.R. Den Hartigh, K.D. Markman,
Organization of Brain and Behavior (MIT Press, C. Gernigon: How do supporters perceive positive
Cambridge 1995) and negative psychological momentum changes
30.49 R. Duarte, D. Araújo, L. Freire, H. Folgado, O. Fer- during a simulated cycling competition?, Psychol.
nandes, K. Davids: Intra- and inter-group co- Sport Exerc. 15, 216–221 (2014)
ordination patterns reveal collective behaviors of 30.65 A. Nowak, R.R. Vallacher: Dynamical Social Psy-
football players near the scoring zone, Hum. Mov. chology (Guilford, New York 1998)
Sci. 31, 1639–1651 (2012) 30.66 P. Van Geert: The dynamic systems approach in the
30.50 R. Duarte, D. Araújo, K. Davids, B. Travassos, V. Gaz- study of L1 and L2 acquisition: An introduction,
imba, J. Sampaio: Interpersonal coordination ten- Mod. Lang. J. 92, 179–199 (2008)
dencies shape 1-vs-1 sub-phase performance out- 30.67 B.R. Jansen, H.L. Van der Maas: Evidence for the
comes in youth soccer, J. Sports Sci. 30, 871–877 phase transition from Rule I to Rule II on the bal-
(2012) ance scale task, Dev. Rev. 21, 450–494 (2001)
30.51 J. Headrick, K. Davids, I. Renshaw, D. Araújo, P. Pas- 30.68 H.L. Van der Maas, P.C. Molenaar: Stagewise cog-
sos, O. Fernandes: Proximity-to-goal as a con- nitive development: An application of catastrophe
straint on patterns of behaviour in attacker– theory, Psychol. Rev. 99, 395–417 (1992)
defender dyads in team games, J. Sports Sci. 30, 30.69 H.S. Hock, J.A.S. Kelso, G. Schöner: Bistability and
247–253 (2012) hysteresis in the organization of apparent motion
30.52 W. Frencken, J. Van Der Plaats, C. Visscher, K. Lem- patterns, J. Exp. Psychol. Hum. Percept. Perform.
Part F | 30
30.78 A.L. Goldberger, L.A. Amaral, J.M. Hausdorff, Curr. Dir. Psychol. Sci. 10, 39–43 (2001)
P.C. Ivanov, C.K. Peng, H.E. Stanley: Fractal dy- 30.86 R.J.R. Den Hartigh, M.W.G. Van Dijk, H.W. Steen-
namics in physiology: Alterations with disease and beek, P.L.C. Van Geert: A dynamic network model
aging, Proc. Natl. Acad. Sci. 99, 2466–2472 (2002) to explain the development of excellent human
30.79 J.M. Hausdorff, Y. Ashkenazy, C.K. Peng, P.C. Ivanov, performance, Front. Psychol. 7 (2016), doi:10.3389/
H.E. Stanley, A.L. Goldberger: When human walk- fpsyg.2016.00532
ing becomes random walking: Fractal analysis and 30.87 J.C. Huber, R. Wagner-Dobler: Scientific produc-
modeling of gait rhythm fluctuations, Phys. Stat. tion: A statistical analysis of authors in mathemat-
Mech. Appl. 302, 138–147 (2001) ical logic, Scientometrics 50, 323–337 (2001)
30.80 M.L. Wijnants, A.M. Bosman, F. Hasselman, 30.88 E. O’Boyle Jr, H. Aguinis: The best and the rest:
R.F.A. Cox, G. Van Orden: 1=f scaling in movement Revisiting the norm of normality of individual per-
time changes with practice in precision aiming, formance, Pers. Psychol. 65, 79–119 (2012)
Nonlinear Dyn. Psychol. Life Sci. 13, 75–94 (2009) 30.89 D.K. Simonton: Talent and its development: An
30.81 P. Van Geert: Dynamic Systems of Development: emergenic and epigenetic model, Psychol. Rev.
Change Between Complexity and Chaos (Harvester, 106, 435–457 (1999)
New York 1994) 30.90 J. Laherrere, D. Sornette: Stretched exponential
30.82 P. Van Geert: Dynamic modeling for development distributions in nature and economy: “Fat tails”
and education: From concepts to numbers, Mind with characteristic scales, Eur. Phys. J. 2, 525–539
Brain Educ. 8, 57–73 (2014) (1998)
30.83 D. Lubinski: Exceptional cognitive ability: The phe- 30.91 S. Redner: How popular is your paper? An empirical
notype, Behav. Genet. 39, 350–358 (2009) study of the citation distribution, Eur. Phys. J. B. 4,
30.84 D. Lubinski, C.P. Benbow: Study of mathematically 131–134 (1998)
precocious youth after 35 years: Uncovering an- 30.92 M. Sutter, M.G. Kocher: Power laws of research out-
tecedents for the development of math-science put, Scientometrics 51, 405–414 (2001)
expertise, Perspect. Psychol. Sci. 1, 316–345 (2006) 30.93 J.C. Huber: A statistical analysis of special cases of
30.85 D.K. Simonton: Talent development as a multidi- creativity, J. Creat. Behav. 34, 203–225 (2000)
mensional, multiplicative, and dynamic process,
Part F | 30
671
Jonathan Waskan
From Neural
31. From Neural Circuitry to Mechanistic
Model-Based Reasoning
Part F | 31
mechanisms in science is achieved through the use
of scale-model-like mental representations.
A central form of model-based reasoning in science, ilar to scale models, though clearly the brain does not
particularly in the special sciences, is model-based rea- instantiate the very properties of a modeled system in
soning about mechanisms. This form of reasoning can the way that scale models do (Sect. 31.2). A key chal-
be affected with the aid of external representational aids lenge facing this view is thus to show that brains are
(e.g., formalisms, diagrams, and computer simulations) capable of realizing representations that are like scale
and through the in-the-head manipulation of represen- models in crucial respects. There have been several
tations. Philosophers of science have devoted most of failed attempts to show precisely this, but a look at
their attention to the former, but the latter is arguably how computers are utilized to model mechanical inter-
at the heart of most of what passes for explanatory un- actions offers a useful way of understanding how brains
derstanding in science (Sect. 31.1). Psychologists have might realize mental representations of the relevant sort
long theorized that humans and other creatures (e.g., (Sect. 31.3). This approach meshes well with current re-
rats) reason about spatial, kinematic, and dynamic re- search on mental maps in rats. In addition, it has useful
lationships through the use of mental representations, ramifications for research in artificial intelligence (AI)
often termed mental models, that are structurally sim- and logic (Sect. 31.4), and it offers a promising account
672 Part F Model-Based Reasoning in Cognitive Science
of the generative knowledge that scientists bring to bear light on the role that external representations of mecha-
when testing mechanistic theories while also shedding nisms play in scientific reasoning (Sect. 31.5).
that phenomenon is generated” [31.7]. Bechtel claims he or she may then use it to formulate predictions in
that the representations underlying this mental anima- order to test that model.
tion process may have a structure similar to that of the While external representational artifacts may some-
diagrams scientist use in their thinking and to the an- times be required in order to achieve explanatory un-
imated renderings of computer simulations scientists derstanding of how a mechanism could produce a given
construct to represent proposed mechanisms in action. phenomenon, plausibly those artifacts are not them-
As for prediction, he notes [31.6]: selves sufficient for explanatory understanding. (For
evidence that there is a crucial psychological compo-
“what the scientist advances is a representation of nent to explanatory understanding, see [31.8].) Instead,
a mechanism [. . . ] She or he then evaluates the rep- representational artifacts may have the important func-
resentation by using it to reason about how such tion of facilitating understanding by enhancing the
a mechanism would be expected to behave under scientist’s ability to mentally simulate the process by
a variety of circumstances and testing these expec- which the proposed mechanism would produce the tar-
tations against the behavior of the actual mecha- get phenomenon. (As shown in Sect. 31.4.2, external
nism.” representational aids may also enable forms of reason-
ing that would otherwise (e.g., due to the complexity
In other words, once the scientist possesses a model of the mechanism) be impossible.) Through manipu-
of the mechanisms that may be responsible for an oc- lation of those mental simulations, scientists may also
currence, which may take the form of a mental model, discover novel predictions of a given model.
Part F | 31.2
landmark monograph, The Nature of Explanation. Re- to the one in Fig. 31.1b, where the alley that the rats had
garding everyday reasoning, Craik suggests [31.9]: previously learned to traverse was blocked. Upon dis-
covering this, the vast preponderance of rats then chose
“If the organism carries a small-scale model of ex- the alley that led most directly to where the food source
ternal reality and of its own possible actions within had been in previous trials. On the basis of such exper-
its head, it is able to try out various alternatives, iments, Tolman concluded that rats navigate with the
conclude which is the best of them, react to future
situations before they arise [. . . ] and in every way
a) b)
to react in a much fuller, safer, and more competent
manner to the emergencies which face it.”
aid of cognitive maps of the relative spatial locations of once. All of this research fits well with Norman’s early
objects in their environment. assessment of mental models. He notes [31.19]:
Later, Shepard and Metzler would show that the
“1. Mental models are incomplete.
time it takes for people to determine if two three-
2. People’s abilities to run their models are severely
dimensional (3-D) structures have the same shape is
limited.
proportional to the relative degree of rotational dis-
3. Mental models are unstable: People forget the de-
placement of those structures [31.13]. One neat ex-
tails of the system they are using [. . . ]
planation for this finding is that people engage in the
4. Mental models do not have firm boundaries: sim-
mental rotation of 3-D models of the two structures
ilar devices and operations get confused with one
until they are aligned in such a fashion as to enable eas-
another.”
ier comparison. In another landmark study of mental
imagery, Kosslyn showed that reaction times for scan- These limitations on the human ability to construct
ning across mental images of a map was proportional to and manipulate mental models surely have a great deal
distance, but not to the number of intervening objects, to do with more general limitations on the capacity of
suggesting that spatial reasoning is better explained by human working memory and with the high cognitive
a process akin to scanning across a real map than to load associated with creating, maintaining, and manip-
a process of sentence-based reasoning (e.g., working ulating mental models.
through a list structure) [31.14]. In everyday reasoning with mental models, the be-
All of this research points to the existence of mental haviors of the component structures in our models will
models of two-dimensional (2-D) and 3-D spatial re- not typically be tied in any direct way to fundamental
lationships, but to support the full range of inferences physical laws (e.g., Newtonian, quantum mechanical, or
implicated in mechanistic model-based scientific rea- relativistic). Rather, many of the kinematic and dynamic
soning, mental models would need to capture kinematic principles governing object behavior in our mental
and dynamic relations as well. There is some support simulations will be rooted in early experiences of colli-
for the existence of these models as well. For instance, sions, impenetrability, balance and support, projectiles,
Schwartz and Black observed similar, proportional re- blocking, and so forth [31.20–22]. In addition, in every-
action times when subjects were asked to determine day reasoning, and even more so in scientific reasoning
whether or not a knob on one gear would, when that about mechanisms, many of the behaviors of the com-
gear is rotated, fit into a grove on a connecting gear ponents of our models will not be the result of early
(Fig. 31.2a) [31.15]. Schwartz and Black found, more- learning. Some of these will be one-off brute events –
over, that subjects were able to “induce patterns of such as a meteor striking the earth, a gene mutating,
behavior from the results depicted in their imagina- or a latch coming undone – for which one does not
tions” [31.16]. Subjects might, for instance, infer and have or require (in order to formulate a satisfactory an-
remember that the second in a series of gears will, swer to the question of why the explanandum occurred)
Part F | 31.2
along with every other even-numbered gear, turn in the any deeper explanation. Such occurrences might be im-
opposite direction of the drive gear (Fig. 31.2b). Hav- posed upon a mental model in much the same way that
ing inferred this through simulation, the information one would impose them – that is, through direct in-
becomes stored as explicit knowledge, thereby elim- tervention – on a scale model. In the same way, one
inating the need to generate the knowledge anew for could also impose newly learned or hypothesized regu-
each new application. larities on a model. Some of these might be discovered
In addition, [31.17, 18] have shown that mental through simple induction (one might notice that one’s
modeling of dynamic relationships is often affected in car engine becomes louder in cold weather) or through
piecemeal fashion, a process that is much better suited prior model-based reasoning (as in Schwartz’ study
for tracing a sequence of interactions through a system with gears). However, when formulating mechanical
than for simulating collections of dynamic effects all at explanations, particularly in science, one sometimes
simply hypothesizes, as a way of making sense of
the available data, that a particular regularity obtains.
a) b)
A good example of this is the way that the hypothe-
sis of periodic geomagnetic pole flipping was used to
make sense of the patterns of magnetization in rocks
found lateral to mid-ocean rifts [31.1]. Such ideas
accord well with recent work regarding mechanistic
Fig. 31.2a,b Knob and groove on connecting gears (a), (af- explanation in the philosophy of science, where it is
ter [31.15]). Gears in series (b), (after [31.16]) generally recognized that our models of mechanisms
From Neural Circuitry to Mechanistic Model-Based Reasoning 31.3 Mental Models in the Brain: Attempts at Psycho-Neural Reduction 675
typically bottom out at brute activities [31.5] or func- that mainly involves the semantic properties of top-
tions [31.6]. neutral logical operators such as if. . . then. . . , and, all,
The above empirical research sheds light on the and some – Johnson-Laird proposes that we reason in-
properties of the models we use to reason about mech- ternally through a process not unlike the formal method
anisms in everyday life and in science. There is, in of truth table analysis. For instance, on Johnson-Laird’s
addition, a great deal of research that simply hypothe- view, the conditional, If the door is pushed, then the
sizes that we do utilize such models to understand other bucket will fall, would be mentally represented as some-
cognitive processes such as language comprehension, thing like the following spatial array, which lists those
concepts [31.23], or learning [31.24–30]. scenarios (models) that would be consistent with the
The hypothesis of mental models has also been truth of the statement (: here signals negation)
invoked by Johnson-Laird to explain deductive rea-
soning, though here the term mental model is used door pushed bucket falls
somewhat differently than it is in the research cited :door pushed bucket falls
above [31.31]. (Below, I explain in greater depth, the :door pushed :bucket falls
contrast between deductive reasoning more generally
and the mental models approach to mechanistic reason- If presented with the additional premise, The bucket
ing espoused here.) Like many proponents of mental did not fall, one could then eliminate all but the last of
models, Johnson-Laird and Byrne do claim to be di- these models, enabling a valid deduction to The door
rectly inspired by Craik, an inspiration that shows up was not pushed. The formal, topic-neutral nature of
in their suggestion that mental models have “a structure this strategy means that it works in exactly the same
that is remote from verbal assertions, but close to the way regardless of what items (e.g., balloons, satellites,
structure of the world as humans conceive it” [31.32]. or mice) we are reasoning about. To say nothing of
However, if we look more closely at the way in which the viability of the approach, Johnson-Laird’s proposals
Johnson-Laird employs the mental models hypothesis regarding deductive (as well as inductive and abduc-
in accounting for reasoning processes (deductive, in- tive) reasoning thus seem, except insofar as they appeal
ductive, and abductive), it begins to look as though to such structures as spatial arrays, at odds with his
he has something very different in mind. For instance, avowed view that mental models have a structure closer
with regard to deductive reasoning – that is, reasoning to the world than to our descriptions of it.
Part F | 31.3
terialize is a demonstration that brains do or, what is a universal Turing machine [31.33]. Indeed, it was in no
even more worrisome, that they could harbor mental small part because von Neumann recognized the func-
models that are like scale models in crucial respects. tional similarities between McCulloch–Pitts neurons
One can see how this might raise concerns about the and electronic switches (e.g., transistors) that he was in-
mental models hypothesis. After all, if brains cannot spired to create the first fully programmable computers,
realize such models then the above explanatory ap- ENIAC and EDVAC. More recently, it has been shown
peals to mental models come out looking misguided that recurrent neural networks are, memory limitations
from the outset. At the same time, there is a com- notwithstanding, capable of implementing computers
peting hypothesis which faces no such difficulties. In that are Turing complete [31.34]. There is, then, no
its most audacious form, it is the proposal that all longer any doubt that it is possible to bridge the divide
of cognition is affected through formal computational between neural machinations and syntax-crunching op-
operations – that is, operations that involve the applica- erations.
tion of syntax-sensitive inference rules to syntactically In contrast, a satisfactory demonstration that neu-
structured (sentential) representations. ral machinations might realize mental models – that is,
Proponents of the computational theory of cog- nonsentential mental representations that are like scale
nition know that they have nothing to fear, at least models in crucial respects – has proven far more elu-
with regards to the matter of whether or not brains sive. Indeed, difficulties arise the moment one tries to
are capable of realizing the relevant kinds of syntax- specify what the crucial respects might be, as is evi-
crunching operations. McCulloch and Pitts showed, denced by the fact that each past attempt at doing this
676 Part F Model-Based Reasoning in Cognitive Science
has been argued, not without justification, to run afoul a purported relationship between mental models and
of one or the other of the following two desiderata: what they represent. Despite initial appearances, this
is the form of isomorphism that Craik seems to have
1. An adequate account of mental models must be
had in mind. He claims, for instance: “By a model
compatible with basic facts about the brain.
we thus mean any physical or chemical system which
2. An adequate account of mental models must be
has a similar relation-structure to that of the process
specific enough to distinguish mental models from
it imitates” [31.9]. Latter-day proponents of this pro-
other kinds of representation (sentential representa-
posal include Cummins [31.36] and Hegarty, who, in
tions).
an attempt to summarize the dominant view of mental
Again, this is no small matter, for given that brains models in psychology, notes [31.18]:
are known to be capable of formal computational oper-
“a mental model (or situation model) is a represen-
ations, if it cannot be shown that they are also capable
tation that is isomorphic to the physical situation
of realizing mental models, this will cast doubt on
that it represents and the inference processes sim-
all those psychological theories mentioned above that
ulate the physical processes being reasoned about.”
advert to mental models. This is a concern made all
the more pressing by the fact that proponents of the One serious concern about this approach is that it is
computational theory of cognition have no shortage of too liberal, which is to say that it leads one to classify
alternative explanations for the behavioral data cited in too wide a range of representations as models. Con-
support of mental models. For instance, to the extent sider, for instance, that one of Craik’s favored examples
that people report having model-like phenomenology, of a representation with a similar relation structure
this might be dismissed as a mere epiphenomenon of to what it represents is Kelvin’s Tide Predictor, a de-
the actual, underlying computational operations. Simi- vice that consists of an ingenious system of gears and
larly, to the extent that behavioral data, such as reaction pulleys arranged so as to support truth-preserving infer-
times, suggests reliance upon model-like mental repre- ences regarding the tides (Fig. 31.3). Says Craik [31.9],
sentations that undergo continuous transformations, this
might be chalked up to demand characteristics (subjects “My hypothesis then is that thought models, or
may feel compelled to pretend that they are scanning parallels, reality–that its essential feature is [. . . ]
a map). Some of these specific objections could be symbolism, and that this symbolism is largely of the
vulnerable in that they give rise to their own testable
predictions [31.35], but, as explained below, proponents
of the computational theory have an ace up their sleeve,
for computational accounts are flexible enough to han-
dle virtually any behavioral data. All of this is quite
general, so let us turn to some of the specific attempts
Part F | 31.3
Mere Isomorphism
The most straightforward form of isomorphism in-
voked in this literature is what might be termed bare Fig. 31.3 Kelvin’s first tide predicting device (photo by
isomorphism, or isomorphism simpliciter, which is William M. Connoley)
From Neural Circuitry to Mechanistic Model-Based Reasoning 31.3 Mental Models in the Brain: Attempts at Psycho-Neural Reduction 677
same kind as that which is familiar to us in mechan- and functionally distinct [31.42–44]. Lastly, the kind of
ical devices which aid thought and calculation.” retinotopy pointed out by Kosslyn is restricted to two
spatial dimensions, and a 2-D representational medium
This, of course, is no different from what propo- cannot realize representations that are physically iso-
nents of the computational theory of cognition currently morphic with what they represent in three dimensions.
maintain. After all, any syntax-crunching system ca- Nor, a fortiori, can such a medium realize representa-
pable of supporting truth-preserving inferences with tions that are physically isomorphic in both 3-D and
respect to a given physical system will have to be causal respects. Crudely put, there are no literal buckets,
isomorphic with it – that is, there will have to be cor- balls, or doors in the brain. (Perhaps it is worth not-
respondences between the parts and relations in the ing, as well, how inessential structural isomorphism is
system and the components of the representation – in to information processing in neural networks, even in
ways that get preserved over the course of computation. the case of 2-D retinotopic maps. The relative physical
To that extent, one might even say that the inference locations of neural cell bodies seems irrelevant when
process simulates, or even pictures [31.37], the process compared to the patterns of connectivity between neu-
being reasoned about. In short, then, the proposal that rons, the strengths and valences of connections, and
mental models are merely isomorphic with what they the schemes of temporal coding the neurons employ.
represent is thus far too vague to satisfy desideratum One would expect then that, so long as all of this is
(2.) above. Indeed, it is for this very reason that re- preserved, cell bodies might be tangled up in arbitrary
searchers have tried to find a more restrictive notion of ways without affecting processing.)
isomorphism, one that can distinguish models from sen-
tential representations. Functional Isomorphism
The main problem with the appeal to physical isomor-
Physical Isomorphism phism, one that has long been appreciated, is that it
Perhaps the most restrictive such notion is that of struc- fails to satisfy desideratum (1.). As Shepard and Chip-
tural [31.38] or physical [31.39] isomorphism, which man note, “With about as much logic, one might as
involves instantiating the very same properties, and ar- well argue that the neurons that signal that the square
rangements thereof, as the represented system. This is green should themselves be green!” [31.38]. Recog-
appears to be the kind of isomorphism that Thagard has nizing this, and recognizing the weakness of appeals to
in mind when he claims [31.11] (also see [31.40]): mere isomorphism, Shepard and Chipman push for the
following moderate notion of isomorphism [31.38, ital-
“Demonstrating that neural representation can con-
ics added for emphasis]:
stitute mental models requires showing how they
can have the same relational structure as what they
“isomorphism should be sought-not in the first-
represent, both statically and dynamically.”
order relation between (a) an individual object, and
Part F | 31.3
Thagard cites Kosslyn’s research as indicative of (b) its corresponding internal representation-but in
how this demand might be met, and in Kosslyn too, we the second-order relation between (a) the relations
do find frequent appeals to structural isomorphisms. For among alternative external objects, and (b) the rela-
instance, noting the retinotopic organization of areas tions among their corresponding internal represen-
of visual cortex that are implicated in mental imagery, tations. Thus, although the internal representation
Kosslyn claims, “these areas represent depictively in the for a square need not itself be square, it should [. . . ]
most literal sense [. . . ]” [31.41]. at least have a closer functional relation to the inter-
Unfortunately, the postulation of physically isomor- nal representation for a rectangle than to that, say,
phic mental representations is highly suspect for several for a green flash or the taste of persimmon.”
reasons. To start with, the kind of retinotopy that one
finds in areas such as V1 is highly distorted relative to The appeal to second-order isomorphism would,
the world due to the disproportionate amount of cortex they hoped, provide an alternative to physical isomor-
devoted to the central portion of the retina (the fovea). phism that is both consistent with basic brain facts
A square in the visual field is thus not represented in (desideratum (1.)) and distinct from sentential accounts
the cortex by sets of neurons that lie in straight, let alone (desideratum (2.)).
in parallel, lines. Moreover, visual representation seems Another moderate account of isomorphism was put
not to be carried out through the activity of any sin- forward at the same time by Huttenlocher et al. [31.45].
gle retinotopically organized neural ensemble. Rather, They had a particular interest in how subjects make or-
vision involves the combined activity of a variety of dering inferences (viz., those involving the ordering of
systems that are, to a considerable extent, anatomically three items along such dimensions as size, weight and
678 Part F Model-Based Reasoning in Cognitive Science
height) like this one Anderson is generally credited with confirming this
suspicion by pointing out the possible tradeoffs that can
Linus is taller than Prior. be made between assumptions about representational
Prior is taller than Mabel. structure and those concerning the processes that op-
erate over the representations [31.46]. He showed that
) Linus is taller than Mabel.
the possible structure-process tradeoffs render compu-
tational accounts flexible enough to handle virtually any
Huttenlocher et al. suggested that subjects might
behavioral finding. Most have since endorsed his thesis
use representations that “are isomorphic with the phys-
that it is, at least after the fact, always possible to “gen-
ically realized representations they use in solving anal-
erate a propositional (i. e., sentential) model to mimic
ogous problems (graphs, maps, etc.) [. . . ]” [31.45]. The
an imaginal model” [31.46]. Alternatively, as Palmer
essence of their proposal was that the mental repre-
puts it, if you create the right sentential model it will
sentations that subjects form in order to solve such
be functionally isomorphic to what it represents in just
problems might function like spatial arrays rather than
the sense that a nonsentential model is supposed to
like sentences. For instance, what seems distinctive
be [31.39].
about external sentential representations of three-term
ordering syllogisms like the one above is that, because
Imagery and Perception
each premise is represented in terms of a distinct ex-
One last way in which one might try to satisfy the above
pression, terms that denote particular individuals must
desiderata, at least with regard to spatial models, is to
be repeated. On the other hand, when such inferences
point out that visual mental imagery involves the uti-
are made with the aid of external spatial arrays, the
lization of visual processing resources. Brooks [31.47]
terms need not be repeated. For instance, one can make
and Segal and Fusella [31.48], for instance, discovered
inferences about the taller-than relation on the basis of
that performance on visual imagery tasks is dimin-
the left-of relation with the help of marks on a paper
ished when subjects must perform a concurrent visual
like these
processing task but not when they perform an audi-
tory task – that is, they found that there is interference
L P M
between mental imagery and auditory perception but
not between mental imagery and visual perception (see
In fact, the introspective reports obtained by Hut-
also [31.36]). However, if these findings are meant
tenlocher et al. did support the idea that subjects were
to provide a model-based alternative to computational
constructing the functional equivalents of spatial ar-
theories, the attempt would appear to have the same fun-
rays – for instance, subjects reported that symbols
damental flaw as the appeal to functional isomorphism.
representing individuals were not repeated – and on
As Block notes, because perceptual processing can, in
this basis they claimed that subjects might be carrying
principle, also be explained in terms of computational
out three-term ordering inferences using mental repre-
Part F | 31.3
by proponents of the computational theory of mind, monsense knowledge that the average human possesses
to distinguish external images and scale models from about what will change and what will stay the same fol-
sentential representations. Three such features concern lowing alterations to the objects in the world. As Hayes
the sorts of entities, properties, and processes that each puts it [31.52]:
form of representation is naturally suited for represent-
“The frame problem arises in attempts to formalise
ing:
problem-solving processes involving interactions
with a complex world. It concerns the difficulty of
1. Images and scale models are not naturally suited for
keeping track of the consequences of the perfor-
representing abstract entities, properties, and pro-
mance of an action in, or more generally of the
cesses (e.g., war criminal, ownership, or economic
making of some alteration to, a representation of the
inflation). They are much better suited for represent-
world.”
ing concrete entities, properties, and processes (e.g.,
a bucket, one object being over another, or compres- The frame problem can actually be broken down
sion). into at least two component problems, the prediction
2. Images and scale models are not naturally suited problem [31.53] and the qualification problem [31.54].
for representing general categories (e.g., triangles or As it confronts computational devices, the predic-
automobiles). They are better suited for represent- tion problem can be summed up as follows: In order
ing specific instances of categories (Note: Genera to support inferences about the consequences of alter-
differ from abstracta in that the former can be con- ations to even simple physical systems, a sentence-and-
crete (e.g., rocks) and the latter can be specific (e.g., rule system would have to contain innumerable rules
the enlightenment)). that explicitly specify how objects will behave relative
3. Images and scale models are not naturally suited to one another following each of innumerable possible
for singling out specific properties of specific ob- alterations. For a simple illustration, consider what we
jects [31.37, 50]. For instance, if would be difficult, all know about the consequences of different ways of
using a scale model, to represent just the fact that altering the items in Fig. 31.4. We know, for example,
Fred’s car is green, for any such model will simul- what would happen were we to use the bucket to throw
taneously represent many other properties, such as the ball through the open doorway, were we to place
the number of doors and wheels, the body type, and the bucket over the ball and slide the bucket through
so on. the doorway, were we to set the bucket containing the
ball atop the slightly ajar door and then shove the door
In contrast, sentential representations (those con- open, and so on indefinitely. To endow a sentence-and-
structed using natural and artificial languages) have rule system with the ability to predict the consequences
little trouble representing abstracta (e.g., war criminal), of these various alterations, one would have to build
genera (triangle), and specific properties of specific ob- in, corresponding to each one, a separate data structure
Part F | 31.3
jects (e.g., Fred’s car is green).
While images and scale models are relatively dis-
advantaged in the above respects, they are much better
suited for supporting inferences regarding the conse-
quences of alterations to specific, concrete systems. The
fact that syntax-crunching systems are quite limited in
this regard first came to light as a consequence of early
work in formal-logic-inspired, sentence-and-rule-based
AI. The general problem confronting syntax-crunching
approaches came to be known as the frame prob-
lem [31.51].
In its original formulation, the frame problem had
much to do with the challenge of endowing a sentence-
and-rule-based representational system with the ability
to anticipate what will not change following an alter-
ation to the world (e.g., tipping over a bottle changes
its orientation but not its color). Today, however, the
frame problem is regarded as something more gen-
eral – namely, the problem of endowing computational Fig. 31.4 A toy world: A doorway, a bucket, and a ball (af-
systems (and other artifacts) with the kind of com- ter [31.55])
680 Part F Model-Based Reasoning in Cognitive Science
specifying the starting conditions, the alteration, and the qualifications would have to be added to the relevant
consequences of that alteration. If these take the form sentence or rule. Once again, in realistic situations, the
of conditional statements, the system could then make challenge of specifying all of the qualifications is mag-
inferences utilizing domain-general (e.g., topic-neutral, nified exponentially.
deductive) machinery. Alternatively, the information The general failing of sentence-and-rule-based rep-
could be encoded directly as domain-specific infer- resentations that the frame problem brings to light is
ence rules (e.g., production-system operators). Either that they only support predictions concerning the con-
way, from an engineering standpoint, the problem that sequences of alterations and the defeaters of those
quickly arises is that no matter how many of these state- consequences if those alterations, consequences, and
ments or rules one builds into the knowledge base of the defeaters have been spelled out, antecedently and ex-
system, there will generally be countless other bits of plicitly, as distinct data structures. Representations of
commonsense knowledge that one has overlooked. No- this sort – that is, representations that require distinct
tice, moreover, that scaling the scenario up even slightly structures to support predictions regarding the conse-
(e.g., such that it now includes a board) has an exponen- quences of each type of alteration to the represented
tial effect on the number of potential alterations and, system – are sometimes termed extrinsic representa-
as such, on the number of new data structures that one tions. (The intrinsic-extrinsic distinction discussed here
would have to incorporate into one’s model [31.53]. As was introduced by Palmer [31.39] but modified by
Hayes says [31.52]: Waskan [31.57, 58].)
It is worth a quick digression to note that, while
“One does not want to be obliged to give a law of
the terminology has changed, these general concerns
motion for every aspect of the new situation [. . . ]
about the limitations of extrinsic representations an-
especially as the number of frame axioms increases
tedate work in contemporary AI by over three hun-
rapidly with the complexity of the problem.”
dred years. They show up, for instance, in Descartes’
Moreover, as explained in the manual for a past in- best-explanation arguments for dualism in his Dis-
carnation of the production system Soar [31.56]: course on the Method. Descartes there despairs of there
ever being a mechanical explanation for, or an arti-
“when working on large (realistic) problems, the
fact that can duplicate, the average human’s boundless
number of operators (i. e., domain-specific rules)
knowledge of the consequences of interventions on the
that may be used in problem solving and the num-
world [31.59]:
ber of possible state descriptions will be very large
and probably infinite.” “If there were machines which bore a resemblance
to our bodies and imitated our actions [. . . ] we
As if the prediction problem were not problem
should still have two very certain means of recog-
enough, it is actually compounded by the other facet of
nizing that they were not real men [. . . ] (Firstly,
the frame problem, the qualification problem [31.54].
humans have the ability to converse.) Secondly,
Part F | 31.3
itly, was the possibility of (to use Chomsky’s term) 31.3.3 Does Computational Realization
a generative inference mechanism – that is, one that Entail Sentential Representation?
embodies boundless knowledge of implications through
finite means. The above distinguishing features can help us to know
What Descartes failed to notice was that there were better whether we are dealing with model-like or
already artifacts (i. e., scale models) that exhibited the sentence-like representations and, ultimately, to appre-
requisite generativity. Indeed, in contemporary AI, the ciate how one might bridge the gap from neurophys-
benefits of an appeal to scale-model-like representa- iology to mental models. As noted above, a similar
tions are now well known. Starting with the prediction bridge was constructed from neurophysiology to com-
problem, one can use a reasonably faithful scale model putational processes by showing that artifacts (e.g.,
of the setup depicted in Fig. 31.4 in order to predict collections of McCulloch–Pitts neurons or wires and
what would happen were one to use the bucket to throw transistors) characterized by a complex circuitry not
the ball through the open doorway, were one to place unlike that of real brains can be configured so as to im-
the bucket over the ball and slide the bucket through plement, at a higher level of abstraction, processes that
the doorway, were one to set the bucket containing the exhibit the hallmarks of traditional syntax-crunching.
ball atop the slightly ajar door and then shove the door Because neurons have similar information-processing
open, and so on indefinitely. To use Haugeland’s terms, capabilities as these artifacts, implementing a set of for-
the side effects of alterations to such representations mal operations on an electronic computer is already
mirror the side effects of alterations to the represented very nearly an existence proof that brain-like systems
system automatically [31.60] – which is to say, without can realize the same set of operations.
requiring their explicit specification. (This only holds, Might this strategy offer a template for constructing
of course, to the extent that the model is a faithful re- a similar bridge to high-level models? There is surely
production. Unless the model is a perfect replica, which no shortage of computer simulations of mechanical sys-
includes being to scale, there will be some limits on tems, and at least as they are depicted on a computer’s
inferential fidelity, though this does not undermine the display, these simulations look for all the world like im-
claim that scale models are generative.) Notice also that ages and scale models. Many would argue, however,
incremental additions to the represented system will that this approach to bridging the neuron-model divide
only have an incremental effect on what needs to be is a nonstarter. The worry, in short, is that it fails to
built into the representation. The addition of a board to satisfy desideratum (2.) above. To see why, it will be
the system above, for instance, can be handled by the helpful to look at the kinds of computational models of
simple addition of a scale model of the board to the rep- mental imagery offered up by researchers such as Koss-
resentation. lyn [31.14] and Glasgow and Papadias [31.61].
Nor do scale models suffer from the qualification Kosslyn’s model of mental imagery has several
problem. To see why, notice that much of what is true components [31.14]. One is a long-term store that
Part F | 31.3
of a modeled domain will be true of a scale model contains sentential representations of the shape and ori-
of that domain. For instance, with regards to a scale entation of objects. These descriptive representations
model of the setup in Fig. 31.4, it is true that the scale are utilized for the construction of representations in
model of the ball will fall out of the scale model of the another component, the visual buffer, which encodes
bucket when it is tipped over, but only if the ball is not the same information in terms of the filled and empty
wedged into the bucket, there is no glue in the bucket, cells of a computation matrix. The cells of the matrix
and so on indefinitely. Just like our own predictions, the are indexed by x, y coordinates, and the descriptions
predictions generated using scale models are implic- in long-term memory take the form of polar coordi-
itly qualified in an open-ended number of ways. With nate specifications (i. e., specifications of the angle and
scale models, all of the relevant information is implicit distance from a point of origin) of the locations of
in the models and so there is no need to represent it filled cells. Control processes operate over the coordi-
all explicitly using innumerable distinct data structures. nate specifications in order to perform such functions
Representations of this sort are termed intrinsic repre- as panning in and out, scanning across, and mental ro-
sentations. Summing up, scale models are immune to tation.
the frame problem, for one can use them to determine, One distinctive feature of actual (e.g., paper-and-
on an as-needed basis, both the consequences of count- ink) spatial matrix representations is that they embody
less alterations to the modeled system and the countless some of the very same properties and relationships
possible defeaters of those consequences – that is, one (spatial ones) as – which is just to say that they are
simply manipulates the model in the relevant ways and physically isomorphic with – the things they repre-
reads off the consequences. sent. But Kosslyn’s computational matrix representa-
682 Part F Model-Based Reasoning in Cognitive Science
tions (CMRs) are clearly not physically isomorphic artifacts instantiate the relevant kind of processing –
with what they represent. After all, Kosslyn’s visual will be doomed to fail from the outset.
buffer representations are not real matrix representa-
tions that utilize cells arranged in Euclidean space; they 31.3.4 What About POPI?
are computational matrix representations. To be sure,
modelers may sometimes see literal pictures on the Consider, however, that upon gazing directly at a vast
output displays of their computers, but the representa- collection of electrical or electrochemical circuits, one
tions of interest are located in the central processing will see no evidence of the harboring or manipulation
unit (CPU) (viz., in random-access memory (RAM)) of sentential representations. In Monadology, Leibniz
of the computer running the model. Accordingly, the turned an analogous observation about perceptual ex-
control operations responsible for executing represen- perience into an objection to materialism [31.65]:
tational transformations like rotation do not make use
“It must be confessed, moreover, that perception,
of inherent spatial constraints, but rather they operate
and that which depends on it, are inexplicable by
over the coordinate specifications that are stored in the
mechanical causes, that is, by figures and motions.
computer’s memory. Details aside, at a certain level of
And, supposing that there were a mechanism so
description, there can be no doubt that the computer is
constructed as to think, feel and have perception, we
implementing a set of syntax-sensitive rules for manip-
might enter it as into a mill. And this granted, we
ulating syntactically structured representations; this is
should only find on visiting it, pieces which push
what computers do. As Block puts it, “Once we see
one against another, but never anything by which to
what the computer does, we realize that the represen-
explain a perception.”
tation of the line is descriptional” [31.49]. The received
view, then, a view that has gone nearly unchallenged, is A similar objection might be leveled regarding com-
that if a representation of spatial, kinematic, or dynamic putational processes. Again, one sees no evidence of
properties is implemented using a high-level computer this kind of processing when one looks at electronic or
program, then the resulting representations must be sen- electrochemical circuitry. Clearly something has gone
tential in character [31.49, 62, 63]. (That Fodor shares wrong.
this sentiment is suggested by his claim that “if [. . . ] What Leibniz overlooked – and this may be because
you propose to co-opt Turing’s account of the nature he lacked the conceptual tools made available by the in-
of computation for use in a cognitive psychology of formation age – was a grasp of the principle of property
thought, you will have to assume that thoughts them- independence (POPI). The basic idea of POPI is that
selves have syntactic structure” [31.64]). properties characterizing a system when it is studied
It would thus seem that the strongest claim that at a relatively low level of abstraction are often absent
can possibly be supported with regard to CMRs is when it is studied at a higher level, and vice versa. It is
that they function like images. Yet, as Anderson notes, POPI that allows computer scientists to say that a sys-
Part F | 31.3
it is always possible, through clever structure-process tem which is characterized by electronic switches and
tradeoffs, to create a sentential system that mimics an relays at level n may nevertheless be best described in
imagistic one [31.46]. Indeed, rather than supporting terms of the storing of bits of information in numeri-
the mental models framework, one might well take cally addressable memory registers at level n C 1 and
computer simulations of mental modeling as concrete in terms of the application of syntax-sensitive rules to
evidence for Anderson’s claim. Likewise, there is a case syntactically structured representations at level n C 2.
to be made that CMRs and their brethren are, unlike It is also the very thing that enables proponents of
scale models, extrinsic representations [31.62]. After computational theories of cognition to say that brains
all, the computers that run them implement syntax- and computational artifacts are, despite superficial ap-
sensitive rules that provide explicit specifications of the pearances, capable of implementing the application of
consequences of alterations. This is no small matter. syntax-sensitive rules to syntactically structured repre-
From the standpoint of cognitive science, one of the sentations.
most important virtues of the hypothesis that we utilize However, when proponents of computational theo-
mental representations akin to scale models was that ries of cognition insist that computational implemen-
scale models constitute intrinsic representations of in- tation (e.g., of CMRs) entails sentential representation,
teracting worldly constraints and are thus immune to the they are turning their backs on the very principle that
frame problem. One could, then, be forgiven for think- enabled them to bridge divide between low-level cir-
ing that any attempt to build a bridge from neurons to cuitry and high-level computational operations; they are
models by following the template set by computational turning their back on POPI. Indeed, nothing about POPI
theories – that is, by noting that certain computational entails that all syntax-crunching systems must be char-
From Neural Circuitry to Mechanistic Model-Based Reasoning 31.3 Mental Models in the Brain: Attempts at Psycho-Neural Reduction 683
acterized in terms of sentences and inference rules at One only finds representations of objects when one
the highest level of abstraction. POPI thus opens up at turns to the higher level of the models that are real-
least logical space for systems that engage in syntax- ized, and multiply realizable, by the aforementioned
crunching operations at one level but that harbor and modeling media. And when we take a close look at the
manipulate nonsentential models at a higher level. properties of these high-level FEMs, we find that they
In point of fact, in this logical space reside ac- share several characteristics that have long been taken,
tual systems, including finite element models (FEMs). including by those who suggest that computational
These were first developed in the physical (e.g., civil implementation entails sentential representation, to dis-
and mechanical) engineering disciplines for testing de- tinguish sentential representations from scale models.
signs, but they have since become a staple tool in the To start with, like scale models and unlike sentential
sciences for exploring the ramifications of theories, representations, FEMs are not (by themselves) natu-
generating novel predictions, and facilitating under- rally suited to representing abstract entities, properties,
standing. For our current purposes, what matters most and processes (e.g., war criminal, ownership, economic
about FEMs is that they provide an existence proof inflation). They are much better suited for represent-
that computational processes can realize nonsentential ing concrete entities, properties, and processes (e.g.,
representations that are like scale models and unlike a bucket, one object being over another, and compres-
sentential representations in all of the crucial respects sion). Nor are FEMs naturally suited to representing
listed above. general categories (e.g., triangles or automobiles). They
To see why, notice first that there are (among oth- are far better suited for representing specific instances
ers) two important levels of abstraction at which a given of those categories. Lastly, FEMs are not naturally
FEM may be understood. As with scale models, one suited to singling out specific properties of specific ob-
may understand FEMs at the relatively low level of jects. For instance, using an FEM, if would be difficult
the principles that govern their implementing medium. to represent just the fact that Fred’s car is green, for any
What one finds at this level are sentential specifications such model will simultaneously represent many other
of coordinates (e.g., for polygon vertices) along with properties, such as the number of doors and wheels, the
rules, akin to the fundamental laws of nature, which body type, and so on. In short, just like scale models,
constrain how those coordinates may change (e.g., due FEMs are always representations of specific, concrete
to collisions and loads) (Fig. 31.5). (For a close analogy, systems. By these reasonable standards, FEMs ought
think the basic rules of Conway’s Game of Life.) When to be considered computationally-realized nonsenten-
a given model is run, at this low level one finds a mas- tial models that are the close kin of scale models.
sive number of iterative number crunching operations. The case for this claim looks even stronger once
Not unlike Leibniz, enemies of the idea of computa- we consider whether or not FEMs constitute intrinsic
tionally realized nonsentential models have seized upon representations. As we have seen, the received view is
this low level with their suggestion that computational that FEMs and their brethren (e.g., CMRs) are extrin-
Part F | 31.3
systems harbor only sentential representations. At this sic representations, for the constraints governing how
level, however, it is not even obvious that we are dealing the coordinates of primitive modeling elements may
with representations (worldly objects and properties) at change must be encoded antecedently and explicitly.
all, any more than we are, for instance, when we fixate Indeed, at the level of coordinates and transformation
upon the constraints governing the behaviors of individ- rules, one gets nothing for free. However, once a model-
ual Lego blocks. ing medium has been used to construct a suitable FEM
of a collection of objects, the model can then be al-
tered in any of countless ways in order to determine
the possible consequences of the corresponding alter-
ations to the represented objects. One can, for instance,
use a high-fidelity FEM of the door, bucket, ball sys-
tem to infer, among other things, what would happen
were we to place the bucket over the ball and slide the
bucket through the doorway, what would happen were
the bucket used to throw the ball at the open doorway,
what would happen were the air pressure dramatically
decreased, and so on indefinitely [31.57]. The conse-
quences of these alterations need not be anticipated or
Fig. 31.5 Polymesh representation of a blunt impact to explicitly incorporated into the system. Indeed, as with
a semirigid sheet of material (after [31.57]) scale models, much of the point of constructing FEMs
684 Part F Model-Based Reasoning in Cognitive Science
is to find out how a system will behave in light of son and Palmer are surely right that, post hoc, one can
whichever alterations an engineer or scientist can dream always constrain a sentential representational system
up. so that it mimics the output of a model-based system,
It bears repeating that it is not at the level of the but the post hoc character of the strategy is precisely
primitive operations of an implementation base that we what gets sentential approaches into trouble vis-à-vis
find intrinsic representations, but at the level of the rep- the frame problem. Consider, for instance, that the tra-
resentations realized by a given, primitively constrained ditional AI approach is to take any physical implication
implementation base. Part of what justifies this claim is of which humans express knowledge and, after the fact,
the fact that certain constraints will be inviolable at the to build it into the knowledge base of one’s system
level of the model, and thus a great deal of information as a sentence or inference rule. (Despite its shortcom-
will be implicit in the model, because it has been imple- ings, this strategy is alive and well, as is evidenced by
mented using a particular kind of medium. As Pylyshyn Lenat’s massive ongoing Cyc project.) But to solve, or
notes [31.63]: rather to avoid, the frame problem, one must rely upon
representations that embody all of this boundless in-
“the greater number of formal properties built into
formation as tacit knowledge – that is, the information
a notation in advance, the weaker the notational sys-
cannot be explicitly encoded at the outset, but it can
tem’s expressive power (though the system may be
later be generated, and thereby become explicit knowl-
more efficient for cases to which it is applicable).
edge, on an as-needed basis. Put simply, to exhibit
This follows from the possibility that the system
anything approaching true functional isomorphism with
may no longer be capable of expressing certain
scale models, what is needed are high-level, intrinsic,
states of affairs that violate assumptions built into
nonsentential models.
the notation. For example, if Euclidean assumptions
To sum up, those who would contend that FEMs
are built into a notation, the notation cannot be used
(or even CMRs of 2-D spatial properties) are, qua
to describe non-Euclidean properties [. . . ]”
computational, necessarily extrinsic and sentential have
This, in fact, is very close to an apt characteriza- overlooked the fact that there are multiple levels of ab-
tion of what is going on in the case of FEMs. Given straction at which a given computational model can be
that a particular model has been realized through the understood. At the relatively low level of the modeling
use of a primitively constrained medium, certain con- medium, there are unquestionably extrinsic representa-
straints will be inviolable at the representational level tions of the principles governing the permissible trans-
and a great deal of information will be implicit [31.57]. formation of primitive modeling elements. At a higher
As Mark Bickhard (in correspondence) summarizes the level, one finds models that share many distinguish-
point: ing features, including immunity to the frame problem,
with the scale models they were in large part invented
“Properties and regularities are only going to be in-
to replace. Thus, we find once again that FEMs are like
trinsic at one level of description if they are built-in
Part F | 31.3
compatible with basic facts about how brains operate. spatial, kinematic, or causal generativity. The proof of
All of this provides a much-needed foundation for all the pudding here is in the eating.
of that psychological work cited above that adverts to To the extent that there have been significant
mental models. advances in the bottom-up endeavor, they mostly issue
One advantage of showing that low-level compu- from research – such as that of Nobel laureates John
tations can realize higher-level mental models is that O’Keefe, May-Britt Moser, and Edward Moser – on
it renders the mental models hypothesis robust enough the biological neural networks that underwrite spatial
to withstand the discovery that the brain is a computa- reasoning abilities in rats. As you will recall, Tolman’s
tional system at some level of description. Even if the pioneering work on maze navigation suggested that rats
brain is not a computational system (i. e., in the syntax- have an onboard medium for the construction of gen-
crunching sense), the manner in which computational erative spatial maps of their location relative to barriers
systems realize intrinsic, nonsentential models will nev- and important items such as food and drink. O’Keefe
ertheless remain quite instructive. It suggests a general and Nadel are famous for showing that the rat’s hip-
recipe for the creation of intrinsic models that can be pocampus contains place cells which fire preferentially
followed even without the computational intermediary: when an animal reaches a particular location in its
Start by creating a representational medium such that environment, cells that fire in sequence as a rat moves
a large number of primitive elements a constrained to from one location to another [31.68]. Moser and Moser
obey a handful of simple behavioral principles. Next subsequently showed that the rat’s uncanny spatial
construct models from this highly productive medium. navigation abilities also depend upon grid cells in the
(Productive is here used in Fodor’s sense – that is, nearby entorhinal cortex [31.69]. Individual grid cells
to denote a medium capable of representing an open- fire when an animal is in any of several, roughly evenly
ended number of distinct states of affairs [31.66].) What spaced locations. When lines are drawn to connect these
emerges are generative structures capable of support- points, they collectively form what (purely by coinci-
ing an open-ended number of mechanical inferences. dence) looks a great deal like the kind of 2-D polymesh
At the level of the medium, running such a model in- shown in Fig. 31.5. While each grid cell is tuned
volves the recursive application of the basic constraints to a collection of locations, different grid cells have
on the modeling-element behaviors. This will typically sparser or denser coverage of the same region of space.
be a massive, parallel, constraint-satisfaction process. Collectively they provide effective coverage of the
Given that this form of processing is the forte of neu- entire region of space in which the animal finds itself.
ral networks, there should be little doubt that neural Importantly, O’Keefe et al. note regarding place
machinations are up to the task. (At the same time, one cells that [31.70]
should not overestimate inherent immunity to the frame
“there does not appear to be any obvious topograph-
problem of neural networks [31.58]. It is only by im-
ical relation between the field locations (i. e., the
plementing a primitively constrained modeling medium
places to which cells become temporarily tuned)
Part F | 31.3
that neural networks can be expected to realize intrinsic
and the anatomical locations of the cells relative to
representations of complex, interacting worldly con-
each other within the hippocampus.”
straints.)
Nor do grid cells in the entorhinal cortex exploit
31.3.6 Bottom-Up Approaches any obvious structural isomorphisms between their re-
spective anatomical locations and the spatial layout of
Thus far, we have largely approached the question of the environment. However, acting in concert, the two
the neural realizability of mental models in the ab- types of cells enable effective navigation, as if the
stract, and from the top down. This is partly because organism had an internal map that preserves relative lo-
there has been a relative dearth of work that moves in cations (place cells) and distances (grid cells). In other
the opposite direction, from the bottom up. One ex- words, the two systems encode maps that are func-
ception is Thagard’s [31.11] recent work on the topic, tionally isomorphic with real maps of the environment.
which appeals to such biologically plausible simula- Moreover, they provide a productive modeling medium,
tions of neural networks as those of Eliasmith and one which, not unlike a collection of Lego blocks,
Anderson [31.67]. Unfortunately, Thagard has yet to can be used and reused, through a process called re-
offer evidence that the neural encoding strategies he mapping, to encode information about an open-ended
discusses exhibit any of the central features, discussed number of new environments [31.71]. The maps con-
here, that distinguish modeling from syntax-crunching. structed in this medium are generative with regards to
Most notably, the neural representations he cites have 2-D spatial properties in the aforementioned sense, as is
not yet been shown to exhibit a significant degree of shown by their role in enabling rats to find efficient new
686 Part F Model-Based Reasoning in Cognitive Science
ways to a destination when familiar routes are blocked. tive to its specific, concrete environment. As such, it
More recent research suggests that the rat’s place cells may (taken in isolation) be ill suited for representing
are also somewhat sensitive to vertical displacement abstracta or genera. As for the singling out of specific
from a reference plane, perhaps enabling 3-D mapping properties of specific objects, it may be that models
capabilities [31.72]. Nor are the lessons learned here that are realized by neurophysiological processes have
applicable only to rats, for a large body of research a natural advantage over scale models in that popula-
suggests that the same anatomical systems may be im- tions representing specific properties may cry out for
plicated in human spatial navigation [31.73]. attention (by oscillating at the appropriate frequency).
Our deepest understanding of how real neural net- There is, moreover, no reason why these lessons could
works create spatial mental models thus suggests that not scale up, so to speak, to account for the human
brains implement a reusable modeling medium and, by ability to run off-line models of spatial, kinematic, and
exploiting the kinds functional, rather than physical, dynamic relationships. Of course, in humans, the neo-
isomorphisms that make neural realizability feasible, cortex is likely to play a much more prominent role.
nothing is lost in the way of generativity. It also bears As of yet, however, there is little understanding of the
mentioning that this modeling medium is well suited precise manner in which the neocortex does, or might,
for producing models of the organism’s location rela- realize mental models.
that there is a determinate computational solution to the as a way of making sense of behavioral findings regard-
prediction and qualification problems. FEMs are gener- ing how humans engage in commonsense reasoning
ative in that they can be manipulated in any of countless about the world. For instance, after paying homage to
ways in order to make inferences about how alterations Craik, MIT researchers Battaglia et al. describe their
to the environment might play out and, by the same to- innovative approach to commonsense reasoning as fol-
ken, about the ways in which those consequences might lows [31.75]:
be defeated. It would thus behoove AI researchers to in-
“We posit that human judgments are driven by an
corporate media for the construction of intrinsic models
intuitive physics engine (IPE), akin to the computer
within the core inference machinery of their devices. In-
physics engines used for quantitative but approxi-
deed, there has been some movement in this direction
mate simulation of rigid body dynamics and col-
in recent years. For instance, though past manuals for
lisions, soft body and fluid dynamics in computer
the Soar production-system architecture evidence a cer-
graphics, and interactive video games.”
tain degree of exasperation when it comes to the frame
problem, more recent manuals indicate that Soar’s de- They simulate the IPE with FEMs of full-blown 3-D
signers have begun to offload mechanical reasoning to kinematic and dynamic relationships. They note that
nonsentential models. Laird notes, for instance [31.74]: a similar IPE in humans might allow us to read off from
our simulations the answers to questions of What will
“With the addition of visual imagery, we have happen? regarding innumerable novel scenarios. Their
demonstrated that it is possible to solve spatial rea- pioneering work also breaks new ground in that it be-
From Neural Circuitry to Mechanistic Model-Based Reasoning 31.5 Mechanistic Explanation Revisited 687
gins to account for probabilistic reasoning by building infer that the cat is trapped inside. But if I then see
a bit of uncertainty into models and treating multiple the cat walking through the kitchen and am told that
runs of a model as a statistical sample. my daughter was given a new electronic cat toy, my
All of this work is very much in the spirit of conclusion would be undermined while at the same
Schwartz’ claim that “inferences can emerge through time leaving the original premise (that there is meow-
imagined actions even though people may not know ing coming from the closet) intact.
the answer explicitly” [31.76, italics mine]. It also fits One diagnosis for why deduction is monotonic is
with the following suggestion of Moulton and Koss- that, in a certain sense, the premises of a valid deduction
lyn [31.35, italics mine]: already contain the information stated in the conclu-
sion, so adding information takes nothing away from
“the primary function of mental imagery is to al- the support that those premises lend to the conclusion.
low us to generate specific predictions based upon That means that insofar as the original premises are
past experience. Imagery allows us to answer what true, the conclusion must be as well, and insofar as the
if questions by making explicit and accessible the conclusion is false, there must be something wrong with
likely consequences of being in a specific situation the premises used to derive it. But deduction is formal,
or performing a specific action.” in that topic-neutral logical particles are what bear the
entirety of the inferential load – that is, the specific con-
31.4.2 Exduction tents (consistently) connected and quantified over drop
out as irrelevant.
Another important lesson to be learned from compu- The use of scale models and FEMs makes evident
tationally realized intrinsic models is that they support that there is another form of monotonic reasoning in
a form of mechanistic reasoning that has found its way addition to deduction. As explained above, information
into few, if any, standard reasoning taxonomies. As derived regarding the consequences of interventions
Glasgow and Papadias claim [31.61]: on a modeled system are to a significant extent al-
ready contained (i. e., they are implicit) in the models
“The spatial structure of images has properties not themselves. The only way to overturn a model-based
possessed by deductive sentential representations inference is to call into question some aspect or other
[. . . ] spatial image representations [. . . ] support of the model from which it was derived. By the same
nondeductive inference using built-in constraints on token, if the conclusion is incorrect, there must be
the processes that construct and access them.” something wrong with the model. But unlike deduction,
model-based reasoning is not affected by abstracting
Of course, there is more to be said about the process away from specific contents and allowing logical par-
of model-based mechanistic reasoning than that it is not ticles to bear the inferential load. Instead, it is the
deductive. In fact, the process shares with (valid) deduc- specific, concrete contents of the models that do all
Part F | 31.5
tive reasoning the property of being monotonic. What of the work. As yet, this form of monotonic reasoning
makes deduction a monotonic (indefeasible) reason- lacks a name. Let us thus call it exduction (ex-out and
ing process is that the conclusion of a valid argument duce-lead). Like deduction, exduction may be imple-
cannot be overturned simply by adding premises; it mented externally through the use of representational
can only be overturned by rejecting one or more of artifacts, but the hypothesis being explored here is just
the premises from which the conclusion was deduced. that we also sometimes engage in internal exductive
Other forms of reasoning (inductive generalization, reasoning through the use of mental models. If this hy-
analogical reasoning, abduction) are defeasible in that pothesis is correct, then exduction must be added to our
one can overturn their conclusions simply by adding standard taxonomy of internal reasoning processes and
relevant premises. For instance, if I hear a meowing placed alongside deduction under the broader heading
noise emanating from my daughter’s closet door, I may of monotonic reasoning.
processes, we can begin to see what the payoff might be to reconstruct mechanistic explanations using extrin-
in terms of our understanding of model-based reasoning sic representations – is that they fail to capture the full
about mechanisms in science. complexity of what anyone who possesses that expla-
nation must know, if only implicitly. The problem is
31.5.1 The Prediction and Ceteris Paribus that there is too much surplus-meaning to express it all
Problems explicitly [31.77], and these added implications are es-
sential to how we assess the adequacy of explanations,
To give some of the flavor of where this might lead, whether in everyday life or in science. As Greenwood
consider that one benefit of the foregoing realization explains [31.78]:
story regarding mental models for the philosophy of
science is that it offers a solution to two longstanding “Where this surplus meaning comes from [. . . ] is
problems: the surplus-meaning problem and the ceteris a matter of some dispute, but that genuine theories
paribus problem (a.k.a., the problem of provisos). Both poses [sic.] such surplus meaning is not–for this is
problems arose as a consequence of attempts to apply precisely what accounts for their explanatory power
the methods of formal, mostly deductive methods in an and creative predictive potential.”
attempt to provide a logical reconstruction of scientific
reasoning. Notice also that the mechanic’s model of why the
The surplus-meaning problem has to do with the car lost power not only contains information about the
fact that explanatory hypotheses have, and are known various other things he should expect to find if that
to have, countless implications beyond the happenings model is correct; it also contains information about the
they explain. To keep things simple, consider the non- countless ways in which each of these expectations
scientific case of what a mechanic knows about the might, consistent with the truth of the explanation, be
operation of an automobile engine. Imagine, in partic- defeated. The mechanic knows, for instance, that re-
ular, that someone has brought an automobile into the placing the rings will restore power, but only if it is not
mechanic’s shop complaining that the engine has suf- the case that one of the spark plug wires was damaged
fered a drop-off in power. Listening to the engine, the in the process, the air filter has become clogged with
mechanic might decide that the engine has blown a ring. dust from a nearby construction project, and so on in-
Setting aside the question of creativity, one might pro- definitely.
vide the following formal, deductive reconstruction of Whether we are dealing with commonsense or sci-
his explanatory model entific reasoning about mechanisms, the problem with
attempts at formalizing our knowledge of the ways in
If one of the cylinders has lost a ring, which a given implication is qualified is that what we
then the engine will have lost power. know far outstrips what can be expressed explicitly in
the form of, say, a conditional generalization. In philos-
One of the cylinders has lost a ring.
Part F | 31.5
source of the dark information comes into focus. After these models have the added virtue that one can eas-
all, these two problems are just variants on the pre- ily freeze the action, zoom in or out, slow things down,
diction and qualifications problems of AI. This is not and even watch things play out in reverse. These mod-
surprising given that both sets of problems were discov- els thus constitute what Churchland and Sejnowski term
ered through early attempts to deductively reconstruct a “fortunate preparation” [31.84]. Such models provide
everyday and scientific reasoning about mechanisms. an even more apt analogy for understanding our own
Naturally, the same solution applies in both cases: Es- native mental models, for both sorts of models are re-
chew the appeal to extrinsic representations and formal alized at a low level by complicated circuitry, and both
inferences in favor of an appeal to intrinsic models and tend to bottom-out well above the level of nature’s fun-
exductive inferences. The promising idea that emerges damental laws. (One way of describing the interplay
is that scientists may be utilizing intrinsic mental mod- between external models (in their various forms) and
els to understand the mechanisms that are (or might internal mental models would be to say that the latter
be) responsible for particular phenomena. Such models are part and parcel of scientific cognition, whereas the
would endow the scientist with boundless tacit knowl- former are representational artifacts created to aid cog-
edge of the further implications of a given mechanistic nition. An alternative, somewhat speculative, proposal
hypothesis and of the countless ways in which those im- is that our external models are no less part of the fab-
plications are qualified. ric of scientific cognition than are our internal mental
models [31.85]).
31.5.2 Beyond Mental Models As noted in Sect. 31.2, over and above the quan-
titative limits imposed by working memory capacity,
We must not forget, however, that our mental mod- scale models and FEMs are, and our own mental
els are limited by working memory capacity and by models may well be, limited in certain qualitative re-
the heavy cognitive load associated with mental model- spects, such as their ability to represent abstracta. But
ing. Even so, scientists are somehow able to formulate surely thoughts about abstracta play a big role in sci-
and comprehend some remarkably complex mechani- entific reasoning. One way of accounting for this is
cal explanations. Seen in this light, it is no surprise to say that our deficiencies with regards to model-
that, in their reasoning about mechanisms, humans rely ing abstracta are the precise problem that analogy and
heavily upon external representational artifacts such metaphor were created to solve. This would make sense
as diagrams. These can act as aids to memory, both of why the language we use to represent abstracta (e.g.,
short- and long-term, enabling us to off-load some of economic inflation) is so shot through with analogies
the cognitive burden to the world and thereby compen- and metaphors rooted in concrete domains [31.86–
sating for our otherwise limited ability to keep track 88].
of the simultaneous influences of many mechanical
components (see [31.82]). Indeed, when aided by ex-
Part F | 31.5
ternal diagrams, Hegarty et al. found that high- and
low-imagery subjects performed about equally well Battery
on a task that required model-based reasoning about Resistor
mechanisms [31.18]. The compensatory influence of
external representations is strong indeed (see also Bech-
tel, Chap. 27).
Of course we do not just utilize static pictures to
make sense of natural phenomena; we also sometimes
use scale models [31.83]. On the present view, the rea-
son so many have come to view these models as an
apt metaphor for in-the-head reasoning may be that
scale models recapitulate, albeit in a way that over-
comes many of our cognitive frailties, the structure of
our internal models of mechanisms. However, with the
advent of sophisticated computer models, we now have
an even better tool for investigating the implications
of mechanical hypotheses. As we have seen, certain
computer models (e.g., FEMs) are like scale models in
that they constitute intrinsic nonsentential representa- Fig. 31.6 Various configurations of circuitry, batteries, and
tions of actual or hypothetical mechanisms. However, resistors
690 Part F Model-Based Reasoning in Cognitive Science
Gentner and Gentner’s study of human reason- Gentner and Gentner found that subjects’ errors in rea-
ing about electricity lends some credence to this soning about these components tracked the analogies
view [31.89]. Gentner and Gentner found that in con- they invoked when discussing electricity. This suggests
versation, nonexpert subjects commonly likened the that these analogies run deeper than the surface fea-
flow of electricity to water moving through pipes or tures of language and penetrate right into subjects’
crowds moving through corridors. Each analogy hap- mental models of the flow of electricity. Analogies and
pens to yield its own unique set of inaccurate predic- metaphors are perhaps not the whole story of how we
tions about how electrical current moves through partic- think about abstracta, but they may well be an impor-
ular configurations of electrical components (Fig. 31.6). tant part of it.
31.6 Conclusion
It was noted in the beginning that mental models might of the distinctive features of scale models that distin-
play a crucial role in the process of mechanistic ex- guish them from sentential representations. If we then
planation and prediction in science. As such, we can turn to the computational realm, we see that these very
only attain a deep understanding of science itself if we features (including immunity to the notorious frame
first understand that nature of this mental, model-based, problem) are exhibited by certain computational mod-
reasoning process. We then saw that experimental psy- els of mechanisms such as FEMs. An appeal to the
chologists have long maintained that mental models are principle that sustains the computational theory of cog-
distinct from sentential representations in much the way nition (POPI) enables us to understand how this could
that scale models are. Insofar as this hypothesis is vi- be so and how high-level, nonsentential, intrinsic mod-
able, we can expect that experimental psychology will els of mechanisms could in principle be realized by
provide crucial insight into both the nature and limits neurophysiological processes. The broader viability of
of our onboard mental models. At the same time, it is this realization story for mental models is suggested by
important to recognize that the many appeals to distinc- recent work in both AI and experimental psychology
tively model-like mental representations in psychology and by the elegant solution it offers to the surplus-
will be considered suspect so long as we lack a reason- meaning and ceteris paribus problems in the philosophy
able way of spelling out what sorts of representational of science. Going forward, the idea that our scientific
structures mental models are supposed to be in a way reasoning about mechanisms might, to a large extent,
that (i) shows these models to be distinct from sentential involve the manipulation of representations that are like
representations while (ii) allowing for their realization scale models in crucial respects can be regarded as at
by neurophysiological processes. We can see the way least one, sturdy pillar of a promising hypothesis re-
forward, however, if we first pay attention to some garding the nature of model-based reasoning in science.
Part F | 31
References
31.1 R. Giere: Explaining Science: A Cognitive Approach 31.8 J. Waskan, I. Harmon, A. Higgins, J. Spino: Inves-
(Univ. Chicago Press, Chicago 1988) tigating lay and scientific norms for using expla-
31.2 P. Railton: A deductive-nomological model of nation. In: Modes of Explanation: Affordances for
probabilistic explanation, Philos. Sci. 45, 206–226 Action and Prediction, ed. by M. Lissack, A. Graber
(1978) (Palgrave Macmillan, New York 2014) pp. 198–205
31.3 W. Salmon: Scientific Explanation and the Causal 31.9 K. Craik: The Nature of Explanation (Cambridge
Structure of the World (Princeton Univ. Press, Univ. Press, Cambridge 1943)
Princeton 1984) 31.10 N. Nersessian: Mental modeling in conceptual
31.4 W. Salmon: Causality and Explanation (Oxford change. In: Internatioinal Handbook of Conceptual
Univ. Press, New York 1998) Change, ed. by S. Vosniadou (Routledge, London
31.5 P. Machamer, L. Darden, C. Craver: Thinking about 2007) pp. 391–416
mechanisms, Philos. Sci. 67, 1–25 (2000) 31.11 P. Thagard: The Cognitive Science of Science (MIT
31.6 W. Bechtel: Mental Mechanisms: Philosophical Per- Press, Cambridge 2012)
spectives on Cognitive Neuroscience (Psychology 31.12 E. Tolman: Cognitive maps in rats and men, Psy-
Press, New York 2009) chol. Rev. 55, 189–208 (1948)
31.7 C. Wright, W. Bechtel: Mechanisms and psycholog- 31.13 R. Shepard, J. Metzler: Mental rotation of three-
ical explanation. In: Philosophy of Psychology and dimensional objects, Science 171, 701–703 (1971)
Cognitive Science, ed. by P. Thagard (Elsevier, New 31.14 S. Kosslyn: Image and Mind (Harvard Univ. Press,
York 2007) pp. 31–79 Cambridge 1980)
From Neural Circuitry to Mechanistic Model-Based Reasoning References 691
31.15 D. Schwartz, J. Black: Analog imagery in mental ral Networks, ed. by P. Smolensky, M. Mozer,
model reasoning: Depictive models, Cogn. Psychol. D. Rumelhart (Lawrence Earlbaum Associates, Mah-
30, 154–219 (1996) wah, New Jersey 1996) pp. 41–84
31.16 D. Schwartz, J. Black: Shuttling between depic- 31.35 S. Moulton, S. Kosslyn: Imagining predictions:
tive models and abstract rules: Induction and fall- Mental imagery as mental emulation, Philos. Trans.
back, Cogn. Sci. 20, 457–497 (1996) R. Soc. B 364, 1273–1280 (2009)
31.17 M. Hegarty: Mechanical reasoning by mental sim- 31.36 R. Cummins: Representations, Targets, and Atti-
ulation, Trends Cogn. Sci. 8, 280–285 (2004) tudes (MIT Press, Cambridge 1996)
31.18 M. Hegarty, S. Kriz, C. Cate: The roles of mental ani- 31.37 L. Wittgenstein: Philosophical Investigations
mations and external animations in understanding (Macmillan, New York 1953)
mechanical systems, Cogn. Instruct. 21, 325–360 31.38 R. Shepard, S. Chipman: Second-order isomor-
(2003) phism of internal representations: Shapes of states,
31.19 D. Norman: Some observations on mental mod- Cogn. Psychol. 1, 1–17 (1970)
els. In: Mental Models, ed. by D. Gentner, 31.39 S. Palmer: Fundamental aspects of cognitive rep-
A. Stevens (Lawrence Erlbaum Associates, Hillsdale resentation. In: Cognition and Categorization, ed.
1983) pp. 7–14 by E. Rosch, B. Lloyd (Lawrence Erlbaum Associates,
31.20 A. Leslie, S. Keeble: Do six-month old infants per- Hillsdale, New Jersey 1978) pp. 259–303
ceive causality?, Cognition 25, 265–288 (1987) 31.40 N. Nersessian: The cognitive basis of model-based
31.21 A. Schlottman: Seeing it happen and knowing how reasoning in science. In: The Cognitive Basis of
it works: How children understand the relation be- Science, ed. by P. Carruthers, S. Stich, M. Siegal
tween perceptual causality and underlying mech- (Cambridge Univ. Press, Cambridge 2002) pp. 133–
anism, Dev. Psychol. 35, 303–317 (1999) 153
31.22 R. Baillargeon, J. Li, Y. Gertner, D. Wu: How do in- 31.41 S. Kosslyn: Image and Brain: The Resolution of the
fants reason about physical events? In: The Wiley- Imagery Debate (MIT Press, Cambridge 1994)
Blackwell Handbook of Childhood Cognitive Devel- 31.42 S. Zeki: The functional organization of projections
opment, ed. by U. Goswami (Blackwell, Oxford 2011) from striate to prestriate visual cortex in the rhesus
pp. 11–48 monkey, Cold Spring Harbor Symp. Quant. Biol. 40,
31.23 L. Barsalou, K. Solomon, L. Wu: Perceptual simula- 591–600 (1976)
tion in conceptual tasks, PROCBEGINProc. 4th Conf. 31.43 M. Mishkin, L. Ungerleider, K. Macko: Object vision
Int. Cogn. Linguist. Assoc. Cult. Typol. Psychol. Per- and spatial vision: Two cortical pathways, Trends
spect. Cogn. Linguist.PROCEND, Vol. 2., ed. by M. Neurosci. 6, 414–417 (1983)
Hiraga, C. Sinha, S. Wilcox (John Benjamins, Am- 31.44 E. DeYoe, D. Van Essen: Concurrent processing
sterdam 1999) pp. 209–228 streams in monkey visual cortex, Trends Neurosci.
31.24 W. Brewer: Scientific theories and naive theories as 11, 219–226 (1988)
forms of mental representation: Psychologism re- 31.45 J. Huttenlocher, E. Higgins, H. Clark: Adjectives,
vived, Sci. Educ. 8, 489–505 (1999) comparatives, and syllogisms, Psychol. Rev. 78,
31.25 R. Langacker: Concept, Image, and Symbol: The 487–514 (1971)
Cognitive Basis of Grammar (Mouton de Gruyter, 31.46 J.R. Anderson: Arguments concerning representa-
New York 1991) tions for mental imagery, Psychol. Rev. 85, 249–277
31.26 A. Goldberg: Constructions: A Construction Gram- (1978)
Part F | 31
mar Approach to Argument Structure (Univ. Chicago 31.47 L. Brooks: Spatial and verbal components in the act
Press, Chicago 1995) of recall, Can. J. Psychol. 22, 349–368 (1968)
31.27 G. Fauconnier: Mental Spaces (MIT Press, Cambridge 31.48 S. Segal, V. Fusella: Influence of imaged pictures
1985) and sounds on detection of visual and auditory
31.28 A. Garnham: Mental Models as Representations of signals, J. Exp. Psychol. 83, 458–464 (1970)
Discourse and Text (John Wiley, New York 1987) 31.49 N. Block: Mental pictures and cognitive science. In:
31.29 L. Talmy: Force dynamics in language and cogni- Mind and Cognition, ed. by W.G. Lycan (Basil Black-
tion, Cogn. Sci. 12, 49–100 (1988) well, Cambridge, Massachusetts 1990) pp. 577–606
31.30 P. Johnson-Laird: How is meaning mentally rep- 31.50 J.A. Fodor: Imagistic representation. In: Imagery,
resented? In: Meaning and Mental Representation, ed. by N. Block (MIT Press, Cambridge 1981) pp. 63–
ed. by E. Eco, M. Santambrogio, P. Violi (Indiana 86
Univ. Press, Bloomington 1988) pp. 99–118 31.51 J. McCarthy, P. Hayes: Some philosophical problems
31.31 P. Johnson-Laird: Mental Models: Towards a Cog- from the standpoint of artificial intelligence. In:
nitive Science of Language, Inference, and Con- Machine Intelligence, ed. by B. Meltzer, D. Michie
sciousness (Harvard Univ. Press, Cambridge 1983) (Edinburgh Univ. Press, Edinburgh 1969) pp. 463–
31.32 P. Johnson-Laird, R. Byrne: Deduction (Lawrence 502
Erlbaum Associates, Hillsdale, New Jersey 1991) 31.52 P. Hayes: The frame problem and related problems
31.33 W. McCulloch, W. Pitts: A logical calculus of the in artificial intelligence. In: Readings in Artificial
ideas immanent in nervous activity, Bull. Math. Intelligence, ed. by B. Webber, N. Nilsson (Morgan
Biophys. 5, 115–113 (1943) Kaufman, Los Altos 1981) pp. 223–230
31.34 S. Franklin, M. Garzon: Computation by discrete 31.53 L. Janlert: The frame problem: Freedom or stability?
neural nets. In: Mathematical Perspectives on Neu- With pictures we can have both. In: The Robot’s
692 Part F Model-Based Reasoning in Cognitive Science
Dilemma Revisited: The Frame Problem in Artificial 31.72 F.S.J. Knierim: Coming up: In search of the vertical
Intelligence, ed. by K.M. Ford, Z. Pylyshyn (Ablex dimension in the brain, Nat. Neurosci. 14, 1102–1103
Publishing, Norwood, New Jersey 1996) pp. 35–48 (2011)
31.54 J. McCarthy: Applications of circumscription to for- 31.73 K. Woollett, E. Maguire: Acquiring ‘the knowledge’
malizing common-sense knowledge, Artif. Intell. of London’s layout drives structural brain changes,
28, 86–116 (1986) Curr. Biol. 21, 2109–2114 (2011)
31.55 J. Waskan: Applications of an implementation story 31.74 J. Laird: Extending the Soar cognitive architecture,
for non-sentential models. In: Model-Based Rea- Proc. 1st AGI Conf., ed. by P. Wang, B. Goertzel,
soning in Science and Technology, ed. by L. Mag- S. Franklin (IOS Press, Amsterdam 2008) pp. 224–
nani, W. Carnielli, C. Pizzi (Springer, Berlin 2010) 235
pp. 463–476 31.75 P. Battaglia, J. Hamrick, J. Tenenbaum: Simula-
31.56 C. Congdon, J. Laird: The Soar User’s Manual: Ver- tion as an engine of physical scene understanding,
sion 7.0.4 (Univ. Michigan, Ann Arbor 1997) Proc. Natl. Acad. Sci. USA 110, 18327–18332 (2013)
31.57 J. Waskan: Intrinsic cognitive models, Cogn. Sci. 27, 31.76 D. Schwartz: Physical imagery: Kinematic versus
259–283 (2003) dynamic models, Cogn. Psychol. 38, 433–464 (1999)
31.58 J. Waskan: Models and Cognition (MIT Press, Cam- 31.77 H. Reichenbach: Experience and Prediction: An
bridge 2006) Analysis of the Foundations and the Structure of
31.59 R. Descartes: Discourse on the method. In: The Knowledge (Univ. Chicago Press, Chicago 1938)
Philosophical Writings of Descartes, (Cambridge 31.78 J. Greenwood: Folk psychology and scientific psy-
Univ. Press, Cambridge 1988) pp. 20–56, trans. by chology. In: The Future of Folk Psychology, ed. by
J. Cottingham, R. Stoothoff, D. Murdoch J. Greenwood (Cambridge Univ. Press, Cambridge
31.60 J. Haugeland: An overview of the frame problem. 1991) pp. 1–21
In: Robot’s Dilemma, ed. by Z. Pylyshyn (Ablex Pub- 31.79 J.A. Fodor: Psychosemantics: The Problem of Mean-
lishing Corp, Norwood 1987) pp. 77–93 ing in the Philosophy of Mind (MIT Press, Cambridge
31.61 J. Glasgow, D. Papadias: Computational imagery, 1984)
Cogn. Sci 16(3), 355–394 (1992) 31.80 R. Giere: Laws, theories, and generalizations. In:
31.62 K. Sterelny: The imagery debate. In: Mind and Cog- The Limits of Deductivism, ed. by A. Grünbaum,
nition, ed. by W. Lycan (Blackwell, Cambridge 1990) W. Salmon (Univ. California Press, Berkeley 1988)
pp. 607–626 pp. 37–46
31.63 Z. Pylyshyn: Computation and Cognition: Toward a 31.81 J. Waskan: Knowledge of counterfactual interven-
Foundation for Cognitive Science (MIT Press, Cam- tions through cognitive models of mechanisms,
bridge, Massachusetts 1984) Int. Stud. Philos. Sci. 22, 259–275 (2008)
31.64 J.A. Fodor: The Mind Doesn’t Work That Way (MIT 31.82 A. Clark: Mindware: An Introduction to the Philos-
Press, Cambridge 2000) ophy of Cognitive Science (Oxford Univ. Press, New
31.65 G. Leibniz: The Monadology. In: The Monadology York 2013)
and Other Philosophical Writings (Clarendon Press, 31.83 M. Weisberg: Who is a modeler?, Br. J. Philos. Sci.
Oxford 1898) pp. 215-271, trans. by R. Lotta 58, 207–233 (2007)
31.66 J.A. Fodor: The Language of Thought (Thomas Y. 31.84 P. Churchland, T. Sejnowski: Neural representation
Crowell, New York 1975) and neural computation, Philos. Perspect. Action
31.67 C. Eliasmith, C.H. Anderson: Neural Engineer- Theory Philos, Mind 4, 343–382 (1988)
Part F | 31
ing: Computation, Representation and Dynamics 31.85 R. Giere: Models as parts of distributed cognitive
in Neurobiological Systems (MIT Press, Cambridge systems. In: Model Based Reasoning: Science, Tech-
2003) nology, Values, ed. by L. Magnani, N. Nersessian
31.68 J. O’Keefe, L. Nadel: The hippocampus as a cogni- (Kluwer, Amsterdam 2002) pp. 227–241
tive map (Oxford Univ. Press, New York 1978) 31.86 G. Lakoff, M. Johnson: Metaphors We Live By (Univ.
31.69 V. Brun, M. Otnass, S. Molden, H. Steffenach, Chicago Press, Chicago 1980)
M. Witter, M. Moser, E. Moser: Place cells and place 31.87 G. Lakoff: Women, Fire, and Dangerous Things
recognition maintained by direct entorhinal-hip- (Univ. Chicago Press, Chicago 1987)
pocampal circuitry, Science 296, 2243–2246 (2002) 31.88 D. Fernandez-Duque, M. Johnson: Attention
31.70 J. O’Keefe, N. Burgess, J. Donnett, K. Jeffery, metaphors: How metaphors guide the cognitive
E. Maguire: Place cells, navigational accuracy, and psychology of attention, Cogn. Sci. 23, 83–116
the human hippocampus, Philos. Trans. R. Soc. B (1999)
353, 1333–1340 (1998) 31.89 D. Gentner, D.R. Gentner: Flowing waters or teem-
31.71 M. Fyhn, T. Hafting, A. Treves, M. Moser, E. Moser: ing crowds: Mental models of electricity. In: Mental
Hippocampal remapping and grid realignment in Models, ed. by D. Gentner, A. Stevens (Lawrence
entorhinal cortex, Nature 446, 190–194 (2007) Erlbaum Associates, Hillsdale 1983) pp. 99–129
693
Part G
Modellin Part G Modelling and Computational Issues
Ed. by Francesco Amigoni, Viola Schiaffonati
Since the 1950s computational progress has been con- with the awareness that epistemological analyses of
tributing to increasing the relevance of the discourse simulations are, to a large degree, contextual and that
about models, making it not only relevant to scientists these analyses require developing insights about the
and philosophers, but also to computer scientists, pro- evolving relation between human capacities and com-
grammers, and logicians. Emphasis has been put both putational science.
on the application of computational tools to a range of
disciplines and on the computational issues themselves. Simulation is also the topic of Chap. 35, Simula-
Computing is playing an increasing role in several sci- tion of Complex Systems by Paul Davidsson, Franziska
entific endeavors, in modeling and simulating different Klügl, and Harko Verhagen, which opens with a dis-
entities, and in practically enhancing the performance cussion on what characterizes complex systems, such
of various scientific activities. At the same time, compu- as huge ecosystems and traffic systems. Different ap-
tational tools are becoming more and more complex and proaches to model complex systems are presented, but
present several open issues that require consideration particular attention is devoted to agent-based simu-
from technical, methodological, and epistemological lations, to their intuitiveness and flexibility, to some
points of view. solutions proposed in the last years, and to the still
open problems that are discussed in a critical perspec-
This part of the Handbook of Model-Based Science tive.
discusses the modeling and computational issues aris-
ing in this context, with the aim of giving back an Chapter 36, by Francesco Amigoni and Viola
articulated, although necessarily incomplete, picture of Schiaffonati, Models and Experiments in Robotics, sur-
the field. It is composed of six different chapters that al- veys the practices being employed in experimentally
ternate with general perspectives and specific fields of assessing the special class of computational models em-
application. bedded in robots. This assessment is particularly chal-
lenging due to the difficulty of satisfactorily estimating
This part opens with Chap. 32 on Computational
the interactions between robots and their environments.
Aspects of Model-Based Reasoning by Gordana Dodig-
Moreover, by considering also related topics such as
Crnkovic and Antonio Cicchetti offering an introductory
simulations, benchmarks, standards, and competitions,
overview on the use of computational models and tools
this chapter shows how the recent debate on the imple-
for the study of cognition and model-based reason-
mentation of the experimental method in this field is
ing. From simple agents, like bacteria, to the complex
still very open.
human cognitive systems, computation is meant as
physical, natural, embodied, and distributed, and it is Chapter 37, Biorobotics by Edoardo Datteri, pro-
discussed in relation to the view of symbol manipula- vides an overview of the biorobotic strategy for testing
tion of classical computationalism. mechanistic explanations of animal behavior starting
Chapter 33 by Peter Sozou, Peter Lane, Mark Addis, from a reflection on the various roles played by robotic
and Fernand Gobet discusses computational scientific simulations in scientific research. Besides the history
discovery as a particularly interesting and successful and state of the art of biorobotics, the chapter also
field of application of computer-supported model-based addresses some key epistemological and methodolog-
science. The chapter reviews the application of compu- ical issues mainly concerning the relationships between
tational methods in the formulation of scientific ideas biorobots and the theoretical models under investiga-
and acknowledges the importance of this field not only tion.
for historical reasons, with the first systems having
It is not by chance that this part of the volume
played a disruptive role in the philosophical debate on
ends with this chapter on biorobotics: if one of the
scientific discovery, but also for testifying its increasing
main common traits of the other chapters has been
importance in many areas of science.
the foundational role of the philosophical tools in dis-
Chapter 34, Computer Simulations and Compu- cussing computational models in model-based science,
tational Models in Science by Cyrill Imbert presents this last chapter also shows how computational models
a very rich examination of computational science and and tools can offer new insights to traditional philo-
computer simulations by giving reason to the constant sophical problems, and thus represents an ideal and
attempts of extending human computational capacities. critical conclusion offering further reflections on the ar-
The chapter covers a wide variety of topics and themes ticulation between computation and philosophy.
695
Computation
32. Computational Aspects of Model-Based Reasoning
Part G | 32.1
Gordana Dodig-Crnkovic, Antonio Cicchetti
Biologists not only model life forms on comput- computational social science (Lazer et al., 2009),
Part G | 32.1
ers; they treat the gene, and even whole organisms, as evidenced, for example, by the growth in jour-
as information systems. Philosophy, artificial intel- nals, conferences, books and research funding. In
ligence, and cognitive science don’t just construct the digital humanities ‘critical inquiry involves the
computational models of mind; they take cognition application of algorithmically facilitated search, re-
to be computation, at the deepest levels. trieval, and critical process that [. . . ] originat[es] in
Physicists don’t just talk about the informa- humanities-based work’; therefore ‘exemplary tasks
tion carried by a subatomic particle; they propose traditionally associated with humanities computing
to unify the foundations of quantum mechanics hold the digital representation of archival materials
with notions of information. Similarly for linguists, on a par with analysis or critical inquiry, as well
artists, anthropologists, critics, etc.” as theories of analysis or critical inquiry originating
in the study of those materials’ (Schreibman et al.,
With the emphasis on philosophical facets of the
2008: xxv). In social sciences, Lazer et al. argue that
computational turn, Charles Ess and Ruth Hagengru-
‘computational social science is emerging that lever-
ber depict the same process of steady increase in the
ages the capacity to collect and analyze data with an
role of computation as follows [32.2]:
unprecedented breadth and depth and scale’ (2009).
“In the West, philosophical attention to computation Latour speculates that there is a trend in these
and computational devices is at least as old as Leib- informational cascades, which is certainly reflected
niz. But since the early 1940s, electronic computers in the ongoing digitalisation of arts, humanities and
have evolved from a few machines filling several social science projects that tends towards ‘the direc-
rooms to widely diffused – indeed, ubiquitous – de- tion of the greater merging of figures, numbers and
vices, ranging from networked desktops, laptops, letters, merging greatly facilitated by their homoge-
smartphones and the internet of things. Along the nous treatment as binary units in and by computers
way, initial philosophical attention – in particular, to (Latour, 1986: 16).”
the ethical and social implications of these devices
Finally, from the perspective of computing itself as
(Norbert Wiener, 1950) – became sufficiently broad
a field, Peter J. Denning ascertains [32.4, p. 1-1; p. 1-4]:
and influential as to justify the phrase the computa-
tional turn by the 1980s. “Computing is integral to science – not just as
In part, the computational turn referred to the a tool for analyzing data but also as an agent of
multiple ways in which the increasing availability thought and discovery. It has not always been this
and usability of computers allowed philosophers to way. Computing is a relatively young discipline. It
explore a range of traditional philosophical inter- started as an academic field of study in the 1930s
ests – e.g., in logic, artificial intelligence, philo- with a cluster of remarkable papers by Kurt Gödel,
sophical mathematics, ethics, political philosophy, Alonzo Church, Emil Post, and Alan Turing. The
epistemology, ontology, to name a few – in new papers laid the mathematical foundations that would
ways, often shedding significant new light on tradi- answer the question, what is computation? and dis-
tional issues and arguments. Simultaneously, com- cussed schemes for its implementation. These men
puter scientists, mathematicians, and others whose saw the importance of automatic computation and
work focused on computation and computational sought its precise mathematical foundation.”
devices often found their work to evoke (if not
force) reflection and debate precisely on the philo- “All this suggests that computing has developed
sophical assumptions and potential implications of a paradigm all its own (Denning and Freeman,
their research.” 2009). Computing is no longer just about algo-
rithms, data structures, numerical methods, pro-
Looking from the perspective of arts, humanities,
gramming languages, operating systems, networks,
and social sciences David M. Berry [32.3] observes:
databases, graphics, artificial intelligence, and soft-
“The importance of understanding computational ware engineering, as it was prior to 1990. It now
approaches is increasingly reflected across a num- also includes exciting new subjects including In-
ber of disciplines, including the arts, humanities ternet, web science, mobile computing, cyberspace
and social sciences, which use technologies to shift protection, user-interface design, and information
the critical ground of their concepts and theories – visualization. The resulting commercial applica-
something that can be termed a computational turn. tions have spawned new research challenges in
This is shown in the increasing interest in the social networking, endlessly evolving computation,
digital humanities (Schreibman et al., 2008) and music, video, digital photography, vision, massive
Computational Aspects of Model-Based Reasoning 32.2 Models of Computation 697
multiplayer online games, user-generated content, itative versus quantitative reasoning, operational
Part G | 32.2
and much more. [. . . ] The central focus of the methodologies, continuous versus discrete, hybrid
computing paradigm can be summarized as infor- systems and more. Computer Science is far ahead
mation processes–natural or constructed processes of many other sciences, due in part to the challenges
that transform information. They can be discrete or arising from the amazing rapidity of the technol-
continuous.” ogy change and development it is constantly being
confronted with. One could claim that computer sci-
Denning reminds that in the beginning of the field
entists (maybe without realizing it) constitute an
“Computation was taken to be the mechanical steps fol-
avant-garde for the sciences in terms of providing
lowed to evaluate mathematical functions. Computers
fresh paradigms and methods.”
were people who did computations.” while, today com-
puting plays a much more diverse role: “Computing is
All the above observations made by researchers
not only a tool for science but also a new method of
with different perspectives on computing provide ev-
thought and discovery in science.” [32.4, p. 1-3]
idence supporting the commonly observed emergence
Computing is the fourth great domain of science,
of computational turn in its different manifestations –
together with traditional domains of physical-, life-, and
from technological to cultural, artistic, cognitive, con-
social sciences [32.5].
ceptual, and modeling aspects. As Edsger Dijkstra
In more detail, here is how Computer Science is rel-
famously said, “Computing is no more about com-
evant for other sciences in view of Samson Abramsky
puters than astronomy is about telescopes” (as quoted
and Bob Coecke [32.6]:
in [32.7]) – that is to say that computers (as we know
“Computer Science has something more to offer to them and develop them) are only the tools for comput-
the other sciences than the computer. In particu- ing. Understanding of computing requires understand-
lar, in the mathematical and logical understanding ing of computational processes and their mechanisms.
of fundamental transdisciplinary scientific concepts Generative computational methods, according to
such as interaction, concurrency and causality, syn- Stephen Wolfram, enable the development of a new kind
chrony and asynchrony, compositional modeling of science [32.8], and a new way of thinking that Jean-
and reasoning, open versus closed systems, qual- nette M. Wing termed computational thinking [32.9].
“[t]he process of utilizing computer technology to presupposes a human as a part of a system – the human
Part G | 32.2
complete a task. Computing may involve computer is the one who poses the questions, provides resources,
hardware and/or software, but must involve some sets up the rules and interprets the answers.
form of a computer system.” However, today the dramatically increased interac-
tivity and connectivity of computational devices have
It is general enough, as a description of computation
changed our understanding of the nature of comput-
process but leaves open the question of what actually is
ing [32.27]. Computing models have been successively
computer. It implicitly seems to assume computer to be
extended from the initial abstract symbol manipulating
a technological device. However in recent years a new
models of stand-alone, discrete sequential machines, to
field of computing is being developed labeled as Nat-
the models of physical computing in the natural world,
ural Computing [32.21] or Computing Nature [32.22,
which are in general concurrent, asynchronous pro-
23], where processes in nature are understood as being
cesses. For the first time it is possible to model living
some kind of (intrinsic, physical) computation.
systems, their informational structures, and dynamics
on both symbolic and subsymbolic information process-
32.2.1 Turing Model of Computation
ing levels [32.28]. Currently the computation models are
and Its Scope
being developed to describe embedded, interactive, and
networked computing systems [32.29] with an ambition
The first classical model of computation that also served
to encompass present-day distributed computational ar-
as a definition of computation is the Turing machine
chitectures with their concurrent time behavior.
model, which takes computation to be computation of
Ever since the time of Turing, the definition of com-
mathematical function. Logical Computing Machine
putation is the subject of a debate. The special issue of
(Turing’s own expression for Turing machines) was an
the journal Minds and Machines (1994, 4, 4) was de-
attempt to give a mathematically precise definition of
voted to the question What is Computation? The most
an algorithm that is a mechanical procedure (followed
general is the view of computation as information pro-
by a human using pencil and paper, and given unlimited
cessing, found in number of mathematical accounts
resources). The Church–Turing thesis says that a func-
of computing; see [32.30] for exposition. Understand-
tion on the natural numbers is computable [32.24, 25] if
ing of computation as information processing is also
and only if it is describable by a Turing machine model.
widespread in biology, neuroscience, cognitive science,
Besides the Turing machine model, several other
and number of other fields. An illuminating case is pre-
models of computation were defined such as lambda
sented by David Baltimore in How biology became an
calculus, cellular automata, register machines, and sub-
information science [32.31]. Barry Cooper and Jan van
stitution systems, which have been shown to be equiv-
Leeuwen Turing centenary volume [32.32] illustrates
alent to using general recursive functions. The Church–
the current state of the art regarding Turing model and
Turing thesis has long served as a definition for compu-
its scope.
tation. There has never been a proof, but the evidence
In general, for a process to qualify as computation,
for its validity comes from the evident practical equiva-
a mechanism that ensures definability of its behavior
lence of mentioned computational models.
must exist, such as algorithm, network topology, physi-
Georg Kampis claims that the Church–Turing the-
cal process, or similar [32.11].
sis applies only to simple systems [32.26]. According
The characterization of computing can be made
to Kampis, complex systems such as found in biol-
in several dimensions with orthogonal types: digital/
ogy must be modeled as self-referential, self-organizing
analog, symbolic/subsymbolic, interactive/batch, and
structures called component systems whose behavior
sequential/parallel. Nowadays digital computers are
goes far beyond the simple Turing machine model as
used to simulate all sorts of natural processes, includ-
a more general model of computation [32.26, p. 223]:
ing those that in physics are understood as continuous.
“a component system is a computer which, when However, it is important to distinguish between the
executing its operations (software) builds a new mechanism and model of computation [32.33].
hardware [. . . ] [W]e have a computer that re-wires
itself in a hardware-software interplay: the hard- 32.2.2 Computation as Information
ware defines the software and the software defines Processing
new hardware. Then the circle starts again.”
Luciano Floridi [32.34] presents the list of the five
I would add an obvious remark. The Turing machine most interesting areas of research for the field of in-
is supposed to be given from the outset – its logic, its formation (and computation) philosophy, containing 18
(unlimited) physical resources, and the meanings as- fundamental questions. Information dynamics is of spe-
cribed to its actions. The Turing Machine essentially cial interest, as information processing (computation).
Computational Aspects of Model-Based Reasoning 32.2 Models of Computation 699
Information and computation are two complemen- classical Turing idea that computation is equivalent to
Part G | 32.2
tary ideas in a similar way to continuum and a discrete algorithm execution [32.4]:
set. In its turn continuum – discrete set dichotomy may
“First, some information processes are natural. Sec-
be seen in a variety of disguises such as time – space;
ond, we do not know whether all natural infor-
wave – particle; geometry – arithmetic; interaction – al-
mation processes are produced by algorithms. The
gorithm; computation – information. Two elements in
second statement challenges the traditional view
each pair presuppose each other, and are inseparably
that algorithms (and programming) are at the heart
related to each other so that Dodig-Crnkovic introduces
of computing. Information processes may be more
the concept of info-computation which emphasizes this
fundamental than algorithms.”
dual character of information and computation as its
dynamics [32.1, 35, 36]. The field of Philosophy of In- Floridi’s philosophy of information, developed
formation is so closely interconnected with the Philoso- as [32.37]
phy of Computation that it would be appropriate to call
“a new philosophical discipline, concerned with a)
it Philosophy of Information and Computation, having
the critical investigation of the conceptual nature
in mind the dual character of information-computation.
and basic principles of information, including its
Burgin puts it in the following way [32.30]:
dynamics (especially computation and flow), uti-
lization and sciences and b) the elaboration and
“It is necessary to remark that there is an ongo-
application of information-theoretic and computa-
ing synthesis of computation and communication
tional methodologies to philosophical problems.”
into a unified process of information processing.
Practical and theoretical advances are aimed at this can be seen as parallel to the philosophy of computa-
synthesis and also use it as a tool for further devel- tion as developed by Cantwell Smith in his Origins of
opment. Thus, we use the word computation in the Objects [32.18].
sense of information processing as a whole. Better To better represent information processing in bi-
theoretical understanding of computers, networks, ological systems, computational modeling is applied
and other information processing systems will al- with computation taken to be distributed, massively
low us to develop such systems to a higher level.” concurrent, heterogeneous, and asynchronous. Dodig-
Crnkovic proposes to adopt Hewitt et al.’s actor model
The traditional mathematical theory of computation of computation [32.38, 39]. In this model, computa-
is the theory of algorithms. Ideal, theoretical comput- tion is the process of message passing between actors
ers are mathematical objects and they are equivalent (agents) in an interacting network. Hewitt provides the
to algorithms, or abstract automata (Turing machines), following description [32.40, p. 161]:
or effective procedures, or recursive functions, or for-
“In the Actor Model, computation is conceived as
mal languages. New envisaged future computers are
distributed in space, where computational devices
information-processing devices. That is what makes
communicate asynchronously and the entire com-
the difference. Syntactic mechanical symbol manipu-
putation is not in any well-defined state. (An Actor
lation is replaced by information (both syntactic and
can have information about other Actors that it has
semantic) processing. Compared to new computing
received in a message about what it was like when
paradigms, Turing machines form the proper subset of
the message was sent.) Turing’s Model is a special
the set of information-processing devices, in much the
case of the Actor Model.”
same way as Newton’s theory of gravitation is a special
case of Einstein’s theory, or the Euclidean geometry is Hewitt’s computational systems consist of compu-
a limit case of non-Euclidean geometries. tational agents – informational structures capable of
According to [32.30] there are three distinct com- acting on their own behalf. Unlike logical model of Tur-
ponents of information-processing systems: hardware ing machine, Hewitt’s model of computation is inspired
(physical devices), software (programs that regulate its by physics [32.23].
functioning), and infoware that represents information When defining computation as information process-
processed by the system. Infoware is a shell built around ing in a network of agents, those networks can consist of
the software–hardware core, which was the traditional molecules or cells like bacteria or neurons, thus consti-
domain of automata and algorithm theory. Communi- tuting networks of networks on the hierarchy of scales of
cation of information and knowledge takes place on the informational structures with computational dynamics.
level of infoware. As we base the model of computation on the con-
Peter J. Denning comments on the relationship be- cept of information, it is in place to analyze the rela-
tween computation and information with respect to the tionships between the two in more detail.
700 Part G Modelling and Computational Issues
In this section, we detail the concept of computation progress in understanding its different aspects, forms,
as information processing to later on make explicit its and processes. Actually no such unique definition of
connections to cognition in the following sections. many among central concepts of sciences exists, and yet
Sometimes the current lack of the consensus about this situation is not experienced as a scandal. A fresh ex-
the definition of information is termed scandal of in- ample is dark energy, which together with dark matter
formation. At the same time, we can talk even about constitutes 95:1% of the content of the universe. We do
the corresponding scandal of computation, that is, the not know what they actually are.
current lack of consensus about the concept of compu- Instead of understanding a concept as equivalent to
tation. Denning describes historical development of the a (linear) string of symbols that constitute its definition,
concept of computation [32.4] as it appears in different we can rather represent it as a network of relationships
contexts and various communities of practice. such as is kind of, is a member of, is a part of, is a sub-
However, this situation is not as unique as it may stance of, is similar to, pertains to, causes, etc., which
seem. There is no simple commonly accepted definition connect the concept with other concepts that are con-
of life, and yet biologists study it and make constant nected with still further concepts.
Information measure
Material
Communications
Misinformation Fact communication theory
Format
News formatting
data format Information
data formatting Ammunition selective information
Nuts and bolts entropy
Message
content
subject matter
substance Information
Intelligence
info
intelligence information Example Cognition
illustration knowledge
Database instance noesis
representative
Factcid
News
intelligence
dings Evidence
word Information grounds Predictor Fact
Details
inside information Gen
Inform
Information Circumstance
condition
consideration Descriptor
Information Stimulation
stimulus
stimulant
input
Tip-off
Accusation Datum
accusal data point
Data
information Background
background knowledge
Acquaintance
familiarity
conversance
conversancy
Accounting data Collection
aggregation
accumulation
assemblage
Raw data
Metadata
This constitutes family resemblance (Familienähn- Figures 32.1 and 32.2 illustrate how visualizations
Part G | 32.3
lichkeit) of Ludwig Wittgenstein [32.42, §66] where of the concepts information and computation may look
like and where Wittgenstein’s idea of family resem-
“There is no reason to look, as we have done tradi- blance becomes apparent.
tionally – and dogmatically – for one, essential core Given that information/computation are common
in which the meaning of a word is located and which ideas spanning fields from physics, chemistry, biol-
is, therefore, common to all uses of that word. We ogy, and theories of mind, to ecologies and social
should, instead, travel with the word’s uses through systems, the bridging over such a large range is
‘a complicated network of similarities overlapping achieved by a network of inter-related concepts (fam-
and criss-crossing (PI §66)’” ily of concepts) that enable us to traverse from field
Wittgenstein explains in [32.42, §67]: to field, the main point being keeping a common
thread. That means that the concept is not a reduc-
“Why do we call something a number? Well, per- tionist one, but networked rhizomatic idea in the sense
haps because it has a direct relationship with several of [32.43].
things that have hitherto been called number; and World today is seen as governed and controlled
this can be said to give it an indirect relationship to by natural laws that in-materio execute programs un-
other things we call the same name. And we extend like the mechanistic world governed by fixed and
our concept of number as in spinning a thread we unchangeable laws expressed by equations of New-
twist fibre on fibre. And the strength of the thread tonian physics. Wolfram is arguing that “informa-
does not reside in the fact that someone fibre runs tion processes underlie every natural process in the
through its whole length, but in the overlapping of universe” [32.8].
many fibres.” The important difference is that while equations
operate on data, programming languages can oper-
As a result, boundaries dissolve [32.42, §68]:
ate even on higher level data structures and even
“What still counts as a game and what no longer on physical objects such as in case of cyber-physical
does? Can you give the boundary? No.” systems.
Estimate
estimation
approximation Derivative function
idea derivate
differential coefficient
differential Interpolation
Calculation
computation
figuring
Calculate
reckoning
cipher
cypher Integral
Calculate compute
estimate Computation
reckon
count on
Extrapolation
Computational
Procedure
process Account
calculate
Calculation
computation
Recalculation computing
Forecast
Number crunching
calculate
(Executable) Models
The belief in mathematical/geometrical essence of the of halting problems. Self-reference is common in
world can be traced back to Plato and the Pythagore- natural information processes; the cell, for example,
ans, which later on reappears with Galileo, in his contains its own blueprint.”
1632 Dialogue Concerning the Two Chief World Sys-
tems, where he argues that the book of nature is One important aspect of modeling is the direction of
written in language of mathematics. Plato’s ideal of their generation – bottom up or top down. Mathemati-
eternal, unchangeable forms can be found in mathe- cal models are typically top-down while computational
matics to this day. Even though mathematical formulas models are frequently bottom-up or generative, de-
can be used to compute time-dependent processes, scribed by Wolfram as a new kind of science [32.8].
equations themselves are symbolic structures, persis- Fields modeling living organisms like synthetic biol-
tent and immovable. Time dependency comes from ogy present challenge of bridging the gap between the
human performing computation, actively using static two, enabling the circular motion from bottom up to top
structures of mathematical algorithms to trace time be- down and back.
haviors of real-world systems. Platonic ideal forms, Barry S. Cooper addresses this difference between
however remote from the physical realizations and mathematical and computational approaches [32.44] in
questions of finite material resources, were long con- his article detecting the mathematician’s bias and the
sidered to represent the true nature of the world, while current return to embodied (physical, natural) computa-
changes were supposed to be something ephemeral, tion.
uninteresting, and too earthly for a mathematician or Unlike pure mathematics, computing can provide
a scientist to bother about. Up to quite recently this modeling tools for biomolecular systems such the ab-
detachment from the real-time aspects of the world straction of molecule as computational agent in which
was commonly taken for granted and justified. The a system of interacting molecules is modeled as a sys-
change happened with computing machinery getting in- tem of interacting computational agents [32.45]. Petri
tegrated with dynamically changing physical objects, nets, State charts, and the Pi-calculus, developed for
such as in embedded systems technologies or pro- systems of interacting computations, can be success-
cess control, where real-time computation processes fully used for modeling of biomolecular systems, such
must match real-time behaviors of the physical en- as signaling, regulatory, and metabolic pathways and
vironment. This situation radicalized even more with even multicellular processes. Processes, the basic inter-
the mobile distributed information and communica- acting computational entities of these languages, have
tion technologies for which the system dynamics is the an internal state and interaction capabilities. Process
most prominent characteristics. Rapidly, eternal forms behavior is governed by reaction rules specifying the
are becoming something remote and less noted. Ev- response to an input message based on its content
erything is in the process of change, communication, and the state of the process. The response can include
timely response, and resource optimization, as this new state change, a change in interaction capabilities, and/or
world of embodied and embedded computation is phys- sending messages. Complex entities are described hier-
ical in nature and thus substrate-dependent. The whole archically [32.45].
field of cyber-physical systems is emerging, at differ- Computer science distinguishes between two lev-
ent levels of scale, from nano to macroscopic. In this els of description of a system: specification (what the
decisive step from idealized abstract forms toward con- system does) and implementation (how the system is
crete material processes, computation has come close built). Biological function of a biomolecular system
to the messy and complex world of concurrent and emerges thus from the semantic equivalence between
context-dependent processes in living beings [32.27]. the low-level and high-level computational descrip-
One important shift is also in the role of an ob- tions [32.45].
server [32.4]: The difference between mathematical and com-
putational models can be summarized as distinction
“Computational expressions are not constrained to between denotational and operational semantics models
be outside the systems they represent. The pos- given by Fisher and Henzinger in the following [32.46]:
sibility of self-reference makes for very powerful Denotational (mathematical) models present set
computational schemes based on recursive designs of equations showing relationships between different
and executions and also for very powerful limita- quantities and their time changes. They are approxi-
tions on computing, such as the noncomputability mated numerically.
Computational Aspects of Model-Based Reasoning 32.5 Computation in the Wild 703
Operational semantics models are based on algo- ecutable algorithms connect directly to the physical
Part G | 32.5
rithms, which execution (computation) simulates the process and can be used in interactions with them, de-
behavior of the system in time. notational semantics is descriptive and can be used to
The two semantics have different roles – while ex- reason about systems.
“Natural computing is the field of research that “Morphological Computation is based on the ob-
investigates models and computational techniques servation that biological systems seem to carry out
inspired by nature and, dually, attempts to under- relevant computations with their morphology (phys-
stand the world around us in terms of information ical body) in order to successfully interact with
processing. [. . . ] There is information processing in their environments. This can be observed in a whole
nature, and the natural sciences are already adapting range of systems and at many different scales. It has
by incorporating tools and concepts from computer been studied in animals – e.g., while running, the
science at a rapid pace. Conversely, a closer look functionality of coping with impact and slight un-
at nature from the point of view of information evenness in the ground is delivered by the shape of
processing can and will change what we mean by the legs and the damped elasticity of the muscle-
computation.” tendon system – and plants, but it has also been
observed at the cellular and even at the molecular
James Crutchfield et al. [32.52] make a distinction level – as seen, for example, in spontaneous self-
between designed computation (that is what computer assembly.”
machinery performs) from intrinsic computation (that
is all processes in nature that are inherently compu- As can be observed in nature, living systems from
tational and that are used in computing machinery to the simplest unicellular to the most complex multi-
compute on the basic hardware level). In that way, they cellular organisms are heterogeneous, distributed, mas-
are able to argue that information processing in dy- sively concurrent, and largely asynchronous (in spite
namical systems as well is computational in nature – of certain common regularities like circadian rhythm
704 Part G Modelling and Computational Issues
with oscillation of about 24 hours, which is found manipulation is the way of classical Turing computa-
Part G | 32.5
in animals, plants, fungi, and bacteria). This kind of tion, while subsymbolic processes such as going on
systems are hardest to cope with in conventional com- in dynamic systems are frequently not even consid-
puting [32.58]. The approach to modeling such sys- ered as computing. Andy Clark argues convincingly
tems suggested by Luca Cardelli is as “collectives of for the necessity of both kinds of processes, subsym-
interacting stochastic automata, with each automaton bolic and symbolic for human-level cognition [32.62].
representing a molecule that undergoes state transi- Information is relative to a cognizing agent and what
tions” [32.59]. Aviv Regev and Ehud Shapiro suggest is information, symbolic, subsymbolic, continuous, or
that we should use lessons learned from modeling se- discrete is a question of level of description or agency.
quence and structure in biomolecular systems, which From the everyday experience of a human agent, water
already have good computational models [32.45]: and air are continua. However, on the molecular level
(thus from the perspective of molecular agency) water
“Sequence and structure research use computers and air consist of discrete objects – molecules. Atomic
and computerized databases to share, compare, crit- nucleus is seen as a continuum in Bohr’s liquid drop
icize and correct scientific knowledge, to reach model, while on the level of constituent nucleons, it is
a consensus quickly and effectively. Why can’t a discrete system. On the level of nucleons, we can see
the study of biomolecular systems make a similar continuum but on the deeper level of their constituent
computational leap? Both sequence and structure quarks as agents, there is a discrete behavior. In gen-
research have adopted good abstractions: DNA-as- eral, what an agent registers as continuous or discrete
string (a mathematical string is a finite sequence depends on both the system one examines and the type
of symbols) and protein-as-three-dimensional- of agent – its structures and ways of interaction. In the
labeled-graph, respectively. Biomolecular systems dynamic systems models, [32.63]:
research has yet to find a similarly successful one.”
“[T]he general idea is that cognition should be
32.5.2 New Computationalism. characterized as a continual coupling among brain,
Nonsymbolic versus Symbolic body, and environment that unfolds in real time, as
Computation opposed to the discrete time steps of digital com-
putation. The emphasis of the dynamical approach
Greg Chaitin argued in [32.60], in the tradition from is on how the brain/body/environment system as
Leibniz to number, epistemology can be seen as a whole changes in real time, and dynamics is
information theory. Reality for an agent is defined by in- proposed as the best framework for capturing that
formation and its processes [32.61] where information change. This is said to contrast with computation’s
processes for a cognizing agent proceed from deepest focus on internal structure i. e., its concern with the
levels of cell-cognition, self-organized and with emer- static organization of information processing and
gent properties from subsymbolic signal processing up representational structure in a cognitive system.”
to symbolic level of human (natural and formal) lan-
guages. Computational modeling of cognitive processes re-
As presented in [32.23, pp. 1–22], it is often argued quires computing tools that contain not only Turing
that computationalism is the opposite of connectionism Machine model but also physical computation on the
and that connectionist networks and dynamic systems level of biological substrate. That is also the claim made
are not computational. This would imply that human by Matthias Scheutz in the Epilogue of the book Com-
mind, as network of processes resulting from the ac- putationalism: New Directions [32.64, p. 176] where he
tivity of human brain, cannot be adequately modeled notices that:
computationally. However, if we define computation in
a sense of natural computation, instead of symbol ma- “Today it seems clear, for example, that classical
nipulation as in the Turing machine, it is obvious that notions of computation alone cannot serve as foun-
processes in the physical substrate of the human brain dations for a viable theory of the mind, especially
are natural computation and consequently models of in light of the real-world, real-time, embedded, em-
connectionist networks and dynamical systems do cor- bodied, situated, and interactive nature of minds,
respond to computational processes. although they may well be adequate for a lim-
One of the central and long-standing controversies ited subset of mental processes (e.g., processes
when it comes to understanding of computation in bio- that participate in solving mathematical problems).
logical (cognitive) systems is the relationship between Reservations about the classical conception of com-
symbolic and subsymbolic computation, where symbol putation, however, do not automatically transfer and
Computational Aspects of Model-Based Reasoning 32.5 Computation in the Wild 705
apply to real-world computing systems. This fact a symbolic simulation. Analog simulation, in con-
Part G | 32.5
is often ignored by opponents of computationalism, trast, is defined by a single mapping from causal re-
who construe the underlying notion of computation lations among elements of the simulation to causal
as that of Turing-machine computation.” relations among elements of the simulated phe-
nomenon.”
Classical computationalism was the view that
the classical theory of computation (Turing-machine Both symbolic and subsymbolic (analog) simula-
model, universal, and disembodied) might be enough to tions depend on causal/analog/physical and symbolic
explain cognitive phenomena. New computationalism type of computation on some level but in the case of
(natural computationalism) emphasizes that embodi- symbolic computation it is the symbolic level where
ment is essential and thus physical computation, hence information processing is observed. Similarly, even
natural computationalism. The view of Scheutz is sup- though in the analog model symbolic representation ex-
ported by Gerard O’Brien [32.65] who is arguing that ists at some high level of abstraction, it is the physical
agency of the substrate and its causal structure that de-
“cognitive processes, are not governed by excep- fine computation (simulation).
tionless, representation-level rules; they are instead Gianfranco Basti [32.68] suggests how to
the work of defeasible cognitive tendencies sub-
served by the non-linear dynamics of the brains “integrate in one only formalism the physical (nat-
neural networks.” ural) realm, with the logical-mathematical (com-
putation), studying their relationships. That is, the
Dynamical characterization of the brain is consis- passage from the realm of the causal necessity (nat-
tent with the analog interpretation of connectionism. ural) of the physical processes, to the realm of
But dynamical systems theory is typically not consid- the logical necessity (computational), and eventu-
ered to be a computational framework. O’Brien and ally representing them either in a sub-symbolic, or
Opie [32.66] thus search for an answer to the question in a symbolic form. This foundational task can be
how connectionist networks compute, and come with performed, by the discipline of theoretical formal
the following characterization: ontology.”
“Connectionism was first considered as the opposed Walter Jackson Freeman offers an accurate
to the classical computational theory of mind. Yet, characterization of the relationship between phys-
it is still considered by many that a satisfactory ac- ical/subsymbolic and logical/symbolic level in the
count of how connectionist networks compute is following passage [32.69]:
lacking. In recent years networks were much in fo-
cus and agent models as well so the number of those “The symbols are outside the brain. Inside the
who cannot imagine computational networks has brains, the construction is effected by spatiotem-
rapidly decreased.” poral patterns of neural activity that are opera-
tors, not symbols. The operations include forma-
As in classical computationalism only symbolic tion of sequences of neural activity patterns that
computation was taken into account, it is important to we observe by their electrical signs. [. . . ] Neural
understand the connection between symbolic and sub- operators implement non-symbolic communication
symbolic information processing [32.67, p. 119]: of internal states by all mammals, including hu-
mans, through intentional actions. [. . . ] I propose
“Symbolic simulation is thus a two-stage affair: first that symbol-making operators evolved from neural
the mapping of inference structure of the theory mechanisms of intentional action by modification of
onto hardware states which defines symbolic com- non-symbolic operators.”
putation; second, the mapping of inference structure
of the theory onto hardware states which (under Subsequently, brain uses internal subsymbolic com-
appropriate conditions) qualifies the processing as puting to manipulate relevant external objects/symbols.
706 Part G Modelling and Computational Issues
Information
In the framework of Humberto Maturana and Fran- Insights into the cognitive processes of bacteria
cisco Varela, cognition is capacity of all living beings, have very far-reaching consequences for our under-
no matter how small or simple [32.70]. It is charac- standing of life and cognition [32.78]:
teristics of organisms that increase in complexity from
the simplest ones such as bacteria, to the most com- “[T]he recognition of sophisticated information
plex forms of cognition found in humans. Maturana processing capacities in prokaryotic cells represents
and Varela’s view is gaining substantial support through another step away from the anthropocentric view of
the study of cognitive capacities of bacteria [32.71–79] the universe that dominated pre-scientific thinking.
and others. Social cognition has been studied in bacte- Not only are we no longer at the physical center of
rial colonies, swarms, and films. Lorenzo Magnani and the universe; our status as the only sentient beings
Emanuele Bardone summarize current findings in the on the planet is dissolving as we learn more about
following [32.80]: how smart even the smallest living cells can be.”
“[A]ll organisms, including bacteria, are able to per- Regarding information processing (computation)
form elementary cognitive functions because they in bacteria, [32.73] emphasizes that bacterial infor-
sense the environment and process internal informa- mation processing differs from the Turing machine
tion for ‘thriving on latent information embedded in model of computation. Unlike the Turing machine,
the complexity of their environment’ (Ben Jacob, in a bacterial colony, in response to an input (sig-
Shapira, and Tauber, 2006) p. 496.” nals or molecules from the environment), hardware
In light of the contemporary research results, the (physical system, bacteria) changes through informa-
earlier completely dominating, and to this day still tion processing resulting in a new configuration/form
widespread view of cognition as an exclusively hu- plus possibly some output in signals/molecules. Bacte-
man capacity occurs as a gross simplification. The ria typically exchange molecules as information, and it
more we learn about the ways living organisms cope might also be exchange of genetic material. This type
with their environments, the more we understand that of computation is example of Kampis’ component sys-
even the simplest organisms exhibit cognitive behav- tems [32.26] and presents physical computation [32.54,
iors, that is, adaptive information processing increasing 81–83]. Yet another take on physical processes behind
their probability of survival. According to James Alan living agency and its evolution is elaborated by Ter-
Shapiro [32.78]: rence Deacon [32.84]. Even though Deacon himself is
not a computationalist, models he develops can be un-
“bacteria utilize sophisticated mechanisms for inter- derstood as mechanistic and interpreted as computation.
cellular communication and even have the ability to For [32.85], bacterial cognition is a case of interactive
commandeer the basic cell biology of higher plants biological (hyper)computation, that is, computation be-
and animals to meet their own needs.” yond Turing machine model.
As an example of the level at which the subtleties According to [32.86] dynamical systems [32.87],
of bacterial cognitive behavior is known, we refer to analog neural networks [32.88] and oracle Turing ma-
Stephan Schauder and Bonni L. Bassler who reveal the chines [32.89] have in common that they introduce
specifics of bacterial communication and quorum sens- elements that are not Turing computable, that is, they
ing both within and between species of bacteria [32.75]: introduce hyper-computation. Bournez and Cosnard
compare capabilities of discrete versus dynamical sys-
“Bacteria communicate with one another using tems and conclude that “many dynamical systems have
chemical signaling molecules as words. Specifi- intrinsically super-Turing capabilities.” Models of hy-
cally, they release, detect, and respond to the ac- percomputation or super-Turing computation models of
cumulation of these molecules, which are called biological systems are studied in [32.90].
autoinducers. Detection of autoinducers allows bac- Advancing computational models of cognition and
teria to distinguish between low and high cell pop- bridging the gap between bacterial and human cog-
ulation density, and to control gene expression in nition calls for studies of cognition in other living
response to changes in cell number. This process, organisms. Even though human cognition is usually su-
termed quorum sensing, allows a population of bac- perior to animal and plant cognition, it is not always the
teria to coordinately control the gene expression of case. For example, many animals have superior senses
the entire community.” like vision, hearing, far better motoric skills, and some
Computational Aspects of Model-Based Reasoning 32.6 Cognition: Knowledge Generationby Computation of New Information 707
Part G | 32.6
memory tasks [32.91]
“Knowledge generation can be naturalized by
“Young chimpanzees have an extraordinary work- adopting computational model of cognition and
ing memory capability for numerical recollection – evolutionary approach. In this framework knowl-
better even than that of human adults tested in the edge is seen as a result of the structuring of input
same apparatus following the same procedure.” data (data ! information ! knowledge) by an
interactive computational process going on in the
Understanding cognition in other organisms has
agent during the adaptive interplay with the en-
extraordinary value in understanding mechanisms of
vironment, which clearly presents developmental
cognition and their evolution. For example study of
advantage by increasing agent’s ability to cope with
cognition in fish can help us find ecological factors that
the situation dynamics.”
affect the evolution of particular cognitive abilities. One
can study the relationship between size of specific brain Scientific knowledge is obviously human knowl-
areas and cognitive abilities and the stages in the de- edge and it includes both propositional (typically
velopment of decision abilities [32.92]. Animal studies theoretical) and nonpropositional (typically practical)
can be used for tracing evolution of cognitive capacities, knowledge. Ronald N. Giere suggests that models in
and quantitatively testing possible correlations between science are best understood “as being components of
certain cognitive abilities and life history, morphology, distributed cognitive systems,” where the process of sci-
or socioecological variables, measure if phylogenetic entific cognition “is distributed between a person and an
similarity corresponds to the cognitive skills through- external representation” [32.96].
out species, etc. [32.93]. However, traditional and even The idea of distributed cognition can be traced back
to this day predominant view is that only humans pos- to David Rumelhart and James McClelland [32.97], and
sess cognition. As a consequence, cognitive science it has been developed during the years in a number of
developed vast majority of its models and theories ex- prominent works such as [32.98, 99, 99–102].
clusively about human cognition. The related idea termed extended mind has
been proposed by Andy Clark and David Chalmers
32.6.1 Distributed Cognition in [32.103] meaning that humans use tools and other
and Model-Based Reasoning suitable objects in the environment to perform cogni-
tive tasks. Enactivism is a connected movement in the
Even though animals (including birds) use tools for philosophy of mind whose proponents argue that we
different purposes, human intelligence is defined as should understand mental abilities as essentially related
“faculty to create artificial objects, in particular tools to to the extended body and to action [32.104].
make tools” [32.94]. In the embodied interaction with In the study of the capabilities of networks of simple
the environment humans are “engaged in a process of processors, Rumelhart and McClelland [32.97] found
cognitive niche construction” as they delegate certain that they are good at recognizing patterns in the input.
cognitive functions to the environment [32.80]: The generalization to human brain is that it recognizes
patterns through the activation of changes in the states
“In this sense, we argue that a cognitive niche
of neurons induced by sensory inputs. Rumelhart and
emerges from a network of continuous interplays
McClelland suggest that “humans do the kind of cog-
between individuals and the environment, in which
nitive processing required for these linear activities by
people alter and modify the environment by mimet-
creating and manipulating external representations.”
ically externalizing fleeting thoughts, private ideas,
In a distributed cognitive system information pro-
etc., into external supports. [. . . ] Artifactual cogni-
cessing happens through parallel distributed processing
tive objects and devices extend, modify, or substi-
(PDP). In this view “the regular or law-like behavior
tute natural affordances actively providing humans
of a complex system is the consequence of interac-
and many animals with new opportunities for ac-
tions among constituent elements” [32.105]. The main
tion.”
ideas of PDP models – such as that cognitive func-
Underlying computational cognitive mechanisms tions arise from neural mechanisms, representations are
enable the process of model construction. If we take distributed, cognitive development is driven by learn-
knowledge to include not only the propositional knowl- ing, cognitive structure is quasi-regular, behavior is
edge (knowledge that) but also nonpropositional knowl- sensitive to multiple ordered constraints, processing is
edge (knowledge how), we can say that bacteria know ordered and continuous – are now standard assumptions
how to find food and avoid dangers in the environment. in many research domains.
708 Part G Modelling and Computational Issues
is seen as a means of socially distributed cognition that entific cognition, and the nature of model-based rea-
supports human communication [32.106]: soning in science in order to give an account of their
cognitive basis and the role they play in representational
“There is no language of thought. Rather, thinking
change. She introduced the term model-based reason-
in language is a manifestation of a pattern matching
ing to denote the construction and manipulation of
brain trained on external linguistic structures (see
representations, both sentential, and those related to ex-
also [32.107]).”
ternal mediators [32.111, 112]. Model-based reasoning
This distributed view of language implies that “cog- is applied to among others thought experiments, visual
nition is not only embodied, but also embedded in a so- representations, and in analogical reasoning [32.113,
ciety and in a historically developed culture” [32.101]. 114].
Even here, as in the case of language games we see As Giere [32.115] emphasizes, models are not only
social aspect of cognitive artifacts. In the same way tools but they also play a central role in the con-
as a concept is a node in a network of related con- struction of knowledge. “Models are important, not
cepts, connected with several types of relationships, as expressions of belief, but as vehicles for exploring
distributed cognition in general is a network that in prin- the implications of ideas (McClelland 2010)” [32.105].
ciple can extend indefinitely. What we consider to be One of the ways of acquiring knowledge besides de-
relevant depends on the agent and the context. duction and induction is abduction that leads to knowl-
The computational model of cognition is closely re- edge discovery. Along with sentential and model-based
lated to the idea of agency. An agent in this context theoretical abduction, [32.116] identifies manipulative
is defined as an entity capable of acting on its own abduction as thinking and learning through doing. Ma-
behalf. Agent models are especially suitable for model- nipulative abduction is thus situated in the domain of
ing of distributed cognitive systems and they are used extended cognition and presents an extra-theoretical
for study of adaptive behavior and learning. Among behavior developed through manipulation of artifacts,
new trends in modeling there are generative agent- such as written notes, diagrams, experimental set-ups,
based models, where complexity of a system results visual and other simulations, etc. One of the illustrative
from the time development of the interactive behavior examples of extended cognition is diagrammatic rea-
of simple constitutive parts (such as swarm). Espe- soning [32.117, 118], [32.119]:
cially successful are new network models of complex
“What is interesting about diagrammatic reasoning
systems [32.108] where the focus is on the properties
is the interaction between the diagram and a hu-
of various network structures in an emerging network
man with a fundamentally pattern-matching brain.
science. Commenting on the current development of
Rather than locating all the cognition in the human
computing, [32.109] declare:
brain, one locates it in the system consisting of a hu-
man together with a diagram. It is this system that
“Multiplicities, flux, materialities, heterogeneities,
performs the cognitive task, for example, proving
and co-construction are features that are becoming
the Pythagorean theorem.”
increasingly evident within new configurations of
computing.” Giere provides further examples of reasoning with
pictorial representations and reasoning with physical
Those features are particularly suitable in modeling and abstract models [32.96, p. 237]
of living (cognitive) systems.
“The idea of distributed cognition is typically asso-
ciated with the thesis that cognition is embodied. In
32.6.2 Computational Aspects
more standard terms, one cannot abstract the cogni-
of Model-Based Reasoning
tion away from its physical implementation.”
in Science
This agrees with Fresco’s conclusions from
Nancy Nersessian [32.110] searches for “the cogni- his book Physical Computation and Cognitive Sci-
tive basis of model-based reasoning in science,” espe- ence [32.54].
cially for model-based creative reasoning that results Based on the idea that “a complex system, as the
in “representational change across the sciences,” thus cognitive one, and its transformations, can be described
investigating the central issue of creativity in science in terms of a configurational structure.” [32.120],
asking “how are genuinely novel scientific representa- morphodynamical abduction is then the abduction ex-
tions created, given that their construction must begin pressed through the geometrical framework of configu-
with existing representations?” rational structure. Magnani in [32.121] explains that
Computational Aspects of Model-Based Reasoning 32.7 Model-Based Reasoningand Computational Automation of Reasoning 709
“different mental states are defined by their ge- in the light of catastrophic rearrangement of attrac-
Part G | 32.7
ometrical relationships within a larger dynamical tors.”
environment. This suggests that the system, in any
This insight about the character of major qualitative
given instant, possesses a general morphology we
shifts in understanding can be extended to aspects of
can study by observing how it changes and devel-
scientific discovery as Thagard addressed for concep-
ops.”
tual change and scientific revolutions [32.122].
This suggests the possibility of representing aspects In the context of model-based reasoning it is in-
of abductive reasoning in the framework of dynamical structive to take software as an example of executable
systems, as processes of natural computation. computational models that enables transformations
In [32.121] the authors argue that: from system requirements to design and implementa-
tion. Software is used as a cognitive tool of extended
“Creative and selective abduction can be viewed as cognition, which facilitates cognitive information pro-
a kind of process related to the transformations of cessing by automation of reasoning, performing cog-
the attractors responsible of the cognitive system nitive tasks of computation, search, control, etc., that
behavior. In the context of naturalized phenomenol- already has resulted in radically new conceptualizations
ogy we have described anticipation and abduction such as cyber-physical systems and Internet of things.
alizable at all. Therefore, even though programming can Modeling languages like the UML (unified mod-
Part G | 32.7
be seen as a form of modeling of a system and its behav- elling language), (object management group (OMG)
ior, here we will refer to models as higher abstraction [32.127]) are formal enough to be used for automated
level representations, very often pictorial, of a software reasoning. They exploit pictorial, or better, diagram-
system [32.125]: matic representations of a system to document its
structure and functions. They are typical examples of
“Modeling, in the broadest sense, is the cost-
previously mentioned diagrammatic reasoning where
effective use of something in place of something
diagrams are objects supporting distributed cognition.
else for some cognitive purpose. It allows us to use
With UML being formal here we mean that models con-
something that is simpler, safer or cheaper than re-
forming, for example, to the UML have to adhere to
ality instead of reality for some purpose. A model
a well-defined set of rules and constraints to produce
represents reality for the given purpose; the model
models. Being more precise, a model has to adhere to
is an abstraction of reality in the sense that it cannot
a set of constraints defined by means of a metamodel,
represent all aspects of reality. This allows us to deal
that is, a modeling language definition.
with the world in a simplified manner, avoiding the
The most common way of defining a modeling lan-
complexity, danger and irreversibility of reality.”
guage is to specify concepts and relationships between
From the 1980s, modeling techniques in software them. In general, language elements are interconnected
engineering have become more and more widespread. through associations with multiplicities restricting pos-
Their initial exploitation was confined to sole doc- sible source(s) and target(s) of a certain relation. More-
umentation/communication purposes, while from the over, elements can contain other elements, and elements
late 1990s they have been progressively adopted to can specialize other elements. If a model adheres to all
provide some form of automation. Interestingly, com- the production rules defined in the metamodel, it is said
puting is perhaps the only domain (except for logic) to conform to the metamodel. The metamodel itself has
that has meta-products, that is, it realizes products to adhere to precise rules to specify a language (like
that are developed by means of the same methods those we informally listed so far), which are typically
used to realize the design/modeling tools (unlike, e.g., referred to as the meta-metamodel. The abstraction hi-
civil engineering where physics and computer graph- erarchy is limited at this point by prescribing that the
ics are exploited to model the final physical product, meta-metamodel is defined by itself.
houses and bridges). It is not surprising that model- In order to improve the usability and readability of
ing techniques can be conceived as a derived product models, very often language concepts are not directly
of description logics (DL), that is, a family of formal shown to users. On the contrary, they are in general
languages used for knowledge representation more ex- amalgamated into information units and rendered in
pressive than propositional logic. However, as [32.126] a more compact/intuitive way. Such utility elements,
argues, “these diagrammatic modeling languages pro- typically called syntactic sugar in programming lan-
vide no extensional, mathematical semantics, nor any guages terminology, introduce the distinction between
automated reasoning facilities.” abstract and concrete syntax of models. The former
The reasons of an evolution toward automated defines the structure of the model in terms of the meta-
mechanisms are often connected with consistency prob- model it conforms to, while the latter represents how
lems: if a software system is documented through the model is rendered to the user. In this respect, the
models and then implemented by hand, none can guar- metamodel defines conformance rules in terms of the
antee that what was coded is compliant with what was abstract syntax, while a certain abstract syntax can give
designed. Even more important, in future maintenance place to multiple concrete syntaxes.
activities it will be very difficult to keep the consis- Modeling languages can be divided into two main
tency between models and code. These issues find their categories: general purpose (GPL) and domain-specific
reasons in the gap between domain-specific description (DSL). The former category refers to languages that
of the problem and programming language encoding have not been defined with a particular application do-
of a solution, as interpreted by the programmer and main in mind, and hence exploit general concepts to rep-
narrowed down by the constructs available in the lan- resent a certain real-life phenomena in abstract terms.
guage (see the notion of agency explained later on). As The latter category instead is typically built bottom-up
a consequence, a paradoxical situation will appear in and exploits the vocabulary specific of a certain applica-
which modeling artifacts have been introduced as an tive field. When linking those categories to model-based
additional effort that, however, leaves the initial prob- reasoning, it is intuitively easier to grasp the message
lem unsolved – aiding in the maintenance of complex carried by a model when produced by a DSL, since
systems. it exploits concepts specific of a certain domain, very
Computational Aspects of Model-Based Reasoning 32.7 Model-Based Reasoningand Computational Automation of Reasoning 711
often with the help of an adequate (pictorial) concrete mantic domain. Thus, any language definition must
Part G | 32.7
syntax. On the contrary, GPLs’ models are less intu- consist of the syntax semantic domain and semantic
itive since they re-encode certain phenomena through mapping from the syntactic elements to the seman-
generic modeling concepts and their concrete syntax has tic domain.”
to exploit very general modeling elements. However, the
specificity of DSLs is also their Achilles’ heel: when- Semantics is distinguished in two main categories,
ever concepts need to be added and/or refined vast re- namely structural and behavioral semantics. The former
visions can be required, in general to change users’ ren- is based on the metamodel definition itself, while the lat-
dering of elements, attached semantics, and so forth. For ter focuses on the execution of models conforming to
example, for a tiny language supporting the graphical a given metamodel. Therefore, for instance if we con-
definition of mathematical expressions, if we consider sider again the mathematical expression language, we
elementary school usage we could define the language could define the subtraction operator as taking two nat-
taking into account only natural numbers. As soon as ural numbers and producing one natural number. At this
we want to extend the language to support real numbers point we can also impose that the first input shall be
everything needs to be revised: notably, negative or ra- greater than the second. Despite that we did not spec-
tional numbers not only need a different representation, ify what a user would do with a model conforming to
but also enable additional operations. such a metamodel, it is quite intuitive that we are defin-
The concepts exploited to define a language implic- ing semantics by the metamodel, since we are narrow-
itly establish ontology of an idealized user using models ing down inputs and output to natural numbers (and we
for a certain purpose. In this respect, the specification are also constraining the maximum number of inputs
of concepts, their relationships, and their graphical ren- to two). Moreover, from an ontological perspective the
dering through a concrete syntax are all conceived with terms subtraction, operand, and result should be self-
this goal in mind. In the case of GPLs, ontology is in- explicative about the concepts provided by the language.
directly derived as mapping between generic concepts Behavioral semantics adds the dynamics of model
and corresponding domain-specific items. Notably, the elements to the structural part, or the definition of the
mathematical language mentioned above could be rep- process related to the ontology defined with the lan-
resented as a package containing four classes, each one guage. Such description could be done in informal
identifying an arithmetical operation, and hence called ways, even by means of natural language (this is for
Addition, Subtraction, Multiplication, and Division, re- instance done for the specification of large portions
spectively. In turn, each class would contain attributes, of the UML). Therefore again taking into account the
called, for example, operand_1, operand_2, and re- math example, the semantics of subtraction concept
sult. Moreover, each class would contain a method would be defined as The subtraction operation sub-
do_calculation performing the appropriate operation tracts the value of operand_2 from operand_1. The
on the operands and returning the result. It is worth main issues with informal approaches are their prone-
noticing that in this case the GPL offers concepts like ness to ambiguous interpretations and the impossibility
package, class, and so forth, by which it is encoding re- to exploit them for automation purposes (e.g., interpre-
ality. On the contrary, a DSL could define a Calculator tation, analysis, execution). A second possibility to give
including the four operations, each of which requiring semantics to metamodel concepts is by means of a be-
two input parameters and returning an output. If either havioral sublanguage embedded in the language itself.
exploiting a GPL or using a DSL, it is supposed that the In some cases the structural part can be decorated with
user has a clear idea of the correspondence between the behavioral descriptions (e.g., scripts in a specific pro-
sign C and the addition operation, and so forth. Even gramming language); in other cases some portions of
more important, the user should make correct use of the language can be devoted to model behavior (for
operands putting them in the correct order, especially instance, the UML specification includes behavioral di-
when performing subtractions or divisions. agrams like state machines. Moreover, there exist UML
When talking about model-based reasoning we usu- extensions enabling behavioral description through so-
ally refer to semantics, that is, the meaning that is called action languages). The model-driven research
associated with concepts and relationships constituting area introduced model transformations as another pos-
a model [32.128]: sible alternative, described in the following [32.129]:
“a language consists of a syntactic notation (syn- “the definition of the semantics of a language can be
tax), which is a possibly infinite set of legal ele- accomplished through the definition of a mapping
ments, together with the meaning of those elements, between the language itself and another language
which is expressed by relating the syntax to a se- with well-defined semantics such as Abstract State
712 Part G Modelling and Computational Issues
Machines, Petri Nets, or rewriting logic. These se- Here model transformations are automatic mecha-
Part G | 32.8
mantic mappings between semantic domains are nisms mapping models toward other models as well as
very useful not only to provide precise semantics streams of characters. In this respect, a transformation
to DSLs, but also to be able to simulate, analyze performs a semantic anchoring between the structural
or reason about them using the logical and seman- concepts defined in the metamodel and the correspond-
tic framework available in the target domain. The ing elements generated as target of the transformation
advantage of using a model-driven approach is that execution. From an ontological perspective, the trans-
these mappings can be defined in terms of model formation in this case conveys the definition of the
transformations.” process.
ent points in time and recognize precisely the kind of exist otherwise than as physical objects. The simpler an
Part G | 32.8
evolution they have been subject to. Notably, it is very agent in the world is, the simpler is its ontology, that
difficult to distinguish between the deletion of an ele- is, all that exists and can exist for that agent. Living
ment and subsequent addition of a new one versus the agents self-organize at increasing number of layers of
simple modification of the same element from the older agency [32.131]. The core is the basic physical struc-
toward a newer version. Even if this could appear as ture of elementary particles and forces from which new
a minor aspect, it has relevant consequences. In fact, layers of organization emerge – from physics to chem-
in the former case whatever was derived from the re- istry, biology, and cognitive layer. Humans, on the top
moved element should be deleted as well, whereas in of hierarchy, have the highest known number of lev-
the latter situation interconnected elements should be els of organization. Starting with the model of layered
tracked and updated accordingly. In turn, the propaga- agency, from the basic physical primitives to cognitive
tion of updates poses interpretation problems as well. functions we can define semantics for an agent. For an
By recalling the simple evolution example related to agent it is characteristic that it is capable of acting in
the subtraction operation (from natural to real numbers) the world.
mentioned before it is not difficult to foresee what kind The most fundamental is physical agency where
of problems developers will incur when trying to re- agents act completely automatically by simply obey-
flect the impact of change on different related elements. ing physical laws. Chemical agency is a level above
Notably, if a concept in the language was renamed, the and it builds on basic physical agency of elemen-
question is should its concrete syntax also be revised? tary physical constituents, taken in bigger chunks of
How? And what about transformation mappings writ- molecules that interact with each other. Biological
ten on the base of the previous version of the modified agency emerges from chemical ones when chemi-
concept? cal structures and cycles are established, which en-
What can be seen as a promising trend to alleviate able stable self-sustaining and self-reproducing for-
the issues described above is the introduction of a sepa- mations [32.84] like first viruses (in between crystal
ration between what have been called as structural and and living organism) and bacteria as simplest cellular
ontological aspects of a certain language. The structural organisms. Symbol-manipulation-based agency arises
aspects refer to generic syntactic rules that are invariant first with organisms possessing nervous system, which
with respect to the information carried by the language. helps them to model and predict their environment.
On the contrary, the ontological aspects are domain spe- Social-level distributed cognition emerges in networks
cific and are strictly related to a particular application of agents, from simplest living organisms to humans
context [32.130]. In general, structural aspects are sim- and anticipated intelligent machinery that is currently
pler to manage, since there is no domain semantics being developed. On the highest level of organiza-
involved. On the contrary, ontological aspects embed tion, there are languages shared by a community of
the semantics of a certain domain, and therefore typi- users, used for coordinating actions and control of
cally require user guidance. If we take an example of environment. Here even programming languages be-
natural language, structural aspects are grammar rules long.
for sentence construction, while the ontological part is Software is based on programming languages,
the argument discussed in the text. In this respect, a bet- which constitute logical framework and syntactic rules.
ter denomination would be informative and cognitive With respect to agency and agents in modeling for
aspects of a language, respectively. In such a way, it SE, some consideration is in place. From the earlier
would not only be clearer that the latter aspects include sections of this chapter, it is clear that our cognitive
semantics details, but it would also make it possible to mechanisms are shaped by our embodied pattern recog-
introduce the notion of agency and agent, to locate pos- nition capability, which in turn has been trained through
sible multiple interpretations due to agency (something individual life and experiences as well as based on evo-
that gets mixed in the term ontology). lutionary developments of the species. In this respect,
A separation between semantics and ontology, by model users are agents – the ones who design the lan-
introduction of the model of agent-centered ontology, guage to specify models, the ones that abstract a certain
is due to the fact that ontology (all that exists) de- real-life phenomena through the model, and the ones
pends on to whom it exists (see the later discussion). who map the information carried by a model back to
For example for humans, invisible microscopic layers a corresponding real-life domain. Agency depends on
of physical reality did not exist before the invention of the background experience of a user that influences the
microscopes; birds collide with windows because they interpretation of concepts represented through models.
do not see glass – for them it does not exist before the Notably, domain experts may have a deeper knowledge
collision. For them much of human civilization does not of certain aspects of the system that would introduce
714 Part G Modelling and Computational Issues
References
Part G | 32
32.1 G. Dodig-Crnkovic: Investigations into Infor- 32.22 H. Zenil (Ed.): A Computable Universe. Under-
mation Semantics and Ethics of Computing standing Computation & Exploring Nature As
(Mälardalen Univ. Press, Västerås 2006) Computation (World Scientific/Imperial College
32.2 C. Ess, R. Hagengruber (eds.): The computational Press, Singapore 2012)
turn: Past, presents, futures?, Proc. IACAP 2011 32.23 G. Dodig-Crnkovic, R. Giovagnoli: Computing Na-
Conf. (Monsenstein Vannerdat, Münster 2011) ture (Springer, Berlin, Heidelberg 2013)
32.3 D.M. Berry: The computational turn: Thinking 32.24 A. Church: Abstract No. 204, Bull. Amer. Math. Soc.
about the digital humanities, Cult. Mach. 12, 1– 41, 332–333 (1935)
22 (2011) 32.25 A. Church: An unsolvable problem of elementary
32.4 P. Denning: Structure and organization of com- number theory, Amer. J. Math. 58, 354 (1936)
puting. In: Computing Handbook. Computer Sci- 32.26 G. Kampis: Self-Modifying Systems in Biology
ence and Software Engineering, ed. by T. Gonza- and Cognitive Science: A New Framework for Dy-
lez, J. Diaz-Herrera, A. Tucker (Chapman Hall/CRC, namics, Information, and Complexity (Pergamon
Boca Raton 2014) Press, Amsterdam 1991)
32.5 P. Denning, P. Rosenbloom: The fourth great do- 32.27 S. Navlakha, Z. Bar-Joseph: Distributed informa-
main of science, ACM Commun. 52(9), 27–29 (2009) tion processing in biological and computational
32.6 S. Abramsky, B. Coecke: Physics from computer systems, Commun. ACM 58(1), 94–102 (2015)
science: A position statement, Int. J. Unconv. 32.28 G. Dodig-Crnkovic: Significance of models of
Comput. 3(3), 179–197 (2007) computation from Turing model to natural com-
32.7 P.J. Denning: The great principles of computing, putation, Minds Mach. 21(2), 301–322 (2011)
Am. Sci. 98, 369 (2010) 32.29 E. Eberbach, D. Goldin, P. Wegner: Turing’s ideas
32.8 S. Wolfram: A New Kind of Science (Wolfram Me- and models of computation. In: Alan Turing: Life
dia, Champaign 2002) and Legacy of a Great Thinker, ed. by C. Teuscher
32.9 J.M. Wing: Computational thinking, ACM Com- (Springer, Berlin, Heidelberg 2004) pp. 159–194
mun. 49(3), 33–35 (2006) 32.30 M. Burgin: Super-Recursive Algorithms (Springer,
32.10 M. Burgin, G. Dodig-Crnkovic: A taxonomy of New York 2005)
computation and information architecture, Proc. 32.31 D. Baltimore: How biology became an informa-
2015 Eur. Conf. Softw. Archit. Workshops (ECSAW tion science. In: The Invisible Future, ed. by
’15) (ACM, New York 2015), Article 7 P. Denning (McGraw-Hill, New York 2001) pp. 43–
32.11 G. Dodig-Crnkovic: Shifting the paradigm of phi- 56
losophy of science: Philosophy of information 32.32 S.B. Cooper, J. van Leeuwen: Alan Turing. His Work
and a new Renaissance, Minds Mach. 13(4), 521– and Impact (Elsevier Science, Amsterdam 2013)
536 (2003) 32.33 M. Burgin, G. Dodig-Crnkovic: From the closed
32.12 M. Burgin: Theory of Information: Fundamen- classical algorithmic universe to an open world of
tality, Diversity and Unification (World Scientific, algorithmic constellations. In: Computing Nature,
Singapore 2010) ed. by G. Dodig-Crnkovic, R. Giovagnoli (Springer,
32.13 J. van Benthem, P. Adriaans: Philosophy of Infor- Berlin, Heidelberg 2013) pp. 241–253
mation (North Holland, Amsterdam 2008) 32.34 L. Floridi: Open problems in the philosophy of in-
32.14 J. van Benthem: Logic and the dynamics of infor- formation, Metaphilosophy 35(4), 554–582 (2004)
mation, Minds Mach. 13(4), 503–519 (2003) 32.35 G. Dodig-Crnkovic, S. Stuart: Computation, Infor-
32.15 J. van Benthem: Logical pluralism meets logical mation, Cognition: The Nexus and the Liminal
dynamics?, Australas. J. Logic 6, 182–209 (2008) (Cambridge Scholars Pub., Newcastle 2007)
32.16 J. van Benthem: Logical Dynamics of Information 32.36 G. Dodig-Crnkovic, V. Müller: A dialogue concern-
and Interaction (Cambridge Univ. Press, Cam- ing two world systems: Info-computational vs.
bridge 2011) mechanistic. In: Information and Computation,
32.17 P. Allo: Logical pluralism and semantic informa- ed. by G. Dodig-Crnkovic, M. Burgin (World Sci-
tion, J. Philos. Logic 36(6), 659–694 (2007) entific, Singapore 2011) pp. 149–184
32.18 B. Cantwell Smith: On the Origin of Objects (MIT 32.37 L. Floridi: What is the philosophy of information?,
Press, Cambridge 1998) Metaphilosophy 33(1/2), 123–145 (2002)
32.19 M. Tedre: The Science of Computing: Shaping a 32.38 C. Hewitt, P. Bishop, P. Steiger: A universal mod-
Discipline (CRC Press/Taylor Francis, Boca Raton ular ACTOR formalism for artificial intelligence,
2014) Proc. 3rd Int. Joint Conf. Artif. Intell. IJCAI, ed. by
32.20 BusinessDictionary: N.J. Nilsson (William Kaufmann, Standford 1973)
http://www.businessdictionary.com/definition/ pp. 235–245
computing.html#ixzz3 NZLB7QAD (2014), visited 32.39 C. Hewitt: Actor model for discretionary, adap-
02.08.2016 tive concurrency, CoRR (2010) http://arxiv.org/abs/
32.21 G. Rozenberg, T. Bäck, J.N. Kok (Eds.): Handbook 1008.1459
of Natural Computing (Springer, Berlin, Heidel- 32.40 C. Hewitt: What is computation? Actor model ver-
berg 2012) sus Turing’s model. In: A Computable Universe,
716 Part G Modelling and Computational Issues
tion. In: Communications and Discoveries From ence, Technology, Values, ed. by L. Magnani,
Part G | 32
Multidisciplinary Data, ed. by S. Iwata, Y. Oshawa, N. Nersessian (Kluwer, Dordrecht 2002) pp. 227–241
S. Tsumoto, N. Zhong, Y. Shi, L. Magnani (Springer, 32.97 D.E. Rumelhart, J.L. McClelland, PDP Research
Berlin, Heidelberg 2008) pp. 3–40 Group: Parallel Distributed Processing: Explo-
32.81 S. Stepney, S.L. Braunstein, J.A. Clark, A.M. Tyrrell, rations in the Microstructure of Cognition. Volume
A. Adamatzky, R.E. Smith, T. Addis, C. Johnson, 1: Foundations (MIT Press, Cambridge 1986)
J. Timmis, P. Welch, R. Milner, D. Partridge: Jour- 32.98 L. Suchman: Plans and Situated Actions (Cam-
neys in non-classical computation I: A grand bridge Univ. Press, Cambridge 1987)
challenge for computing research, Int. J. Parallel 32.99 F. Varela, E. Thompson, E. Rosch: The Embodied
Emerg. Distr. Syst. 20, 5–19 (2005) Mind: Cognitive Science and Human Experience
32.82 S. Stepney, S.L. Braunstein, J.A. Clark, A.M. Tyrrell, (MIT Press, Cambridge 1991)
A. Adamatzky, R.E. Smith, T. Addis, C. Johnson, 32.100 E. Hutchins: Cognition in the Wild (MIT Press,
J. Timmis, P. Welch, R. Milner, D. Partridge: Jour- Cambridge 1995)
neys in non-classical computation II: Initial jour- 32.101 A. Clark: Being There: Putting Brain, Body and
neys and waypoints, Int. J. Parallel Emerg. Distr. World Together Again (Oxford Univ. Press, Oxford
Syst. 21, 97–125 (2006) 1997)
32.83 G. Piccinini: Computation in physical systems. 32.102 A. Clark: Supersizing the Mind Embodiment, Ac-
In: The Stanford Encyclopedia of Philosophy, tion, and Cognitive Extension (Oxford Univ. Press,
ed. by E. N. Zalta (Fall 2012 Edition) http:// Oxford 2008)
plato.stanford.edu/archives/fall2012/entries/ 32.103 A. Clark, D.J. Chalmers: The extended mind, Anal-
computation-physicalsystems/ ysis 58(1), 7–19 (1998)
32.84 T. Deacon: Incomplete Nature. How Mind Emerged 32.104 T. Froese, T. Ziemke: Enactive artificial intelli-
from Matter (Norton, New York, London 2011) gence: Investigating the systemic organization
32.85 C.E. Maldonado, A.N. Gómez Cruz: Biological hy- of life and mind, Artif. Intell. 173(3/4), 466–500
percomputation: A new research problem in com- (2009)
plexity theory, Complexity 20(4), 8–18 (2015) 32.105 T.T. Rogers, J.L. McClelland: Parallel distributed
32.86 F. Hernandez Quiroz: Computational and human processing at 25: Further explorations in the mi-
mind model. In: The Computational Turn: Past, crostructure of cognition, Cogn. Sci. 38, 1024–1077
Presents, Futures?, ed. by C. Ess, R. Hagengruber (2014)
(Monsenstein Vannerdat, Münster 2011) pp. 104– 32.106 R. Giere: Scientific cognition as distributed cog-
106 nition. In: The Cognitive Basis of Science, ed.
32.87 O. Bournez, M. Cosnard: On the computational by P. Carruthers, S.P. Stich, M. Siegal (Cambridge
power and super-Turing capabilities of dynamical Univ. Press, Cambridge 2002) p. 285
systems, Theor. Comput. Sci. 168, 417–459 (1996) 32.107 W. Bechtel: What knowledge must be in the head
32.88 H.T. Siegelmann: Turing on super-Turing and in order to acquire knowledge? In: Communicat-
adaptivity, Prog. Biophys. Mol. Biol. 113(1), 117–126 ing Meaning: The Evolution and Development of
(2013) Language, ed. by B.M. Velichkovsky, D.M. Rum-
32.89 A.M. Turing: Systems of logic based on ordinals, baugh (Lawrence Erlbaum, New Jersey 1996)
Proc. Lond. Math. Soc. s2-45, 161–228 (1939) 32.108 A.-L. Barabasi: The architecture of complexity,
32.90 C.E. Maldonado, A.N. Gómez Cruz: Biological hy- IEEE Control Syst. Mag. 27(4), 33–42 (2007)
percomputation: A concept is introduced, http:// 32.109 I. Cafezeiro, C. Gadelha, V. Chaitin, I.C. da Mar-
arxiv.org/abs/1210.4819 (2012) ques: A knowledge-construction perspective on
32.91 S. Inoue, T. Matsuzawa: Working memory of nu- human computing, collaborative behavior and
merals in chimpanzees, Curr. Biol. 17(23), R1004– new trends in system interactions, Lect. Notes
R1005 (2007) Comput. Sci. 8510, 58–68 (2014)
32.92 R. Bshary, W. Wickler, H. Fricke: Fish cognition: A 32.110 N.J. Nersessian: The cognitive basis of model-
primate’s eye view, Anim. Cogn. 5(1), 1–13 (2002) based reasoning in science. In: The Cognitive
32.93 E.L. MacLean, L.J. Matthews, B.A. Hare, C.L. Nunn, Basis of Science, ed. by P. Carruthers, S. Stich,
R.C. Anderson, F. Aureli, E.M. Brannon, J. Call, M. Siegal (Cambridge Univ. Press, Cambridge 2002)
C.M. Drea, N.J. Emery, D.B.M. Haun, E. Herrmann, pp. 133–153
L.F. Jacobs, M.L. Platt, A.G. Rosati, A.A. Sandel, 32.111 N.J. Nersessian: Should physicists preach what
K.K. Schroepfer, A.M. Seed, J. Tan, C.P. van Schaik, they practice? Constructive modeling in doing and
V. Wobber: How does cognition evolve? Phyloge- learning physics, Sci. Educ. 4, 203–226 (1995)
netic comparative psychology, Anim. Cogn. 15(2), 32.112 N.J. Nersessian: Model-based reasoning in con-
223–238 (2012) ceptual change. In: Model-Based Reasoning in
32.94 H. Bergson: Creative Evolution (Dover, New York Scientific Discovery, ed. by L. Magnani, N. Nerses-
1998) sian, P. Thagard (Kluwer/Plenum, New York 1999)
32.95 G. Dodig-Crnkovic: Knowledge generation as nat- pp. 5–22
ural computation, J. Syst. Cybern. Inf. 6(2), 12–16 32.113 L. Magnani, N.J. Nersessian, P. Thagard (Eds.):
(2008) Model-Based Reasoning in Scientific Discovery
32.96 R.N. Giere: Models as parts of distributed cog- (Kluwer/Plenum, New York 1999)
nitive systems. In: Model-Based Reasoning: Sci-
718 Part G Modelling and Computational Issues
32.114 L. Magnani, N.J. Nersessian: Model-Based Rea- 32.123 T. Deacon: The Symbolic Species: The Co-Evolution
Part G | 32
soning: Science, Technology, Values (Springer, of Language and the Brain (Norton, New York,
New York 2002) London 1997)
32.115 R.N. Giere: Using models to represent reality. 32.124 K. Czarnecki: Generative Programming. Principles
In: Model-Based Reasoning in Scientific Discov- and Techniques of Software Engineering Based on
ery, ed. by L. Magnani, N. Nersessian, P. Thagard Automated Configuration and Fragment-Based
(Kluwer/Plenum, New York 1999) pp. 41–57 Component Models, Ph.D. Thesis (Department
32.116 L. Magnani: Model-based and manipulative ab- of Computer Science and Automation, Technical
duction in science, Found. Sci. 9(3), 219–247 University of Ilmenau, Ilmenau 1998)
(2004) 32.125 J. Rothenberg: The nature of modeling. In: Arti-
32.117 N.H. Narayanan, M. Suwa, H. Motoda: Hypoth- ficial Intelligence, Simulation, and Modeling, ed.
esizing behaviors from device diagrams. In: by L.E. William, K.A. Loparo, N.R. Nelson (Wiley,
Diagrammatic Reasoning: Cognitive and Com- New York 1989) pp. 75–92
putational Perspectives, ed. by J. Glasgow, 32.126 F. Dau, P.W. Eklund: A diagrammatic reasoning
N.H. Narayanan, B. Chanrasekaran (MIT Press, system for the description logic ALC, J. Vis. Lang.
Cambridge 1995) pp. 501–534 Comput. 19(5), 539–573 (2008)
32.118 D. Wang, J. Lee, H. Zeevat: Reasoning with di- 32.127 Object Management Group (OMG): UML 2.4.1 Su-
agrammatic representations. In: Diagrammatic perstructure Specification. OMG Document Num-
Reasoning: Cognitive and Computational Per- ber: formal/2011-08-06 (OMG 2011), http://www.
spectives, ed. by J. Glasgow, N.H. Narayanan, omg.org/spec/UML/2.4.1/
B. Chanrasekaran (MIT Press, Cambridge 1995) 32.128 D. Harel, B. Rumpe: Meaningful modeling: What’s
32.119 R.N. Giere: Scientific Perspectivism (Univ. Chicago the semantics of “semantics”?, IEEE Comput.
Press, Chicago 2006) 37(10), 64–72 (2004)
32.120 L. Magnani: Abductive Cognition: The Epistemo- 32.129 A. Vallecillo: A journey through the secret life of
logical and Eco-Cognitive Dimensions of Hypo- models, Persp. Workshop: Model Eng. Complex
thetical Reasoning (Springer, Berlin, Heidelberg Syst. (MECS), Dagstuhl Sem. Proc. 08331. (2008)
2009) 32.130 T. Kühne: Matters of (meta-) modeling, Soft. Syst.
32.121 L. Magnani, M. Piazza: Morphodynamical abduc- Model. 5(4), 369–385 (2006)
tion. Causation by attractors dynamics of ex- 32.131 G. Dodig-Crnkovic: Info-computational construc-
planatory hypotheses in science, Found. Sci. 10, tivism and cognition, Constr. Found. 9(2), 223–231
107–132 (2005) (2014)
32.122 P. Thagard: Conceptual Revolutions (Princeton
Univ. Press, Princeton 1992)
719
Computation
33. Computational Scientific Discovery
Part G | 33
33.1 The Roots
Computational scientific discovery is becoming in-
of Human Scientific Discovery ............ 720
creasingly important in many areas of science.
This chapter reviews the application of compu- 33.2 The Nature of Scientific Discovery ...... 721
tational methods in the formulation of scientific 33.3 The Psychology
ideas, that is, in the characterization of phenom- of Human Scientific Discovery ............ 722
ena and the generation of scientific explanations,
in the form of hypotheses, theories, and models. 33.4 Computational Discovery
After a discussion of the evolutionary and anthro- in Mathematics................................. 723
pological roots of scientific discovery, the nature 33.4.1 Logic Theorist .................................... 723
33.4.2 AM and EURISKO ................................ 724
of scientific discovery is considered, and an out-
33.4.3 GRAFFITI............................................ 724
line is given of the forms that scientific discovery
can take: direct observational discovery, finding 33.5 Methods and Applications
empirical rules, and discovery of theories. A dis- in Computational Scientific Discovery. 725
cussion of the psychology of scientific discovery 33.5.1 Massive Systematic Search
includes an assessment of the role of induction. Within a Defined Space ...................... 726
Computational discovery methods in mathematics 33.5.2 Rule-Based Reasoning Systems ........... 726
are then described. This is followed by a sur- 33.5.3 Classification, Machine Vision,
vey of methods and associated applications in and Related Techniques ..................... 727
computational scientific discovery, covering mas- 33.5.4 Data Mining ...................................... 727
33.5.5 Finding Networks .............................. 727
sive systematic search within a defined space;
33.5.6 Evolutionary Computation .................. 728
rule-based reasoning systems; classification, ma-
33.5.7 Automation of Scientific Experiments .. 729
chine vision, and related techniques; data mining;
finding networks; evolutionary computation; and 33.6 Discussion ........................................ 730
automation of scientific experiments. We conclude References................................................... 731
with a discussion of the future of computational
scientific discovery, with consideration of the ex-
tent to which scientific discovery will continue to
require human input.
Science is concerned with characterizing and explain- lation of scientific ideas – the characterization of phe-
ing observations and phenomena. For most of history, it nomena and the generation of scientific explanations,
has been an exclusively human activity. However, the in the form of hypotheses, theories, and models. This is
development of computers has had a substantial im- the focus of this review. What computational methods
pact on science. The assessment and testing of scientific have been developed so far, and how have they been ap-
models has seen the application of computational meth- plied? What scope is there for further developments and
ods, often with spectacular success. Among these major applications?
successes have been the development of numerical and We begin by discussing the evolutionary biologi-
simulation methods to compute the predictions of sci- cal and anthropological roots of scientific discovery,
entific models [33.1, 2]. and the establishment of scientific discovery as a hu-
A more ambitious endeavour is the use of computa- man endeavor. We then set out important features of
tional methods to represent, in some sense, the formu- the nature of the scientific discovery process, and the
720 Part G Modelling and Computational Issues
different forms of discovery. This provides a basis for in science. We then survey methods and applications
considering the psychology of scientific discovery. This in computational scientific discovery. We finally draw
is followed by a discussion of computational discovery conclusions, with particular attention paid to what type
methods in mathematics, which, because of the close of problems in scientific discovery computational meth-
links between mathematics and the theoretical sciences, ods are best suited to, and the future of computational
provides a useful prelude to computational discovery scientific discovery.
Part G | 33.1
Part G | 33.2
taxonomies as the first stage of the discovery process. power than Kepler’s laws. A putative chemical structure
However, we do not consider direct observational dis- can be regarded as a theory, and therefore the discovery
covery to be limited to the formation of taxonomies, of a chemical structure (such as the discovery by Wat-
as it may include discoveries that fit within existing son and Crick of the double helix structure of DNA)
taxonomies, such as the discovery of Pluto described can be regarded as discovery of a scientific theory, but
above. Additionally, the formation of taxonomies is not it is not clear that there is a clear division between this
always the first stage of scientific discovery; for exam- discovery and direct observational discovery; in con-
ple, the classification of organisms according to how trast, the discovery of the theory of evolution by natural
recently they last shared a common ancestor depends selection is very different from a direct observational
on the prior theory that the organisms in question are discovery.
descended from a common ancestor. In short, there Philosophers of science have been concerned
are cases where direct observational discovery does mostly with the discovery of scientific explanations,
not involve the formation of taxonomies, and cases that is, of empirical rules and theories. The essential
where the formation of taxonomies is a late stage in characteristic of a scientific explanation is that it be
the discovery process. We therefore consider direct logically coherent [33.21]. But how are scientific ex-
observational discovery to be a basic form of sci- planations (in the form of empirical rules or theories)
entific discovery. It generally occurs at the birth of generated in the first place? And how should a choice
a scientific field, but often continues as the field ma- be made between competing empirical rules or theo-
tures. ries?
A different form of discovery, which generally Assessing potential explanations is problematic be-
comes after a field has matured sufficiently to have cause there is no agreed objective method to assess a po-
accepted terms for describing observations and quan- tential explanation against an alternative (nor against
titative measures of these observables, is concerned the possibility that no explanation so-far advanced is
with finding empirical rules. Such a rule is a use- a good explanation of the phenomenon in question).
ful description of some aspect of the world to which There are some generally agreed principles that make
a number of observations conform. Prominent exam- a potential explanation more likely to be accepted.
ples are Kepler’s laws of planetary motion, Weber’s These are that it is better if it fits the data more closely,
law (also known as the Weber–Fechner law) in psy- better if it is more parsimonious, and better if it is more
chology, which states that the just noticeable difference plausible. However, while metrics can sometimes be
between two stimuli is proportional to the size of the calculated for how well a theory fits the data, and for its
stimuli, and the constancy of the speed of light in any parsimony, the question of plausibility is ultimately one
reference frame, established by the Michelson–Morley for human judgment. There is also the problem of how
experiment of 1887. This form of discovery can, if de- much weight to put on all these factors. Humans appear
sired, be divided into those that involve qualitative rules to have developed pragmatic, context-specific methods
on the one hand and quantitative on the other; an ad- of making acceptable inferences [33.3]: The psychol-
ditional distinction concerns whether or not the rules ogy of scientific discovery is discussed in more detail
involve unobserved entities [33.19]. in Sect. 33.3.
Another category of discovery is theories. A theory The process of finding an explanation for a speci-
is an underlying explanation, accounting for a set of ob- fied phenomenon of interest has the characteristics of
servations by means of a causal process. For example, an inverse problem. These problems have the follow-
Newton’s theory of gravitation explains Kepler’s ob- ing form. There is some state of the world S which,
servations by means of a deeper, causal principle. The in conjunction with certain physical laws, gives rise to
theory of evolution by natural selection, conceived by a data set D. Given S, it is in principle straightforward
Darwin and Wallace, explains a wide range of observa- to calculate D. This is known as the forward problem.
tions regarding organisms’ adaptations, and forms the The inverse problem is to calculate S from D. There
basis of the modern science of behavioral ecology. may be no solution S that generates D exactly, or there
722 Part G Modelling and Computational Issues
may be several. An example of such a problem con- Most inverse problems do not involve scientific
cerns mirage data. A mirage is an optical distortion discovery – for example, inverting mirages to find
caused by meteorological conditions, which result in a temperature profile would not be considered scien-
variation in the atmospheric refractive index [33.22]. tific discovery – but the similarities between scientific
The forward problem is to calculate what a mirage discovery and inverse problems are clear, and indeed
should look like from the refractive index profile. The scientific discovery can be regarded as an inverse prob-
inverse problem is to deduce the refractive index pro- lem. It can be cast as follows: Given a suitable theoret-
file (and from this the temperature profile) from mirage ical or empirical account of the relevant aspect of the
Part G | 33.3
data [33.23]. One approach involves a form of regular- world, specified in sufficient detail, it is possible to cal-
ization: finding a refractive index profile to minimize culate what observations should follow from it: that is
a cost function [33.24]. This cost function includes an the forward problem. The inverse problem is to go from
error term, which depends on how far the mirage data the observations to the empirical rule or the theory. It is
predicted by the proposed profile differs from the real therefore pertinent to consider the computational meth-
data; computation of this term involves solving the for- ods used in inverse problems. Computational optimiza-
ward problem for the proposed refractive index profile. tion methods are widely used [33.24, 26]. Evolutionary
Another term penalizes the proposed refractive index computational methods have been applied [33.27–29]:
profile according to a measure of its implausibility. These are particularly suited to problems in which there
Vision problems are inverse problems [33.25]. A com- are several local optima, where standard optimization
mon characteristic of inverse problems is that, as with methods often get stuck at a local optimum. We will
problems of scientific explanation, there is generally discuss evolutionary computational methods for scien-
no clear, objective measure of the best solution; rather, tific discovery in more detail in Sect. 33.5.
ad hoc problem-specific measures are needed. As with It must be remembered, however, that not all sci-
the example of the inversion of mirage data [33.24], entific discovery involves finding an explanation for
a feature of almost every computational approach to a specified problem. Rich data sets allow data-driven
inverse problems is that a candidate solution is tested research in which new entities are directly discovered,
by solving the forward problem and calculating the and new empirical rules suggested by data analy-
data that would be predicted under this candidate so- sis [33.18]; we will describe in Sect. 33.5 how com-
lution. putational methods play a large part in these processes.
ment space. Klahr and Dunbar report that subjects vary uation faced by Kepler is questionable. Nevertheless,
in their approach to discovery: The principal approach Qin and Simon’s results do provide some support for
of one group, the theorists, was to formulate hypothe- the role of rule-based reasoning in scientific discovery.
ses and then test them experimentally, while a second Kulkarni and Simon [33.40] developed the KEKADA
group, the experimenters, conducted many experiments computational system for explaining general scientific
without explicit hypotheses. This would seem to pro- processes; it is able to replicate the discovery of Hans
vide evidence that humans can use both Popperian and Krebs’s hypothesis for the urea–ornithine cycle [33.41],
Baconian approaches to discovery. In the assessment using rule-based methods, again providing some sup-
Part G | 33.4
of hypotheses, however, humans are prone to confir- port for the idea that inductive methods are important
mation bias, that is, putting disproportionate weight on in human scientific discovery. See Sect. 33.5 for more
studies that would tend to confirm a favored hypothe- details about the BACON and KEKADA computational
sis [33.37]. discovery systems.
Qin and Simon [33.38] show how human subjects Notwithstanding the Baconian view of discovery,
can discover a scientific law (Kepler’s third law of it is generally held that at least some forms of hu-
planetary motion) by a process of exploring possi- man scientific discovery involve something which can
ble, though simple, algebraic relationships between two be termed creativity. Boden [33.42] suggests that there
variables, and using the results of early explorations are three forms of creativity. The first is combinational
to make better-informed guesses of the relationship. creativity: the new (unexpected) combination of fa-
This approach mimics the workings of the BACON miliar ideas. The second is exploratory creativity: the
computational scientific discovery method for finding exploration of accepted, structured spaces. The third is
scientific laws [33.39]. However, the participants in the transformational creativity, which involves ideas out-
experiment of Qin and Simon [33.38] were primed to side the rules of an accepted space. A capacity that
expect that some sort of simple relationship between contributes to creativity, and is particularly important
the variables existed. Kepler did not know for certain in scientific discovery, is the use of analogy [33.3]; this
that such a relationship existed, so the extent to which has been the subject of cognitive modeling, with an em-
the experiment of Qin and Simon represents the sit- phasis on the role of memory [33.43].
As in chess, where the legality of a move does As AM proved fairly successful in mathematics,
not imply its quality, LT uses several proof strategies a related system, EURISKO, was built to attempt to
to make its search efficient. These strategies look for discover new search heuristics: EURISKO is a meta-
situations where the above rules of inference may be discovery system, which discovers new ways for dis-
applied: For example, when wanting to show a implies covery to occur. As Lenat and Brown [33.55] explain,
c and knowing that b implies c, then proving a implies one of the discoveries of EURISKO was that AM’s suc-
b would enable use of the syllogism inference. cess was largely down to how mathematical concepts
Although intended as a program for proving log- were represented. Replicating AM’s success in other
Part G | 33.4
ical statements, Newell et al. [33.48] also make some domains requires careful work on formulating the inter-
novel claims about the program’s ability to simulate the nal representation of the domain. In particular, changing
behavior of human problem solvers, and were among the syntactic form of an expression should be reflected
the first authors to argue for the use of computer pro- in meaningful changes to the semantics (to the mean-
grams to aid in understanding psychological processes. ing).
See Gobet and Lane [33.50] for a discussion of LT’s
impact on psychology. Around the same time, a pro- 33.4.3 GRAFFITI
gram was developed [33.51] that did not lay any claims
to modeling human behavior, but used more intensive GRAFFITI [33.56] is a system for developing conjec-
computational strategies to prove all the theorems in tures in an area of mathematics known as graph theory.
Chaps. 1–3 of Whitehead and Russell [33.49]. A graph is a set of nodes interconnected by edges, and
graph theory has important applications in many scien-
33.4.2 AM and EURISKO tific areas, including physics, chemistry and computer
science (see Sects. 33.5.1 and 33.5.5).
Automated Mathematician (AM) [33.52, 53] is a pro- GRAFFITI, like AM, is used to generate conjec-
gram which uses heuristics to discover conjectures (and tures but is unable to construct proofs. The conjectures
hence potential theorems) in mathematics. Unlike LT, are formed from a database of invariants of a graph,
it is unable to directly prove theorems. Instead, conjec- such as the diameter (greatest number of edges between
tures are created by modifying existing theorems, and any two nodes), rank (number of nodes minus the num-
tested empirically against specific examples. AM works ber of connected components), or chromatic number
within the set and number theory areas of mathematics, (the number of colors required to color a graph so that
in part because it is easy for the computer to generate no two adjacent nodes are of the same color). Simple
examples of, say, a conjecture about numbers. AM has sums of these invariants are then generated, and tested
been criticized [33.54], but serves as a model of how against a database of known graphs.
a mathematical discovery system can be designed. As the conjectures can require considerable compu-
AM begins with a basic set of concepts and can tational time to check, a pair of heuristics are used to try
create well-known conjectures in set theory (such as to focus on interesting conjectures. The beagle heuris-
subsets) and number theory (such as prime numbers, or tic is used to check that the conjecture is not trivial, for
Goldbach’s conjecture). AM represents each concept in example, that an invariant is less than itself plus 1. The
a frame, holding information including: its algorithm, dalmation heuristic is used to check that the conjecture
examples, which other concepts it is related to by gen- is different to ones already in GRAFFITI’s database.
eralization or specialization, and other related concepts. If no counterexample can be found by the program,
A concept is then selected out of the current pool, based the conjecture is passed to the user. Conjectures are then
on a measure of interestingness, and adapted versions published, and may be picked up and further analysed
of the concept are created. These adaptations may be by graph theorists. GRAFFITI is one of the few sys-
a specialization of the concept, a restriction of its do- tems proven to make conjectures which mathematicians
main, or similar. In addition, and crucially, a human find interesting, and has helped to advance the field.
observer may indicate a concept for the system to work A substantial number of the conjectures have resulted
on next. in publications: a list may be found at [33.57].
Computational Scientific Discovery 33.5 Methods and Applications in Computational Scientific Discovery 725
Part G | 33.5
Table 33.1 Principal examples of scientific discovery systems
System Type of discovery Domain Main technique
DENDRAL [33.58] Chemical structures Chemistry/ Partly rule-based, partly brute force
(topological) biochemistry exhaustive
consideration of possible structures
MECHEM [33.59] Chemical reaction pathways Chemistry Systematic search using
defined components to
a defined level of
complexity; rule based/reasoning
framework for controlling search
BACON [33.39] Scientific laws General Rule-based/reasoning
GLAUBER [33.60] Qualitative rules Primarily chemical Rule-based/reasoning
reactions, but potentially
general
KEKADA [33.40] Scientific processes General Rule-based system that seeks to
explain phenomena by recursively
generating hypotheses; can also
propose experiments
GOLEM [33.61] Predicting three-dimensional Biochemistry Statistical classifier, using a machine
structure of proteins learning method to determine the
classification rules
Storrie-Lombardie Classifying galaxies Astronomy Classifier based on a neural network,
et al. [33.62] trained using backpropagation
Shamir [33.63] Classifying galaxies Astronomy Classifier based on weighted
proximity to a number of descriptors
determined from test data
Tiffin et al. [33.64] Candidate genes for disease Biomedical Data mining, combining gene
causation expression data and biomedical
literature
Warmr [33.65] Potentially carcinogenic chemicals Chemistry Data mining combined with
rule-based reasoning
GRAM [33.66] Co-expressed genes and regulatory Biomedical Network generation from pairwise
networks measures of expression,
then incremental node addition
PC Causal relationships between General Network refinement by
algorithm [33.67] variables successive deletion and
directional interpretation of edges
Guindon and Deducing phylogenies from DNA Evolutionary Hill-climbing to improve esti-
Gascuel [33.68] biology mated phylogenetic tree, based on
maximum-likelihood methods ap-
plied to DNA data
Frias-Martinez and Process-based theories Psychology Evolutionary computation (genetic
Gobet [33.69] programming)
Schmidt and Scientific laws General Symbolic regression,
Lipson [33.70] based on genetic programming
Robot Formulation and experimental Biochemistry Rule-based reasoning
scientist [33.71] testing of simple hypotheses and control of a robot
726 Part G Modelling and Computational Issues
33.5.1 Massive Systematic Search ways, that is, the steps involved in a chemical reaction.
Within a Defined Space An example is a possible pathway in the urea–ornithine
cycle, originally proposed by Krebs [33.41].
The first significant achievement in computational sci-
entific discovery was a system known as heuristic DEN- Ornithine C NH3 C CO2 ! H2 O C C6 H13 N3 O3
DRAL. This began in 1965, with the purpose of auto- NH3 C C6 H13 N3 O3 ! Arginine C H2 O
matically finding structures in organic chemistry [33.58, Arginine C H2 O ! Urea C Ornithine
72]; an important motivation was to provide a test-bed
Part G | 33.5
for the applicability of ideas in the emerging field of In this proposed pathway, the chemical species has been
artificial intelligence (AI). Heuristic DENDRAL relies conjectured: It was not observed, nor was it part of the
on the idea of making intensive use of specific knowl- input data. The hypothesis-formation algorithm used in
edge, together with harnessing computational power for MECHEM makes use of two complexity parameters,
searching. It systematically evaluates all the topologi- specifying the number of steps and species to be con-
cally distinct arrangements of a set of atoms consistent tained in a hypothesis. Then the hypothesis generator
with the rules of chemistry. Later versions also consider finds the possible hypotheses within this constraint. If
three-dimensional geometry, and use data beyond mass they are all rejected, then at least one of the complexity
spectrometry. Alongside heuristic DENDRAL is a sister parameters is incremented. The MECHEM system also
system called meta-DENDRAL, the purpose of which is has a higher level (rule-based) decision system, allow-
to learn the rules of mass spectroscopy from empirical ing it to indicate that new experimental evidence should
data. While heuristic DENDRAL carries out computa- be sought, or that the problem be suspended.
tional scientific discovery directly, meta-DENDRAL is
a method for developing a tool used in scientific dis- 33.5.2 Rule-Based Reasoning Systems
covery. It is therefore heuristic DENDRAL which is of
more direct interest to the present study. The core of The BACON research program [33.20, 38], as with
heuristic DENDRAL is a plan-generate-test algorithm. DENDRAL described above, originated in artificial in-
The planner devises hypotheses that reject or propose telligence research. BACON is a series of systems for
certain classes of chemical graph. A key feature of the discovering empirical rules, in the form of laws, by un-
planner is that it incorporates specialist knowledge to covering relationships within data sets. It makes use of
constrain the set of potential solutions considered by the rule-based induction, looking initially for simple rela-
generator. The generator was designed to exhaustively tionships between variables, such as an invariant ratio
and efficiently generate all the possible chemical struc- or product. In what it achieves, BACON has a lot in
tures, within specified constraints. The testing part of the common with regression, and can be considered a form
algorithm includes a prediction component. This takes of dimension reduction. Later versions of BACON go
a proposed structure and generates a predicted mass beyond simple regression and dimension reduction by
spectrum: This can be compared to the real data. This generating properties representing intrinsic properties
testing of a possible solution by comparing the predicted of entities; an example is the refractive index, generated
data it would generate to the real data involves a prin- in the discovery of Snell’s law of refraction.
ciple discussed in the consideration of inverse problems The GLAUBER discovery system [33.20, 60] uses
in Sect. 33.2: testing a proposed solution to an inverse a similar rule-based induction process to BACON, but
problem by solving the forward problem for this solu- is concerned with qualitative empirical rules. For ex-
tion, and comparing the predicted outcome to the real ample, it can discover the law that every acid combines
data. with every alkali to produce some salt.
It should be emphasized that heuristic DENDRAL The KEKADA system [33.40] is a tool with the pur-
is not just a number cruncher: The use of specialized pose of understanding scientific processes; it has been
knowledge to constrain the set of potential solutions applied to the urea–ornithine cycle, replicating the steps
considered – justifying the term heuristic – was very undertaken by Krebs [33.41] to formulate his famous
important to the success of this system. However, num- hypothesis describing how the cycle operates. In seek-
ber crunching was also necessary: the capacity to use ing to replicate how scientists act, it includes a problem
computational power to systematically evaluate a set of chooser module: This determines which discovery task
potential solutions which is too large a job to undertake to attempt when there are several potential tasks on the
manually. agenda, according to considerations such as how impor-
Another system that makes good use of systematic tant a task is, and how accurately it can be studied. It
search within a defined space is MECHEM [33.59]. allows for a puzzling experimental finding to be added
This is concerned with finding chemical reaction path- to the agenda for investigation. A hypothesis genera-
Computational Scientific Discovery 33.5 Methods and Applications in Computational Scientific Discovery 727
tor creates hypotheses when faced with a new problem. termed data mining [33.74, 75]. Scientific literature can
There are also rules to propose experiments whose find- be a useful source of such data [33.76]. For exam-
ings could change confidence in existing hypotheses. ple, candidate genes for causation of disease can be
The results of experiments feed into the system via hy- determined from computational discovery of impor-
pothesis modifiers and confidence modifiers. tant statistical associations, using literature databases
While rule-based reasoning can be considered to be together with protein function [33.77] or gene expres-
the main discovery technique in the above systems, it sion [33.64] data. Computational data-mining tech-
should be noted that rule-based reasoning is also an im- niques have also been used, in combination with rule-
Part G | 33.5
portant component of several other scientific discovery based methods, to identify candidate carcinogenic com-
systems. Thus, for example, heuristic DENDRAL, de- pounds, using a database of carcinogenetic tests of
scribed above, has a rule-based system to control which compounds [33.65] (Fig. 33.1). A deep commonality
searches it carries out, while GOLEM, described below, between machine learning (and classification) meth-
uses rule-based reasoning to learn classification rules. ods on the one hand and data mining techniques on
the other is that both involve finding statistical asso-
33.5.3 Classification, Machine Vision, ciations in data. In machine learning, the emphasis
and Related Techniques tends to be on the techniques for finding associations
within a given set of data, while data mining is often
Computerized processes for classification, alongside more concerned with techniques for extracting useful
techniques for object/instance recognition, including data from diverse or technically challenging sources.
machine vision methods, are becoming increasingly Machine-learning and data-mining approaches can be
important. It is therefore not surprising that such sys- used together [33.78].
tems have found applications in computational scien-
tific discovery. The GOLEM system has been shown 33.5.5 Finding Networks
to successfully produce hypotheses about protein struc-
ture [33.61]. It uses machine learning to determine Sometimes a set of entities interact together, in a way
rules for predicting the structure. The basic algorithm that governs some process. This is termed a network.
takes a random sample of pairs of example residues, An example is a gene network, that is, a set of genes
taken from all the proteins in the system, and computes and proteins that interact to govern a biological pro-
common properties. These properties are then made cess [33.79]. Gene networks can be discovered through
into a rule. GOLEM therefore makes use of rule-based a computational process that combines different sources
techniques, within the envelope of what is in effect of evidence about interactions between genes and regu-
a classification system. lator proteins [33.66].
In astronomy, data sets may include image data, and Finding networks frequently involves the use of
computational methods can be applied to problems such optimization methods, to find a set of hypothesized
as the identification and classification of astronomical interactions that maximizes some objective function,
objects. Storrie-Lombardie et al. [33.62] use a neural based on statistical measures of the strength of these
network to classify galaxies into five types, based on interactions. The objective function can include a com-
13 variables measured by machine, using a backprop- plexity penalty term, aimed at preventing overfitting
agation algorithm to train the network. Shamir [33.63] and ensuring network sparseness [33.80]. Computa-
describes the automatic classification of galaxies using tional implementation of exhaustive search, clustering,
a method which first converts each image to a number of or an optimization method which moves individual
low-level descriptors, and then uses discriminant anal- nodes to increase connectivity, can find networks of
ysis to find the descriptors which are most informative. interacting molecules [33.81]. The structure and pa-
A weighted distance between two feature vectors can rameters of ecological networks, denoting how species
then be computed; the predicted class of a test image interact in an ecosystem, can be estimated using com-
is given by the class of the training image that has the putational methods based on an analysis of biological
smallest weighted distance to it. A similar approach has flows from prey to predator species [33.82, 83]. Combi-
been applied to the classification of structures in biolog- natorial optimization techniques can be used to find the
ical images [33.73]. most important members of a social network [33.84].
It is often important to characterize probabilistic de-
33.5.4 Data Mining pendencies between variables in a network, and, where
feasible, to seek evidence for causality. Probabilistic de-
Methods used for finding patterns from large amounts pendencies can be represented in a form of directed
of data, sometimes from disparate sources, have been acyclic graph (i. e., where any relationship between two
728 Part G Modelling and Computational Issues
nodes A and B has a direction and there are no cy- sets of candidate hypotheses, models, or theories for
cles) known as a Bayesian network (BN) [33.85]. In those which are consistent with available data, and
a BN, a node is conditionally independent of its nonde- which are admissible on other grounds such as parsi-
scendants given its parents. One of the more successful mony and plausibility. How to generate the candidate
computational approaches in finding such networks hypotheses, models, or theories to be tested remains
involves first connecting all nodes with edges, then a serious challenge. Evolutionary computation is a pro-
deleting edges between variables that are independent, cess for searching a large space of potential candidate
or conditionally independent on a subset of the remain- solutions by mimicking the Darwinian process of nat-
ing variables [33.67]; another step involves inferring the ural selection. The two main methods of evolutionary
direction of the remaining edges from conditional de- computation are genetic algorithms, a method for find-
pendencies. See Haughton et al. [33.86] for a review of ing a set of parameters, and genetic programming,
computational methods. Such networks can provide ev- a method for finding an algorithm, in the form of a com-
idence for causality [33.87, 88], though this approach puter program. Here, we consider the application of
has its critics [33.89]. genetic programming [33.92, 93] to scientific discov-
Closely related to the discovery of networks is the ery. Genetic programming involves creating an initially
discovery of ancestral relationships between entities. random population of programs. In successive iteration
This is of particular interest in biology, for the discov- cycles, a new population is generated by preferential
ery of phylogenetic relationships [33.90]. An important selection of those individuals which are better with re-
technique is the computational application of maximum spect to a defined fitness function. In addition, a small
likelihood methods, which seeks the phylogenetic tree amount of random change is introduced into the new
that maximizes the probability of the observations given programs (mutation); this helps the process explore new
the tree [33.68, 91]. regions of the search space. New variants are also cre-
ated by randomly combining two existing members of
33.5.6 Evolutionary Computation the population (crossover).
The development of process-based theories in psy-
With the continuing development of faster computers, chology involves finding plausible psychological theo-
it has become possible to evaluate increasingly large ries, constructed from basic processes, to explain the
Computational Scientific Discovery 33.5 Methods and Applications in Computational Scientific Discovery 729
Part G | 33.5
term memory, and simple logical operations. This set putSTM putSTM
of operators is based on previous theories in the psy-
chological literature. In the genetic program, the fitness
function measures how well predictions resulting from I1 I2
a candidate psychological theory fit a data set; it can
also penalize theories that are too complex. Frias Mar- Fig. 33.2 Example of a theory generated by genetic pro-
tinez and Gobet [33.69] and Lane et al. [33.94] have gramming. According to this theory, the delayed match to
shown that the technique can be used to generate the- sample task is accomplished by comparing one of the new
ories to explain subjects’ behavior in the delayed match images to the original image; the second new image is not
to sample task [33.95], in which subjects must match used (after [33.69])
a new image to one of two images they have seen previ-
ously. The theories developed using genetic program- Table 33.2 Operators used in the theory shown in Fig. 33.2
ming generate predictions that fit the empirical data
Operator Description
well. In addition, simple and surprising theories can
Progn2 Function: executes two inputs sequentially
be created [33.69]. Figure 33.2, adapted from [33.69], Input: Input1, Input2
shows an example of such a theory. Input 1 is the Output: the output produced by Input2
new image, while input 2 is one of the previous im- PutSTM Function: writes the input into short term
ages. The operators used in the theory are listed in memory (STM)
Table 33.2. This capacity to generate unexpected the- Input: Input1
ories offers the potential for these techniques to provide Output: the element written in STM (Input 1)
new insights into the psychological phenomena under Compare12 Function: compares positions 1 and 2 of STM
study. and returns empty (NIL) if they are not equal
Schmidt and Lipson [33.70] describe a system for or the element if they are equal
Input: none
finding laws embedded in data sets. It uses a technique
Output: NIL or the element being compared
known as symbolic regression, and is based on genetic
programming. A key test for a candidate law is that
it should have good predictive ability, based on par- determined directly from data, for applications such
tial derivatives between pairs of variables. Schmidt and as establishing which genes are involved in encoding
Lipson [33.70] represent an equation as a computer pro- enzymes. Some previous scientific discovery methods
gram. An initially random population of equations is have incorporated in their logic the capacity to propose
created, and then a genetic programming algorithm is experiments [33.40, 59], but the robot scientist is prob-
applied. ably the first practical application of a fully automated
robotic system in which real world, rather than com-
33.5.7 Automation of Scientific Experiments putational, experiments are formulated, executed, and
analyzed.
A great deal of scientific experimentation involves Other automated systems, while arguably not as
labor-intensive cycles of setting up experiments, and complete as the robot scientist, have been developed
collecting and analyzing data. The robot scien- for the automation of the collection and analysis of data
tist [33.71, 96] automates this process, with the automa- in hostile environments [33.97]. This applies in areas
tion controlled by a computer algorithm. It iteratively such as studies of underwater environments [33.98] and
collects and analyzes data and generates hypotheses, space exploration [33.99].
730 Part G Modelling and Computational Issues
33.6 Discussion
The roots, nature, and psychology of scientific discov- successful, in that these methods have not discovered
ery (Sect. 33.1–33.3) provide a context for understand- any new and important rules, generalizations, or laws:
ing the potential application of computational discovery See Gillies [33.30] for a discussion of this claim. This
systems, and their limitations. Section 33.4 has de- does not detract from the possible use of such systems
scribed examples of computational discovery systems to shed light on understanding an important part of the
in mathematics; these have characteristics, such as the basis for how humans make scientific discoveries. It
Part G | 33.6
use of logic, and of searching through large numbers of does, however, suggest that systems that are primarily
cases to find possible patterns, in common with scien- rule-based do not constitute the most practically
tific discovery systems. important tools currently available for computational
The survey of scientific discovery methods in scientific discovery.
Sect. 33.5 has shown that computers can do scientific Classification, machine vision and data-mining
discovery, in the form of characterizing phenomena and methods provide important computational discovery
generating scientific explanations. Yet the wholesale re- tools, which fully utilize a computer’s capability to pro-
placement of human scientists by computers is not on cess large amounts of information [33.65, 73]. These
the horizon. Quite simply, computers are better suited methods essentially involve detecting or recognizing
to some aspects of the scientific discovery process than patterns, which is similar in principle to finding conjec-
others. tures in mathematics [33.52]. Automatic determination
What computers do particularly well is number of network relationships is likely to prove to be increas-
crunching: carrying out large numbers of calculations ingly important [33.80], and may benefit from future
and data operations, with great accuracy and at speed advances in statistical modeling and optimization tech-
which exceeds human capability by several orders of niques, but human input is required for problems where
magnitude. It follows that those areas where compu- a strong degree of judgment and prior knowledge is re-
tational discovery methods have been most successful quired to construct causal relationships [33.89]. In the
tend to be those for which number crunching can most specific field of determining phylogenetic relationships,
readily be applied to discovery processes. The first using established maximum-likelihood methods, com-
significant discovery system, heuristic DENDRAL, il- putational discovery has been very successful [33.91].
lustrates this. The discovery of chemical topologies, The use of evolutionary computational methods
and structures, involves consideration of a potentially in scientific discovery is relatively recent, and shows
enormous range of candidate solutions. Computational promise. The application to psychological theories
brute force enables a systematic search to be carried [33.69] opens up the possibility of automatically dis-
out, albeit within a limited parameter range. Rule-based covering useful models involving a sequence of sim-
methods for making the search methods efficient, and ple processes. The approach, however, requires human
the facility to use specific knowledge to constrain the judgment in specifying the characteristics of operators
search, are important, but it is the number-crunching used to construct models, in the extraction of data from
capacity of computers that made this approach feasi- published results, and in the interpretation of results.
ble, even with the available computing power of the Automation of experimental procedures will continue
mid-1960s, when the system was first proposed. The to be important in hostile environments [33.97], and
later MECHEM system for finding chemical pathways may find further application for relatively predictable
similarly depends on number crunching to carry out cycles of hypothesis proposal and testing [33.71].
a systematic search of possible pathways. This is not For computers to be really effective at the kinds of
to suggest that humans use such forms of systematic discovery methods that humans use will require basic
search. As Giza [33.100] has argued, computational developments in strong artificial intelligence, that is,
scientific discovery systems can proceed in a radically computers which can replicate general human capac-
different manner from noncomputational methods, and ities. Is such a development likely? Penrose [33.101]
employ criteria for choosing candidate discoveries that has suggested that human consciousness may be de-
are different from those employed by human scientists. pendent on nonalgorithmic physical processes, and not
The systems that are primarily rule-based, representable by a computational algorithm, and that
such as BACON [33.20], GLAUBER [33.60] and this poses a serious barrier to strong AI; by way of
KEKADA [33.40], are those that come closest to example he draws on the way mathematical truth is
attempting to replicate how humans carry out scientific discovered, arguing that there is no general algorithm
discovery, at least to the extent that humans use induc- for determining the truth of a mathematical proposition.
tive rules. It has been argued that this approach is not Among the critics of this line of reasoning, however, is
Computational Scientific Discovery References 731
Dennett [33.102], who suggests that people make use and specifying operators used for constructing evolu-
of reasoning methods that evolved for reasons related tionary computational models [33.69]. An important
to survival, and that these may happen to work well open question is the extent to which such aspects of
for assessing mathematical propositions a high propor- human judgmental skill could eventually be automated,
tion of the time. Another possible barrier to strong AI but there does not seem to be a strong prospect of sub-
is that inference depends on context [33.103, 104]: So stantial automation of this sort in the near future.
far, humans have proved better than machines in be- The immediate trend will be for more development
ing able to judge how to make good generalizations. of discovery systems oriented to the kind of things
Part G | 33
Gillies [33.30] argues that humans have a political su- computers do well. This is the case in several modern
periority to computers because computers are designed applications of AI, where problems are solved using
and built by humans in order to carry out human tasks; the computer’s ability to use large quantities of data,
it follows that rather than in mimicry of how humans do the same
task. For example, Google translate does not try to con-
“if a computer is designed to solve problems a, b, struct models of what a sentence means, but instead
c,. . . , it is likely to give rise to further problems x, scans the internet for data on how each sentence may
y, z,. . . which the computer system itself will not be translated. In this way, the problem can be solved
be able to solve, but which will need some human with a computer, but without employing a method that
thinking for their resolution.” is human-like. Similarly, it is likely that areas of scien-
tific discovery will be achieved by computers working
In any event, the development of strong AI does not in computer-like ways, with humans providing added
appear to be on the immediate horizon. There is no clear value in terms of synthesis or creative insights. This har-
consensus on the prospects for strong AI in the medium nessing of the complementary qualities of humans and
and long-term future. machines is likely to increase the rate at which scientific
A recurring theme in this review has been that the discoveries are made [33.105].
application of computational scientific discovery sys- In conclusion, computational scientific discovery
tems requires human input – in areas such as judging methods are an increasingly important tool in science.
putative causal relationships [33.89], judging the cor- But the role of the human scientist remains, for the fore-
rect context for making generalizations [33.103, 104], seeable future, essential.
References
33.1 H.K. Versteeg, W. Malalasekera: An Introduction to 33.9 J. Diamond: Guns, Germs and Steel (Vintage, Lon-
Computational Fluid Dynamics: The Finite Volume don 2005)
Method (Pearson Education, Harlow 2007) 33.10 P. Curd: Presocratic philosophy. In: The Stanford
33.2 D.W. Heermann: Computer-Simulation Methods Encyclopedia of Philosophy (Winter 2012 Edition),
in Theoretical Physics (Springer, Berlin 1990) ed. by E.N Zalta (2012) http://plato.stanford.edu/
33.3 J. Holland, K. Holyoak, R. Nisbett, P. Thagard: archives/win2012/entries/presocratics/
Induction: Processes of Inference, Learning, and 33.11 C. Shields: Aristotle. In: The Stanford Encyclo-
Discovery (MIT, Cambridge 1986) pedia of Philosophy (Spring 2014 Edition), ed.
33.4 D.W. Stephens: Change, regularity, and value in by E.N. Zalta (2014) http://plato.stanford.edu/
the evolution of animal learning, Behav. Ecol. 2, archives/spr2014/entries/aristotle/
77–89 (1991) 33.12 J.R. Platt: Strong inference, Science 146, 347–353
33.5 S.M. Reader, K.N. Laland: Social intelligence, in- (1964)
novation, and enhanced brain size in primates, 33.13 P.M.S. Blackett: Memories of Rutherford. In:
Proc. Nat. Acad. Sci. 99, 4436–4441 (2002) Rutherford at Manchester, ed. by J.B. Birks (Hey-
33.6 N.J. Emery, N.S. Clayton: The mentality of crows: wood, London 1962) pp. 102–113
Convergent evolution of intelligence in corvids 33.14 L.A. Geddes: Looking back how measuring elec-
and apes, Science 306, 1903–1907 (2004) tric current has improved through the ages, IEEE
33.7 A.M. Auersperg, B. Szabo, A.M. von Bayern, Potentials 15, 40–42 (1996)
A. Kacelnik: Spontaneous innovation in tool 33.15 W.L. Bragg: The diffraction of short electromag-
manufacture and use in a Goffin’s cockatoo, Curr. netic waves by a crystal, Proc. Camb. Philos. Soc.
Biol. 22, R903–R904 (2012) 17, 43–57 (1913)
33.8 M.H. Christiansen, S. Kirby: Language evolution: 33.16 L. Pauling, R.B. Corey, H.R. Branson: The structure
Consensus and controversies, Trends Cogn. Sci. 7, of proteins: Two hydrogen-bonded helical con-
300–307 (2003) figurations of the polypeptide chain, Proc. Nat.
Acad. Sci. 37, 205–211 (1951)
732 Part G Modelling and Computational Issues
33.17 J.D. Watson, F.H. Crick: Molecular structure of nu- pp. 705–725
cleic acids, Nature 171, 737–738 (1953) 33.38 Y. Qin, H.A. Simon: Laboratory replication of sci-
33.18 D.B. Kell, S.G. Oliver: Here is the evidence, now entific discovery processes, Cogn. Sci. 14, 281–312
what is the hypothesis? The complementary roles (1990)
of inductive and hypothesis driven science in the 33.39 P. Langley, G. Bradshaw, H.A. Simon: BACON 5: The
post genomic era, Bioessays 26, 99–105 (2004) discovery of conservation laws, Proc. 7th Int. Jt.
33.19 P. Langley: The computational support of scien- Conf. Artif. Intell., Br. Columbia (AAAI, Palo Alto
tific discovery, Int. J. Human-Comput. Stud. 53, 1981) pp. 121–126
393–410 (2000) 33.40 D. Kulkarni, H.A. Simon: The processes of scientific
Part G | 33
33.20 P. Langley, H.A. Simon, G. Bradshaw, H.A. Simon, discovery: The strategy of experimentation, Cogn.
J.M. Zytkow: Scientific Discovery: Computational Sci. 12, 139–175 (1988)
Explorations of the Creative Processes (MIT, Cam- 33.41 H.A. Krebs: The discovery of the ornithine cycle of
bridge 1987) urea synthesis, Biochem. Educ. 1, 19–23 (1973)
33.21 A. Machado, F.J. Silva: Toward a richer view of the 33.42 M.A. Boden: Creativity and artificial intelligence,
scientific method: The role of conceptual analysis, Artif. Intell. 103, 347–356 (1998)
Am. Psychol. 62, 671–681 (2007) 33.43 P. Thagard, K.J. Holyoak, G. Nelson, D. Gochfeld:
33.22 R. Greenler: Rainbows, Halos, and Glories (Cam- Analog retrieval by constraint satisfaction, Artif.
bridge Univ. Press, Cambridge 1980) Intell. 46, 259–310 (1990)
33.23 W.G. Rees, C.M. Roach, C.H.F. Glover: Inversion of 33.44 E.P. Wigner: The unreasonable effectiveness of
atmospheric refraction data, JOSA A 8, 330–338 mathematics in the natural sciences. Richard
(1991) Courant lecture in mathematical sciences deliv-
33.24 P.D. Sozou: Inversion of mirage data: An opti- ered at New York University, May 11, 1959, Com-
mization approach, JOSA A 11, 125–134 (1994) mun. Pure Appl. Math. 13, 1–14 (1960)
33.25 M. Bertero, T.A. Poggio, V. Torre: Ill-posed prob- 33.45 S. Colton: Computational discovery in pure math-
lems in early vision, Proc. IEEE 76, 869–889 (1988) ematics. In: Computational Discovery of Scien-
33.26 M.V. Afonso, J.M. Bioucas-Dias, M.A. Figueiredo: tific Knowledge, Lecture Notes in Computer Sci-
An augmented Lagrangian approach to the con- ence, Vol. 4660, ed. by S. Džeroski, L. Todorovski
strained optimization formulation of imaging in- (Springer, Berlin Heidelberg 2007) pp. 175–201
verse problems, IEEE Trans. Image Process. 20, 33.46 S. Colton, A. Bundy, T. Walsh: On the notion of
681–695 (2011) interestingness in automated mathematical dis-
33.27 H.Y. Li, C.Y. Yang: A genetic algorithm for inverse covery, Int. J. Human-Comput. Stud. 53, 351–375
radiation problems, Int. J. Heat Mass Transf. 40, (2000)
1545–1549 (1997) 33.47 C.E. Larson: A survey of research in automated
33.28 C.L. Karr, I. Yakushin, K. Nicolosi: Solving inverse mathematical conjecture-making, DIMACS Ser.
initial-value, boundary-value problems via ge- Discrete Math. Theor. Comput. Sci. 69, 297–318
netic algorithm, Eng. App. Artif. Intell. 13, 625–633 (2005)
(2000) 33.48 A. Newell, J.C. Shaw, H.A. Simon: Elements of
33.29 D.K. Karpouzos, F. Delay, K.L. Katsifarakis, a theory of human problem solving, Psychol. Rev.
G.D. Marsily: A multipopulation genetic algorithm 65, 151–166 (1958)
to solve the inverse problem in hydrogeology, 33.49 A.N. Whitehead, B. Russell: Principia Mathemat-
Water Resour. Res. 37, 2291–2302 (2001) ica, Vol. 1 (Cambridge Univ. Press, Cambridge 1910)
33.30 D. Gillies: Artificial Intelligence and Scientific 33.50 F. Gobet, P.C.R. Lane: Human problem solving:
Method (Oxford Univ. Press, Oxford 1996) Beyond Newell et al.’s (1958) elements of a theory
33.31 F. Bacon: Novum Organum (Open Court, Chicago of human problem solving. In: Cognitive Psy-
1994), ed. by P. Urbach, J. Gibson, originally pub- chology: Revisiting the Classic Studies, ed. by
lished in 1620 D. Groome, M.W. Eysenck (Sage, Thousand Oaks
33.32 K.R. Popper: Conjectures and Refutations: The 2015) pp. 133–145
Growth of Scientific Knowledge (Routledge and 33.51 H. Wang: Toward mechanical mathematics. In:
Kegan Paul, London 1963) Automation of Reasoning: Classical Papers on
33.33 K.R. Popper: The Logic of Scientific Discovery (Un- Computational Logic 1957–1966, ed. by J. Siek-
win Hyman, London 1990), 14th impression mann, G. Wrightson (Springer, Berlin 1983)
33.34 D. Campbell: Blind variation and selective reten- pp. 244–264
tion in creative thought as in other knowledge 33.52 D.B. Lenat: AM: An Artificial Intelligence Approach
processes, Psychol. Rev. 67, 380–400 (1960) to Discovery in Mathematics as Heuristic Search
33.35 D. Simonton: Origins of Genius (Oxford Univ. Press, (Dept. Computer Science, Stanford Univ., Stanford
Oxford 1999) 1976)
33.36 D. Klahr, K. Dunbar: Dual space search during sci- 33.53 R. Davis, D.B. Lenat: Knowledge-Based Systems
entific reasoning, Cogn. Sci. 12, 1–48 (1988) in Artificial Intelligence (McGraw-Hill, New York
33.37 K. Dunbar, J. Fugelsang: Scientific thinking and 1982)
reasoning. In: The Cambridge Handbook of Think- 33.54 G.D. Ritchie, F.K. Hanna: AM: A case study in AI
ing and Reasoning, ed. by K.J. Holyoak, R.G. Mor- methodology, Artif. Intell. 23, 249–268 (1984)
rison (Cambridge Univ. Press, Cambridge 2005)
Computational Scientific Discovery References 733
33.55 D.B. Lenat, J.S. Brown: Why AM and EURISKO ap- 33.73 N. Orlov, L. Shamir, T. Macura, J. Johnston,
pear to work, Artif. Intell. 23, 269–294 (1984) D.M. Eckley, I.G. Goldberg: WND-CHARM: Multi-
33.56 S. Fajtlowicz: On conjectures of Graffiti, Ann. Dis- purpose image classification using compound
crete Math. 38, 113–118 (1988) image transforms, Pattern Recognit. Lett. 29,
33.57 E. Delavina: Bibliography on conjectures, meth- 1684–1693 (2008)
ods and applications of Graffiti (2016), http://cms. 33.74 U. Fayyad, G. Piatetsky-Shapiro, P. Smyth:
dt.uh.edu/faculty/delavinae/research/wowref. From data mining to knowledge discovery in
htm databases, AI Magazine 17, 37 (1996)
33.58 R.K. Lindsay, B.G. Buchanan, E.A. Feigenbaum, 33.75 X. Wu, V. Kumar, J.R. Quinlan, J. Ghosh, Q. Yang,
Part G | 33
J. Lederberg: DENDRAL: A case study of the first H. Motoda, D. Steinberg: Top 10 algorithms in data
expert system for scientific hypothesis formation, mining, Knowl. Inf. Sys. 14, 1–37 (2008)
Artif. Intell. 61, 209–261 (1993) 33.76 L. Hirschman, J.C. Park, J. Tsujii, L. Wong, C.H. Wu:
33.59 R.E. Valdes-Perez: Theory-driven discovery of re- Accomplishments and challenges in literature
action pathways in the MECHEM system, Proc. 10th data mining for biology, Bioinformatics 18, 1553–
Natl. Conf. Artif. Intell., San Jose (AAAI, Palo Alto 1561 (2002)
1992) pp. 63–69 33.77 C. Perez-Iratxeta, P. Bork, M.A. Andrade: Associ-
33.60 J.M. Zytkow, H.A. Simon: Normative systems of ation of genes to genetically inherited diseases
discovery and logic of search, Synthese 74, 65–90 using data mining, Nat. Genet. 31, 316–319 (2002)
(1988) 33.78 N.M. Ball, R.J. Brunner: Data mining and machine
33.61 S. Muggleton, R.D. King, J.E. Sternberg: Protein learning in astronomy, Int. J. Mod. Phys. D 19,
secondary structure prediction using logic-based 1049–1106 (2010)
machine learning, Protein Eng. 5, 647–657 (1992) 33.79 H. Kitano: Systems biology: A brief overview, Sci-
33.62 M.C. Storrie-Lombardi, O. Lahav, L. Sodre, ence 295, 1662–1664 (2002)
L.J. Storrie-Lombardi: Morphological classifica- 33.80 M. Hecker, S. Lambeck, S. Toepfer, E. Van Someren,
tion of galaxies by artificial neural networks, R. Guthke: Gene regulatory network inference:
Mon. Not. R. Astron. Soc. 259, 8P–12P (1992) Data integration in dynamic models–a review,
33.63 L. Shamir: Automatic morphological classification Biosys. 96, 86–103 (2009)
of galaxy images, Mon. Not. R. Astron. Soc. 399, 33.81 V. Spirin, L.A. Mirny: Protein complexes and func-
1367–1372 (2009) tional modules in molecular networks, Proc. Natl.
33.64 N. Tiffin, J.F. Kelso, A.R. Powell, H. Pan, V.B. Bajic, Acad. Sci. 100, 12123–12128 (2003)
W.A. Hide: Integration of text-and data-min- 33.82 R.E. Ulanowicz: Quantitative methods for ecolog-
ing using ontologies successfully selects disease ical network analysis, Comput. Biol. Chem. 28,
gene candidates, Nucleic Acids Res. 33, 1544–1552 321–339 (2004)
(2005) 33.83 P. Kavanagh, N. Newlands, V. Christensen,
33.65 R.D. King, A. Srinivasan, L. Dehaspe: Warmr: D. Pauly: Automated parameter optimization for
A data mining tool for chemical data, J. Comput.- ecopath ecosystem models, Ecolo. Model. 172,
Aided Mol. Des. 15, 173–181 (2001) 141–149 (2004)
33.66 Z. Bar-Joseph, G.K. Gerber, T.I. Lee, N.J. Rinaldi, 33.84 S.P. Borgatti: Identifying sets of key players in
J.Y. Yoo, F. Robert, D.K. Gifford: Computational a social network, Comput. Math. Organ. Theory 12,
discovery of gene modules and regulatory net- 21–34 (2006)
works, Nat. Biotechnol. 21, 1337–1342 (2003) 33.85 Z. Ghahramani: An introduction to hidden Markov
33.67 P. Spirtes, C. Glymour: An algorithm for fast re- models and Bayesian networks, Int. J. Pattern
covery of sparse causal graphs, Soc. Sci. Comput. Recog. Artif. Intell. 15, 9–42 (2001)
Rev. 9, 62–72 (1991) 33.86 D. Haughton, A. Kamis, P.A. Scholten: A review of
33.68 S. Guindon, O. Gascuel: A simple, fast, and ac- three directed acyclic graphs software packages:
curate algorithm to estimate large phylogenies MIM, tetrad, and WinMine, Am. Stat. 60, 272–286
by maximum likelihood, Syst. Biol. 52, 696–704 (2006)
(2003) 33.87 D.M. Hausman, J. Woodward: Independence, in-
33.69 E. Frias-Martinez, F. Gobet: Automatic generation variance and the causal Markov condition, Br.
of cognitive theories using genetic programming, J. Phil. Sci. 50, 521–583 (1999)
Minds Mach. 17, 287–309 (2007) 33.88 C. Glymour: Learning, prediction and causal Bayes
33.70 M. Schmidt, H. Lipson: Distilling free-form natural nets, Trends Cogn. Sci. 7, 43–48 (2003)
laws from experimental data, Science 324, 81–85 33.89 N. Cartwright: Causation: One word, many things,
(2009) Phil. Sci. 71, 805–820 (2004)
33.71 R.D. King, J. Rowland, S.G. Oliver, M. Young, 33.90 J.P. Huelsenbeck, F. Ronquist, R. Nielsen, J.P. Boll-
W. Aubrey, E. Byrne, M.L. Kata, K. Karkham, P. Pir, back: Bayesian inference of phylogeny and its im-
L.N. Soldatova, A. Sparkes, K.E. Whelan, A. Care: pact on evolutionary biology, Science 294, 2310–
The automation of science, Science 324, 85–89 2314 (2001)
(2009) 33.91 Z. Yang: PAML 4: Phylogenetic analysis by max-
33.72 B.G. Buchanan, E.A. Feigenbaum: DENDRAL and imum likelihood, Mol. Biol. Evol. 24, 1586–1591
Meta-DENDRAL: Their applications dimension, Ar- (2007)
tif. Intell. 11, 5–24 (1978)
734 Part G Modelling and Computational Issues
33.92 J. Koza: Genetic Programming: On the Program- 33.98 I. Vasilescu, K. Kotay, D. Rus, M. Dunbabin,
ming of Computers by Means of Natural Selection, P. Corke: Data collection, storage, and retrieval
Vol. 1 (MIT, Cambridge Massachussetts 1992) with an underwater sensor network, Proc. 3rd
33.93 R. Poli, W. Langdon, N. McPhee: A field guide Int. Conf. Embed. Networked Sens. Syst. (2005)
to genetic programming, http://www.gp-field- pp. 154–165
guide.org.uk (2008) 33.99 J. Schwendner, F. Kirchner: Space Robotics: An
33.94 P.C.R. Lane, P.D. Sozou, M. Addis, F. Gobet: Evolv- overview of challenges, applications and tech-
ing process-based models from psychological nologies, KI-Künstliche Intell. 28, 71–76 (2014)
data using genetic programming. In: AISB50: Se- 33.100 P. Giza: Automated discovery systems and scien-
Part G | 33
lected Papers, ed. by M. Bishop, K. Devlin, Y. Er- tific realism, Minds Mach. 12, 105–117 (2002)
den, R. Kibble, S. McGregor, M. Majid al-Rifaie, 33.101 R. Penrose: The Emperor’s New Mind: Concerning
A. Martin, M. Figueroa, S. Rainey (AISB, London Computers, Minds and the Laws of Physics (Oxford
2015) pp. 144–149 Univ. Press, Oxford 1989)
33.95 L. Chao, J. Haxby, A. Martin: Attribute-based neu- 33.102 D.C. Dennett: Betting your life on an algorithm,
ral substrates in temporal cortex for perceiving Behav. Brain Sci. 13, 660–661 (1990)
and knowing about objects, Nat. Neurosci. 2, 913– 33.103 G. Gigerenzer: Strong AI and the problem of sec-
919 (1999) ond order algorithms, Behav. Brain Sci. 13, 663–
33.96 A. Sparkes, W. Aubrey, E. Byrne, A. Clare, 664 (1990)
M.N. Khan, M. Liakata, R.D. King: Towards robot 33.104 M. Addis, P.D. Sozou, P.C. Lane, F. Gobet: Compu-
scientists for autonomous scientific discovery, tational scientific discovery and cognitive science
Autom. Exp. 2, 1 (2010) theories, Proc. IACAP, ed. by V. Müller (Springer,
33.97 J.G. Bellingham, K. Rajan: Robotics in remote Heidelberg 2016)
and hostile environments, Science 318, 1098–1102 33.105 C. Glymour: The automation of discovery,
(2007) Daedalus 133, 69–77 (2004)
735
Cyrille Imbert
Computer Sim
34. Computer Simulations and Computational Models
in Science
Part G | 34
cepted across scientific fields as legitimate ways and Their Specificities ........................ 740
of reaching results (Sect. 34.1). Also, a great variety 34.2.3 Digital Machines, Numerical Physics,
of computational models and computer simu- and Types of Equivalence ................... 741
lations can be met across science, in terms of 34.2.4 Non-Numerical Digital Models ............ 741
the types of computers, computations, compu- 34.2.5 Nondeterministic Simulations............. 742
tational models, or physical models involved and 34.2.6 Other Types of Computer Simulations .. 742
they can be used for various types of inquiries
and in different scientific contexts (Sect. 34.2). 34.3 Epistemology of Computational
For this reason, epistemological analyses of com-
Models and Computer Simulations..... 743
34.3.1 Computer Simulations
puter simulations are contextual for a great part.
and Their Scientific Roles ................... 743
Still, computer simulations raise general ques-
34.3.2 Aspects of the Epistemological Analysis
tions regarding how their results are justified, how
of Computer Simulations .................... 744
computational models are selected, which type of 34.3.3 Selecting Computational Models
knowledge is thereby produced (Sect. 34.3), or and Practices .................................... 746
how computational accounts of phenomena partly 34.3.4 The Production of ‘New’ Knowledge:
challenge traditional expectations regarding the In What Sense?.................................. 748
explanation and understanding of natural sys-
tems (Sect. 34.4). Computer simulations also share 34.4 Computer Simulations, Explanation,
various epistemological features with experiments and Understanding........................... 750
and thought experiments; hence, the need for 34.4.1 Traditional Accounts of Explanation .... 751
transversal analyses of these activities (Sect. 34.5). 34.4.2 Computer Simulations:
Finally, providing a satisfactory and fruitful defini- Intrinsically Unexplanatory? ............... 751
tion of computer simulations turns out to be more 34.4.3 Computer Simulations:
difficult than expected, partly because this notion More Frequently Unexplanatory? ........ 752
is at the crossroads of difficult questions like the 34.4.4 Too Replete to Be Explanatory?
nature of representation and computation or the The Era of Lurking Suspicion ............... 754
success of scientific inquiries (Sect. 34.6). Over-
34.4.5 Bypassing the Opacity of Simulations .. 757
34.4.6 Understanding and Disciplinary Norms 758
all, a pointed analysis of computer simulations
in parallel requires developing insights about the 34.5 Comparing:
evolving place of human capacities and humans Computer Simulations, Experiments
within (computational) science (Sect. 34.7). and Thought Experiments ................. 758
34.5.1 Computational Mathematics
and the Experimental Stance.............. 759
34.1 Computer Simulations in Perspective . 736 34.5.2 Common Basal Features ..................... 759
34.1.1 The Recent Philosophy of Scientific 34.5.3 Are Computer Simulations
Models and Computer Simulations ...... 736 Experiments? .................................... 762
34.1.2 Numerical Methods and Computational 34.5.4 Knowledge Production,
Science: An Old Tradition .................... 737 Superiority Claims, and Empiricism ..... 765
34.1.3 A More or Less Recent Adoption 34.5.5 The Epistemological Challenge
Across Scientific Fields........................ 738 of Hybrid Methods ............................. 767
736 Part G Modelling and Computational Issues
For several decades, much of science has been com- puter simulations raise questions about the traditional
putational, that is, scientific activity where computers conceptualization of science in terms of experiments,
play a central and essential role. Still, computational theories and models, about the ways that usual scien-
Part G | 34.1
science is larger than the set of activities resorting to tific activities like predicting, theorizing, controlling, or
computer simulations. For example, experimental sci- explaining are carried out with the help of these new
ence, from vast experiments in nuclear physics at the tools and, more generally, how the production of sci-
European Organization for Nuclear Research (CERN) entific knowledge by human creatures is modified by
to computational genomics, relies heavily on comput- computer simulations. Importantly, while the specific
ers and computational models for data acquisition and philosophical analysis of computer simulations is re-
their treatment, but does not seem to involve computer cent (even if it was preceded by the development of the
simulations proper. In any case, there is a great and philosophical study of scientific models) and compu-
still proliferating variety of types of computer simu- tational science is a few decades old, the development
lations, which are used for different types of inquiries of computational tools and mathematical techniques
and in different types of theoretical contexts. For this aimed at bypassing the complexity of problems be-
reason, one should be careful when describing the phi- longs to a much older tradition. This means that claims
losophy of computer simulations and nonjustified gen- about how much computer simulations change science,
eralizations should be avoided. At the same time, how and how much a closer attention to computer simula-
much the development of computer simulations has tions should change our picture of scientific activity, are
been changing science is a legitimate question. Com- questions to be treated with circumspection.
tational models, and digital simulations of empirical account scientific users, qua intentional cognitive crea-
systems as such did not really start until the early 1990s, tures [34.27, 28], and their cognitively constrained ways
with articles by Humphreys [34.7, 8], Rohrlich [34.9] or to handle models by means of inferences, graphs, pic-
Hartmann [34.10]. (Such a description of the field is tures or diagrams (Kulvicki [34.29], Giardino Chap. 22,
necessarily unfair to earlier works about the use of com- this volume; Bechtel Chap. 27, this volume). Overall,
puter simulations in the empirical sciences. Particular in spite of the close connection within scientific prac-
mention should be given to the works of Bunge [34.11] tice between the uses of models and their computational
or Simon [34.12].) An article by Hughes about the explorations, the issue of computational models and
investigations of the Ising model [34.13], a special is- computer simulations was not seen clearly as a fruit-
sue of Science in Context edited by Sismondo [34.14] ful field of inquiry of its own, this trend of thought
and works by Winsberg [34.15–17], who completed being explicitly and vividly brought to the fore in
his Ph. D. in 1999 about computer simulations, also 2008 in a deliberately provocative paper by Frigg and
contributed to the development of this field. Finally, Reiss [34.30].
in 2006, the Models and Simulations Conference took
Part G | 34.1
place, which was the first of what was to become a still 34.1.2 Numerical Methods
active conference series, which has contributed to mak- and Computational Science:
ing the issue of computational science one of the fields An Old Tradition
of philosophy of science.
Philosophical works about scientific models, a very The second relevant chronology is that of the ad-
close field, were not significantly older. The impor- vancement in attempts to solve complex mathemati-
tance of the notion of set-theoretic model had been cal problems by developing computing machines and
emphasized by partisans of the model-theoretic view mathematical methods. Importantly, while the develop-
of theories in the 1970s, but, if one puts aside works ment of digital computers in the mid-twentieth century
by pioneers like Black [34.18] or Hesse [34.19], this changed the face of scientific computation, humans
did not launch investigations about scientific models did not wait for this decisive breakthrough to extend
proper. Overall, the intense epistemological study of their mathematical and computational powers. Further,
models did not start until the 1980s, with in particular as Mahoney wrote it, “the computer is not one thing,
a seminal article by Redhead about scientific models but many different things, and the same holds true of
in physics [34.20]. Members of the Stanford School computing” [34.31], and it is only in the twentieth cen-
also argued against the view that science was unified tury that different historical strands related to logic,
and that theories played a dominant role in scien- mathematics, or technologies came together. On the
tific activities such as the selection and construction one hand, early mathematical procedures, like New-
of models [34.21], and conversely emphasized the au- ton’s method to find the roots of real-valued functions,
tonomy of experimental and modeling practices. This or Euler’s method to solve ordinary differential equa-
context was appropriate for an independent investiga- tions, were developed to provide numerical approxi-
tion about the role of models in science, which bloomed mations for problems in numerical analysis. This field
at the end of the 1990s [34.22] and was further fed was already important to investigate physical systems
by a renewal of interest for the question of scien- but, with the advent of digital computers, it became
tific representation [34.23–25]. These investigations of a crucial part of (computational) science. On the other
models paved the way for new studies focused neither hand, mechanical calculating tools, such as abacuses
on theories nor on experiments. However, while the dif- or slide rules, were used from the Antiquity through
ficulty to explore a model was already acknowledged the centuries. The invention by Pascal of a device
in works by Redhead and Cartwright, interest for the (the Pascaline) to perform additions and subtractions,
actual modes of its exploration, in particular by com- and the conceptualization by Babbage of mechanical
puter simulations, was not triggered. Indeed, the focus computing systems fed by punched cards, were im-
remained on the effects of the complexity of the in- portant additional steps. Human computers were also
quiry on scientific representations, with studies about used. For example, in 1758, Clairaut predicted the re-
simplifications, approximations, or idealizations (Even turn of Halley’s comet, by dividing the computational
Laymon’s 1990 paper [34.26], in spite on its apparent work with other colleagues [34.32]. Gaspard de Prony
focus on computer simulation, mainly deals with the produced the logarithm and trigonometric tables in
nature of approximation and what it is to accept or be- the early nineteenth century by dividing the compu-
lieve a theory.), or how to articulate the model-theoretic tational tasks into elementary operations, which were
view of theories and the uses of models and repre- carried out by unemployed hairdressers with little ed-
sentations in actual scientific practices, by taking into ucation. Human computers were used during World
738 Part G Modelling and Computational Issues
War I to compute artillery tables and World War II 34.1.4 Methodological Caveat
to help with the Manhattan project [34.33, 34]. Fi-
nally, mechanical analog computers were developed These different chronological perspectives call for the
for scientific purposes by engineers and scientists like following comments.
Thomson or Kelvin, in the late nineteenth century, First, philosophers should be careful when devel-
Vannevar Bush, between the two World Wars, or En- oping an epistemology of computational models and
rico Fermi, in 1947, and such computers were used computer simulations. Modeling and simulating prac-
till the 1960s. Finally, even in the digital era, new tices have been developed in various epistemic contexts
technological change can have a large impact. For in scientific fields in which well-entrenched theories
decades, access to computational resources was diffi- are more or less present and which have different
cult and only possible in the framework of big projects. methodological and scientific norms. Thus, the role of
Typically, Schelling’s first simulations of residential computer simulations and their epistemological assess-
segregation [34.35] were hand made. An important ment can significantly differ from one case to another,
recent step has been the development of personal com- and bold generalizations should be carefully justified or
Part G | 34.1
puters, which has brought more flexibility and may avoided. As just mentioned, the use of computer sim-
have triggered the development of new modeling prac- ulations is central and accepted in fields like climate
tices [34.36]. science (even if it raises important problems) but is
still regarded with great suspicion in fields like eco-
34.1.3 A More or Less Recent Adoption nomics [34.37, 40].
Across Scientific Fields Second, how much computational models and com-
puter simulations correspond to epistemologically dif-
The development of computational science and the use ferent practices, which should be described in terms
of computational models and simulation methods vary of some computational turn, cannot be assumed, but
from one field to another. Since the 1940s onward, should be investigated on a case-by-case basis regarding
computer simulations have been used in physics, and all potentially relevant aspects. This can be illustrated
computers were also used in artificial intelligence as with the question of the tractability of scientific mod-
early as the late 1950s. However, some fields have re- els. Humphreys, in his 2004 book Extending Ourselves,
sisted such methods, and still do, as far as commonly proposes the following two principles to analyze sci-
accepted mainstream methods are concerned. Typically, ence: “It is the invention and deployment of tractable
the development of computational models and com- mathematics that drives much progress in the physi-
puter simulations in the human and social sciences, cal sciences”; and its converse version: “most scien-
with the possibility of analyzing diachronic interac- tific models are specifically tailored to fit, and hence
tions between agents (versus static models describing are constrained by, the available mathematics” [34.41,
equilibria) is much more recent. As emphasized ear- pp. 55–56]. These two principles suggest both a con-
lier, Schelling’s initial dynamic model of segregation tinuist and discontinuist reading of the development
was first run manually in 1969. Attempts to use com- of science. First, students of science need to assess
putational science to predict social and economic be- which precise aspects of scientific practices have been
havior were globally met with suspicion in the 1960s changed by the development of computers and whether
and 1970s, all the more since these studies were of- these changes should be seen as a scientific revolution,
ten carried out by scholars who did not belong to or simply as an extension of existing modes of rea-
well-entrenched traditions in these fields (such as sci- soning [34.42]. In this perspective, questions about the
entists studying complexity, including human behavior, tractability and complexity of models can no longer be
in institutions like the Santa Fe Institute). Overall, ignored, and may be crucial to an understanding of how
in economics, computer simulations are still not ac- new branches of modeling and computational practices
cepted [34.37]. Similarly, the development of a specific can develop and of how the dynamics of science can
(and still somewhat distinct) subfield using computa- be qualitatively different [34.43]. At the same time, sci-
tional methods to analyze social phenomena is recent, entific practices were also constrained by the available
with the edition by Hegselmann et al. of the volume mathematics before the advent of computers, and new
Modelling and Simulation in the Social Sciences from findings in mathematics already paved the way for the
the Philosophy of Science Point of View [34.38], the development of new scientific practices. For example,
need felt to create, in 1998, the Journal of Artificial Lakatos emphasizes that [34.44, p. 137]
Societies and Social Simulation and the publication in
2005 of the handbook Simulation for the Social Scien- “the greatness of the Newtonian programme comes
tist by Gilbert and Troitzsch [34.39]. partly from the development – by Newtonians of
Computer Simulations and Computational Models in Science 34.2 The Variety of Computer Simulations and Computational Models 739
classical infinitesimal analysis which was a crucial definition, presents successful investigations of models
precondition for its success.” which must have been found to be, one way or an-
other, tractable enough regarding the inquiries pursued.
From this point of view, a continuist reading is also In brief, discussions about the scientific models which
required. are found in scientific practices are ipso facto concern-
Third, the computational perspective may require ing computationally tractable models, or models having
partly revising the philosophical treatment of questions computationally tractable versions.
about science, and scientific representation in particular. How much these remarks imply that existing anal-
Computer simulations are actual activities of investi- yses about scientific models have been discretely
gation of scientific models, and, for this reason, the skewed, or on the contrary that the constraints of
tractability and computational constraints that they face tractability have already been taken into account, needs
can hardly be ignored when analyzing them. They force to be ascertained, and the answer may be different de-
us to adopt an in practice perspective, where what mat- pending on the question investigated. For example, for
ters is not the logical content of representations (that decades the question of the relations between fields has
Part G | 34.2
is, the information which agents can access in prin- mainly been treated in terms of relations between theo-
ciple, with unlimited resources), but the results and ries. While this perspective is in part legitimate, recent
conclusions which agents can in practice reach with the investigations suggest that tractable models may also be
inferential resources they have [34.41, §5.5]. By con- a relevant unit to analyze scientific theoretical, method-
trasts, traditional analyses of scientific models adopt an ological or practical transfers between fields [34.41,
in-principle stance: the question of their exploration and §3.3], [34.45, 46]. In any case, when discussing ques-
of the tractability of the methods used to explore them tions related to scientific representation, explanation, or
is one question among others, and is implicitly ideal- confirmation, philosophers of science must watch out
ized away when discussing other issues. This implies that answers may sometimes differ for the models that
surreptitiously smuggling in the unjustified claim that scientists work with daily (and which more and more
the distinction between what is possible in principle and require computers to be investigated), and for simple
what is possible in practice can be ignored for the inves- analytically solvable models, which philosophers more
tigation of these other issues, which may sometimes be naturally focus upon, and which may have a specific
controversial. scientific status regarding the construction of knowl-
At the same time, philosophers of science draw edge and the development of families of models in each
their examples from the scientific literature, which, by field.
trajectories of S in the state space of a computational operations upon continuous physical signals. Thus, op-
model of S (working characterization). erations that would be difficult to program on a digital
This characterization is not meant as a full-blown computer are immediately possible on an analog ma-
definition (Sect. 34.6) but as a synthetic presentation of chine. The specificity of analog machines is that they
important features of computer simulations. contain physical elements whose dynamics decisively
First, it emphasizes that the development of com- contribute to the dynamic instantiation of these math-
puters is a central step in the recent evolution of science, ematical operations. For a machine to be used as an
which was made possible by steady conceptual and analog computer, its physical dynamics must be explic-
technical progresses in the twentieth century. It can itly known and completely under control so that there
therefore be expected that computational aspects are is no uncertainty about the operations which are carried
often, though not necessarily always, central for the out. While systems like wind tunnels cannot be made to
epistemological analysis of computational science and compute several different dynamics, mechanical analog
computer simulations (Sect. 34.3). Second, the work- computers like the differential analyzer and electrical
ing definition is meant to emphasize that all uses of analog computers can be used as general-purpose com-
Part G | 34.2
sionless quantities [34.55, Chap. 8]. While the need to or some of its properties. Because of floating-point rep-
describe systems in terms of dimensionless quantities resentation, round-off errors cannot be avoided in simu-
is a general one in the empirical sciences [34.56–58], lations. When algorithms result in small cumulative er-
and is also crucial for digital simulations, here it is rors, they are stable and two such stable algorithms may
specifically important to understand the type of reason- be considered as numerically equivalent – although they
ing involved in analog simulations. Indeed, the physical need not be computationally equivalent in terms of their
description of the simulating and simulated systems computational efficiency. Finally, based on the type
matter only in so far as one needs to justify that they in- of inquiry pursued, wider notions of representational
stantiate a common dimensionless dynamical structure. equivalence can be defined at the computational model
In brief, such analogical reasoning does not involve or computer simulation level. Typically, two computa-
any direct comparison between the physical material tions yielding the same average quantity, or describing
properties of the simulating and simulated systems: the same topology of a trajectory, may be considered
the mathematical structure mediates the comparison. In as equivalent. Overall, this shows that analyses of the
other words, even with analog simulations, an analysis failures and predictive or explanatory successes of com-
Part G | 34.2
of the similarities of the two systems is irrelevant once puter simulations must often be rooted in the technical
one knows which analog computation is being carried details of computational practices [34.62]. From this
out by both systems. point of view, an important part of computational sci-
ence can be seen as the continuation of the numerical
34.2.3 Digital Machines, Numerical Physics, analysis tradition presented in Sect. 34.1.2.
and Types of Equivalence
34.2.4 Non-Numerical Digital Models
In digital machines, information is processed discretely,
coded in binary digits (1 or 0), and stored in transistors. A large part of science gives a central role to scientific
Computations involve the transition between computa- theories couched in terms of differential equations re-
tional states. These transitions are described in terms lating continuous functions with their derivatives. For
of logical rules between the states. If these rules can this reason, much of computational science is based
be described in a general form, they may be described on finite-difference equations aimed at finding ap-
in terms of equations involving variables. Digital com- proximate numerical solutions to differential equations.
puters can have various types of architecture with dif- However this theory- and equation-oriented picture
ferent computational performances. Traditionally, soft- does not exhaust actual practices in computational sci-
ware was written for sequential computation, in which ence. First, computer simulations can be carried out in
one instruction is executed at a time. In contrast, modern the absence of theories – which turns out to be a prob-
supercomputers are designed to solve tasks in parallel, lem when it comes to the explanatory value of computer
and parallelism can be supported at different levels of simulations (Sect. 34.4). Second, even when equation-
architecture, which often implies the need to adapt algo- based theories exist, computational models are not nec-
rithms, if not models, to parallel computation [34.59]. essarily completely determined by these theories and by
Digital machines can be used to develop different mathematical results describing how to discretize equa-
types of computer simulations. Much computational tions appropriately (Sect. 34.3.2). Finally, even when
science is numerical: binary sequences code for num- well entrenched, equation-based, theories exist, digital,
bers and computers carry out numerical operations on but non-numerical, computer simulations can be de-
these numbers by processing the binary strings. Since veloped. This perspective was advocated in the 1980s
computers can only devote limited memory to repre- by computer scientists like Fredkin, Toffoli, or Mar-
sent numbers (e.g., with floating-point representation), golus. Building on the idea previously expressed by
numerical science usually involves numerical approxi- Feynman, that maybe “nature, at some extremely mi-
mations. In other words, computer simulations do not croscopic scale, operates exactly like discrete computer
provide exact solutions to equations – even if the notion logic” [34.63], they wanted to develop “a less round-
of an exact solution is not as straightforward as philoso- about way to make nature model itself” [34.64, p. 121]
phers usually take it to be [34.60]. than the use of computers to obtain approximate numer-
Different types of equivalence between compu- ical solutions of equations. The idea was to represent
tations, and, by extension, computer simulations, more directly physical processes by means of phys-
should be distinguished beyond equivalence at the bit ically minded models, with interactions on a spatial
level [34.61]. Logical and mathematical expressions lattice providing an emulation “of the spatial locality of
and algorithms can be mathematically equivalent when physical law” [34.65] and to use exact models obeying
they refer to, or compute, the same mathematical object discrete symbolic dynamics to dispense with numer-
742 Part G Modelling and Computational Issues
ical approximations. In practice, this resulted in the putational methods are regularly invented, and these
renewed development of cellular automata (hereafter often challenge previous attempts to provide rational
CA) studies and their use for empirical investigations. typologies. Further, the features presented in the pre-
A CA involves cells in a specific geometry; each cell vious sections are often mixed in complex ways. For
can be in one of a finite set of states, and evolves fol- example, CA-based methods in fluid dynamics, which
lowing a synchronous local update rule based on the were not originally supposed to involve numbers or of-
states of the neighboring cells. The field of CA was fer exact computations, were finally turned into lattice
pioneered in the 1940s by Ulam’s works on lattice net- Boltzmann methods, which involve making local av-
works and von Neumann’s works on self-replicating erages [34.73]. Here, I shall merely present types of
automata. It was shown over the decades that such mod- computer simulations that are widely discussed in the
els, though apparently over-simplistic, can not only be philosophical literature.
successfully used in fields as different as the social sci-
ences [34.66] and artificial life [34.67], but also physics, Agent-Based Methods
in which they were shown in the late 1970s and 1980s to Agent-based methods involve the microlevel descrip-
Part G | 34.2
be mesoscopic alternate to Navier–Stokes macroscopic tion of agents and their local interactions (in contrast
equations [34.68]. to global descriptions like balance or equilibrium equa-
tions), and provide tools to analyze the microscopic
34.2.5 Nondeterministic Simulations generation of phenomena. They are often opposed to
equation-based approaches, but the distinction is not
Another important distinction is between determinis- completely sharp, since equations do not need to de-
tic and nondeterministic algorithms. From the onset, scribe global behaviors and, when discretized, often
computers were used to execute nondeterministic algo- yield local update rules. Agent-based models and simu-
rithms, which may behave differently for different runs. lations are used across fields to analyze artificial, social,
Nondeterministic simulations involve using random biological, etc., agents. CA models like the Schelling
number generators, which can be based on random model of segregation can be seen as agent-based models
signals produced by random physical processes, or even though most such agent-based also involve num-
on algorithms producing pseudorandom numbers with bers in the descriptions of local interactions. Because
good randomness properties. Overall, the treatment of they rely on microscopic descriptions, agent-based sim-
randomness in computer simulations is a tricky issue ulations are often at the core of debates about issues
since generating truly random signals, with no spurious such as emergence [34.74], explanation [34.75], or
regularities which may spoil the results by introducing methodological individualism in science [34.76].
unwanted patterns, turns out to be difficult.
Monte Carlo methods, also called Monte Carlo Coupled and Multiscale Models
experiments, are a widely used type of nondetermin- Extremely elaborate computational models, developed
istic simulations. They were central to the Manhattan and studied by large numbers of scientists, are in-
project, which led to the production of the first nu- creasingly used to investigate complex systems such as
clear bombs and contributed heavily to the development Earth’s atmosphere, be it for the purpose of precise pre-
of computer simulations. They can be used for vari- dictions and weather forecasting or for the analysis of
ous purposes such as the calculation of mathematical larger less precise trends of climate studies. While in
quantities like Pi or the assessment of average quan- fluid dynamics, it is sometimes possible to carry out
tities in statistical physics by appropriately sampling direct simulations, where the whole range of spatial
some interval or region of a state space. These practices and temporal scales from the dissipation to the integral
are hard to classify and, depending on the case, seem scale are represented [34.77, Chap. 9], such methods
to correspond to computational methods, experiments, are too costly for atmosphere simulations, in which sub-
or full-blown computer simulations. Metropolis and grid models of turbulence or cloud formation need to be
Ulam [34.69] is a seminal work, Galison [34.70, 71] included (see Edwards [34.78] and Heymann [34.79]
correspond to historical studies, and Humphreys [34.8], for accessible and clear introductions). Also, different
Beisbart and Norton [34.72]) to epistemological models sometimes need to be coupled like in the case
analyses. of global coupled ocean-atmosphere general circulation
models.
34.2.6 Other Types of Computer Simulations These complex computer simulations raise a num-
ber of epistemological issues. First, in the case of
It is difficult to do justice to all the kinds of simu- multiscale or coupled models, the physical and com-
lations that are seen in scientific practice. New com- putational compatibility of the different models can be
Computer Simulations and Computational Models in Science 34.3 Epistemology of Computational Models and Computer Simulations 743
a tricky issue, and one must be careful that it does Computational Methods
not create spurious behavior in the computer simulation versus Computer Simulations
(see Winsberg [34.16, 80], Humphreys [34.41, 81] for Not all major families of mathematical and compu-
more analyses about such models). Second, since there tational methods are used to produce computational
are various ways of taking into account subgrid phe- models or computer simulations of empirical systems.
nomena, pluralism in the modeling approaches cannot Evolutionary algorithms are used for the investigation
be avoided [34.82]. Importantly, the existence of differ- of artificial worlds, or of foundational issues about evo-
ent incompatible models need not be seen as a problem, lution, and they have important applications in the field
and scientists can try to learn by comparing their results of optimization methods. Artificial neural networks are
or elaborate ensemble methods to try to deal with uncer- used in the field of machine learning and data learn-
tainties [34.83]. The development of investigations of ing and to predict the behavior of physical systems
such large-scale phenomena requires collective work, out of large databases. Bayesian networks are helpful
both within and between research teams. Typically, not to model knowledge, develop reasoning methods, or
only the interpretation of the models, their justification, to treat data. Overall, all these computational methods
Part G | 34.3
the numerical codes [34.84], but also the standard of have clear practical applications. They can be used for
results [34.78, 85] must be shared by members of the scientific tasks, sometimes concurrently with computer
corresponding communities. An important but still un- simulations in the case of predictive tasks. However, no
explored question is how much the collective dimension genuine representations of physical systems and their
of these activities influences epistemologically how dynamics seem to be attached to their use – even if, as
they are carried out. From this point of view, the epis- the development of CA-based simulations has shown,
temology and philosophy of computational models and novel formal settings may eventually have unexpected
computer simulations can be seen as another chapter of applications for modeling purposes in the empirical
the analysis of the collective dimension of science. sciences.
ple, predictive or explanatory knowledge, or knowledge simulations, corresponding to the general epistemolog-
about how systems behave and can be controlled are ical problems raised by such general functions. In any
of different types. Some scientific roles can be general case, to complete the picture, one needs to go deeper
(predicting) and others are very specific. For example, into the analysis of the roles that computer simulations
computer simulations are used to develop evidential serve within scientific practices and how they fulfill
standards in physics by simulating detection procedures these roles in various types of contexts. This program
and identifying patterns of data (signatures) [34.86]. is not incompatible with the philosophical perspectives
Overall, developing a coherent and fine-grained episte- of some of the advocates of the so-called practice turn
mology of computer simulations would require drawing in science [34.88], and in particular of authors who put
a map of their various roles to see how much their epis- contextually described scientific activities at the core of
temological features are general or contextual and role their description of science [34.89, 90].
specific.
Let us now be more specific. In the twentieth cen- 34.3.2 Aspects of the Epistemological
tury, the role of experiments, as sources of empirical Analysis of Computer Simulations
Part G | 34.3
vided for specific problems. Overall, scientists in this extracting all their content is possible, an assumption
field still actively investigate and debate how much al- that, in the framework of computational science, is of-
gorithms can be verified (see Fetzer [34.93], Asperti ten not plausible.
et al. [34.94] and Oberkampf and Roy [34.95] for dis-
cussions). At a higher mathematical level, as we saw From Theoretical to Empirical Justifications
earlier, many computational methods provide numeri- Computer simulations have often been viewed as ways
cal methods for approximately solving problems, and of exploring theories by hypothetico-deductive meth-
the stability of algorithms can be a source of concern, ods. This characterization captures a part of the truth,
which means that analyzing computational errors is part since existing theories are often a starting point for the
of the epistemology of simulations [34.62]. construction of computer simulations. In simple cases,
Finally, one needs to assess whether the approxi- computer simulations can mainly be determined by the-
mations in the solution, as well as the representational ories, like in the case of direct simulations [34.77] in
inadequacies of the model, are acceptable regarding fluid dynamics, which derive from Navier–Stokes equa-
the physical inquiry pursued. At this interpretational tions, and in which all relevant scales are simulated and
Part G | 34.3
level, because of the variety of theoretical contexts no turbulence model is involved.
in which computer simulations are carried out, there However, taken as a general description, this view
is no single and general way in which the reliabil- misrepresents how computer simulations are often pro-
ity of the results they provide can be analyzed. The duced and their validity justified. As emphasized by
credentials of computer simulations will be different Lenhard [34.97], even when theoretical equations are
depending on whether a sound theory is being used, in the background, computer simulations often result
how much uncertainty there is about the initial condi- from some cooperation between theory and experi-
tions, how complex the target system is, whether drastic ments. For example, in 1955 when Norman Phillips
simplification assumptions have been made etc. Also, managed to reproduce atmospheric wind and pressure
depending on what the simulation is used for, and what relations with a six-equation model, which arrange-
type of knowledge it is meant to provide, the justi- ment of equations could lead to an adequate model of
ficatory requirements will be more or less stringent. the global atmosphere was uncertain and the need for
It takes different arguments to justify that based on experimental validation was primordial to confirm his
a simulation one knows how to represent, control, pre- speculative modeling assumptions. Overall, the role of
dict, explain, or understand the behavior of the system empirical inputs in simulation studies is usually cru-
(see Sect. 34.4 for a discussion of the last two cases, cial to develop phenomenological modules of models,
and [34.96] for similar analyses). Similarly, precise parameterize simulations, or investigate their reliability
quantitative spatial-temporal predictions are in need based on their empirical successes [34.15, 17].
of much pointed justifications than computer simula- At the same time, since computer simulations are
tions aimed at studying average quantities or qualitative used precisely in cases where empirical data are absent,
behaviors of systems. Importantly, this discussion of sparse, or unreliable [34.16], sufficient data to build up
the reliability of computer simulations overlaps sig- and empirically validate a computational model may
nificantly with that of the epistemology of physical be missing. In brief, in some cases, computer simula-
models, and with how the results issued from approx- tions can be sufficiently constrained neither by theories
imate, idealized, coarse grained, or simply partly false nor by data and are somewhat autonomous. From an
models can still be scientifically valuable (see Por- epistemological point of view, this potential situation
tides Chap. 2, this volume; Frigg and Nguyen Chap. 3, of theoretical and experimental under-determination is
this volume). However, in the present context, it is im- not something to be hailed, since it undermines the sci-
portant to emphasize that, even if the content of models entific value of their results (see also Sect. 34.4.2).
obviously constrains the reliability of the information
that can be extracted from them, models do not by The Epistemology of Complex Systems
themselves produce results – only procedures which in- Because computer simulations are generally used to
vestigate them do. In this perspective, the epistemology analyze complex systems, their epistemology partly
of computer simulations is a reminder that reliabil- overlaps that of complex systems and their modeling.
ity primarily characterizes practices or activities that It involves the analysis of simplification procedures
produce knowledge and that models, taken alone, are at the representational or demonstration levels and of
not such practices. In other words, epistemological dis- how various theoretical or experimental justifications
cussions about the reliability of models as knowledge are often used concomitantly. Overall, when it comes
providers make sense only by explicitly reintroducing to investigating complex systems, obtaining reliable
such practices or when it can be assumed that reliably knowledge is difficult. Thus, any trick or procedure that
746 Part G Modelling and Computational Issues
works is welcome and the result is often what Winsberg may prove useful in cases in which problems can be
has labeled a motley epistemology [34.16]. located by testing parts of the process and applying a di-
At the same time, sweeping generalizations should chotomy procedure. Local transparency requires that all
be avoided. Philosophers studying computer simula- details of the physical models and computational algo-
tions have too often cashed in their epistemology in rithms used be transparent, which may be more or less
terms of that of the most complex cases, such as com- the case. Usually, computer simulations make heavy use
puter simulations in climate science, which are charac- of mathematical resource libraries such as code lines,
terized by extreme uncertainties and the complexity of routines, functions, algorithms, etc. In applied science,
their dynamics. But computer simulations are used to more or less opaque computational software can be
investigate systems that have various types and degrees proposed to simulate various types of systems, for ex-
of complexity, and whose investigation meets different ample, in computational fluid dynamics [34.91, p. 567].
sorts of difficulties. It is completely legitimate, and po- This raises epistemological problems since black-box
litically important, that philosophers epistemologically software is built on physical models with limited do-
analyze computational models and computer simula- mains of physical validity, and results will usually be
Part G | 34.3
tions in climate science (see, e.g., [34.98] for an early returned even when users unknowingly apply such soft-
influential article). However, to obtain a finer grained ware outside these domains of validity.
and more disentangled picture of the epistemology of Another form of epistemic opacity for individual
computer simulations, and not to put everything in the scientists arises from the fact that investigating natu-
same boat, a more analytic methodology should be ap- ral systems by computer simulations may require dif-
plied. More specifically, one should first analyze how ferent types of experts, both from the empirical and
the results are justified in more simple cases of com- mathematical sciences. As a result, no single scien-
puter simulations where specific scientific difficulties tist has a thorough understanding of all the details of
are met independently. In a second step, it can be an- the computational model and computational dynam-
alyzed how adding up scientific difficulties changes ics. Such type of opacity is not specific to computer
justificatory strategies and when exactly more holistic simulations, since it is a consequence of the epis-
epistemological analyses are appropriate [34.99]. In this temic dependence between scientists within collabora-
perspective, much remains to be done. tions [34.103].
tions, experiments, etc.,) and the conclusions that are regularities in nature, which are mirrored by modeling
obtained from them, other aspects of material, computa- and computational practices.
tional, cognitive, or social natures, potentially unknown
to the scientific agents involved in the inquiry, may play Local Factors
a role to explain that these activities were actually car- The explanation may be rooted in the specificities of
ried out. For example, in the case of the Millennium modeling and computational activities. In particular, if
Run (a costly simulation in astrophysics), the results explaining is better achieved by limiting the number
were made publicly accessible. Scientists who were not of (types) of (computational) models [34.21, pp. 144–
involved in the process leading to the decision to carry 5], or explanatory argument patterns [34.107], it is
out this simulation could try to make the best of it since no surprise that often the same computational mod-
it was already there and milk it as much as possible els and practices are met. Also, scientists may feel
for different purposes. Or, some scientists may decide the need to avoid dispersion of their efforts in cases
to study biological entities like proteins or membranes when research programs need to be pursued for a long
by means of Monte Carlo simulations, because mem- time before good results can be reached and it is
Part G | 34.3
bers of their teams happen to be familiar with these more profitable to exploit a local mine than to go dig-
tools. However, once they have decided to do so, they ging somewhere else [34.106, Chaps. 4 and 5], [34.21,
must still justify how their computer simulations sup- pp. 143–4]. More generally, the recurrence of compu-
port their conclusions. tational practices may be viewed as another example of
In the perspective of explaining actual scientific the benefits of adopting scientific standards [34.108].
uses, one also needs to distinguish between explana- One may also, in the Kuhnian tradition, put the em-
tions aimed to account for specific uses (e.g., Why was phasis on the education of scientists, who are taught
the millennium simulation carried out in 2005 by the to see new problems through the lens of specific prob-
Virgo consortium?) and those aimed to explain more lems or exemplars [34.106, p. 189], and emphasize that
general patterns, corresponding to the use of practices this education has a massive influence on the lineages
of a given type, within or across several fields of science of models or practices which are later developed. This
(e.g., Why are Monte Carlo simulations regularly used story can have a more or less contingentist version, de-
in this area of physics?, Why are they used regularly pending on why the original models or practices at the
in science?). Importantly, since different instantiations lineage seeds are adopted in the first place, and why
of a pattern may have different explanations, the ag- these uses are perpetuated and scientists do not emanci-
gregated frequency of a scientific practice, like that of pate from them after schooling.
the use of the Ising model across science, may be the Theories may also play an indirect role in the se-
combined effect of general transversal factors and of lection of computational models. For example, models
inquiry- or field-specific features [34.105]. naturally couched in the standard formalism of a theory
may be easier to use, even if the same physics can also
Field-Specific versus Cross-Disciplinary be put to work by using other models. Barberousse and
Explanations Imbert [34.100] analyze the case of lattice methods for
A tempting move has often been to answer that sci- fluid simulations in depth, which, though significantly
entific choices are primarily, if not completely, theory different from approaches based on Navier–Stokes dif-
driven – and are therefore field specific. After all, theo- ferential equations, can be used for the same purposes,
ries guide scientists in their predictive and explanatory even if this requires spending time learning and har-
activities by fueling the content of their representa- nessing new methods and formalisms, which physicists
tions of natural systems. However, a reason to look for may be reluctant to do.
additional elements of explanations is that the spec-
trum of actual modeling and computational practices is Computational
smaller than our scientific knowledge and goals would and Mathematical Explanations
allow [34.21, 41, 106]. For example, why do the har- As seen in Sect. 34.1.4, Humphreys [34.41, 81], sug-
monic oscillator, the Ising model, the Lotka–Volterra gests that most scientific models are tailored to fit the
model, Monte Carlo simulations, etc., play such promi- available mathematics, hence the importance in sci-
nent roles throughout science? entific practice of tractable models (see Humphreys’s
As highlighted by Barberousse and Imbert notion of computational template [34.41, §3.4], and
[34.105], a variety of significantly different expla- further analyses in [34.45]). Even if one grants the
nations of the greater or lesser use of models of potential importance of such mathematical and compu-
a given type, and of scientific practices, can be found, tational factors, cashing out in detail the corresponding
beyond the straightforward suggestion that there are explanation is not straightforward. Barberousse and
748 Part G Modelling and Computational Issues
Imbert [34.105] emphasize that there are various com- 34.3.4 The Production of ‘New’ Knowledge:
putational explanations. The objective computational In What Sense?
landscape (how intrinsically difficult problems are, how
frequent easy problems are) probably influences how Be Careful of Red Herrings!
science develops, even if knowing exactly what it looks It is commonly agreed that computer simulations
like and how it constrains scientific activity is of the ut- produce new knowledge, new data, new results, or
most difficulty. However, the epistemic computational new information about physical systems (Humphreys
landscape (scientists’ beliefs about the objective com- [34.41], Winsberg [34.113, pp. 578–579], Norton and
putational landscape) may just be as important since it Suppe [34.114, p. 88], Barberousse et al. [34.91,
frames modeling choices made by scientists. p. 557], Beisbart [34.115]). This can be considered as
Other potentially influential factors may also in- a factual statement, since contemporary science, which
clude how difficult it is to explore the objective land- is considered to produce knowledge, relies more and
scape (and the corresponding beliefs regarding the more heavily on computer simulations.
easiness of this exploration), how much scientists, who At the same time, the notion of knowledge should
Part G | 34.3
try to avoid failure, are prone to resort to tractable not be a red herring. It is commonly considered that
models, or which techniques are used to select such experiments, inferences, thought experiments, repre-
tractable models (since some specific techniques, like sentations, or models can bring knowledge, which then
polynomial approximations, may repeatedly select the generates the puzzle that widely different activities have
same models within the pool of tractable models). Fi- similar powers. The puzzle may be seen as somewhat
nally, modeling conservativeness may also stem from artificial since knowledge, especially scientific, can be
the computational and result pressure experienced by of different types [34.81], and when new knowledge is
scientists, that is, how scarce computational resources produced, the novelty can also be of different types. In
are in their scientific environment and how much scien- this perspective, it may be that what is produced by
tists need to publish results regularly. each of these activities falls under a general identical
concept but is significantly different. From this point
Universality, Minimality, of view, the real question concerning computer simu-
and Multiple Realizability lations is not whether they produce knowledge, but in
Other explanations may be offered in terms of how which particular sense they produce knowledge, what
weak the hypotheses are to satisfy a model or a dis- kind of knowledge they produce, what is specific to
tribution. For example, the Poisson distribution is often the knowledge produced by computer simulations, and
met because various types of stochastic processes sat- what type of novelty is involved.
isfy the few conditions that are required to derive A comparison can be made with thought experi-
it [34.41, pp. 88–89]. Relationships between models ments, for which the question of how they can produce
and how models approximate to each other may also new knowledge has also been debated. Both activi-
be important. Typically, the Gaussian distribution is ties correspond to the exploration of virtual situations,
the limit of various other distributions (see however, and do not involve direct interactions with the sys-
Lyon [34.109] for a more refined analysis and the claim tems investigated. From this point of view, computer
that in Nature Gaussian distributions are common, but simulations and thought experiments can be seen as
not pervasive). More generally, models that capture uni- platonic investigation of ideas, with this difference
versal features of physical systems and are rooted in that, for computer simulations, the mind is assisted by
basic properties, such as their topology, can be ex- computers [34.41, p. 115–116]. Overall, computer sim-
pected to be met more often. Therefore, for reasons ulations have been claimed to sometimes play the same
having to do with the mathematics of coarse-grain de- role of unfolding as thought experiments [34.87], have
scriptions, and the explanation of multiple realizability, sometimes been equated with some types of thought
many systems fall into the same class and have similar experiments [34.116], and it has been suggested that
descriptions [34.110–112] when minimal, macro-level, computational modeling might bring the end of thought
or simply qualitative models are built and explored. experiments [34.117]. Importantly, even if thought ex-
Importantly, all the above explanations are not ex- periments are perhaps less used in science than for-
clusive. Typically, the emphasis on tractability may be merly, this latter claim seems implausible. The reason
a general one in the sense that models always need to is that there are different kinds of thought experiments,
be tractable if they are to be used by scientists. and many reveal conceptual possibilities that have lit-
tle to do with computational explorations. Arguably, the
possibility to set up computer simulations would have
added nothing to famous thought experiments such as
Computer Simulations and Computational Models in Science 34.3 Epistemology of Computational Models and Computer Simulations 749
those made by Galileo, Einstein, Podolsky and Rosen, knowledge that have already been found – a position
or Schrödinger. (I am grateful to Paul Humphreys for defended by Descartes in 1637 in the Discours de la
emphasizing this point.) In any case, a satisfactory Méthode [34.120]. This kind of puzzle, though particu-
account of these activities should account for both larly acute for computer simulations, is not specific to
similarities and differences in how they work epistemo- them and is nothing new for philosophers of language –
logically and how they are used. Frege and Russell already analyzed similar ones. How-
In any case, the question of how and what we ever, this shows that, pace the neglect for linguistic
can learn about reality by using these methods arises, issues in the present philosophy of science, without an
even if the sources of puzzlement do not exactly touch adequate theory of reference and notion of content that
the same points in each case. Indeed, how mental would make clear what exactly we know and do not
thought experiments work is more opaque than how know when we make a scientific statement, we are ill-
computer simulations do. For this reason, their ratio- equipped to precisely analyze the knowledge generated
nal reconstruction as logical arguments [34.118, p. 354] by computer simulations [34.41, 121].
is more controversial than that of computer simula- Computational science may also remain somewhat
Part G | 34.3
tions [34.115], and it is less clear whether their posi- mysterious if one reasons with the idealizations usu-
tive or negative epistemic credentials are those of the ally made by philosophers of science. As emphasized
corresponding reconstructed argument [34.119]. (For in Sect. 34.1.4, idealizing away the practical constraints
example, if certain thought experiments are reliable faced by users is characteristic of much traditional phi-
because mental reasoning capacities about physical sit- losophy of science and theories of rationality. In the
uations have been molded by evolution, development, present case, it is true that “in principle, there is nothing
or daily experiments, it is not clear that their logi- in a simulation that could not be worked out with-
cal reconstruction will more vividly make clear why out computers” [34.122, p. 368]. Nevertheless, adopting
they are reliable.) The situation is clearer for com- this in principle position is unlikely to be fruitful here
puter simulations since the process is externalized and since, when it comes to actual computational science,
is based on more transparent mechanisms (see how- which scientific content can be reached in practice is
ever Sect. 34.3.2). Then, if computer simulations are a crucial issue if one wants to understand how com-
nothing else than (computationally assisted) thinking putational knowledge develops and pushes back the
corresponding to the application of formal rules, and boundaries of science (see Humphreys [34.41, p. 154]
their output is somewhat contained in the description of and Imbert [34.102, §6]).
the computational model, how knowledge is generated Overall, it is clear that present computational pro-
is clearer but the charge of the lack of novelty is heavier. cedures and computer simulations do contribute to the
development of scientific knowledge. Thus, it is incum-
The Need for an Adequate Notion of Content bent on epistemologists and philosophers of sciences to
Suppose that a physical system S is in a state s at time t develop conceptual frameworks to understand how and
and obeys deterministic dynamics D. Then, the descrip- in what sense computer simulations extend our science
tion of D and s characterizes a mathematical structure and what type of novelty is involved.
M, which is the trajectory of S in its phase space and
is known as such. If a computer simulation unfolds this Computer Simulations
trajectory, then it explicitly shows which states S will and Conceptual Emergence
be in. At the same time, any joint description of one Computer simulations unfold the content of computa-
of these states and of the dynamics denotes the same tional models. How to characterize the novelty of the
structure M, which is known to characterize the evo- knowledge that they bring us? Since the notion of nov-
lution of S. So, from a logical point of view, no new elty is also involved in discussions about emergence,
content has been unraveled by the computer simulation, the literature about this latter notion can be profitably
which can at best be seen as a means of producing new put to work here.
descriptions of identical contents. In brief, if knowl- Just as emergence may concern property instances
edge is equated with that of logical content, computer and not types [34.123, 124, p. 589], the notion of nov-
simulations do not seem to be necessarily producing elty needed here should apply to tokens of properties
new knowledge. We may even be tempted to describe instantiated in distinctive systems and circumstances, or
computer simulations as somewhat infertile and thereby to specific regularities the scope of which covers such
perpetuate a tradition according to which formal or me- tokens and circumstances. For example, the apparition
chanical procedures to draw inferences, and rules of of vortices in fluids is in a sense nothing new, since the
logic in particular, are sterile, as far as discovery is con- behavior of fluids is covered by existing theories in fluid
cerned, and can at best be used to present pieces of dynamics, no new concept is involved, and other phe-
750 Part G Modelling and Computational Issues
nomena of this type are already understood for some macro-pattern qua pattern. Also, the same conceptually
well-studied configurations. At the same time, finding emergent phenomena may arise in different situations
out that patterns of vortices emerge in configurations of and its description may therefore require an indepen-
a new type is a scientific achievement and the discovery dent conceptual framework, just like the regularities of
of some new piece of knowledge. special science require new concepts, unless one is pre-
Importantly, as emphasized by Barberousse and pared to describe their content in terms of a massive
Vorms [34.125, p. 41], the notion of novelty should be disjunction of all the cases they cover [34.126].
separated from that of surprise. When the exact value Interestingly, various phenomena investigated by
of a variable is precisely learnt and lies within the computational science are conceptually emergent. Even
range that is enabled by some physical hypothesis or if computer simulations are sufficient to generate them,
principle, we have a kind of unsurprising novelty. Bar- identifying, presenting, and understanding them may
berousse and Vorms give an example from experimental require further analyses of the simulated data, re-
science, but computer simulations may also provide ex- descriptions at higher scales and the development of
act values for quantities, which agree with general laws new theoretical tools. For example, traffic stop-and-go
Part G | 34.4
(e.g., laws of thermodynamics) and are therefore partly patterns in CA models of car traffic, emergent phe-
expected. nomena in agent-based simulations, and much of the
In addition, computer simulations can provide cases knowledge acquired in classical fluid dynamics seem
of surprising novelty, concerning behaviors that are to correspond to the identification and analysis of con-
covered by existing theories like chaotic behavior for ceptually emergent phenomena. Effectively, it is by
classical mechanics. Indeed, Lorenz attractor and be- conceptually representing these phenomena in different
haviors of a similar type were discovered by means frameworks that one manages to gain novel informa-
of computer simulations of a simplified mathematical tion about these systems, above and beyond our blind
model initially designed to analyze atmospheric con- knowledge of the microdynamics that generates them.
vection, and this stimulated the development of chaos It is important to emphasize that different types
theory [34.125, p. 42]. of novelty described above are also met in experi-
This leads us to a type of novelty, related to ments exploring the behavior of systems for which the
what Humphreys calls conceptual emergence. Some- fundamental physics is known. In other words, the po-
thing is conceptually emergent relative to a theoretical tential novelty of experimental results should not be
framework F when it requires a conceptual apparatus overemphasized. Even if only experiments can con-
that is not in F to be effectively represented [34.41, found us [34.127, pp. 220–221] through results which
p. 131], [34.123, p. 585]. The conceptual apparatus are not covered by our theories or models, many of
may require new predicates, new laws and sometimes the new empirical data that these experiments provide
the introduction of a new theory. Importantly, con- us with are no more novel than those produced by
ceptual emergence is not merely an epistemic notion. computer simulations. The statements describing these
It does not depend on the concepts we already pos- results are not strongly referential, in the sense that no
sess and the conceptual irreducibility is between two unknown aspects of the deep nature of the correspond-
conceptual frameworks. Further, even if instances of ing systems would be unveiled by a radically new act
the target pattern can be described at the microlevel of reference [34.87, pp. 3463–3464]. These statements
without the conceptually emergent concepts, the de- derive from what we already know about the physical
scription of the pattern itself, if it is made without systems investigated, and the experimental systems un-
these novel concepts, is bound to be a massive dis- ravel them for us. In this sense, they are merely weakly
junction of microproperties, which misrepresents the referential.
tems they investigate (Sect. 34.4.4). Even if computer Similarly, computer simulations can sometimes be
simulations fail to meet these expectations because of instantiations of argument patterns that are part of what
their epistemic opacity, understanding may sometimes Kitcher describes as the explanatory store unifying our
be regained by appropriately visualizing the results or beliefs [34.107]. For example, the computation of the
studying phenomena at a coarser level (Sect. 34.4.5). In comet’s trajectory can be seen as an instantiation of
any case, scientific judgments about such issues are in- “the Newtonian schema for giving explanations of mo-
fluenced by disciplinary norms, which may sometimes tions in terms of underlying forces” [34.132, p. 121,
evolve with the development of computational science p. 179].
(Sect. 34.4.6). Be this as it may, computer simulations have often
been claimed, both by scientists and philosophers, to be
34.4.1 Traditional Accounts of Explanation somewhat problematic concerning explanatoriness and
lacking some of the features that are expected to go
Philosophers of science have discussed intensively the with the fulfillment of explanatory requirements. This
issue of scientific explanation over the last decades. reproach of unexplanatoriness can be understood in sev-
Part G | 34.4
The seminal works of Hempel were published in the eral senses.
1940s, when computational science started to develop.
However, until recently, discussions about computer 34.4.2 Computer Simulations:
simulations and explanations did not interfere with Intrinsically Unexplanatory?
each other – which could suggest that for theorists
of explanation, how explanations are produced does One may first claim that computer simulations in gen-
not in fact matter. While it is true that many of the eral, or some specific types of them, do not meet
examples of explanatory inquiries analyzed in the liter- one’s favorite explanatory requirements. For example,
ature are simple and, at least in their most elementary agent-based simulations may be described as not usu-
versions, do not belong to computational science, it ally involving covering laws nor providing explanatory
is hard to see why computer simulations could not causal mechanisms or histories [34.75, 133]. However,
in some cases satisfy the requirements corresponding one should not ascribe to computer simulations re-
to major accounts of explanations. According to the proaches that should be made to the field itself. If
deductive-nomological (hereafter DN) model, one ex- a field does not offer well-entrenched causal laws and
plains a phenomenon when a sentence describing it is one is convinced that explanations should be based
logically deduced from true premises essentially con- on such laws, then the computer simulations made in
taining a scientific law [34.128, pp. 247–248]. For such fields are not explanatory, but this has nothing to
example, the explanation of the trajectory of a comet, do with computer simulations in general. Also, some
by means of a computer simulation of its trajectory computer simulations are built with scientific material
based on the laws of classical (or relativistic) mechanics like phenomenological regularities, which potentially
together with the initial positions of all bodies signif- makes them unexplanatory, but this material could also
icantly influencing its trajectory, seems to qualify as be used in the context of explanatory inquiries involv-
a perfect example of DN explanation – provided that ing arguments or closed form solutions to models. Thus,
computer simulations can be seen as deductions [34.91, the problem comes from the use of this material and not
115]. from the reliance on one or another mode of demon-
Analog statements can be made concerning the stration – and claiming that computer simulations are
causal and unificationist models of explanations. The unexplanatory is like blaming the hammer for the hard-
computer simulation of the comet’s trajectory is a way ness of the rock.
to trace the corresponding causal processes, described For this reproach to be meaningful (and specific
in terms of mark transmission [34.129] or of conserved to computer simulations), it should be the case that
quantities such as energy and momentum [34.130]. other inquiries based on the same material are indeed
Other causal theorists of explanation like Railton have explanatory, but that the corresponding explanations
claimed that explanatory progress is made by detail- based on computer simulation are not, because of spe-
ing the various causal mechanisms of the world and cific features of computer simulations or some types
all the nomological information relevant to the inves- of them. It is not completely clear how this can be
tigated phenomenon; the corresponding “ideal explana- so. Computer simulations are simply means of explor-
tory text” is thereby slowly unveiled [34.131]. But, one ing scientific models and hypotheses by implementing
should note that, because such ideal explanatory texts algorithms, which provide information about tractable
are necessarily complex, their investigation is almost in- versions of these models or hypotheses. Therefore, their
evitably made by computational means. explanatory peculiarity, if any, should be an effect of
752 Part G Modelling and Computational Issues
specific features like the use of algorithms, coding lan- unless one shows why unexplanatoriness stems from
guages, or external computational processes. specific features of (some types of) computer simula-
There is no denying that the need to format scien- tions qua simulations. In the absence of such conceptual
tific models and hypotheses into representations that analyses, one can simply conclude that some scientific
are suitable for computational treatment comes with uses of computer simulations, or some computational
constraints. For example, a recent challenge has been practices, turn out to be unexplanatory.
to adapt coupled circulation models and their algo-
rithms to the architecture of modern massively parallel 34.4.3 Computer Simulations:
supercomputers. Similarly, when one uses CA mod- More Frequently Unexplanatory?
els for fluid dynamics, the physical hypotheses must
be expressed in the straightjacket of up-to-date rules A different claim is that, given the current uses of
between neighboring cells on a grid. Beyond these gen- computer simulations in science, they are more often
uine constraints on computational practices, one should unexplanatory than other scientific items or activities,
remember that, computational languages, provided they even if this is partly a contingent matter of fact. The
Part G | 34.4
are rich enough, are content neutral in the sense that any explanatoriness of computer simulations can be threat-
content that can be expressed with some language can ened in various ways. Computational models may be
also be expressed with them. Similarly, computational built on false descriptions of target systems or may lack
devices like the computers we use daily are universal theoretical support and simply encapsulate phenomeno-
machines in the sense that any solution to a compu- logical regularities; they may have been spoiled by the
tational problem (or inference) that can be produced approximations, idealizations, and modeling tricks used
by other machines can also be produced by them. For to simplify models and make them tractable; they may
these reasons, it is hard to see why, in principle, com- depart from the well-entrenched explanatory norms in
puter simulations should be explanatorily limited, since a field or may not correspond to accepted explanatory
the theoretical content and inferences related to other methods. Clearly, none of these features is specific to
means of inquiries can also be processed by them. computer simulations. However, it may be the case that
The case of CA models abovementioned exempli- because of their current uses in science, computer sim-
fies nicely this point. For several decades, CA mod- ulations more frequently instantiate them.
els have been used under various names in various
fields; from Schelling’s investigations about spatial seg- The Lure of Computational Explorations
regation in neighborhoods, analysis of shock waves Because they are powerful heuristic tools, and because
in models of car traffic, models of galaxies, inves- other means of exploration are often not available,
tigations of the Ising model, to fluid dynamics (see computer simulations are more often used to toy and
Ilachinski [34.134] for a survey). Because existing the- tinker with hypotheses, models, or mechanisms and,
ories and scientific laws are not expressed in terms of more generally, to experiment on theories [34.135,
CA, some philosophers have claimed that CA-based 136]. This may especially be the case in fields where
simulations were merely phenomenological [34.135, there is no well-established theory to justify (or inval-
pp. 208–209], [34.9, p. 516]. Nevertheless, Barberousse idate) the construction of models, or where collected
and Imbert [34.100] have argued that such bold general evidence is not sufficient to check that the simulated
statements do not resist close scrutiny. They present the mechanisms correspond to actual mechanisms in target
case of lattice gas models of fluids and argue that, be- systems. For example, in cognitive science, competing
yond their unusual logical nature, from a physical point theories of the mind and its architecture coexisted for
of view, such mesoscopic models and computer sim- decades, and even modern techniques of imaging like
ulations make use of the same underlying physics of fMRI (functional magnetic resonance imaging), though
conserved quantities as more classical models, and can empirically informative, do not provide sufficient evi-
be seen as no less theoretical than concurrent computer dence to determine how the brain works precisely in
simulations of fluids based on macroscopic Navier– terms of causal mechanisms. Accordingly, in this field,
Stokes equations. Therefore, there is no reason why developing a model that is able to simulate the cog-
such computer simulations could not be usable for sim- nitive performances of an agent does not imply that
ilar explanatory purposes. one has understood and explained how her brain works,
Overall, there is no denying that some (and possi- and more refined strategies that constrain the functional
bly many) computer simulations are not explanatory. architecture must be developed if one wants to make
Providing various examples of unexplanatory computer explanatory claims [34.4, Chap. 5]. The issue is all
simulations is scientifically valuable, but it says nothing the more complex in this specific field since the in-
general about their general lack of explanatory power, quiry may also involve determining (verses assuming)
Computer Simulations and Computational Models in Science 34.4 Computer Simulations, Explanation, and Understanding 753
whether neural processes are computations [34.137]. learning, which can be based, for example, on the use
Similarly, in the social sciences, empirically validat- of artificial neurons. In such fields, one first combines
ing a simulation is far from being straightforward and a limited number of elementary mathematical functions
as a result the epistemic and, in particular, explana- (e.g., artificial neurons) that, when adequately parame-
tory value of computer simulations is often question- terized, reproduce potentially complex behaviors found
able [34.138]. in databases (the learning phase). In a second step,
Overall, since computer simulations offer powerful one uses the parameterized functions (e.g., the trained
tools to investigate hypotheses and match phenomena, neural network) on new cases in the hope that extrap-
it is a temptation for scientists to take a step further and olation and prediction are possible. In such cases, even
claim that their computer simulations have explanatory if the right phenomenology is reproduced, and extrap-
value. In brief, computer simulations offer a somewhat olation partly works, it is clear that the trained neural
natural environment for such undue explanatory claims. network and the corresponding mathematical functions
do not explain the phenomena. Overall, this means
The Worries of Under-Determination that the ability to reproduce some potentially complex
Part G | 34.4
In the case of computer simulations, the higher fre- phenomenon is far from being sufficient to claim that
quency of inappropriate explanatory claims may be the corresponding computer simulation has explanatory
reinforced by the combination of several factors. power (see also [34.140] for the issue of the over-fitting
When toying with hypotheses, scientists are often of computer simulations to data).
interested in trying to reproduce some target phe- Third, when scientists do succeed, they may be sub-
nomenology, so they often do not tinker in a neutral ject, as other human creatures, to confirmation biases,
way. The specific problem with computer simulations is overweigh their success and tend to ignore the fact that
that, in many cases, getting the phenomenology right is various mechanisms or laws can produce the same data
somewhat too easy, and the general problem of under- (or that other aspects of their computer simulations do
determination of theoretical claims by the evidence is not fit). While such biases are not specific to computa-
particularly acute. tional inquiries, they are all the more epistemologically
First, computer simulations are often used in cases dangerous since matching phenomena is easy.
where data are scarce, incomplete, or of low quality
(see, e.g., [34.78, Chap. 10] for the case of climate data Complex Systems Resist Explanation
and how making data global was a long and difficult Because they are very powerful tools, computer simu-
process). The scarcity of data can also be a primary mo- lations are specifically used for difficult investigations,
tivation to use computer simulations to inquire about which usually have features that may spoil their ex-
a system for which experiments are difficult or impos- planatory character [34.141, 142]. Typically, in the nat-
sible to carry out, like in astrophysics [34.139]. Further- ural sciences, computer simulations and computational
more, knowledge of the initial and boundary conditions methods are centrally used for the study of so-called
out of which the computer simulations should be fed complex systems [34.143, 144], see also Chap. 35. Re-
may also be incomplete, which leaves more latitude alistically investigating complex systems would imply
for scientists to fill in the blanks and possibly match taking into account many interrelated nonlinear aspects
data. As a result, confidence in the result of com- of their dynamics including long-distance interactions
puter simulations like the Millennium Run and in their and, in spite of the power of modern computers, the cor-
representational and explanatory success is in part un- responding models are usually intractable. Therefore,
dermined [34.139]. drastic simplifications need to be made in both the con-
Second, computer simulations usually involve more struction of the model and its mathematical treatment,
variables and parameters than theories. For example, which often threatens the epistemic value of the results.
for a 10 10 grid with cells characterized by three Importantly, for the above reasons, the problem of
variables, the total number of variables is already 300. the explanatory value of computer simulations can arise
This raises the legitimate suspicion that, by tuning vari- even in fields like fluid dynamics where the underly-
ables in an appropriate way, there is always a means ing theories are well known. It is no surprise that this
to obtain the right phenomenology. (Ad hoc tuning problem is more acute in fields, such as the human and
is of course not completely straightforward, since the social sciences, in which no such theories are available,
many variables involved in a computer simulation are the investigated objects are even more complex, sound
usually jointly constrained. Typically, in a fluid simu- data are more difficult to collect and interpret, and the
lation, all cells of the grid obey the same update rule very nature of what counts as a sound explanation and
and are correlated.) This possibility of tuning variables genuine understanding is more debated [34.145, 146]
and parameters is indeed used in fields like machine especially in relation to computer simulations [34.133].
754 Part G Modelling and Computational Issues
For these reasons, even if there are good arguments for laws or mechanisms, essentially involved in explana-
claiming that computer simulations do not fare worse tory arguments, can be irrelevant to the explanation of
than other methods like analytic models or experi- aspects of phenomena that are covered by these laws or
ments (see [34.40] for the case of economics), it is not mechanisms [34.148]. So the problem is not simply to
surprising that their potential explanatory power is un- discard inessential (unscientific or scientific) premises,
dervalued. but also to determine, within the content of the scien-
Overall, it is plausible that often computer simula- tific premises that are essentially used in explanatory
tions have less explanatory power than other methods, derivations, what is relevant and what is not [34.102,
and that this does not stem from their nature but from 148, 150].
the type of uses they usually have in science. If this is This problem is especially acute for computer sim-
the case, the question of the explanatory power of com- ulations. Take a computer simulation that unfolds the
puter simulations is to be treated on a case-by-case basis detailed evolution of a system based on the description
by using the same criteria as for assessing the explana- of its initial state and the laws governing it. Then all
tory power of other scientific activities, pace the distrust aspects of the computational model are actually used in
Part G | 34.4
that shrouds the use of computer simulations. the computational derivation. At the same time, all such
aspects are not necessarily explanatorily relevant with
34.4.4 Too Replete to Be Explanatory? respect to all facets of the computed behavior. Typically,
The Era of Lurking Suspicion some aspects of the computed behavior may simply de-
pend on the topology of the system, on symmetries in
Theories of explanation should capture our intuitions its dynamics or initial conditions, on the fact that some
about what is explanatory. From this point of view, initial quantity is above some threshold, etc.
it is interesting to see whether computer simulations Accordingly, the following methodological maxim
meet these intuitions, especially when they fulfill the may be proposed: the more an explanation (resp. an
explanatory requirements described by theories of ex- argument) contains independent pieces of scientific in-
planations. formation, the more we are entitled to suspect that it
contains irrelevancies (regarding the target behavior).
Computer Simulations At the same time, one should remain aware that
and Explanatory Relevance explaining some target phenomenon may sometimes
Good explanations should not include explanatorily irreducibly require that all the massive gory details in-
irrelevant material. While determining whether some volved in the simulation of the corresponding system
piece of information is explanatorily relevant to explain are included. For example, as chaos theory shows it,
some target fact is a scientific task, finding a satis- explaining the emergence and evolution of a hurricane
factory notion of explanatory relevance is a task for may essentially require describing the flapping of a but-
philosophers. Despite progresses concerning this prob- terfly’s wings weeks earlier.
lem, current accounts of explanation still fall short of An additional problem is that there is no general
capturing this notion [34.147, 148]. At the same time, scientific method to tell whether a premise, or some
existing results are sufficient to understand why com- part of the information it conveys, is relevant. Con-
puter simulations raise concerns regarding explanatory trarily to what the hexed salt example [34.151] may
relevance. perhaps suggest, irrelevant pieces of information within
Scientific information, in particular causal laws, an explanation do not wear this irrelevance on their
accounts for the behavior of phenomena. Thus, it is le- sleeves and are by no means easy to identify. This is
gitimate, when trying to explain some phenomenon, to the problem of the lack of transparency, or of opacity,
show that its occurrence can be derived from a scientific of irrelevant information.
description of the corresponding system. Nevertheless, Overall, since they are based on informationally
even then, one may fall short of satisfying the require- replete descriptions of their target systems, computer
ment of explanatory relevance. This is clearly explained simulations legitimately raise the suspicion of being
by Salmon in his 1989 review of theories of explanation, computational arguments that contain many irrelevan-
where he asks “Why are irrelevancies harmless to argu- cies, and therefore of being poor explanations – even
ments but fatal to explanations?” and further states that when they are not.
“irrelevant premises are pointless, but they have no ef-
fect whatever on the validity of the argument” [34.149, Computer Simulations, Understanding,
p. 102]. While philosophers have mainly focused on and Inferential Immediacy
the discussion of irrelevant unscientific premises, the Mutatis mutandis, similar conclusions can be reached
problem actually lies deeper. Parts of the content of regarding the issue of computational resources. Since
Computer Simulations and Computational Models in Science 34.4 Computer Simulations, Explanation, and Understanding 755
this issue is closely related to the question of how much by exhibiting a pattern of counterfactual dependence
computer simulations can bring about understanding, between explanans and explanandum” [34.154, p. 13].
things shall be presented through the lens of this latter From this point of view, computer simulations fare well
notion. since, if one does not go beyond their domain of va-
It is usually expected that explanations bring under- lidity, they provide general patterns of counterfactual
standing. Theorists of understanding, while disagreeing dependence between their inputs I and outputs O, which
on the precise nature of this notion, have explored are obtained by applying t times their update algorithms
its various dimensions, which provides a good toolkit (UA), that is, more formally, O.t; l/ D UAt .I/.
to analyze how computer simulations fare on this is- Is there a philosophical catch? Woodward also re-
sue. quires that the pattern of counterfactual dependence be
Hempel cashes in the notion of understanding in described in terms of a functional relation. But what
terms of nomic expectability. From this point of view, is to count as a function in this context? Functions
taken as explanatory arguments, computer simulations can be defined explicitly (by means of algorithms) or
seem able to provide understanding since, like other implicitly (by means of equations). The advantage of
Part G | 34.4
scientific representations, they can rely on nomological computer simulations is that they provide algorithmic
regularities. Further, in contrast to sketchy explanations, formulations based on elementary operations of how the
they make the nomic dependence of events explicit. explanandum varies with the explanans. From this point
Consider the explanation analyzed by Scriven that “the of view, computer simulations are more explicit than
impact of my knee on the desk caused the tipping over models, which simply provide equations linking the ex-
of the inkwell” [34.152]. The hidden strategy described planans and the explanandum. However, the problem is
by Woodward [34.153] is to claim that the value of this that with computer simulations any kind of functional
latter nonnomological explanation is to be measured immediacy is lost, since it is computationally costly
against an ideal explanation, which is fully deductive to carry out the algorithm. Indeed, Woodward usually
and nomological and describes the detailed succession describes straightforward examples of functional de-
of events that led to the stain on the carpet, even if this pendence like Y D 3X1 C 4X2 . With such functions, we
complete explanation is often inaccessible. From this may feel that the description of the counterfactual de-
point of view, a computer simulation can offer a way to pendence is just there, since, by simply instantiating
approach such an ideal explanation, by providing an ex- the variables and carrying out the few operations in-
plicit deduction of the lawful succession of events that volved, specific numerical relations are accessible. In
brought about the explanandum. However, an epistemic such simple cases, a human mind can do the work
problem is that, once such a computer simulation has by itself and answer the corresponding what-if-things-
been carried out (and properly stored), it is possible to had-been-different (what-if) questions. In contrast, with
explicitly highlight any part of it, but it is not possible to a computer simulation, computing the output takes
scrutinize all parts because there are too many of them. much computational power. So the tentative conclusion
This is one of the reasons why computer simulations are is that computer simulations provide understanding in
intrinsically opaque to human minds [34.41, §5.3], see Woodward’s sense, but this understanding is not imme-
also Sect. 34.3.2. diately accessible, the degree of (non)-immediacy being
Be this as it may, causal theorists of explanation described by the computational resources it takes to
should agree that computer simulations often contribute answer each what-if question. Importantly, an equation-
significantly to developing our understanding by re- based model may give the illusion of immediacy, since
ducing uncertainty about the content of causal ideal the equation presents a short description of how the
explanatory texts, as requested in [34.131]. variables are correlated. However, one should watch
Computer simulations also seem to be able to pro- out that short equations can be unsolvable, and short
vide unificatory understanding. For unificationists like descriptions of algorithms (like O.t; l/ D UAt .I/) with
Kitcher, understanding is a matter of “deriving descrip- simple inputs can yield complex behaviors that are
tions of many phenomena using the same pattern of computationally costly to predict [34.155].
derivation again and again” [34.107, p. 423]. Since Similar conclusions can be reached if one focuses
computer simulations offer more ways of deriving phe- on analyses of understanding proper. De Regt and Dieks
nomena, by providing new patterns of derivation or propose to analyze understanding in terms of intelligi-
instantiating existing patterns in more complex cases, bility, where this latter notion implies the ability to rec-
at least some of them contribute to unification. ognize qualitative characteristic consequences without
Things are less straightforward with Woodward’s performing exact calculations [34.156]. In this sense,
account of explanation and understanding. Woodward understanding seems to be a matter of immediacy, as
argues that a good explanation provides “understanding was already suggested by Feynman, who described it as
756 Part G Modelling and Computational Issues
the ability to foresee the behavior of a system, at least tions between the premises of the explanatory argument
qualitatively, or the consequences of a theory, without and its conclusion are opaque. Therefore, scientists are
solving exactly the equations or performing exact cal- no longer able to encompass uno intuitu all aspects of
culations [34.157, Vol. 2, 2–1]. the explanation and how they are related, to develop
Depending on the cases, foreseeing consequences expectations about counterfactual situations (in which
requires logical and cognitive operations to a greater or similar hypotheses are met), and the unificatory knowl-
lesser extent. Thus, the above ideas may be rephrased in edge that only global insights can provide is also lost.
a more gradualist way, by saying that the less inferen- Overall, with computer simulations the objective in-
tial or computational steps one needs to go through to telligibility that is enclosed in explanations and can
foresee the behavior of a system or the consequences of be accessed by first-person epistemic appropriation of
a theory, the better we understand it. In this perspective, the explanatory arguments can no longer be completely
computer simulations fare terribly badly, since they in- enjoyed by scientists (see also [34.159] for further anal-
volve going through many gory computational steps yses about epistemic opacity in this context). In this
and, even once these have been carried out, scientists perspective, the problem of computer simulations is not
Part G | 34.4
usually end up with no simple picture of the results and that they have less explanatory value but that we can-
no inferential shortcuts that could exempt them from not have epistemic access to this explanatory value.
this computational stodginess for future similar investi- In brief, this problem would not pertain to the logic
gations. of computer-simulation-based explanations but to their
epistemology.
Understanding: What Do We Lose
with Computer Simulations? New Standards for Understanding?
Before the advent of computational science, explana- The gradualist description regarding the need of cogni-
tory advances in science were always the direct product tive and logical operations to foresee consequences (see
of human minds and pen-and-rubber methods. There- Sect. 34.4.4 Computer Simulations, Understanding,
fore, any actual scientific explanation that satisfied the and Inferential Immediacy) suggests that the bound-
requirements for explanatoriness was also human sized, ary between cases where intelligibility is present or
and the epistemic benefits logically contained within is lost is not completely sharp. Importantly, the abil-
such explanations could actually be enjoyed by compe- ity to foresee consequences depends on various factors
tent and informed epistemic agents. In [34.158, p. 299], such as the knowledge of physical or mathematical
Hempel states that an explanatory argument shows that theorems to facilitate deductions, the knowledge of
“the occurrence [of an event] was to be expected” and powerful formalisms to facilitate inferences, how much
he adds “in a purely logical sense.” This addition em- the intuition of scientists has been trained to anticipate
phasizes that expectation should not be understood as consequences of a certain type and has somewhat in-
a psychological notion nor refer to the psychological ternalized inferential routines, etc., [34.102, §6.4]. In
aspects of the activity of explaining. In the case of other words, at least in some cases, the frontiers of what
computer simulations, this addition is somewhat su- has a computational explanation, but remains unintelli-
perfluous. Nomic expectability remains for scientists, gible to a human mind, can be pushed back to some
since, based on computer simulations, they may know extent.
that they can entertain the belief that an event should This raises the question of how much the frontiers
happen. However, this belief is completely cold. Since of intelligibility can be extended and whether the ideal
the activity of reasoning is externalized in computers, of inferential or computational briefness for explana-
it is no longer part of the proper cognition of scientists tions should be considered as a normative standard.
and does not come with the psychological side-effects Two positions are possible. One may claim that genuine
associated with first-person epistemic activities, such as explanations should always yield the possibility for
emotions or feelings of expectation, impressions of cer- human subjects to access the corresponding understand-
tainty and clarity, or the oft-mentioned aha or eureka ing. Or one may claim that, as shown by computational
feeling which usually comes with first-person experi- science, we have gone beyond human-sized science, not
ences of understanding. In other words, with computer all good explanations can be comprehended by human
simulations, the mind is no longer the carrier of the ac- minds, and this is not a defect of our science, even if it
tivity of explanation, and simply records what it should is clearly an epistemic inconvenience.
believe. Unfortunately, epistemic benefits associated A motivation for endorsing the former claim is that
with the individual ability to carry out this activity are the lack of intelligibility of explanations often stems
also lost. Since the explanatory argument can no longer from epistemic flaws of the agents producing them and
be surveyed by a human mind, the details of the rela- can be corrected. Typically, in science, results are often
Computer Simulations and Computational Models in Science 34.4 Computer Simulations, Explanation, and Understanding 757
laboriously proved and, with the advance of scientific for being so explanatorily laborious), or whether one
understanding, shorter and clearer proofs, or quicker al- cannot do better (and the process is intrinsically com-
gorithms, are found. plex).
Overall, it seems sound to adopt the following Overall, because determining whether explanations
methodological maxim: the more resources we need to are informationally minimal (regarding the use of rele-
produce (or check) an explanation (resp. an argument, vant information) and whether arguments or computa-
a proof), the more we are entitled to suspect, in the tions are optimal is opaque, computer simulations are
absence of contrary evidence, that the explanation is doomed to remain shrouded in suspicion about their ex-
unduly complex. From this point of view, computer sim- planatoriness, even in cases in which there is no better
ulations do not seem flawless, since they make abundant (that is, shorter or less informationally replete) ex-
use of computational and inferential resources. Accord- planation. In brief, the era of suspicion regarding the
ingly, it is legitimate to suspect computer simulations explanatoriness of computer simulations will not end
of providing unduly complex explanations, which have soon.
simpler versions yielding the expected accessible un-
Part G | 34.4
derstanding. 34.4.5 Bypassing the Opacity
Nevertheless, this philosophical stance may be in- of Simulations
appropriate in many cases. There is a strong suspicion
that explaining phenomena often requires using an Even when computer simulations are epistemically
irreducible amount of resources. This idea of computa- opaque, some strategies can be tried to regain predictive
tional irreducibility has been vocally advanced, though power, control, and potentially understanding regarding
not clearly defined, by Wolfram [34.155], and philoso- the corresponding inquiries.
phers have toyed with close intuitions in recent dis-
cussions about emergence [34.74, 123, 124, 160–162]. Understanding, Control
Capturing the idea in a clear, robust and fruitful def- and Higher Level Patterns
inition is a difficult on-going task [34.163]. However, As emphasized by Lenhard [34.159], by manipulating
there seems to be an agreement that this intuitive no- computational models and observing which behavior
tion is not empty, which is what matters for the purpose patterns are obtained, scientists can try to control the
of the present discussion. Overall, this means that in all processes involved and develop “a feeling for the con-
such cases, asking for computationally simple explana- sequences.” Lenhard suggests that this understanding
tions does not make sense, since such explanations do by control, which is oriented toward design rules and
not exist. In this perspective, tailoring our explanatory predictions, corresponds to a pragmatic account of un-
ideals to our human capacities is wishful thinking, since derstanding, which is also involved in the building of
in many cases, the inaccessibility of the usual epistemic reliable technological artifacts.
benefits of explanations does not stem from our epis- Other authors have emphasized that, even if the de-
temic shortcomings. tails of computer simulations cannot be followed by
This suggests that we may have to bite the bullet human minds, one may sometimes still obtain valuable
and say that, sometimes, computer simulations do bring insights by building coarse-grained representations of
full-fledged explanation and objective understanding, the corresponding target systems and analyzing whether
even if, because of our limited cognitive capacities, we macro-dynamics emerge when microinformation is
cannot enjoy this understanding and the epistemic bene- thrown away [34.164]. Surprisingly, the existence of
fits harbored by such explanations. In other words, both coarse-grained dynamics seems to be compatible with
of the above philosophical options are correct, though complex, potentially computationally irreducible, dy-
in different cases. namics at the microlevel [34.165, 166], even if this by
Ideally, one would like to be able to know when no means warrants that control or understanding can
each of these two options should be adopted. Unfor- always be regained at the macro-level. Thus, the ques-
tunately, determining whether a computational process tion arises as to when and how much epistemic virtues
can be shortcut or a computational problem solved by like predictive power, control, and potentially under-
quicker algorithms, seems to be in practice opaque standing, which are somewhat lost at the microlevel,
(problem of the lack of transparency of the optimal- can be partly recovered at the macro-level, and how the
ity of the computational process). This means that in corresponding patterns can be detected. The treatment
most cases, when facing a computational explanation of such questions requires the analysis of logical and
of a phenomenon, one does not know whether there mathematical relations between descriptions of systems
are computationally or inferentially shorter versions of at different scales and, for this reason, it should gain
this explanation (and we are to be epistemically blamed from ongoing debates and research in the philosophical
758 Part G Modelling and Computational Issues
and scientific literature about the emergence of simple example, Kuorikoski [34.164] acknowledges that visual
behavior in complex systems. representations are cognitive aids but emphasizes that
they often merely bring about a feeling and illusion of
Visualization and Understanding understanding. So, there is the need of epistemological
Another important issue is how to exploit macro-level analyses which would make clear in which cases, and
patterns that are present in computer simulations to re- how, visual representations can be reliable guides and
store partial cognitive grasp of the simulated systems self-certifying vectors of knowledge, which partly en-
by humans. Given the type of creatures that we are, able their users to determine whether and how much
and in particular the high visual performance of the they should trust them.
brain, using visual interfaces can be part of the an-
swer. Indeed, the format of scientific representations 34.4.6 Understanding and Disciplinary
partly determines what scientists can do with them – Norms
whereas, as emphasized by [34.41, p. 95], philosophers
have often considered the logical content of a repre- All the above discussion has been based on gen-
Part G | 34.5
sentation to be the only important element to analyze eral arguments about explanations and understanding.
them. To go further into these issues, sharp analy- However, as already emphasized, explanatory norms
ses of representational systems and their properties are sometimes differ from one field to another, economics
required. Tools and concepts developed in the Goodma- being, at least in its mainstream branches, a paradig-
nian tradition prove to be extremely useful [34.167]. For matic case of a field in which simulation methods are
example, Kulvicki [34.29] highlights how much graphs shunned [34.37]. Similarly, the explanatory status of
and images can present information more immediately computer simulations and computational models varies
than symbolic representations can. This notion of im- across fields like cognitive sciences, artificial intelli-
mediacy is cashed in in terms of semantic salience, gence [34.137], artificial life [34.170] or within fields
syntactic salience or extractability. Vorms further shows themselves (see, e.g., [34.171] for the case of computa-
how taking into account formats of representation in tional chemistry and [34.79] for that of climate science).
the analysis of scientific reasoning is crucial, since in- This is not the place to discuss whether these varia-
ferences have different cognitive costs depending on tions regarding explanatory norms are deep, or whether
the format of representation [34.168]. Jebeile [34.169] they result from differences in theoretical contexts, in
applies similar concepts to computational models and the degrees of complexity of the systems investigated,
argues that visualization tools can have a specific ex- in the difficulties to collect evidence about them, in
planatory role since they do not merely present compu- the scientific maturity and empirical success of these
tational data in more accessible ways, but also suggest fields, etc. Such questions cannot be answered on the
interpretations that are not contained in the original basis of armchair investigations. Field-specific studies
data, highlight relations between these data, and thereby of the explanatoriness of computer simulations, made
point at elements of answers to what-if questions. by scholars who are in the same time acutely aware
Overall, the issue of how much visualization can of present discussions about scientific explanation, are
convey objective understanding remains debated. For needed.
iments as the source of primary evidence upon which Like experiments, computational proofs involve ex-
science is built (Sect. 34.5.4). In any case, discussions ternal processes, which are fallible. Their reliability can
about the relationships between experiments and com- then be seen as being partly of a probable nature and
puter simulations should remain compatible with the needs to be assessed a posteriori by running these exter-
actual existence of hybrid (both computational and ex- nal processes several times and checking that the appara-
perimental) methods (Sect. 34.5.5). tus involved worked correctly. By contrast, proofs which
When in the 1990s philosophers of science started can be actively and directly produced by humans minds,
investigating computer simulations, they soon realized can provide a priori knowledge, the validity of which
that the object of their inquiry cross-cut traditional is assessed by (mentally) inspecting the proof itself,
categories like those of theories, models, experiments qua mathematical entity. Further, computational proofs,
or thought experiments. Similarities with experiments like experiments and empirical methods in mathemat-
were particularly striking, since, among other things, ics, usually provide particular numerical results: as the
computer simulations involved the treatment of mas- computational physicist Keith Roberts writes it, “each
sive data and statistical reasoning, required robustness individual calculation is [. . . ] analogous to a single ex-
Part G | 34.5
analysis, and were claimed to yield new knowledge. periment or observation and provides only numerical or
As a result, computer simulations were suggestively graphical results” (quoted in [34.70, p. 137]). Therefore,
dubbed by various authors as computer experiments, to obtain more general statements (and possibly theo-
numerical experimentation or in-silico thought experi- ries), probabilistic inductive steps are needed. Overall,
ments, even though it was not always conceptually clear such debates illustrate the need to clarify the use in this
what these potentially metaphorical characterizations context of labels like experimental or empirical.
meant exactly.
All such similarities are worth analyzing and poten- The Experimental Stance
tially call for explanations. They may be the sign of an The case of computational mathematics also makes
identical nature between (some of) these activities, of clear how scientists can adopt an experimental stance
common essential features, or may just be shallow or for inquiries where no physical process is investigated,
fortuitous. Clarifying this issue is also a way to analyze and the nature of the object which is experimented upon
these activities more acutely by singling out what is spe- is completely known.
cific to each or common to them and to determine to Experimenting involves being able to trigger
what extent epistemological insights can be transferred changes, or to intervene on material or symbolic dy-
between them. namical processes, and to record how they vary ac-
cordingly. As noted by Dowling [34.136, p. 265] and
34.5.1 Computational Mathematics Jebeile [34.169, II, §7.2], processes for which the dy-
and the Experimental Stance namics is known can also work as black boxes, since
the opacity of the process may stem either from our lack
Experimental Proofs in Mathematics of knowledge about its dynamics, or from the math-
Since aspects related to the representation of material ematical unpredictability (or epistemic inaccessibility)
systems are absent from mathematics, a comparison of its known dynamics. In this perspective, contrarily
with this field can be hoped to be fruitful to an- to Guala [34.177], being a black box is not a specific
alyze what exactly is experimental in computational feature of experiments.
science. Finally, when experimenting on a material or for-
The mathematical legitimacy of computers for the mal object, it is better that interactions with the object
production of proofs has been discussed for several be made easy and the results be easily accessible to the
decades. Computational proofs like that of the four- experimenters (e.g., by means of visual interfaces) so
color theorem by Appel et al. [34.172, 173] were rapidly that tinkering is made possible [34.136] and intuitions,
labeled quasi-empirical and discussions raged about familiarity, and possibly some form of understand-
how they should be interpreted [34.174, p. 244]. Such ing [34.159, 169, III] can be developed.
computational proofs can actually be seen as having
roots in the older tradition of quasi-empirical mathe- 34.5.2 Common Basal Features
matics, practiced for example by mathematicians like
Euler, and philosophically defended by authors like Some similarities of computer simulations and experi-
Lakatos [34.175] or Putnam [34.176]. Interestingly, ments (and thought experiments) may be accounted for
even in these contexts, the labels empirical or exper- by highlighting common basal features of these activ-
imental were used to refer to various aspects of the ities, which in turn account for the existence of their
activity of proving results. common epistemological features, such as the shared
760 Part G Modelling and Computational Issues
concerns of practitioners of experiments and computer suring instruments and therefore that they have the
simulations for “error tracking, locality, replicability, same epistemic status as experimental measurements.
and stability” [34.70, p. 142]. In this perspective, one She first claims that models can serve as measuring
should characterize the nature and status of these com- instruments, and then shows that this role can be ful-
mon basal features. filled in connection with both computer simulations and
experiments, which are similarly model shaped. An im-
Role or Functional Substitutability portant part of her strategy is to relax the conditions
Though computer simulations, thought experiments and for something to count as an experiment, by discretely
experiments are activities of different types, they can giving primacy, in the definitions of scientific activi-
sometimes be claimed to play identical roles. Typically, ties, to the roles which are played (here measuring)
computer simulations are used to gain knowledge about and by downplaying the importance of physical in-
how physical systems behave (hereafter behavioral teractions with the investigated target systems in the
knowledge) when experiments are unreliable, or mak- definition of experiments (which are simply seen as
ing them is politically or ethically unacceptable [34.41, a way to perform this measuring role). Giere’s rejoin-
Part G | 34.5
p. 107]. Importantly, acknowledging that computer sim- der denies the acceptability of this strategy, and follows
ulations can sometimes be used as substitutes for exper- the empiricist tradition, when he claims that “a substi-
iments by no means implies that they can play all the tute for a measurement is not a measurement, which
roles of experiments (Sect. 34.5.4). Further, one should traditionally requires causal interaction with the target
be aware that, at a high-level of abstraction, all activi- system” [34.179, p. 60]. Indeed, the potential additional
ties may be described as doing similar things; therefore, pay-offs of experiments, as primary sources of radically
these shared roles should be shown in addition to have new evidence, come from these causal interactions. Ac-
nontrivial epistemological implications. For example, cordingly, their specificity is not due to their roles,
one may argue that providing knowledge or producing qua information sources (since thought experiments,
data are roles that are endorsed by computer simula- models, or theories are also information sources), but
tions, thought experiments, or experiments. However, from the type of epistemological credentials that come
this may be seen as some partially sterile hand-waving. with the corresponding information, and grounds our
Indeed this points at a too abstract similarity if these ac- ultimate scientific beliefs. A different nonempiricist
tivities produce items of knowledge of totally different epistemology might be developed, but the bait must
types, and nothing epistemologically valuable can be then be swallowed explicitly, and it must be explained
inferred from this shared characterization (see [34.81] why such an epistemology, in which activities are ex-
for a presentation of the different types of knowledge clusively individuated on the basis of their function
involved in science). and the importance of other differences is downplayed,
El Skaf and Imbert [34.87] make an additional step should be preferred. In any case, an account of how to
when they claim that these activities can in certain cases individuate these functions would be needed, since at
be functionally substitutable, that is, that we can some- a high level of abstraction, various activities can be seen
times use one instead of the other for the purpose of as performing the same function.
a common inquiry – which remains compatible with
the fact that these activities do not play the roles in Beyond Anthropocentric Empiricism
question in the same way, that they come with dif- To practice science, humans need to collect observa-
ferent epistemic credentials, provide different benefits, tions and make inferences. Since human capacities are
and therefore, as role holders, are not epistemologically limited, various instruments have been developed to ex-
substitutable. El Skaf and Imbert, in particular, claim tend them and these instruments have been partly com-
that computer simulations, experiments, and thought putational for decades. These parallel developments of
experiments are sometimes used for the purpose of observational and inferential capacities come with com-
unfolding scenarios (see also Hughes’ notion of demon- mon epistemological features. In both cases, restricted
stration in Sect. 34.6.1) and argue that investigations empiricism, which gives a large and central role to
concerning the possibility of a physical Maxwellian human sensorial or inferential capacities in the descrip-
demon were indeed pursued by experimental, computa- tion of how scientific activities are carried out, is no
tional and thought experimental means. The existence longer an appropriate paradigm to understand scientific
of such common roles then provides grounds for an- practices. Indeed, the place of human capacities within
alyzing similarities in the epistemological structure of modern science needs to be reconsidered [34.8, 41,
the corresponding inquiries. 180]. Further, the externalization of observations and
Morrison [34.178] goes even further since she ar- inferences comes at the price of some epistemic opacity
gues that some computer simulations are used as mea- and passivity for the practitioner, since, as humans, we
Computer Simulations and Computational Models in Science 34.5 Comparing: Computer Simulations, Experiments 761
no longer consciously carry out these activities. Instead and epistemology and highlight both their nonacciden-
we simply state the results of experimental or computa- tal similarities and specific differences (see [34.182] for
tional apparatus. However, this also comes with gains in the case of computational instruments).
objectivity since observational and informational proce- Computational science must also face the challenge
dures are now carried out by external, transparent and of data management. While the steps of traditional
controlled apparatus, which no longer have hidden psy- mathematical proofs and arguments, once produced,
chological biases nor commit fallacies. can be verified by scientists, things are usually dif-
The development of computational instruments and ferent for computer solutions, even if they are merely
computer simulations also raises similar epistemolog- executions of computational programs [34.91], or ar-
ical problems. For example, the apparently innocuous guments [34.72]. Details of computer simulations are
notion of data seems to raise new issues in the con- in general not stored since this would require too large
text of computational science. Computer simulations, amounts of memory (even if, in some cases like the Mil-
like models, have been claimed to be useful to probe lennium Run, scientists may decide to keep track of the
physical systems and to be used as measuring instru- evolution of the computer simulation). In other words,
Part G | 34.5
ments [34.178]. Whatever the interpretation of such like experimental science, computational science in-
statements (Sect. 34.5.3), it is a fact that both computer volves choosing which data to keep track of, developing
simulations and computational instruments provide us powerful devices to store them, finding appropriate
with data, which raises transversal questions. ways to organize them, providing efficient interfaces
A datum is simply the value of a variable. It can be to visualize, search, and process them, and, more gen-
taken to describe a property of any object. In this sim- erally, developing new methods to produce knowledge
ple sense, data coming from experiments and computer from them. This also raises questions about how these
simulations can play a similar role by standing for the data can or should be accessed by the scientific com-
properties of some target system within some represen- munity, and which economic model is appropriate for
tational inquiries. Furthermore, in both cases, their in- them [34.183]. In brief, the epistemology of computer
terpretation usually involves heavy computational treat- simulations here meets that of big data [34.184, 185],
ments. In particular, mathematical transforms of various even if it cannot be assumed that on-going debates
types serve to separate information from noise, remove and analysis about the latter, because they are mostly
artifacts, or recover information about a system prop- focused on questions raised by empirically collected
erty out of intertwined causal signals, like in computed data, will naturally apply to, or be insightful for, the
tomography imaging techniques [34.121]. From this corresponding problems raised by computer simula-
point of view, as emphasized by Humphreys [34.181], tions.
here one departs from a principle frequently used by tra-
ditional empiricists, and according to which “the closer Different Activities, Similar Patterns
we stay to the raw data, the more likely those data are of Reasoning
to be reliable sources of evidence.” As noted by Parker [34.186], strategies developed to
At the same time, there are different types of epis- build confidence in experimental results, and described
temological data, and the need for their common study in particular by Allan Franklin, seem to have close
should not introduce confusion in their understanding. analogs for the justification of results generated by
In science, one seeks to determine how much data reli- computer simulations. Indeed, the interpretation of the
ably stand for their target, and which properties exactly results of computer simulations as evidence for hy-
they refer to. Humphreys’s remark above the compu- potheses about physical systems can sometimes be
tational treatment of data, reproduced above, highlights made through an error-statistical perspective [34.187]
the fact that causal information concerning the source is as in the case of experiments [34.188].
crucial to treat and interpret data and to determine what Similar patterns of reasoning are also used to ar-
empirical content they bring about this source (this is gue in favor of the existence of specific mechanisms or
the inverse inference problem), given that data do not entities on the basis of patterns within data, modes of
wear on their sleeves details of how they were pro- visualizations of these patterns, or our ability to manip-
duced. From this point of view, experimental and com- ulate the actual or represented systems and find pattern
putational data have utterly different causal histories – regularities in their behavior (see [34.71] for a descrip-
so what gives its sense to the computational treatment tion of the homomorphic tradition, in which visual
is potentially of a different nature [34.91, 121]. Overall, forms are given much importance, in contrast to the
more pointed comparative analyses of data obtained by homologic tradition, which is more based on logical re-
computer simulations and computational instruments lationships). More generally, visualization techniques,
are still to be carried out, to understand their semantics aimed to facilitate the reasoning about results present in
762 Part G Modelling and Computational Issues
large databases, are crucial in the case of both experi- even if solutions to these problems, as those described
ments and computer simulations (Sect. 34.4.4). by Godin and Gingras or Franklin [34.194], may be
Importantly, these similarities may have different partly activity specific. In any case, adopting a gen-
explanations. For example, they may simply stem from eral comparative perspective provides a way to analyze
the need to treat massive amount of data by efficient more acutely what is epistemologically specific or com-
standard procedures, or be a consequence of features mon to scientific activities.
shared by experimental and computational data, inde-
pendently of their quantity, like the presence of noise, 34.5.3 Are Computer Simulations
or may correspond to the application of general types Experiments?
of evidential or explanatory arguments to data having
different natures. Some authors go as far as claiming that, at least in some
cases, what we call computer simulations are in fact ex-
The Reproducibility of Results periments. In this perspective, Monte Carlo methods,
Reproducibility is a typical requirement for experi- sometimes labeled Monte Carlo experiments or Monte
Part G | 34.5
ments, though it is one that is sometimes difficult to Carlo simulations, seem to be a philosophical test case
achieve because of the tacit knowledge involved in the (like analog simulations, Sect. 34.2.2). Such methods
carrying out of experiments [34.189]. Similar prob- are used to compute numbers (e.g., pi), sample target
lems may arise with computer simulations. Even if the distributions or produce dynamical trajectories with ad-
latter are nothing more than computations and are in equate average properties. They rely crucially on the
principle reproducible, in practice reproducibility may use of randomness [34.8, 72]. They may look closer to
sometimes be difficult, especially in the context of big experiments because they sometimes use physical sys-
science. For example, computer simulations may be too tems, like a Geiger counter, to generate random events.
big to be reproduced (all the more since scientists have Still, Beisbart and Norton claim that Monte Carlo
in general little incentive to reproduce results). Numer- methods are not experiments, since randomizers can
ical codes may not be public (because they are not be replaced by computer codes of pseudorandomiz-
published or shared), and many of the computational ers [34.72, p. 412]. This shows that these computer
details may be left tacit. Finally, computer simulations simulations do not require contact with the random-
involving stochastic processes may not be exactly re- izer as an external object; therefore no direct empirical
producible because the random numbers came from discovery about the nature of physical systems can be
external physical signals or because the details of the made by them and they should not be seen as having
pseudorandom number generator are not made public. an experimental nature. In brief, in Monte Carlo simu-
lations, the physical systems involved are simply used
Experimenters’ and Simulationists’ Regresses as computers to generate mathematically random se-
Good scientific results are usually expected to be robust quences.
against various changes [34.190], in particular those Beyond the analysis of specific cases, some au-
related to implementation or material details, and this thors have defended the bolder claim that all computer
is why failure of exact reproducibility should not be simulations are experiments (what Winsberg calls the
a worry. identity thesis [34.195, §5]). While this goes against
Still, when one faces an inability to reproduce a re- inherited scientific common sense (computations are
sult, the problem may arise from a lack of robustness or not experiments!), the claim should be carefully exam-
flaw in the original experiment or computer simulation, ined. Indeed, in principle there is no impossibility here:
or from a failure to reproduce it correctly. Accordingly, while computations, logically defined, are not experi-
as emphasized by Gelfert [34.191], computer simula- ments, we need physical machines to carry them out.
tions are affected by a problem similar to that of the Therefore, in the end, computers, instruments and ex-
experimenter’s regress [34.192], which is met when to perimental systems are physical systems that we use for
determine whether an experimental apparatus is work- the purpose of doing science – and it all boils down to
ing properly scientists have no criterion other than the how we conceptualize in a coherent and fruitful way
fact that it produces the expected results. As noted by these external worldly activities. In brief, perhaps, after
Godin and Gingras [34.193], regresses like that high- all, we would be better off revising our epistemolog-
lighted by Collins are instances of well-known types of ical notions so that computer simulations are seen as
arguments already analyzed in the framework of ancient genuine examples of experiments – a revisionary po-
skepticism (more specifically, regresses or circular rela- sition with regard to the empiricist tradition since it
tions regarding justification). As such, they are specific ignores the specificity of experiments as primary evi-
neither to experiments nor to computer simulations – dential sources of knowledge.
Computer Simulations and Computational Models in Science 34.5 Comparing: Computer Simulations, Experiments 763
In what follows, I review existing arguments in decide how its memories are organized, and even within
favor of the claim that computer simulations are exper- the same computation, a single part of the memory may
iments, and how these arguments have been criticized. be used at different steps to code for different physi-
Overall, as we shall see, in contrast to what is claimed cal variables [34.91, pp. 564–566], [34.196, pp. 81–84].
in [34.195, §5], it is very dubious that discussions about Overall, in the general case, the relation between the
the identity thesis are simply a matter of perspective and physical states of the represented target system and the
where the emphasis is placed. A minute, conceptually physical states of the computer(s) that may be used to
rigorous, and sharp treatment of this question can be simulate its behavior is a many-many one, and the idea
found in [34.53, 72, 91, 196] and [34.169, Chap. 7]. that the phenomenon is recreated in the machine “is
fundamentally flawed for it contradicts basic principles
Problems with Analyses in Terms of computer architecture” [34.196, p. 84]: in the case
of Common Physical Structures of a successful computer simulation, one can simply
Some authors analyze computer simulations as manip- say that every step of the computation has been carried
ulations of physical systems (the computers), which out by some appropriate physical mechanism, but there
Part G | 34.5
instantiate or realize models that are also instantiated is no such thing as a computer instantiating the struc-
or realized by the investigated physical systems. ture of the model investigated. (Note that the argument
Norton and Suppe [34.114] are good representatives based on multiple realizability is in the spirit of those
of this tradition. They first try to describe formal re- originally developed by Fodor [34.126] in his discus-
lations between what they call a lumped model, the sion of the reduction of the special sciences).
structure of the target system, and the programmed
computer, which is supposed to embed the lumped Problems with Common Analyses in Terms
model. They further argue that these relations account of Intervention or Observation
for the experiment-like character of computer sim- Computer simulations have also been claimed to qualify
ulations: instead of experimenting on real systems, as experiments “in which the system intervened on is
computer simulations are used as physical stand-ins a programmed digital computer” [34.199, p. 488], or
or analogs to probe real-world phenomena, and one to involve observations of the computer as a material
thereby learns thing about the represented systems. This system [34.114, p. 88]. Winsberg even goes as far as to
suggestive position has charmed various authors. It also claim that [34.195]
has similarities with accounts of scientific representa-
tion made in terms of similarity [34.28], isomorphism, “nothing but a debate about nomenclature [. . . ]
or weaker relationships between the representation and would prevent us from saying that the epistemic
the target system [34.197, 198], even if the authors that target of a storm simulation is the computer, and
defend the above view have not adopted so far this line that the storm is merely the epistemic motivation for
of argument. studying the computer.”
However, in the case of computer simulations, this
view does not seem to resist close scrutiny, for rea- Such claims can be answered along the same lines
sons specific to computational activities. While in the as the previous argument. There is of course no denying
case of analog simulations both the represented sys- that when one runs a computer simulation one inter-
tem and the analog computer instantiate a common acts with the interface of the computer, which triggers
mathematical structure (Sect. 34.2.2), such a claim can- some physical change in the computer so that the right
not be made for digital computers. The general idea computation is carried out. Similarly, once the compu-
is that steps of computational processes are multiply tation is finished, the physical state of the memory in
realizable and that, conversely, how physical states of which the result is stored, triggers a causal mechanism
computers are to be interpreted is contextual and partly that produces changes in the interface so that the re-
arbitrary [34.4]. It is true that for every step of a com- sult can be read by the user. However, the definition of
putation to be carried out in practice, one needs to use an intervention at the model level does not determine
a physical machine that can be seen as instantiating the a specific intervention at the physical level of the com-
corresponding transition rule. However, physically dif- puter. The reason is that, as emphasized above, even
ferent machines can be used to carry out different parts within the same computational process, the way that
of a computation (for example when the computation the intervened model variable is physically represented
is distributed). Furthermore, even if a single machine in the computer may vary, and how the computer, qua
is used, different runs of the program will correspond physical system, evolves precisely may depend on var-
to different physical processes, since the computer may ious parameters such as the other tasks that it carries
process several tasks in the same time and contextually out at the same time. In brief, the idea that actual com-
764 Part G Modelling and Computational Issues
puter simulations, defined at the model level, could be in the type of background knowledge that researchers
seen as the investigation of the computer, qua physical use to justify the external validity hypothesis [34.113,
machine, which is used to carry them out, seems to be p. 587], a position which is again revisionary with
riddled with insuperable difficulties. regard to the empiricist tradition if this is the only speci-
Finally, one should mention that epistemic access to ficity ascribed to experiments.
the physical states of the computer corresponding to the A serious worry is that describing the investigation
successive steps of a computation is usually not possi- of the computational model in terms of internal valid-
ble in practice [34.196, p. 81]. ity is problematic and artificial, since, as can be seen
above, computer simulations cannot be considered as
Problems with General Analyses in Terms investigations of the causal behavior of the computer,
of Epistemological and Representational qua physical system. For the same reason, the use of
Structure the notion of external validity is inappropriate, since
Some authors have also argued that computer simula- for computer simulations inferences about the target
tions and experiments share an epistemological struc- system do not involve the generalization of causal re-
Part G | 34.5
ture, or epistemological aspects, and have used this lations taking place in the computer to other systems
claim to justify the identity thesis. by comparing their material properties but involve the
For example, it has been claimed that in both cases representational validity of the computational model.
one interacts with a system to gain knowledge about A final problem is that the characterization of the
a target system, and the internal and external validity of methodology of experimental studies in terms of inter-
the processes needs to be checked. This type of analysis nal and external validity, though useful in the social
stems from a 2002 paper by Guala [34.177] in which he sciences, is not a general one. Using it as an accepted
presents a laboratory experiment in economics aimed general framework to compare experiments and com-
at investigating behavioral decision making by giving puter simulations looks like a hasty extrapolation of
decisional tasks to real human subjects in the labo- the case of laboratory experiments in experimental eco-
ratory. In this case, a hypothesis about how agents nomics, not to mention the fact that economics may be
behave in the laboratory is investigated (internal va- seen as a bold pick to build a general conceptual frame-
lidity hypotheses); then, based on similarities between work for experimental studies.
the experimental situation and the real-life situation, It is true that in experiments, the measured prop-
an external hypothesis is made about the behavior of erties are often not the ones that we are primarily
agents in real life situations (external validity hypothe- interested in and the former are used as evidence about
sis). The notion of internal validity comes from social these latter target properties. Typically, vorticity in tur-
science and corresponds to the (approximate) truth bulent flows is difficult to measure directly, and is often
of inferences about causal relationships regarding the assessed by measuring velocity, based on imaging tech-
system that is experimented on. External validity corre- niques. In more complex cases, the properties measured
sponds to the (approximate) truth of the generalization can be seen as a way to observe different and poten-
of causal inferences from an initial system, for which tially remote target systems, as is vividly analyzed by
internal validity has been demonstrated, to a larger class Shapere with his case study of the observation of the
of systems. Guala further claims that both computer core of the sun by the counting of 37 Ar atoms in a tank
simulations and experiments fit this epistemological within a mine on Earth [34.180]. Importantly, in all such
description in terms of internal and external validity ar- cases, the measuring apparatus, the directly measured
guments, but cautiously concludes that their “difference property, and the indirectly probed target system are
must lie elsewhere” [34.177]. According to him, com- related by causal processes. The uses of the collected
puter simulations and experiments are different, since in empirical information then vary with the type of inquiry
the latter case there is a material similarity between the pursued. The evidence may be informational about the
object and the target, whereas, in the former case, there physics of a particular system, like the Sun. Or, it may
is a formal similarity between the simulating and the be used to confirm or falsify theories (like in the case
simulated systems (a claim which seems to be falling of the 1919 experiment by Eddington and the relativ-
under the above criticism directed at Norton and Suppe ity theory). In some cases, though by no means all, it
and their followers). may be used to draw inferences about the nature or be-
Guala’s conceptual description is endorsed by most havior of a larger class of similar systems – which are
authors who try to picture computer simulations as not related to the measured system by a causal rela-
some sort of experiment. For example, Winsberg ac- tionships. If this latter case of reasoning about external
cepts the description, but claims that the difference validity is taken as paradigmatic for experiments, and
between experiments and computer simulations lies the causal processes between the target experimented
Computer Simulations and Computational Models in Science 34.5 Comparing: Computer Simulations, Experiments 765
systems (the source) and the measuring apparatus (the [34.201] takes it for granted that “experiments are com-
receptors), which are present in all experiments, are monly thought to have epistemic privilege over simula-
considered as a secondary feature, experimental activi- tions” and claims that this is in fact a context-sensitive
ties are misrepresented. As Peschard nicely puts it “the issue. As we shall see, if one puts aside the question of
idea that the experiments conducted in the laboratory the specific role of experiments as the source of primary
are aimed at understanding some system that is outside evidence about nature, it is not clear whether the gen-
the laboratory is a source of confusion” [34.200]. Gen- eral version of the superiority claim has actually been
eral conceptual frameworks that do not introduce such defended, or whether a straw man is attacked.
confusion are however possible. For example, Peschard
proposes [34.200] Computer Simulations, Experiments and the
Production of Radically New Evidence
“to make a distinction between the target system,
Let us try to specify what the general superiority claim
that is manipulated in the experiment or represented
could be and how it has really been defended.
in the computer simulation, and the epistemic mo-
The obvious sense in which experiments may be su-
tivation, which in both cases may be different from
Part G | 34.5
perior is that they can provide scientists with primary
the target system ”
evidence about physical systems, which originate in in-
(see also the distinction between the result of the teractions with these systems, and cannot be the product
unfolding of a scenario and the final result of the inquiry of our present theoretical beliefs. It is unlikely that com-
in [34.87]). puter simulation can endorse this role. As Simon pithily
Overall, the common description provided by puts it, “a simulation is no better than the assumptions
Guala, and heavily relied upon in [34.113, 199] to sup- built into it, and a computer can do only what it is
port versions of the identity thesis can be defended only programmed to do” [34.12, p. 14]. From this perspec-
by squeezing experiments and computer simulations tive, experiments have the potential to surprise us in
into a straightjacket which misrepresents these activi- a unique way, in the sense that they can provide results
ties, is not specifically fruitful, and meets insuperable contradictory to our most entrenched theories, whereas
difficulties. a computer simulation cannot be more fertile than the
scientific model used to build it (even if computer sim-
Materiality Matters ulations can surprise us and bring about novel results,
Clearly, for both experiments and computer simula- see Sect. 34.3.4). This is what Morgan seems to have
tions, materiality is crucial. However, it does matter in mind when she emphasizes that “[N]ew behaviour
differently, and one does not need to endorse a version patterns, ones that surprise and at first confound the
of the identity thesis to acknowledge the importance profession, are only possible if experimental subjects
of materiality when claiming for example that, to un- are given the freedom to behave other than expected,”
derstand computational science, the emphasis should whereas “however unexpected the model outcomes,
be on computer simulations which can be in practice, they can be traced back to, and re-explained in terms of,
and therefore materially, carried out by actual sys- the model” [34.202, pp. 324–5]. In brief, experiments
tems [34.41, 91]. are superior in the sense that, in the empirical sciences,
For experiments, material details are relevant they can serve a function which computer simulations
throughout the whole inquiry when producing, dis- cannot.
cussing and interpreting results, their validity and their Roush [34.203] has highlighted another aspect re-
scope (especially if one tries to extrapolate from the in- garding which experiments can be superior to simula-
vestigated system to a larger class of materially similar tions. She first insists that we should compare the two
ones). By contrast, for computer simulations, material methods other things being equal, especially in terms of
details are important to establish the reliability of the what is known about the target situation. Then, in any
computation, but not beyond: only the mathematical case in which there are elements in the experimenter’s
and physical details of the investigation matter when study system that affect the results and are unknown,
discussing and interpreting the results of the computer we may still run the experiment and learn how the target
simulation and the reliability of the inquiry. system behaves; by contrast, in the same epistemic situ-
ation, the simulationist cannot build a reliable computer
34.5.4 Knowledge Production, simulation that yields the same knowledge. However,
Superiority Claims, and Empiricism when all the physical elements that affect the result are
known, a simulation may be as good as an experiment,
The question of the epistemic superiority of experi- and it is a practical issue to determine which one can in
ments over simulations has also been discussed. Parke practice be carried out in the most reliable way.
766 Part G Modelling and Computational Issues
Thus, for a quantitative comparison to be meaning- fended? Since experiments and computer simulations
ful it should be related to roles which can be shared are different activities, which are faced with specific
by experiments and computer simulations, such as the difficulties, it is hard to see why computer simulations
production of behavioral knowledge about physical should always fare worse. Why could simulations based
systems, the relevant dynamics of which is known on reliable models not sometimes provide more reli-
(Sect. 34.5.2). able information than hazardous experiments? Indeed,
it is commonly agreed that, when experiments cannot
Grounds for Comparative Claims be carried out, are unreliable, or ethically unacceptable,
Scientists and philosophers have emphasized over the computer simulations may be a preferable way to gain
last decades that computer simulations are often mere information [34.41, p. 107].
simulations [34.177], the results of which should be
taken carefully. As seen above, economists shun sim- Justified Contextual Superiority Claims
ulation; similarly, Peck states that evolutionary biolo- Interestingly, superiority claims can sometimes be
gists view simulations with suspicion and even contempt made in specific contexts. Morgan presents cases in
Part G | 34.5
[34.204, p. 530]. Nevertheless, however well advised economics in which a precise and contextual version of
these judgments may be, they cannot by themselves sup- the superiority claim may be legitimate [34.202].
port a general and comparative claim of superiority in Like Guala, Morgan discusses laboratory experi-
favor of other methods, but at most the claim that, in ments in economics, that is, purified, controlled, and
fields where other methods are successful and computer constrained versions of real world systems, which are
simulations have little epistemic warrants or face serious studied in artificial laboratory environments (in con-
problems, these other methods will usually or on aver- trast with field experiments, which “follow economic
age be more reliable (exceptions remaining possible). behavior in the wild” [34.202, p. 325]) and are aimed
Some authors have discussed the comparative claim at investigating what is or would be the case in ac-
by analyzing the power of the types of inferences made tual (nonsimplified) economic situations. Mathematical
to justify knowledge claims in each case. In [34.199], models can also be used for such inquiries and, in each
Parker adopts Guala’s description of experiments (resp. case, scientists run the risk of describing artificial be-
computer simulations) as having material (resp. for- haviors. Morgan then makes the following contextual
mal) similarities with their target systems (see the claim that “any comparison with the model experiment
discussion in Sect. 34.5.3) and studies the claim that is still very much to the real experiment’s advantage
inferences made on the basis of material similarities here” [34.202, p. 321] (my emphasis) on the grounds
would have an epistemic privilege. (Guala does not that, in this case, the problem of making ampliative
seem to endorse a comparative claim. He argues that analog inferences from laboratory system to real-world
material similarities are a specific feature of experi- systems is nothing compared with the problem of the
ments, implying that the prior knowledge needed to realism of assumptions for models exploring artificial
develop simulations is different from that needed to models [34.202, pp. 321–322]. She does not justify
develop experiments.) Again, the common description this point further, but a plausible interpretation is that,
in terms of internal and external validity regarding in such cases, mathematical models necessarily ab-
the inferences from one physical system to another stract away essential parts of the dynamics of decision
gives the semblance of a new problem. However, if, making, which arguably are preserved in experiments
as suggested above, the material properties of comput- because of the material similarity between the labora-
ers matter only in so far as they enable scientists to tory and real agents. In brief, while material similarity
make logically sound computations, and no similarity plays a role in her argument she does not make the
between systems is involved, the grounds and rationale general claim in the core of her paper that material sim-
for this discussion between the properties of the com- ilarity will always provide more reliable grounds for
puter and those of the target system collapse. A way to external validity claims than other methods (even if her
save the argument is to claim that the aforementioned formulation is less cautious in her conclusion).
formal similarities are simply those between the com- Overall, such sound contextual comparative judg-
putational model and the target system, but then the ments require two premises: first that in some context
question boils down to the much more familiar compar- computer simulations are not reliable (or have relia-
ison between model-based knowledge (here extracted bility r) and second that in the same context material
by computational means) and some type of experiment- similarities provide reasonably reliable inferences (or
based knowledge. have reliability s > r). (Indeed, analogical reasoning
On what grounds could the general privilege of based on material similarities, in which one reasons
experiment-based behavioral knowledge then be de- based on systems that are representative of or for larger
Computer Simulations and Computational Models in Science 34.6 The Definition of Computational Models and Simulations 767
classes of systems [34.127], can sometimes be pow- brids “run along several dimensions” [34.127, p. 233].
erful ways to make sound – though not infallible! – Overall, sciences illustrate “how difficult it is to cut
contextual inferences. As emphasized by Harré and cleanly, in any practical way, between the philosopher’s
Morgan, “shared ontology [. . . ] has epistemological categories of theory, experiment and evidence” [34.127,
implications” [34.202, p. 323], since “the apparatus is p. 232], and, we may add, computer simulations or
a version of the naturally occurring phenomenon and thought experiments.
the material setup in which it occurs” [34.205, pp. 27– Should these hybrid methods lead philosophers to
8]. After all, different samples of the same substance reconsider the conceptual frontiers between experi-
obey the same laws, even if contextual influences may ments and computer simulations? We can first note that
change how they behave and any extrapolation is not their existence may be seen as a confirmation that the
possible.) traditional picture of science, in which theoretical, rep-
resentational or inferential methods on one hand and
34.5.5 The Epistemological Challenge experimental activities on the other play completely
of Hybrid Methods different but complementary roles, is not satisfactory
Part G | 34.6
(Sect. 34.5.2). Then, if one grants that activities like
Whether computer simulations and experiments are on- experiments, thought experiments and computer sim-
tologically, conceptually, and epistemologically distinct ulations can sometimes play identical roles, it is no
activities or not, it is a fact that jointly experimental and surprise that they can also be jointly used to fulfill them.
computational mixed activities have been developed Similarly, a group of four online players of queen of
by scientists. Their study was pioneered by Morgan, spades sometimes involve virtual players – but most
who presents various types of hybrid cases in eco- people will be reluctant to see this as sufficient grounds
nomics [34.206] and biomechanics [34.127]. For exam- for claiming that bots are human creatures.
ple, she reports different mixed studies aimed at inves- In any case, these hybrid activities raise episte-
tigating the strength of bones and carried out by cutting mological questions. What, if anything, distinguishes
slices of bone samples, photographing them, creating a computer simulation that makes heavy use of em-
digital 3-D images, and applying the laws of mechanics pirical data from a measurement involving the com-
to these experiment-based representations. Morgan fur- putational refinement of such data [34.53, 121]? How
ther attempts to provide a principled typology of these much should the results of these methods be consid-
activities. This proves difficult because “modern sci- ered as empirical? Overall, what type of knowledge and
ence is busy multiplying the number of hybrids on our data is thereby generated (see [34.53] for incipient an-
epistemological map” and because the qualities of hy- swers)?
simulations, qua formal tools (e.g., agent-based, CA [. . . ] Simulations are closely related to dynamic
models, equation-based simulations, etc.,) can be used models. More concretely, a simulation results when
in different epistemic contexts for different purposes, the equations of the underlying dynamic model are
and require totally different epistemological analyses. solved. This model is designed to imitate the time-
The case of CA-based computer simulations exempli- evolution of a real system. To put it another way,
fies the risk of too quick essentialist characterizations. a simulation imitates one process by another pro-
While it was believed that these models were appropri- cess. In this definition, the term process refers solely
ate for phenomenological simulations only [34.9, 135], to some object or system whose state changes in
their use in fluid dynamics has shown that they could time. If the simulation is run on a computer, it is
supply theoretical models based on the same underly- called a computer simulation.”
ing physics as traditional methods [34.100].
The following section is organized as follows. Ex- This definition has been criticized along the follow-
isting definitions and the problems they raise are pre- ing lines. First, as noted by Hughes [34.13, p. 130],
sented first, and then issues that a good definition of the definition rules out computer simulations that do
Part G | 34.6
computer simulations should clarify are emphasized. not represent the time evolution of systems, whereas ar-
guably one can simulate how the properties of models
34.6.1 Existing Definitions of Simulations or systems vary in their phase space with other param-
eters, such as temperature. Accordingly, a justification
Computer-Implemented Methods for the privilege granted to the representation of tempo-
As emphasized by Humphreys, a crucial feature of sim- ral trajectories should be found, or the definition should
ulations is that they enable scientists to go beyond what be refined, for example, by saying that computer simu-
is possible for humans to do with their native inferential lations represent successive aspects or states of a well-
abilities and pen-and-paper methods. Accordingly, he defined trajectory of a system along a physical variable
offered in 1991 the following working definition [34.7]: through its state space. Second, the idea that a specific
trajectory is meant to be represented may also have to be
“A computer simulation is any computer-imple-
abandoned. For example, in Monte Carlo simulations,
mented method for exploring the properties of
we learn something about average values of quantities
mathematical models where analytic methods are
along sets of target trajectories by generating a poten-
unavailable.”
tial representative of these trajectories, but the computer
This definition requires that we possess a clear defi- simulations are not aimed at representing any trajectory
nition of what counts as an analytic method, which is in particular. One may also want a computer simula-
not a straightforward issue [34.60]. Further, as noted tion to be simply informative about structural aspects of
by Hartmann et al. [34.10, pp. 83–84], it is possi- a system. Overall, the temporal dynamics of the simulat-
ble to simulate processes for which available models ing computer is a crucial aspect of computer simulations
are analytically solvable. Finally, as acknowledged by since it “enables us to draw conclusions about the be-
Humphreys, the definition covers areas of computer-as- havior of the model” [34.13, p. 130] by unfolding these
sisted science that one may be reluctant to call computer conclusions in the temporal dimension of our world,
simulations. Indeed, this distinction does sometimes but the temporal dynamics of the target system may not
matter in scientific practice. Typically, economists are have to be represented for something to count as a com-
not reluctant to use computers to analyze models but puter simulation.
shun computer simulations [34.37]. Since both com- Third, the definition is probably too centered on
putational methods and computer simulations involve models and their solutions [34.207], since it equates
computational processes, their difference must be either computer simulations with the solving of a dynamic
in the different types (or uses) of computations involved model that represents the target system. This is tan-
either at the mathematical and/or the representational tamount to ignoring the fact that describing computer
level. simulations as mathematical solutions of dynamic mod-
els is not completely satisfactory. What is being solved
One Process Imitating Another Process is a computational model (as in Humphreys’s defini-
Hartmann proposes the following characterization, tion [34.41], see below), which can be significantly
which gives the primacy to the representation of the different from, and somewhat independent of, the ini-
temporal evolution of systems [34.10, p. 83]: tial dynamic model of the system, which usually de-
rives from existing theories. Effectively, different layers
“A model is called dynamic, if it [. . . ] includes as- of models, often justified empirically, can be needed
sumptions about the time-evolution of the system. in-between [34.13, 97, 208]. For this reason, the repre-
Computer Simulations and Computational Models in Science 34.6 The Definition of Computational Models and Simulations 769
sentational relation between the initial dynamic model Computer Simulations as the Concrete
and the target system, and between the computational Production of Solutions to Computational
system and the target system, are epistemologically dis- Models
tinct. In order to answer problems with the previous defini-
Finally, the definition may be reproached for en- tions, Humphreys proposed in 2004 another definition
tertaining a recurrent confusion about the role of ma- of computer simulations, which is built along the fol-
teriality in computer simulations (Sect. 34.5.3), by lowing lines [34.41]. He defines the notion of a theoret-
describing the representational relation as being be- ical template, which is implicitly defined as a general
tween two physical processes, and not between the relation between quantities characterizing a physical
computational model and succession of mathemati- system, like Newton’s second law, Schrödinger’s equa-
cal states which unfold it (in whatever way they are tion, or Maxwell’s equations. A theoretical template
physically implemented and computed) and the target can be made less general by specifying some of its
system. variables. When the result is computationally tractable,
we end up with a computational template. (Thus, what
Part G | 34.6
Computer Simulations as Demonstrations qualifies as a computational template seems to depend
Hughes does not propose a specific definition of com- on our computational capacities at a given time.) When
puter simulations since he believes that computer sim- a computational template is given (among other things)
ulations naturally fit in the DDI account of scien- an interpretation, construction assumptions, and an ini-
tific representation that he otherwise defends [34.13, tial justification, it becomes a computational model.
p. 132]. According to the DDI, which involves de- Finally, Humphreys offers the following characteriza-
notation, demonstration, and interpretation as compo- tion [34.41, pp. 110–111]:
nents [34.13, p. 125]:
“System S provides a core simulation of an object
“Elements of the subject of the model (a physi- or process B just in case S is a concrete computa-
cal system evincing a particular kind of behavior, tional device that produces, via a temporal process,
like ferromagnetism) are denoted by elements of solutions to a computational model [. . . ] that cor-
the model; the internal dynamic of the model then rectly represents B, either dynamically or statically.
allows conclusions (answers to specific questions) If in addition the computational model used by S
to be demonstrated within the model; these conclu- correctly represents the structure of the real system
sions can then be interpreted in terms of the subject R, then S provides a core simulation of system R
of the model.” with respect to B.”
The demonstration can be carried out by a physical Another important distinction lies between the com-
model (in the case of analog simulations) or by a log- puter simulation of the behavior of a system and that
ical or mathematical deduction, such as a traditional of its dynamics [34.41, p. 111] since, even when the
mathematical proof, or a computer simulation. Further, computational model initially represents the structure
according to Hughes, in contrast to Hartmann’s ac- and dynamics of the system, the way its solutions are
count, “the DDI account allows for more than one layer computed may not follow the corresponding causal pro-
of representation” [34.209, p. 79]. Overall, a virtue of cesses. Indeed, in a computer simulation, the purpose
this account is that it emphasizes the common episte- is not that the computational procedure exactly mim-
mological structures of different activities by pointing ics the causal processes, but that it efficiently yields
at a similar demonstrative step, which excavates the the target information from which an appropriate dy-
epistemic content and resources of the model (see namic representation of the target causal processes can
also [34.210] for refinements, [34.87] for an analysis finally be built for the user. For reasons of computa-
which extends the idea of demonstration, or unfolding, tional efficiency, the representation may be temporally
to thought experiments and some types of experiments, and spatially dismembered at the computational level
and [34.72] for the related idea that computer simula- (e.g., by computing the successive states in a different
tions are arguments). While as a definition of computer order), as may happen with the use of parallel process-
simulation, Hughes’s sketchy proposal has somewhat ing, or of any procedure aimed at partially short cutting
been neglected (see however [34.208]) it is a legiti- the actual physical dynamics.
mate contender and it remains to be seen how much The space here is insufficient to analyze all the as-
a more developed version of is would provide a fruitful pects of the above definition and to do justice to their
framework for philosophical discussions about com- justification – all the more so since further compli-
puter simulations. cations may be required to accommodate even more
770 Part G Modelling and Computational Issues
complex cases [34.208]. Suffice it to say that this elab- they tend to describe computer simulations as involv-
orate definition, which is aimed at providing a synthetic ing a representational relationship between two material
answer to the problems raised by previous definitions, systems and to misconstrue how computers work (see
is one of the most regularly referred to in the literature. again Sect. 34.5.3). They thereby tend to misrepre-
sent the epistemological role of the physical properties
34.6.2 Pending Issues of computers and the fact that computational science
involves two distinct steps; one in which computer sci-
Simulating or Computing entists warrant that the computer is reliable and another
Giving a definition of computer simulations implies in which scientists use computations and do not need to
choosing which notions should be regarded as primi- know anything about computers qua physical systems.
tive and how to order them logically. Some authors first A way out of this deadlock may be to use a flexible
define the notion of simulation and present computer notion of simulation, which can be applied to relations
simulations as a specific type of simulations. For exam- between physical or logical–mathematical simulating
ple, Bunge first defines the notion of analogy, then that processes and the target simulated physical processes.
Part G | 34.6
of simulation, and finally that of representation, as sub- Then, the question remains as to what exactly is gained
relation of simulation. For him, an object x simulates (and lost) from an epistemological point of view by
another object y when (among other things) (1) there putting in the same category modes of reasoning of
is a suitable analogy between x and y and (2) the anal- such different types – if one puts aside the empha-
ogy is valuable to x, or to another party z that controls x sis on the obvious similarities with analog simulations,
(see [34.11, p. 20] for more details). which are a very specific type of computer simulation
A potential benefit of this strategy is that it be- (Sect. 34.2.2). Overall, it is currently far from clear
comes possible to unify in the same general framework whether this unificatory move should be philosophi-
various different types of analogous relations between cally praised.
systems such as organism versus society, organism ver-
sus automaton, scale ship versus its model, computer Abstract Entities or Physical Processes
simulations of both molecular and biological evolu- Arguably, computations are logical entities that can
tion, etc. Similarly, Winsberg [34.195, §1.3] suggests be carried out by physical computers. Then, the ques-
that the hydraulic dynamic scale model of the San tion arises should computer simulations also be seen
Francisco Bay model should be viewed as a case of as abstract logical entities, or should they be seen as
simulation (see [34.211] for a recent presentation and material processes instantiating abstract computations?
philosophical discussion of this example in the context Hartmann’s definitions present computer simulations
of modeling). While scale models can obey the same as processes, whereas Humphreys’s definition is more
dimensionless equations as their target systems and be careful in the sense that the computing systems simply
used to provide analog simulations of them, Winsberg’s produce the solution or provide the computer simula-
claim is not uncontroversial and may require an ex- tion. Clearly, to analyze computational science, it is
tension of the notion of simulation. Indeed the model paramount to take into account material and practical
and the Bay itself do not exactly obey the same mathe- constraints since a computer simulation is not really
matical equations. For example, distortions between the a part of our science and we have no access to its
vertical and horizontal scales in the model increase the content unless a material system carries it out a for
hydraulic efficiency, which implies adding copper strips us. At the same time, just like the identity of a text is
and the need for empirical calibration. Therefore, this not at the material level, the identity of a computing
is not exactly a case of a bona fide analog simulation simulation (and the corresponding equivalence relation-
(Sect. 34.2.2) but of a complex dynamical representa- ship between runs of the same computer simulation) is
tion between closely analogous systems. In any case, defined at the logical (if not the mathematical) level
if one adopts such positions, it is then a small step to and the physical computer simply presents a token of
describe other cases of analogical reasoning between the computer simulation. From this point of view, the
material systems (and possibly cases of experimental material existence of computer simulations and the in
economics, in which the dynamics of the analogous tar- principle/in practice distinction emphasized by Humph-
get system is not precisely known and external validity reys [34.41] have epistemological, not ontological, sig-
is to be assessed by comparing the material systems in- nificance, that is, they pertain to what we may learn
volved) as cases of simulations (Sect. 34.5.3). by interacting with actual tokens of computer simula-
At the same time, unification is welcome only if tions [34.91, p. 573] but not to the nature of computer
it is really fruitful (and is, of course, not misleading). simulations. Similarly the identity of a proof seems to
As seen above, the problem with such analyses is that be at the logical level, even if a proof has no existence
Computer Simulations and Computational Models in Science 34.6 The Definition of Computational Models and Simulations 771
nor use for us unless some mathematician provides simulations, and potential cases in which something
some token of it. that was believed to be a computer simulation by scien-
tists actually is not. This in turn requires knowing how
Success, Failure, and the Definition computer simulations can fail specifically [34.92] and
of Computer Simulations which failures are specific to them. In brief, one needs
A computer simulation is something that reproduces the to be able to decide on a justified basis which failures
behavior or dynamics of a target system. The problem disqualify something from being a computer simulation
with characterizations of this type is that they make and which ones simply alter its scientific, epistemic, or
computer simulation a success term and if a computer semantic value. This analysis may also have to be co-
simulation mis-reproduces the target behavior, it is no herent with analyses about how other types of scientific
longer a computer simulation. This problem is a gen- activities such as experiments and thought experiments
eral one for representations, but is specifically acute for can fail [34.214], especially when these activities play
scientific representations (Frigg and Nguyen Chap. 3, similar or identical roles.
this volume). Indeed, while anything in art can be An option to consider is that something is a com-
Part G | 34.6
used to represent anything else, scientific representa- puter simulation based on criteria that do not involve
tions are meant to be informative about the natural empirical success, and that it qualifies as an empiri-
systems they represent. This is part of their essential cal success depending on additional semantic properties
specificities and, arguably, a definition according to and on whether it correctly represents the relevant as-
which any process could be described as a scientific pects of its (real or fictional) target system(s). This
computer simulation of any other process is not satis- option is potentially encompassing enough (the scien-
factory. At the same time, one does not want something tifically short-sighted student can be said to perform
to be a computer simulation, or a scientific represen- a computer simulation), but discriminating between
tation, based on whether it is scientifically successful good and bad computer simulations is still possible. It
and exactly mirrors its target (remember that, for some is compatible with the fact that research inquiries are
scientific inquiries, representational faithfulness is not often open and scientists need not know in advance
a goal and may even impede the success of the investi- what in their results will have representational value
gation [34.212] and [34.23, Chaps. 1 and 3]. in the end. Finally, it is also compatible with a differ-
An option is to say that something is a scientific rep- ent treatment of representational and implementation
resentation if it correctly depicts what its user wants failures. Indeed, the possibility of being unsuccess-
it to represent. However, this may raise a problem for ful at the representational level is consubstantial to
computer simulations that were carried out and had empirical inquiries and is in this sense normal. By
subsequent nonintended uses, like the millennium sim- contrast, an implementation failure is simply some-
ulation. It may also raise a problem for fictions, which thing that should be fixed. It corresponds to a case in
strictly speaking seem to represent nothing [34.25, which we did not manage to carry out the intended
p. 770]. computation, whereas computing is not supposed to be
Finally, failed representations, which do not repre- a scientific obstacle, and we learn nothing by fixing the
sent what their producers believe them to depict, are failure.
also a problem. Representational inquiries can fail in
many ways, and failures are present on a daily ba- Natural, Intentional, or Social Entities?
sis in scientific activity, from theories and experiments A similar but distinct issue is to determine which type
to models and simulations. For this reason, descrip- of objects computer simulations are, qua token physical
tions of scientific activities should be compatible with processes carried out by computing devices – a question
failure, especially if they are to account for scientific which is close to that of the nature of physical comput-
progress and the development of more successful in- ers and is also related to that of the ontology of model
quiries. Indeed, it would be weird to claim that many (Gelfert Chap. 1, this volume).
of the computer simulations that scientists perform Arguably, they are not simply natural objects which
and publish about are actually not computer simula- are defined by some set of physical properties and exist
tions. Further, whether a genuine computer simulation independently of the existence of the agents using them.
is carried out should be in general transparent to the Indeed, because computations can be multirealized and
practitioner, and this cannot be the case if computer some runs of computations built by patching different
simulation is defined as a success term and scientific bits of computation on physically different machines,
failure is frequent (see also [34.213, pp. 57–58]). it is unlikely that all computations can be described in
Overall, a question is to determine where the fron- terms of natural kind predicates (massively disjunctive
tier should lie between unsuccessful or failed computer descriptions not being allowed here) [34.126].
772 Part G Modelling and Computational Issues
Further, for both computations and computer simu- tion of which can be extracted. While such positions
lations, pragmatic conditions of use seem to matter. To may be palatable for those, like Konrad Zuse, Edward
quote Guala commenting on the anthropomorphism of Fredkin and their followers [34.64, 217, 218], who want
Bunge’s definition (see above), [34.177, p. 61] to see nature as a computer, it is not clear that such pan-
computationalist theses, whatever their intrinsic merits
“it makes no sense to say that a natural geyser sim-
for discussing foundational issues like the computa-
ulates a volcano, as no one controls the simulating
tional power of nature or which types of computers are
process and the process itself is not useful to anyone
physically possible, serve the purpose of understanding
in particular.”
science as it is actually practiced.
Indeed, even if any physical system can be seen An important distinct question is whether inten-
as computing some (potentially trivial) functions (see tional or pragmatic analyses should also be endorsed
below), any physical object cannot be used as a (gen- regarding computational models and computer sim-
eral) computer, and we may have to endorse a posi- ulations, qua representational mathematical entities,
tion along the lines of Searle’s notion of social ob- that is, how much the intentions of users and con-
Part G | 34.6
jects [34.215], or of any analysis doing the same work: ditions detailing how their use by scientists is pos-
a physical object X counts as Y in virtue of certain cog- sible, should be part and parcel of their definitions.
nitive acts or states out of which they acquire certain Arguably, a scientific model is not simply a piece
sorts of functions (here computing), given that these of syntax or an entity which inherently and by it-
objects need to demonstrate appropriate physical prop- self represents, completely or partially, a target system
erties so that they may serve these functions for us. in virtue of the mathematical similarities it intrinsi-
A specificity of computer simulations is that, unlike en- cally possesses with this system. In order to understand
trenched social objects, such as cars or wedding rings, how scientific representations and computer simula-
a small group of users may actually be enough for tions work and actually play their scientific role, their
a physical system to be seen as carrying out a computer description may have to include captions, legends, ar-
simulation. Thus, the evolution of a physical system gumentative contexts, intentions of users, etc., since
(like a fluid) may count for some users as an ana- these elements are part of what makes them scientifi-
log computer, which performs a computer simulation, cally meaningful units. Indeed, how one and the same
and for other users as an experiment, even if experi- mathematical model represents significantly varies de-
ments and computer simulations are in general objects pending on the inquiry, subject matter and knowledge
of different types, and this case is unlikely to be met in of the modelers. This is particularly clear in the case
practice (Sect. 34.5.3). of computational templates, which are used across
In any case, what is needed for something to be fields of research for different representational and
used as a computer or a computer simulation is not epistemological purposes [34.41, §3.7], and which are
completely clear. The physical process must clearly be scientific units at the level of which different types
recognized as instantiating a computer model. Control of theoretical and conceptual exchanges take place
is useful but not necessarily mandatory (e.g., we may within and across disciplines [34.45]. Overall, this is-
use the geyser to simulate a similar physical system, sue is not specific to computer simulations but can
even if the geyser would not count as a controlled ver- be raised for other scientific representations [34.23,
satile analog computer). The possibility to extract the 168, 219–221]. Thus, this point shall not be developed
computed information is clearly useful – an issue that further.
matters for discussions about analog and quantum com-
puter simulations, and of course cryptography. Computer Simulations
An alternative position is not to mention users in the and Computational Inquiries
definition and to claim that, pace the peculiar case of How should computer simulations be delineated? Com-
man-made computations (which may make use heav- puter simulations do not wear on their sleeves how they
ily of the possibility offered by multiple realizability, were built, contribute to scientific inquiries, should be
see Sect. 34.5.3), physical processes are the one-piece interpreted and how their results should be analyzed.
physical instantiations of running computer models Accordingly, authors like Frigg and Reiss distinguish
(resp. computer simulations) and, as such, are computa- between computer simulations in the narrow sense (cor-
tions (even if, sometimes, trivial ones). See [34.216] for responding to the use of the computer), and in the
a sober assessment of this pancomputationialist posi- broad sense (corresponding to the “entire process of
tion. In this perspective, one may say that it is a practical constructing, using and justifying a model that in-
problem to create artificial human-friendly computers volves analytically intractable mathematics” [34.30, p.
which can in addition be controlled and the informa- 596]). See also the distinction between the unfolding
Computer Simulations and Computational Models in Science 34.7 Conclusion: Human-Centered, but not Human-Tailored 773
of a scenario and the computational inquiry involv- puter simulations does not seem to determine the type
ing this unfolding at its core [34.87], or the descrip- of knowledge they produce.
tion of how the demonstration activity is encapsulated Clearly, computer simulations can yield theoretical
in other activities in the DDI account of representa- knowledge when they are used to investigate theoretical
tion [34.13]. models. At the same time, even if computer simula-
Whatever the choice which is made, there is tension tions are not experiments (Sect. 34.5.3), they produce
here. As underlined above, an analysis of the identity of knowledge, which may qualify as empirical in different
scientific representations cannot rest on the logical and and important senses. As we have seen, computer sim-
mathematical properties of scientific models and their ulations provide information about natural systems, the
similarities with their physical targets, and indications validity of which may be justified by empirical creden-
about how these representations are to be interpreted tials rooted in interactions with physical systems for as-
cannot be discarded as irrelevant to the analysis of their pects as various as the origin of their inputs, the flesh of
nature and uses. At the same time, computer simu- their representations of systems (see in Sect. 34.5.5 the
lation, qua computational process, and the arguments examples by Morgan about the studies of the strength
Part G | 34.7
that are developed by humans about it, are activities of bones), the calibration or choice of their parameters,
of different natures and play different roles. Therefore or their global validation by comparison with experi-
an encompassing definition should not lead to blur the ments (Sect. 34.3.2). However, information about the
specificities of the different components of computa- dynamics represented cannot completely be of empir-
tional inquiries (just like a good account of thought ical origin, since it involves the description of general
experiments should not blur that they crucially involve relations between physical states, and general relations
mental activities at their core and are part of inquiries cannot be observed.
also involving scientific arguments). From this point of view, computer simulations may
be seen as a mathematical mode of demonstrating the
34.6.3 When Epistemology Cross-Cuts content of scientific representations that is in a sense
Ontology neutral regarding the type of content that is processed:
empirically (resp. theoretically) justified representa-
Whatever the exact definition of computer simulations, tions in, empirically (resp. theoretically) justified in-
it is clear that they are of a computational nature, in- formation (or knowledge) out. This suggests that when
volve representations of their target systems and that analyzing and classifying types of scientific data and
their dynamics is aimed at investigating the content of knowledge, the ways that they are produced and pro-
these representations. cessed (experimentally or computationally) and where
Importantly, whereas the investigation of scien- their reliability comes from (e.g., theoretical credentials
tific representations is traditionally associated with the or experimental warrants) are, at least in part, indepen-
production of theoretical knowledge, the nature of com- dent questions.
in formalisms which made their symbolic manipulation which computational science is controlled and its re-
possible for humans (hence the success of differential sults skimmed by its human beneficiaries. More con-
calculus), problems were selected in such a way that cretely, scientific problems still need to be selected;
they could be solved by humans, results were retrieved computational models, even if designed for computers,
in ways such that humans could survey or browse them, need to be scientifically chosen (e.g., CA-based mod-
etc. els of fluids were first demonstrated to yield the right
Navier–Stokes-like behavior by means of traditional an-
34.7.1 The Partial Mutation alytic methods [34.43]; results of computer simulations,
of Scientific Practices even if produced and processed by computers, need to
be analyzed relative to the goals of our inquiries; and ul-
The use of computers within representational inquiries timately scientific human-sized understanding needs to
has modified, and keeps modifying, scientific prac- be developed for new fundamental or applied scientific
tices. Theorizing is easier and therefore less academ- orientations to be taken.
ically risky, even in the absence of well-entrenched
Part G | 34.7
backing-up theories; solutions to new problems be- 34.7.3 Analyzing Computational Practices
come tractable and how scientific problems are selected for Their Own Sake
evolves; the models which are investigated no longer
need to be easily manipulated by human minds (e.g., Over the last three decades, philosophers of science
CA are well adapted for computations, but ill-suited to have emphasized that in most cases computer simu-
carry out mental inferences [34.43]; the exploration of lations cannot simply be viewed as extensions of our
models is primarily done by computers, making men- theoretical activities. However, as discussed above, the
tal explorations and traditional activities like thought assimilation of computer simulations with experimental
experiments are somewhat more dispensable [34.117] studies is still not satisfactory. A temptation has been to
and [34.41, pp. 115–116]; the treatment of computa- describe the situation as one in which computer stud-
tional results, as well as their verification, is made by ies lay in-between theories and experiments. While this
computational procedures; the storage of data, but also description captures the inadequacy of traditional char-
their exploration by expected or additional inquirers, acterizations based on a sharp and exclusive dichotomy
are also computer based. Finally, the human, mate- between scientific activities, it is at best a metaphor.
rial, and social structure of science is also modified Further, this one-dimensional picture does little jus-
by computers, with a different organization of scien- tice to, let alone help one understand, the intricate and
tific labor, the emergence in the empirical sciences of multidimensional web of complex and context-sensitive
computer-oriented scientists, like numerical physicists relations between these activities.
and computational biologists or chemists, or the devel- An alternative is to analyze computational models,
opment of big computational pieces of equipment and computer simulations, and computational science for
centers, the access to which is scientifically controlled their own sake. Indeed, computer simulations clearly
by the scientific community (like for big experimental provide a variety of new types of scientific practices,
pieces of equipment). the analysis of which is a problem in its own right. Im-
portantly, this by no means implies that these practices
34.7.2 The New Place of Humans require a radically new or autonomous epistemology
in Science or methodology. Similarly mathematical and scientific
problems can be genuinely independent, even when in
Overall, the place and role of humans in science has the end they can be reduced by complex procedures
been modified by computational science. Arguably, hu- to a set of known or solved problems. Indeed, the
man minds are still at the center of (computational) epistemology of computer simulations often overlaps
science, like spiders in their webs or pilots in their piecewise with that of existing activities like theorizing,
spacecrafts, since science is still led, controlled, and experimenting, or thought experimenting. Disentan-
used by people. Thus, we are in a hybrid scenario gling these threads, clarifying similarities, highlighting
in which we face what Humphreys calls the anthro- specific features of computational methods, and analyz-
pocentric predicament of how, we, as humans, can ing how the results of computer simulations are justified
“understand and evaluate computationally based scien- in actual cases is an independent task for naturalistic
tific methods that transcend our own abilities” [34.42, philosophers, even if one believes that, in principle,
p. 134]. In other words, interfaces and interplays be- computer simulations boil down to specific mixes of al-
tween humans and computers are the core loci from ready existing, more basic activities.
Computer Simulations and Computational Models in Science References 775
34.7.4 The Epistemological Treatment portunity to write this article and for being so generous
of New Issues with space. I also thank T. Boyer-Kassem, J.M. Durán,
E. Arnold, and specifically P. Humphreys for feedback
In practice, the analysis of computer simulations has or help concerning this review article. I am also grate-
raised philosophical issues, which were not treated by ful to A. Barberousse, J. P. Delahaye, J. Dubucs, R. El
philosophers before computational studies were taken Skaf, R. Frigg, S. Hartmann, J. Jebeile, M. Morrison,
as an independent object of inquiry, either because they M. Vorms, H. Zwirn for stimulating exchanges over
were ignored or unnoticed in the framework of previ- the last couple of years about the issue of models and
ous descriptions of science, or because they are gen- simulations and related questions. All remaining short-
uinely novel [34.96, 124, 207]. This a posteriori justifies comings are mine.
making the epistemological analysis of computational Various valuable review articles, such as (J.M.
models and computer simulations a specific field of Durán: A brief overview of the philosophical study
the philosophy of science. How much computer sim- of computer simulations, Am. Philos. Assoc. Newslett.
ulations will keep modifying scientific practices and Philos. Comput. 13(1), 38–46 (2013), W.S. Parker:
Part G | 34
how much their philosophical analysis will finally bring Computer simulation, In: The Routledge Companion
about further changes in the treatment of important to Philosophy of Science, 2nd edn., ed. by S. Psil-
issues like realism, empiricism, confirmation, explana- los, M. Curd (Routledge, London 2013)), have been
tion, or emergence, to quote just a few, remains an open recently written about the issue of computer simula-
question. tions. (P. Humphreys: Computational science in Ox-
ford bibliographies online, (2012) doi:10.1093/OBO/
Acknowledgments. I have tried to present a criti- 9780195396577-0100) presents and discusses impor-
cal survey of the literature with the aim of clarifying tant references and may be used as a short but insightful
discussions. I thank the editors for providing me the op- research guide.
References
34.1 M. Mahoney: The histories of computing(s), Inter- 34.10 S. Hartmann: The world as a process: Simu-
discip. Sci. Rev. 30(2), 119–135 (2005) lations in the natural and social sciences. In:
34.2 A.M. Turing: Computing machinery and intelli- Modelling and Simulation in the Social Sciences
gence, Mind 59, 433–460 (1950) from the Philosophy of Science Point of View, The-
34.3 A. Newell, A.S. Herbert: Computer science as em- ory and Decision Library, ed. by R. Hegselmann,
pirical inquiry: Symbols and search, Commun. U. Mueller, K.G. Troitzsch (Kluwer, Dordrecht 1996)
ACM 19(3), 113–126 (1976) pp. 77–100
34.4 Z.W. Pylyshyn: Computation and Cognition: To- 34.11 M. Bunge: Analogy, simulation, representation,
ward a Foundation for Cognitive Science (MIT Rev. Int. Philos. 87, 16–33 (1969)
Press, Cambridge 1984) 34.12 H.A. Simon: The Sciences of the Artificial (MIT
34.5 H. Putnam: Brains and behavior. In: Analyti- Press, Boston 1969)
cal Philosophy: Second Series, ed. by R.J. Butler 34.13 R.I.G. Hughes: The Ising model, computer simu-
(Blackwell, Oxford 1963) lation, and universal physics. In: Models as Medi-
34.6 J.A. Fodor: The Language of Thought (Crowell, ators: Perspectives on Natural and Social Science,
New York 1975) ed. by M.S. Morgan, M. Morrison (Cambridge Univ.
34.7 P. Humphreys: Computer simulations, Proceed- Press, Cambridge 1999) pp. 97–145
ings of the Biennial Meeting of the Philosophy 34.14 S. Sismondo: Models, simulations, and their ob-
of Science Association, Vol. 2, ed. by A. Fine, jects, Sci. Context 12(2), 247–260 (1999)
M. Forbes, L. Wessels (Univ. Chicago Press, Chicago 34.15 E. Winsberg: Sanctioning models: The episte-
1990) pp. 497–506 mology of simulation, Sci. Context 12(2), 275–292
34.8 P. Humphreys: Numerical experimentation. In: (1999)
Philosophy of Physics, Theory Structure and 34.16 E. Winsberg: Simulations, models, and theories:
Measurement Theory, Patrick Suppes: Scientific Complex physical systems and their representa-
Philosopher, Vol. 2, ed. by P. Humphreys (Kluwer, tions, Philos. Sci. 68, S442–S454 (2001)
Dordrecht 1994) 34.17 E. Winsberg: Simulated experiments: Methodol-
34.9 F. Rohrlich: Computer simulations in the physi- ogy for a virtual world, Philos. Sci. 70(1), 105–125
cal sciences, Proceedings of the Biennial Meeting (2003)
of the Philosophy of Science Association, ed. by 34.18 M. Black: Models and Metaphors: Studies in Lan-
A. Fine, M. Forbes, L. Wessels (Univ. Chicago Press, guage and Philosophy (Cornell Univ. Press, New
Chicago 1991) pp. 507–518 York 1968)
776 Part G Modelling and Computational Issues
34.19 M. Hesse: Models and Analogies in Science (Sheed (Oxford Univ. Press, Oxford 2004)
Ward, London 1963) 34.42 P. Humphreys: Computational science and its ef-
34.20 M. Redhead: Models in physics, Br. J. Philos. Sci. fects. In: Science in the Context of Application,
31, 145–163 (1980) Boston Studies in the Philosophy of Science, Vol.
34.21 N. Cartwright: How the Laws of Physics Lie 274, ed. by M. Carrier, A. Nordmann (Springer, New
(Clarendon, Oxford 1983) York 2011), pp. 131–142, Chap. 9
34.22 M. Morgan, M. Morrison: Models as Mediators 34.43 A. Barberousse, C. Imbert: Le tournant compu-
(Cambridge Univ. Press, Cambridge 1999) tationnel et l’innovation théorique. In: Précis de
34.23 B. Van Fraassen: Scientific Representation: Para- Philosophie de La Physique, ed. by S. Le Bihan
doxes of Perspective (Clarendon Press, Oxford (Vuibert, Paris 2013), in French
2008) 34.44 I. Lakatos: Falsification and the methodology of
34.24 R. Frigg: Scientific representation and the seman- scientific research programmes. In: Criticism and
tic view of theories, Theoria 55, 49–65 (2006) the Growth of Knowledge, ed. by I. Lakatos,
34.25 M. Suárez: An inferential conception of scientific A. Musgrave (Cambridge Univ. Press, Cambridge
representation, Philos. Sci. 71(5), 767–779 (2004) 1970) pp. 91–195
34.26 R. Laymon: Computer simulations, idealizations 34.45 T. Knuuttila, A. Loettgers: Magnets, spins, and
and approximations, Proceedings of the Biennial neurons: The dissemination of model templates
Part G | 34
Meeting of the Philosophy of Science Association across disciplines, The Monist 97(3), 280–300
(Univ. Chicago Press, Chicago 1990) pp. 519–534 (2014)
34.27 R.N. Giere: Understanding Scientific Reasoning 34.46 T. Knuuttila, A. Loettgers: The productive tension:
(Holt Rinehart Winston, New York 1984) Mechanisms vs. templates in modeling the phe-
34.28 R.N. Giere: Explaining Science: A Cognitive Ap- nomena. In: Representations, Models, and Sim-
proach (Univ. Chicago Press, Chicago 1988) ulations, ed. by P. Humphreys, C. Imbert (Rout-
34.29 J. Kulvicki: Knowing with images: Medium and ledge, New York 2012) pp. 3–24
message, Philos. Sci. 77(2), 295–313 (2010) 34.47 A. Carlson, T. Carey, P. Holsberg (Eds.): Handbook
34.30 R. Frigg, J. Reiss: The philosophy of simulation: of Analog Computation, 2nd edn. (Electronic As-
Hot new issues or same old stew?, Synthese sociates, Princeton 1967)
169(3), 593–613 (2008) 34.48 M.C. Gilliland: Handbook of Analog Computa-
34.31 M. Mahoney: The history of computing in the tion: Including Application of Digital Control Logic
history of technology, Ann. Hist. Comput. 10(2), (Systron-Donner Corp, Concord 1967)
113–125 (1988) 34.49 V.M. Kendon, K. Nemoto, W.J. Munro: Quantum
34.32 D.A. Grier: Human computers: The first pioneers analogue computing, Philos. Trans. R. Soc. A 368,
of the information age, Endeavour 25(1), 28–32 3609–3620 (2010), 1924
(2001) 34.50 C. Shannon: The mathematical theory of commu-
34.33 L. Daston: Enlightenment calculations, Crit. Inq. nication, Bell Syst. Tech. J. 27, 379–423 (1948)
21(1), 182–202 (1994) 34.51 M.B. Pour-el: Abstract computability and its re-
34.34 I. Grattan-Guinness: Work for the hairdressers: lation to the general purpose analog computer
The production of de Prony’s logarithmic and (Some connections between logic, differential
trigonometric tables, Ann. Hist. Comput. 12(3), equations and analog computers), Trans. Am.
177–185 (1990) Math. Soc. 199, 1–28 (1974)
34.35 T. Schelling: Models of segregation, Am. Econ. Rev. 34.52 M. Pour-El, I. Richards: Computability in Analysis
59(2), 488–493 (1969) and in Physics. Perspective in Mathematical Logic
34.36 A. Johnson, J. Lenhard: Towards a new culture (Springer, Berlin, Heidelberg 1988)
of prediction. Computational modeling in the era 34.53 E. Arnold: Experiments and simulations: Do they
of desktop computing. In: Science Transformed?: fuse? In: Computer Simulations and the Chang-
Debating Claims of an Epochal Break, ed. by ing Face of Scientific Experimentation, ed. by
A. Nordmann, H. Radder, G. Schiemann (Univ. J.M. Durán, E. Arnold (Cambridge Scholars Pub-
Pittsburgh Press, Pittsburgh 2011) lishing, Newcastle upon Tyne 2013)
34.37 A. Lehtinen, J. Kuorikoski: Computing the per- 34.54 R. Trenholme: Analog simulation, Philos. Sci.
fect model: Why do economists shun simulation?, 61(1), 115–131 (1994)
Philos. Sci. 74(3), 304–329 (2007) 34.55 P.K. Kundu, I.M. Cohen, H.H. Hu: Fluid Mechanics,
34.38 R. Hegselmann, U. Mueller, K.G. Troitzsch: Mod- 3rd edn. (Elsevier, Amsterdam 2004)
elling and Simulation in the Social Sciences from 34.56 S.G. Sterrett: Models of machines and models
the Philosophy of Science Point of View (Springer, of phenomena, Int. Stud. Philos. Sci. 20, 69–80
Dordrecht, Pays-Bas 1996) (2006)
34.39 G.N. Gilbert, K.G. Troitzsch: Simulation for the So- 34.57 S.G. Sterrett: Similarity and dimensional analy-
cial Scientist (Open Univ. Press, Berkshire 2005) sis. In: Philosophy of Technology and Engineering
34.40 J. Reiss: A plea for (good) simulations: Nudg- Sciences, ed. by A. Meijers (Elsevier, Amsterdam
ing economics toward an experimental science, 2009)
Simul. Gaming 42(2), 243–264 (2011) 34.58 G.I. Barenblatt: Scaling, Self-Similarity, and In-
34.41 P. Humphreys: Extending Ourselves. Computa- termediate Asymptotics, Cambridge Texts in Ap-
tional Science, Empiricism, and Scientific Method plied Mathematics, Vol. 14 (Cambridge Univ. Press,
Computer Simulations and Computational Models in Science References 777
Part G | 34
(rather than an approximation of) differential practices in meteorology and astrophysics, Stud.
equations in modeling physics, Physica D 10, 117– Hist. Philos. Sci. Part B 41, 273–281 (2010), Special
127 (1984) Issue: Modelling and simulation in the atmo-
34.65 N. Margolus: Crystalline computation. In: Feyn- spheric and climate sciences
man and Computation: Exploring the Limits of 34.85 M. Sundberg: The dynamics of coordinated com-
Computers, ed. by A. Hey (Westview, Boulder 2002) parisons: How simulationists in astrophysics,
34.66 R. Hegselmann: Understanding social dynamics: oceanography and meteorology create standards
The cellular automata approach. In: Social Science for results, Soc. Stud. Sci. 41(1), 107–125 (2011)
Microsimulation, ed. by K.G. Troitzsch, U. Mueller, 34.86 E. Tal: From data to phenomena and back again:
G.N. Gilbert, J. Doran (Springer, London 1996) Computer-simulated signatures, Synthese 182(1),
pp. 282–306 117–129 (2011)
34.67 C.G. Langton: Studying artificial life with cellular 34.87 R. El Skaf, C. Imbert: Unfolding in the empirical
automata, Physica D 22, 120–149 (1986) sciences: Experiments, thought experiments and
34.68 B. Hasslacher: Discrete Fluids, Los Alamos Sci. computer simulations, Synthese 190(16), 3451–
Special issue 15, 175–217 (1987) 3474 (2013)
34.69 N. Metropolis, S. Ulam: The Monte Carlo method, 34.88 L. Soler, S. Zwart, M. Lynch, V. Israel-Jost: Science
J. Am. Stat. Assoc. 44(247), 335–341 (1949) After the Practice Turn in the Philosophy, History,
34.70 P. Galison: Computer simulations and the trad- and Social Studies of Science (Routledge, London
ing zone. In: The Disunity of Science: Boundaries, 2014)
Contexts, and Power, ed. by P. Galison, D. Stump 34.89 H. Chang: The philosophical grammar of scientific
(Stanford Univ. Press, Stanford 1996) pp. 118–157 practice, Int. Stud. Philos. Sci. 25(3), 205–221 (2011)
34.71 P. Galison: Image and Logic: A Material Culture of 34.90 H. Chang: Epistemic activities and systems of
Microphysics (Univ. Chicago Press, Chicago 1997) practice: Units of analysis in philosophy of sci-
34.72 C. Beisbart, J. Norton: Why Monte Carlo simu- ence after the practice turn. In: Science After the
lations are inferences and not experiments. In: Practice Turn in the Philosophy, History and So-
International Studies in Philosophy of Science, cial Studies of Science, ed. by L. Soler, S. Zwart,
Vol. 26, ed. by J.W. McAllister (Routledge, Abing- M. Lynch, V. Israel-Jost (Routledge, London 2014)
ton 2012) pp. 403–422 pp. 67–79
34.73 S. Succi: The Lattice Boltzmann Equation for Fluid 34.91 A. Barberousse, S. Franceschelli, C. Imbert:
Dynamics and Beyond (Clarendon, Oxford 2001) Computer simulations as experiments, Synthese
34.74 A.M. Bedau: Weak emergence, Philos. Perspect. 169(3), 557–574 (2009)
11(11), 375–399 (1997) 34.92 P. Grim, R. Rosenberger, A. Rosenfeld, B. Ander-
34.75 T. Grüne-Yanoff: The explanatory potential of ar- son, R.E. Eason: How simulations fail, Synthese
tificial societies, Synthese 169(3), 539–555 (2009) 190(12), 2367–2390 (2013)
34.76 B. Epstein: Agent-based modeling and the fal- 34.93 J.H. Fetzer: Program verification: The very idea,
lacies of individualism. In: Models, Simulations, Commun. ACM 31(9), 1048–1063 (1988)
and Representations, ed. by P. Humphreys, C. Im- 34.94 A. Asperti, H. Geuvers, R. Natarajan: Social pro-
bert (Routledge, London 2011) p. 115444 cesses, program verification and all that, Math.
34.77 S.B. Pope: Turbulent Flows (Cambridge Univ. Struct. Comput. Sci. 19(5), 877–896 (2009)
Press, Cambridge 2000) 34.95 W.L. Oberkampf, C.J. Roy: Verification and Vali-
34.78 P.N. Edwards: A Vast Machine: Computer Models, dation in Scientific Computing (Cambridge Univ.
Climate Data, and the Politics of Global Warming Press, Cambridge 2010)
(MIT Press, Cambridge 2010) 34.96 W.S. Parker: Computer simulation. In: The Rout-
34.79 M. Heymann: Understanding and misunder- ledge Companion to Philosophy of Science, ed. by
standing computer simulation: The case of at- S. Psillos, M. Curd (Routledge, London 2013)
778 Part G Modelling and Computational Issues
34.97 J. Lenhard: Computer simulation: The cooperation Caskill, N. Packard, S. Rasmussen (MIT Press, Cam-
between experimenting and modeling, Philos. bridge 2000) pp. 497–506
Sci. 74(2), 176–194 (2007) 34.117 S. Chandrasekharan, N.J. Nersessian, V. Subrama-
34.98 N. Oreskes, K. Shrader-Frechette, K. Belitz: Verifi- nian: Computational modeling: Is this the end of
cation, validation, and confirmation of numerical thought experimenting in science? In: Thought
models in the earth sciences, Science 263(5147), Experiments in Philosophy, Science and the Arts,
641–646 (1994) ed. by J. Brown, M. Frappier, L. Meynell (Rout-
34.99 J. Lenhard, E. Winsberg: Holism, entrenchment, ledge, London 2012) pp. 239–260
and the future of climate model pluralism, Stud. 34.118 J.D. Norton: Are thought experiments just what
Hist. Philos. Sci. 41(3), 253–262 (2010) you thought?, Can. J. Philos. 26, 333–366 (1996)
34.100 A. Barberousse, C. Imbert: New mathematics for 34.119 J.D. Norton: On thought experiments: Is there
old physics: The case of lattice fluids, Stud. Hist. more to the argument?, Philos. Sci. 71, 1139–1151
Philos. Sci. Part B 44(3), 231–241 (2013) (2004)
34.101 J.M. Boumans: Understanding in economics: 34.120 R. Descartes: Discours de la méthode. In: Oeuvres
Gray-box models. In: Scientific Understanding: de Descartes, Vol. 6, ed. by C. Adam, P. Tannery (J.
Philosophical Perspectives, ed. by H.W. de Regt, Vrin, Paris 1996), first published in 1637
S. Leonelli, K. Eigner (Univ. Pittsburgh Press, Pitts- 34.121 P. Humphreys: What are data about? In: Computer
Part G | 34
34.134 A. Ilachinski: Cellular Automata: A Discrete Uni- 34.155 S. Wolfram: A New Kind of Science (Wolfram Me-
verse (World Scientific, Singapore 2001) dia, Champaign 2002)
34.135 E.F. Keller: Models, simulation and computer ex- 34.156 H.W. de Regt, D. Dieks: A contextual approach to
periments. In: The Philosophy of Scientific Exper- scientific understanding, Synthese 144(1), 137–170
imentation, ed. by H. Radder (Univ. Pittsburgh (2005)
Press, Pittsburgh 2003) pp. 198–215 34.157 R.P. Feynman, R.B. Leighton, M.L. Sands: The
34.136 D. Dowling: Experimenting on theories, Sci. Con- Feynman Lectures on Physics, Vol. 3 (Addison-
text 12(2), 261–273 (1999) Wesley, Reading 1963)
34.137 G. Piccinini: Computational explanation and 34.158 C. Hempel: Reasons and covering laws in
mechanistic explanation of mind. In: Cartogra- historical explanation. In: The Philosophy of
phies of the Mind, ed. by M. Marraffa, M. De Caro, C.G. Hempel: Studies in Science, Explanation, and
F. Ferretti (Springer, Dordrecht 2007) pp. 23–36 Rationality, ed. by J.H. Fetzler (Oxford Univ. Press,
34.138 E. Arnold: What’s wrong with social simulations?, Oxford 2000), first published in 1963
The Monist 97(3), 359–377 (2014) 34.159 J. Lenhard: Surprised by a nanowire: Simulation,
34.139 S. Ruphy: Limits to modeling: Balancing ambi- control, and understanding, Philos. Sci. 73(5),
tion and outcome in astrophysics and cosmology, 605–616 (2006)
Simul. Gaming 42(2), 177–194 (2011) 34.160 M. Bedau: Downward causation and the auton-
Part G | 34
34.140 B. Epstein, P. Forber: The perils of tweaking: How omy of weak emergence, Principia 6, 5–50 (2003)
to use macrodata to set parameters in com- 34.161 P. Huneman: Determinism, predictability and
plex simulation models, Synthese 190(2), 203–218 open-ended evolution: Lessons from computa-
(2012) tional emergence, Synthese 185(2), 195–214 (2012)
34.141 W. Bechtel, R.C. Richardson: Discovering Com- 34.162 C. Imbert: Why diachronically emergent proper-
plexity: Decomposition and Localization as ties must also be salient. In: World Views, Science,
Strategies in Scientific Research (MIT Press, and Us: Philosophy and Complexity, ed. by C. Ger-
Cambridge 1993) shenson, D. Aerts, B. Edmonds (World Scientific,
34.142 H. Zwirn: Les Systèmes complexes (Odile Jacob, Singapore 2007) pp. 99–116
Paris 2006), in French 34.163 H. Zwirn, J.P. Delahaye: Unpredictability and
34.143 Y. Bar-Yam: Dynamics of Complex Systems (West- computational irreducibility. In: Irreducibility
view, Boulder 1997) and Computational Equivalence, Emergence,
34.144 R. Badii, A. Politi: Complexity: Hierarchical Struc- Complexity and Computation, Vol. 2, ed. by
tures and Scaling in Physics (Cambridge Univ. H. Zenil (Springer, Berlin, Heidelberg 2013)
Press, Cambridge 1999) pp. 273–295
34.145 D. Little: Varieties of Social Explanation: An In- 34.164 J. Kuorikoski: Simulation and the sense of un-
troduction to the Philosophy of Social Science derstanding. In: Models, Simulations, and Repre-
(Westview, Boulder 1990) sentations, ed. by P. Humphreys, C. Imbert (Rout-
34.146 H. Kincaid: Philosophical Foundations of the So- ledge, London 2012)
cial Sciences: Analyzing Controversies in Social 34.165 C.R. Shalizi, C. Moore: What Is a Macrostate?
Research (Cambridge Univ. Press, Cambridge 1996) Subjective Observations and Objective Dynamics
34.147 C. Hitchcock: Discussion: Salmon on explanatory (2003) arxiv:cond-mat/0303625
relevance, Philos. Sci. 62, 304–320 (1995) 34.166 N. Israeli, N. Goldenfeld: Computational irre-
34.148 C. Imbert: Relevance, not invariance, explana- ducibility and the predictability of complex phys-
toriness, not manipulability: Discussion of Wood- ical systems, Phys. Rev. Lett. 92(7), 074105 (2004)
ward’s views on explanatory relevance, Philos. 34.167 N. Goodman: Language of Arts (Hackett, Indi-
Sci. 80(5), 625–636 (2013) anapolis 1976)
34.149 W.C. Salmon: Four Decades of Scientific Explana- 34.168 M. Vorms: Formats of representation in scientific
tion (Univ. Pittsburgh Press, Pittsburgh 2006) theorizing. In: Models, Simulations, and Repre-
34.150 G. Schurz: Relevant deduction, Erkenntnis 35(1– sentations, (Routledge, London 2012) pp. 250–273
3), 391–437 (1991) 34.169 J. Jebeile: Explication et Compréhension Dans Les
34.151 H.E. Kyburg: Comment, Philos. Sci. 32, 147–151 Sciences Empiriques. Les modèles Scientifiques et
(1965) le Tournant Computationnel, Ph.D. Thesis (Uni-
34.152 M. Scriven: Explanations, predictions, and laws. versité Paris, Paris 2013)
In: Scientific Explanation, Space, and Time, Vol. 34.170 S. Bullock: Levins and the lure of artificial worlds,
3, ed. by H. Feigl, G. Maxwells (Univ. Minnesota The Monist 97(3), 301–320 (2014)
Press, Minneapolis 1962) pp. 170–230 34.171 J. Lenhard: Autonomy and automation: Compu-
34.153 J. Woodward: Scientific explanation. In: The tational modeling, reduction, and explanation in
Stanford Encyclopedia of Philosophy, ed. by quantum chemistry, The Monist 97(3), 339–358
E.N. Zalta (Stanford Univ., Stanford 2014), http:// (2014)
plato.stanford.edu/archives/win2014/entries/ 34.172 K. Appel, W. Haken: Every planar map is four col-
scientific-explanation/ orable. I. Discharging, Ill. J. Math. 21(3), 429–490
34.154 J. Woodward: Making Things Happen (Oxford (1977)
Univ. Press, Oxford 2003) 34.173 K. Appel, W. Haken, J. Koch: Every planar map is
four colorable. II. Reducibility, Ill. J. Math. 21(3),
780 Part G Modelling and Computational Issues
Part G | 34
783
Part G | 35.1
help us in, for example, estimating how different 35.3.2 Engineering Agent-Based Simulations. 792
infrastructure investments will affect the trans- 35.4 Summing Up and Future Trends ......... 795
port system and understanding the behavior of
large Internet-based systems in different situa- References................................................... 796
tions. This type of system is becoming the focus
of research and sustainable management as there
are now techniques, tools and the computational
resources available. This chapter discusses model-
ing and simulation of such complex systems. We
will start by discussing what characterizes complex
systems.
central terms in the related vocabulary. A second re- also responsible for the difficulty in studying large
lated approach is termed complex adaptive systems, composite systems.”
originating more from biology, based on the idea that
complexity arises from many entities adapting to their Clearly, the origins of complexity are not merely in
local environment. the number of entities capable of participating in an in-
teraction, but there is some form of trade-off between
35.1.1 Features Associated the complexity of interaction and the number of inter-
with Complex Systems acting entities. Clearly the interaction of hundreds of
millions of humans creates a complex system of epi-
Following the discussion above, an intuition of what is demic spread [35.4], but there are also examples for sys-
a complex system can be best imparted by describing tems with only a few entities, but complex interactions;
the features of complex systems. None of the features for example the model describing the emergence of po-
alone capture a complex system and a complex system litical actors [35.5] contains only 10 entities deciding
does not need to exhibit all of them. A complex sys- about whether to pay tribute or not based on decisions
tem in physics may have a completely different shape taken before. Although there are only a few entities,
compared to a complex system in biology or social sci- their decision making and interaction is based on sev-
ence. Nevertheless, a combination of those features can eral feedback loops resulting in a complex system.
be identified at any complex system that we will con-
sider in the following. Multiple Levels of Observation,
Self Organization, Emergence
Nonlinearity Auyang [35.3] assumes that complexity arises from
Part G | 35.1
Nonlinearity is one of the most frequently identified large-scale composition. The idea of a system com-
properties of a complex system. It basically describes posed of many interacting entities may lead to complex-
that when combining results/solutions/behavior from ity even if the interaction does not lead to nonlinearity.
multiple elements into one system, these do not add In such a system, there are at least two levels of ob-
in a linear way, but rather in a nonlinear way due to servation: the individual entity and the overall system
interactions between elements. Examples of nonlinear level. The behavior and pattern observable on the lat-
systems are systems in which saturation occurs. Promi- ter originate from actions and interactions among the
nent are also systems in which a small change causes lower level entities, between the entities and their en-
a big effect somewhere else. Weather, economies, social vironment as well as if there is some manifestation on
systems form examples for this kind of chaotic system. a high level (some form of organization) this may also
Nonlinear phenomena are hard to model, capture impact on the lower level entities.
and analyze mathematically. Often simulation is a last Two additional concepts are relevant in relation
resort for handling the complexity, yet not each simula- to the multilevel feature of complex systems: self or-
tion paradigm is suitable. ganization and emergence. Self organization denotes
some process in which local interactions of lower-level
Distributedness, Scale and Interaction entities produce some form of sustainable regularity
Many systems coined as complex ones, especially in from an initially unordered situation. Often random
biology or social science, are large and distributed. fluctuations are responsible for locating the produced
That means they contain a huge number of entities that regularity. In his famous book Kauffman [35.6] dis-
are distributed in some way. This may not just refer cusses self organization in biological systems based
to geographic distribution, but also to an entity tak- on (evolutionary) adaptation, clearly distinguishing true
ing a position in a network of relations. The important self organization from organization following some
idea is that there is a form of local interaction, an en- predefined scheme expressing some intention to build
tity or component interacts with a (selected) number of structure. A key to self organization is that local de-
others distributed over some form of more or less ab- cision making is adaptive to a changed environment;
stract environment. Scale also plays an important role that means the local behavior is not fully predefined
simply because a small system can be overlooked, in but at least conditional to environmental conditions.
a huge one perceiving what happens where in that dis- Prominent examples for self-organizing systems can be
tributed system obscures the overall dynamics, or as found in many natural systems, percolation processes in
Auyang [35.3, p. 11] coins it: physics, genesis of structures in biology, bird flocking,
etc. The principles are also used in technology, mainly
“The relational network is chiefly responsible for in systems such as swarm robotics or biologically in-
the great variety of phenomena in the world. It is spired optimization.
Simulation of Complex Systems 35.2 Modeling Complex Systems 785
A related concept, capturing phenomena or patterns their decision making to their local context. Only if the
originating from local interaction on a lower level is entities can adapt, feedback loops can be formulated
emergence. In addition to self organization, it contains and something complex (unexpected) can be produced.
some associated aspect of astonishment. As this is of- Flexible decision making refers to the next action the
ten very subjective, the actual definition of the term entity chooses to perform possibly being related to the
emergence is partially controversial. The whole is more extent with which an entity does something or referring
than the sum of its parts forms the intuitive character- to the selection of the interaction partner. Adaptivity
ization. This idea fascinated researchers from Aristotle hereby means a change in the behavior in reaction to
to Holland [35.7]. More formal approaches for defining some immediate environmental context, for example
emergence are based on the idea that the description governed by some rules that control an entity to change
of the overall phenomenon uses different vocabulary its movement direction before bumping into a suddenly
than the description of the description of the lower- occurring obstacle. True learning is different from adap-
level entities producing the phenomenon. Well-known tive behavior: it changes the behavior program itself –
examples are the Mexican wave in cheering audiences for example, the entity learns a new rule about how
in which the individual spectators stand up depending to deal with obstacles. Learning can also happen on
on their neighbors’ actions so that a wave forms, or the system (population) level in the form of evolution
traffic jams originating from too high density of vehi- combining the production of new entities with a fitness-
cles. The congestion travels into the opposite direction based selection and/or survival.
than the individual vehicles. Darley [35.8] discusses
the dilemma of relating the definition of emergence to 35.1.2 Summing Up
some limited understanding by an observer and defines
Part G | 35.2
emergence as a phenomenon that is best predicted by The last section attempted to capture the idea of a com-
simulation. plex system based on particular features. The main
assumption hereby is a complex system originates from
Adaptivity, Flexible Decision Making multilevel systems in which phenomena and patterns on
and Feedback Loops the overall system level are produced by entities that are
The features tackled so far mostly address the system capable of flexible behavior and interaction.
level, but there is an important aspect also in the individ- Complex systems – as characterized here – are best
ual entities’ behavior making up the complex system: analyzed and their overall behavior is best predicted
For producing a complex system, decision making and using simulation. In the following, we will focus on dif-
interaction of the lower-level entities requires some de- ferent approaches and different motivations to modeling
gree of freedom. The entities must be able to adapt and simulating complex systems.
parameters and initial conditions. Computer simula- describe common ways of implementing micro-level
tion, on the other hand, is often used when simple simulations.
closed-form analytic solutions are not possible, which
is typically the case for complex systems. Cellular Automata
Computer simulation consists of three main steps: Individual-based modeling can be traced back to von
Neumann, who in the 1950s invented what was later
1. Designing a model of an actual or theoretical system
termed cellular automata [35.12]. The core definition
2. Executing the model on a computer, and
of a cellular automaton refers to a model of a system
3. Analyzing the execution output.
that is discrete in time, space and state, yet the term is
Although there are many different types of com- often used in a broader way denoting to models with
puter simulation, they typically attempt to generate discrete space (cells). In principle, such a simulation
a sample of representative scenarios for a model in model consists of a grid of cells, i. e., entities, each
which a complete enumeration of all possible states in one of a finite number of states. The state of a cell
would be prohibitive or impossible. In this chapter we at time t is a function of the states of a finite number
will describe simulation as a tool for understanding the of cells (called its neighborhood) at time t 1. These
behavior of complex systems. For a deeper philosoph- neighbors are a selection of cells relative to the spec-
ical treatment of the concept of simulation, we refer to ified cell, and do not change. Every cell has the same
Chap. 34 of this book by Imbert. rule for updating, based on the values in this neigh-
borhood. Each time the rules are applied to the whole
35.2.1 Macro-Level Versus Micro-Level grid a new generation is created. These were used by
Simulation Conway [35.13] in the 1970s when he constructed the
Part G | 35.2
c)
and the macro-level phenomena caused by their interac- marriage. This type of simulation can be used to, for
tion (rather than by complex behavior of the individual example, predict the outcome of different social poli-
as in the case of humans). Thus, in these models little cies. However, the quality of such simulations depends
attention is paid to individual variation and the indi- on the quality of:
Part G | 35.2
vidual decision making is rather primitively modeled.
A prominent example of sociophysics is the work of The random sample, which must be representative,
Galam [35.17]. and
Another prominent example of a model in which lo- The transition probabilities, which must be valid
cal interactions lead to interesting macro-level behavior and complete.
is the Boid model by Reynolds [35.18], which simu-
lates coordinated animal motion such as bird flocks and Micro-Level Simulation in Technical Domains
fish schools. The underlying spatial representation and Also for socio-technical systems, that means for sys-
consequently the state of the entities in form of direc- tems in which people and technology interact, a number
tion of movement are continuous values. So it is not of micro-level modeling and simulation techniques ex-
a cellular automata in the narrow sense, yet belongs ist. Examples are complex production lines in which
to the same category of models that generate com- human workers cooperate with machines executing
plex system behavior from locally interacting, simple more or less automated process steps. Formulating
entities. a complex system using for example a Petri nets and
queuing system is based on a process-oriented point of
Dynamic Micro-Simulation view of locally interacting entities and is particularly
One of the first, and most simple, ways of perform- apt for entities traveling through a complex system in
ing micro-level simulation in social science is often a more or less automated way not involving individual
called dynamic micro-simulation [35.19, 20]. It is used reasoning and on-the-fly adaptation.
to simulate the effect of the passing of time on indi- Petri nets hereby have been accepted as a power-
viduals. Data from a (preferably large) random sample ful formal modeling tool for dealing with performance
from the population to be simulated is used to initially and functionality issues in systems with distributed
characterize the simulated individuals. Some examples concurrent processes based on local behavior – for
of sampled features are: age, sex, employment status, example complex software systems (see as an introduc-
income, and health status. A set of transition probabili- tion [35.21]). According to its basic definition, the core
ties are used to describe how these features will change of a Petri net (of the type condition-event net) model
over a given time period, e.g., there is a probability that is a bipartite graph consisting of a finite set of places
an employed person will become unemployed over the connected with elements from a finite set of transitions.
course of a year. The transition probabilities are ap- Places can hold tokens (one or more tokens with or
plied to the population for each individual in turn and without colors) that travel from place to place when
then repeatedly reapplied for a number of simulated a transition fires. The colors of the token represent its
time periods. Sometimes it is necessary to also model internal state and allow formulating behavior depend-
changes in the population, for example birth, death, and ing on internal state.
788 Part G Modelling and Computational Issues
Queuing systems are used for performance analy- these types of end users may use modeling and simula-
sis in distributed systems in which entities are traveling tion of complex systems for different purposes.
through a system, for example patients through a hos-
pital or jobs through a production system (see for Supporting Theory Building and Evaluation
introduction: [35.22]). Also queuing systems models In the context of theory building, a simulation model
are graphs with two different types of nodes: servers can be seen as an experimental method or as a the-
and queues. The servers represent resources that the ory in itself [35.24]. In the former case, simulations are
jobs have to be processed by. If the server is busy, the run to test the predictions of theories, whereas in the
job has to wait in the queue in front of that resource. latter case simulations in themselves are formal mod-
Every queue may have its special queuing discipline. els of theories. Using a formal language for describing
Connections between the different elements are ei- ambiguous, natural language-based theories helps to
ther deterministic (without branching) or probabilistic find inconsistencies and other problems, and thus con-
(branching or merging). tributes to theory building.
These micro-level models and simulations are used Simulation may also be used to evaluate a particular
for complex socio-technical systems – with a focus on theory, model, hypothesis, or system, or compare two
technology – that benefit from a distributed, process- or more of these. Moreover, simulation can be used to
oriented point of view. In contrast to cellular automata, verify whether a theory, model, hypothesis, system, or
space is abstracted to times that a token or job needs software is correct. Using simulation studies as an ex-
for traveling between places or servers. Simulated time perimental tool offers great possibilities. For instance,
is usually handled event-based, that means simulated many experiments with human societies are either un-
time is advanced to times in which change happens that ethical or even impossible to conduct. Experiments in
Part G | 35.2
triggers other changes. A general formalism for object- silico, on the other hand, are fully possible. These can
oriented modeling and simulation is Discrete Event also breathe new life into the ever-present debate in so-
System Specification (DEVS) [35.23], which is possi- ciology on the micro-macro link [35.25].
ble to use for micro-level modeling and simulation for
complex systems. Based on the concept of eventually Supporting the Engineering of Systems
coupled systems, the underlying abstractions are quite Many large-scale technical systems are distributed and
generic and thus can be used without doubt, but they involve complex interactions between humans and ma-
are not specifically supportive for modeling complex chines. The idea is to model the behavior of human
systems. Below, we will introduce agent-based simu- users in terms of software entities (see next section). In
lation – a micro-level modeling approach for complex particular, this seems useful in situations where it is too
systems that is more general than these more formalized expensive, difficult, inconvenient, tiresome, or even im-
modeling and simulation paradigms. possible for real human users to test out a new technical
system. Of course, also the technical system, or parts
35.2.2 Purpose of Modeling thereof, may be simulated. For instance, if the technical
Complex Systems system includes hardware that is expensive and/or spe-
cial purpose, it is natural to simulate also this part of the
Modeling and simulation of complex systems can be system when testing out the control software. An exam-
done for different purposes, such as: ple of such a case is the testing of control systems for
intelligent buildings, where software entities simulate
Supporting theory building and evaluation
the behavior of the people in the building [35.26].
Supporting the engineering of systems, e.g., valida-
tion, testing, etc.
Supporting Planning, Policy Making,
Supporting planning, policy making, and other de-
and Other Decision Making
cision making and
In simulation for decision making the focus is on ex-
Training, in order to improve a person’s skills in
ploring different possible future scenarios in order to
a certain domain.
choose between alternative actions. Besides this type of
It is possible to distinguish between four types of prediction, modeling of complex systems may be used
end users: scientists, who use the models in the research for analysis; to gain deeper knowledge and understand-
process to gain new knowledge or verify hypotheses; ing of a certain phenomenon.
policymakers, who use it for making strategic decisions; An area in which several studies of this kind have
managers (of systems), who use it to make operational been carried out is disaster management, such as ex-
decisions; and other professionals, such as architects, periments concerning different roles and the efficiency
who use it in their daily work. Below we describe how of reactions to emergencies [35.27]. Here also software
Simulation of Complex Systems 35.3 Agent-Based Simulation of Complex Systems 789
entities are models of humans. Based on individuals’ ferent containment strategies. The simulation model
observations, personal characteristics and skills, past can be quickly adapted to local circumstances via the
experience and role characteristics, and social network, geographical data (given that there is data on the pop-
the entities create a plan to execute. The effect of adding ulation as well) and is used to determine the effects of
a role (floor warden) in a fire alarm scenario upon different containment strategies.
the evacuation efficiency in an abstract environment A third area where social simulation has been used
is analyzed. Sometimes environmental information is to support planning and policy making is traffic and
based on GIS (geographical information system) data, transport. An example of this is the simulation of all
thereby tying the simulation closer to the physical real- car travel in Switzerland during morning peak traf-
ity [35.24]. In yet another study, real-world data were fic [35.30].
used for both the environment and the software en-
tities’ internal decision-making model to analyze the Training
effect of different insurance policies on the willingness The main advantage of using modeling and simulation
of modeled humans to pay for a disaster insurance pol- for training purposes is to be part of a real-world-like
icy [35.28]. situation without real-world consequences. Especially
Another application area for this type of simula- in the military the use of simulation for training pur-
tion study is disease spreading. Again, software entities poses is widespread. Also in medicine, where mistakes
are used to represent human beings and the simulation can be very expensive in terms of money and lives, the
model is linked to real-world geographical data. One use of simulation in education is on the rise.
study [35.29] also included software entities that repre- An early product in this area was a tool to help train
sent towns acting as the epicenter of disease outbreak. police officers to manage large public gatherings such
Part G | 35.3
The town entity’s behavior repertoire consisted of dif- as crowds, demonstrations, and marches [35.31].
dents theoretical insights in the area of psychological As we have seen, the behavior of individual could
theory. be either deterministic or stochastic. Also the basis for
Whereas the physical state often is simple to model, the behavior of the individuals may vary. We can iden-
the modeling of the mental state is typically much tify the following categories:
more complex, especially if the individuals modeled
are human beings. A common approach is to model
The state of the individual itself: In most social sim-
ulation models the physical and/or mental state of
the beliefs, desires, and intentions of the individual,
an individual plays an important role in determin-
for instance by using the belief–desire–intention (BDI)
ing its behavior.
model [35.38, 39]. Such a model may include the so-
cial state of the individual, i. e., which norms it adheres
The state of the environment: Also the state of the
environment surrounding the individual often in-
to, which coalitions it belongs to, etc. Although the
fluences the behavior of an individual. Thus, an
BDI model is not based on any experimental evidence
individual may act differently in different contexts
of human cognition it has proved to be quite useful
although its physical and mental state is the same.
in many applications. There has also been some work
on incorporating emotions in models of the mental
The state of other individuals: One popular type of
simulation where the behaviors of individuals are
state of individuals [35.40] as well as obligations (cf.
based on the state of other individuals is those using
the beliefs, obligations, inentions and desires (BOID)
cellular automata. As introduced above, the state of
model [35.41] which extends the BDI with obligations).
each cell is updated as a function of the states of
The modeling of the behaviors (and decisions) of
a particular set of neighbors. In this case, informa-
the individuals can be done in a variety of ways,
tion about the state of other individuals can be seen
from simple probabilities to sophisticated reasoning
as gained through observations. Another possibility
Part G | 35.3
and planning mechanisms. As an example where the
is to gain the knowledge through communication
former is used, we can mention dynamic micro-
and in this case the individuals do not have to be
simulation as described in Sect. 35.2.1, Dynamic Micro-
limited to the neighbors.
Simulation, which was one of the first ways of perform-
ing individual-based simulation and is still frequently
Social states (norms etc.) as viewed by the agent: For
simulation of social behavior the agents need to be
used. In traditional micro-simulation, the behavior of
equipped with mechanisms for reasoning at the so-
each individual is regarded as a black box. The behav-
cial level (unless the social level is regarded as emer-
ior is modeled in terms of probabilities and no attempt
gent from individual behavior and decision making).
is made to justify these in terms of individual prefer-
Several models have been based on theories from
ences, decisions, plans, etc. Thus, better results may be
economy, social psychology, sociology etc. An ex-
gained if also the cognitive processes of the individuals
ample of this is provided by Guye-Vuillème [35.46]
were simulated. Opening the black box of individual
who has developed an agent-based model for simu-
decision making can be done in several ways. A ba-
lating human interaction in a virtual reality environ-
sic and common approach is to use decision rules, for
ment. The model is based on sociological concepts
instance, in the form of a set of situation-action rules.
such as roles, values, and norms and motivational
That is, if an individual and/or the environment is in
theories from social psychology to simulate persons
state X then the individual will perform action Y. By
with social identities and relationships.
combining decision rules and the BDI model quite so-
phisticated behavior can be modeled. Other models of In most simulation studies, the behavior of the in-
individual cognition used in agent-based social simula- dividuals is static, i. e., the decision rules or reasoning
tion include the use of Soar (a computer implementation mechanisms do not change during the simulation. How-
of Allen Newell’s unified theory of cognition [35.42]), ever, human beings and most animals do have an ability
which was for example used in Steve for generating to adapt and learn. To model dynamic behavior of in-
a believable simulated tutor [35.43]. Another unified dividuals through learning or adaptation can be done in
theory of individual cognition for which a computer many ways. For instance, both ACT-R and Soar have
implementation exists is Adaptive Control of Thought- learning built in. Other types of learning include the
Rational (ACT-R) [35.44]„ which is realized as a pro- internal modeling of individuals (or the environment)
duction system. A less general example is the Consumat where the models are updated more or less continu-
model [35.45], a meta-model combining several psy- ously.
chological theories on decision making in a consumer Finally, there are some more general aspects to con-
situation. Also, nonsymbolic approaches such as neural sider the modeling of individuals. One such aspect is
networks have been used to model the agents’ decision whether all the agents share the same behavior or they
making [35.27]. behave differently, i. e., representation of behavior is ei-
792 Part G Modelling and Computational Issues
ther individual or uniform. Another general aspect is to improve their innovations through research or by
the number of individuals modeled, i. e., the size of exchanging knowledge with other firms. However,
the model, which may vary from a few individuals to in many scenarios location is very important, and
billions of individuals. Moreover, the population of in- in those each individual (and sometimes each ob-
dividuals could be either static or dynamic. In dynamic ject) is assigned a specific location at each time step
populations, changes in the population are modeled, of the simulation. In this case, the individuals may
typically births and deaths. be either static (the entity does not change location
during the simulation) or mobile. The location could
Model of the Interaction Between Individuals either be specified as absolute positions in the envi-
In dynamic micro-simulation each simulated individual ronment, or in terms of relative positions between
is considered in isolation without regard to their in- entities. In some areas the simulation software is
teraction with others. However, in many situations the integrated with a geographical information system
interaction between the individuals is crucial for the be- (GIS) in order to achieve a closer match to reality,
havior at system level. Thus, in such cases better results see [35.49].
will be achieved if the interactions between individuals Time: There are in principle two ways to address
were simulated. Two important aspects of interactions time, and one is to ignore it. In static simulation
are who is interacting with whom, i. e., the interaction time is not explicitly modeled; there is only a be-
topology, and the form of interaction. fore and an after state. However, most simulations
A basic form of interaction is physical interac- are dynamic, where time is modeled as a sequence
tion (or interaction based on spatial proximity). As we of time steps. Typically, each individual may change
have seen, this is used in simulations based on cel- state between each time step.
Part G | 35.3
lular automata, e.g., in the Game of Life, introduced Exogenous events: This is the case when the state
in Sect. 35.2.1, Dynamic Micro-Simulation. Another of the environment, for example the temperature,
example is the Boids model [35.18], which simulates changes without any influence or action from the
coordinated animal motion such as bird flocks and fish individuals. Exogenous events, in case they are
schools in order to study emergent phenomena. In these modeled, may also change the state of entities, for
examples, the interaction topology is limited to the indi- example, decay of resources, or cause new entities
viduals immediately surrounding an individual. In other to appear. This is a way to make the environment
cases, as we will see below, the interaction topology stochastic rather than deterministic.
is defined more generally in terms of a (social) net-
work. Such a network can be either static, i. e., the 35.3.2 Engineering Agent-Based
topology does not change during a simulation, or dy- Simulations
namic. In these networks the interaction is typically
language based. An example of this is the work by Ver- Factors to Consider When Choosing a Model
hagen [35.47] where agents that are part of a group use In contrast to some of the more traditional approaches,
direct communication between the group members to such as system dynamics modeling, agent-based mod-
form shared group preferences regarding the decisions eling does not yet have any standard procedures that can
they make. Communication is steered by the structure support the model development. During the last decade
of the social network regardless of the physical location some attempts in this direction have been proposed. For
of the agents within the simulated world. example, Grimm et al. [35.50] proposed a structure for
documenting an agent-based simulation model origi-
Model of the Environment nally in the area of ecological systems. However, it is
The state of the environment is usually represented by often the case that the only formal description of the
a set of (global) parameters, for example temperature. model is the actual program code. However, it may be
In addition, there are a number of important aspects of useful to use the unified modeling language (UML) to
the environment model, such as: specify the model [35.51].
Some of the modeling decisions are determined by
Spatial explicitness: In some models, there is actu- the features of the system to be simulated, in particular
ally no notion of physical space at all. An example those regarding the interaction model and the environ-
of a scenario where location is of less importance is, ment model. The hardest design decision is often how
for example, innovation networks [35.48] in which the mental state and the behaviors of individuals should
the individuals are high tech firms that each have be modeled, in particular in the case when the individu-
a knowledge base that they use to develop artifacts als are human beings. For simpler animals or machines,
to launch on a simulated market. The firms are able a feature vector together with a set of transitions rules
Simulation of Complex Systems 35.3 Agent-Based Simulation of Complex Systems 793
is often sufficient. Depending on the phenomena being tem modeling comes from its generative nature [35.36].
studied, this may be sufficient also when modeling hu- The structure and dynamics of the overall system are
man beings. Gilbert [35.52] provides some guidelines not directly described, but generated from behavior and
whether a more sophisticated cognitive model is nec- interactions of simulated, individual agents. So, there
essary or not. He states that the most common reason are at least two levels of modeling and observation: the
for ignoring other levels is that the properties of these low-level agents and the aggregate system level. Run-
other levels can be assumed to be constant, and exem- ning the low level produces the structure and behavior
plifies this by studies of markets in equilibrium where on the aggregate level; in general, a formal a priori anal-
the preferences of individual actors is assumed to re- ysis before simulating the system is hardly possible,
main constant (note, however, that this may not always only by running the simulation, the what, where and
be true). Another reason for ignoring other levels, ac- when of a social phenomenon emerging from the low-
cording to Gilbert, is when there are many alternative level agents, can be fully determined. Formal analyses
processes at the lower level that could give rise to the (or even prediction) of overall simulation outcomes
same phenomenon. He exemplifies this by the famous from low-level agent behavior are difficult or impossi-
study by Schelling [35.14] regarding residential segre- ble.
gation. Although Schelling used a very crude model In many applications, a certain macro level phe-
of the mental state and behavior of the individuals, nomenon in the original system is to be reproduced
i. e., ignoring the underlying motivations for household or optimized. Thus, the micro-level rules determin-
migration, the simulation results were valid (as the un- ing the behavior have to be adapted in a way that
derlying motivations were not relevant for the purpose the intended aggregate phenomenon is produced. For
of Schelling’s study). On the other hand, there are many agent-based simulation, exploratory, experience-based,
Part G | 35.3
situations where a more sophisticated cognitive model less informal methodologies appear to be more appro-
is useful, in particular when the mental state or behav- priate [35.55]. The basic question on how to come up
ior of the individual provides constraints on, or in other with the appropriate low-level behavior is left to indi-
ways influences the behavior at the system level. How- vidual creativity and experience. This issue, together
ever, as Gilbert concludes, the current research is not with the general level of granularity of the model has
sufficiently mature in order to give advice on which also been discussed in the last subsection.
cognitive model to use (BDI, Soar, ACT-R, or other).
Rather, he suggests that more pragmatic considerations Critical Parameter Structures and Calibration.
should guide the selection. A simple model is preferable as it contains fewer as-
The model of the environment is mostly dictated by sumptions and thus less parameters. Parameters may be
the system to be simulated, where the modeler has to factors in formulas or thresholds for decision making.
decide on the granularity of the values of the attributes Also values for initial values for state variables of all
of the environment. The interaction model is often cho- entities are parameters. A model with too many param-
sen based on the theory or practical situation that is the eters can be tuned to produce anything. That means,
basis for the simulation, but sometimes the limitations a constellation of parameter values can be found, so that
of the formal framework used restricts the possibilities. the given agent actions and interactions produce any
Also here the modeler has to decide upon the granular- intended overall outcome. So, structural falsification
ity of the values of the attributes. In general Edmonds becomes impossible. This limits the analytical value of
and Moss [35.53] give the advice that a modeler shall the model if not accompanied with rigorous processes
not optimize for simplicity of the model, but more for for quality assurance.
understandability and believability. Many parameters also cause practical problems:
they need to be set to appropriate values. The necessary
More Issues in Engineering Agent-Based effort for calibration becomes an issue in developing
Simulations agent-based simulations. Hereby, one has to pay atten-
Decision making about the granularity of agent decision tion as the sheer number of parameters may be critical.
is not the only issue when developing an agent-based This can be remedied by putting parameters in rela-
simulation model for a complex system – yet is the tion to each other. Another important problem relates
most important one. In the following we will discuss to the nature of the parameters themselves. A single
more issues. A more elaborate discussion of issues can parameter can have an enormous effect on the over-
be found in [35.54]. all aggregated behavior when shared by many agents
or if there are nonlinear feedback loops amplifying its
Generative Micro-Macro Link. A basic reason for at- effect. Izquierdo and Polhill [35.56] denoted decision
tractiveness of agent-based simulation for complex sys- threshold parameters as knife-edge parameters: if the
794 Part G Modelling and Computational Issues
behavior of the agent changes depending on this pa- floor fields capturing gradient data for path finding.
rameter. Setting such a parameter homogeneously for Information is explicitly stored in the environment
all agents can cause chaotic behavior: small changes without any correspondence to the original system,
in the parameter values result in completely different but for making agent implementation more efficient.
phenomena. This issue can be addressed by allow- Again, it is a matter of modelers’ experience to
ing individual values for the individual agents or by know how far one can go with these additions.
smoothing the effects of a threshold with some stochas- During simulation, virtual time is advanced to ex-
tic transition. Finally, the test whether the simulation press the dynamics of the model. As environment
leads to the intended system and individual behavior and time are artificial, the modeler needs a way
may not be automatized – especially in abstract mod- for explicitly handling artificial parallelism of the
els without sufficient underlying empirical data – but agent’s update. In principle, every agent could run
human intelligence is needed to identify whether the in its own software process, but for simple agents
generated pattern of the complex system is the searched explicitly handling virtual parallelism is more effi-
one. cient. Depending on used infrastructure, the mod-
eler has to take care about these low-level aspects
Size and Scalability. There is a variety of complex of simulation implementation.
system models with respect to the number of agents,
ranging from one-agent systems capable of complex Tools for Agent-Based Simulation of Complex
interaction behavior to large-scale simulations with sev- Systems
eral millions of agents. For many complex system Despite the many available tools, implementation of an
models, a minimum agent number is necessary – for agent-based simulation model is still not trivial. The
Part G | 35.3
example the effect of pheromone-based ant recruit- currently most prominent tools suitable for complex
ment cannot be shown by only a small number of systems are Swarm, Repast, MASON and NetLogo.
simulated ants, but the number of agents has to be Platforms such as SeSAm support modeling better, yet
synchronized with the environmental configuration and simulation runs tend to be slow. In the following, we
the evaporation rate of the pheromone used to estab- will shortly discuss these tools. More elaborate com-
lish a pheromone trail. Scalability with respect to agent parisons can be found in [35.57, 58] or [35.59]
numbers is only half of the story: scalability depends on
the complexity of the agent behavior and architecture Swarm. Swarm [35.60] is one of the earliest tools for
as discussed above. Complex system models contain implementation of agent-based simulations (ABSs) and
agents that are capable of flexible decision making, complex systems. Practically, it provides libraries (in
which is clearly more costly than fully scripted behav- Objective-C or newly also JAVA) that developers can
ior programs. With an appropriate tool, as one of the use when building their simulations. Agents are hierar-
platforms discussed in Sect. 35.3.2, More Issues in En- chically organized in Swarms.
gineering Agent-Based Simulations, at least the simpler
scalability issues are addressable. Repast. Repast (Recursive Porous Agent Simulation
Toolkit [35.61]) is also a Java-based platform. Fol-
Other Technical Issues. Besides those principled lowing the hierarchical structure of Swarm, Repast
problems, there one can identify engineering issues at provides a library of classes for the most common tasks
technical design and implementation – often supported associated with the implementation of an ABS. Besides,
by tools and platforms. If an analysis of the dynamics since the initial focus of Repast was social science, it in-
of the complex system model is needed, the model must cludes some tools that are useful in this domain such as
be implemented. This is challenging in a way similar to network analysis. The Repast Simphony forms a visual
multi-agent systems. Simulated multi-agent systems are modeling extension based on state charts.
also consisting of distributed intelligent decision mak-
ers, each with its own thread of control, its local beliefs MASON. MASON (Multi-Agent Simulator Of Neigh-
and interacting and acting in parallel. In addition to the borhoods [35.62]) forms a library based on Java with
challenges developing a multi-agent system, one can the goal to particularly support large-scale simulations.
identify:
NetLogo. NetLogo [35.63] is currently probably the
Issues about extended design choices on the envi- most used platform. It was particularly designed for
ronmental model. For facilitating the design of the complex system modeling and simulation with the end
agents, the environmental model can be augmented: user in mind. A NetLogo model has basically three el-
A prominent example is crowd simulations using ements. The first is the actual implementation of model
Simulation of Complex Systems 35.4 Summing Up and Future Trends 795
behavior. The used modeling language resembles Starl- and [35.67]. An overview is practically impossible.
ogo, which is easy to understand and learn. The second Many other tools are specifically designed for partic-
and third element of a NetLogo model is the simulation ular purposes. For example, MadKit relies upon an
interface for visualization and parameter settings and organizational model of agents’ societies. Therefore
a third explicit element is a structured documentation. its particular strength is in models focusing on intra-
NetLogo is becoming increasingly popular due to and inter-organizational processes. Similarly, COR-
its extensive documentation, the existence of good tu- MAS (Common-pool Resources and Multi-Agent Sys-
torials, and a large library of preexisting models. Intro- tems) is a programming environment that targets natural
ductory books such as [35.64] are based on NetLogo. resources management.
In principle, an agent-based simulation platform
SeSAm. The Shell for Simulated Agent Systems shall not just support the implementation of agent be-
[35.65] provides a fully visual interface for the model havior, but provide basic infrastructure for, for example,
development. A proprietary model representation lan- integration of input data, handling virtual time, model
guage forms the basis for visual programming, etc. The instrumentation, data collection, and others. Depending
kernel of a SeSAm simulation consists of the behavior on the nature of the tool, the expressiveness of the lan-
models of agents and the world, which is represented as guage for capturing the agent behavior might be limited.
a special, global agent that may manage different kinds Whether it is sufficient or not, is dependent on the actual
of maps. objective behind simulating the complex system. Nev-
ertheless, a good development platform amends some
Other Tools. In repositories for agent-based simula- of the issues discussed above and thus enables the mod-
tion platforms many more systems are listed [35.66] eler to concentrate on the core aspects of the model.
Part G | 35.4
35.4 Summing Up and Future Trends
The ability to understand and manage different types tematic way, comparable to system dynamics. This
of complex systems is becoming more and more im- may be due to the fact that modeling per se often
portant, both for research, businesses and government. contributes to understanding a complex system. That
As we have seen, agent-based modeling and simulation is, the complex system to be modeled is not fully
seems a promising approach to many problems involv- understood by the stakeholders or even by the mod-
ing the simulation of complex systems of interacting elers themselves before the modeling and simulation
entities. Although a large number of different meth- endeavor starts – independent of which objective the
ods and tools for agent-based modeling and simulation model is developed for. Thus, model development and
have been developed, it seems that the full potential of model analysis also contain elements of original sys-
the agent concept often is not realized. In particular, tem exploration that, especially in the case of com-
this is the case when modeling complex systems that plex systems, may lead to new questions and insights
include human actors. For instance, most models use influencing the ongoing model development process.
a very primitive model of agent cognition yet as argued Even if the specific open issues that we discussed in
in [35.52] cognitive layers of agent architectures should Sect. 35.3.2, More Issues in Engineering Agent-Based
be intertwined with social layer. Simulations, are addressed by new methodologies to
The question of how to balance complexity of the be developed, this will probably remain as a profound
agents’ reasoning and transparency and comprehen- issue as long as there are systems that appear to be
siveness of the overall system behavior forms one of complex. More generally, we argue that the art and
the many open issues from a methodological point of practice of engineering agent-based models is an im-
view. Although agent-based modeling of complex sys- portant area of future research; as for handling, predict-
tem forms a highly attractive approach for the full ing and especially for understanding complex systems,
variety of possible objectives, there is not yet any modeling and simulation form the centerpiece of any
established method for developing a model in a sys- activity.
796 Part G Modelling and Computational Issues
References
Artif. Life IV Proc. Fourth Int. Workshop Synth. N. Venkatasubramanian: Multi-agent simulation of
Simul. Living Syst., ed. by R.A. Brooks, P. Maes (MIT disaster response, Proc. 1st Int. Workshop Agent
Press, Cambridge 1994) pp. 411–416 Technol. Disaster Manag., Hakodate (2006)
35.9 J.S. Lansing: “Artificial societies” and the social 35.28 L. Brouwers, H. Verhagen: Applying the Consumat
sciences, Artif. Life 8, 279–292 (2002) model to flood management policies, 4th Work-
35.10 R.K. Sawyer: Artificial societies – Multi-agent sys- shop Agent-Based Simul. (2004) pp. 29–34
tems and the micro-macro link in sociological the- 35.29 D. Yergens, J. Hiner, J. Denzinger, T. Noseworthy:
ory, Sociol. Meth. Res. 31(3), 325–363 (2003) IDESS – A multi-agent-based simulation system for
35.11 H.V.D. Parunak, R. Savit, R.L. Riolo: Agent-based rapid development of infectious disease models,
modeling vs. equation-based modeling: A case Int. Trans. Syst. Sci. Appl. 1(1), 51–58 (2006)
study and users’ guide, Lect. Notes Comput. Sci. 35.30 B. Raney, N. Cetin, A. Völlmy, M. Vrtic, K. Axhausen,
1534, 10–25 (1998) K. Nagel: An agent-based microsimulation model
35.12 J.L. Schiff: Cellular Automata: A Discrete View of the of swiss travel: First results, J. Netw. Spatial Econ.
World (Wiley, Hoboken 2008) 3(1), 23–41 (2003)
35.13 M. Gardner: The fantastic combinations of John 35.31 R. Williams: An agent based simulation environ-
Conway’s new solitaire game “life, Sci. Am. 223, 12– ment for public order management training, West.
123 (1970) Simul. Multiconf., Object-Oriented Simul. Conf.,
35.14 J.M. Epstein, R.L. Axtell: Growing Artificial Soci- San Diego (1993) pp. 151–156
eties: Social Science from the Bottom Up (MIT Press, 35.32 M. Wooldridge: An Introduction to Multiagent Sys-
Cambridge 1996) tems (Wiley, Hoboken 2009)
35.15 T.C. Schelling: Dynamic models of segregation, 35.33 Y. Shoham: Agent-oriented programming, Artif. In-
J. Math. Sociol. 1(1), 143–186 (1971) tell. 60, 51–92 (1992)
35.16 N.A. Barricelli: Symbiogenetic evolution processes 35.34 M.W. Macy, R. Willer: From factors to actors: Com-
realized by artificial methods, Methodos 9(35–36), putational sociology and agent-based modeling,
143–182 (1957) Annu. Rev. Sociol. 28, 143–166 (2002)
35.17 S. Galam: Sociophysics: A review of Galam mod- 35.35 M.J. Prietula, K.M. Carley, L. Gasser (Eds.): Simulat-
els, Int. J. Mod. Phys. C 19(3), 409–440 (2008), ing Organizations: Computational Models of Insti-
doi:10.1142/S0129183108012297 tutions and Groups (MIT Press, Cambridge 1998)
35.18 C.W. Reynolds: Flocks, herds, and schools: A dis- 35.36 J.M. Epstein: Generative Social Science: Studies in
tributed behavioral model, Comput. Graph 21(4), Agent-Based Computational Modeling (Princeton
25–34 (1987) Univ. Press, Princeton 2007)
35.19 N. Gilbert: Computer simulation of social processes. 35.37 J. Künzel, V. Hämmer: Simulation in university ed-
Social research update, Issue 6, Department of So- ucation: The artificial agent PSI as a teaching tool,
ciology, University of Surrey, UK (1994), http://sru. Simulation 82(11), 761–768 (2006)
soc.surrey.ac.uk/SRU6.html, Accessed 15 Feb 2015 35.38 M.E. Bratman: Intentions, Plans, and Practical
35.20 N. Gilbert, K.G. Troitzsch: Simulation for the Social Reason (Harvard Univ. Press, Cambridge 1987)
Scientist, 2nd edn. (Open Univ. Press, Maidenhead
Simulation of Complex Systems References 797
35.39 M. Georgeff, B. Pell, M. Pollack, M. Tambe, In: Agent-Based Modelling and Simulation in the
M. Wooldridge: The Belief-Desire-Intention model Social and Human Sciences, ed. by D. Phan, F. Am-
of agency, Lect. Notes Comput. Sci. 1555, 1–10 (1998) blard (Bardwell, Oxford 2007) pp. 273–294
35.40 A.L.C. Bazzan, R.H. Bordini: A framework for the 35.52 N. Gilbert: When does social simulation need cog-
simulation of agents with emotions: Report on ex- nitive models? In: Cognition and Multi-Agent In-
periments with the iterated prisoners dilemma, 5th teraction From Cognitive Modeling to Social Sim-
Int. Conf. Auton. Agents (2001) pp. 292–299 ulation, ed. by R. Sun (Cambridge Univ. Press,
35.41 J. Broersen, M. Dastani, Z. Huang, J. Hulstijn, L. Van Cambridge 2006) pp. 428–432
der Torre: The BOID architecture: Conflicts between 35.53 B. Edmonds, S. Moss: From KISS to KIDS – An ‘anti-
beliefs, obligations, intentions and desires, 5th Int. simplistic’ modelling approach, Lect. Notes Com-
Conf. Auton. Agents (2001) pp. 9–16 put. Sci. 3415, 130–144 (2004)
35.42 A. Newell: Unified Theories of Cognition (Harvard 35.54 F. Klügl: “Engineering” agent-based simulation
Univ. Press, Cambridge 1994) models?, Lect. Notes Comput. Sci. 7852, 179–196
35.43 G. Méndez, J. Rickel, A. de Antonio: Steve meets (2012)
Jack: The integration of an intelligent tutor and 35.55 E. Norling, B. Edmonds, R. Meyer: Informal ap-
a virtual environment with planning capabilities, proaches to developing simulation models. In:
Lect. Notes Comput. Sci. 2792, 325–332 (2003) Simulating Social Complexity, Understanding Com-
35.44 J.R. Anderson, D. Bothell, M.D. Byrne, S. Douglass, plex Systems, ed. by B. Edmonds, R. Meyer
C. Lebiere, Y. Qin: An integrated theory of mind, (Springer, Berlin, Heidelberg 2013) pp. 39–55
Psychol. Rev. 111(4), 1036–1060 (2004) 35.56 L.R. Izquierdo, J.G. Polhill: Is your model suscep-
35.45 M. A. Janssen, W. Jager: An integrated approach to tible to floating-point errors?, J. Artif. Soc. Soc.
simulating behavioural processes: A case study of Simul. 9(4), 4 (2006)
the lock-in of consumption patterns, J. Artif. Soc. 35.57 S.F. Railsback, S.L. Lytinen, S.K. Jackson: Agent-
Soc. Simul. 2(2) (1999) based simulation platforms: Review and develop-
Part G | 35
35.46 A. Guye-Vuillème: Simulation of Nonverbal Social ment recommendations, Simulation 82(9), 609–
Interaction and Small Groups Dynamics in Virtual 623 (2006)
Environments, Ph.D. Thesis (Ècole Polytechnique 35.58 C. Nikolai, G. Madey: Tools of the trade: A survey of
Fédérale de Lausanne, Lausanne 2004) various agent based modeling platforms, J. Artif.
35.47 H. Verhagen: Simulation of the learning of norms, Soc. Soc. Simul. 12(2), 2 (2009)
Soc. Sci. Comput. Rev. 19(3), 296–306 (2001) 35.59 F. Klügl, A.L.C. Bazzan: Agent-based modelling and
35.48 N. Gilbert, A. Pyka, P. Ahrweiler: Innovation net- simulation, AI Mag. 33, 29–40 (2012)
works – A simulation approach, J. Artif. Soc. Soc. 35.60 Swarm Development Group: http://www.swarm.
Simul. 4(3) (2001) org
35.49 M. Schüle, R. Herrler, F. Klügl: Coupling GIS and 35.61 Argonne National Laboratory: http://www.repast.
multi-agent simulation – Towards infrastructure sourceforge.net
for realistic simulation, Lect. Notes Comput. Sci. 35.62 George Mason University: http://cs.gmu.edu/
3187, 228–242 (2004) ~eclab/projects/mason/
35.50 V. Grimm, U. Berger, F. Bastiansen, S. Eliassen, 35.63 NetLogo Team: http://ccl.northwestern.edu/
V. Ginot, J. Giske, J. Goss-Custard, T. Grand, netlogo
S.K. Heinz, G. Huse, A. Huth, J.U. Jepsen, 35.64 S. Railsback, V. Grimm: Agent-Based and
C. Jørgensen, W.M. Mooij, B. Müller, G. Pe’er, Individual-Based Simulation – A Pra