Você está na página 1de 11

Original Research Article

Big Data & Society


July–December 2016: 1–11
Algorithms and their others: ! The Author(s) 2016
DOI: 10.1177/2053951716665128

Algorithmic culture in context bds.sagepub.com

Paul Dourish

Abstract
Algorithms, once obscure objects of technical art, have lately been subject to considerable popular and scholarly scrutiny.
What does it mean to adopt the algorithm as an object of analytic attention? What is in view, and out of view, when we
focus on the algorithm? Using Niklaus Wirth’s 1975 formulation that ‘‘algorithms þ data structures ¼ programs’’ as a
launching-off point, this paper examines how an algorithmic lens shapes the way in which we might inquire into con-
temporary digital culture.

Keywords
Algorithms, practice, materiality, configurations, visibility, code

conversations. Some, for example, are entwined with


Introduction
discussions of ‘‘Big Data,’’ with a focus on the ways
During my time as an undergraduate student in com- that online activities create data streams from which
puter science, algorithms were objects of concern in a algorithms extract patterns that guide the action of
variety of ways – as practical rubrics for the design of institutions, corporations and states. Others frame dis-
effective and efficient computer programs, as catalogs cussions of algorithms in terms of automation and in
of ways of working, as abstract formulations in text- particular the kinds of high-speed action associated
books and research papers, or as mathematical conun- with, say, programmed trading in stock markets,
drums that might appear in exam papers. Alongside high-frequency automated trades carried out by com-
compilers, libraries, specifications, languages, and puter systems without human intervention (Buenza and
state machines, they formed part of the intellectual fur- Millo, 2013). Still others are concerned with the ways
niture of that world. that algorithmic developments are transforming aspects
From that perspective, it’s rather odd to find that of the labor relation, positioning human being as
algorithms are now objects of public attention, arising resources to be deployed according to programmed
as topics of newspaper articles and coffee shop conver- responses to demand, for instance in ride-sharing ser-
sations. When digital processes become more visible as vices like Uber (e.g. Rosenblat and Stark, 2016). Each
elements that shape our experience, then algorithms in of these is a broader or ongoing conversation into
particular become part of the conversation about how which the algorithmic has become incorporated.
our lives are organized. From discussions over the role Relatedly, algorithms have also become objects of
that algorithms might play in hiring (Hansel, 2007) or academic attention in social and cultural studies,
credit scoring (Singer, 2014) to inquiries into the often in the context of similar concerns. Working
assumptions behind the algorithms that set the ambient
temperature in office buildings (Belluck, 2015), an
awareness has developed that algorithms, somehow University of California, Irvine, CA, USA
mysterious and inevitable, are contributing to the
Corresponding author:
shape of our lives in ways both big and small. Paul Dourish, University of California, 5086 Donald Bren Hall, Irvine,
The public discussion of algorithms emerges out of CA 92697-3440, USA.
(and arises at the intersection of) a series of other Email: jpd@ics.uci.edu

Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://
www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further
permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-
at-sage).
Downloaded from by guest on August 25, 2016
2 Big Data & Society

across a number of areas, including finance, labor pol- Wirth’s formulation  algorithms þ data structures
itics, governance, public policy, and organizational ¼ programs  highlights important concerns too for
strategy, scholars such as Barocas (2014), Gillespie those concerned with algorithms and digital culture.
(2012), Glaser (2014), Manovich (2013), Pasquale The first is that algorithms and programs are differ-
(2015), Seaver (2015), and Ziewitz (2015) have turned ent entities, both conceptually and technically.
attention to the way that algorithms are embedded Programs may embody or implement algorithms (cor-
within topics of academic investigation or indeed may rectly or incorrectly), but, as I will elaborate, programs
constitute a significant new topic of attention in them- are both more than algorithms (in the sense that pro-
selves. As with the public discussion, this academic grams include non-algorithmic material) and less than
interest in algorithms is sometimes driven by the way algorithms (in the sense that algorithms are free of the
that algorithms are beginning to arise as objects of sig- material constraints implied by reduction to particular
nificance within existing academic domains; in other implementations).
cases, it arises as part and parcel of a broader interest The second, related, observation is that since algo-
in harnessing the tools of cultural analysis to under- rithms arise in practice in relation to other computa-
stand contemporary digital culture and its platforms tional forms, such as data structures, they need to be
(e.g. Cox, 2012; Fuller, 2008; Mackenzie, 2006; analyzed and understood within those systems of rela-
Manovich, 2001; Montford and Bogost, 2009). tion that give them meaning and animate them. There
However, the complex embedding of the topic of is, in other words, within Wirth’s formula, an analytic
algorithms into these related concerns raises some diffi- warrant for a relational and differential analysis of
culties. In particular, it requires us to be careful about algorithm alongside data, data structure, program, pro-
the bounds and limits of algorithms and their function- cess, and other analytic entities. This is not to dissolve
ing. Just what is it that we have in view when we focus the algorithm in a sea of relations, but rather to under-
on ‘‘algorithms’’ as the central object of analytic stand how algorithm – as a technical object, as a form
attention? of discourse, as an object of professional practice, and
In 1975, the pioneering computer scientist Niklaus as a topic of public or academic concern – comes to
Wirth published a book entitled ‘‘Algorithms þ Data play the particular role that it does.
Structures ¼ Programs’’ (Wirth, 1975). Wirth was one The goal of this paper is to sketch just this sort of
of a group of researchers and academics who developed relational analysis and to place algorithm in juxtapos-
and advocated for the idea of ‘‘structured program- ition with other relevant terms both in order to identify
ming,’’ an approach to the design and engineering of aspects of the scope and limits of ‘‘algorithm’’ as a con-
software systems that emphasized the stepwise, modu- ceptual tool, and to understand how algorithms come
lar decomposition of problems and a similarly struc- to act within broader digital assemblages. As Neyland
tured approach to software design and construction. (2016) notes, the danger to be guarded against here is
This approach made computer programs easier to taking an essentializing view of algorithms. Similarly,
develop (especially by teams of programmers) and my argument here should not be read as an essentialist
easier to analyze, as well as more naturally aligning argument, seeking a foundational truth of the nature of
computer programs, as engineering artifacts, with the algorithms as natural occurrences. No such naturalism
sorts of mathematical mechanisms by which they could can be sustained. Instead, the argument here is one of
be analyzed and assessed. Wirth did not simply cheer ethnographic responsibility and practical politics. With
for this position from the sidelines; his own work in respect to ethnographic responsibility, I note that
programming language design and development pro- ‘‘algorithm’’ is a term of art within a particular profes-
vided software engineers with the tools they needed to sional culture – that of computer scientist, software
adopt the model, which soon became (and indeed, in designers, and machine learning practitioners – and I
variant forms, remains) standard industrial practice. seek to understand the limits and particularlities of that
‘‘Algorithms þ Data Structures ¼ Programs’’ focused term’s use as a members’ term, its emic character, in
on the practice of software design in the structured much the same way as we might similarly explore,
programming tradition, setting out the case for the respect, and analyze the consequences of members’
mutual design of algorithmic processes and the regular- terms within other cultural milieux. Secondly, as a
ized data representations or ‘‘data structures’’ over matter of practical politics, I take it that the domain
which they would operate. At a time when the develop- of Big Data is one into which social science seeks to
ment and analysis of algorithms was the dominant and make an intervention, and suggest that critiques of
most prestigious area of computer science, Wirth algorithmic reasoning that set their own terms of refer-
wanted to emphasize the concomitant importance of ence for key terminology are unlikely to hit home.
data structures for those building effective software Again, this is not to grant primacy or authority to a
systems. technical interpretation; the goal rather is to

Downloaded from by guest on August 25, 2016


Dourish 3

understand what that technical interpretation is, and staked out by the term ‘‘algorithm’’, in among other
what consequences it might hold for social and cultural related terms and phenomena, seems worthwhile, espe-
analysis. The paper takes up the question of what algo- cially if the algorithm is presented as a site of particu-
rithms do within the domain of Big Data’s professional larly valuable leverage in contemporary debates.
practices, as ‘‘convening’’ objects (Ananny, 2016), and With that caution in mind, then, we can consider the
as objects that live in dynamic relations to the other work that the term ‘‘algorithm’’ does and might do for
material and discursive elements of software systems social analysis contextually.
and the setting that produce them. In doing so, I
hope to be able to identify fruitful directions for
taking up the algorithm as an object of attention
Algorithm and automation
within software studies and allied domains. Perhaps the most diffuse concern expressed by discus-
sion of algorithms is that which uses the notion meto-
nymically to address the regime of digital automation
Algorithms and their others
most broadly. Here, the concern is not with algorithms
In computer science terms, an algorithm is an abstract, as such, but with a system of digital control and man-
formalized description of a computational procedure. agement achieved through sensing, large-scale data
Algorithms fall into different types according to their storage, and algorithmic processing within a legal, com-
properties or domains – combinatorial algorithms deal mercial, or industrial framework that lends it authority.
with counting and enumeration, numerical algorithms We might point here to discussions of credit scoring
produce numerical (rather than symbolic) answers to (e.g. Zarsky, 2016), digitally enhanced public surveil-
equational problem, while probabilistic algorithms pro- lance (e.g. Graham and Wood, 2003), or plagiarism
duce results within particular bounds of certainty. detection (e.g. Introna, 2016) as cases where concerns
Algorithms may also vary in terms of their analytic with the algorithmic, in part or in whole, stand in for
characteristics, such as generalized performance char- critiques of the larger regime of computer-based moni-
acteristics (e.g. how their mean-time or best-time per- toring and control. To be sure, crucial issues of labor
formance varies with the size of the data sets over which politics, social justice, personal privacy, public account-
they operate). As part of the stock-in-trade of computer ability, and democratic participation are thrown up by
scientists and software engineers, some algorithms are this technologically enabled system of management,
known by the names of their inventors (Dijkstra’s algo- and the expansion of the sorts of regulative, coercive,
rithm, the Viterbi algorithm, Gouraud shading, or and divisive processes that are the legacy of Charles
Rivest-Shamir-Adelman) while others are known by Babbage and Frederick Taylor, and algorithms play a
conventional names (e.g. QuickSort, Fast Fourier critical role in these. Indeed, these are among the most
Transform, Soundex, or sort-merge join). important areas of political analysis that an under-
The significance of some of these properties – for- standing of ‘‘algorithm’’ as a term of technical art
malization, abstraction, identity, and so on – becomes and practice can illuminate. Nonetheless, the wholesale
clearer when we look at algorithms in the context of equation of algorithm and automation makes this work
their ‘‘others’’ – related but distinct phenomena that more, rather than less, difficult. If we want to be able to
emphasize different aspects of the sociotechnical assem- speak of algorithms analytically in order to identify
bly. In speaking of what an algorithm ‘‘is’’ and ‘‘is their significance as specific technical and discursive
not,’’ I am not asserting its stable technical identity; formulations then we need to be able to better identify
rather, my motive is to be ethnographically true to a how they operate as part of, but not as all of the larger
members’ term and members’ practice. As such, then, framework.
the limits of the term algorithm are determined by
social engagements rather than by technological or
Algorithm and code
material constraints. While social understandings and
practices evolve, algorithm, as a term of technical art, At a greater level of specificity, we might consider the
nonetheless displays for members some precision and a distinctions to be drawn between algorithms and code.
meaning within a space of alternatives. When technical In various forms, code has been a particular focus of
people get together, the person who says, ‘‘I do algo- attention in software studies, acting as it does as a site
rithms’’ is making a different statement than the person of material, textual, and representational production.
who says, ‘‘I study software engineering’’ or the one Code is software-as-text, and particularly in the form
who says, ‘‘I’m a data scientist,’’ and the nature of of ‘‘source code,’’ the human-readable expressions of
these differences matters to any understanding of the program behavior that are the primary focus of pro-
relationship between data, algorithms, and society. grammers’ productive attentions, it has perhaps been
Accordingly, an investigation of the particular territory particular by those working under the umbrella of

Downloaded from by guest on August 25, 2016


4 Big Data & Society

‘‘critical code studies’’ (see, e.g., Berry, 2011; Montford with elements of other algorithms, or they might simply
et al., 2012). be distributed between different modules, different
In textbooks and research papers, algorithms are methods, or different functions, so that they operate
often expressed in what is informally called ‘‘pseudo- of the algorithm is (intentionally or unintentionally)
code,’’ a textual pastiche of conventional programming obscured.
languages that embodies general ideas that most lan- Third, algorithms are manifest differently on differ-
guages share without committing to the syntactic or ent code platforms. Object-oriented languages, proced-
semantic particulars of any one. Pseudo-code expresses ural languages, functional languages, and declarative
the abstract generality of an algorithm, the idea that it languages are all based on different paradigms for
can be operationalized in any programming language code expression and so will express the same algorithm
while transcending the particulars of each. It also quite differently. Particular examples of those language
expresses the promise of an algorithm, the idea that it styles have different features and different sets of
is code-waiting-to-happen, ready to be deployed and libraries, and will be able to rely on those in different
brought to life in programs yet to be written (Introna, ways to carry out some of the algorithm’s operations.
2016). The idea that the relationship between the algo- Different computer architectures, different data storage
rithm and the code is largely a temporal one is perhaps, technologies, different arrangements of memory hier-
then, not surprising, and yet there are distinctions that archy, and other features of a platform mean that the
have a good deal of significance from an analytic per- code of an algorithm is highly variable and highly spe-
spective. I will outline four here. cific. The ‘‘governing dynamics’’ of algorithms
First, while the transformation of an algorithm (Ananny, 2016), then, are only in part algorithmic;
(described in mathematical terms or in pseudo-code) they are as much platform effects.
into code may be relatively straight-forward (although The fourth observation is something of a corollary
it is not necessarily so), the reverse process – to read the to the others, although one with particular conse-
algorithm off the code – is not at all a simple process. quences. One reason that an algorithm can be hard to
There are a number of circumstances in which this need recover from a program is that there is a lot in a pro-
arises. Assessing whether an algorithm has been cor- gram that is not ‘‘the algorithm’’ (or ‘‘an algorithm’’).
rectly implemented by a piece of code, for example, is The residue is machinic, for sure; it is procedural, it
one case of attempting to ‘‘read off’’ the algorithm (as involves the stepwise execution of one instruction fol-
implemented) from the code, and the complexity of this lowed by another, and it follows all the rules of layout,
is made clear by the many cases in which errors slip control flow, state manipulation, and access rights that
through. Within the domain of Internet security, for shape any piece of code. But much of it is not actually
example, there have been a number of headline cases part of the – or any – algorithm. An algorithm might
lately where trusted code did not in fact correctly imple- express, for example, how to transform one kind of
ment the algorithm that it was meant to embody, leav- data representation into another, or how to reach a
ing systems open for attack and data breaches; the numerical result for a formula, or how to transform
‘‘Heartbleed’’ incident is among the best known data so that a particular constraint will hold (e.g. to
(Durumeric et al., 2014). The difficulty of reading an sort numbers) – but actual programs that implement
algorithm off the code also lies at the heart of patent these algorithms need to do a lot more besides. They
disputes (over whether a given piece of code does or read files from disks, they connect to network servers,
does not implement a protected algorithm, for instance) they check for error conditions, they respond to a user
as well as simply cropping up as a practical problem for interrupting a process, they flash signals on the screen
a programmer charged with understanding, maintain- and play beeps, they shuffle data between different stor-
ing, modifying, or porting an existing software system age units, they record their progress in log files, they
written by another (or sometimes even the code we check for the size of a screen or the free space on a disk,
wrote ourselves). and many other things besides. An algorithm may
Second, algorithms and code have different locality express the core of what a program is meant to do,
properties. One of the reasons, in fact, that the algo- but that core is surrounded by a vast penumbra of
rithm may not be easy to read off the code is that the ancillary operations that are also a program’s respon-
algorithm may not happen all in one place. The algo- sibility and also manifest themselves in the program’s
rithm, an apparently singular object when it appears on code. In other words, while everything that a program
the page of a book, becomes many different snippets of does and that code expresses is algorithmic in the sense
code distributed through a large program. Even if they that it is specified in advance by formalization, it is not
happen in sequence when a program is executed, they algorithm, in the sense that it goes beyond things that
may not occur together or even nearby within the text algorithms express, or even what the term ‘‘algorithm’’
of a program. In a program, they may be intermixed signals as a term of professional practice.

Downloaded from by guest on August 25, 2016


Dourish 5

form of decentralized control, focusing on the questions


Algorithm and architecture of conformance and regulation that underlie networked
The third distinction that it is useful to take up is that actions, but the protocol, as an agreement or specifica-
between algorithm and architecture. This is an elabor- tion to which both parties must conform, obscures, to
ation of part of the earlier discussion, but an elabor- some extent, the algorithm itself. The algorithm specifies
ation that has particular relevance in the context of how a protocol should be implemented but it cannot be
contemporary networked systems. easily located as an algorithm in the running system,
I noted above that algorithms, in the sense of par- distributed as it is between different sites. More gener-
ticular formulations of program behavior, may not be ally, the factoring of system behavior into a range of
easily localizable in code. That is, although they are components, some of which are bound together in the
often defined in terms of a ‘‘sequence of steps’’ or same address space, some of which are distributed as
‘‘sequence of operations’’, that sequence may not be different threads or processes, some of which are imple-
laid out as a sequence of statements or sequence of mented on different computers, many of which are vis-
lines in a program’s text. The algorithm, then, is dis- ible to each other only through restricted interfaces,
tributed or fragmented in a program. often means that the ‘‘algorithm’’ can not only not be
Most contemporary programs of any complexity, located within an easily delineated stretch of code, but
however, are extremely large – often numbering in the not even within a single computer or the network of a
hundreds of thousands or millions of lines of code – single organization.
and must be arranged according to some organizational Given how many contemporary systems are net-
structure in order to help programmers and teams work-based or network-backed, are designed for
manage their complexity and comprehend the whole. large-scale clusters, or even just depend on multi-core
So-called software ‘‘architecture’’ concerns the arrange- or graphics processor-based architectures common in
ment of units, modules, or elements of a larger system, contemporary personal platforms from desktops to
and the patterns of interaction between those units. The wearables, the question of distribution is pervasive.
nature of the units and the nature of the communica- Introna (2016) suggests the language of Barad’s
tion between them depend both on the system’s archi- (2007) agential realism as a way of thinking about
tecture and on the underlying platform. Units might this, recognizing that the ‘‘algorithm’’ is itself an ‘‘agen-
relate to each other as libraries, as inheritance hierar- tial cut’’, a means of constituting some semi-stable
chies, as containerized components, as client/server, or object within a dynamic and unfolding socio-technical
in a host of related ways. The details are not of rele- assembly. This does not diminish the power of ‘‘algo-
vance to the argument here, but the point is this: first, rithm’’ as a way of accounting for the operation of a
that ‘‘the algorithm’’, to the extent that it can be treated digital assemblage, by any means, but it does imply that
as a unit, may not be localized even within a module, ‘‘algorithm’’ may dissolve into nothing when we drill
never mind within a simple extent of code; and second, down into the specific elements of a system that might
that modules may be highly isolated from each other, be subject to audit or focused critical or forensic exam-
their code unavailable to each other, perhaps written by ination (c.f. Kirschenbaum, 2008). Introna’s analysis
different programmers, running on different computers, shows that we should examine both what work it
located within different administrative and manage- takes to identify certain aspects of a running system
ment domains, and so forth. as the manifestations of an algorithm, and also what
For instance, we might talk of the algorithm by which is achieved through that collective process and practice
the Internet manages the flow of data in a Transmission of identification.
Control Protocol (TCP) stream. Data flow must be regu-
lated so as to avoid congestion on transmission lines,
Algorithm and materialization
and indeed the development of a new congestion avoid-
ance algorithm in the late 1980s was crucial in allowing The final distinction to explore here is that between the
the Internet to scale to its current size (Jacobson, 1988). algorithm and its manifestation not just in a piece of
This ‘‘algorithm’’ though is hard to locate in practice. It code or even in a larger software system but in a specific
is an algorithm that governs the behavior of two parties, instantiation – as a running system, running in a par-
the two end-points of a communication on a network, so ticular place, on a particular computer, connected to a
they are, by definition, almost always on two different particular network, with a particular hardware config-
computers. Those different computers quite likely run uration. All of these critically shape the effect that the
two different implementations of the TCP/IP protocols, algorithm has.
written by different people, and quite possibly the pri- That material configurations limit the effectiveness
vate, undisclosed code belonging to two different organ- or reach of algorithms is no surprise; algorithmic for-
izations. Galloway (2004) has examined protocol as a mulations do not take into account the storage speeds,

Downloaded from by guest on August 25, 2016


6 Big Data & Society

network capacities, instruction pipelines, or memory us nothing about how quickly or slowly it might actu-
hierarchies, each of which can have a crucial effect on ally perform in practice. The only things that have
algorithmic performance. More interestingly, though, actual measurable performance (measured in seconds
the converse is also true – our experience of algorithms or fractions thereof) are implementations, in software
can change as infrastructure changes. or in silicon. The algorithm, in other words, must be
Consider an example taken from nuclear weapon understood both as a formalized account of computa-
simulation (see Dourish and Mazmanian, 2013). Due tional possibilities and as a practical tool, and the rela-
to nuclear test ban treaties, the nuclear powers have not tionship between these two is not fixed.
detonated nuclear weapons in several decades.
However, they continue to develop and introduce new
weapons. To do so with no testing would be foolhardy,
Inscrutibility
and so new designs are tested but only through simu- Stretching across all these discussions are a series of
lation (Gusterson, 2001, 2008). In fact, we might argue distinctions that seem to anchor the social analysis of
that it was the ability to produce credible digital simu- algorithms. Algorithms are presented as fast, rather
lations of nuclear explosions that made test limitation than slow; as automated, rather than hands-on; as
treaties possible. At this point, the design of new machinic, rather than human. Each of these presents
nuclear warheads and weapons is so intrinsically tied a series of problems when algorithms move into new
to the technology of simulation that one could cite the domains.
technology of simulation as one of the major limits Perhaps the most significant contrast, though, con-
upon the production of new weapons. Advances in cerns the problems of inscrutability. The focus of sev-
simulation technology make new simulations practical, eral examinations has been the question of
and those new simulations open up new avenues for accountability and assessment thrown up by the fact
weapons design. Note that the algorithms do not need that algorithms are opaque; their operation cannot be
to change in this scenario; only the technologies upon examined as easily as those of human actors, for a var-
which they are implemented. The simulation – the algo- iety of reasons, leading us to look for new ways to make
rithm – remains unchanged, but the shifting techno- algorithmic processes visible, to render algorithms
logical base upon which an implementation of that accountable, and to find within the algorithmic process
algorithm runs means that the capacities of that algo- some opportunity for audit, external review, and exam-
rithm and its effectiveness within a design process is ination (e.g. Pasquale, 2015; Sandvig et al., 2014). Here,
changing. New technologies shift the effect and I draw on a recent article in these pages by Jenna
impact of an algorithm without changing the algorithm Burrell (2016), who lays out some of the foundations
itself; they expand the bounds of algorithmic for algorithmic opacity in order to trouble some of
possibility. these calls for audit.
Security infrastructures are a second area where
these changes have made a difference. For instance,
Algorithmic opacity
even simplistic so-called ‘‘brute-force attacks’’ on pass-
word systems (systematically attempting every possible Burrell begins from the problems posed by opaque
password) that were once infeasibly hard with simple algorithms. For those for whom algorithmic practice
password technology are now trivial; more sophisti- potentially embodies an end-run around traditional
cated attacks on more complicated cryptographic sys- forms of legislative accountability, this opacity is a
tems are similarly now just a matter of assembling severe problem, and some, such as Pasquale (2015),
enough computing power. have argued that algorithms need to be available to
In a wide-ranging examination of algorithms that audit. Burrell points out, though, that there are mul-
takes the Viterbi path algorithm as its key example, tiple different sources of algorithmic opacity, with dif-
Mackenzie (2005) takes up some of these questions. ferent relations to mechanisms of redress such as audit.
The algorithm is powerful and has many applications, The first is the trade-secret protection that governs
but much of what makes it effective in our world is the many of the algorithms that lie behind services such
fact that particular implementations of the algorithm as Google, Facebook, and Twitter, but also those
can be embodied in devices and infrastructures with that are used by financial institutions and other corpor-
specific operating capacities. Mackenzie’s analysis ations. Audit might have the most force here, where
focuses on digital temporality, and here we find a key algorithms are held as secrets. A second source of opa-
concern with algorithms and their materialization. To city is that the ability to read or understand algorithms
speak of an algorithm like the Viterbi algorithm as is a highly specialized skill, available only to a limited
‘‘fast’’ is to speak of its complexity, its efficiency and professional class; it depends upon particular education
the conditions that limit its performance, but this tells and training. This suggests that audit, at least under

Downloaded from by guest on August 25, 2016


Dourish 7

contemporary arrangements, will always be a professio- activities never became a ‘‘trending topic’’ on Twitter
nalized and specialized technical practice; with respect (highlighted because of user activity). Some were con-
to audit, we might be concerned about the problems vinced that this must have indicated censorship; after
that have attended financial audit in cases like that of all, how could the latest pop sensation’s haircut or new
Enron, for example. However, most problematic is tattoo be more important than this mass political
Burrell’s third source of opacity. As she notes, many action? The engineers at Twitter were adamant that
of the algorithms that have social and cultural signifi- no censorship had gone on, but were themselves
cance, including those that shape the flow of informa- unable to explain why OWS had not become a trending
tion in social media, the distribution of search results in topic. They can explain the algorithm (although it’s a
search engines, and the production of recommenda- trade secret, so they don’t) – the factors that contribute,
tions in online retail, are statistical machine learning the ordering and weighting of different properties of
algorithms. Operating over large amounts of data, tweets and hashtags – but that is not, in itself, enough
they observe, characterize, and act on patterns that to account for what happens in the system. To under-
arise in the data. But these patterns are purely statistical stand that, one must be able to characterize the specific
and probabilistic phenomena – they are not human des- dynamics of the ever-roiling mass of data – the way that
ignations. A ‘‘top-down’’ approach might operate in people pick up ideas, the dynamics of how they repeat
terms of human-identified traits, and then seek to find them, the geographical waves of interest, all going by at
them in the data; the bottom-up approach of statistical millions of tweets per minute. It is not just that we
machine learning is to identify the patterns first and cannot easily recreate the circumstances and forensic-
then see if they can be made sense of for human ally figure this out (as Heraclitus 2.0 might say, you
needs. So, for example, a ‘‘bottom-up’’ algorithm for cannot step twice into the same data stream) but also
handwriting recognition has no concept of the alpha- that the patterns that are being analyzed are ephemeral.
bet. It has not been programmed with the shape of the And yet we need to find ways to narrate them.
letters ‘‘A’’ or ‘‘g’’. It has instead exposed to thousands Although the forms of analysis in which statistical
of examples, on the basis of which it is programmed to machine learning techniques are embedded are referred
recognize certain arrangements of strokes as being to with the term ‘‘Big Data,’’ there are in fact two scalar
characteristic of particular letters. Audit, in this case, moves at work. The first is a move from small to big –
has no power to reveal what the algorithm knows, from individual data to large data sets, from one record
because the algorithm knows only about inexpressible to an accumulated mass of data (as in the Quantified
commonalities in millions of pieces of training data. Self movement – c.f. Neff and Nafus, 2016), or from
The questions of what we know and what we can say one person to a large population. This is not only the
about the operation of machine learning or Big-Data scalar move from which Big Data gains both its name
algorithms of this sort is a key issue at stake in algo- but also certain claims to statistical meaningfulness,
rithmic analysis. During my years of computer science and it is the move that allows statistical techniques to
training, to have an algorithm was to know something. start to describe features of populations. The second
Algorithms were definitive procedures that lead to pre- move, though, is from big to small again, and it is the
dictable results. The outcome of the algorithmic oper- key move in narrating or accounting for the results of
ation was known and certain. Much of the debate Big Data analysis. Machine learning techniques cluster
about ‘‘algorithms’’ at the moment focuses on a par- data but humans read and narrate the clusters that arise
ticular class of algorithm – statistical machine learning as signaling certain categories of people – pregnant
techniques – that produce, instead, unknowns. More women, dual-income Minneapolis families in the
accurately, they produce analyses of data that are market for a new car, disaffected voters, or people
known and understood in some terms (in terms of the likely to cheat on their taxes. Each act of categorization
formal properties of the data set – its patterns and – or more accurately, of narration – is a move from big
regularities) but unknowable in others (in the terms of to small, a reduction of a mass of data points to a
the domain that the data represents.) When my credit narrative element or a defining characteristic, drawn
card company deems a particular purchase or stream of generally from the domain of which we want to
purchases ‘‘suspicious’’ and puts a security hold on my know. Electoral data is gathered in order to tell us
card, the company cannot explain exactly what was about voters, and so we find voters in it; purchase
suspicious – they know that there’s something odd data formulates people as consumers, and so we find
but they don’t know what it is. consumer categories in it. And we find not only voters
When algorithms come to play a role in social and consumers, but voters and consumers who can be
affairs, this begins to matter. As reported by Gillespie made sense of in terms that make sense in the domain –
(2011), activists in the Occupy Wall Street (OWS) geography, income, lifestyle, history, engagement,
movement were surprised to note that the OWS interests, and inclinations. Big Data analysis says

Downloaded from by guest on August 25, 2016


8 Big Data & Society

‘‘this happens along with that’’ but the narratives we course a significant source of interest in algorithms in
tell of why are human ones, not technical ones. We are the first place, but the topic of data structures – the
inclined only to find things in Big Data that we specific representations that organize data in order to
expected, in some sense, to find – or at least, we find make it processable by algorithms – have been less
the kinds of things that we can make sense of. prominent. The consequences of representational
It is useful here, then, to return to Wirth’s formula- forms – of the way that data must be shaped to be
tion – algorithms þ data structures ¼ programs. It processed by databases or other informational systems
speaks to the inherent duality of algorithms and data (e.g. Curry, 1998; Dourish, 2014), the organizing prin-
in the production of running systems, and the problems ciples of data archives (e.g. Edwards et al., 2011) or the
of attempting to understand one without the other. relationships between data format, data transmission,
Wirth speaks of data structures, rather than data, and representation (e.g. Dourish, 2015; Galloway,
because algorithms are designed around data structures 2004) – is the necessary dual of algorithmic processing.
– about forms and regularities rather than around con- While privacy discusses focus on data generation
tent. (An algorithm for sorting numbers is the same no and accumulation, data organization – the data struc-
matter what the numbers – and indeed the same algo- tures of Wirth’s aphoristic equation – require similar
rithm should also be able to sort names, files, or dates.) scrutiny.
Similarly, Burrell’s concern with opacity also directs us A second concern to which our attention might be
to be concerned about structures and regularities in drawn on the basis of this exploration is the question of
data sets and the mechanisms by which we struggle to algorithmic identity. How might we go about identify-
name them. Concerns with algorithms as inscrutable ing and pinpointing algorithms in consequence of the
and illegible may direct us instead towards the need vagaries of implementation and the flux of evolution?
to example the sources of the apparent legibility of How can algorithms be isolated and examined, and
data. how much sense does it even make to attempt that
Some recent moves by European legislators have exercise? Calls for audit and accountability, or even
shifted the conversation from ‘‘audit’’ to ‘‘explan- the manifestation of particular algorithms in order to
ation,’’ arguing that citizens who are substantively trace aspects of their history or movements, require
affected by the action of an algorithmic system should some attention to the identity conditions upon which
have a right to an explanation of how that decision was algorithmic sameness or similarity are founded. As
made (Goodman and Flaxman, 2016). The notion of Gillespie (2012) has noted, algorithms shift and evolve
‘‘explanation’’ here reflects the duality of algorithm and in deployment, particularly those hidden behind trade
data and the way that each can play a role in automated secrecy barriers; talking in any coherent way about
decision-making. At the same time, though, it begs ‘‘Google’s search term prediction’’ algorithm, for
other questions, including, first, what degree of explan- example, is deeply problematic given the invisible
ation can successfully ‘‘explain’’ results, and, perhaps shifts in implementation and strategy that lie behind
more pertinently, how the production of such an the scenes. Mackenzie (2005) considers the patterns of
explanation – which must, of course be generated algo- repeatability that algorithms embody within them-
rithmically – can be itself explained. selves, although one might extend his analysis to con-
sider the forms of repeatability at work in either the
successive use of algorithms over different data sets,
Directions or the multiple embodiments of ‘‘the same’’ algorithm
What lessons might we draw from this analysis and in different platforms and technologies. Again, the con-
what directions does it suggest for future analytic cern is not to engage in an essentializing project with
work around algorithms? Should we conclude that the the goal of laying down the criteria for algorithmic
term ‘‘algorithm’’ is too beset with problems and mis- sameness; the concern is more to understand how algo-
understandings to function effectively in critique, and rithms are identified as, used as, or made to be the same
that perhaps it is time to declare a moratorium on its in different settings, circulating as they do among plat-
use? Conceptual confusions certainly abound, but the forms, institutions, corporations, and applications.
term still carries weight and value if we can appropri- In turn, then, this might direct our attention towards
ately locate it within a larger analytic frame. a third concern, that of the temporalities of algorithms
One consequence is to pair analyses of algorithms – not just the temporalities of their own processes
with analyses of the various phenomena of data – (although those matter, because not all algorithms pro-
data items, data streams, and data structures – upon duce answers quickly) but also the temporalities of their
which they operate and in relation to which they are evolution as implemented and deployed. Perhaps espe-
formulated. The rise of interest in Big Data techniques cially important here are the co-evolution of algorithms
(e.g. Boellstorff and Maurer, 2015; Kitchin, 2014) is of and data streams, particularly in cases where these are

Downloaded from by guest on August 25, 2016


Dourish 9

mutually influential. An algorithm for, say, modeling many other elements in relation to which the algorithm
climate data is not directly tied to the climate data arises; but by corollary, if it appears in social analyses
itself, although it might influence the design of new with some new and different meaning, then it becomes
sensors and data collection instruments (Edwards, difficult to imagine critiques hitting home in the places
2010), but the algorithm by which Twitter determines that we hope to effect change.
the ‘‘trending topics’’ that it will report does exist in a Finally, one of the more intriguing issues to arise in
feedback loop with the data over which it operates, this exploration, and perhaps one that merits further
since trending topics displayed to users of Twitter and attention, is the relationship between algorithmic and
influence their own action, including the topics they non-algorithmic within technological practice. That is,
search and the postings they retweet and comment if algorithms are distinguishable elements of software
upon. design, delineable, identifiable, and perhaps even name-
These concerns with algorithmic identity and evolu- able, then we also begin to recognize that there are
tion point towards an alternative approach to algo- other elements in software systems that are machinic
rithm studies which might put aside the question of and programmed but not actually themselves governed
what an algorithm is as a topic of conceptual study by the sorts of things that are normally demarcated as
and instead adopt a strategy of seeking out and under- ‘‘algorithms.’’ Some may be expressible algorithmically,
standing algorithms as objects of professional practice but they are not themselves the things with which algo-
for computer scientists, software engineers, and system rithm designers or algorithm analysts concern them-
developers. What power does the notion of ‘‘algo- selves. These include the happenstance interaction of
rithm’’ have within their conversations and collabor- different systems not necessarily designed in concert
ations, and in what way are algorithms invoked, (such as the interactions between different flows on a
identified, traded, performed, produced, boasted of, network, different services on a server, or different
denigrated, and elided? What are computer scientists modules in an application, but they also include the
doing with they ‘‘do’’ algorithms, and for whom? In work ‘‘around the edges’’ of algorithms even in their
this approach, we might examine algorithm as a feature most direct implementation – the housekeeping, the
of the world of professional practice and as a member error-checking, the storage management, and so on.
category. A useful model here might be Eric Given the easy slippage between ‘‘algorithmic’’ and
Livingstone’s (1986) ethnographic study of the work ‘‘machinic,’’ or between ‘‘algorithmic’’ and ‘‘auto-
of mathematicians and the role and nature of ‘‘proof’’ mated,’’ the emergence of a category of programmed
in their lived work. Focusing on the ‘‘proof’’ not as an but not algorithmic activity within computer systems –
abstract truth but as a material form, something to be not governed by algorithms in the sense in which that
written on blackboards, demonstrated in conversation, term is used within computational practice – is intri-
and codified into academic career narratives, guing and suggestive. Certainly, it speaks to the poten-
Livingstone provides an account of the emergence of tial problems that software studies might have in
an object of professional practice within the everyday talking to some of its potential audiences if it talks
practical work of a scientific community. As studies by purely in terms of ‘‘algorithms’’. Further, it speaks to
Mackenzie (2015), Neyland (2016) and Seaver (2015) the disappearance within algorithm-oriented analysis of
begin to show, algorithms may benefit from a similar the work of making algorithms work. Perhaps, too, it
approach. Wendy Chun (2008) has argued cogently for suggests some useful parallels with, say, the elements of
the need to resist fetishizing technical objects such as engineered systems that are not themselves outcomes of
source code or algorithm, pointing out that a capitula- processes of design or engineering, and other gaps,
tion to purely technical accounts risks obscuring the holes, and rifts between systems-as-manifest and sys-
social and cultural practices by which those technical tems-as-studied.
objects are animated in practice. While acknowledging Understanding the limits and specificities of ‘‘algo-
the force of this argument, I have suggested that both rithm,’’ then, holds out the opportunity both to engage
ethnographic responsibility and practical politics more meaningfully in interdisciplinary dialogue and to
require that the term ‘‘algorithm’’ as an analytic cat- open up new areas for analysis around the edges of
egory must nonetheless be wielded with some precision. algorithmic systems.
Clearly, its emic character is not the limit of what can
be said for, with, or about it but we must nevertheless Acknowledgements
be at least conscious of where and when we make delib- I would particularly like to thank Evelyn Ruppert for provid-
erate moves to invoke the term in order to do new ing the invitation to present the lecture on which this paper is
conceptual work, and with what consequences. If the based, and Matthew Fuller, Martin Brynskov, Lone Koefoed
term ‘‘algorithm’’ appears in social analyses to mean Hansen, Adrian Mackenzie, and others in audiences at
just what it means emically, then it risks missing the Goldsmiths and Aarhus Universities for their feedback.

Downloaded from by guest on August 25, 2016


10 Big Data & Society

Jenna Burrell kindly shared an early copy of her paper on Dourish P and Mazmanian M (2013) Media as material:
algorithmic opacity. Much of my thinking on this topic has Information representations as material foundations for
developed in conversation with Tarleton Gillespie, Scott organizational practice. In: Carlile, Nicolini, Langley,
Mainwaring, Bill Maurer, Helen Nissenbaum, Phoebe et al. (eds) How Matter Matters: Objects, Artifacts, and
Sengers, Nick Seaver, Malte Ziewitz, and other collaborators Materiality in Organization Studies. Oxford, UK: Oxford
in the Intel Science and Technology Center for Social University Press, pp. 92–118.
Computing. Anonymous reviewers for the journal provided Durumeric Z, Kasten J, Adrian D, et al. (2014) The matter of
invaluable feedback that has improved the paper heartbleed. In: Proceedings of ACM internet measurement
considerably. conference IMC’14, Vancouver, BC, Canada, pp. 475–488.
Edwards P (2010) A Vast Machine: Computer Models,
Declaration of conflicting interests Climate Data, and the Politics of Global Warming.
Cambridge, MA: MIT Press.
The author(s) declared no potential conflicts of interest with Edwards P, Mayernik M, Batcheller A, et al. (2011) Science
respect to the research, authorship, and/or publication of this friction: Data, metadata, and collaboration. Social Studies
article. of Science 41(5): 667–690.
Fuller M (2008) Software Studies: A Lexicon. Cambridge,
Funding MA: MIT Press.
The author(s) disclosed receipt of the following financial sup- Galloway A (2004) Protocol: How Control Exists after
port for the research, authorship, and/or publication of this Decentralization. Cambridge, MA: MIT Press.
article: in part by the National Science Foundation under Gillespie T (2011) Can an algorithm be wrong? Available at:
awards 1525861 and 1556091. http://culturedigitally.org/2011/10/can-an-algorithm-be-
wrong/ (accessed 5 December 2015).
Gillespie T (2012) The relevance of algorithms. In: Gillespie
References T, Boczkowski P and Foot K (eds) Media Technologies:
Ananny M (2016) Toward an ethics of algorithms: Essays on Communication, Materiality, and Society.
Convening, observation, probability, and timeliness. Cambridge, MA: The MIT Press.
Science, Technology & Human Values 41(1): 93–117. Glaser V (2014) Enchanted algorithms: How organizations
Barad K (2007) Meeting the Universe Halfway: Quantum use algorithms to automate decision-making routines. In:
Physics and the Entanglement of Matter and Meaning. Proceedings of the Annual Meeting of the Academy of
Durham, NC: Duke University Press. Management, Philadelphia, PA.
Barocas S (2014) Panic Inducing: Data Mining, Fairness, and Goodman B and Flaxman S (2016) EU regulations on algo-
Privacy, PhD Thesis, New York University, NY. rithmic decision-making and a ‘‘Right to Explanation’’.
Belluck P (2015) Chilly at work? Office formula was devised In: International conference on machine learning workshop
for men. New York Times, 3 August. Available at: http:// on human interpretability in machine learning (WHI 2016),
www.nytimes.com/2015/08/04/science/chilly-at-work-a- June, New York, NY, pp. 26–30.
decades-old-formula-may-be-to-blame.html (accessed 5 Graham SDN and Wood D (2003) Digitizing surveillance:
December 2015). Categorization, space, inequality. Critical Social Policy
Berry D (2011) The Philosophy of Software: Code and 23(2): 227–248.
Mediation in the Digital Age. Basingstoke, UK: Palgrave Gusterson H (2001) The virtual nuclear weapons laboratory
Macmillan. in the new world order. American Ethnologist 28(2):
Boellstorff T and Maurer B (eds) (2015) Data, Now Bigger 417–437.
and Better! Chicago, IL: Prickly Paradigm Press. Gusterson H (2008) Nuclear futures: Anticipating knowledge,
Buenza D and Millo Y (2013) Folding: Integrating algorithms expert judgment and the lack that cannot be filled. Science
into the floor of the New York Stock Exchange. Working and Public Policy 35(8): 551–560.
paper, Social Science Research Network (SSRN). Hansel S (2007) Google answer to filling jobs is an algorithm.
Burrell J (2016) How the machine ‘thinks’: Understanding New York Times, 3 January. Available at: http://www.
opacity in machine learning algorithms. Big Data and nytimes.com/2007/01/03/technology/03google.html
Society 3(1): 1–12. (accessed 5 December 2015).
Chun W (2008) On ‘‘sourcery’’, or code as fetish. Configurations Introna LD (2016) Algorithms, governance, and governmen-
16(3): 299–324. tality: On governing academic writing. Science, Technology
Cox G (2012) Speaking Code: Coding as Aesthetic and & Human Values 41(1): 17–49.
Political Expression. Cambridge, MA: MIT Press. Jacobson V (1988) Congestion avoidance and control.
Curry M (1998) Digital Places: Living with Geographical In: Proceedings of ACM symposium on communications
Information Technologies. London, UK: Routledge. architectures and protocols SIGCOMM’88, Stanford, CA,
Dourish P (2014) NoSQL: The shifting materialities of data- pp. 314–329.
base technology. Computational Culture 4. Kirschenbaum M (2008) Mechanisms: New Media and the
Dourish P (2015) Packets, protocols, and proximity: The mate- Forensic Imagination. Cambridge, MA: MIT Press.
rialities of internet routing. In: Parks and Starosielski (eds) Kitchin R (2014) The Data Revolution: Big Data, Open Data,
Signal Traffic: Critical Studies of Media Infrastructures. Data Infrastructures and Their Consequences. London,
Champaign, IL: University of Illinois Press, pp. 183–204. UK: Sage.

Downloaded from by guest on August 25, 2016


Dourish 11

Livingstone E (1986) The Ethnomethodological Foundations of Pasquale F (2015) The Black Box Society: The Secret
Mathematics. Boston, MA: Routledge & Kegan Paul. Algorithms that Control Money and Information.
Mackenzie A (2005) Protocols and the irreducible traces of Cambridge, MA: Harvard University Press.
embodiment: The Viterbi algorithm and the mosaic of Rosenblat A and Stark L (2016) Uber’s drivers: Information
machine time. In: Hassan (ed) 24/7: Time and asymmetries and control in dynamic work. International
Temporality in the Network Society. Stanford, CA: Journal of Communication 10: 3758–3784.
Stanford University Press, pp. 89–108. Sandvig C, Hamilton K, Karahalios K, et al. (2014) Auditing
Mackenzie A (2006) Cutting Code: Software and Sociality. algorithms: Research methods for detecting discrimination
Pieterlen, Switzerland: Peter Lang International on internet platforms. In: Annual Meeting of the
Academic Publishers. International Communication Association. Seattle, WA,
Mackenzie A (2015) The production of prediction: What does pp. 1–23.
machine learning want? European Journal of Cultural Seaver N (2015) Working with algorithms: Plans and mess. In:
Studies 18(4–5): 429–445. Kai Franz (ed) Serial Nature. Stuttgart: Edition Solitude.
Manovich L (2001) The Language of New Media. Cambridge, Singer N (2014) The scoreboards where you can’t see your
MA: MIT Press. score. New York Times, 27 December. Available at: http://
Manovich L (2013) Software Takes Command. London, UK: www.nytimes.com/2014/12/28/technology/the-scoreboa
Bloomsbury. rds-where-you-cant-see-your-score.html (accessed 5
Montford N and Bogost I (2009) Racing the Beam: The Atari December 2015).
Video Computer System. Cambridge, MA: MIT Press. Wirth N (1975) Algorithms þ Data Structures ¼ Programs.
Montford N, Baudoin P, Bell J, et al. (2012) 10 PRINT Englewood Cliffs, NJ: Prentice-Hall.
CHR$(205.5þRND(1)); : GOTO 10. Cambridge, MA: Zarsky T (2016) The trouble with algorithmic decisions:
MIT Press. An analytic road map to examine efficiency and fairness
Neff G and Nafus D (2016) The Quantified Self. Cambridge, in automated and opaque decision making. Science,
MA: MIT Press. Technology & Human Values 41(1): 118–132.
Neyland D (2016) Bearing account-able witness to the ethical Ziewitz M (2015) Governing algorithms: Myth, mess, and
algorithmic system. Science, Technology & Human Values methods. Science, Technology & Human Values 41(4): 3–
41(1): 50–76. 16.

Downloaded from by guest on August 25, 2016