Sound, Space, Gravity:

A Kaleidoscopic Hearing
(part I)1
How does sound in cinema evoke space? When we say sound is evoking space,
what exactly is the space being referred to and where is it located? How does
sound space work with the space suggested by the images? This essay proposes to
explore these questions by paying close auditory attention to one of the most
technically innovative films of the recent decade, Gravity (Cuaron 2013). By
contextualising the films sonic achievements in a series of kaleidoscopically
historical and theoretical discussions I hope to show that the ways in which
cinema immerses its audience with sound is no straightforward matter; instead
it involves an intricate intertwining of sound technologies, industrial
conventions, perceptual habituation and the idiosyncratic way of audiovisual
presentation that cinema offers.

We accept seen space as real only when it contains sounds as well, for these
give it the dimension of depth.
Bela Balazs (1970: 207)

The New Soundtrack 6.1 (2016): 115

DOI: 10.3366/sound.2016.0079
# Edinburgh University Press and the Contributors

1. I wish to thank Tom

Gunning, Yuri Tsivian and
Jim Lastra for their
support on writing this
essay. Eric Dienstfrey has
read a draft and given
valuable input.

sound space
vocal proxemics
panning body sounds

2. To my knowledge there
are at least twelve such
trailers: aurora, argon,
canyon, city/Broadway,
curious George, Dolbee,
Egypt, enlighten, game,
rain, stomp, the train.
Some of these have more
than one version and
some (aurora, argon,
enlighten) are rather
short, logo-only pieces.
Sobchack mentions eight.
3. The phrase when the ear
dreams is from Gaston
4. Sobchacks description of
the sonic motions of
these trailers recalls
Germaine Dulacs notion
of pure cinema, that is, a
cinema of pure motion:
. . .a visual symphony, a
rhythm of arranged
movements in which the
shifting of a line or of a
volume in a changing
cadence creates emotion
without any crystallization
of ideas. Quoted in
(Gunning 2007: 38)

Dong Liang

Impressed and intrigued by nine purposefully oneiric Dolby Digital

promotional trailers,2 especially their promise to open peoples ears in a
new way, Vivian Sobchack, in an essay provocatively titled when the ear
dreams, muses on the phenomenological consequences of hearing highly
spatialised sounds and how these trailers make our assumptions between
seeing and hearing in the cinema less certain, if not completely reversed
(2005: 2)3. Throughout, Sobchack describes, emphasis is on sound
emergent, moving, swelling and fading, on sounds separated, spatialised,
and amplified to create an intensified sense of acoustical presence and
sonic immersion (2005: 8). The Dolby Digital trailers are prime examples
of how an audiovisual vignette may be, as Sobchack shrewdly observes,
made to foreground sound (as well as to corporately shape it) (2005: 3).
Yet the basic characteristics of the sound space offered by these trailers are
meant to be deployed eventually in mainstream filmmaking. In other
words, although a sound space that consists largely of pure sonic motion4
may be better characterised as sonic avant-garde, the attempt is fuelled by
the same desire that has propelled cinema sounds exploration of space.
How does sound in cinema evoke space? When we say sound is evoking
space, what exactly is the space being referred to and where is it located?
These are intricate questions that can only be fully explored in a much
more copious theoretical/historical treatment of the subject. Yet a recent
work, Gravity (Cuaron 2013), gives us a perfect example of what might be
the key theoretical and historical issues at stake concerning sound space.
The film is, in many ways, another milestone of cinema sound since the
Dolby Digital trailers. Most interestingly, while the said trailers highlight
the nature of a constructed sound space untethered to a narrative context,
here what is involved is somewhat similar, but within a fully fledged
narrative context. The Dolby Digital trailers take place in an imaginary
space; Gravity takes place in the outer space. Both spaces are foreign to an
average filmgoers experience. As the opening quote from Balazs suggests,
to our perception, a space is only real to the extent that it is filled with
sound. Without sound, a space film might run the danger of becoming a
cartoonish blank space without depth. Although Gravity is a film set in a
space that carries no sound, it cannot afford to do without sound. In fact,
in Gravity the issue of sound space seems to have acquired much urgency,
or should I say, gravity, since sound plays a critical role in conveying the
sense of losing oneself in space. It is also sound that infuses the immensity
of the space with human presence, populates it and makes it palpable and
therefore navigable.
This case study gives a close hearing to the sound of Gravity. Yet it
also raises many (hence the term kaleidoscopic) interlocking issues of a
theoretical and historical nature such as voice panning, vocal proxemics,
sound-image scale matching and the unacknowledged role played by body
sounds such as breath and heartbeat. In addition to paying close attention
to the sonic achievements of the film, this essay also endeavors to situate
them in a historically persistent mode of sound space construction.
By doing so I hope to establish that the notion of sound space can act as a
useful gauge that peruses in its unique fashion the history of film sound.

Gravity: A Kaleidoscopic Hearing (part I)

The essay is published here in two parts. In the first part, I introduce the
film, the spacesuit genre and then focus on its panning practice: how a
particular sound (object) such as voice, music, or even heartbeat moves
through speakers and creates a sense of space in the process; how this
practice draws from the past and breaks with tradition. While Gravitys
panning is justified by its camera movement and idiosyncratic diegetic
setting, its continuity editing, a tried-and-true method of constructing
space through images, becomes problematic with the extensive call of
spatialisation. The second part of this essay is concerned with the spatial
acoustics of human voice and the body sounds, two types of sound that are
significant components in the soundscape of Gravity.

5. The dialogue between

ground control and the
astronauts was taken
nearly verbatim from
transcripts and
recordings. See http://
original/167. Accessed 15
Feb 2015.


7. The same kind of problem

faces films such as Dredd
(Travis 2012) or Frank
(Abrahamson 2014)
where during the entirety
(or near entirety, in the
case of Frank) of the film
the protagonist never
takes off his helmet.

Space exploration has been an attractive theme in the sci-fi genre since
Georges Melie`ss Trip to the Moon (1902). In the wake of the USs effort
to catch up with the space race and the formation of NASA in 1958,
Hollywood was able to put a spin on the genre and applied a dry coating
of rocket science on the basic dramaturgical elements of the genre. Despite
its many varieties (e.g., the majestic 2001: A Space Odyssey (Kubrick 1968),
the contemplative and episodic The Right Stuff (Kaufman 1983), the
fastidious5 Apollo 13 (Howard 1995)), a space film contains often the
following generic ingredients: machine malfunction that endangers space
travelers;6 overabundance of technical jargon that could have come from a
NASA documentary, which this genre closely borders; the archetypal plot
of overcoming disasters and returning to home triumphantly; a parallel
depiction of astronauts life on earth as average human beings (i.e. in the
backyard, holding a beer) and their wives anxiety. The problem of the
genre, or its strength, consists in balancing what is familiar (astronauts as
emotionally comprehensible and predictable creatures so that our fear can
be projected and made visible), with what is unfamiliar but attractive (the
outer space experience) and what is neither familiar nor attractive but
nevertheless key to the genre: the highly verbalised, extremely technical
aspect of space travelling.
While Gravity has clearly drawn from its many predecessors in what
we might call the spacesuit genre, it could well be the ultimate spacesuit
film. Apart from the final scene, where Dr. Ryan Stone (Sandra Bullock)
lands on earth (still no human being in sight), the entire film takes place in
outer space, under zero gravity. Other than a few remarkable moments
where Stone exposes her body, for the entire length of the film, all we can
see of the characters are their facesthe rest hidden in the cumbersome
and asexual spacesuit. This constitutes a very realistic challenge not only to
acting, but also to viewing. How are the actors and actresses able to convey
their emotional states with all their bodies and limbs blocked from
our view and their movements neutralised by the generic, all covering
spacesuit? How can a film connect and communicate with its audience,
when everything except the face7 (and a few interior shots) is generated by
CGI and animated on computer? After all, what defines the film as a live
action film, instead of computer animation?8

6. Susan Sontag in an
influential article explains
that we need malicious
machine against human to
happen: we have a
disaster complex. (Sontag

8. The claim that Gravity is an

animation film is not as
far-fetched as it may
sound. In fact, the
animation part of the
whole process seems to
dwarf if not trivialise the
live action part. It also
makes the distinction
between pre-production,
production and postproduction obsolete. In
an early interview with
The Wired, Alfonso
Cuaron made the
following comment that
might remind us of Lev
Manovichs radical
proposal that the history
of cinema is but an
entracte of the history of
animation: We had to do
the whole film as an
animation first. We edited
that animation, even with
sound, just to make sure
the timing worked with
the sound effects and
music. And once we were
happy with it, we had to
do the lighting in the
animation as well. Then all
that animation translated
to actual camera moves
and positions for the
lighting and actors [. . .]
We animated for two,
maybe two and a half
years before we started
shooting the actors. Then
we shot the filmand
then the poor animators
had to start from scratch
because they had to base

Dong Liang

their final animations on

what was shot. http://
Accessed 15 Feb 2015.
9. http://
blog/2013/11/07/gravitypart-1-two-charactersadrift-in-an-experimentalfilm/. Accessed 15 Feb
2015. The film indeed
makes use of a new
camera mount, called the
Isis, that bears uncanny
similarity to the device
that Snow has
commissioned to make in
order to shoot his film.
10. The film is not shot in
3D, but uses an entirely
conversion process to
convert 2D images. Yet
the effect is more than
convincing. Kristin
Thompson admits, Even
I, no fan of 3D, have seen
it in that format at two
of my three viewings so
far and would do so
11. A detailed description of
this workflow can be
found in (Kaufman n.d.)
12. In Aningnaaq, a
companion short film
directed by Jonas
Cuaron, co-writer of
Gravity (and son of
Alfonso Cuaron), this
mans identity and story
is revealed.
13. The song is called
Angels Are Hard to
Findfrom Hank Williams
Jr.s 1974 album Living
Proof. Written as a
prayer to God, Hank
confesses to having failed
at his past love, but
promises to be good to
his future love if he gets
a chance.

It is the sound, again, that comes to the rescue.

To say this is by no means to underestimate the visual achievement of
the film. In fact, Gravity strikes a unique position between two kinds
of filmmaking. On the one hand, it is a narrative fiction with a
characteristically Hollywoodian plot designated for mass audience. On the
other, the visual and auditory aspects of the film significantly go beyond
what the conventions of mainstream production dictate. Varietys
Scott Foundas refers to it as the worlds biggest avant-garde movie.
J. Hoberman echoes the view by calling it blockbuster modernism. Kristin
Thompson, even before seeing the film, compares it to Michael Snows
La Region Centrale (1971).9 The films unprecedented camera movement,
unique conception of stereoscopic imagery10 and intriguing production
workflow11 merit careful unpacking and critical analyses.
This case study shall focus on the films sonic achievement. Let us not
forget: for the five Oscars the film garnered, three are located on the
soundtrack. Apart from the original score composed by Steven Price,
Gravity became the first film pre-mixed in Dolby Atmos to win Oscars in
the categories of sound mixing (Skip Lievsay, Niv Adiri, Christopher
Benstead and Chris Munro) and sound editing (Glenn Freemantle).
This is a bit ironic. Because the film begins, as it happens, with a black
screen onto which the following inscriptions are shown consecutively:
At 600KM above planet Earth the temperature fluctuates between + 258
and - 148 degrees Fahrenheit. There is nothing to carry sound. No air
pressure. No oxygen. Life in space is impossible.
There is nothing to carry the sound. Except that the film cannot be
silent. In fact, the film is not short of any sound that a traditional film may
embrace. Music has a strong presence in this filmalready, while the titles
unravel, a crescendo is building up, ultimately reaching deafening volume,
synchronised with the image of a giant planet earth viewed from outer
space. Throughout the film, the composer uses a skilful mix of analogue
(traditional string instruments and less conventional instruments such as
glass harp and glass harmonica) and electronic (synthesiser) sounds whose
fusion recalls the score composed by Vangelis in Blade Runner (Scott
1982). Moreover, the numerous collisions in the film are accompanied/
synchronised by sonic booms, whose psychological effect straddles the
category of music and sound effect.
We also hear, from the very beginning, nonstop dialogue from both
visible and invisible characters (Houston, the Explorer and the mysterious
person with dog barking and baby cries12). Lieutenant Matt Kowalski
(George Clooney) is an obsessively talkative charactersome viewers may
find it charismatic; others, offensive. But he may be forgiven on the ground
that much of the films back plot needs to unravel through his verbal
impertinence. It is also true that he fulfills a vital function, namely, to fill
the radio communications with therapeutic chatteringhis voice sounds
just like the country music13 from his portable radio: a little naive, but
certainly sunny. Even when his oxygen level is as low as two percent,
the talking must go on, lest the characters become engulfed in a vast void
and agoraphobic infinity. By the same token, Houston will kindly indulge

Gravity: A Kaleidoscopic Hearing (part I)

in Kowalskis babbling, because hearing voices simply affirms the

functioning of the communication channelas long as they are talking,
everything should be fine. Ryan Stone is initially more reserved; but as the
film progresses, she also develops a compulsive talking habit. She has to
speak out loud her thoughts, constantly, to assure herself as well as the
audience. Many times, both Kowalski and Stone report to Houston in the
blind. They do this not necessarily to communicate the gravity of their
situation to mission control, but to overcome their own fear by wrapping
personal loss with an indifferent, official tone.
Finally, there is no lack of sound effects in the film. Critical reception
tends to focus on what happened in outer space. Indeed the film has long
segments in outer space, perhaps more than any feature film ever made.
But there are also long interior segments.14 The soundscape of the ISS is
full of beeping, hissing, metal clicking, electric sparks, grinding and all
sorts of factory noises. The fire sequence, although short, is intense on
sound effects (alarms, extinguisher, shaking, rumbling, explosion).15 True
to the slogan no air, no sound, the film claims to have used contact
microphones to record its sound effects when characters work in outer
space. Instead of airborne waves this type of microphone would only record
the vibrations of physical objects in touch. The sound designer Lievsay
describes it as an extremely clever conceit and attributes it to Glenn
Freemantle. I am unable to confirm if such sounds indeed resemble what
an astronaut would hear under the circumstance. Given how sound effects
are normally used in movies, it should not surprise us that the films
avowed sonic realism does not amount to a rigorous authenticity of how
things really sound in outer space but instead aims at simulating a familiar
experience, namely, muffled sounds from underwater.16
Gravity begins, after its very own audiovisual big bang, with a gradual
phasing in of a radio conversation. Several indistinct voices can be
identified, which grow louder with the Explorer space shuttle getting
bigger on the screen. One by one these voices are anchored to bodies on
the screen. But the anchoring is not accomplished through showing the
perfectly synchronised images and sounds of moving lips, the towering
achievement of the talkie. Instead, it is accomplished by a consistent
mapping of the spatial location of the voice: a voice is associated with a
body, because it is perceived as coming from the precise location that a
body is found on the screen. The kind of spatial synchronisation at work
here bears uncanny resemblance to an idea that emerged in the early years
of transition to sound, where tying the voice and the body as closely as
possible in space was believed as a necessary condition for sound (talking)
cinema. In a 1928 American Cinematographer article titled New Light on
Talkies (Kroesen) the author suggests that the screen should be divided
and so arranged that sound will be reproduced only at or as near the point
of action as possible.17 This idea entails insurmountable technical
difficulties at the time: Into how many sections should the screen be

14. This includes the better

part of the second half of
the film where Stone
travels through ISS,
Soyuz, Tiangong and
Shenzhou. The only time
spent outside is when
she goes out to detach
the Soyuz (46:4853:30)
and later when she
jumps from Soyuz to
Tiangong (1:11:10
1:13:20). Therefore the
time in space includes
the first part of the film
(about 37 minutes) and
the two sections
mentioned here
(7 minutes). They make
up a little over half of
the films total length
(83 min).
15. When Stone closes the
trapdoor above we can
hear sounds coming
from overhead speakers
in an Atmos theater.
16. One of the films most
blatant mistakes is of a
similar nature. When
Stone holds Kowalski by
a rope both of them are
still under zero gravity, a
gentle pull and he will
come toward her. Yet
the film presents the
situation as if she was
holding the rope from a
cliff on earth, and she
needs to let go in order
to survive: a scenario
popularised by
numerous movies.
17. Quoted in (Altman
1992: 48)

Dong Liang

divided? How to implement a smooth transition from one speaker to

another (only manual switching of one single track was available at the
time)? It is therefore hardly surprising that the project bears no fruition
whatsoever. Yet the idea seems to have survived well and is finally realised,
after almost a century, in Gravity.
Although Ryan Stones voice is the first that emerges from the
indistinguishable sonic background, it remains disembodied before the
two other astronauts are visually anchored. As the camera approaches
the Explorer from afar, the audience is able to identify Kowalski first as
the source of a narcissistically cheerful voice as he cruises from screen
right to left, getting very close to the camera at one point. Kowalskis voice
and the moving diegetic music (supposedly playing from a portable device
he carries but never shown in close-up) are carefully panned with his screen
location and this spatial synchronisation not only helps the audience to
identify a character, but also adds depth to the space seen. Immediately
after the identification of Kowalski, Houston makes a reference to a
character that is for the first time brought to the audiences attention:
Shariff. A voice with a noticeable Indian accent answers. At this point
what we take as Shariffs body becomes centered on the screen and his
voice sounds accordingly close-by. The identification remains partial
however for its lack of either spatial movement or facial revelation.
The anchoring is confirmed only when Shariff is told to take the rest of the
day off and starts his Macarena dance.
Once the film helps us to identify characters in the space, it then
proceeds to move the bodies out of the frame while using the spatial
location of their voices to keep track of them. A characters entrance into
the frame can thus be predicted by the trajectory of his or her voice moving
through the auditorium. Even mission control is pinned down: in the
beginning Ed Harriss voice emits from lower left corner, that is, where the
blue planet is located; as the camera moves to a position where the earth is
framed to the upper right corner, his voice is relocated accordingly.
Clearly, the perceived precision of this panning trajectory depends on
the particular theatres sound system. The Dolby Atmos system here
constitutes a definitive advantage thanks to its two arrays of ceiling
If, in previous surround sound practices, the screen has been acting as a
giant threatening magnet enforcing a strict hierarchy of sounds spatial
characteristics, here the screen is transformed into a harmless window that
looks into the space. All the voices enter and depart from the magnetic
field with ease, as if the electricity-powered screen-magnet was shutdown.
The screen is dispossessed, (Chion 1999: 166) Chion anticipated this
experience more than a decade ago, and the image comes to float like
a poor little fish in this vast acoustic aquarium. Phenomenologically
speaking this loosening of the audiences visual fixation on the screen is
accompanied by a competing bodily awareness of the auditorium. Each
time a voice is carried outside the luminous container and roams free it
calls attention to itself and becomes an attraction, like a shot of firework
that thrusts into the dark canvas of the auditorium. The effect is such that

Gravity: A Kaleidoscopic Hearing (part I)

I frequently find myself following the invisible sound with my eyes before
I realise that I am staring at speakers on the wall or on the ceiling.
The films first subjective shot is also worth noticing for its intricate
manipulation of space. I say subjective shot, knowing that the films camera
movement (in the Lubezki-Cuaron signature style) merges seamlessly
what traditionally should have been several shots, subjective or not. To be
precise, what I call a subjective shot is the middle portion of the films
second shot, which is executed in a symmetrical fashion, starting from and
finishing by showing Stones body in the distance. Between these two
extreme long-shot scales the camera transits from free-floating (the camera
remains still relative to Stone so we can observe her movement) to a fixed
position in relation to Stone. The virtual18 camera then closes in on her
face and eventually penetrates her helmet. While the digitally sutured
camera movement manifests itself as one seamless flow, the accompanying
sound marks the entrance and departure by sounding significantly
different. Outside the helmet, the soundtrack has a sort of ostinato
underscore that we may interpret as the ruthless rhythm of outer space.
The moment the camera enters the private sphere of the helmet this
ostinato is gone and a new high pitch sound is switched on. In conjunction
with this change Stones breathing now has radically different acoustics.
Thanks to the expressiveness of sound space, therefore, camera movement
becomes a much more intense bodily experience for the audience. Similar
to what happens in Star Wars (Lucas 1977), where Obi Wan-Kenobis
urge use the force resonates in all channels, here Stones breathing is
momentarily spread out in the auditorium before settling down to the rear.
This sonic movement is justified by the camera taking up a classic
subjective view, with its characteristic instability and blurred vision.
Arguably for the first time in cinema history, a subjective shot is
perceptually reinforced by the voice of the very subject coming exclusively
from the surround back, as if the audience is magically shrunk and
relocated to the tiny space between her mouth and the helmet.
In consistently pushing the dialogue out of the screen (or rather, the
front speakers), Gravity presents a significant challenge to the established
codes of surround sound, and signifies a triumphant return to a crucial idea
in the history of sound space. For the last two decades, a key compromise
made by surround sound technologies is that certain sounds can break free
from the magnetic force field of the screen while others remain forever
trapped. The human voice, especially that which carries vital narrative
information, traditionally belongs firmly to the latter category. This is not
because the previous iterations of sound technology do not possess a means
to send human voice to the other end of the auditorium. Quite the
contrary. Experimental stereo films in the 1950s, Dolby Stereo six track
films in the 1970s, or Dolby Digital and other formats of digital surround
sound in the 1990s have all attempted, at one point in their lifespan,
to involve surround sound in the role of storytelling. For a few years,
Rick Altman recalls the early days of Dolby Stereo, every menace,
every attack, every emotional scene seemed to begin or end behind the
spectators. Finally, it seemed, the surround channel had become an integral

18. For an ambitious

theoretical proposal on
the nature of camera
movement in the digital
age see (Pierson 2015).

19. Chion for instance

recognises some
precedents of the
practice: voices can
circulate around and
beyond the screen, in
orbit (the ghost voices in
Poltergeist, space
communications in
numerous sci-fi films like
Alien) (Chion 1999:
167). This said, I suspect
Chions description
might be only partly true
or even inaccurate, since
the two films are made
respectively in 1979 and
1982, when the
technology cannot really
afford this practice.
20. (Blake 1984: 45)

Dong Liang

part of the films fundamental narrative fiber.19 But the deplorable

condition (limited frequency range, poor maintenance) of those surround
speakers has forced sound designers to reconsider their options. Upon
discovering how critical information carefully placed in the surround
channels was not properly played, as the story goes,20 Ben Burtt initiated a
retreat that other sound designers soon followed. The emancipation of the
voice adds another record of failure. The deployment of narratively crucial
sounds in rear speakers ended up being undesirable; it remains a possibility,
albeit one seldom realised. The resulting codes of surround sound dictate a
hierarchical order of channels and speakers, each designated for one
specific purpose.
Gravity could have been composed with this conventional schema of
frontalised sound. But instead it does not hesitate to push narratively
important dialogues outside the shadow of the screen. Indeed, the film is
conceived in such a way that moving voices out of the screen becomes a
necessity. The film is in a unique position to treat all directions in the
auditorium almost equally, as this is precisely how the images are designed
to render the diegetic space. There is nothing to carry the sound. No air
pressure. No oxygen. The film opens with these lines. To this one might
also add, no sense of direction. Although we can only look ahead of us
and the screen still possesses a degree of directionality (up, bottom, left,
right), the lack of a stable reference horizon upsets the audiences sense of
orientation and their mental construction of diegetic space. Unlike other
space exploration films Gravity doesnt offer us a familiar ground (e.g.,
a backyard BBQ) before we take off. The audience is thrown into the space
from the very first moment and stays there for almost the entirety of the
film. It is only in the final shot of the film the horizontality inherent in any
earth film returns: we are able to tell that the camera tracks horizontally
and tilts up to frame, from an extremely low angle, Stones triumphant
stancenone of these terms would make much sense if we are still in the
outer space. To say almost is to convey the residual horizontality still
implied by the camera movementfor the majority of times it does tend to
frame characters in a upright position while frequently upsetting it with
opposites. Likewise, the sound space needs to disguise its current dark
zoneas it stands, sound would not yet come from underneath the
floorand the camera movement acts accordingly, never allowing a
character to cruise out of the frame from the bottom.
Even considering the fact that voice panning is concentrated in specific
segments of the filmthe film does become much less radical in terms of
form after the first thirty minutesa treatment like this is intensely
subversive. The film is able to afford such treatment partly because it is
fully motivated by the plot; more importantly it is supported, or should
I say demanded, by the innovative camera movement. The film offers a
fascinating occasion to observe the effect of constantly hearing voices from
anywhere other than its locus classicus, i.e., behind the screen. It certainly
does result in the audiences attention being divided. But whether this
constitutes an attraction or distraction seems to depend on first, the
narrative and emotive justification of the said divergence, and second,

Gravity: A Kaleidoscopic Hearing (part I)

the currently underarticulated nature of experiencing sounds that centre

around us. One thing is certain, however. Instead of wasting the audiences
limited cognitive resources, the unprecedented level of spatialised voice
in Gravity actually serves a purpose: it compensates for the sense of
disorientation induced by the dazzling camera movement.
Last but not the least, the curious case of voice orbiting gives us a
chance to revisit Mark Kerinss ultrafield theory (Kerins 2010). Clearly,
if the orbiting voices in Gravity do construct a sound space, this space is
hardly an accurate acoustic rendering of the diegetic space of the film. The
sounds, transmitted from radio, are not sonic waves travelling through air,
and therefore cannot be indicative of any particular direction. Unless a
character has grown, as Vertov may have dreamt, a pair of radio-ears that
can natively interpret electromagnetic waves, she would hear all the sounds
equally and closely miked. The perfect spatialisation, therefore, is entirely
constructed for the sake of the audiences sense of immersion. Even with
the presence of air, one can still easily find evidence in the film where
sound spatialisation does not constitute an accurate mapping of the
diegetic space. Instead it often suggests, albeit with considerable subtlety,
crucial narrative intention long before the images have the liberty to do so.
For instance, in the episode of Kowalskis posthumous visit, everything
looks normal: Kowalski enters, takes off his helmet, takes a sip out of the
vodka bottle and proceeds to tell Stone what her next step should be. But
the sound has already betrayed his presence. During their conversation,
there is a heavily echoed21 version of Kowalskis favorite country music,
which is played only in the overhead speakers. That the sound is not
coming from this space is already a clear hint of the nature of this
encounter. Towards the end of their conversation, Kowalskis voice starts
to drift around the auditorium while he visually remains seated in the same
place. By having the rambling of the voice contradicting the visual, the film
presents an acoustic dissolution of Stones hallucination.22 The dissolution
gradually leads to the return of the alarm sound that is clearly diegetic in
the space of the Soyuz.
Although Gravity represents a special case that most films cannot easily
follow, its presentation of sound space is still exemplary of Hollywoods
dominant ideology of spatialisation. This ideology dictates that if the
sounds indeed create a space, it shall be a space highly schematic in what it
chooses to represent (and it carefully disguises what it cannot); it shall be
created for the specific purpose to guide the audience in their makebelieve. The sound space thus constructed is artificially matched to the
image space, so that the two can mutually reinforce each other. What is
frequently referred to as immersion, therefore, is the result of such
matching of two spaces. A sense of immersion thus becomes a valuable
resource that effectively enhances the audiences psychological alignment
with characters. Continuously tracked by sonic coordinates, Kowalskis
circling around the auditorium is meant to offer a sonic attraction as well
as an aid for the audiences construction of diegetic space. But the voice
panning also serves to situate the audience at the center of his circling
which facilitates our identification with the heroine. The uninterrupted

21. This effect resembles

what the alarm (oxygen
level low) sounds like
immediately before the
episode, indicating its
subjective nature.
22. Having voices rambling
around the auditorium
has been used in the past
to signify hallucination. In
12 Monkeys (1998), for
instance, there is a scene
where Cole (Bruce
Willis) is confined to a
cell where seemingly
disembodied voices
assault him from all


Dong Liang

vocal instructions establish a soothing presence. Like a dancing

partners ensuring instructions it helps us to overcome our own fear
facing the void.
The kind of close matching between sound space and image space as
exhibited in the opening shot of Gravity works best when the camera
movement presents a continuum of space of which the screen is nothing
but a moving window. Sound reinforces the sense of self-movement by
sketching out an invisible yet logical trajectory of objects outside that
window. This kind of close matching encounters significant resistance,
however, when editing is introduced. Inevitably editing presents not only
discontinuous but necessarily significant relocation of humans and objects
in space. The system of continuity editing, for example, dictates a
movement between 30 to 180 degrees. To maintain close spatial matching
in these circumstances would mean to force the audience to become aware
of the previously habituated and largely invisible jump by accentuating it
with sound.
In Gravity, the tethered space walk sequence in which Kowalski and
Stone make their way to the ISS is highly indicative of the problem. This
sequence initially alternates two shots of both characters tethered together
and full shots of Stone. Kowalskis voice jumps in the auditorium space
whenever a cut happens: from right behind the screen to the left rear
speakers. The pattern repeats itself three times before other shots are used
(Kowalskis POV looking at a wrist mirror; CU of Stone; a new two-shot
from the back). Indeed, a conventional dialogue sequence such as this
would raise no eyebrow at all if it were not for the fact that the audience,
having just been successfully primed with an impressive close spatial
matching of two spaces, now faces a huge gap between the two. Not only
does the sound call attention to the editing and make it no longer invisible,
but it questions the validity of the practice of editing that breaks up the
hitherto continuous space.
To understand the extent in which space matching has been taken
for granted in contemporary audiovisual media authoring, consider the
curious (but already customary) phenomenon of revamping films made
in the monophonic or stereophonic format into 5.1 surround sound, a
phenomenon that remains underexplored in film sound scholarship. Take
for example the Blu-ray release of On Her Majestys Secret Service (Hunt
1969). The films original theatre release has a stereo soundtrack (the first
stereo track in a Bond movie). But in order to accommodate the 5.1
DTS-HD (or elsewhere Dolby TrueHD) lossless format that is the sine
qua non of todays Blu-rays, the films soundtrack has undergone nothing
less than an overhaul. Besides relocating numerous spot effects and
ambiences (ocean noise, the clop of a horse, cars zipping, skiers blasting
down the mountain, pulsing helicopter blades, fireworks bursting, etc.)
into the rear channels (a standard operating procedure these days), the
new soundtrack also moves the voices into the surround channel where

Gravity: A Kaleidoscopic Hearing (part I)

it sees fit. The dialogue sequence between Blofeld and Bond, after his
disguise is blown, is a case in point. In this scene Bond sits on a sofa and
Blofeld on a chair opposite Bond several metres away. Instead of staging
the scene with two-shots or over-the-shoulder shots, the camera is
positioned in the midpoint of their eye line (naturally the eyeline is slightly
off-kilter). The soundtracks renovation sees this as a perfect opportunity
to apply voice panning: as Blofeld or Bond is speaking off the screen, the
voice would clearly come from the rear speaker, indicating the sources
spatial location. As the shots alternate the to-and-fro bouncing of the
voice becomes very noticeable, which in turn calls attention to the editing
itself. On several occasions, to maintain the consistency of the practice, a
voice would jump in mid-phrase from straight ahead to behind, the
experience of which is not unlike instant teleporting.
The above two examples show that editing in cinema, especially the
most banal but quintessential shot-and-reverse-shot formula, might
become the biggest hurdle for close space matching as an element of
contemporary audiovisual aesthetics. The full gravity of the situation can
also be understood by a historical lesson: this problem is in fact not a new
one. Indeed, what we are highlighting here is the same complaint about
many stereo films made in the 1950s such as The Robe (Koster 1953).
Being the first CinemaScope film, The Robe benefits from the stereo
recording of many of its dialogue scenes. While boosting realism by
accurately matching the sound space and the image space, the practice was
found distracting when editing is involved. As voices from characters
located at different parts of the screen are sonically (instead of visually)
located, the integrity of the sound space comes to the fore. Various shots
from the same scene stand out by their distinct sonic depths, threatening to
call attention to themselves instead of merging seamlessly into the overall
flow of the sequence.
What The Robe wanted to revive, voluntarily or not, can be regarded as
a dream conceived two decades earlier by the advent of sound. The early
talkies notably produce an uncanny impression, albeit only for a short
amount of timethe audience had difficulty believing that the body and
the voice are one, because they dont seem to come from the same spot in
space and their spatial signatures dont match. Lucy Fischer comments,
[. . .] the creation of a sound/image illusion was a highly tenuous process,
and one whose success revolved around the parameter of space. (1985:
232)23 The problem of the talkie, or at least its perceived imperfection of
fusing two spaces, has fuelled several initiatives in sound technology that
aim to explore how sound space can be convincingly captured and
represented. Among these is Alan Blumleins attempt24 to use a coincident
pair of microphones (a clever contraption now called the Blumlein pair)
that picks up phase differences and converts them into amplitude
differences by a pair of loudspeakers. Blumlein and his colleagues also
made a series of experimental recordings and films to demonstrate the
technology and to see if there was any commercial interest from the film
and music industry. One of these films, literally called walking and
talking, shows a stage where three men in suits or lab coats walk from left


23. Fischers remark is

commenting on the
following observation
from Rudolf Arnheims
1933 work. These
phenomena are probably
largely due to the fact
that sound arouses an
illusion of actual space,
while a picture has
practically no depth.
Rudolf Arnheim, Film
(Farber and Farber,
1933), 235.
24. By admittedly anecdotal
accounts, Blumlein went
to see a talkie with his
wife in 1931 and was
troubled by the sound
reproduction, especially
how the sound didnt
match the image. The
brilliant engineer
declared to his wife on
their way home that he
knew how to fix it.


Dong Liang

and right, and vice versa, while counting numbers and days of the week.
Again, these experiments seem to work well only by neglecting editing as a
fundamental tool of feature length filmmaking.
Instead of a pair of microphones located at one same spot in space but
pointing to different directions, the stereo films in the 1950s use an array
of spaced microphones to record a scene, an idea that can be traced to
experiments conducted by Steinberg and Snow in Bell Laboratories in the
1930s. Both, however, conceive the scene as rather static, with no camera
movements and cutsagain, much like the canned theatre or music
performance in the first Vitaphone shorts. But unlike those shorts, or the
early talkies, a multiple microphone setup doesnt really solve the problem,
as each microphone would introduce its own sound space, easily perceived
as in conflict with others. Given all these problems of perception, hardly
solvable to this day, it is rather fortunate for cinema to be without sound
for thirty yearsfor if it had sound from the very beginning, probably
neither camera movement nor cutting would have been invented! Without
the burden of sound space, the camera is free to move around and the
image track enjoys an exclusive liberty to construct space at its own pace.
All these become problematic once the sound space is involved.
In my view, Hollywoods answer to the conflict of interest between
editing, camera movement and sound space integrity consists of essentially
two strategies: either the sound space is reduced to a barely legible sketch
(as in the majority of 1930s films) or it is entirely reconstructed from
scratch to avoid conflict with the image space. The latter process seems to
dominate contemporary filmmaking by treating every film as if it were a
cartoon. Instead of using various stereo recording techniques that register
the spatial location of sounds and their movements, a manual process is
introduced to locate and pan dramaturgically important sound across
channels. This technique of panning sound was first developed in
Disneys Fantasound. By 1958, John Belton reports (1992: 157) that
Foxs CinemaScope films no longer used stereo to record dialogue and
sound effects. Instead it went back to what Fantasound did two decades
earlier and revived the panning practice. Panning may achieve good results
when applied carefully, so that its inherent technical complications
(divergence control, spread, etc.) could be controlled. It may work fine if
what is involved is on-screen or off-screen continuous movement. But
when it comes to cuts that place the characters into opposite end of the
auditorium, the effect can be somewhat jarring. It is perhaps for this
reason, and the complexity of the process itself, that Ioan Allen claims:
by the 1970s dialogue panning had all but died out. (Allen 1991) The
convenient solution of leaving dialogue to the centre prevails.
Despite its inherent disruption, the idea of closely matching image and
sound space seems to have an almost irresistible appeal. As a result, the
practice is being revived with every new iteration of sound technology.
At this point it is unclear to what extent the use of sonic teleporting has
become an established norm by the industry. The phenomenon remains
somewhat inexplicable to sound designers and mixers themselves. But if,
indeed, a new generation of filmgoers will eventually take the orbiting

Gravity: A Kaleidoscopic Hearing (part I)

voices as a fairly unobtrusive phenomenon, it is hard to imagine that

anyone can be habituated to the effect of instant teleporting (unless that,
too, has become a banal reality). Mark Kerins claims in his theory of the
ultrafield that an accurate (I would suggest instead the term close)
matching of the sound space and the image space helps the audience to
locate characters in the diegesis independent of the image. While this is
quite true if, as we have seen in the opening shot of Gravity, the two spaces
are presented in a continuous fashion, the theory of ultrafield may have
significant difficulties explaining dialogue sequences based on the most
banal shot-reverse shot structure. How can the acoustical information be
of any help if it keeps contradicting itself? How could anyone locate the
character in the diegesis with eyes closed if they keep jumping every two
seconds? If, as in Kerinss examples, this kind of disorientation is precisely
what is needed in the plotit might be said that every blockbuster needs a
scene of this naturethen disorientation is what you get. But what about
other circumstances (i.e., the majority of time) when disorientation is not
wanted? Regardless of what path cinema sound might take, the issue of
space will certainly play a key role in its future negotiations.
Abrahamson, L. (2014), Frank, film, UK, Ireland, USA: Runaway Fridge
Productions, Film 4.
Allen, Ioan (1991), Matching the Sound to the Picture, Dolby Laboratories.
Altman, Rick (1991), 24-Track Narrative? Robert Altmans Nashville,
CiNeMAS. 1(3).
Altman, Rick (1992), Sound Space, in Sound Theory, Sound Practice,
NJ: Routledge.
Altman, Robert (1974), California Split, film, USA: Columbia Pictures Co.
Altman, Robert (1975), Nashville, film, USA: ABC Entertainment.
Altman, Robert (1976), Buffalo Bill and the Indians, film, USA: Dino de
Laurentiis Co.
Altman, Robert (1977), Three Women, film, USA: Lions Gate.
Altman, Robert (1978), A Wedding, film, USA: Lions Gate.
Balazs, Bela, (1970), Theory of the Film; Character and Growth of a
New Art, New York: Dover Publications.
Belton, John (1992), 1950s Magnetic Sound: Frozen Revolution,
in Rick Altman, ed. Sound Theory, Sound Practice, New York:
Blake, Larry (1984), Film Sound Today: An Anthology of Articles from
Recording Engineer/producer, Hollywood: Reveille Press.
Chion, Michel (1994), Audio-vision: Sound on Screen, New York:
Columbia University Press.
Chion, Michel (1999), The Voice in Cinema, New York: Columbia
University Press.
Cuaron, Alfonso (2013), Gravity, film, US: Warner Bros.
Coulthard, Lisa (2012), Haptic Aurality: Resonance, Listening and
Michael Haneke, Film-Philosophy, 16(1), pp. 1629.



Dong Liang

Doane, Mary Ann (1980), The Voice in the Cinema: The Articulation of
Body and Space, Yale French Studies (60), pp. 3350.
Dreyer, Carl Theodor (1928), The Passion of Joan of Arc, film, France:
Societe generale de films.
Fischer, Lucy (1985), Applause: The visual and acoustic landscape, in
Film Sound: Theory and Practice, pp. 232246.
Fondas, Scott (2013), Why Gravity could be the worlds biggest
avant-garde movie, Variety, Oct 7.
Gilliam, Terry (1998), Twelve Monkeys, film, USA: Universal et al.
Godard, Jean-Luc (1967), Two or Three Things I Know about her, film,
France: Argos Films et al.
Gray, Tim (2013), Sandra Bullock On the Emotional Rollercoaster of
Filming Gravity (VIDEO), http://variety.com/2013/film/awards/
sandra-bullock-on-the-emotional-rollercoaster-of-shooting-gravity1200688811/, accessed 29 July 2015.
Gunning, Tom (2007), Moving Away from the Index: Cinema and the
Impression of Reality, differences, 18(1), pp. 2952.
Hoberman, J. (2013), Drowning in the digital abyss, The New York
Review of Books, Oct 11.
Howard, Ron (1995), Apollo 13, film, USA: Universal.
Hunt, Peter R. (1969), On Her Majestys Secret Service , film, UK: Danjaq.
Kaufman, Debra (n.d.), Creating the 3D in Gravity, https://
accessed 9 July 2015.
Kaufman, Philip (1983), The Right Stuff, film, USA: The Ladd Company/
Warner Bros.
Kerins, Mark (2010), Beyond Dolby (Stereo): Cinema in the Digital Sound
Age, Bloomington: Indiana University Press.
Kroesen, J.C. (1928), New Light on Talkies, American
Cinematographer 9.4 (July).
Koster, Henry (1953), The Robe, film, USA: 20th Century Fox.
Kubrick, Stanley (1968), 2001: A Space Odyssey, film, USA: MGM.
Lucas, George (1977), Star Wars, film, USA.
Maas, Arnt (2008), The proxemics of the mediated voice, in Lowering
the Boom: Critical Studies in Film Sound, Urbana: University of Illinois
Press, pp. 3650.
Marks, Laura U. (2000), The Skin of the Film: intercultural cinema,
embodiment, and the senses, Durham: Duke University Press.
Melie`s, Georges (1902), Trip to the Moon, film, France: Star Film
Pierson, Ryan (2015), Whole-Screen Metamorphosis and the Imagined
Camera (Notes on Perspectival Movement in Animation), Animation.
10(1), pp. 621.
Quinlivan, Davina (2012), The Place of Breath in Cinema, Edinburgh:
Edinburgh University Press.
Scott, Ridley (1982), Blade Runner, film, USA: The Ladd Company et al.
Scott, Ridley (2012), Prometheus, film, USA: 20th Century Fox et al.
Snow, Michael (1971), La Region Centrale, film, Canada.

Gravity: A Kaleidoscopic Hearing (part I)

Sobchack, Vivian (1992), The Address of the Eye: A Phenomenology of Film

Experience, Princeton, N.J: Princeton University Press.
Sobchack, Vivian (2005), When the Ear Dreams: Dolby Digital and the
Imagination of Sound, Film Quarterly, 58, pp. 215.
Sontag, Susan (1965), Imagination of Disaster, Commentary, 428.
Travis, Pete (2012), Dredd, film, USA: DNA Films et al.
Sokurov, Aleksandr (2011), Faust, film, Russia: Proline Film.
Vadim, Roger (1968), Barbarella, film, France: Dino de Laurentiis Co.
Von Trier, Las (1996), Breaking the Waves, film, Denmark: Argus Film
Produktie et al.
Welles, Orson (1941), Citizen Kane, Film, USA: RKO.
Dong Liang recently received his PhD in Cinema & Media Studies from
the University of Chicago. His dissertation is titled The World Heard:
Sound, Film Theory and the Cinematic Experience. He is currently
teaching at San Jose State University and his research interest includes film
sound, digital cinema, media technologies, 3D and virtual reality.
Contact: dliang@uchicago.edu
The second half of this article will be published in The New
Soundtrack 6.2.