Escolar Documentos
Profissional Documentos
Cultura Documentos
Pinecone Research Labs
A systematic review of the
most appropriate methods of
achieving spatially enhanced
audio for headphone use
Benjamin Costerton
May, 2013
Abstract
2
Contents
1. Introduction ................................................................................................................................4
1.1 Primary research areas ..............................................................................................................4
3 Methodology ..............................................................................................................................11
3.1
Introduction .....................................................................................................................................11
3.2 Summary ..................................................................................................................................11
3.3 Extended Methodology .............................................................................................................12
3.4 Method of Research Analysis ...................................................................................................13
4 Results .......................................................................................................................................14
4.1 Introduction ...............................................................................................................................14
5 Discussion .................................................................................................................................15
5.1 Timbral Issues ..........................................................................................................................15
5.2 Timbral Issues Related to Headphones ...................................................................................16
5.3 5.1 Surround to Binaural ..........................................................................................................17
5.4 Issues with 5.1 Surround - Binaural Re-recording ....................................................................18
5.5 Audio/Visual Scene ..................................................................................................................18
5.6 End User Manipulation .............................................................................................................19
6 Conclusion .................................................................................................................................21
6.1 HRTFs ......................................................................................................................................21
6.2 Externalisation ..........................................................................................................................21
6.3 Re-recording Techniques .........................................................................................................21
6.4 Suggested Further Research ...................................................................................................22
7 Glossary .....................................................................................................................................24
8 References .................................................................................................................................25
9 Bibliography ..............................................................................................................................27
3
gamingʼ. To have access to privacy while
listening to content on such devices users
will need to incorporate headphones into
their set-up, and with a 34% increase in
1 Introduction the sale of wireless headphones in 2012
(Anon, (L). n.d), this could be an indication
In 2013 surround sound can be for the need of more immersive audio on
experienced not only in cinemas but also our devices.
at home while watching movies (Anon, (F).
n.d), playing video games (Anon, (M). n.d) The soundtracks that accompany video on
while on the computer (Anon, (N). n.d) and these devices are in most cases a two
in some cases surround sound can be channel mix produced after the original
experienced in the car (Anon, (O). 2013). multi-channel (5.1, 7.1 ect) mixes found on
However surround sound is not yet found DVD and Blu-ray. Commonly this is done
in portable media. Mp3 players, tablets, via Lt/Rt (Left total/Right total) down-
phones, and other forms of portable media mixing, for example; “When the mixer
that are commonly used with headphones completes the final surround mix for a film,
do not have access to this same surround he processes the film through a Dolby DS4
audio experience. processor...The result is a 2-channel print
master ready to be converted into optical
Personal digital media is widely used soundtrack” (Purcell, J. 2007 p.314).
every day at home, at work, on trains, These mixes normally limit spatial
buses, and in the car, as well as many perception to a horizontal field, only
other places. A 2012 market survey found allowing humans to hear differences from
that 30.5% of the UK population owns a left to right. With the use of headphones, it
smartphone or tablet, and estimates show is possible to achieve extremely realistic
that by 2016 this could rise to 65.1% 3D sound, as an individuals own spatial
(Anon, (K). 2013). In a 2011 study on perception can be recreated using the
portable media usage 558 smartphone speakers covering his/her ears. This level
users and 419 tablet users were asked of 3D realism in sound can be found in
where they would commonly use their binaural media. By reproducing spatial
devices. When asked about using devices cues in the recordings that the brain
on a short commute (“on a bus or deciphers every day, the brain is able to
t u b e ” ( E v a n s , S . 2 0 11 ) ) 5 1 % o f believe it is hearing sound from all
smartphone users and 26% of tablet users directions while only listening to the two
said they use their devices. 66% of speakers (headphone). The dissertation
smartphone users and 55% of tablet users explores the possibilities of producing a
said they would use their devices on more realistic 2.0 stereo soundtrack to
longer commutes (“Long train journey or accompany visual content on portable
on plane” (Evans, S. 2011)). As well as media devices.
53% of smartphone users and 24% of
tablet user stating they would use their
devices while “waiting for a bus/train, in a
queue” (Evans, S. 2011). The study also 1.1 Primary research areas
showed that 34% of smartphone users and
36% of tablet users regularly use their The research has taken the following
devices for ʻMusic stored on deviceʼ, 12% areas of personal media consumption and
of smartphone user and 24% of tablet audio format into consideration.Surround
users regularly use their devices for ʻfilms audio in a consumer sense has been
stored on deviceʼ, and finally 20% of relatively successful in that several
smartphone users and 28% of tablet users hardware developers such as Sony (Anon,
say they use their devices for ʻcasual (F). n.d), Samsung (Anon, (G). n.d), LG
4
(Anon, (H). n.d), Panasonic (Anon, (I). n.d) image are only played to your left ear, and
and Phillips (Anon, (J). n.d) have their own the right signals only played to your right
5.1 home cinema systems on the market. ear. When listening to binaural recordings
Although these companies will have to you are listening to what the microphones
consider how many speakers consumers on the respective side of the dummy head
will want in their homes before they feel recorded, and all the spatial cues that
uncomfortable. There are currently were recorded along with it. Because of
headphones on the market with multiple this you will be able to hear sound in 3D
drivers in each head-cup, designed over a normal pair of stereo headphones,
specifically to achieve surround sound in using only basic spatial information that
headphones. The Razer Tiamat 7.1 your brain deciphers everyday.
features "10 discrete drivers built-in to
deliver the ultimate 7.1 Surround Sound Headphones therefore will always play a
experience" (Anon, (A). n.d), and is one significant part in the results of binaural
solution the manufacturer Razer feels is audio, as well as spatially improved
the answer to achieving surround sound recordings and synthesis technology. The
for headphones. dissertation acknowledges that the
The research has explored the possibilities frequency response of different
of achieving improved spatial headphones used will effect the end
characteristics for 2.0 stereo headphones, results of spatially improved audio, and
however has not explored any multi- that timbral issues may surround different
channel headphone options. The headphone choices. "[Are timbral issues
dissertation is focusing on changing end- brought about by the use of BRIR and
user and content creator technology as HRFT data] any worse than the difference
little as possible, therefore demanding a between some cheap headphones that
change in headphone usage is not you get with an mp3 player versus some
something that has been considered. nice Sennhesiers" (R, Mason. Personal
communication, 2013). The dissertation
An area of study that is heavily focused on however does not discuss the consumer
in the research is binaural audio. The term habits of headphone usage, nor does it
binaural audio is used when recordings explore in great detail the implications of
have been made in a way to replicate how using different headphones with audio
a human will hear and localizes sound. designed to achieve a greater spatial
"The term binaural stereo is usually impression. This would have to be
reserved for signals that have been explored in a separate body of research as
recorded or processed to represent the the factors involved and how they will
amplitude and timing characteristics of the implement the results of this body of
sound pressures at two human research are too great. It is important to
ears" (Rumsey, F. 2001 p.13). Binaural note that the research will not be looking
recordings are commonly made using a into the use of binaural soundtracks for
dummy head, featuring microphones DVD/Blu-ray releases, and will not be
where you would find eardrums, artificial looking to replace a 2.0 stereo mix on
ear canals, and replicas of human pinna DVD/Blu-ray. Instead the dissertation
(the part of the ear visible from outside the focuses on digital versions of content
head). The two microphones placed inside designed for use on portable devices such
the head will record sound independently as tablets and mp3 players where there is
while being subjected to all the spatial a sole consumer present.
cues that humans hear everyday and use
to localise sound. To experience the The dissertation does not discuss in any
desired effect, binaural recordings must be great detail the use of head-tracking as
listened to over headphones, so that the this would insist on changes to end-user
signals captured on the left of the stereo hardware. Instead the research focuses on
5
methods of achieving spatial awareness 2. Literature Review
where changing or manipulating end-user
hardware is kept to a minimum. There are 2.1 HRTFs, ITD, ILD, and EQ
issues surrounding the 'audio scene' and
its relationship to what visual is seen on As the space between each person's ear
the screen, and how the relationship drums, the length of his/her ear canals,
between the two is broken when users and the shape of the pinna is different,
move their heads. There are currently non- each person will have a unique set of
portable solutions such as the Realiser A8 head-related transfer functions, or HRTFs.
developed by Smyth Research (Anon, (E). HRTFs are a collection of measurements
n.d) that include the use of head tracking referring to how a certain head hears
to shift the audio scene with the user, but sound from different source cues. The
the research is not focusing in any great information given in a HRTF shows how
detail on these areas. the brain-ear combination can decipher,
among other things, which direction a
The dissertation concludes with what has sound is coming from. There is a problem
been found to be the most significant with collecting HRTFs however. As no two
contributing factors that will need to be human have the same shaped head, each
considered in order to achieve a spatially persons HRTFs will be different, and
enhanced audio track. The dissertation therefore recording HRTF values for
explores methods of making content that is binaural reproduction will only create an
originally mixed for a multi-speaker setup accurate representation of the subject
available for portable media users that is used in recording. This can cause
closer to its original form than a 2.0 stereo problems for binaural reproduction, as the
mix that will improve end user brain will not be use to these alien sound
experiences. cues. "People that have tried experiments
where they are given another person's
The use of ambisonic recordings is also HRTF, by blocking their own pinnae and
taken into consideration during the feeding signals directly to the ear canal,
research because of its flexibility to have found that their localising ability is
reproduce a 3D space over several markedly reduced. After a short time,
formats. However as this adds limitations though, they appear to adapt to the new
and will insist on change within the post information." (Rumsey, F. 2001 p. 26).
production process for the audio content, It These HRTFs are made up of several
is not considered in any great detail in this pieces of information including interaural
research. time difference (ITD), interaural loudness
difference (ILD), changes in equalisation
The underlying rule throughout the (EQ), and resonances from parts of the
dissertation is to make as little change to ear such as the concha and the ear canal.
methods used in content creation and end-
user habits. There is a strong emphasis on The ITD and ILD are key measurements
how appropriate each proposed method is for shifting the location of a phantom audio
to achieving a spatially improved image, however on their own it is hard to
soundtrack in relation to both content achieve anything more than horizontal
creators and end- panning. Changing the ITD and ILD of
users. sound cues can shift the phantom image
of sound as seen in conventional panning
techniques. Although as well as the size of
peoples heads being inconsistent, there
are some common problems when
measuring time differences. While
measuring localisation cues, time
6
difference in this example, attention needs sounds coming from below"(Han, H.L.
to be made to the frequencies that our 1994, p.19). He goes on to state that
ears, or binaural microphone will be "shoulder reflections do not play a
hearing. It is explained by Rumsey in his significant role in the high frequencies...
book Spacial Audio (2001) that humans any effects of such a reflection must be
are better at localising sound dependent insignificant compared with the effects of
on frequency, and will only be sensitive to the ear canal and the pinna" (Han, H.L.
a difference in phase at low frequencies, 1994, p. 19) H.L.Han's research goes on
generally no higher than 1kHz (Rumsey, F. to show how the concha is the most
2001). Rumsey goes on to state that important part of the ear when it comes to
"[Soundwaves] also give ambiguous detecting early reflections, and therefore
information above about 700 Hz where the important for ITD and ILD. "Covering part
distance between the ears is equal to half F [Fossa] with eardown has very little
a wavelength of the sound" (Rumsey, F. effect on the first arriving sounds, while
2001 p. 23). This 'ambiguous' information covering the concha C makes the double
is brought about because humans are less peak [initial sound and early reflections
able to localise sound once a given shown on a graph] fuse into one.
waveform is shorter than the space Therefore only the concha is responsible
between the two ears. If the ears are for splitting up the first arriving sound".
hearing two different waveforms at the (Han, H.L. 1994, p. 19). These results
same time the brain is less able to show that as sound arrives at the ear from
determine which ear is leading and which different directions, the EQ of the sound
ear is lagging behind, causing confusion will be changed, and therefore so will
about the location of sound sources. perception of location. In the same way
that sound arriving from different heights
will be affected by different part of the
outer ear, sound arriving from different
2.2 EQ directions on a 0° elevations should also
be affected by the outer ear. The outer ear
A possible key factor in producing usable is clearly different when viewed from
HRTF data for spatial enhance audio is 30°/-30° and 110°/-110° angles (common
EQ. Research has shown that there are locations of L/R and Ls/Rs speakers in a
changes in EQ depending on the direction 5.1 speaker set-up), so therefore
as well as elevation of a sound source differences should be expected in the EQ
(Han, H.L. 1994) and that this relies on the response of sound sources behind and in
different parts of our outer ear (Helix, front of the head. If these differences in EQ
Fossa, Anthihelix, and Concha). As sound are experienced when listening to sound
arriving at the head is greeted by an outer cues in front and behind the head, it
ear not subjected to change in everyday should be possible to use this EQ
life, EQ could possibly be used to alter the information to aid of reproducing sound at
perception of audio on the vertical and different locations related to the head.
horizontal fields, and assist in height and
front/back perception.
H.L. Han explains (Han, H.L. 1994) what 2.3 Concha Resonance/Externalisation
effect different parts of the ear have on
altering the EQ or sound sources at As the research is focusing on a situation
different heights on a 90° azimuth. "...the where the listener is mainly using
concha is the most active part in funneling headphones, the headphones used will
sound if the source is on ear level or up, play a big part of the success of the final
and that the part between the concha and reproduction. A known issue with
helix acts as an acoustic amplifier for headphone use is the sense of
7
externalisation, or more importantly the impression of a very large screen (Anon,
lack of it. "The so-called concha resonance (B). n.d). The glasses also feature ear
(that created by the main cavity in the buds close to the ears, locking the audio
centre of the pinna) is believed to be and visual scenes together. Users no
responsible for creating a sense of longer have the freedom to explore their
externalization" (Rumsey, F. 2001 p.26). surroundings using this technology, but It
As the majority of headphones supplied may be a solution to pairing the audio and
with devices such as ipods/mp3 players, v i s u a l b a c k t o g e t h e r. H o w e v e r
phones, and tablets are in-ear designs consideration will have to be made as to
(meaning that the driver sits in the concha whether users will want the audio and
as opposed to over the entire ear) sound visual scene to be locked together in this
will not resonate in the concha before format. In the real world the audio and
entering the ear canal and therefore this visual scenes are locked together, but
sense of externalisation will not occur. As a humans also have the freedom to explore.
result listening to audio using such In everyday life humans will have the
headphones can make sound sources freedom to move their heads slightly to
seem to come from inside the head, an better hear a sound cue, and therefore be
undesirable characteristics. able to locate a given sound. Using
technology such as the Vuzix glasses
removes the functionality and possibly the
ability to localise sound.
2.4 Audio/Visual Scene
3.2 Summary
14
The dissertation has explored many of the 5 Discussion
most discussed methods of achieving an
improved spatial awareness over 5.1 Timbral Issues
headphones and conclusions have been
drawn. Many of these topics are discussed In January 2012 BBC R&D worked
in greater detail through the discussion together with BBC Radio 4 to produce a
section where the points raised in the table binaural production of Private Peaceful,
above are explored in greater detail. the book by Michael Morpurgo (Brun, R.
2012). The 88 minute dramatization
featured a reproduction of a 5.1 speaker
system, and had 4 variations. At the start
of each variation the listener would hear a
series of test signals allowing for a choice
of which version gives the listener the best
spatial experience. By doing this, BBC
R&D have accepted that there will be
variations on the success of the binaural
reproduction, and therefore provided
different mixes based on different sets of
HRTF data. The release of Private
Peaceful had an accompanying survey
which all listeners were asked to complete.
It asked questions about the success that
the binaural reproduction had with the
listeners and which version (1-4) the
listener though was most successful.
The point raised by DR Mason in this Using re-recording techniques like this
interview is that even if a successful loses control over the space of the room,
reproduction of audio in a more realistic meaning once the audio is recorded, the
3D space is achieved, the effect could be size of the room cannot be changed. The
damaged by the end users choice of audio is also being applied to one set of
headphones. Cheaper headphones, and HRTF data, one room, and a fixed
indeed more expensive headphones with position, meaning head tracking would not
EQ colouration, will have an influence on be an option for audio re-recorded in this
how the audio is heard by the end user. manner. As re-recording techniques will
With the research that this paper use a single impersonal collection of
discusses into constructing HRTF and HRTFs, users will likely find the recordings
BRIR data through measuring impulse confusing at first, and therefore lower the
responses and EQ, among other methods, chances of success. Having one output
it is clear that this data would need to from this technique means the end user
remain intact for a successful reproduction will not be able to choose which binaural
17
recording works best for them, as seen in even though the re-recording technique
the Private Peaceful radio drama produced can achieve reverb, and therefore assist in
by BBC R&D and BBC Radio 4. Having a externalisation, it does not achieve spacial
re-recording of multi-channel audio in awareness of multiple sound sources or
binaural will remove multiple versions of solve issues of impersonal HRTFs.
audio that will not fit onto distribution
formats such as DVD and Blu-ray, but also
limits the audio being well received by end
users, and therefore will not work as a 5.4 Issues with 5.1 Surround - Binaural
simple solution to the problem. re-recording
These techniques however are seen in It is however important to point out that
other areas of audio production. Walter using re-recording techniques with multi
Murch, sound designer and re-recording channel audio is a more complex method
mixer for films such as Apocalypse Now of processing than that used on Kind of
(1979) and the Godfather trilogy (1972, Blue, where the mix was only just
1974, 1990) has discussed his use of experimenting with the use of two-channel
'worldizing', a technique in which sounds stereo.
or lines of dialogue are played over
loudspeaker into a desired location, and Consideration will have to be made to the
then re-recorded for use in post production equipment used in the re-recording, what
(Murch, W. n.d). These lines of dialogue limitations this equipment will have, and
were then balanced against the original what effects the equipment will have on
recordings until the desired effect was the audio, colouration or lose in
produced. This example shows how the frequencies for example. Using such
space required for the audio could not be recording methods for multi-channel audio
designed in post production, and therefore as previously discussed involves binaural
needed to be sourced and recorded. Of microphones and therefore a choice as to
course this example dates back to a time what speaker arraignment and which
when digital reverb was not at everyone's microphone is used will have to be made.
disposal, and the only way to replicate a As previously mentioned having an
space was to record it. This technique impersonal set of HRTF data with binaural
won't in itself solve spatial awareness for recordings can introduce timbral and
multi-channel audio, but does address the spatial problems for users. Decisions
issues of increasing a sense of would also have to be made about the
externalisation in the same way as amount of binaural recordings produced
convolution reverb will. A further example for any media, and how these mixes will
of this has been discussed previously in be implemented into the market. Content
this paper through the work of Irvin creators and distributors would have to
Townsend during his work on the 1959 consider whether having a ʻbinaural mixʼ
album Kind of Blue by Miles Davis (1959). optional on a DVD, Blu-ray or download
Once the album had been mixed, every would be well received by the end user
component of the mix was sent to an market.
echo-chamber. (Kahn, A, 2002, p.102).
This example shows how an entire body of
commercial work was processed through
re-recording techniques, and hopefully the 5.5 Audio Visual Scene
lifetime of this piece of work shows that
this process does not detract from the As discussed earlier when looking at
usability of the final product. Again it technology such as the Vuzix Wrap 1200
should be mentioned that this technique (Anon, (B). n.d), It is important to consider
was used on a mono sound source, and the relationship between audio and video
18
in consumer technology. This relationship at the movies or in home theatre is not
is different to everyday life as normally discussed in the dissertation. The
audio and visual scenes are locked discussion over the audio visual scene is
together. For example as car drives from based on a ʻdefault positionʼ in which users
point A to point B, the visual cue (the car) of mobile devices will place their device in
and the sound cue (sound produced by the front of the line of sight in order to
car) will move together and therefore have experience a pleasant and comfortable
a fixed relationship. The challenge viewing experience. The same argument
experienced with the audio/visual scene of a ʻdefaultʼ position is used when
relationship in binaural synthesis, and discussing speaker set-ups, in which users
indeed in all headphone use, is that the will sit within the speaker arrangement and
audio scene is fixed to the users head, and in front of a screen.
the visual scene is not. This means turning
away from the visual source displayed will The requirements for head-tracking to
not have any effect on the audio that work also include slightly more
accompanies it, and this removes some sophisticated hardware than seen in
very basic human functionality. smartphones and tablets in 2013. This is
due to the hardware requirements needed
One solution to audio-visual scene to track and alter real time HRTF data, see
problems is head tracking. "Head tracking Smyth Research Realiser A8 for example
is a means by which the head movements (Anon, (E). n.d). In Francis Rumseys book
of the listener can be monitored by the Spatial Audio he speaks of experiments
replay system" (Rumsey, F. 2001 p. 72). carried out at the Institute für
The replay system being used has to track Rundfunkteknik in 1999, and notes that
where the users head is and in what experiments carried out using head
direction it is facing, and modify the HRTFs tracking for the localization of sound
in real time. Head tracking may become improved the usability of binaural audio.
very useful in the future of 3D audio as it "Front-back reversals were virtually
can simulate normal physical actions eliminated. Even more interestingly, they
performed in everyday hearing, for found that substituting the dummy head for
example, when a sound is heard off to one a simple sphere microphone (no pinnae)
direction that needs to be investigated, produced very similar results, suggesting
humans will turn their heads, sometimes that the additional spectral cues provided
only slightly, to give a better audio image by the pinnae were of relatively low
of the sound source. With head tracking, importance compared with the effect of
this is possible as the centre of the audio head rotation" (Rumsey, F. 2001 pg 73).
scene is controllable. This therefore is a These results show that the use of head
downside to using spatially enhanced movement is more important than what
recordings without head tracking. If a can be learnt from the human head and
sound cue is away from the centre of a mix spectral cues alone, and that maybe to
in spatially enhanced audio where head- achieve a full reproduction of 3D audio
tracking is not present, the sound source over headphone, head-tracking may be
cannot be moved towards the centre of the hard to ignore.
audio scene. It will remain fixed in
whatever location it was placed in during
the mix. But is this really that much of a
problem seeing as the creators of the 5.6 End user manipulation
content have control over where sound is
placed in multi-channel loudspeaker set- It has become apparent that delivering a
ups? Whether users often feel the need to piece of ready-to-go spatially improved
explore their audio surroundings when multi-channel audio has many issues
listening to a multi-channel speaker set-up surrounding it. Early in the research head-
19
tracking was ruled out as research topic spatially enhanced audio on a portable
and a possible method of improving 3D media device.
spatial awareness for end-users based on
the fact that end-user technology would
have to adapted. Although it has become
apparent that this may be the only option
available in order to avoid the series of
issues involved with manipulating audio
pre-user involvement as discussed
previously.
20
6 Conclusion multi-channel audio for decoding at an
end-user level.
The following concludes the key areas of
discussion found to be most relevant. It is also worth noting that badly
Each subject featured in the conclusion implemented HRTF's could quickly
has been previously discussed in the damage the possibility of technology
dissertation. The conclusion also features striving to improve spatial awareness over
what has been seen as the most headphones from being accepted. If end-
appropriate areas of further research. users do not experience an improvement
in their spatial awareness of audio from
the first use, as the technology would
advertise, then the credibility of the
6.1 HRTFs technology would be greatly damaged,
and end users could possibly lose faith in
The majority of methods employing a one- achieving greater spatial awareness.
size-fits-all solution will without doubt face
problems at an end-user level. The
dissertation has been heavily focused
throughout on improving spatial 6.2 Externalisation
awareness for multi-channel audio on
portable media with the end user in mind, Methods of achieving externalisation
and although methods such as measuring through technology such as the VRM Box
HRTF data or binaural re-recording may from Focusrite have been discussed, but
achieve success on a theoretical level, it as with many aspects of spatial audio what
has been shown that this may not be the works for one individual may not work for
case for every end-user. It has been another. As explained by Dr Russell
commented on in several pieces of Mason "for some listeners it can work fine,
research that have been discussed that as in people get externalisation and they
timbral issues are the main problem of can imagine it being there, wherever the
achieving greater spatial awareness (Pike, loudspeakers are arranged, without head-
C. Personal Communication. 2012). The tracking. [For] other listeners it never
solution to creating better spatial seems to work, they never seem to get
awareness for end-users will rely heavily away from the fact that they have
on the appropriateness of whatever headphones on, so therefore it must be
technology is decided on, and if timbral inside my head... People are used to
issues are a worry after creating four listening on headphones and getting
separate mixes (Pike, C. Personal [audio] inside their head so therefore they
Communication. 2012), it may suggest that can't believe it's anything
implementing HRFT's prior to distribution different" (Mason, R. Personal
may not be appropriate for the end-user. communication, 2013). Any results
concluded will always be dependent on
The dissertation has concluded that acceptance from end-users. As Dr Mason
however extensive the HRTF data that is previously stated, some users simply will
collected at a content creation level, it may never accept externalisation while listening
never be close enough to the personal to audio over headphones.
HRFT data that end users can obtain
through technology such as the Smyth
Research Realiser A8 (Anon, (E). n.d). The
conclusion can be made that the most 6.3 Re-recording Techniques
appropriate method of making multi-
channel audio accessible to end-users A binaural re-recording of multi-channel
over headphones would be to distribute audio would create a 3D image, and
21
depending on the room would capture a personalised HRTF data for the end-user
good sense of externalisation, but through a similar calibration set-up seen
unfortunately the same issues surrounding with the Smyth Research Realiser A8, and
a one-size-fits-all HRTF recording still be able to track head movements of the
apply with re-recording. Many end-users user, together with room and monitor
will undoubtedly experience issues characteristics collected through
because "its not your own pinna, and you convolution reverb and BRIR, similar to
are having to use someone else's ears what is seen with the VRM Box, then this
effectively" (Mason, R. Personal may give the user the best chance of
communication, 2013). As discussed experiencing an improved spatial
previously, using someone else's pinna awareness while listening to audio on
can cause confusion and timbral issues for portable media devices.
many users.
By having an application implement the
Another key point made at the beginning spatial characteristics needed to improve
of the dissertation was to cause as little spatial awareness at the end-user stage,
disturbance to how content creators and nothing is being asked from the content
end-users work and consume their audio creators. Access to multi-channel audio is
content. By introducing binaural re- possible through the use of DVD and Blu-
recording techniques as a possible ray, and although this information would
solution, content creators are being asked need to be extracted from such formats,
to introduce a further stage in the using this multi-channel audio would mean
distribution process. Asking the production no effort would have to be made by
companies to introduce such a process content creators/distributors to allow end-
would not appear to be an appropriate users to experience improved spatial
option. awareness in portable media.
23
7 Glossary: Ambisonics - Multi-channel recording
technique designed to reproduce acoustic
2.0 Stereo - Stereo sound featuring two and directionality in sound recordings.
channels of audio designed to be played in
front of the listener, isometrically distant HRTF - Head Related Transfer Function.
from the listeners head. A collection of personal measurements
documenting how an individual hears
5.1 Surround - Surround sound format sound in relation to acoustic properties
consisting of three front channels, two rear and directionality.
channels, and a sub channel (LFE).
HRIR - Head Related Impulse Responses.
7.1 Surround - Surround sound format An impulse response recorded in any
consisting of three front channels, two side given space normally done with a binaural
channels, two rear channels, and a sub head
channel (LFE)
Impulse Response - A short (normally 1
LFE - Low Frequency Effects. This frame) sound, normally a clap or staring
channel is the (.1) in a multi channel setup pistol, recorded in a room. Recording the
sound excites the acoustic properties of
Binaural - Binaural refers to audio the room for use in convolution reverb.
recordings that attempt to replicate the
human hearing experience as closely as ITD - Inter-aural Time Difference. The time
possible. Binaural recordings are often difference between two ears experienced
made using a dummy head featuring from a sound reaching the head.
microphones where the eardrums are
found in a human. More sophisticated ILD/IID - Inter-aural Loudness Difference.
models feature moulds of real human ears, The loudness/intensity difference between
and the addition of synthetic ear canals. In two ears experienced from a sound
some cases shoulders are also added to reaching the head.
the dummy head to assist in replicating the
human body more accurately. EQ - The frequency information of sound.
Dummy head - A model of a human head Phantom Image - A sound source (not of
featuring microphones where eardrums physical existence) created between
would be found. Other human physical sound sources through the use of
characteristics are added to more panning.
sophisticated models.
BRIR - Binaural Room Impulse
Binaural Synthesis - Refers to post Responses. Impulse response recordings
production techniques designed to used to excite a rooms acoustic properties
manipulate stereo recordings with the allowing for use in post production. For
desire of making them more realistic in BRIR these recordings are made using a
reference to human hearing. binaural head.
24
8 References:
Anon, (I). (n.d) SC-BTT290 3D Blu-ray
Apocalypse Now. (1979) Directed by Home Cinema [Online]. Panasonic.
Francis Ford Coppola. 153 mins. Zoetrope Available from: http://
Studios. DVD w w w. p a n a s o n i c . c o . u k / h t m l / e n _ G B /
Products/Home+Entertainment/Blu-ray
Anon, (A). (n.d) Razer Tiamat 7.1 [Online]. +Home+Cinema/SC-BTT290/Overview/
R a z e r. Av a i l a b l e f r o m : h t t p : / / 9406265/index.html [Accessed 11th April
www.razerzone.com/store/razer-tiamat-71 2013]
[Accessed 09 March 2013].
Anon, (J). (n.d) 5.1 Home theater 3D Blu-
Anon, (B). (n.d) Vuzix Wrap 1200 [Online]. ray, dock iPod/iPhone [Online]. Phillips.
Vuzix. Available from: http:// Available from: http://www.philips.co.uk/c/
www.vuzix.com/consumer/ home-cinema-systems/3d-
products_wrap_1200.html [Accessed 20 blu-ray-dock-ipod-iphone-hts5562_12/prd/
Feburary 2013] [Accessed 11th April 2013]
Anon, (C). (n.d) MPEG Surround [Online]. Anon, (K). (2013) In the UK, Will Mobile
MPEG Surround. Available from: http:// Payment Go Mainstream? [Online].
w w w. m p e g s u r r o u n d . c o m / i n d e x . h t m l e M a r k e t e r. Av a i l a b l e f r o m : h t t p : / /
[Accessed 23 January 2013] www.emarketer.com/Article/UK-Will-
Mobile-
Anon, (D). (n.d) VRM Box [Online]. Payments-Go-Mainstream/1009641
Focusrite. Available from: http:// [Accessed 11th April 2013]
uk.focusrite.com/usb-audio-interfaces/vrm-
box [Accessed 21 March 2013] Anon, (L). (n.d) Half of Tablet and
Smartphone Users Are Using These
Anon, (E). (n.d) Realiser A8 [Online] Devices to Listen to Music, According to
Smyth Research. Available from: http:// The NPD Group [Online]. NPD Group.
smyth-research.com/products.html Available from: https://www.npd.com/wps/
[Accessed 21 March 2013] portal/npd/us/news/press-
releases/half-of-tablet-and-smartphone-
Anon, (F). (n.d) BDV-E2100 3D Blu-rayTM users-are-using-these-devices-to-
Home Cinema System [Online]. Sony. listen-to-music-according-to-the-npd-
Available from: http://www.sony.co.uk/ group/ [Accessed 11th April 2013]
product/hch-systems-with-blu-ray-
disc/bdv-e2100 [Accessed 11th April 2013] Anon, (M). (n.d) Complete 5.1 Surround
Sound System Licenced for Xbox 360 HD
Anon, (G). (n.d) HT-D550W; 5.1ch DVD Game Console [Online]. Pioneer. Available
Home Theatre System [Online]. Samsung. from: http://www.pioneerelectronics.com/
Available from: http://www.samsung.com/ PUSA/Home/Home-Theater-Systems/
hk_en/consumer/tv-av/home-theater/5-1- HTS-GS1[Accessed 11th April 2013]
h o m e - t h e a t r e - s e t / H T- D 5 5 0 W / Z K ?
pid=hk_en_5.1hometheatresetsubtype_ke Anon, (N). (n.d) Logitech Speaker System
y v i s u a l 1 _ h t - d 5 5 0 w / Z906 [Online]. Logitech. Available from:
zk_20130123[Accessed 11th April 2013] http://www.logitech.com/en-gb/product/
speaker-system-Z906 [Accessed 11th April
Anon, (H). (n.d) LG BH9520TW Cinema 2013]
3D Sound Home Cinema System[Online].
LG. Available from: http://www.lg.com/uk/ Anon, (O). (2013) A6 allround quattro;
home-entertainment/lg-BH9520TW-home- Audio & Communication [Online] Audi.
cinema-system [Accessed 11th April 2013]
25
Available from: http://www.audi.co.uk/new- your-face-maybe-better-than-you-do/
cars/a6/a6-allroad-quattro/audio- [Accessed 24 March 2013]
and-communication/bose-surround-
sound.html. [Accessed 11th April 2013] Merimma, J. (2010) Modifications of HRTF
Filters to Reduce Timbral Effects in
Anon, (P). (n.d) Ear Force DSS2 Surround Binaural Synthesis, Part 2: Individual
Sound Processor [Online]. Turtle Beach. HRTFs. AES. USA
Available from: http://
www.turtlebeach.com/product-detail/dolby- Miles Davis. (1959) Kind of Blue. Columbia
processor-accessories/ear-force-dss2/33 Records. California.
[Accessed 12th April 2013]
Murch, W. (n.d) Walter Murch Articles
Anon, (Q). (n.d) Easr Force Z6A Multi- [ O n l i n e ] . F i l m s o u n d . o r g . Av a i l a b l e
Speaker Surround Sound [Online]. Turtle from:http://filmsound.org/murch/murch.htm
Beach. Available from: http:// [Accessed 08 February 2013]
www.turtlebeach.com/product-detail/pc-
headsets/ear-force-z6a/44 [Accessed 13th Purcell, J. (2007) Dialogue Editing for
April 2013] Motion Pictures. Focal Press. Oxford
Anon, (R). (2013) iPhone; Tech Specs Rumsey, F. (2001) Spatial Audio. Focal
[Online]. Apple. Available from: http:// Press. Oxford
www.apple.com/uk/iphone/specs.html
[Accessed 15th april 2013] Rumsey, F. (2011) Whose head is it
anyway?. AES. USA
Brun, R. (2012) Private Peaceful: Drama in
surround sound [Online]. BBC. Available S i l v e r m a n , D . ( 2 0 11 ) I n t e r p r e t i n g
f r o m : h t t p : / / w w w. b b c . c o . u k / b l o g s / Qualitative Data. Sage Publications Ltd.
radio4/2012/02/private_peaceful.html London.
[Accessed 13th April 2013]
The Godfather. (1972) Directed by Francis
Coolican, H. (2008) Research Methods Ford Coppola. 175 mins. Paramount
and Statistics in Psychology. Bookpoint Pictures. DVD
Ltd. Oxon.
The Godfather: Part 2. (1974) Directed by
Evans, S. (2011) Portable Devices Francis Ford Coppola. 200 mins.
[Online]. Harris Interactive. Available Paramount Pictures. DVD
from:http://www.harrisinteractive.com/vault/
HI_UK_Corp-%20Portable-Device- The Godfather: Part 3. (1990) Directed by
Research.pdf [Accessed 11th April 2013] Francis Ford Coppola. 162 mins.
Paramount Pictures. DVD
Han, H.L. (1994) Measuring a Dummy
Head in Search of Pinna Cues*. AES. USA
C h e n g , C . W a k e fi e l d , G . ( 2 0 0 1 )
Introduction to head-related transfer
functions (HRTFs): Representations of
HRTFs in time, frequency, and space*.
AES. USA
27