Você está na página 1de 82

An investigation of Inter-Application Audio Routing on

the Macintosh OS X operating system.

Author: Richard Hallum

This document was presented as part of my final research for a Master of Music Technology
Degree.
Table of Contents

1. Abstract 3

2. Aim 3

3. Objectives and outcomes 4

4. The Report 6

Background

The research process

Introduction

An overview of audio in OS X

The Problem

Third party inter-application audio routing

Miscellaneous audio routing software

Third party software routing solutions examined

Conclusion

5. Bibliography 16

6. Acknowledgements 22

7. Appendices 23

2008 Richard Hallum 2


An investigation of Inter-Application Audio Routing on
the Macintosh OS X operating system.

1 Abstract
The personal computer has now developed to the stage where it can be used to run several audio
applications simultaneously. The Apple Macintosh OS X platform provides many advantages for
music and multimedia users. It has an elegant and intuitive user interface which many creative
users prefer. Pre-emptive multi-tasking, and protected memory provide this modern operation
system with impressive speed and reliability1. Perhaps surprisingly, OS X cannot route audio
between applications. This paper investigates why that is, and in particular examines latency and
synchronisation issues that are inherently associated with audio streaming. These provide very
real challenges for audio software developers who want to provide a solution to meet the demands
of audio professionals. Several third party solutions do exist and these are examined, and
compared.

2 Aim
The aim of this research is primarily to identify all available methods for routing audio between
applications on the Macintosh OS X operating system. Any available software titles will be
examined in practice to find one or more solutions and report on their effectiveness. It will also
investigate what has been written to date on inter-application audio in OS X.

This project will:


Review any literature on this subject
Investigate how audio is supported on the Mac OS X operating system with the Apple
technologies CoreAudio, and QuickTime
Compare how MIDI is supported on the Mac OS X operating system with the Apple
technologies CoreMIDI
Compare how audio is supported on the Windows XP and Vista operating systems
Evaluate third party software solutions for inter-application audio routing
Evaluate other third party audio software solutions such as device selectors, and audio
capture software
Develop tutorial resources to instruct on the setup and use of some of the audio software
tested
Implement installation of the most suitable software solution on the MAINZ MIDI room
computers

1
This aspect of software design is examined in Appendix VIII

2008 Richard Hallum 3


3 Objectives and Expected Outcomes
Objective Outcome
To study CoreAudio (including QuickTime) to establish background information for A report on CoreAudio and QuickTime audio will be written
this research project. This will involve sufficient depth to develop an A brief description on Windows audio will also be added.
understanding of Audio handling on OS X from a user perspective, and identify if It will be established whether or not native support for inter-application
inter-application audio routing is included in the OS X operating system. Detailed audio routing is an inherent feature of OS X.
descriptions of programming will not be covered in this report.

To survey all available third party software solutions for inter-application audio Software examined will be Direct Connect, JACK, Rewire, and
routing. A list of these is included in the resource section of this document. The Soundflower.
intention is to trial the various software titles available, and where applicable Test results for these will be discussed, tabulated, and graphed in the
conduct objective tests of increased latency and microprocessor usage. report.
Combinations of certain third party audio routing and control solutions will also be
included to identify possible conflicts.

To survey other available third party audio utility software for OS X. This will Software examined will be Audio Hijack, Detour, Line In, PTHVolume,
include any programme or system extension that can route or capture the audio in Sound Menu, SoundSource, WireTap, and WireTap Anywhere.
OS X. Other utilities (such as file converters, editing software) will not be Test results for these will be provided in the report.
considered in this report.

To trial the available software on a MacBook. This is the most common model of Students will be able to install and use at least one of the audio routing
computer that is used by the Diploma in Audio Engineering & Music Production solutions available.
students at MAINZ. Therefore this particular model will be used for the most Instructions will be written, and screen-movie tutorials created.
extensive part of the software evaluation.

To compare CoreAudio routing with Core MIDI routing. Since OS X version 10.3 A section on CoreMIDI will be included in the report.
there has been a solution for inter-application for MIDI data. It is therefore useful
to relate the development of these two related software technologies.

To investigate the functionality of the new Aggregate Audio Devices feature that The Aggregate Audio Device will be used to create a full-duplex device
was added to the Audio MIDI Setup in OS X version 10.4 on the Intel Mac.

2008 Richard Hallum 4


To provide a written report of the findings of the software testing. This will detail all Results will be tabulated, and graphed where appropriate
experiment results and provide a comparison of the features and effectiveness of A detailed comparison chart of the evaluated software will provide a
each software programme. A glossary will be included. Background information quick reference to the software attributes.
on JACK will also be covered. A glossary of terms used in this report will be added.
An in depth study of JACK will detail its theory of operation.

To accompany the report with QuickTime screen movies of any installation and/or Setup procedures and screen movie tutorials will be created for using
setup procedures, where this method of explanation is preferable to using text Soundflower, JACK, and Wiretap Anywhere.
and/or screenshots.

2008 Richard Hallum 5


4 The Report

4.1 Background

The Certificate in Audio Engineering and Music Production at MAINZ uses Windows computers
for the MIDI / digital audio workstations. On this platform it is possible to route audio from
various sources via the Mixer window. One example of use is to record the output of MIDI sounds
(from the soundcard) back into an audio track of a sequencer programme. Students continuing on
with the Diploma in Audio Engineering and Music Production use Macintosh computers and are
often surprised to find that there is no similar audio mixer feature in OS X. A third part solution
would therefore be highly desirable, particularly as the students have their own laptops.
A preliminary search on this topic indicated that few if any comprehensive surveys of software for
inter-application audio routing have been undertaken. This report will therefore provide useful
information for anyone using music/audio production software on a Macintosh computer.

4.2 A Note on the Research Process

This topic has been covered in MTEC6709 A1 and A2, and MTEC6710 A1 so only a summary is
provided here. The original idea for this research was to find out what solution(s) exist to route
audio between applications on the OS X operating system.
Initial research on the web indicated that several third party software solutions do exist, and these
were duly tested. Altogether, a considerable number of possibly useful pieces of audio software
were discovered. Not all of these can provide audio routing but they were evaluated in any case,
and tested where appropriate.
To start with it appeared that very little information was available on this subject so I had to dig
deep to find enough to work with. As time went on many resources were discovered and
reviewed (as indicated by the bibliography), and this has enabled me to provide some background
information on CoreAudio, and JACK. Some topic areas were infertile, eg no details of the
operating principle of Soundflower were found.
Over the six-month period of this investigation I have stuck to the original direction of the
research, i.e. answering the question what is the best way to route audio between applications on
OS X? The research was expanded somewhat to provide sufficient background information to
enable the reader to understand the issues involved for the software developers who tackle this
problem.
Originally, my thoughts were to test all the available Macintosh computer models but limit the
testing to only JACK and Soundflower. Once testing was underway it became clear that to
comprehensively test all combinations on all models (laptops, desktops, towers, Intel & PPC) as
well as Tiger and Leopard would take more time than I had available so I scaled it back to three
models, and two OS versions. Testing on the PPC Mac was also minimised, as this model is no
longer used at MAINZ. In any case, testing has been rigorous and it is doubtful if more exhaustive
testing would alter the outcome of this investigation.

2008 Richard Hallum 6


4.3 Introduction

Over the last ten years personal computers have developed enormously in computing power. One
of the areas that has benefited greatly from this advancement has been audio production. This has
enabled a professional digital audio workstation (DAW) to be built using no more than a standard
computer with suitable software and a hardware interface for audio I/O (input and output). In the
1980s computer music creation meant using the MIDI data format but in the mid to late 90s the
ability to add audio recording and playback to MIDI sequencers became increasingly
commonplace. Initially audio track counts were limited to two (i.e. stereo) but we have now
reached the point where it is possible to have dozens of tracks recorded simultaneously, providing
there is sufficient I/O, and many more bussed on the computer internally.

4.4 An Overview of Audio in OS X

Implementation of good audio handling was part of the original design of OS X. The previous
operating system (OS 9) had reached its limits in many areas, including sound. Apples system
extension for sound (Sound Manager) was limited to 2-channel, 16 bit audio. There were also
latency issues depending on what hardware was used (Wherry, 2003). To work around these
limitations music software companies developed proprietary systems that address the OS and
hardware at a lower level (Steinbergs ASIO, Emagics EASI, MOTUs MAS, Digidesigns
DAE).
Apples solution in OS X is CoreAudio. This audio server (the hardware abstraction layer; HAL)
sits between the hardware drivers and the application software. The benefit of this system is that
programmers can develop audio applications without having to write for specific hardware, as the
APIs (Application Programming Interface) are addressing the HAL. This provides a consistent,
high specification multi-channel audio routing system within OS X. Timing and synchronisation
of audio signals are fundamental to the design, as is low latency (Wherry, 2003). Latency figures
of 15ms were considered good on OS9. In OS X this can be reduced to around 1ms (Ray J., Ray
W.C., 2003).
Closely associated with CoreAudio is Apples QuickTime. QuickTime is best known as a media
player application but beyond this visible part it also performs many other tasks, including media
synchronisation and file conversion. A part of QuickTime is the MIDI implementation, which
uses a DLS (Downloadable Sound Font) called the QuickTime Music Synthesizer. Audio output
is handled by CoreAudio and will automatically be sent to the default audio device (Apple, 2005).
Also provided is a standard type of plugin architecture (Audio Units; AU) that any software
manufacturer can use. Since Logic is an Apple product its plugins are entirely in the AU format.
There are many freeware AU plugins now available. Other commercial developers have generally
hedged their bets by providing plugins in multiple formats. Digidesign continues to restrict Pro
Tools to their own formats (AS, RTAS, TDM, HTDM).
For a more detailed explanation of CoreAudio and QuickTime refer to Appendix VII.

4.5 The Problem

The features of CoreAudio provide a stable and suitable way to handle most audio situations on
OS X. Version 10.2 (Jaguar) introduced significant performance improvements, such as the Apple
MIDI Setup Utility (Cooper, 2003) and since then several enhancements have been added, most
notably the Aggregate Audio feature in Audio MIDI Setup which allows separate pieces of
hardware to act as one virtual I/O.

2008 Richard Hallum 7


With the introduction of Jaguar came unlimited multi-channel audio support and it is this feature
that is most relevant to this study. The problem is that the design of multichannel functionality has
been related to addressing the growing use of multichannel I/O. While this is natural and useful it
has not kept pace with another development in audio software. Increasingly, audio production is
being performed (sometimes completely) in the box. Therefore, it is equally important for many
users to be able to route 2-channel (and sometimes multi-channel) audio between applications
within the same computer (Davis, 2006). Lack of support in this area is frustrating users (PeterB,
2008). Paul Davis says of his solution, JACK much to my surprise it provides something Apple
do not provide themselves, which is inter-application communication. I was under the impression
that CoreAudio provided that, apparently it does not (Davis, 2003).
OS X is rooted in UNIX, an OS where APIs are based on the read/write model derived from the
everything is a file abstraction. The problem with this design is that it fails to force developers
to pay sufficient attention to the real-time nature of audio. In particular, it becomes difficult to
facilitate inter-application audio routing if different programmes are not running synchronously
(Davis, 2003).
To date only one roundup of possible solutions has been found. Andi summarises six possible
sound routing methods, and recommends Soundflower. His evaluation is somewhat limited.
Although Jack is included he states i didnt have the time to figure things out with jack os x (sic)
(Andi, 2007).

4.6 Third Party Inter-Application Audio Routing

A thorough search for utility software that can route audio between applications on OS X
revealed two main possibilities (andi, 2007). These are Jack, and Soundflower, both of which are
freeware (as are most of the utilities covered in this review).
Jack (Jack Audio Connection Kit) is an open source software solution and is written by Paul
Davis and his associates. It is non-commercial in origin and to some extent relies on users having
good computer literacy (Vucic, p23). Jack is accessed from two websites (www.jackaudio.org and
www.jackosx.com) and does include a detailed manual. There is also a support forum on Yahoo
(a Yahoo login is required). A network enabled audio communication tool (NetJack) is provided
with Jack but this has been disabled in the latest version (0.77). NetJack has been temporarily
removed from the Jack OS X package, until it is fixed. (Jack OS X, p21). This indicates a
weakness in this type of software in that it is still somewhat experimental compared to
commercial release programmes. On the other hand, Vucic identifies that the software is
conceptualised without having to be a commodity in a specific market and can therefore focus
solely on the issue of inter-application data exchange (Vucic, p22). The design philosophy of Jack
has been to provide a system to seamlessly move audio data between programmes and/or an audio
interface (Davis, 2006). Although it is only on version 0.77, development has been quite fast, with
13 versions released to the public since its introduction in 2004. Version 0.75 introduced
optimisation for running on dual processor machines (Jack OS X, 2008).
Jacks interconnections are realised using a virtual patchbay (the Connections Manager) to
configure the list of inputs, outputs, and the connections between. One of Jacks key features is
that it does not add any perceptible latency to the routed audio (Vucic, p25). Jack has been
designed from the ground up to be suitable for professional audio work. (Jack OS X, 2008).
Synchronous execution of all clients (eg applications) is also a design priority. To achieve these
goals Jack has a central audio engine called Jack Server that can communicate with all clients
whether they are an I/O hardware interface, AU, or CoreAudio application.

Soundflower is also open source freeware and is provided by Cycling 74, primarily to accompany
their product Max. It is a kernel driver and presents itself as a CoreAudio device (Ingalls, 2007). It
is very effective for straightforward setups and can provide either 2 or 16 channels for audio
routing, but is not as flexible as Jack (Davis, 2008, p43). The documentation is extremely brief

2008 Richard Hallum 8


and while it is possible to search for Soundflower on the Cycling 74 website
(www.cycling74.com) the information is biased toward Max support.
Thus there is a shortage of available support on this utility and the only way to find out its
usefulness was to trial it. This can be an issue, as fine-tuning requirements for audio handling are
common. Without installation, version and compatibility details any problems can easily go
unresolved. One forum (PeterB, 2008) has two posts stating Soundflower is too buggy.
Both Jack and Soundflower do provide uninstallers, which is important as these utilities dig
deeper into the operating system than most applications.

An alternative solution is Rewire. This is a commercial solution and is usually associated with
Reason. Rewire can stream up to 256 audio channels as well as 255 MIDI busses between two
audio applications (Propellerhead, 2008) and acts as a plugin. In a typical setup Reason is
synchronised to a host (eg Logic, Pro Tools) and a MIDI track in the host can drive virtual
synthesisers in Reason. Reason can in turn output audio back into tracks in the host application.
Transport control can be from either the host or Reason. While many audio applications are now
rewire enabled, the support is often to simply run a standalone version without the developer
having to rewrite the programme source code as various plugins (AU, VST, RTAS). Rewire is
written using the obsolete CFM (Code Fragment Manager) format and must be wrapped to work
in a Mach-O (Mach Object file format), creating processing overhead (James, 2004).
Functionality under Rewire is therefore varied, and it does not offer a universal approach to inter-
application audio routing.

4.7 Miscellaneous Audio Routing Software

As well as utilities that route audio within the Macintosh computer there are several other utilities
that enhance the ability of the user to select audio inputs and outputs (functions that are already
handled with the Apple utilities Sound Preferences and Audio MIDI Setup). The third party
offerings are small applications providing enhancements to the sound functionality. Line In
(Rogue Amoeba, 2008) allows for passing the audio input directly through to the audio output.
Sound Menu (Aspirine, 2007) and SoundSource (Rogue Amoeba, 2008) both allow menu-bar
access to switch audio input and output. None of these solve the issue of inter-application audio
routing. The convenience offered by having audio input and output switching from the menu-bar
is important if used in conjunction with Jack or Soundflower as the audio source and/or
destination will probably be set frequently.
An alternative approach to solving the problem is to capture the audio by recording it as it is
streamed to the computer output. Two third party software solutions do this; WireTap allows you
to record any audio playing through your Mac (Ambrosia Software). Audio Hijack performs the
same task and also includes the ability to enhance the audio with effects plugins (Rogue Amoeba,
2008). A new version named WireTap Anywhere has now been released and is intended to
supersede WireTap Pro. Interestingly, these programmes are the only commercial products
available for OS X audio routing. These applications appear to provide a workable solution to the
problem but fail to maintain a timing reference between applications. They do, however, offer the
ability to capture audio from hidden sources eg when a web browser handles streaming audio.
Detour is freeware form the Rogue Amoeba software company. It can send different audio to
different outputs, or lower the volume of some applications in relation to others (Rogue Amoeba,
2005). This used to be a commercial utility and is no longer supported2 but is still available and
suitable for PPC (PowerPC) Macintosh machines running Tiger (OS X 10.4).
Evaluation of these and other audio software utilities are in appendix IV.

2
the final release was version 1.5.5 dated Oct. 2005 (Detour readme.pdf)

2008 Richard Hallum 9


4.8 Third Party Software Routing Solutions Examined

a. Soundflower
i. Soundflower v1.2.1

Soundflower was developed by Cycling 74 and is open source software. It is a background


application in the form of a kernel extension and has no GUI. It is accompanied by the optional
Soundflowerbed, which adds a menu to the menu bar with the following dropdown menu items:

Fig. 1 Soundflowerbed menu items (Cycling 74)

Installation of Soundflower was straightforward. Soundflowerbed does not need installation as the
programme runs when the icon is double-clicked, and shows a flower icon in the menu bar.
Soundflowerbed must be rerun every time the machine is booted, unless installed as a startup item
[system prefs/accounts]. If Soundflowerbed is run without Soundflower, a menu message appears
in place of the Soundflowerbed menu items: Soundflower is not installed!! Uninstalling requires
using the Terminal to run a shell script. Soundflowerbed tends to hang when the uninstaller runs.
Logging out and in again fixed this. It is therefore best to not have Soundflowerbed running
during the uninstall process.
To use Soundflower it is not necessary to use Soundflowerbed, as it is just a matter of selecting
Soundflower in the input and output audio applications. Some audio applications do not access to
inputs and/or outputs. In this case Soundflower must be selected in either the Sound pane (System
Preferences), or in Audio Devices (Audio MIDI Setup). A handy menu item in Soundflowerbed
is Audio Setup which opens the Audio MIDI Setup. It is also possible to set the input and
output connections from the Soundflowerbed menu. This can be done for 2 channels or 16
channels (although few applications can provide 16 channel access). An example is shown below,
where the Built-in Output has been assigned to Soundflower.

2008 Richard Hallum 10


Fig. 2 Selecting outputs in Soundflowerbed (Cycling 74)

Buffer size is adjustable from 64 to 2048 samples and is very tolerant on a MacBook. Any setting
from 64 to 2048 samples could be used with no sign of glitching. Latency is set by the Logic
buffer setting, and adjusting the Soundflower buffer had no effect. Native Intel Mac support was
introduced in version 1.2.

ii. Soundflower version 1.3.1

This version has been modified to allow use of volume, mute, and gain controls in the AMS, and
balance control in the Sound preference pane. It is a patch of version1.3, which was an unofficial
release bundled with some Rogue Amoeba software, and is currently maintained by Joachim
Bengtsson (http://thirdcog.eu). Installation and performance on an Intel iMac was identical to
version 1.2.1 with no problems arising during testing.

b. JACK
i. JACK version 0.77

Installation of JACK is reasonably straightforward. A full listing of exactly where each of the
forty files goes is included in the documentation. An uninstaller is provided. This runs an uninstall
command from the Terminal and logs the uninstall activity. On the MacBook the Jack Router
appears in the AMS Audio Devices menus but did not always show in the Sound preference pane.
Setup (as distinct from installation) of JACK is more involved than Soundflower. Several things
must be set before the router can be used. On an Intel Mac the Sound in and Sound out are
duplex. JACK treats all clients as mono full duplex and cannot deal with duplex, so they must
be combined in an AMS aggregate device for JACK to be able to communicate with them. The
aggregate device must be created using an administrator account. JACK is also unable to process
stereo interleaved audio, but this is not so much of an issue as stereo tracks are de-interleaved on
playback from the client application. Secondly, correct setup order is important. Details of the
correct setup procedure are shown in Appendix III of this report. If a strict setup sequence is not
followed applications will fail to appear in the Connections Manager. In any case some
applications will not show in the Connections Manager until they are actually playing an audio
file. This is identified in the manual but is unusual software behaviour and could cause users some
confusion.

2008 Richard Hallum 11


The Connections Manager interface is somewhat ambiguous in operation. Double clicking is
performed to make a connection. This can be done on either the Send or Receive device (so long
as the Receive device is highlighted). Some form of graphical patchbay would greatly enhance the
user experience.

Fig. 3 The Connections Manager showing Logic output going to the Built-in Sound (Davis P.,
2008)

Fig. 5 The JackPilot Window, showing CPU load on a MacBook (Davis P., 2008)

JackPilot is the simple GUI for JACK and provides a button to start/stop the JACK server.
The Routing button opens the Connections Manager window. CPU load shown in JackPilot. The
value shown is sometimes considerably lower compared with that shown in the Apple Activity
Monitor utility. Where applicable, both results have been included in Appendix I of this report.
For the tests the Connections Manager allowed two possible configurations: i) connecting from
the Logic output into the System audio and then connecting the system audio output into the
Logic input, or ii) connecting the Logic output to its input directly. This is the preferred option as
the levels remain at unity gain.
For the latency tests JACK had to be persuaded to allow virtual loopback from Logic Audio
output to input. This was done by setting a parallel input into Logic from QuickTime. For some
reason Jack will not reconnect Logics output until something else is connected to its input (the
Logic output continues to default to the built in audio output). Once a connection is made from
QuickTime to Logics input then Logics output can be seen on a record enabled channel in Logic.
If the QuickTime connection is deleted the Logic connection also disappears. Feedback is
prevented as JACK allocates separate busses for each connection. JACK version 0.78 was also
tested (on Leopard, 10.5.4), and does allow loopback of Logic output to input audio, and therefore
did not require this workaround.

2008 Richard Hallum 12


Fig. 6 The JACK Pilot Preferences window (Davis P., 2008)

Preferences can be set for a number of JACK parameters. Buffer sizes range from 32 to 2048
samples, and the Sample Rate can be set to 44.1kHz, 48kHz, or 96kHz. During latency testing, the
latency was set by the JACK buffer setting. Adjustment of the Logic buffer size has no effect on
the latency results. This is the opposite behaviour to Soundflower, where Logic overrode the
router buffer size setting. Two special situations where the JACK buffer must be set to 1024 are
included in the documentation. These are when using the JACK AU plugin or the Apple DVD
Player (Davis, 2008).
The number of input and output channels is also set in the preference window, for both the audio
interface and the virtual channels that will be used for inter-application communication. JACK
also provides for saving Connections Manager setups (these are called Studio Setups) and this is a
real advantage for complex or seldom used configurations.
MIDI is supported in JACKs architecture, but not implemented. At the time of writing there is no
intention on the developers part to go ahead with MIDI on OS X, as inter-application MIDI
communication is now available using the IAC busses.
If the JACK server is quit without first shutting down the clients then Logic remains stable and
senses the loss of the JACK device then switches to the default device (built-in audio). QuickTime
handles this situation less favourably and needs to be quit. If the JACK server is restarted at this
point it does not respond and so no connections are made. It is therefore a good idea to heed the
warning message and not stop the JACK server while clients are running. One strange attribute of
JACK is that JACK Pilot can be quit without shutting down the JACK server. It is uncertain
whether this was a design feature or simply occurs as a result of the modularity of the three part
JACK system.

Fig. 7 JACK warning message (Davis P., 2008)

Details of JACKs principles of operation are in Appendix V. Basic testing of QJackCtl, an


alternative GUI for JACK, was also undertaken. For more information refer to Appendix V.

2008 Richard Hallum 13


c. Rewire
i. Rewire v1.7
Rewire is another way of getting audio transferred between two applications. It is a product of
Propellerhead Software, and is primarily used with their virtual rack / sequencer Reason, although
a number of other software products are compatible with Rewire. Most commercial sequencers
can act as host (Logic, Cakewalk, Digital Performer, Pro Tools, Cubase). Rewire appears as a
plugin and allows audio and MIDI data to be transferred between the host and any rewire enabled
application.
According to the limited information available on the website, Rewire offers high-precision
synchronisation and complete glitch-free sync (www.propellerhead.se, 2008). Rewire can support
a total of 256 audio channels. These may be shared across several clients (Reason has a maximum
of 64 channels). It can also transmit MIDI signals between applications with a maximum of 4080
MIDI channels (255 x 16). An additional feature is that, where applicable, the two applications
can have their transports synchronised. It is unclear if these claims relate to this, or to sample-
accurate synchronisation of the streamed audio. Rewire is a reliable piece of software but did
show some instability with MIDI routing (see Appendix X). The audio busses can be disabled and
enabled during audio streaming with no problems. Latency has been specified as 64 samples (1.45
mS) (Walker, 1999). CPU usage for Rewire itself is probably insignificant but the combined
usage of Pro Tools and Reason being connected via Rewire is about 45% on a MacBook.
Rewire is easy to setup, as there are no settings on the plugin. The sample rate is set at 44.1kHz
and the buffer size setting is automatic. It does offer reliable interchange of audio and MIDI
between specific host and client applications, with sample-accurate synchronisation. The transport
synchronisation is also reliable and the two programmes run in parallel so the transport functions
of either will work equally well.

Fig. 8 The Rewire setup. Audio is being routed from Reason to Pro Tools and MIDI is
being routed from Pro Tools to Reason

2008 Richard Hallum 14


d. DirectConnect

In 2000 Digidesign introduced a similar system of audio interconnection software called


DirectConnect. It was a TDM or RTAS plugin allowing streaming of up to 32 channels of 24-bit
audio from an audio application into Pro Tools. It is no longer available.

4.9 Conclusion

This research has set out to provide a comprehensive survey of audio routing on the OS X
operating system, evaluation and testing of the audio utility software, and creation of tutorials on
its use.
All of the software tested performed remarkably well3, with very few actual crashes.
Furthermore most of the software tested was tolerant of other audio utilities running concurrently.
Much of this robustness can be attributed to the exceptional stability of OS X. CoreAudio offers
an effective, low latency, sample accurate audio architecture for OS X. By design, it does not,
however, feature inter-application audio routing.
Several possible audio routing solutions do exist but some are specialised (eg the
DLSMusicDevice), or are limited to certain applications (eg Rewire). Also some must work as
plugins (eg AUNetSend / Rec). Only JACK and Soundflower offer a global solution.
JACK is a professional non host-based solution, which adds zero latency to synchronised audio.
Although it is an elegant piece of software, JACKs architecture is somewhat complex and this
requires more care with installation and use. JACK can use an appreciable amount of the CPU (eg
30% on a PPC Mac, 20% on a MacBook), and this increased with higher sample rates (eg 57% @
96kHz). Applications that dont allow the user to separately choose input and output drivers can
be handled better by JACK than Soundflower. Also JACKs AU plugin provides for more
complex patching configurations.
Soundflower is a simpler solution, yet surprisingly robust. In contrast to some of the software
tested, it appeared in every audio device list. Soundflowers best latency was 2.9mS and while not
matching JACK (0.73mS) it is a very small amount. Due to its ease of installation, setup, and use I
would recommend it for education use. It will be installed on all the MAINZ MIDI computers,
and suggested as the preferred solution for students to use on their laptops.
A number of other audio routing and control utilities were evaluated, and in some cases audio
capture software (eg Audio Hijack or WireTap Anywhere) will be the best solution when stereo
audio needs to be detoured into a file. Additional utilities such as Sound Menu, SoundSource and
PTHVolume provide a handy way to select devices and control volume.
It should be noted that for many users transferring audio data by way of file exporting/ importing
will be the preferred method to get audio between applications as many tracks of audio can be
exchanged faster than in real-time. If time-stamped formats (such as broadcast wavefiles, AAF)
are used then synchronisation of the audio is maintained.
Setup procedures for three common scenarios have been documented in Appendix III and
accompanying video tutorials produced (on CD2).

3
Mac users who have migrated from OS9 to OS X will best appreciate this statement

2008 Richard Hallum 15


5 Bibliography

ALSA Project, 2008, Advanced Linux Sound Architecture (ALSA) project homepage, ALSA
Project, accessed 17.9.08
http://www.alsa-project.org/main/index.php/Main_Page

Andi, 2007, Sound Routing, Sojamo, accessed 12.6.08


http://www.sojamo.de/blog/2007/10/15/sound-routing/

Apple Inc., 2008, Audio, Apple Developer Connection, accessed 24.5.08


http://developer.apple.com/audio/

Apple Inc., 2008, Audio Unit Programming Guide, Apple Developer Connection, accessed
24.5.08
http://developer.apple.com/documentation/MusicAudio/Conceptual/AudioUnitProgrammingGuid
e/AudioUnitProgrammingGuide.pdf

Apple Inc., 2006, Architecture of Mac OS X Audio, Apple Developer Connection, accessed
28.5.08
http://developer.apple.com/documentation/DeviceDrivers/Conceptual/WritingAudioDrivers/Audi
oOnMacOSX/chapter_2_section_3.html

Apple Inc., 2004, CoreAudio-Introduction, Apple Developer Connection, accessed 28.5.08


http://developer.apple.com/documentation/MusicAudio/Reference/CoreAudio

Apple Inc., 2007, CoreAudio Technologies, Apple Developer Connection, accessed 28.5.08
http://developer.apple.com/audio/coreaudio.html

Apple Inc., 2007, Introduction to Mac OS X Technology Overview, Apple Developer Connection,
accessed 6.9.08
http://developer.apple.com/documentation/MacOSX/Conceptual/OSX_Technology_Overview/Ab
out/chapter_1_section_1.html#//apple_ref/doc/uid/TP40001067-CH204-TPXREF101

Apple Inc., 2006, Introduction to QuickTime Musical Architecture, Apple Developer Connection,
accessed 24.5.08
http://developer.apple.com/documentation/QuickTime/RM/MusicAndAudio/qtma/A-
Intro/chapter_1000_section_1.html

Apple Inc., 2008, Mac OS X Snow Leopard, Apple, accessed 20.6.08


http://www.apple.com/macosx/snowleopard/

Apple Inc., 2006, Technical Note TN2091-Device input using the HAL Output Audio Unit, Apple
Developer Connection, accessed 16.6.08
http://developer.apple.com/technotes/tn2002/tn2091.html

Apple Inc., 2005, QuickTime Overview-Architecture, Apple Developer Connection, accessed


24.5.08
http://developer.apple.com/documentation/QuickTime/RM/Fundamentals/QTOverview/QTOvervi
ew_Document/chapter_1000_section_2.html

2008 Richard Hallum 16


AppleInsider, 2004, CoreAudio in Mac OS X Tiger to improve audio handling, Apple Insider,
accessed 29.5.08
http://forums.appleinsider.com/showthread.php?s=&threadid=45567

Ardour Foundation, 2007, Getting Audio In, Out and Around Your Computer, Ardour Manual,
accessed 6.6.08
http://ardour.org/files/manual/sn-conFig.uring-jack.html

Ash M., 2006, Why CoreAudio is Hard, mikeash.com, accessed 6.6.08


http://www.mikeash.com/?page=pyblog/why-coreaudio-is-hard.html

Audio Engineering Society, 2006, AES information document for digital audio Personal
computer audio quality measurements (AES-6id-2006), AES, accessed 15.9.08
www.aes.org/publications/standards/preview.cfm?ID=6

Bengtsson J., 2008, No Sound Plays on PPC Macs, Get Satisfaction, accessed 17.10.08
http://getsatisfaction.com/cycling74/topics/1_3_no_sound_plays_on_ppc_macs_works_fine_on_i
ntel

Cakewalk, 2008, Latency: Whats Required vs Whats Possible, Cakewalk DevXchange, accessed
18.9.08
http://www.cakewalk.com/DevXchange/audio_i.asp

Chartier D., 2008, SoundSource 2: Real audio controls in your menu bar, Ars Technica, accessed
7.9.08
http://arstechnica.com/journals/apple.ars/2008/03/04/soundsource-2-real-audio-controls-in-your-
menu-bar

Cohen P., 2008, WireTap Anywhere lets you redirect Mac audio, Mac Publishing, accessed
19.9.08
http://www.macworld.com/article/135011/2008/08/wta.html

Cooper S., 2003, Audio and MIDI under Mac OS X Revisited, NZMac.com, accessed 20.5.08
http://www.nzmac.com/features/multimedia/audio-and-midi-under-mac-os-x-revisited.html

Corbett R.,van den Doel K., Pai D., 2002, Evaluation of Low Latency Audio Synthesis using a
Native Java ASIO Interface, Dept. of Computer Science, University of British Colombia,
accessed 9.11.08
http://www.cs.ubc.ca/~rcorbett/lat02.pdf

Cosgrovee K. 2005, [linux-audio-user] audacity jack, linux-audio-user, accessed 5.9.08


http://music.columbia.edu/pipermail/linux-audio-user/2005-July/024519.html

Davis P., 2003, The JACK Audio Connection Kit, Linux Audio Systems, accessed 10.9.08
http://jackaudio.org/documentation

Davis P., 2006, Requirements for OS X, Ardour, accessed 13.9.08


http://ardour.org/osx_system_requirements

Dominic, 2008, WireTap Anywhere progress log, Ambrosia Software Inc., accessed 24.6.08
http://www.ambrosiasw.com/forums/index.php?showtopic=119722

Donner M., 2006, Making Connections with Rewire, Penton Media Inc., accessed 15.10.08
http://emusician.com/mag/emusic_making_connections_rewire/

2008 Richard Hallum 17


Ekeroot J., 2007, Audio Software Development an Audio Quality Perspective, Lule University
of Technology, accessed 14.9.08
epubl.ltu.se/1402-1552/2008/059/LTU-DUPP-08059-SE.pdf

Epson, 2001, Inside Mac OS X: System Overview, ZDnet.co.uk, accessed 13.6.08


http://whitepapers.zdnet.co.uk/0,1000000651,260014354p-39000512q,00.htm

Fielding S., 2008, QjackCtl and the Patchbay, rncbc.org aka Rui Nuno Capela, accessed 9.10.08
http://www.rncbc.org/drupal/node/76

Fiera D., 2008, WireTap Anywhere progress log, Ambrosia Software Web Board, accessed
11.9.08
http://www.ambrosiasw.com/forums/index.php?showtopic=119722

Figlar N., 2007, Jack Quickstart Guide, 64 Studio, accessed 9.10.08


http://64studio.com/quickstart_jack

Fischmann S., 2006, Free OS X Audio Utilities, Jelsoft Enterprises Ltd, accessed 22.9.08
http://www.soundsonline-forums.com/archive/index.php/t-5012.html

Gore J., nd, How to Maximize Your Inputs with Aggregate Audio, Apple Pro Techniques,
accessed 2.6.08
http://www.apple.com/pro/techniques/aggregateaudio/

Haddad P., 2008, PTHVolume 2, PTH Consulting, accessed 8.9.08


http://pth.com/products/pthvolume/

Isaacson C., 2007, Software Pipelines, Quovadx Inc., accessed 24.9.08


http://www.roguewave.com

James D., 2004, Linux Jack Sound Server Ported to OS X, Sound on Sound, accessed 12.9.08
http://www.soundonsound.com/sos/Oct04/articles/applenotes.htm

James J. D., 2007, Internal Mac Sound Recorder?, Ask MetaFilter, accessed 22.5.08
http://ask.metafilter.com/66436/Internal-Mac-Sound-Recorder

Kerner S., 2004, Open Source Awards 2004: Paul Davis for JACK, CNet Networks, accessed
9.10.08
http://articles.techrepublic.com.com/5100-10878_11-5136755.html

Kirn P., 2008, Leopard Audio Woes and Digidesign; 10.5.2 is a lemon for music?, Create Digital
Media, accessed 17.10.08
http://createdigitalmusic.com/2008/05/21/leopard-audio-woes-and-digidesign-1052-is-a-lemon-
for-music/

Kirn P., 2007, Leopard: Incompatibilities with JACK, Soundflower, Create Digital Media,
accessed 17.10.08
http://createdigitalmusic.com/2007/12/03/leopard-incompatibilities-with-jack-soundflower-finder-
audio-previews/

Kirn P., 2007, Vista for Music + Pro Audio, Create Digital Music, accessed 18.9.08
http://createdigitalmusic.com/2007/01/19/vista-for-music-pro-audio-exclusive-under-the-hood-
with-cakewalks-cto/

2008 Richard Hallum 18


Kuper R., 2004, Multiprocessing in SONAR 3.1, Cakewalk DevXchange, accessed 18.9.08
http://www.cakewalk.com/DevXchange/multiprocessing.asp

Lei Lei, 2007, iVol by Livecn, www.huasing.org, accessed 10.9.08


http://livecn.huasing.org/ivol/

Letz S., Orlarey Y., Fober D., 2005, Jack audio server for multiple machines, Grame Research,
accessed 10.9.08
http://www.grame.fr/Recherche/Publications/list/index.php?p=list.php?lang=uk&type=ARCH

Letz S., Orlarey Y., Fober D., 2005, jackdmp: Jack server for multi-processor machines, Grame
Research, accessed 10.9.08
http://www.grame.fr/Recherche/Publications/list/index.php?p=list.php?lang=uk&type=ARCH

Letz S., Orlarey Y., Fober D., Davis P., 2004, Jack Audio Server: MacOSX port and multi-
processor version, Grame Research, accessed 10.9.08
http://www.grame.fr/Recherche/Publications/list/index.php?p=list.php?lang=uk&type=ARCH

MacMillan K., Droettboom M., Fujinaga I., 2001, Audio Latency Measurements of Desktop
Operating Systems, Peabody Institute of the John Hopkins University, accessed 4.11.08
http://www.music.mcgill.ca/~ich/research/icmc01/latency-icmc2001.pdf

MacMusic, nd, Music & Audio on Macintosh, MacMusic, accessed 18.6.08


http://www.macmusic.org/home/?lang=en

McIntosh J., Toporek C., Stone C., 2003, Mac OS X in a Nutshell, O'Reilly and Associates,
Sebastopol, CA, USA.

Mertin O., 2004, A Big Mac and a Side of Plug-Ins, Penton Media, accessed 16.10.08
http://emusician.com/mag/emusic_big_mac_side/

moki, 2008, Ambrosia seeks WireTap Anywhere (MacOS X) beta test, BigBlueLounge.com,
accessed 24.6.08
http://www.bigbluelounge.com/forums/viewtopic.php?t=43395&sid=2c83c56afc0741a8ee25ec17
71dc4cfc

Moore J., 2006, Re: WireTap, CoreAudios API, and system capture, and kexts, Apple Mailing
Lists, accessed 27.10.08
http://lists.apple.com/archives/coreaudio-api/2006/Jan/msg00101.html

Moore M., 2006, Apple Mailing Lists, accessed 27.10.08


http://lists.apple.com/archives/coreaudio-api/2006/Jan/msg00101.html

Native Instruments, 2008, Mac OS X Tuning Tutorial, Native Instruments Support, accessed
20.9.08
http://www.native-instruments.com/index.php?id=niosxtut5&L=1

Native Instruments, 2007, Native Instruments Setup Guide, Native Instruments, Berlin, Germany

Orenstein D., 2000, Quickstudy: Application Programming Interface (API), Computerworld Inc.
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=43487

Parcher M., 2005, An alternative method for recording computer audio, Mac OS X Hints, accessed
8.6.08
http://www.macosxhints.com/article.php?story=20051219161639252

2008 Richard Hallum 19


PeterB, 2008, Any alternative to Soundflower to route audio output into input?, ArsTechnica
Openforum, accessed 12.6.08
http://episteme.arstechnica.com/eve/forums/a/tpc/f/8300945231/m/746003552931

Phillips D., 2007, Jack Sync: A Primer for Linux Users, Linux Journal, accessed 9.10.08
http://www.linuxjournal.com/node/1004080

Pitcarn J., 2007, Record offset in Logic, Opus Locus, accessed 4.11.08
http://www.opuslocus.com/logic/record_offset.php

Poole L., Cohen D., 2002, Macworld Mac OS X Bible, Hungry Minds, NY, USA

Propellerhead, 2008, Rewire, Reason, accessed 27.6.08


http://www.propellerheads.se/products/reason/index.cfm?fuseaction=get_article&article=rewire

Quentin, 2005, New Audio Units in OS X 10.4, Rogue Amoeba Software, accessed 23.9.08
http://www.rogueamoeba.com/utm/posts/Article/autiger-2005-05-19-21-00

Ray J., Ray W.C., 2003, Mac OS X Unleashed, 2nd ed.,Sams, Indianapolis, IN, USA

Rudolph B., 2003, Steinberg VST System Link, Mix Online, accessed 21.9.08
www.barryrudolph.com/mix/pdfs/steinbergvst.pdf

Schotman H., 2005, Basic Studio Setup using Audio Hijack Pro (v2.0), Hugo Schotman, Zurich,
accessed 15.10.08
http://log.hugoschotman.com/hugo/2005/02/basic_studio_se.html

Schotman H., 2005, How to Add or Change a Soundflower Device, Hugo Schotman, Zurich,
accessed 17.10.08
http://log.hugoschotman.com/hugo/2005/04/how_to_add_or_c.html

Schotman H., 2005, Soundflowerbed & Skype don't seem to like each other, Hugo Schotman,
Zurich, accessed 16.10.08
http://log.hugoschotman.com/hugo/2005/03/soundflowerbed_.html

Sellers D., 2004, Mac OS and the music plug-in situation, Macsimum News, accessed 2.6.08
http://www.macsimumnews.com/index.php/archive/mac_os_and_the_music_plug_in_situation

Shaffer J., Rosenzweig G., 2004, MacAddict guide to Making Music with GARAGEBAND, Que,
CA, USA

Shirkey P., 2006, JACK user documentation, Linux Audio Users Guide, accessed 16.9.08
http://lau.linuxaudio.org/jack/

Soundforge, 2008, Midishare, Soundforge.net, accessed 25.5.08


http://sourceforge.net/projects/midishare/

Soundforge, 2008, Jack OS X, Soundforge.net, accessed 25.5.08


http://sourceforge.net/projects/jackosx

Tim, 2007, Getting to Know JACK (QjackCtl), 64 Studio, accessed 9.10.08


http://www.64studio.com/manual/audio/jack

2008 Richard Hallum 20


Ubuntu documentation, 2007, HowToQjackCtlConnections, Canonical Ltd., accessed 9.10.08
https://help.ubuntu.com/community/ HowToQjackCtlConnections

Vucic V., nd, Free Software Audio Applications for Audio Playback, Recording, Editing,
Production and Radio Broadcast Management and Automation, Linux Center, Serbia and
Montenegro, accessed 17.5.08
http://www.gnulinuxcentar.org/Audio_Tools_Scan.pdf

Walker M., 1999, Rewired for Sound, Sound on Sound Ltd, accessed 15.10.08
http://www.soundonsound.com/sos/nov99/articles/rewire.htm

Wherry M., 2003, Mac OS X For Musicians, Sound on Sound, accessed 27.5.08
http://www.soundonsound.com/sos/Apr03/articles/osx.asp?print=yes

Wherry M., 2005, Mac OS X Tiger: A Musician's Guide, Sound on Sound, accessed 27.5.08
http://www.soundonsound.com/sos/jul05/articles/tiger.htm

Wiffen P., 2004, Investigating CoreAudio Performance Under Mac OS X, Sound on Sound,
accessed 27.5.08
http://www.soundonsound.com/sos/nov04/articles/coreaudio.htm

Wikipedia, 2008, Pipeline (software), Wikimedia Foundation, accessed 12.9.08


http://en.wikipedia.org/wiki/Pipes_and_filters

Winkler P., Shirkey P., Rzewnicki E., Knecht M., Low-Latency HOWTO, linuxaudio.org,
accessed 9.10.08
http://lowlatency.linuxaudio.org/

2008 Richard Hallum 21


6 Acknowledgements
I would like to thank Heather for allowing me to skip doing dishes whenever an assignment was
due in.

2008 Richard Hallum 22


7 Appendices

I. Test results 24
a. Latency test methods

b. Latency test results

c. CPU load test results

II. Comparison chart 33

III. Setup procedures 35

IV. Other audio routing and control utilities 39

V. JACK details 48
a. Theory of operation

b. Alternative GUI

VI. Latency 57

VII. How audio is handled in OS X 59


a. CoreAudio

b. QuickTime

c. Audio on the Windows PC

VIII. Software reliability 66

IX. Combinations 67

X. A Rewire MIDI problem 74

XI. Software resources 76

XII. Glossary 78

2008 Richard Hallum 23


I. Results of the Tests

The test methodology for the latency tests is to use a simple technique that can be implemented on
a sample-accurate sequencer, rather than requiring to use software developers tools (eg Million
Monkeys from the Apple SDK). It is the additional latency (if any) of the routing software that is
the focus of these tests. The results were obtained using Logic Pro v8. CPU load tests were not
originally considered but it became apparent that some of the audio routing software places a
considerable workload on the CPU. As the computer will be running at least two other audio
applications at the time the routing software is active, this could not be ignored.
Initial Testing was done on a MacBook (2 GHz Intel core 2 duo, OS 10.4.11, 3GB RAM). Other
computer models and/or operating systems used are indicated in the following tables.
Housekeeping routines and verification of permissions were checked before the tests were run.
Test files are on CD1. The results of testing various combinations of audio software are in
Appendix IX.

Latency4 Test Methods

The test setup was to record the relative latencies of each of the audio routing programmes against
the others. The absolute system latency is not considered. As a reference, latency was also
recorded when using a loopback cable. In either case the round trip latency5 is measured (i.e.
input + output), where:

Latency (in mS) = 1000 [ sample count / sample rate]

The testing procedure was as follows:


A transient signal was used to generate a pulse. In the case of the first method the source was a
Test Oscillator plugin using a 1kHz needle pulse waveform. This was inserted on a mono channel
with the output panned hard left so as to only appear on the L side of the output channel. A second
mono channel with input set to input 1 (L) was panned hard right and sent to the output channel.
A series of stereo channels were created to record both these signals. The tone was manually
pulsed on for three bursts using the channel mute control while the signal was re-recorded onto a
stereo channel.
The recording was analysed using the sample editor, which gives a sample count reading between
the L and R signals. Results were averaged on readings from the leading and trailing edge of the
three tone-bursts.
The second method used a wavefile as the source. A percussive electronic signal was used, and
the setup was otherwise similar. The wavefile was trimmed in the sample editor so the leading
edge of the envelope was exactly at 0 time. The wavefile was then put in an audio track and
copied on the beat for two bars. Once each recording was done the new track was merged with the
original track and analysed in the sample editor, using the same technique as above. All
recordings were taken at a sample rate of 44.1kHz.

4
for a description of latency, see Appendix VI
5
here, Latency is defined as the minimum time needed for the computer to store a sample from an audio interface into
application memory and copy the same sample from application memory to the audio interface output.

2008 Richard Hallum 24


ADC RAM HD RAM DAC

CPU CPU

Input latency Output latency

Fig. 9a Overall system latency is the combination of input and output delays.

Latency Measurements

i) Software Routing:

Playback delay Record delay

RAM CPU RAM CPU HD

Test tone plugin DUT eg Soundflower

ii) Cable:

Playback delay Record delay

HD CPU DAC ADC CPU HD

transient sample Cable

Fig. 9b The methods used for testing latency: i) software, ii) via loopback cable

2008 Richard Hallum 25


Fig. 10 The latency test setup. A narrow width pulse waveform is used to get a fast risetime.

Latency Test Results

The best-case latency test results are shown below. These values are taken from the following
tables. A result for WireTap Anywhere was not possible, as it does not allow loopback of the
audio for a single application.

Fig. 11 Lowest latency comparison

Notes: 1. tests using test tone are in the folder: Testing/Latency Toneburst Tests.
2. tests using the wavefile are in the folder: Testing/ Latency Reference Tests.

2008 Richard Hallum 26


A. Latency for MacBook (2 GHz Intel core 2 duo, OS 10.4.11, 3GB RAM)

Table 1. Reference Tests using a Test Tone plugin as the source signal

Device Logic Audio Samples Delay Soundflower


Tested buffer size (mS) buffer size

loopback 1024 3720 84.35 N/A


cable 32 744 18.14
32* 720 16.33
* with System Preference (Energy Saver) set to Better Performance

Table 2. Reference Tests using a time-aligned+ percussive wave file as the source signal

Device Logic Audio Samples Delay Soundflower


Tested buffer size (mS) buffer size

loopback 32*** 50 1.13 N/A


cable 32* 112 2.54
32** 50 1.13
* with Safety Buffer enabled
** with Delay Compensation on
*** with Delay Compensation off
+ sample truncated so the signal attack begins at exactly 0 mS in Logic Audio.

Table 3. Tests using a Test Tone plugin as the source signal

Device Logic Audio Samples Delay router buffer


Tested buffer size (mS) size

Soundflower 64 128 2.9 512


v1.2.1 1024 2048 46.44 512
32 64 1.45 512
32 64 1.45 2048
32 64 1.45 64
32* 64 1.45 64
Jack 512 512 11.61 512
v0.77 32 512 11.61 512
32 32 0.73 32
512 32 0.73 32
* with System Preference (Energy Saver) set to Better Performance

2008 Richard Hallum 27


Table 4. Tests using a time-aligned+ percussive wave file as the source signal

Device Logic Audio Samples Delay Soundflower


Tested buffer size (mS) buffer size

Soundflower 32 50 1.13 64
v1.2.1 32 50 1.13 2048

B. Latency for iBook (1.42 GHz PowerPC G4, OS 10.4.11, 1GB RAM)

Table 5. Tests using a Test Tone plugin as the source signal

Device Logic Audio Samples Delay router buffer


Tested buffer size (mS) size

Jack 1024 512 11.61 512


v0.77 256 512 11.61 512
1024 32 0.73 32*
32 32 0.73 32*
32 64 1.45 64
Soundflower 32** 64 1.45 64
v1.2.1 1024 2048 46.44 64
1024 2048 46.44 2048
32** 64 1.45 2048
64 128 2.90 2048
recording was completed with no noticeable degradation, but brought up a sync error
(error trying to sync audio between send and receive devices).
** dropouts of recorded signal occurred (with or without I/O Safety Buffer enabled).
All tests done with System Preference (Energy Saver) set to Better Performance

C. Latency for iMac (2.0 GHz Intel Core 2 Duo, OS 10.5.4, 3GB RAM)

Table 6. Tests using a Test Tone plugin as the source signal

Device Logic Audio Samples Delay router buffer


Tested buffer size (mS) size

Jack 512 512 11.6 512


v0.77 512 32 0.73 32
Soundflower 32 128 2.9 64
v1.2.1 32 128 2.9 512
Soundflower 512 1024 23.2 512
v1.3.1 32 128 2.9 64
32 128 2.9 2048
1024 2048 46.4 64
32 128 2.9 512
64 128 2.9 512
128 256 5.8 512

2008 Richard Hallum 28


CPU Load Test Results

Like many other audio tasks, audio routing can take a considerable amount of the available
processing use. These figures are indicative only as they often fluctuate. Where considerable
variances occurred between the peak and steady values, these have been listed separately. Figures
were obtained by observing the % CPU usage in the Activity Monitor utility over a period of one
minute.

Fig. 12 The Activity Monitor CPU Usage window. Each graph is for an individual CPU core.
Green indicates User %. Red is System %. The window is swept about every 1.5 minutes. This
graph is showing the CPU usage while streaming audio from Reason into Logic on a MacBook

CPU usage is also shown on the JackPilot so I originally used those figures when testing JACK.
They vary considerably from those in Activity Monitor, generally reading lower values. These are
included for interest only and the unbracketed numbers should be used for comparison purposes.
The CPU meter in JackPilot represents the sum of the real-time audio thread CPU use of all Jack
clients. Thus it gives a partial picture of total CPU use, since it does not take into account any GUI
impact on CPU use of either JackPilot or of the Jack clients (Davis P, 2008)

A. CPU Load Table

Device Logic Audio router buffer Computer CPU usage


Tested buffer size size (%)

Jack N/A 64 PPC* 20 (20)


Jack N/A 32 PPC* 30 (29-50)
Jack N/A 512 PPC* 8 (5)
Soundflower# 64 N/A PPC* 0+
Jack N/A 32 Intel** 20 (14)
Jack N/A 64 Intel** 12.5 (8.5) +++
Soundflower# 64 N/A Intel** 0++
WireTap Anywhere N/A N/A Intel*** 6I, 6U, 22P
Jack N/A 32 Intel*** 13 (1.5) I,
14 (6) U
Jack N/A 512 Intel*** 6 (0.2) I
Rewire N/A N/A Intel*** 1.2 I, 5 U^
Audio Hijack Pro N/A 32 Intel** 2I, 10U
Table 7. CPU usage for various software, hardware, and buffer sizes

2008 Richard Hallum 29


*iBook (1.42 GHz PowerPC G4, OS 10.4.11, 1GB RAM)
**MacBook (2 GHz Intel core 2 duo, OS 10.4.11, 3GB RAM)
***iMac (2 GHz Intel core 2 duo, OS 10.5.4, 3GB RAM)

Jack readings are the combined CPU load of JackPilot and Jackdmp (the Jack Server).
bracketed results were as shown in JackPilot
+ Logic Pro CPU usage was 17.5% when idle and 55% when recording.
++ Logic Pro CPU usage was 6.5% when idle and 24% when recording.
+++ Logic Pro CPU usage was 8% when idle and 25% when recording.
Audio Hijack idle usage was 0.2% with hijacking inactive.
# Soundflower is a Kernel extension but showed no noticeable increase in kernel_task activity
^ Pro Tools CPU usage was 16% when idle and 40% when streaming audio
I= idle, U= in use (i.e. while streaming 2 channel audio), P= peak

Fig. 13. Worst Case CPU Load comparison. These results are from Activity Monitor.

B. CPU Load vs. buffer size for JACK on a MacBook (2 GHz Intel core 2 duo, OS 10.4.11,
3GB RAM)

buffer size Activity JackPilot


Monitor (%)
(%)

32 18.8 1.2
64 12.5 0.9
128 9.5 0.7
256 7.3 0.28
512 6 0.2
1024 5.5 0.13
2048 5.1 0.09

Table 8. CPU load vs. buffer size for JACK

2008 Richard Hallum 30


The JACK FAQ states, the only impact of using JACK is a slight increase in the amount of work
done by the CPU (Davis, 2006). While it is true that the JackPilot test results are in the order of
1% and can be disregarded, Activity Monitor results are much higher. This would suggest that the
overall CPU use is significant.

Fig. 13 CPU usage vs. buffer size for JACK. The sample rate is 44.1kHz

C. CPU Load vs. Sample Rate for JACK on a MacBook (2 GHz Intel core 2 duo, OS
10.4.11, 3GB RAM). The buffer size is 32 samples.

Sample rate Activity JackPilot


(kHz) Monitor (%)
(%)

44.1 17.3 1.4


48 19 1.5
96 57 5

Table 9. jackdmp CPU Load vs. Sample Rate for JACK

2008 Richard Hallum 31


Fig. 14 CPU usage vs. sample rate for jackdmp

Fig. 15 System CPU usage (red) is high when running JACK at 96 kHz on a MacBook.
Total average usage is over 40%, most of which is being used by JACK

2008 Richard Hallum 32


II. Comparison Chart

The following table provides a quick way to compare the features and specifications of software
tested in this report. Gaps in the table exist where a function could not be tested, or no data could
be found. More detailed explanations of each software capability and performance are in the text
of this report.

Table Legend

# separate Jack parts are: Jackdmp 0.71, JackRouter 0.8.7, JackPilot 1.6.3

? stability is evidenced by running without crashing (eg quitting, freezing, not responding).
Reliability was appraised during normal operation (eg connecting, disconnecting, streaming)

$$ where CPU loads were variable; I= idle, U= in use, P= peak

* for details see the latency performance charts

** prone to high levels of digital noise on some interfaces (eg M-Box)

*** for 2 channels in and out, as measured in Activity Monitor

**** for an average user: simple = can be easily setup with little or no instructions, average = can be
easily setup with simple instructions, complex = can only be setup by following detailed
instructions

NA not applicable

++ not tested; data is from the developers website

+ for further information refer to Combination test results in Appendix IX

2008 Richard Hallum 33


Comparison Chart

Soundflower Jack WireTap Studio Audio Hijack AUNetsend/ rec Detour cable
#
Version 1.2.1 0.77 1.0.4 2.8.1 1.4.0, 1.4.1 1.5.5 N/A
Ease of Installatio n simple simple simple simple Auto (with OS) simple simple
Documentation minimal 50 page manual help file 82 page manual minimal help file N/A
Support minimal good no N/A
Ease of Setup simple complex**** simple simple simple simple simple
?
Reliability very good very good good good good good very good**
Ease of Un-installation Uninstaller Uninstaller simple N/A N/A
provided provided
Features:
No. of audio channels 2 or 16 unlimited 2 2 2 2 2 (on built in
No. of MIDI channels 0 0 0 0 0 0 audio)
18
Buffer size settings (samples) 64 - 2048 32 - 4096 N/A 32-6144, 384-2 N/A none N/A
Latency (measured) very low* 0 0 low*
Internal Bit depth 32 bit floating 16, 24, 32 FP
decimal point
Global or specific solution global global global global semi-global Semi-global global
System Architecture Kernel Extension Synchronous Kernel Extension
Server/Client
+
Compatibility OK on most Mac
OS versions tested 10.4.11 10.4.11 10.4.11 10.4.11 10.5.4 10.4.11 models.
Digidesign Core Audio driver Y Can only be used
Conflicts with other utilities N N on an iBook with
Hardware: PPC Y Y Y Y Y an external I/O as
Intel Macs Y Y Y Y Y N there is no line-in.
CPU load***$$ 0% 12.5% 1%I, 22%U, 2%I, 10%U 0% 0%
100%P
Cost free free US$69 US$32 free free $10 cable
URL www.cycling74.com www.jackosx.com www.ambrosiasw. www.rogueamoeba. www.apple.com www.rogueamoeba. N/A
com com com
Utilities accessed from menu AMS Sound Prefs none AMS none none N/A
AMS
?
Stability (no. of crashes) 0 1 0 0 0 0 0

2008 Richard Hallum 34


Comparison Chart

WireTap DLS Rewire Soundsou r c e Sound M e n u QuickTim e Core Audio


Anywhere MusicDevice
Version 1.0.1 1.4.0 1.7 2.0 1.5.1 7.5.5 3.1.0
Ease of Installatio n simple Auto (with OS) simple simple simple Auto (with OS) Auto (with OS)
Documentatio n good on-line help none little Readme file Readme file Developer docs Developer docs
Support online tutorials online tutorials
Ease of Set u p simple simple average N/A N/A simple simple
?
Reliability good limited good excellent excellent excellent excellent
Ease of Un-installatio n Uninstaller N/A Uninstaller simple N/A N/A
provided available
Features:
No. of audio channels 2 2 256 2 2 Unlimited Unlimited
No. of MIDI channels 0 0 4080 0 0 N/A
Buffer size Set by client
Latency 30mS++ NA NA Very low
Internal Bit dep t h Sample- 32 bit floating
dependant decimal point
Global or specific solutio n global specific specific global global specific global
System Architecture Server (host Container format Synchronous
application)/ Client execution-via-
callback API
+
Compatibility
OS versions tested 10.5.4 10.4.11, !0.5.4 10.4.11 10.4.11 10.4.11 10.4.11, 10.5.5 10.4.11, 10.5.5
Digidesign Core Audio driver Y Y Y
Conflicts with other utilities N N N N N
Hardware: PPC Y Y Y Y Y
Intel Macs Y Y Y Y Y Y Y
CPU load 6%
Cost US$129 Included with OS bundled free free Included with OS Included with OS
URL www.ambrosiasw. www.apple.com www.propellerheads www.rogueamoeba. www.aspirine.li www.apple.com www.apple.com
com .se com
Utilities accessed from men u N N/A N/A Sound Prefs, AMS Sound Prefs N Sound Prefs
?
Stability (no of crashes) 0 0 1 0 0 0 0

2008 Richard Hallum 35


III. Setup Procedures

1. Routing QuickTime audio into Logic, using Soundflower (refer to screen


movie 1):

1. In the AMS set the default output to Soundflower (2ch)


2. In Logic set the record channel I/O= No output
3. In the Logic Preferences: Audio/Devices, set the CoreAudio Device to Soundflower
(2ch)
4. Adjust the record volume in QuickTime
5. Record to an audio track in Logic
6. In Logic return the channel I/O to Output 1-2 and disable the track record
7. Playback the recording in Logic

Fig. 16 Re-recording MIDI as audio in Logic, using Soundflower

Audio from iTunes can be recorded in a similar way (using the iTunes volume to set the
record level). This also works for QuickTime Music Synthesiser (i.e. playing a MIDI file),
however the level is quite low and cannot be adjusted with the QuickTime volume. This can
be fixed by putting a Gain plugin into the record channel and setting the gain to around 10dB.

2008 Richard Hallum 36


2. AMS and Sound Preferences operation:

When an input or output is selected in either the other one will automatically update to the
new setting.
The input and output volumes can only be adjusted in the Sound preference pane. These
controls also set the level for recording and playback. If both are set to around 66% this will
give unity gain on loopback.
A feature that is not apparent is that the output volume has two separate settings: headphone
(when there is a plug in the jack, and another for the speaker volume.

3. Setup Procedure for recording QuickTime into Logic Audio (Intel Mac),
using JACK (refer to screen movie 3):

1. In the Audio MIDI Setup create an Aggregate Device from the Audio menu. From the
list select Built in Input and Built in Output. The Clock setting is not important, as
Jack will synchronise the audio. Name the device eg Full Duplex (refer to screen
movie 2)
2. Start JackPilot
3. In JackPilot Preferences select this Aggregate device as the Interface. Also check that
the Virtual Input and Virtual Output channel numbers are set to 2 and that the buffer
size is 512. Deselect Auto Connect with physical ports.
4. Start the Jack Server using the Start button on JackPilot.
5. In the Audio MIDI Setup select JackRouter as the Input and Output devices.
6. Start QuickTime and start playing a file. Check that QuickTime appears in the Send
Ports list of the Connections Manager, which is accessed from the Routing button on
JackPilot.
7. Start Logic Audio and in Preferences/Audio/Devices/CoreAudio select JackRouter as
the device. Set the buffer size to 512. Click on the Apply Changes button. A pop up
initialising CoreAudio window should appear momentarily.
8. In the Connections Manager Logic Audio should now be showing on the Receive
Ports list. Click once on QuickTime in the Send Ports list (this will highlight it in
blue) and double click on Logic Audio in the Receive Ports list (it will turn red). The
connection will now show in the Connections list on the right. To delete a connection
double click on either the Send or Receive device again.
9. In Logic record enable a stereo audio track. The QuickTime audio should now be seen
on the channel meter.
10. To hear output, Logic Audio needs to be connected to the System Device. In the
Connections Manager click on Logic Audio in the Send Ports list, and then double
click on system in the Receive Ports list. To save this setup, go to the JackPilot file
menu (save studio setup).

2008 Richard Hallum 37


Fig. 17 Routing QuickTime to Logic using JACK. The setup for a PPC Mac is similar
except step 1 can be ignored.

4. Routing Logic audio and MIDI into QuickTime Pro, using WireTap
Anywhere (refer to screen movie 4):

1. In WireTap Anywhere create a device for Logic and name it Logic


2. Select this device (it will turn blue)
3. In QuickTime Pro Preferences/Recording set Microphone: WireTap: Logic and
Quality: Device Native.
4. In QuickTime Pro select File/ New Audio Recording
5. In Logic push play. Check that audio is showing on the meter in the QuickTime Pro
Audio Recording window.
6. Push record and then push play in Logic.
7. Push stop on Logic and then stop on the QuickTime Pro recording
8. A .mov file will be saved onto the desktop
9. Use File/Export, Export: Sound to Wave to create a .wav file.

2008 Richard Hallum 38


IV. Other Audio Routing and Control Utilities
There are several other audio utilities that offer useful routing and control functionality. The two
main purposes are to capture and record audio within the computer, and connect or select
applications and audio devices. The software companies Rogue Amoeba and Ambrosia are most
active in this area.

WireTap

The original WireTap was a free utility from Ambrosia Software and provided the ability to
record any audio playback in OS X. Recordings are saved to file, either compressed or
uncompressed.

Fig. 18 The WireTap GUI (Ambrosia Software, 2003)

WireTap Studio provided extra functionality and a more advanced GUI. Features added include
adding audio effects, an audio library window, comprehensive on-line help, and scheduled
recording. The main audio addition was the ability to source from two audio devices
simultaneously and adjust the relative levels of each.

Fig. 19 The WireTap Studio GUI (floating controls, source tab) (Ambrosia Software, 2006)

Ambrosia has just announced (3Q, 2008) the release of its third generation WireTap product;
WireTap Anywhere. The GUI is a System Preference pane where audio devices are configured.
Up to 12 virtual devices can be created and each can capture the audio of up to 16 applications
simultaneously (dominic, 2008). An uninstaller is provided with the user able to select the files
with or without the associated kernel extension. The reason for this is that the kernel extension
(AmbrosiaAudioSupport) is also used by other Ambrosia software, namely Snapz Pro X, and
WireTap Studio.

2008 Richard Hallum 39


Fig. 20 The WireTap Anywhere GUI (Ambrosia Software, 2007)

WireTap Anywhere was tested on a 2 GHz Intel iMac with 3GB of RAM. The operating system
was Leopard 10.5.4. CPU power used was 22% peak and 6% while streaming 2 channel audio.
This utility is quite easy to setup and use and has an excellent on-line help file. There are also six
tutorial videos on the Ambrosia website which cover all the features. Unlike previous versions of
WireTap, this utility is an audio patchbay and does not have any record to file function. Recording
is done in any other chosen application (eg QuickTime Pro). An advantage of WireTap Anywhere
is it will route audio to another application for recording without disconnecting the audio stream
to the sound output. This enables monitoring of the audio while setting up and recording. Under
test WireTap Pro performed very smoothly with no signs of crashing or glitching the audio.
Devices can be selected and the power switched on or off while the audio is streaming without
any adverse effects. WireTap Anywhere also has the option to run in AU mode in which case it is
not necessary to create audio devises. For this to work the audio application must be AU
Generator compatible. I was unable to test the latency of WireTap Anywhere, as it would not
allow audio loopback. According to Dominic Feira from Ambrosia Software the latency is about
30mS at a 44.1kHz sampling rate for the AU Generator, and 32mS for the AUHAL plugin.
Lowest latency is 27mS (http://www.ambrosiasw.com/forums).

Fig. 21 WireTap devices as they appear in the GarageBand Preferences (Apple, 2008)

2008 Richard Hallum 40


Audio Hijack Pro v2.8.1

This application performs the same function as WireTap. It can record audio from any source on
OS X (applications, audio devices, internet streamed audio), and it is possible to combine these
into a single recording. With the addition of Soundflower it can also record system audio. Audio
can be recorded in four formats: AIFF (16 bit or 24 bit), ALAC, MP3, or AAC. A Quick Record
feature allows hijacking audio without having to use all the settings for a full session. A
convenient Split button allows the user to start a new file while audio is streaming. The input
buffer size can be set between 32 and 6144 samples, and the output buffer can be set between 384
and 262144 samples. An optional install (Instant Hijack) lets this programme hijack audio from an
application that is already open. Otherwise Audio Hijack must be run first.

Fig. 22 Audio Hijack Pro GUI (Rogue Amoeba, 2008). Source Type can be set to Application,
Audio Device, AM/FM Radio, or System Audio.

Fig. 23 Advanced options for the Application as a


source setting (Rogue Amoeba, 2008). Stream Index has
8 levels to allow you to receive audio from
applications which output audio in non-standard ways.
No further description of this feature is given.

2008 Richard Hallum 41


Audio Hijack Pro 2 has a considerable number of extra features such as adding AU effects, using
the Apple Automator function, running it using Applescript, CD burning, and recording podcasts
using the inbuilt timers. These fall outside the scope of this paper so they will not be explained
here.

Detour v1.5.5

Detour is a system preference pane, allowing application-independent audio routing, with


independent volume control, to any sound devices enabled on a PPC Macintosh. Development of
Detour has been discontinued and the final version is 1.5.5, which was released in 2005. All of the
application-redirects and most other features can be accessed from the menu-bar Detour menu.

Fig. 24 The Detour Applications Redirects window (Rogue Amoeba, 2005)

Fig. 25 The Detour devices window (Rogue Amoeba, 2005)

2008 Richard Hallum 42


Fig. 26 The Detour menu showing QuickTime set to Soundflower with a volume of 70% (Rogue
Amoeba, 2005)

SoundSource

Sound Source is an audio utility allowing menu-bar access to control selection of the input, output
and system audio devices.

Fig. 27 The SoundSource menu (Rogue Amoeba, 2008). This is the Leopard version; volume
sliders do not appear on the Tiger version.

A limitation of most Macintosh models is the inability to isolate sound output between the
headphone (audio out) jack and the internal loudspeaker. The Mac Pro does offer separate output
control and SoundSource features an auto-switch to headphones menu item to mimic the way
other models switch the internal speaker off if the headphone jack is in use (Rogue Amoeba,
2008).

2008 Richard Hallum 43


Sound Menu

An alternative to SoundSource, this is another menu-bar utility to allow access to the sound
devices. The features are similar, but direct access to the AMS is not provided. It does have an
indication of mute being on which is shown as a cross through its menu-bar speaker icon.

Fig. 28 The Sound Menu GUI (Aspirine, 2008)

Line In

This utility allows any input audio to be directly routed to any output device. The advanced tab
brings up a second window where input buffer size can be set between 32 and 6144 samples, and
the output buffer can be set between 384 and 262144 samples. The minimum output buffer size
automatically sets to twice the input buffer size. Tests on a MacBook revealed the minimum
output buffer size allowed was 512 samples, which gives a latency of 11.6 mS. Below these
values the thru function was muted.

Fig. 29 The Line In GUI (Rogue Amoeba, 2008)

PTHVolume

This is a system preference pane, which allows individual volume control for each audio device
from the menu bar. It requires OS 10.5. Custom keyboard shortcuts can be set individually for any
device that allows volume control.

Fig. 30 The GUI for PTHVolume (PTH Consultants, 2008). Faders can be shown either
vertically or horizontally.

2008 Richard Hallum 44


Fig. 31 The preference pane for PTHVolume showing keyboard shortcuts set for Soundflower
(PTH Consultants, 2008).

Distributed Audio Processing

VST System Link is a distributed computing system, which allows audio and MIDI to be
transferred in real-time between a number of computers. It was introduced in 2002 and is
proprietary software of Steinberg so it requires Cubase or Nuendo as the host. The LSB of the 24
bit audio word is used to maintain sample accurate synchronisation. A typical configuration
might include a keyboardist with many virtual synths operating on one computer that does not
affect the mixing engineer's computer running with many VST effects and plug-ins. (Rudolph,
2003).
Apples Logic Audio Node is a similar system for distributed audio processing using Ethernet and
the built-in networking capabilities of Mac OS X.
With the increased processing power available with the Intel CPUs it will be unnecessary for
many users to require extra processing on remote machines. The ability to freeze tracks has also
reduced CPU usage of virtual instruments to acceptable levels for a lot of situations. Neither of
these systems was tested as this paper is concerned with audio communication on a single
computer.

AUNetSend v1.4.0 and AUNetReceive v1.4.1

These two AU plugins work together and are also intended to transfer audio between two
computers using the Bonjour protocol to communicate over Ethernet. It is also possible to transfer
2-channel audio between applications on a single machine. I was able to successfully stream audio
from Logic into Audio HiJack Pro.

2008 Richard Hallum 45


Fig. 32 The two AU Network windows (Apple, 2008)

I also tested routing audio from a Logic audio track via a channel insert and back into the input of
an audio instrument channel (and from there to mix out). This worked without any problems with
the AUNetReceive automatically connecting when it senses audio playing.

Fig. 33 The setup for testing the AUNetwork plugins in Logic

By bussing the incoming audio stream onto a new audio track I was able to check the latency
which was found to be 0 samples. The AUNetSend plugin allows the user to set the data format.
16 bit, 24 bit and 32 bit floating integer are presented as well as AAC (32 to 128kb/s), -Law, and
IMA 4:1 data compression.

2008 Richard Hallum 46


DLSMusicDevice v1.4.0

This is an Apple AU plugin, which allows a MIDI sound device to be inserted into an audio
instrument channel of a sequencer. By default this will be the QuickTime Music Synthesiser
output but any Soundfont2 device can be selected if it resides in the Library/Audio/Sounds/Banks
folder. The DLSMusicDevice will recognise most of the GM2 MIDI commands including
velocity, pitchbend, volume and bank select, reverb, chorus, and program change (Shaffer,
Rosenzweig, 2004). I found it worked well for a small number of tracks but when tried with 16
tracks of MIDI the audio glitched badly (testing was on a 2GHz MacBook).

Fig. 34 The DLSMusicDevice edit window (Apple, 2008)

Digidesign CoreAudio Driver

This allows any CoreAudio enabled application to access Pro Tools hardware, such as the Mbox,
or Digi 002. A separate driver is required for the Mbox 2 series. It provides for full duplex
recording and playback of audio up to 24 bit and 96kHz. Up to 18 channels of I/O are possible
with a Digi 002. It is also possible to use it on a TDM Pro tools system but only the first 8
channels of I/O are available. Buffer sizes can be set from 128 to 2048 samples, depending on the
Digidesign hardware used.

Fig. 35 The Digidesign CoreAudio Manager (Digidesign, 2008). On a MacBook or Intel iMac
the buffer size can be set to either 512 or 1024 samples.

The correct order of connection is important when using the Digidesign CoreAudio Driver. It is:
1) Start Digi CoreAudio Manager
2) Select Digidesign HW in Sound Prefs or AMS
(if Sound Prefs is already running; quit and reopen it)
3) Start the audio application, open an audio file, and play or record

2008 Richard Hallum 47


V. JACK

Several third party software developers have built an audio system for OS X that routes audio
between applications. Rewire from Propellerhead Software, and Soundflower, originally from
Cycling 74 (Matt Ingalls) and now maintained by Joachim Bengtsson (ThirdCog) are two such
examples. Neither of these is described in much detail so the exact theory of operation and
specifications are difficult to know. In contrast, JACK has been well documented by its
developers, particularly Paul Davis and Dan Nigrin. It is an Open Source programme so there is
also a considerable amount of information published by other interested parties. For this reason,
and the fact that JACK proposes to be a professional solution, JACK is considered here in some
depth.
Vucic (nd) has dedicated several pages of his paper (Free Software Audio Applicationsan
overview of functionality and usability) to describing JACK. On p21 he compares freeware to
commercial software, maintaining that the two work on different sets of principles. Features of
commercial software occur as a result of market research and are designed to address a particular
user segment, as well as working well only with certain hardware, such as proprietary audio I/O
interfaces. Non-commercial software is free from such market driven decisions and may include a
variety of unusual features. It can also be driven by a purely problem-solving approach, as is the
case of JACK. Paul Davis describes how the impetus for developing JACK was one of finding a
solution to a problem; the question of how to get different audio applications to talk to each
otheryoud want to do something that seemed very obvious and you just could not do it
(Kerner, 2004).

Theory of Operation

JACK was conceived in 2001 and originally written for the Linux operating system, where it can
utilise the scalability and reliability of this OS. Other Linux solutions (eg NAS, artsd, esd) do not
provide sample accurate synchronised I/O of multiple audio streams, nor were they designed for
performance within low latency systems (Phillips, 2006). Stephane Letz successfully ported
JACK to OS X, and the initial public release for OS X was on the 7th of January 2004. This was
version 0.4 and was named Jack Tools. In the same month its designer, Paul Davis was awarded a
Bronze Open Source Award for the Linux version of JACK (Kerner, 2004). From the outset the
design objective was to create a professional specification solution, which would allow streaming
of high bandwidth data between independent applications with low latency.
Not all audio software is designed for professional use. JACK does meet this criteria, and
therefore must be evaluated against the statement JACK is different from other audio server
efforts in that it has been designed from the ground up to be suitable for professional audio work.
This means that it focuses on two key areas: synchronous execution of all clients, and low latency
operation. (Davis P, 2008)
It is anticipate that data other than audio could be supported so in future video may be added. It is
possible to run more than one JACK server, where each would form its own independent setup.
Applications connected to JACK have their own graphical interfaces with JACK making no
specifications as to different GUI toolkits or libraries. Consequently, a JACK setup can be spread
across multiple system processes. A primary design goal was to provide sample-accurate
synchronisation of all clients. To achieve this goal all clients must process the audio for an exact
period of time, in other words each client must execute the audio in exact lock-step. Other
specifications of JACK are that it uses only mono audio streams (i.e. interleaved audio is not
supported), and 32 bit floating point (to IEEE-754 6, normalised to a range of -1.0,+1.0) is used.

6
IEEE Standard for binary floating-point arithmetic ANSI/IEEE Std 754-1985

2008 Richard Hallum 48


Included in the design goal is the ability to add or remove clients while the JACK server is
running as well as to connect applications that are running.
Other approaches typically use plugin APIs that are shared objects that must execute in the
context of the host application. Rewire, and DirectConnect are examples of high-level plugin APIs
which use shared objects. To simplify design these run in a single process by the host application.
One of the more difficult tasks any OS has is the real-time scheduling of each audio application
correctly. UNIX approaches generally employ sound servers using a push model. This means the
applications can write any amount of data from very small to very large when it suits it. The push
model does not maintain synchronous operation of all applications so it cannot generate audio at
the same time we actually hear it. This is sufficient for consumer applications such as MP3
players but not satisfactory for professional audio due to the large amount of buffering required,
increasing latency. Most of these models do not provide inter-application routing. JACK uses a
pull model to ensure accurate control of the audio data. In this model data is drawn from the
application by the JACK server.

The JACK software consists of several parts. The JACK Server (jackdmp), which manages the
clients. Jackdmp is optimised to use the processing power of multi-core CPU machines.
JackPilot is the GUI to allow connection setup and control of jackdmp. The JackRouter is a
CoreAudio driver that allows any OS X audio application to be a JACK client. There are also
JACK plugins (AU and VST) providing for additional audio routing possibilities. For example a
third application can act as an effects insert. A NetJack module has also been developed which
allows streaming of audio over a network. At present this feature has been suspended until fixed.

Fig. 36 JACK architecture diagram showing the Linux version. (Davis P., 2003)

2008 Richard Hallum 49


Fig. 37 JACK architecture diagram showing the OS X version. (Davis P., 2003)

The central part of JACK is the JACK Server, which controls the audio data transmission between
clients. These may be either applications or hardware devices, such as built in audio or an external
audio I/O interface unit. A small number of native JACK applications exist but most use the
JackRouter (JAR) to communicate with jackdmp. When the first client signals Jackdmp via the
CoreAudio driver Jackdmp activates a client graph. This is a set of nodes to be executed
consecutively on a periodic basis. Each client has its own audio process function to be called in a
specific order. The driver calling jackdmp at a regular interval determined by the buffer size
executes the graph. The server essentially passes this interrupt to each client in turn. The design
criteria is that all audio data processing and transfer is completed for all clients within two
consecutive interrupts. This must include server/ client communication (Davis P, 2008).

Paul Davis described JACK in some detail at the Linux Audio Developers Conference, ZKM
Karlsruhe in March 2003 and much of the information included here is drawn from that
presentation.

2008 Richard Hallum 50


Fig. 38 Typical audio I/O architecture (Davis P., 2003). An interrupt to the OS via the CPU
wakes the driver. The read and write lines are the system calls. Data transfer between the User
layer and the buffers in the Kernel layer use either a memory copy command or DMA. Davis
describes this as the best API model.

Using the method of two interrupts per hardware buffer, JACK can react to the input signal within
one interrupt interval and deliver audio within two interrupt intervals.

The Context Switch

The best audio applications are built with a high level of integration between the GUI and the DSP
code. Normally a software designer will want to keep to one developer Toolkit so that both the
GUI and DSP code can be run in the same process. This is not always possible with audio
software where for example two applications that need to be connected run in different processes.
One of the critical aspects in the design of JACK is to allow plugins running in another process to
interface with jackdmp. The advantages are that this isolates the host from plugin errors, avoids
the requirement for IPC (inter-process communication) between the GUI and the DSP code, and
developers can choose their own GUI Toolkits. Paul Davis expresses frustration on this point,
one of the most annoying features of writing audio application software is we cannot integrate
good applications written in different toolkits (Davis P, 2003).
JACK uses a Context Switch to facilitate clients running in a different process. This is the ability
of a CPU to switch between processes and saving all the register values for the current process on
the stack followed by restoring all the register values for the next process. On a machine using
virtual memory (i.e. any personal computer) the same address in two different processes rarely
refers to the same physical memory location. Therefore the virtual memory addresses must be
mapped to physical memory, making context switching computationally intensive. The task can
be handled in hardware eg the Translation Lookaside Buffer (TLB) does this mapping on an Intel
processor. Upon each switch, the memory (and cache) must be invalidated, and the impact of this
is to introduce some delay. This is linear with CPU speed for each register save/ restore cycle but
the TLB effects are dependent on a number of factors outside the programmes control. These

2008 Richard Hallum 51


include the frontside buss speed, memory speed, cache size, and the process working set size. On
a modern x86 processor register switching time is in the order of 10 to 50 S. Cache and/or TLB
effects can increase this time by a factor of two or four. The total time available to process 64
frames per interrupt at 48kHz is 1333 S. This is the lowest order of latency that many current
PCI interfaces can support. While 50 S is not a lot, if JACK is serving ten clients this would be
500 S, which is nearly 40% of the total time available.
One of the most common things effecting the context switch time is the process working size. If
the program has to use a large amount of memory it will invalidate a lot of cache and the cache
will need refilled (and visa versa). Several of JACKs design goals are significant in determining
the low latency. Whereas the internal clients are run as shared object APIs (as are VST plugins,
Rewire), the external clients also need to communicate with jackdmp efficiently. Memory copy
techniques have been avoided and instead JACK uses shared memory to exchange data.
Communication is done using FIFOs rather than signals/slots event notification as these were
found to be too slow. FIFOs provide an extremely fast and lightweight IPC mechanism. A
compile option in JACK enables FIFOs to be placed on a RAM-based files system. There are
other advantages in using FIFOs in the model, namely that they are easily reconfigurable when the
graph changes. The FIFOs on the server side can be left open, and clients can close and reopen to
reflect the graph order.

This paper will not look at how JACK functions at a coding level but it is informative to briefly
consider the operation of the external client. Within the bounds of the external client process
delimiter reside a request socket and an event socket, both of which are duplex. The request
socket can initiate a request to jackdmp. An example would be a client shutdown message. The
request may also communicate with the application GUI or any other process thread (eg for MIDI
I/O). The event socket accepts control signals from jackdmp. This enables the client to disconnect
or reconnect to another client and is able to communicate with a special audio thread created by
jackdmp for the client. This JACK thread runs the main event loop and controls the data to the
FIFO from the previous client and the data going to the next client via the output FIFO. Other
threads cannot have audio I/O and must not be able to block the JACK thread. They should also
register shutdown callbacks so the programme is aware of JACK shutting down its client. Thus
the client is responsible for managing at least one thread (the audio thread), and executes process,
xrun (buffer under-runs or overruns), and sample rate changes. It must be able to function in real-
time without getting any blocking calls. For this reason programming functions such as malloc,
sleep, read/write/open/close, and some pthread (POSIX threads) synchronisation functions must
be avoided.

2008 Richard Hallum 52


Fig. 39 A JACK external client (Davis, 2003)

Client F Client E Client D


external external external

Jackdmp

Client C Client G
internal internal

Client B Client H
internal internal

Driver Engine Client A Client I


internal internal

Interrupt from Audio Interface Call Context switch

Fig. 40 JACK block diagram (after Davis P, 2003). External clients are run as separate
processes, like normal applications.

2008 Richard Hallum 53


JACK provides the following options for the CoreAudio driver:

-c, --channels Maximum number of channels (default: 0)


-i, --inchannels Maximum number of input channels (default: 0)
-o, --outchannels Maximum number of output channels (default: 0)
C, --capture Provide capture ports. Optionally set CoreAudio device name
(default: will take default CoreAudio input device)
P, --playback Provide playback ports. Optionally set CoreAudio device name
(default: will take default CoreAudio output device)
-m, --monitor Provide monitor ports for the output (default: false)
-D, --duplex Provide both capture and playback ports (default: true)
-r, --rate Sample rate (default: 44100)
-p, --period Frames per period (default: 128)
d, --device CoreAudio device name
(default: will take default CoreAudio device name)
-I, --input-latency Extra input latency (frames) (default: 0)
-O, --output-latency Extra output latency (frames) (default: 0)
-l, --list-devices Display available CoreAudio devices (default: true)
-L, --async-latency Extra output latency in asynchronous mode (percent)
(default: 100)

Notes:

1. The period (p) specifies the number of frames between process () function calls.
JACKs latency therefore is --period --rate.
2. The maximum number of output ports defaults to 128. QjackCtl allows selection of
512 ports. More are available with sufficient memory.
3. Several additional settings relating to memory management and sound card
performance are available on the Linux version of JACK. For example n,
(--nperiods) specifies the number of periods in the hardware buffer.
The default value is 2. The buffer size (bytes) = (-p)*(-n)*4. (Phillips, 2006)

Table 10. JACK parameter settings (after Capela R., 2008)

Paul Davis has indicated several ways in which JACK could be improved in future. Several
changes to the Kernel are proposed. Although JACK employs real-time scheduling processes in
practice these dont always get run when they are supposed to be run. Deterministic scheduling
would overcome this problem. Also a faster IPC method has been considered, using futexes/
doors, as FIFOs must go through various kernel locks. Reducing interrupt blocking is another area
where improvements could be made. There are still a number of places in the kernel where
interrupts get blocked for a while. Delays may be small but can use up to 25% of total available
processing time to achieve lowest latency. Dynamically increasing the number of ports would not
require the user to specify a fixed number of ports before starting the JACK server. MIDI
implementation has also been considered and it would be easy to add MIDI using the port model
(where that data length was a set number of MIDI bytes). Full MIDI support would require a
different model that can handle asynchronous MIDI message communication (Davis P., 2003).
JACK is a server/client model, which is hard to make truly robust, as a badly coded application
can cause the system to hang. Furthermore it uses a pull model to ensure sample-accurate
synchronisation of the streamed audio data and this does not work with applications that are not
callback-based. Porting software written on a push model to a pull model can be difficult. It
should be kept in mind that JACK is still a work in process. The OS X version has not yet reached

2008 Richard Hallum 54


version 1.0. This indicates that although JACK is open source freeware, the development team is
serious about creating and improving an audio routing solution suitable for professional use.

QJackCtl v0.3.2 - an alternative GUI

QJackCtl is an alternative GUI for JACK developed by Rui Nuno Capela. It provides several
enhancements to the JACK GUI; more parameters can be set and these can be saved between
sessions, application transport control is provided, it has a more detailed run status, the
connections are presented in graphical (rather than tabular) form, and configurations can be saved
in a visual patchbay. The graphical interface of QJackCtl has somewhat of a mixed PC and OS X
look, due to it being a cross-platform solution that has been ported from the Linux version.

Fig. 41 The QJackCtl Connections window (Capela R., 2008) is very similar to JACKs
but cables (in this case horizontal lines) show the connections made

Fig. 42 The Patchbay (Capela R., 2008), like JACK, allows saved configurations but
also with cables showing the connections

2008 Richard Hallum 55


Fig. 43 QJackCtl main window showing the transport controls (Capela R., 2008)

Fig. 44 The settings window (Capela R., 2008) allows GUI access to many more
parameters than JACK. Latency is calculated and shown here, according to the sample
rate and samples per buffer.

Fig. 45 The Status window (Capela R., 2008) provides considerably more information
than JackPilot. Transport state can also show BPM and Timecode. XRUN can detail six
types of buffer error count

2008 Richard Hallum 56


VI. Latency

Latency can be defined as the delay occurring between an action and its effect. Latency is a
central issue when using a personal computer for audio work. There is no processing on a sample
by sample basis instead the CPU will input, process, and output the audio in blocks. One
solution is to have dedicated DSP chips to handle the audio processing eg Pro Tools TDM
systems). Any delay introduced by any inter-application audio system will also be significant if it
is more than a few milliseconds. As the total delay will be cumulative so it is essential that low
latency is a priority in the routing design.
Latency can be split into Input and Output delays. The input latency is defined as the amount of
time a system takes to respond to the input signal. First, data arrives at the audio I/O, then it is
buffered until the next interrupt, when the CPU can access it. For output latency the process is; the
data is ready for the I/O and starts filling the output buffer. When the next interrupt executes the
CPU can deliver the data. If other data is already queued on the audio interface hardware then
additional delays can be incurred before the audio output takes effect. Shortest latency will be
achieved if there are two interrupts per hardware buffer. This is a variation of double buffering,
which is common in graphical programming. The buffer is divided into two, with an interrupt as it
crosses from one half to the other. The hardware works on one half; either writing to it or reading
data from it, and the software works on the other half. One layer of the buffer is used for the audio
input, and one for the audio output. While the user application writes to the output buffer 0, the
hardware driver writes to input buffer 1; after that, starting from the following frame, the buffers
are switched to output buffer 1 and input buffer 0 respectively.
To summarise, the total output latency will equal the buffer size and the input latency will equal
the interrupt interval. Through latency, therefore, will equal the output latency.
At 44.1kHz each sample is 22.676S so a latency of 100 samples takes 2.27mS. Sound travels at
approximately 344 m/s so two musicians spaced 3m apart will each hear the other with a delay of
8.7mS. While this is acceptable for live performance, in more critical situations such as overdub
recording using headphones, any delay of more than 3mS will make singing or playing in time
difficult (Davis, 2003).
Virtual Instrument developers have gradually reduced the minimum buffer sizes as faster
computers have become available. For example Native Instruments latest versions allow the user
to set the output latency to 8mS. The Arturia Jupitor-8V audio buffer size can be set at low as 14
samples (0.3 mS at 44.1kHz). To achieve stable performance with this order of latency requires a
fast computer to prevent buffer overruns occurring.
On a 2GHz or greater Intel Mac good results can be obtained but only by careful optimisation and
dedicating the computer for sequencing use. Although CPU power has increased dramatically in
the last five years there has also been an even greater demand for it by the DSP of audio software.
Some Virtual Instruments and other signal processing plugins use complex algorithms requiring
numerous computations, and these put a considerable workload on the CPU. A buffer setting of
5mS is a good compromise between minimal delay and reliable audio streaming without glitching.
The CPU workload increases as buffer size is reduced so a fast computer is required to achieve
latencies of this order. The relationship between buffer size, latency, and CPU load for audio
(using the ASIO API) has been explored at the University of British Colombia (Corbett, van den
Doel, Pai, 2002).

2008 Richard Hallum 57


Fig. 46 A water tank analogy of buffer size. The tank represents the buffer size. Water can flow
into the tank sporadically but will leave the tank in a continuous flow. The larger tank can
store more water to compensate for irregular inflow but takes longer to fill (latency).

Fig. 47 Best case latency measurements of comparable systems (after MacMillan, Droettboom,
Ichiro, 2001). Figures in red show results for testing under extra load. Best overall result was
CoreAudio.

2008 Richard Hallum 58


VII. How Audio is handled in OS X

a. CoreAudio

Apples objectives for the design of audio on OS X were twofold. It is a system where latency is
inherently low and is determined by the application. This provides a specification suitable for
professional multi-channel audio I/O. Secondly, Apple wanted to address a problem that had
arisen in OS 9, where third party developers were left to find many of the solutions for audio
support (eg ASIO, EASI, MAS7). Although often referred to as drivers these are actually user-
level APIs and latency results vary depending on whether a generic driver is enabled or one
specified for the actual hardware. For many users OS X succeeds in providing a standardised and
integrated audio system solution.
The reasons that CoreAudio does not allow inter-application audio routing are stated by Jeff
Moore in an Apple Mailing List, Mac OS Xs audio system was designed first and foremost for
performancewe have opted for better performance at the cost of not being able to provide this
feature (Moore, 2006). CoreAudio is designed for minimal latency whilst retaining glitch-free
audio and uses a pull model to achieve this with reliability. Callbacks are run in real-time
threads so there are some restrictions on what they can do.

Fig. 48 CoreAudio architecture (Apple, 2004)

The CoreAudio is generally expressed in terms of layers where the Kernel layer includes the audio
drivers and the I/O kit. Audio drivers use the I/O kit to perform many of its tasks, such as
providing timing mechanisms and buffers.
Above this layer resides the User layer, which includes the Hardware Abstraction Layer (HAL).
The HAL is used to present a consistent interface to any application to interact with hardware.
Timing information and latency values are therefore handled by a standardised part of the OS.

7
ASIO: audio stream input output (Steinberg), EASI: Enhanced audio streaming interface (Emagic), MAS: MOTU
audio system (Mark of the Unicorn).

2008 Richard Hallum 59


Sample accurate timing is fundamental to CoreAudio and is achieved with timestamps.
A specific AU (the AUHAL) is used to get audio in and out of the HAL.

Fig. 49 The recording studio concept of CoreAudio (Apple, 2007)

CoreAudio_app

API CoreAudio.frameworks
User
Kernel

CoreAudio Driver

Hardware
Sound Card

Fig. 50 CoreAudio layers (after Ekeroot, 2007)

2008 Richard Hallum 60


AU Framework Audio Application

Audio Toolbox Framework Core MIDI Framework


Audio File, Converter &
CODEC
AU Graph
Music Player
CoreAudio Clock MIDI Server Framework

CoreMIDI Drivers
System CoreAudio Framework
Sound HAL
OpenAL
User
Kernel
I/O kit

CoreAudio Drivers

Hardware

Audio I/O MIDI I/O

Fig. 51 CoreAudio architecture

The audio HAL is central to obtaining reliable audio on OS X. The audio HAL uses extremely
accurate timecode on OS X to ensure clients perform their I/O at the proper time, according to
their buffer size. HAL predicts when the I/O cycle has completed. The driver sets up the hardware
buffer and takes a timestamp for very buffer cycle. The Audio HAL keeps track of each
timestamp and uses the sequence of timestamps to predict the current location of the audio I/O
engine (in terms of sample frame read or written) at any time. Given that information, it can
predict when a cycle will complete and sets its wake-up timestamp accordingly. This model,
combined with the ability of the I/O Kit Audio family to receive audio data from each Client
asynchronously, allows any number of clients to provide audio data that gets mixed into the Final
output. It also allows different client buffer sizes; one client can operate at a very low buffer size
(and a correspondingly low latency) while at the same time another client may use a much larger
buffer. As long as the timestamps provided by the driver are accurate, the family and the Audio
HAL do all of the work to make this possible. (Apple, 2006)

2008 Richard Hallum 61


Audio Toolbox APIs
Audio File and Converter File read/ writes to file or buffer.
File conversion (CODEC or bit depth).

AU Graph Allows dynamic modification.


Data is pulled from the head node (the output AU node).

Music Player Playback of sequenced MIDI data or CoreAudio events.


Can output to AU Graph.

CoreAudio Clock Reference clock; can supply SMPTE, samples, bars,


(added in version 10.3) beats & clicks.

Audio Units API Supports 32 bit FP linear PCM (non-interleaved)

Table 11. OS X Audio API features

User access to CoreAudio in OS X is usually via the Sound pane of System Preferences or Audio
MIDI Setup Utility. In accordance with Apple GUI design philosophy, there are very few things
that can be adjusted in the Sound Pane. The AMS allows setting a few more parameters such as
sample rate, where these may be adjusted for a given device. Interestingly, the MIDI side of the
AMS has more functionality. There is a MIDI patchbay where virtual MIDI devices can be
connected to and from any MIDI interface. This includes naming and setting which MIDI
channels will be allowed for a given device. There are also two special devices in the MIDI device
pane. The IAC driver allows for unlimited MIDI busses to be created for inter-application MIDI
data. Secondly, the Network device allows MIDI communication between computers, using
Bonjour over Ethernet.

b. QuickTime

QuickTime is a cross-platform multimedia technology that is primarily used to read, write,


convert and stream video and/ or audio data. It supports over 100 data types, including image,
video, audio8 and web streaming formats. Media is handled in a data structure called a QuickTime
movie, which acts as either a pointer or container file for the media. The QuickTime movie has a
track or set of tracks for each data type.
CoreAudio handles the audio output of QuickTime. By default this is sent to the default audio
device, but from version 7 it is possible to specify any output device. There are also settings for
track volume, balance, channel assignment, pitch shift, playback speed, and an eight-band real-
time spectrum analyser. Full functionality, including basic editing (truncate, loop), is provided in
the Pro version.
An extensible component of QuickTime is the QuickTime Music Architecture (QTMA). This can
play a sequence of music notes from an application, either internally or externally. Using
QuickTime Pro, a Standard MIDI File (SMF) can be imported and saved as an audio file in as a
QuickTime movie. In this case the convention of General MIDI applies, with channel 10 set to
drums.

8
Supported audio formats are: uncompressed 8, 16, 24 or 32 bit, IMA, Law, aLaw, AVI, WAV, DV audio, MP3,
MPEG-4 audio.

2008 Richard Hallum 62


Fig. 52 QTMI architecture (Apple, 2006)

The Note Allocator plays individual notes with the application setting which instrument will
sound. The synthesiser is selectable from any Soundfonts installed. The default device is the
QuickTime Music Synthesizer, which has a specification similar to General MIDI 2, with 24-
voice polyphony. Timing of a sequence of notes is handled by the Tune Player, which can play an
entire sequence in an asynchronous manner, without application intervention. Tune Player is also
responsible for volume setting and transport control. QuickTime music events (such as note,
controller, marker) are held in a QuickTime movie track, which uses a media handle to access the
Tune Player.

A problem arises when trying to bounce MIDI tracks as part of a stereo mix in Logic. During
playback both audio tracks and any MIDI tracks assigned to the QuickTime Music Synthesiser are
heard as they are both routed to the default audio device (i.e. the internal speakers). A bounce
operation is handled inside Logic but the MIDI data is sent out to QuickTime so any instruments
playing from the QuickTime Music Synthesiser are excluded. This condition is identified in Apple
article 300898 but no fix is given. The solution is to use Soundflower or JACK to route the audio
as detailed in Appendix III.

2008 Richard Hallum 63


c. Audio on a Windows PC

For the purpose of comparison, a short description of how audio is handled on the Windows
operation system is included in the section.
Audio routing on the PC is more flexible than on a Macintosh due to the Kernel Mixer. As well as
a master volume, independent volume, balance and muting controls are present for Line input,
Mic input, CD playback, MIDI sounds (from the soundcard), and Wave (from hard disk or
memory).

Fig. 53 The playback mixer in Windows XP (Microsoft, 2001)

2008 Richard Hallum 64


Fig. 54 Audio paths on the Windows PC (AES, 2006). Some paths have been omitted. For
example the Mic input can also be directed to the Playback Mixer.

WDM9 is a unified driver model, which provides audio mixing and resampling using a system
component named KMixer. This allows multi-client access to hardware and unlimited audio
streams mixed in real-time. In a Windows XP system internal buffering in the KMixer adds 30mS
of latency to audio playback which most applications are stuck with, as they have no method to
bypass the KMixer. Windows Vista introduced a new model named WaveRT. This has picked up
on several methods used in the Linux ALSA architecture such as mapping the hardware buffer
and control registers into user space so that the driver never touches the audio data. This
eliminates using IRPs (I/O Request Packets) and associated kernel-user transitions. WaveRT
supports both push and pull models of data transfer.

9
The Win32 Driver Model

2008 Richard Hallum 65


VIII. Software Reliability

While it is exceptionally rare to find a piece of software that simply does not work it is probably
true to say it is also exceptionally rare to find a piece of software that never fails. In the ever-
changing world of computer software we are accustomed to a certain amount of downtime due to
software misbehaving. With audio software there tends to be less certainty of stability than with
software for other purposes. This is largely because of the real-time scheduling required.
Audio and Music software has remained highly competitive, and is generally provided by small
software developer companies. Exceptions are Logic Audio, which is now a product of Apple
Computers, and Pro Tools from Digidesign, which is a part of the Avid Company. It is interesting
to note that Digidesign maintain a stable software product by specifying exactly (in some detail)
what hardware and Operation System are required for compatibility with any given version of Pro
Tools. Problems do still arise; an extreme case was Pro Tools LE which was not qualified for use
with the Leopard OS until mid 2008 this resulted in a wait of over six months for many users
who had upgraded to a new Macintosh which came with Leopard preinstalled. In most cases
incompatibilities can be fixed within a shorter timeframe.
Historically, music and audio software has been relatively immature when released to the public
and while there has been some improvement in this area this remains a non-trivial issue for any
professional user. In comparison a product such as Microsoft Office is virtually bug-free, and
furthermore patches are rare. The rapid evolution of OS X has exacerbated the problem, with
some audio developers having to play catchup because newer versions of the OS are released
frequently. In my experience, each major release of OS X has had some issues in handling audio.
As the minor releases get up to about 5 (eg 10.4.5) things seem to settle down and maximum
performance is obtained with few reliability problems. As further minor releases are added there is
sometimes an increase in stability issues. This was the case in Tiger but 10.4.11 is quite reliable
now and this is probably due to the audio developers being able to eventually fix all the bugs,
since Tiger is no longer a moving target. Leopard has been particularly troublesome to some audio
users. In the case of 10.5.2 several audio software developers issued statements advising against
using it altogether (Kirn, 2008).
Several commentators (Sellers, 2004, Mertin, 2004) have noted that audio plugins in general have
a high degree of incompatibility, either with the host application or the OS.

2008 Richard Hallum 66


IX. Combinations

Due to the variety and complexity of sound operations that we now expect to perform on a
computer, there are some situations where a combination of two or more of the sound utilities are
needed to achieve the desired result.
A related aspect is how well the sound utilities live together on the same computer. It will be
infrequent that most users will require using several audio streaming utilities simultaneously.
Several sound utilities will most likely be installed on one machine and used at various times. It is
important that the software will flush completely out of the machines RAM when quit and that
any processes that automatically load do not interfere with other applications.
During the testing phase I ran the following utilities on a single computer: JACK,
Soundflower/Soundflowerbed, SoundSource, Sound Menu, WireTap Anywhere, PTHVolume,
Rewire, and AudioHijack. Some of these are very compatible, but care must be taken when using
other audio utilities with Soundflower, Rewire and JACK.
A problem arose while I was testing on an Intel iMac (OS 10.4.5) where all the audio hung. This
was a situation where JACK, the Digidesign CoreAudio driver, QuickTime, Logic Express, and
Soundflower were all activated. For some reason QuickTime failed to appear in the JACK
connections manager and JACK hung. CPU usage was at a constant 51% and almost all of this
was caused by Soundflower. Force-quitting Soundflower fixed this but JACK, Logic Express, and
QuickTime Player refused to be force quit as CoreAudio had hung. The computer needed a
hardware restart before work could recommence.

Fig. 55 CPU usage during the hang, and after Soundflower was force quit

Fig. 56 The Force Quit window, which is invoked using, keystrokes Option-Apple-Escape
(Apple, 2008)

2008 Richard Hallum 67


Fig. 57 Activity Monitor (Apple, 2008) shows CoreAudio has hung. Jackdmp was still active
with 3 threads but refused to force-quit.

A basic setup of bringing 2 channels of audio into Pro Tools (v7.4) from Reason (v4) using the
Rewire plugin was also connected successfully. An incompatibility with MIDI was encountered
while testing Rewire and is detailed in Appendix IX.
The following tables indicate if any two of the trialled applications were basically compatible
when tested together. It should be noted that comprehensive compatibility testing of all the
practical combinations would take an enormous amount of time and was not possible in the
timeframe allotted for this research.

JACK Rewire Audio Line Sound Sound PTH WireTap Digidesign Sound
Hijack In Menu Source Vol Anywhere CoreAudio flower
JACK Y Y Y Y Y Y Y Y Y
Rewire Y Y Y Y Y Y N Y
Audio Hijack Y Y Y Y Y Y Y
Line In Y Y Y Y Y Y
Sound Menu Y Y Y Y Y
SoundSource Y Y Y Y
PTHVolume Y Y Y
WireTap NT Y
Anywhere
Digidesign Y
CoreAudio
driver
Soundflower

Table 12 Compatibility chart 1. Does the software work while other software is active?
Long term testing was not implemented on all programmes, so this is only a guide your mileage
may vary. NA = not applicable, NT =not tested.

2008 Richard Hallum 68


Fig. 58 (left image) Soundflower (Cycling74, 2008) is able to sense the Digidesign CoreAudio
driver, and JACK on OS 10.4.11 but does not see the Digidesign CoreAudio driver on OS 10.5.4;
(right image) the WireTap Anywhere (Ambrosia, 2008) add source menu. It can see almost any
other audio application.

Acknowledges ->
JACK Rewire Audio Line In Sound Sound PTH WireTap Digi Sound
Hijack Menu Source Vol Anywhere Core flower
Audio
JACK N N N NA NA NA N N N
Rewire N N N NA NA NA N NT N
Audio Hijack Y N NA NA NA N N Y
Line In NA N N NA NA NA N NT Y
Sound Menu N N N N NA NA N N Y
SoundSource N N N N NA NA N N Y
PTHVolume Y N N N NA NA N N Y
WireTap Y N N NA NA NA NA N Y
Anywhere
Digidesign N N N N N N N N N
CoreAudio
driver
Soundflower Y N N N N N N N Y

Table 13 Compatibility chart 2. Does the software acknowledge other software?


NA = not applicable, NT =not tested.

2008 Richard Hallum 69


Fig. 59 Pro Tools/Rewire/Reason are outputting audio to an Mbox. At the same time,
QuickTime Player is streaming audio through Soundflower to be recorded into Logic Pro
(MacBook)

Fig. 60 Here, Pro Tools/Rewire/Reason are outputting audio to an Mbox while Audio Hijack
Pro captures audio from the QuickTime Player (MacBook)

Pro Tools provides a special case as it has its own audio API; the Digidesign Audio Engine
(DAE). This is a unique audio routing solution involving both kernel extensions and a user layer
Framework. I was interested to see if it was possible to run a Pro Tools session simultaneously

2008 Richard Hallum 70


with audio streaming though JACK and discovered that this was quite possible on an Intel iMac,
with no glitches or hangs.

Fig. 61 Here, audio is playing from Pro Tools to an Mbox. At the same time audio is routed
through JACK from the QuickTime Player and recorded in Logic Pro.

Fig. 62 Activity Monitor shows that CPU usage is moderate (JACK 20%, Logic 21%, Pro Tools
15%, and QuickTime 12%.

It was also possible to use a Pro Tools session simultaneously with Soundflower. As the DAE
runs completely separately from CoreAudio it simply ignores either JACK or Soundflower.

2008 Richard Hallum 71


Fig. 63 Playing audio from Pro Tools to an Mbox. At the same time audio is routed through
Soundflower from the QuickTime Player and recorded in Logic Pro.

2008 Richard Hallum 72


Hugo Schotman has described setup for podcasting that uses both Audio Hijack Pro and
Soundflower. Audio Hijack Pro is used to record and Soundflower routes the sound in and out of
the setup. He identifies that Soundflowerbed has given some problems when used to monitor
Soundflower. A partial solution has been to reset the sample rate to 44.1kHz as it defaults to
48kHz. A workaround is to use the Auxiliary Device Output instead of Soundflowerbed to
monitor the mix, or to have an additional Audio Hijack session to send audio from Soundflower to
the headphones. Schotman notes that the disadvantage to these methods is increased latency
(Schotman, 2005).

Fig. 63 Podcasting setup using Soundflower and Audio Hijack Pro (Schotman H., 2005)

2008 Richard Hallum 73


X. A Rewire MIDI Problem

Usually the Rewire MIDI setup is straightforward as MIDI data comes in from an external
keyboard and drives both Reason and a Pro Tools MIDI track. On this occasion I had no external
keyboard and used a small virtual keyboard application called midikeys
(http://www.manyetas.com/). The setup was to generate a simple MIDI sequence in Logic Pro
(v8) and send that to Pro Tools and Reason via midikeys, using two IAC busses, as follows:

Logic midikeys Pro Tools Reason


Sequence

IAC Rewire
IAC
buss 1 buss 2 MIDI
buss

Fig. 64 The inter-application MIDI setup

When the Rewire plugin was activated Reason failed to load, its CPU usage went to 90%, and it
could not be force quit. Furthermore Reason or Rewire was holding Pro Tools and not allowing it
to be quit.

Fig. 65 The Pro Tools error message (Digidesign, 2008)

The problem was fixed by first opening Reason in stand alone mode and setting up the MIDI input
by disabling the USB MIDI port driver and enabling the IAC buss after the IAC driver was
enabled in Audio MIDI Setup. Once this was done the normal sequence of starting Reason
automatically by inserting the Rewire plugin into Pro Tools allowed normal operation.

2008 Richard Hallum 74


Fig. 66 The Reason Preferences window where the MIDI input device is set (Propellerhead,
2008)

2008 Richard Hallum 75


XI. Software Resources

Sound Menu v1.5.1


Mort D'ivresse
Aspirine Software Products
http://www.aspirine.li

Jack v0.77, 0.78


Paul Davis, Stephane Letz, Johnny Petrantoni, Dan Nigrin
http://www.jackaudio.org
http://www.jackosx.com
http://sourceforge.net/projects/jackosx

QJackCtl v0.3.2
Rui Nuno Capela
http://qjackctl.sourceforge.net

Soundflower v1.2.1, 1.3, 1.3.1, 1.4


Cycling '74
30 Clementina Street
San Francisco, CA 94103
USA
http://www.cycling74.com
http://code.google.com/p/soundflower/

Rewire
Propellerhead Software
Rosenlundsgatan 29c
11863 Stockholm
Sweden
http://www.propellerheads.se/

Pro Tools LE 7.4


Digidesign
Avid Technology, Inc.
Avid Technology Park
One Park West Tewksbury,
MA 01876
U.S.A.
http://www.digidesign.com/

2008 Richard Hallum 76


VST Systemlink
Steinberg Media Technologies GmbH
Neuer Hoeltigbaum 22-32
22143 Hamburg
Germany
http://www.steinberg.net/en/home.html

WireTap Studio v1.0.4, WireTap Anywhere v1.0.1


Ambrosia Software
PO Box 23140
Rochester, NY 14692
USA
http://www.AmbrosiaSW.com

Audio Hijack Pro v2.8.1, Detour v1.5.5, Line In v2.0.3, SoundSource v2.0
Rogue Amoeba
22 Kidder Ave. #3
Somerville, MA 02144
USA
www.rogueamoeba.com

Apple OS X 10.4.11 (Tiger), Apple OS X 10.5.3, 10.5.4, 10.5.5 (Leopard), including System
Preferences/Sound v3.0, Audio MIDI Setup v2.2.2, CoreAudio v3.1.0, Activity Monitor v10.5
Apple QuickTime v7.4.5, 7.5, Apple iTunes v7.6
Apple Logic Pro 8
Apple
1 Infinite Loop , Cupertino, CA 95014
USA
http://www.apple.com/

PTHVolume v2.2.0
PTH Consulting
Flower Mound,
Texas,
USA
http://pth.com

2008 Richard Hallum 77


XII. Glossary

ADC
Analogue to Digital Converter

AGGREGATE DEVICE
A feature of the AMS that allows for inputs and outputs on multiple hardware devices to be
addressed as one virtual device.

AMS (AUDIO MIDI SETUP)


The OS X version of MIDI Manager or OMS, the AMS allows the user to select audio devices,
and configure a virtual MIDI patchbay. It provides multiple configurations, device naming,
channel filtering. In version 10.3 (Panther) the IAC has been included with unlimited busses.
There is support for USB, firewire, PCMCIA, and PCI. OMS was developed for Opcode by Doug
Wyatt, who moved to Apple to develop AMS.

API
Application Programming Interface
A high level set of computer instructions provided as part of the operating system for the purpose
of providing easy and standard procedures with which an application can call the OS.
There are five APIs available as part of OS X; namely, POSIX, Cocoa, Carbon, Java, and
Toolbox. Audio routing software such as JACK or Rewire can also be considered APIs.
Linux has several audio APIs, the most common being OSS, ALSA, and LADSPA.

ASIO
Audio Stream Input Output
ASIO is a low latency, sample accurate audio API developed by Steinberg software. It provides
callbacks and double buffering to achieve this. On OS X it has been largely superceded by
CoreAudio.

AU
Audio Unit
The native format for OS X audio plugins.

AUGRAPH
TheAUGraph is a high-level representation of a set of AudioUnits, along with the connections
between them. It provides for realtime routing changes, and maintaining representation even when
AudioUnits are not instantiated.

ASYNCHRONOUS
A system of data communication where no clocking is present. For MIDI, start and stop bits must
be added to the data to indicate what the transmission state is to the receive device register.
Asynchronous Transfer Mode (ATM) is an example of a sophisticated method of streaming audio
and video on the Internet. Instead of data packets it uses cells with a standard data length of 48
bytes to avoid jitter and delay.

BUFFER MEMORY
A relatively small amount of RAM dedicated to compensate for different speeds of input and
output data. Usually associated with some hardware device (eg hard disk, video card). Buffer
under-runs occur if output demand exceeds data supply; buffer over-runs are where output data
cannot be used fast enough. Inter-application audio streaming also requires buffer memory (see
also Latency).

2008 Richard Hallum 78


CALLBACK
A type of computer instruction, which allows a low level process to call a function defined in a
higher level. The advantage of this method of programme execution for audio routing is that it
allows relative isolation between the driver and the audio data.

CORE MIDI
MIDI implementation in OS X. Features include: applications can share multi-port MIDI
interfaces, inter-application bussing, USB MIDI class specification compliance, and low latency
(< 1mS). Jitter is designed to be less than 200S.

CPU
Central Processing Unit
The microprocessor that handles overall control of the computer system and is responsible for
most of the data processing.

DAC
Digital to Analogue Converter

DIGIDESIGN COREAUDIO DRIVER


The Digidesign CoreAudio Driver is a single-client, multichannel sound driver that allows
CoreAudio-compatible applications to record and play back through Digidesign hardware. Full-
duplex recording and playback of 24-bit audio is supported at sample rates up to 96 kHz. The
CoreAudio Driver provides up to 18 channels of I/O depending on the Pro Tools System. Buffer
sizes can be set from 128 to 2048 samples.

DUPLEX
duplex means that data can be transferred only in one direction at a time. Full duplex means
that simultaneous bi-directional data communication can occur.

FIFO (Named Pipe)


A method of inter-process communication. In computing a Pipeline is a method for connecting the
output of one process into the input of another. Named Pipes are created and deleted outside of the
attached process. FIFO stands for First In, First Out and refers to the property that the order of
bytes going in is the same coming out.

FRAME
A time-coincident set of samples of the various channels in the audio stream. For PCM audio 1
Frame = 1 Packet.

FRAMEWORK
A type of software bundle that packages a dynamic shared library (executable code) with the
resources that the library requires.

GLITCH
A transient interruption to the audio signal.

GUI
Graphical User Interface

HAL
Hardware Abstraction Layer
A high level software layer between the hardware and application so that it can communicate with
the hardware in a consistent way, and without needing to address the specifics of the hardware.

2008 Richard Hallum 79


JACK
Jack Audio Connection Kit

JACKDMP
The JACK server.

KERNEL
The core or lowest level of the operating system.

LATENCY
The delay between the time of input stimulus and observed output. In a computer handling audio
this will generally be the time difference between when audio enters the system and when it
leaves the system. As audio must be streamed in and/or out of the computer in real time without
glitching a buffer memory is required. The main factor affecting latency is therefore buffer size.
Other considerations are the ADC and DAC conversion delays, and CPU efficiency (ie how many
clock cycles are needed to execute operations).

MIDI
Musical Instrument Digital Interface
A Simplex (unidirectional) asynchronous data communications protocol running at 31.25kb/s.

MIDI MANAGER
Apple's MIDI Manager offered a high level interface to the Mac OS to correctly support the
timing accuracy required by MIDI hardware and software under MultiFinder. It was for doing
Inter-Application Communication and for allowing multiple applications to address the serial port.
MIDI Manager did not come with the System - it was available to developers or as licensed
software with MIDI application packages.

MULTI-THREADING
Where a CPU can virtually be executing more than one series of instructions simultaneously
(using a procedure called scheduling). The scheduler can set thread priorities.

MULTI-PROCCESSING
Sharing the CPU workload between two or more processors. The Intel Core 2 Duo technology
effectively combines two processors on one chip.

OMS
OMS (Open MIDI System) is similar to MIDI Manager in that it extends the Mac OS for MIDI
applications. It has some features not found in MIDI Manager such as SMPTE synchronization
(SMPTE synchronization is the job of MIDI Time Code; it does not require OMS or MIDI
Manager). OMS was essential to many pre OS X Macs for achieving full MIDI functionality.
OMS allows an application to address a large number of discrete MIDI cables through one serial
port, and also allows real-time IAC (Inter-Application Communication) between applications, like
a sequencer and a softsynth (maximum of 4 busses). A feature is the OMS patchnames library.
Development ceased at version 2.3.8. It was supplied free to all interested MIDI/Music
developers.

OPEN SOURCE
Software where the Source code is provided freely to users (usually along with the compiled
application). Some limitations of use and distribution of the software may still apply but will
usually allow anyone to use, extend, fix, or modify the software.

2008 Richard Hallum 80


OPERATING SYSTEM (OS)
A software interface between the hardware, the user interface, and the application software.

PLUGIN
An application which is designed only to work from within its host application. An audio plugin
adds some extra sound processing or generating functionality and allows audio (and/or MIDI data)
to automatically stream between the plugin and host applications. Parameters are controlled using
the particular plugins GUI.

PPC
Power Performance Computing
A RISC architecture CPU used in Macintosh computers (PowerMac series and later G series).

RAM
Random Access Memory
The amount of RAM is an important determiner of computer performance. The OS X POSIX API
does not support locking pages into real memory. Page outs can cause audio dropouts, so running
unnecessary applications should be avoided when running audio communication software.

Fig. 67 The System Memory tab in Activity Monitor (Apple, 2008). The high ratio of page ins to
page outs indicates a good surplus of RAM.

SYNCHRONOUS
Where multiple data is stepped in time, or clocked. For digital audio to be truly synchronous
each frame (ie sample word) of the multiple audio streams must be kept in step.

UNIX
An operating system started at Bell Laboratories in 1969. OS X uses the BSD version of UNIX.

XRUN
Either a buffer under-run or over-run. In the first case an application or the CPU is not fast enough
delivering audio data to the buffer. In the second case the output from the buffer cannot be
processed fast enough by an application or the CPU. XRUNs will usually result in audible pops.

2008 Richard Hallum 81


2008 Richard Hallum 82