Você está na página 1de 20

WHITE PAPER

Validation of iMotions’ Emotion


Evaluation System embedded in
Attention Tool® 3.0
Jakob de Lemos, Golam Reza Sadeghnia,
Íris Ólafsdóttir, Ole Jensen
Version 2 - Apr 2010

Keywords: emotions, emotional response, emotion technology, non-invasive physiological


measurements, visual attention, arousal, emotional activation, decision making.

Abstract:
This paper describes how Attention Tool® can be used to measure human emotions and which
statistical outputs are provided in the tool for each tested visual stimulus. The method in Attention
Tool for measuring the emotional strength, also known as physiological arousal, is based on pupil
size variation, eye blink pattern and gaze behavior. Furthermore, the method is evaluated together
with galvanic skin response (GSR) recordings that is a well-known and widely used method for
measuring arousal. The comparison of these two methods shows that Attention Tool is just as good
an evaluator of physiological arousal as the currently used GSR.

Introduction
Emotions are multifaceted and a complex phenomena with many different conceptualizations and
theories (Scherer, 2000). One group of theories considers emotions as biologically determined
responses that were attained through evolutionary challenges (Cosmides & Tooby, 2000). Other
theories consider emotions as based more on learning and cognitive evaluation (Scherer et al,
2001). However, despite theoretical differences, most emotion researchers agree that emotions are
manifested and can be assessed in relation to a subjective (experiential), a physiological (bodily)
and a behavioral (acting) dimension (Lang, 1988). Research indicates that emotions play an
important role in adjusting advantageously to the environment, and are critical for decision making,
problem solving and rational behavior in everyday life (Damasio, 1994).

Emotions affect decision making and behavior indirectly. Strong emotional response to an
advertisement or a product design can be the prevailing factor in the decision making moment in
buying one product over the other. People are often not conscious about their emotional response;
therefore it is better accessible through psychophysiological measurements rather than self report
or questionnaires. Results from these methods also have an innate tendency to make the

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 1/20
WHITE PAPER

respondents reflect about the answer instead of giving an immediate response. Cognition (thoughts
and their associations) is unavoidable when using self report and questionnaires, and will often give
misleading results if the intention is to measure the immediate emotional response. Attention Tool
can access the physiological response in connection to an emotional response fast and efficiently
through eye movements, eye blink pattern and pupil size variation. The result is a good indication
of the emotional effect that an advertisement or other visual stimuli may have on the respondent on
a subconscious level.

Attention Tool is the first and currently the only non-intrusive tool on the market that can make
psychophysiological measurements for emotional response evaluation. There exist intrusive tools
such as functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), positron
emission tomography (PET) and galvanic skin response (GSR) that can measure different aspects
of human emotions. GSR is a very good indicator of the strength of an emotion, broadly known as
physiological arousal (Emotional Activation in Attention Tool) and has been used in lie detector
apparatus (polygraph) for 100 years. The problem with GSR is that it is intrusive (must be
connected to the body of the respondent). Attention Tool is compared to GSR measurements for
approval of the arousal estimation system embedded in the tool.

Attention Tool provides a range of statistical information on tested stimuli. For a group of no less
than ten respondents the average Emotional Activation estimate for the group is provided along
with the corresponding confidence interval and data quality. Furthermore Attention Tool 3.0
provides a statistical hypothesis test to investigate if the difference in the average arousal estimate
for two different stimuli is significant or not.

A study is performed where 66 stimuli have been exposed to 50 males and 50 females where eye
tracking data and GSR data are recorded simultaneously. Attention Tool and GSR are used to
classify stimuli as belonging to one of two groups; high arousal or low arousal stimuli. The
classification results from these two methods were consistent. For females, 20 out of 24 stimuli
were classified into the correct category and 16 out of 17 stimuli for males. The highest Pearson’s
correlation coefficient was observed to be ρ = 0.84. According to this study, Attention Tool 3.0 can
be considered to be at least as reliable an arousal evaluator as the best of the GSR measurement
systems that exist on the market today.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 2/20
WHITE PAPER

Table of contents
Measuring emotions ................................................................................................................................3

What is an emotion? ............................................................................................................................3

How can emotions be measured? ........................................................................................................3

How do emotions affect decision making? .........................................................................................3

Attention Tool approach to emotional response evaluation....................................................................3

Statistical output ..................................................................................................................................3

Validation of emotion evaluation method in Attention Tool ..................................................................3

Study design and performance ............................................................................................................3

Participants ......................................................................................................................................3

Stimuli selection ..............................................................................................................................3

Procedure.........................................................................................................................................3

Equipment setup ..............................................................................................................................3

Eye tracker data processing .............................................................................................................3

Signal processing of galvanic skin response data ...........................................................................3

EPOC study results: Validation of Attention Tool..............................................................................3

Precision of galvanic skin response and eye tracking .....................................................................3

A brief note on Arousal calculations ...............................................................................................3

Confusion matrix accuracy of low and high arousal classification.................................................3

Linear regression of measurements .................................................................................................3

Conclusion...............................................................................................................................................3

References ...............................................................................................................................................3

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 3/20
WHITE PAPER

Measuring emotions
Before describing how Attention Tool can access emotional response, it is relevant to answer a few
questions like what is an emotion, how can it be measured and how does it affect decision making?
Following, a brief discussion on these issues is presented.

What is an emotion?
There is a general acceptance that emotions are elicited by a specific situation, person or object
(real or imagined). It includes changes in three different reactive systems:

• there is an experience of emotion, often referred to as feelings (Damasio, 1994)

• emotions are accompanied by expressive display, e.g. postures, gestures, facial and vocal
expressions (Ekman, 1971; Scherer, 1986)

• emotions are accompanied by bodily responses that comprise changes in the somatic and
autonomous nervous system, as well as in the endocrine and immune system. These, in
turn, modify specific psychophysiological responses such as reflex, cardiovascular,
electrodermal, gastrointestinal or pupillary activity (Cacioppo, Tassinary & Berntson,
2000)

Many researchers agree that emotions can be assessed and measured in relation to a limited
number of dimensions (Christie & Friedman, 2004). One of the most relevant is the concept of
arousal, which is the intensity connected to the elicited emotion (Lang, 1995). Another dimension
often used is the valence dimension that group emotions along a pleasant (positive) – unpleasant
(negative) scale. Thus from a dimensional perspective, emotions are often considered subordinate
divisions in a valence (pleasant or unpleasant emotion) and arousal (intensity) coordinate space
(Lang et al, 1993).

How can emotions be measured?


There exist a few methods for accessing emotional response by monitoring the bodily reaction.
These bodily reactions are reflected by skin conductance and pupillary response, amongst other
peripheral markers. These changes are to a high degree controlled by amygdale in the limbic
system. Following is a list of some of the most widely used methods for measuring bodily reactions
following the experience of an emotion.

fMRI is a neuroimaging technique that measures the hemodynamic response related to neural
activity in the brain, as a method of observing which areas of the brain are active at any given time
(functional imaging). Blood releases oxygen to active neurons at a greater rate than to inactive
neurons. Thus oxygenated or deoxygenated blood leads to variations in the atoms magnetisation,
which can be detected using an MRI scanner. The test subject is lying in a tube that makes it
virtually impossible to imitate a natural experimental setup. Additionally, MRI equipment is
expensive and requires highly trained staff.

EEG is the measurement of electrical activity produced by the brain and is recorded from
electrodes placed on the scalp, so the method is highly intrusive. Furthermore, the signal is always

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 4/20
WHITE PAPER

disrupted by other electrical signal sources in the body, such as muscular activity. EEG can be used
to measure different aspects of emotions but the equipment is quite expensive and requires highly
trained staff.

GSR is related to the electrodermal system. It is sensitive to rapid change in hydration of the skin.
It is typically recorded from the surface of the fingers (Hugdahl, 1995). The response is secondary
to a hormonal change induced by the sympathetic division of the autonomous nervous system.
There is a relationship between sympathetic activity and the intensity of an emotion, although the
response is not known to identify the specific emotion being elicited. GSR equipment is quite
simple to use and has a low cost. The disadvantage is that the GSR is intrusive as electrodes must
be attached to the fingers. Also the GSR signal is highly sensitive to body movements, making it
quite difficult to use in a natural experimental setup. The respondent is required not to move any
part of the body during the recording or perform deep respirations.

PET is an imaging technique, using small amounts of radioactive substances injected in the blood
stream that can be traced by the scanner as the blood stream activity in one area increase according
to brain activity. The technique results in precise functional images, but both equipment and
maintenance are very costly. PET requires injection of a tracer and is therefore highly intrusive as
well as invasive PET is also considered to be unhealthy if a subject is repeatedly exposed, as the
injected tracer is a radioisotope.

Eye tracking (Attention Tool) is primarily used to measure eye gaze, but can also measure the
pupil size and blinks with high precision. These parameters reflect, amongst other, the immediate
emotional response and may indicate interest in the subject of attention and/or sexual stimulation
(Hess, Eckhard & Polt 1960). Eye blink rate and pupil dilation have been found to be indicators of
cognitive processing as well as the level of emotional arousal (Cramon, 1977). The eye trackers
available on the market today are non-intrusive and relatively low cost.

How do emotions affect decision making?


Decision making has until recently been viewed as merely a cognitive process. It was assumed that
a rational cost-benefit analysis with weighting all possible alternatives and then making a decision
based on these analysis. Emotions were not helpful, only a disturbance. However, the last decades
of a blooming interest in the role of emotions in decision making tells us otherwise.

A recent research has shown that emotions are critical for advantageous adjustment to the
environment, as well as for decision making and problem solving (Forgas, 1995; Damasio, 1994).
Background emotions (that are subconscious) are constantly associated with the new situation
being experienced. They continuously work as subconscious, automatic guide in decision making.
These are called somatic markers (Damasio, 1994).

During decision making the somatic markers from the reward- and punishment associated
experiences related to it are summed up to produce a net somatic state which automatically
excludes many alternatives. Then the conscious mind is left with a small number of alternatives to

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 5/20
WHITE PAPER

be considered. If this automatic mechanism was not there, the conscious mind would be overloaded
with information. Somatic markers’ (emotions) role is twofold: the first is to automatically reduce
the number of behavior alternatives by retrieving the emotional states previously encountered from
similar experience, and the other is to mark the new situation with a new, modified somatic marker.
The somatic marker consists of the emotions that are associated to the situation or object.

Emotions also affect visual attention and the processing of visual information. Some studies
suggest that the valence of an emotion determines the nature of subsequent information processing.
For example, unpleasant negative emotions are thought to narrow, whereas pleasant positive
emotions broaden the attention focus (Isen, 1999). The human memory system also depends on
emotional information, both at an encoding and a retrieval level. The attention-grabbing nature of
emotionally arousing objects often leads to a stronger focus on these, and emotional arousal has
been shown to enhance declarative memory (Kensinger and Shachter, 2005).

With associations of recall and decision making with emotions in mind, the intensity of the
immediate emotion, known as arousal or Emotional Activation as interpreted by Attention Tool,
gives much information about a tested stimulus. This is of great value when analyzing the impact
of visual stimulus on an emotional and subconscious level.

Attention Tool approach to emotional response evaluation


Knowledge about the relation between emotional response and pupillary reaction has been
available since the seventies. The knowledge has not been utilised until recently when reliable eye
tracking hardware that is capable of measuring pupil size, eye blink and gaze coordinates with high
accuracy became available in the market. Today, there exist dozens of companies providing high
quality eye trackers for a reasonable price, which is accelerating the research in this field
dramatically.

Attention Tool can measure the emotional response of a group to exposed stimuli with a good
precision. From the gaze movements, pupil size variation and blink characteristics it is possible to
evaluate the arousal level of a respondent. This method has been successfully compared to GSR
measurements where the correct high and low arousal classification was observed to be 83% for
females and 94% for males (N = 45 on average). The precision of Attention Tool emotion
evaluation system and a proof of concept are described in details in the next chapter. Attention
Tool provides statistical information about the emotional response to the tested stimuli, which is
described below.

Statistical output
The emotional response is evaluated for a stimulus exposed to a group of no less than 10 valid
respondents. It is recommended to use at least 30 respondents of the same gender to evaluate the
average arousal level for the stimulus. In Attention Tool 3.0 the arousal measure is referred to as
Emotional Activation, indicating the strength of the emotion being elicited to the presented
stimulus. Attention Tool 3.0 provides the following statistics of emotional response:

• Average arousal estimate (Emotional Activation) of the group

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 6/20
WHITE PAPER

• Standard deviation in arousal for the group

• 95% confidence interval for mean arousal

• Affectivity given as number of respondents below (unaffected) and above or equal to


(affected) 5.0 in Emotional Involvement

• Statistical hypothesis test to investigate if the difference in the average arousal estimate for
two different stimuli is significant or not on a 95% confidence level

• Data quality as the percentage of valid measurements for each stimulus

The average arousal or the Emotional Activation tells how strong an emotional reaction the group
experienced on average to the exposed stimulus. The scale is linear and range from zero, indicating
no reaction, to ten, indicating a strong emotional response.

The 95% confidence interval of the Emotional Activation is given in the user interface of the
software. Additional statistical information is provided in a tabulated text file output, including the
95% and 90% confidence intervals, the standard deviation of the Emotional Activation and number
of valid respondents. This information can be used to make further statistical tests and compare
results between studies and segments.

Affectivity is calculated from the individual arousal estimate. It indicates how big a fraction of the
respondents experienced a strong emotional response (arousal ≥ 5) when exposed to the stimulus.

When comparing two stimuli, it is interesting to know if the given average arousal estimate for the
stimuli are significantly different or not. One way to get an indication about this is to compare the
confidence intervals and see if they overlap. A more reliable method is to perform a test of
significance to investigate if there is a significant difference between the average arousal estimates
of the stimuli. Attention Tool 3.0 provides a comparison matrix for all tested stimuli telling if the
null hypothesis was accepted (indicating no significant difference) or rejected (indicating that there
is a difference). This comparison is performed using a Z-test with a 95% confidence level,
corresponding to a two tailed significance level of α = 0.025.

Data quality is calculated from the amount of valid data for each stimulus. If the respondent looks
away from the screen, blinks a lot or for any other reason causes poor data collection, he or she is
defined as an outlier and excluded from the data set.

Validation of emotion evaluation method in Attention Tool


The arousal evaluation in Attention Tool has been compared to GSR for verification. This chapter
describes a study iMotions has designed to prove the concept of iMotions’ emotion evaluation
system (iEE) which will be referred to as the EPOC study (Extended Proof of Concept).

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 7/20
WHITE PAPER

Study design and performance


The study took place in a laboratory at iMotions’ headquarters in Copenhagen during the years
2006 and 2007. The selection of participants, stimuli and study procedure were as is described
below.

Participants
One hundred persons with normal physiological and psychological health, using no medication and
capable of being measured for GSR response on the second phalanx of the index and middle
fingers of the left hand were selected for testing. Correctly functioning vision was an important
factor, therefore only respondents with no ophthalmologic diseases or congenital anomaly of the
vision or eyes were selected. The group was equally divided into 50 male and 50 female
respondents as previous studies demonstrated important gender differences in relation to emotional
reactions (Lang et al, 1993). All subjects were Danish citizens and fluent in Danish. They ranged
from 18 to 49 years of age with an average age of 33 years in each gender segment.

Stimuli selection
Using the norms provided by Lang et al (2005), 45 pictures from the International Affective Picture
System (IAPS) database and 21 advertisements were selected. The IAPS database consists of a
combination of emotionally charged color photographs designed to elicit emotions pleasant
(positive) and unpleasant (negative) and neutral pictures causing less emotional response. All 900
stimuli in the IAPS database have been evaluated on the arousal and valence dimensions using a
rating system called the Self-Assessment Manikin (SAM), through a series of experiments carried
out by Peter J. Lang in 1988. Using SAM the respondents have been asked to rate their emotional
response on the photographs, on the arousal and valence dimensions, immediately after being
presented to the stimulus. The average and standard deviation of the ratings derived in Peter J.
Lang’s study are used to select a subset of the IAPS pictures to be used in the iMotions EPOC
study.

The criteria for the selected stimuli are:

• Maximum standard deviation for arousal according to the IAPS database is less than or
equal to 2.0 (Lang, P.J., Bradley, M.M., & Cuthbert, B.N. (2005))

• The variance in the average arousal ratings across all stimuli was to be maximized by
selecting photographs of low, medium and high arousal values

• The variance in valence across all stimuli was kept high

Separate norms of arousal and valence exist for men and women in the IAPS database (Lang et al,
1993). For females the maximum and minimum average SAM arousal ratings are 1.87 (IAPS no.
7175) and 7.77 (IAPS no. 3000). For males the least and most arousing pictures have average SAM
arousal ratings of 1.55 (IAPS no. 7010) and 8.25 (IAPS no. 4210), respectively.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 8/20
WHITE PAPER

The pictures selected for the EPOC study range from 1.97 (IAPS no. 7010) to 7.56 (IAPS no. 6230)
for females, and from 1.55 (IAPS no. 7010) to 7.76 (IAPS no. 4800) for males. The valence
reaches 1.86 (IAPS no. 3010) and 7.86 (IAPS no. 8200) for females, and 1.80 (IAPS no. 3120) to
8.21 (IAPS no. 4180) for males. Moreover, the pictures represent many categories, in order to span
across a large spectrum of emotions. The content varies from mutilated bodies, erotic pictures,
social and ethical content to household objects, food, sports and typical print advertisements. The
IAPS SAM reporting method is based on self evaluation and not physiological measurements.
Therefore it is only used to validate our use of the IAPS system.

Procedure
The respondent was welcomed and escorted to the test room. He or she was seated in a comfortable
chair at a distance of approximately 70 cm from the monitor. The room was sound attenuated and
dimly lit, with a fairly consistent temperature. For each testing session, the same female instructor
calibrated the eye measurement equipment and attached electrodes for recording skin conductance.
The respondent’s left arm was placed on a table and two electrodes of type EL507 with isotonic gel
(manufactured by Biopac systems) were attached to the second phalanx of the middle and index
fingers. The respondent was then instructed to sit as still as possible, as body movements may
influence GSR recordings. Also, the respondent is asked to not look away from the screen, as this
may lead to unusable eye tracking data.

The data collection starts with a gaze calibration of the respondent. After a successful gaze
calibration, the respondent was told that a series of pictures would be displayed for an equal
amount of time. Before the stimulus image, though, there was a series of five intensity images
(experienced as grey noise) for light calibration purposes. Then, for each stimulus there was a
block consisting of one black inter slide of two seconds duration, one greyscale inter slide of fifteen
seconds duration, the stimulus slide of six seconds duration, one SAM arousal scale pictogram
slide and finally one SAM valence scale pictogram slide, which did not have any time restriction.
The SAM slides are designed to wait for a rating from the user before proceeding to the next slide.

The greyscale prior to the stimulus slide is created by scrambling the following stimulus slide and
serves as light adaptation for the respondent, as well as a psychological pause in between stimuli.
The total 17 seconds of inter slide duration varied in a random fashion by a couple of seconds, to
compensate for the respondent’s tendency to learn when to expect stimulus onset time. The time
duration is also designed to allow the skin conductance responses to return to baseline (Bechara et
al., 1996). Each stimulus was presented for six seconds, the same time as in previous related
studies (Bradley et al., 2001; Lang et al., 1993).

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 9/20
WHITE PAPER

Figure 1 shows the slide order during data collection. The light calibration slides and gaze calibration
are shown once, followed by the interslide, stimulus slide and SAM rating slides for each of the 66
stimuli, as the arrow indicates.

Immediately after the display of each stimulus slide, the respondent had to evaluate the valence and
arousal using the SAM rating system on a PC keyboard. On the screen the SAM arousal and
valence scales are displayed one after the other, illustrated by pictograms (see Figure 1). It was
ensured that every participant was familiar with this system and the use of the scales was
demonstrated with a few sample pictures. These were intended to illustrate neutral and extreme
ratings in either valence direction.

After having explained the use of the scales, a short session of four pictures was performed, where
the user could rate the pictures as a practice. After the practice session, the instructor ensured that
the respondent understood the procedure and left the room. The 66 selected IAPS and
advertisement pictures were presented automatically in a random order.

Figure 2 shows the test situation. The respondent is sitting in front of the eye tracker, while her GSR
and eye properties are being measured. Using the keyboard with her right hand she rates the pictures
on the SAM scales immediately after exposure.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 10/20
WHITE PAPER

When the respondent had completed the study session, the instructor entered the room for a short
debriefing and noted demographic information of the respondent. The session was video recorded
for later analysis and typically lasted between half an hour and 45 minutes.

Equipment setup
The stimuli were presented on a 17 inch Tobii computer monitor with built-in eye tracking sensors
(Tobii model 17501). The data presentation was carried out by the software package E-Prime,
which is a computer program designed for behavioral experiments. Three sorts of data were
recorded. E-Prime registered the respondents SAM ratings for each picture and recorded eye
tracking data through the Tobii eye tracker. The Biopac GSR equipment2, which is a professional
system for physiological recordings, recorded the GSR data. Every time a target stimulus was
presented, a marker signal was sent from E-Prime to the GSR software program (Biopac
AcqKnowledge version 3.8.1.) for synchronization. The setup is shown in Figure 3.

Figure 3 illustrates the communication between the hardware used in the EPOC study. GSR
recordings were recorded on the laptop, eye tracker data recordings were performed on a stationary
computer and a synchronization signal connected the two machines through the Biopac system.

Eye tracker data processing


Eye tracker data was exported from E-Prime into an Emotion Tool (former version of Attention
Tool) compliant database and loaded into Emotion Tool for emotional response classification. The
classification was performed using the iMotions’ emotion evaluation system, iEE 2.0, which

1
See http://www.tobii.com
2
The Biopac modules MP100ACE, UIM100C and GSR100C were used.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 11/20
WHITE PAPER

Attention Tool 3.0 is based on. Signal processing of GSR data was a manual process that has been
carried out in accordance to the guidelines of Kenneth Hugdahl (see Hugdahl, 2001).

Signal processing of galvanic skin response data


The GSR signal has been filtered to remove small fluctuations which do not belong to the galvanic
skin response frequency spectrum. Figure 4 shows an example of the filtered GSR signal from
stimulus onset time until five seconds after stimulus presentation.

Figure 4, an example of a filtered galvanic skin response signal. A is the stimulus onset time, B is the
time instant where the signal is at maximum, C is where the search for a maximum starts and D where
it ends.

The amplitude of the response is determined by the peak of the GSR signal subtracted by the
minimum value of the signal. First, the peak is found from one second after stimulus onset time
until five seconds after onset time. Then, the minimum value is found in the interval from stimulus
onset time until the time instant of the peak. See below for a mathematical description.

A is the stimulus onset time and B is the time instant, where the GSR signal is at maximum
between instances C and D; C = A + 1 and D = A + 5.

If there has been a large fluctuation with high amplitudes prior to stimulus onset time, this might
indicate heavy respiration, physical movements by the respondent or perhaps a technical disruption
of the signal. All of which may distort the response measured after stimulus onset time, and the
response will therefore be marked as an outlier and omitted from average response calculations.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 12/20
WHITE PAPER

The respondent was also video recorded during the data collection for later analysis and outlier
detection. The video recordings were used to observe any obvious disturbance causing an invalid
GSR recording, such as the mentioned body movements or heavy respiration.

EPOC study results: Validation of Attention Tool


The results from the EPOC study are presented in this chapter. First, the precision of the
measurement equipments are compared for an overview on the limitations of the techniques and
the pros and cons of galvanic skin response and eye tracking. The correlation between the
simultaneously acquired measurements from eye tracking and galvanic skin response is
investigated next. This is done by looking at how the iMotions emotion evaluation system behaves
in high and low arousal regions by computing the confusion matrix accuracy of iEE versus GSR.
The whole arousal range is investigated by performing a simple linear regression between iEE
arousal and GSR arousal.

The comparison proves iMotions’ emotion evaluation system encapsulated in Attention Tool 3.0 to
be a good quantitative arousal estimator.

Precision of galvanic skin response and eye tracking


Before comparing the ability of the two methods in estimating arousal, it is relevant to compare the
precision of data collected by galvanic skin response measurements and eye tracking.

The simultaneously collected skin conductance and eye tracking data from the EPOC study are
normalised, and the coefficient of variation (cv) for the measurements are compared on a feature
level. The extracted features are the basis for arousal calculations in both measurement techniques.
The coefficient of variation is calculated as

where σ is the standard deviation and µ is the average value of the feature.

The coefficient of variation is a normalised, dimensionless measure of dispersion that can be used
to compare the relative standard deviation of different datasets of various sources. It is given as a
ratio and can be used to give an indication of whether the individual measurements are in
agreement with the average of the sample or not. It is desirable that the Cv is less than unity,
meaning that the standard deviation of a set of measurements is smaller than the average of the
measurements. The higher the Cv, the less reliable is the individual measurements in a sample with
regard to the average of the sample.

Data from the 66 stimuli in the EPOC study are used to calculate the Cv of all fundamental features
for computing arousal from eye tracking and galvanic skin response measurements. The Cv of these
features are plotted in Figure 5 as a function of the normalised average feature values. The
comparison shows that there is an average of four times as high a relative standard deviation in the
skin response measurements than eye tracking measurements. In fact, the standard deviation of skin
response amplitudes is about twice as high as their average. On the other hand, looking at eye

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 13/20
WHITE PAPER

tracking features, the complete opposite case can be observed. Figure 5 shows that the standard
deviation of the main features used for computing arousal from eye tracking is less than half the
average value of the features. The Cv of eye tracking features are also notably more stable around
the same level throughout the range of average feature values per stimulus, as opposed to the very
varying Cv of skin response features.

Figure 5, coefficient of variation for features from eye tracking (·) and
skin conductance (·). The features for eye tracking have a much lower
dispersion.

Assuming that both equipments are capable of measuring physiological arousal triggered by visual
stimuli, and that the given features are the basis for calculating emotional arousal, we can conclude
that the data from the eye tracker are better suited for delivering consistent, low-dispersion features
that can be used for arousal estimation.

A brief note on Arousal calculations


For calculating arousal with the galvanic skin response equipment, the natural logarithm of the
GSR amplitudes is used.

The iMotions emotion evaluation system on the other hand is considered to be a black box, where
eye tracking features are fed to the input and the output is an average and standard deviation of iEE
2.0 arousal.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 14/20
WHITE PAPER

Figure 6, iEE 2.0 is considered to be a black box unit in this whitepaper. The inputs are features
extracted from eye blink, gaze coordinates and pupil diameter. The output is the average arousal and
its standard deviation.

Confusion matrix accuracy of low and high arousal classification


For both iEE and GSR it is quite easy to identify low and high arousal emotional responses. In the
middle region of the responses however, it is more difficult to tell with confidence whether the
elicited response belongs to the lower or higher region of the arousal scale. This is in particular
manifested in GSR measurements (see Figure 5), where the dispersion of measurements are
especially high in the low to middle region of the measurements. In general, it is difficult to use
GSR measurements with fair precision due to the overall high dispersion of data.

High variance is an inherit part of physiological measurements that reflect a transition from an
affected to an unaffected state; analogous to a calm or excited bodily reaction. It is therefore quite
difficult to maintain low variance, especially in the transition region. Using a scale pan as a
metaphor, the transition between the affected and unaffected states in the middle region can be seen
as “tipping the scales”. This is also expected to be the case with eye tracking measurements,
although the overall variance is somewhat lower than skin response measurements.

The accuracy of the iEE arousal scale when compared to GSR amplitudes is calculated by
classifying the responses of either method into two categories; low arousal and high arousal
response. For this purpose we have selected a region, which is regarded to be of too high variance
using either iEE or GSR. Data from stimuli that were considered to be in this grey zone were left
out, whereas the remaining where used separately for the two genders. The definitions for the two
classes of responses are listed in Table 1 below.

High arousal class Low arousal class


GSR GSR amplitude > average GSR amplitudes GSR amplitude < average GSR amplitudes
of all stimuli + 15% of all stimuli - 15%
iEE 2.0 iEE 2.0 Emotional Involvement > 6.0 iEE 2.0 Emotional Involvement < 4.0

Table 1, definition of low arousal and high arousal classes for GSR and iEE data.

From the total number of tested stimuli, 24 stimuli fulfilled the criteria for females and 17 stimuli
for males.

The classification resulted in 8 correct classifications as high arousal and 12 correct classifications
as low arousal for females. The two methods disagreed on 4 of the tested stimuli for females.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 15/20
WHITE PAPER

For males, 5 stimuli were correctly classified as high arousal and 11 as low arousal, leaving
inconsistency between the methods for only one of the stimuli. The results are listed in Table 2.

Females GSR High GSR Low Males GSR High GSR Low
arousal arousal arousal arousal

iEE 2.0 High iEE 2.0 High


arousal 8 2 arousal 5 0

iEE 2.0 Low iEE 2.0 Low


arousal 2 12 arousal 1 11

Table 2, arousal classification results from GSR and Attention Tool. The accuracy of the classifications
is 83% for females and 94% for males.

This leaves 20 out of 24 correctly classified stimuli for females, or 83%, and 16 out of 17 stimuli,
or 94%, correctly classified for males.

Linear regression of measurements


The arousal values attained with galvanic skin response measurements and iEE are compared by
performing a linear regression of the two. The Pearson’s correlation coefficient (ρ) is calculated
for two datasets; one with all 66 stimuli from the EPOC study, and one with the low and high
arousal stimuli chosen according to Table 1.

Figure 7 shows GSR arousal as a function of iEE arousal, the corresponding regression line and 95%
prediction interval of males and females. All stimuli in the EPOC study were used for this plot.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 16/20
WHITE PAPER

The effect of the greater dispersion of individual measurements around the middle range of the
arousal scale is also visible in the regression shown in Figure 7. The correlation coefficient is ρ =
0.39 for females and ρ = 0.38 for males for the whole dataset. For the second dataset, the
correlation coefficient is ρ = 0.63 for females and ρ = 0.84 for males, as listed in Table 3.

Correlation coefficient ρ Females ρ Males


All stimuli 0.39 0.38
Low dispersion stimuli 0.63 0.84

Table 3, Pearson’s correlation coefficient for all stimuli in the EPOC study and for a subset of the
stimuli where high or low arousal was observed.

It can be argued that the correlation between the two methods is vague for the whole dataset. This
should be seen in the light of the disadvantages of galvanic skin response measurements, and the
effect of the greater dispersion around the centre of the arousal scales. Looking at the correlation
coefficients attained with the low dispersion dataset, it is evident that there is a strong correlation
between GSR and iEE, when the arousal level is not around the centre of the arousal scale. This in
turn does not mean that the iEE arousal scale fails at this point, but that the two methods, especially
GSR, have a greater dispersion in this region, making it less likely to attain small differences in
arousal levels between the methods; thus, resulting in a lower correlation coefficient.

Conclusion
Emotions can be measured through the physiological response of the body and these measurements
are of high value. One of the advantages of this information is the possibility of analysing how
advertisements that still haven’t entered the market, are going to affect potential consumers. There
exist many tools on the market today which can measure a physiological reaction in connection
with emotions. But most of them are expensive and difficult to use; and all of them are intrusive.
GSR is one of these measurement methods (that is less costly) and is considered to be a very good
indicator of arousal in the field of psychophysiology.

The coefficient of variation of the features for arousal estimation derived by Attention Tool is
compared with the same for GSR features for computing physiological arousal. The result showed
that eye tracking features have a much lower dispersion of the individual measurements in a sample
with regard to the average of the sample. The comparison revealed that eye tracking measurements
are capable of delivering up to four times as consistent measurements of a peripheral physiological
feature that can be used for arousal computation, than galvanic skin response measurements.
Moreover, galvanic skin response is an intrusive method that needs skilled staff to derive a
meaningful result from the data output. The respondent is also required to have a minimum
understanding of, and compliance to, the equipments extreme sensitivity to attain useful results.
This is due to the fragile nature of the biological signal, which is being recorded from the skin. Eye

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 17/20
WHITE PAPER

tracking requires no involvement from the respondent, once the calibration procedure has
succeeded and is 100 % non-intrusive.

Classification of the stimuli into high and low arousal response categories showed consistent
results. GSR arousal and iEE 2.0 arousal classified 83% of the stimuli into the same category for
females and 94% of the stimuli for males. The Pearson’s correlation coefficient further
substantiated the agreement between the methods. The correlation coefficient was ρ = 0.39 for
females and ρ = 0.38 for males for the whole dataset from the EPOC study. Due to high variance in
emotional response in the middle range of the arousal scales, the correlation increased dramatically
when stimuli in this region were omitted from the dataset. For the same dataset, as was used for the
confusion matrix accuracy calculations, the correlation was ρ = 0.63 for females and ρ = 0.84 for
males.

Based on the above facts, a conclusion can be drawn that Attention Tool 3.0 is just as precise in
determining the intensity of an emotion as the current, broadly used GSR method; thus, allowing
the iMotions emotion evaluation system which Attention Tool 3.0 is based on to act as the future
window to the human emotion spectrum. In addition Attention Tool 3.0 is non-intrusive and data
output is analyzed and finalized automatically.

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 18/20
WHITE PAPER

References

• Beatty, J. & Lucero-Wagoner, B. (2000): The pupillary system, in Caccioppo, J.,


Tassinary, L.G. & Berntson, G. (Eds.): The Handbook of Psychophysiology, Cambridge
University Press, Hillsdale, New York.

• Biopac: http://www.biopac.com

• Bradley, M. M. & Lang, P. J. (1994): Measuring emotion: SAM and the semantic
differential, Journal of Experimental Psychiatry & Behavior Therapy, 25, 49-59.

• Bradley, M. M., Cuthbert, B. N. & Lang, P. J. (1999): Affect and the startle reflex, in
Dawson, M.E., Schell, A. & Boehmelt, A. (Eds.): Startle modification: Implications for
neuroscience, cognitive science and clinical science, Stanford University Press, Stanford,
242–276.

• Bradley, M.M., Codispoti, M., Cuthbert, B.N. & Lang, P.J. (2001): Emotion and
Motivation I: Defensive and Appetitive Reactions in Picture Processing, Emotion, 1 (3),
276–298

• Cacioppo, J.T. & Gardner, W.L. (1999): Emotion, Annual Review of Psychology, 50, 191-
214

• Calvo, M.G. & Lang, P.J. (2004): Gaze patterns when looking at emotional pictures:
Motivationally biased attention, Motivation and Emotion, 28 (3)

• Christie, I. C. & Friedman, B. H. (2004): Autonomic specificity of discrete emotion and


dimensions of affective space: a multivariate approach, International Journal of
Psychophysiology, 51 (2), 143-153

• Cosmides, L. & Tooby, J. (2000): Evolutionary Psychology and the Emotions, in Lewis,
M. & Haviland-Jones, J.M. (Eds.): Handbook of Emotions, The Guilford Press, New York

• Dichter, G.S., Tomarken, A.J. & Baucom, B.R. (2002): Startle modulation before, during
and after exposure to emotional stimuli, International Journal of Psychophysiology, 43,
191-196

• E-Prime: www.pstnet.com

• Granholm, E. & Steinhauer, S.R. (2004): Pupillometric Measures of Cognitive and


Emotional Processes, International Journal of Psychophysiology, 52, 1–6

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 19/20
WHITE PAPER

• Hugdahl, K. (2001): Psychophysiology: The Mind-body Perspective, Harvard University


Press

• E-mail correspondence with Kenneth Hugdahl (May 2006).

• Lang, P.J. (1988): What are the Data of Emotion?, in Hamilton, V., Bower, G.H. & Fridja,
N.H. (Eds.): Cognitive Perspectives on Emotion and Motivation, Kluwer Academic
Publishers, the Netherlands

• Lang, P.J., Bradley, M.M., & Cuthbert, B.N. (2005): International affective picture system
(IAPS): Affective ratings of pictures and instruction manual, Technical Report A-6,
University of Florida, Gainesville, FL.

• Lang, P.J., Greenwald, M.K., Bradley, M.M. & Hamm, A.O. (1993): Looking at pictures:
Affective, facial, visceral, and behavioural reactions, Psychophysiology, 30, 261-273

• Parrott, W.G. & Hertel, P. (1999): Research methods in cognition and emotion, in
Dalgleish, T. & Power, M.J. (Eds.): Handbook of Cognition and Emotion, John Wiley &
Sons, Chichester

• Ruiz-Padial, L., Sollers, J.J., Vila, J. & Thayer, J.F. (2003): The rhythm of the heart in the
blink of an eye: Emotion-modulated startle magnitude covaries with heart rate variability,
Psychophysiology, 40, 306–313

• Scherer, K.R. (2000): Psychological Models of Emotion, in Borod, J.C. (Eds.): The
Neuropsychology of Emotion, Oxford University Press

• Scherer, K.R., Schorr, A. & Johnstone, T. (Eds.): Appraisal Processes in Emotion –


Theory, Methods, Research, Oxford University Press, 2001

• Staners, R.F., Coulter, M., Sweet, A.W. & Murphy, P. (1979): The papillary response as an
indicator of arousal and cognition, Motivation and Emotion, 3 (4), 319-340

• Steinhauer, S. R., Boller, F., Zubin, J. & Pearlman, S. (1983): Pupillary dilation to
emotional visual stimuli revisited, Psychophysiology, 20

• Tobii: www.tobii.com

• Tranel, D. (2000): Electrodermal Activity in Cognitive Neuroscience: Neuroanatomical


and Neuropsychological Correlates, in Lane, R.D. & Nadel, L. (Eds.): Cognitive
Neuroscience of Emotion,Oxford University Press, New York

© iMotions - Emotion Technology A/S . Denmark, India, USA . info@imotionsglobal.com . www.imotionsglobal.com


CONFIDENTIAL Redistribution is not permitted without written permission from iMotions. 20/20