Escolar Documentos
Profissional Documentos
Cultura Documentos
Ermanno Pitacco
University of Trieste (Italy)
Michel Denuit
UCL, Louvain-la-Neuve (Belgium)
Steven Haberman
City University, London (UK)
Annamaria Olivieri
University of Parma (Italy)
1
3
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
© Ermanno Pitacco, Michel Denuit, Steven Haberman, and Annamaria Olivieri 2009
The moral rights of the authors have been asserted
Database right Oxford University Press (maker)
First published 2009
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose the same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data available
Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India
Printed in Great Britain
on acid-free paper by
CPI Antony Rowe, Chippenham, Wiltshire
ISBN 978–0–19–954727–2
10 9 8 7 6 5 4 3 2 1
Preface
consider two further topics that are of great importance in the context of
life annuities and mortality forecasts but which are less traditional as far as
actuarial books are concerned. These are mortality at the very old ages (i.e.
the problem of ‘closing’ the life table) and the concept of ‘frailty’ as a tool
to represent heterogeneity in populations due to unobservable risk factors.
Chapter 3 considers mortality trends during the past century. The well-
known background is that average human life span has roughly tripled over
the course of human history. Compared to all of the previous centuries, the
20th century has been characterized by a huge increase in average longevity.
As we demonstrate in several chapters, there is no evidence which shows
that improvements in longevity are tending to slow down. This chapter
aims to illustrate the observed decline in mortality over the 20th century,
on the basis of Belgian mortality statistics, using several of the mortality
indices that have been introduced in Chapters 1 and 2. We also illus-
trate the trends in mortality indices for insurance data from the Belgian
insurance market, which have been provided by the Banking, Finance and
Insurance Commission (in Brussels). We note the key point that emerges
from actuarial history that, in order to protect an insurance company from
mortality improvements, actuaries need to resort to life tables incorporat-
ing a forecast of the future trends of mortality rates (the so-called projected
tables). The building of these projected life tables is the main topic of the
next chapters.
Chapter 4 aims at describing the various methods that have been pro-
posed by actuaries and demographers for projecting mortality. Many of
these have been used in an actuarial context, in particular for pricing and
reserving in relation to life annuity products and pension products and
plans, and in the demographic field, mainly for population projections. First,
the idea of a ‘dynamic’ approach to mortality modelling is introduced. Then,
projection methods are presented and our starting point is the extrapolation
procedures which are still widely used in current actuarial practice. More
complex methods follow, in particular those methods based on mortality
laws, on model tables, and on relations between life tables. The Lee–Carter
method, which has been recently proposed, and some relevant extensions
are briefly introduced (while a more detailed discussion, together with var-
ious examples of its implementation, is presented in Chapters 5 and 6). The
presentation is thematic rather than following a strict chronological order.
In order to obtain an insight into the historical evolution of mortality fore-
casts, the reader can refer to the final section of this chapter, in which some
landmarks in the history of dynamic mortality modelling are identified.
There is a variety of statistical models used for mortality projection, rang-
ing from the basic regression models, in which age and time are viewed
Preface ix
rules which could be implemented within internal models are tested and a
comparison is also developed with the requirement for longevity risk set by
Solvency 2, in its current state of development. With regard to risk trans-
fers, particular attention is devoted to capital market solutions, that is, to
longevity bonds. The possible design of reinsurance arrangements is exam-
ined in connection with the hedging opportunities arising from some of
these capital market solutions. The main issues concerning policy design
and the pricing of longevity risk are sketched. The possible behaviour of
the annuitant with respect to the planning of her/his retirement income,
which should be carefully considered in order to choose an appropriate
design of life annuity products, is also examined.
Our approach to writing this book has been to allocate prime responsi-
bility for each chapter to one or two authors and then for us all to provide
comments and input. Thus, Chapters 1 and 4 were written by Ermanno
Pitacco; Chapter 2 by Ermanno Pitacco and Annamaria Olivieri jointly;
Chapters 3 and 5 by Michel Denuit; Chapter 6 by Steven Haberman; and
Chapter 7 by Annamaria Olivieri. We would like to add that a book like
this will never be the result of the inputs of just the authors. Thus, we each
would like to acknowledge the support that we have received from a range
of colleagues. First, we would each like to thank our respective institutions
for the stimulating environment that has enabled us to complete this project.
Michel Denuit would like to acknowledge the inputs by Natacha Brouhns
and Antoine Delwarde, who both worked on the topic of this book as
PhD students under his supervision at UCL. Andrew Cairns kindly pro-
vided detailed comments on an earlier version of Chapters 3 and 5, which
led to significant improvements, in particular with regard to mortality
projection models. Discussions and/or collaborations with many esteemed
colleagues helped to clarify the analysis of mortality and its consequence
for insurance risk management, including Enrico Biffis, Hélène Cossette,
Claudia Czado, Pierre Devolder, Jan Dhaene, Paul Eilers, Esther Frostig,
Anne-Cécile Goderniaux, Montserrat Guillen, Étienne Marceau, Christian
Partrat, Christian Robert, Jeroen Vermunt, and Jean-François Walhin. Luc
Kaiser, Actuary at the BFIC kindly supplied mortality data about the Bel-
gian life insurance market. Particular thanks go to all the participants
to the ‘Mortality’ task force of the Royal Society of Belgian Actuaries,
directed by Philippe Delfosse. Interesting discussions with practising actuar-
ies involved also helped to clarify some issues. In that respect, Michel Denuit
would like to thank Pascal Schoenmaekers from Munich Re for stimulating
exchanges. Michel Denuit would like to stress his beneficial involvement in
the working party appointed by the Belgian federal government in order to
Preface xi
produce projected life tables for Belgium. Special thanks in this regard go to
Micheline Lambrecht and Benoît Paul from FPB. Also, Michel Denuit has
benefited from partnerships with (re)insurance companies, especially with
Daria Khachakidze and Laure Olié from SCOR, and with Lucie Taleyson
from AXA. The financial support of the Communauté française de Belgique
under contract ‘Projet d Actions de Recherche Concertées’ ARC 04/09-320
and of Banque Nationale de Belgique under grant ‘Risk measures and
Economic capital’ are gratefully acknowledged.
Steven Haberman would like to express his deep gratitude to his long-
term research collaborator, Arthur Renshaw, for his contributions to their
joint work which has underpinned the ideas in Chapters 5 and 6 and for
stimulating discussions about mortality trends. He would also like to thank
his close colleague, Richard Verrall, for his contributions and advice on
modelling mortality, as well as their recent PhD students, Terry Sithole
and Marwa Khalaf-Allah, and their research assistant, Zoltan Butt, who
have all worked on the subject of mortality trends and their impact on
annuities and pensions. Steven Haberman would also like to thank Adrian
Gallop from the Government Actuary’s Department for providing mortality
data for England and Wales (by individual year of age and calendar year)
that facilitated the modelling of trends by cohort. The financial support,
provided through annual research grants, received from the Continuous
Mortality Investigation Bureau of the UK Actuarial Profession is gratefully
acknowledged.
Annamaria Olivieri and Ermanno Pitacco would like to thank Enrico
Biffis and Pietro Millossovich for stimulating exchanges and collaborations,
Patrizia Marocco and Fulvio Tomè from Assicurazioni Generali for interest-
ing discussions on various practical aspects of longevity, Marco Vesentini
from Cattolica Assicurazioni, Verona, for providing useful material. The
financial support from the Italian Ministero dell’Università e della Ricerca is
gratefully acknowledged; thanks to the research project ‘Income protection
against longevity and health risks: financial, actuarial and economic analysis
of pension and health products. Market trends and perspectives’, coordi-
nated by Ermanno Pitacco, various stimulating meetings have been held.
Finally, special thanks go to all the participants of the Summer School
of the Groupe Consultatif Actuariel Europeen on the topic ‘Modelling
mortality dynamics for pensions and annuity business’ held twice in Italy
(Trieste, 2005; Parma, 2006). Their feedback and comments have been
very useful and such Continuing Professional Development initiatives offer
to the lecturers involved exciting opportunities for the merging of theoret-
ical approaches and practical issues, which we hope have been retained as
a theme in this book.
This page intentionally left blank
Contents
Preface v
1 Life annuities 1
1.1 Introduction 1
1.2 Annuities-certain versus life annuities 2
1.2.1 Withdrawing from a fund 2
1.2.2 Avoiding early fund exhaustion 5
1.2.3 Risks in annuities-certain and in life annuities 6
1.3 Evaluating life annuities: deterministic approach 8
1.3.1 The life annuity as a financial transaction 8
1.3.2 Actuarial values 9
1.3.3 Technical bases 12
1.4 Cross-subsidy in life annuities 14
1.4.1 Mutuality 14
1.4.2 Solidarity 16
1.4.3 ‘Tontine’ annuities 18
1.5 Evaluating life annuities: stochastic approach 20
1.5.1 The random present value of a life annuity 20
1.5.2 Focussing on portfolio results 21
1.5.3 A first insight into risk and solvency 24
1.5.4 Allowing for uncertainty in mortality
assumptions 27
1.6 Types of life annuities 31
1.6.1 Immediate annuities versus deferred annuities 31
1.6.2 The accumulation period 33
1.6.3 The decumulation period 36
1.6.4 The payment profile 38
1.6.5 About annuity rates 40
1.6.6 Variable annuities and GMxB features 41
1.7 References and suggestions for further reading 43
xiv Contents
2.1 Introduction 45
2.2 Life tables 46
2.2.1 Cohort tables and period tables 46
2.2.2 ‘Population’ tables versus ‘market’ tables 47
2.2.3 The life table as a probabilistic model 48
2.2.4 Select mortality 49
2.3 Moving to an age-continuous context 51
2.3.1 The survival function 51
2.3.2 Other related functions 53
2.3.3 The force of mortality 55
2.3.4 The central death rate 57
2.3.5 Assumptions for non-integer ages 57
2.4 Summarizing the lifetime probability distribution 58
2.4.1 The life expectancy 59
2.4.2 Other markers 60
2.4.3 Markers under a dynamic perspective 62
2.5 Mortality laws 63
2.5.1 Laws for the force of mortality 64
2.5.2 Laws for the annual probability of death 66
2.5.3 Mortality by causes 67
2.6 Non-parametric graduation 67
2.6.1 Some preliminary ideas 67
2.6.2 The Whittaker–Henderson model 68
2.6.3 Splines 69
2.7 Some transforms of the survival function 73
2.8 Mortality at very old ages 74
2.8.1 Some preliminary ideas 74
2.8.2 Models for mortality at highest ages 75
2.9 Heterogeneity in mortality models 77
2.9.1 Observable heterogeneity factors 77
2.9.2 Models for differential mortality 78
2.9.3 Unobservable heterogeneity factors.
The frailty 80
2.9.4 Frailty models 83
2.9.5 Combining mortality laws with frailty models 85
2.10 References and suggestions for further reading 87
Contents xv
3.1 Introduction 89
3.2 Data sources 90
3.2.1 Statistics Belgium 91
3.2.2 Federal Planning Bureau 91
3.2.3 Human mortality database 92
3.2.4 Banking, Finance, and Insurance Commission 92
3.3 Mortality trends in the general population 93
3.3.1 Age-period life tables 93
3.3.2 Exposure-to-risk 95
3.3.3 Death rates 96
3.3.4 Mortality surfaces 101
3.3.5 Closure of life tables 101
3.3.6 Rectangularization and expansion 105
3.3.7 Life expectancies 111
3.3.8 Variability 113
3.3.9 Heterogeneity 115
3.4 Life insurance market 116
3.4.1 Observed death rates 116
3.4.2 Smoothed death rates 118
3.4.3 Life expectancies 122
3.4.4 Relational models 123
3.4.5 Age shifts 127
3.5 Mortality trends throughout EU 129
3.6 Conclusions 135
References 373
Index 389
This page intentionally left blank
Life annuities
1
1.1 Introduction
Great attention is currently devoted to the management of life annuity port-
folios, both from a theoretical and a practical point of view, because of the
growing importance of annuity benefits paid by private pension schemes.
In particular, the progressive shift from defined benefit to defined contribu-
tion pension plans has increased the interest in life annuities, which are the
principal delivery mechanism of defined contribution pension plans.
Among the risks which affect life insurance and life annuity portfolios,
longevity risk deserves a deep and detailed investigation and requires the
adoption of proper management solutions. Longevity risk, which arises
from the random future trend in mortality at adult and old ages, is a rather
novel risk. Careful investigations are required to represent and measure it,
and to assess the relevant impact on the financial results of life annuity
portfolios and pension plans.
This book provides a comprehensive and detailed description of methods
for projecting mortality, and an extensive introduction to some important
issues concerning the longevity risk in the area of life annuities and pension
benefits.
Conversely, the present chapter mainly has an introductory role, aiming
at presenting the basic structure of life annuity products. Moving from
the simple model of the annuity-certain, typical features of life annuity
products are presented (Section 1.2). From an actuarial point of view, the
presentation progressively shifts from very traditional deterministic models
(Section 1.3) to more modern stochastic models (Section 1.5). An appropri-
ate stochastic approach allows us to capture the riskiness inherent in a life
annuity portfolio, and in particular the risks arising from random mortality.
Cross-subsidy mechanisms which work (or may work) in life annuity
portfolios and pension plans are described in Section 1.4.
2 1 : Life annuities
Assume that the amount S is available at a given time, say at retirement, and
is used to build up a fund. Denote the retirement time with t = 0, and assume
that the year is the time unit. In order to get her/his post-retirement income,
the retiree withdraws from the fund at time t the amount bt (t = 1, 2, . . . ).
Suppose that the fund is managed by a financial institution which guarantees
a constant annual rate of interest i.
Denote with Ft the fund at time t, immediately after the payment of the
annual amount bt . Clearly:
Ft = Ft−1 (1 + i) − bt for t = 1, 2, . . . (1.1)
with F0 = S. Thus, the annual variation in the fund is given by
Ft − Ft−1 = Ft−1 i − bt for t = 1, 2, . . . (1.2)
Figure 1.1 illustrates the causes explaining the behaviour of the fund
throughout time, formally expressed by equation (1.2).
The behaviour of the fund throughout time obviously depends on the
sequence of withdrawals b1 , b2 , . . .. In particular, if for all t the annual
withdrawal is equal to the annual interest credited by the fund manager,
that is,
bt = Ft−1 i (1.3)
then, from (1.1) we immediately find
Ft = S (1.4)
1.2 Annuities-certain versus life annuities 3
– Annual
Ft–1 payment
Ft – Ft–1
Fund
Ft
+ Interest
t–1 t
Time
b = Si (1.5)
follows.
Conversely, if we assume a constant withdrawal b,
b > Si (1.6)
Clearly, the exhaustion time m depends on the annual amount b (and the
interest rate i as well), as it can be easily understood from equation (1.2).
The sequence of m constant annual withdrawals b (with m defined by
conditions (1.8), and possibly completed by the exhausting withdrawal at
time m + 1) constitutes an annuity-certain.
Example 1.1 Assume S = 1000. Figure 1.2 illustrates the behaviour of
the fund when i = 0.03 and for different annual amounts b. Conversely,
Fig. 1.3 shows the behaviour of the fund for various interest rates i, assuming
b = 100.
4 1 : Life annuities
2,000
b = 50
b = 75
b = 100
1,000 b = 125
0
0 5 10 15 20 25 30 35
Ft
–1,000
–2,000
–3,000
–4,000 t
1,200
i = 0.02
i = 0.03
1,000 i = 0.04
i = 0.05
800
600
400
Ft
200
0
0 2 4 6 8 10 12 14 16
–200
–400
–600 t
m ≈ Mod[Tx ] (1.9)
Thus, with a high probability the exhaustion time will coincide with the
residual lifetime. Notwithstanding, events like Tx > m, or Tx < m, may
occur and hence the retiree bears the risk originating from the randomness
of her/his lifetime. Conversely, the choice
m=ω−x (1.10)
clearly with lx V0 = lx S
From (1.11), we find the following recursion describing the evolution of
the individual fund:
lx+t−1
Vt = Vt−1 (1 + i) − b (1.12)
lx+t
with V0 = S. Recursion (1.12) can also be written as follows:
lx+t−1 − lx+t
Vt = Vt−1 (1 + i) + Vt−1 (1 + i) − b (1.13)
lx+t
Thus, the annual variation in the fund is given by
lx+t−1 − lx+t
Vt − Vt−1 = Vt−1 i + Vt−1 (1 + i) − b (1.14)
lx+t
It is worth noting from (1.14) that the annual decrement of the individual
fund can be split into three contributions (see Figure 1.4):
Contribution (b), which does not appear in the model describing the
annuity-certain (see Figure 1.1), is maintained thanks to a cross-subsidy
among annuitants, that is, the so-called mutuality effect. For more details,
see Section 1.4.1.
In the case of life annuities, the individual fund Vt (as defined by recursion
(1.12)) is called the reserve.
– Annual
payment
Vt–1
Vt – Vt–1
Reserve
+ Interest
Vt
+ Mutuality
t–1 t
Time
– market risk, more precisely interest rate risk, as we have assumed that i is
the guaranteed interest rate which must be credited to the fund whatever
the return from the investment of the fund itself may be;
– liquidity risk, as the annual payment obviously requires cash availability.
Conversely, the retiree does not take any financial risk thanks to the
guaranteed interest rate, whereas she/he bears the risk related to her/his
random lifetime, as seen above.
Now, let us move to the life annuity. According to the structure of this
product (at least as defined in Section 1.2.2), the annuitant does not bear
any risk. Actually, the annuity is paid throughout the whole lifetime and
the amount of the annual payment is guaranteed.
Conversely, the annuity provider first bears the market risk and the
liquidity risk as in the annuity-certain model. Further, if the actual life-
times of annuitants lead to numbers of survivors greater than the estimated
ones, the cross-subsidy mechanism (see Section 1.2.2 and Fig. 1.4) cannot
finance the payments to the annuitants still alive. In other words, contri-
bution (b), which is required to maintain the individual fund Vt , should be
8 1 : Life annuities
and we have
ω−x
−t
S=b t px (1 + i) (1.18)
t=1
where
10 1 : Life annuities
1 − (1 + i)−h
ah = (1.20)
i
denotes the present value of a temporary annuity-certain consisting of h
unitary annual payments in arrears;
– the symbol qx+h denotes the probability of an individual age x + h dying
within one year, formally
qx+h = P[Tx+h < 1] (1.21)
we note that, assuming ω as the maximum age, qω = 1;
– hence, h px qx+h is the probability of an individual currently age x dying
between ages x + h and x + h + 1; in symbols
Note that
Vt = b ax+t (1.30)
starting from a (notional) initial value lx . For example, assume for qx+h the
following expression:
GH
x+h
if x + h < 110
qx+h = 1 + G H x+h (1.32)
1 if x + h = 110
12 1 : Life annuities
1,200
1,000
800
Vt
600
400
200
0
65 75 85 95 105 115
x+t
The relation between S (the single premium) and b (the annual benefit)
relies on the equivalence principle, as S is the expected present value of
the sequence of annual amounts b. The adoption of this principle complies
with common (but not necessarily sound) actuarial practice. Actually, when
the equivalence principle is used for pricing insurance products and life
annuities in particular, a safe-side technical basis (or prudential basis, or
first-order basis) is chosen, namely an interest rate i lower than the estimated
investment yield, and a set of probabilities expressing a mortality level lower
than that expected in the life annuity portfolio. The estimated investment
yield and the mortality actually expected constitute the scenario technical
basis (or realistic basis, or second-order basis).
For simplicity, assume a constant estimated investment yield i∗ ; denote
with q∗x+h , h = 0, 1, . . . , ω − x the realistic probabilities of death. The sur-
vival probabilities, t p∗x , can be calculated from the q∗x+h as stated by relation
1.3 Evaluating life annuities: deterministic approach 13
(1.23). The resulting actuarial value of the life annuity, a∗x , is clearly given
(see (1.27)) by
ω−x
a∗x = ∗ ∗ −t
t px (1 + i ) (1.33)
t=1
The difference ax − a∗x can be interpreted as the expected present value (at
time t = 0) of the profit generated by the life annuity contract. Note that,
if i∗ > i, the yield from investment contributes to the profit. Usually profit
participation mechanisms assign a (large) part of the investment profit to
policyholders, and so the expected profit ax − a∗x should be taken as gross
of the profit participation.
Example 1.3 For example, assume i = 0.03 and the qx+h adopted in Exam-
ple 1.1 as the items of the safe-side technical basis (i.e. the pricing basis);
conversely, for the scenario basis assume i∗ = 0.05 as the estimated invest-
ment yield, and the mortality level described by probabilities q∗x+h given by
the expression (1.32) implemented with the parameters G∗ = 0.0000023,
H ∗ = 1.134. With these assumptions, we have that i∗ > i and q∗x+h > qx+h .
We find that a∗65 = 11.442, and hence the expected present value of the
profit produced by a life annuity with a unitary annual payment, that is,
with b = 1, is a65 − a∗65 = 2.731.
1.4.1 Mutuality
where
lx+t−1 − lx+t
θx+t = (1.35)
lx+t
Example 1.4 In Fig. 1.6 the quantity θx+t is plotted for x = 65 and
t = 0, 1, . . .. The underlying technical basis is the first-order basis, with
i = 0.03 and the qx+t defined in Example 1.2. It is interesting to note that,
1 The expression ‘Implied Longevity Yield’ and its acronym ‘ILY’ are registered trademarks and
property of CANNEX Financial Exchanges.
16 1 : Life annuities
2
1.8
1.6
1.4
1.2
1
u
0.8
0.6
0.4
0.2
0
65 75 85 95 105
Age
when moderately old ages are involved (say, in the interval 65–75), the val-
ues of θ are rather small. In such a range of ages, they could be ‘replaced’
with a higher yield from investments (provided that riskier investments can
be accepted), and so, in that age interval, a withdrawal process could be pre-
ferred to a life annuity. Conversely, as the age increases, θ reaches very high
values, which obviously cannot be replaced by investment yields. So, when
old and very old ages are concerned, the life annuity is the only technical
tool which guarantees a lifelong constant income. As regards theoretical
results showing that the annuitization constitutes the optimal choice, see
Section 1.7.
1.4.2 Solidarity
itself. The weighting should reflect the expected numbers of (future) insureds
belonging to the various risk classes.
Assume that, as far as pricing is concerned, the population is split into
rating classes rather than into risk classes. The rationale of this grouping
may be, for example, a simplification in the tariff structure.
When two or more risk classes are aggregated into one rating class,
some insureds pay a premium higher than their ‘true’ premium, that is,
the premium resulting from the risk classification, while other insureds pay
a premium lower than their ‘true’ premium. Thus, the equilibrium inside
a rating class relies on a money transfer among individuals belonging to
different risk classes. This transfer is usually called solidarity (among the
insureds).
Clearly, such a premium system may cause adverse selection, as individ-
uals forced to provide solidarity to other individuals can reject the policy,
moving to other insurance solutions (or, more generally, risk management
actions). The severity of this self-selection phenomenon depends on how
people perceive the solidarity mechanism, as well as on the premium systems
adopted by competitors in the insurance market. In any event, self-selection
can jeopardize the technical equilibrium inside the portfolio, which depends
on actual versus expected numbers of insureds belonging to the various risk
classes grouped into a rating class. So, in practice, solidarity mechanisms
can work provided that they are compulsory (e.g. imposed by insurance
regulation) or they constitute a common market practice.
As regards life annuities, risk classes are usually based on age and gen-
der. In particular, it is well known that females experience a mortality
lower than males and a higher expected lifetime. So, if for some reason
the same premium rates (only depending on age) are applied to all annu-
itants, a solidarity effect arises, implying a money transfer from males to
females.
The solidarity effect is stronger when the number of rating classes is
smaller, compared with the number of risk classes. In the private insur-
ance field, an extreme case is achieved when one rating class only relates
to a large number of underlying risk classes. Outside of the private insur-
ance area, the solidarity principle is commonly applied in social security.
In this field, the extreme case arises when the whole national population
contribute to fund the benefits, even if only a part of the population itself is
eligible to receive benefits; so, the burden of insurance is shared among the
community.
Finally, it is interesting to stress the implications of this argument. Mutu-
ality affects the benefit (or claim) payment phase, so that ‘direction’ and
18 1 : Life annuities
‘measure’ of the mutuality effect in a portfolio are only known ex-post. Con-
versely, solidarity affects the premium income phase, and hence its direction
and measure are known ex-ante.
S = B E[aK ] (1.38)
of France at the time of King Louis XIV. In this plan, a fund was raised
by subscriptions. Let S denote the amount collected by the State. Then, the
State had to pay each year the interest on S , at a given annual interest rate
i. The constant annual payment S i was to be divided equally among the
surviving members of the group and would terminate with the death of the
last survivor. Thus, according to our notation, the duration of the annuity
is K (see definition (1.37)), and we have B = S i. Note that
B 1
=i= (1.39)
S a∞
where a∞ = 1/i is the present value of a perpetuity (given the discount
rate i). As
S S S
< < (1.40)
a∞ aω−x E[aK ]
(assuming that the same discount rate is used for all the present values), we
find that original Tonti’s scheme did not fulfill the equivalence principle,
whilst it is favourable to the issuer (i.e., to the State).
(a) The tontine scheme clearly implies a cross subsidy among the annui-
tants, and in particular a mutuality effect arises as each dying annuitant
releases a share of the amount B , which is divided among the surviving
annuitants.
(b) A basic difference between tontine annuities and ordinary life annuities
should be recognized. In an ordinary life annuity, the annual (individual)
benefit b is stated and guaranteed, in the sense that the life annuity
provider has to pay the amount b to the annuitant for her/his whole
residual lifetime, whatever the mortality experienced in the portfolio (or
pension plan) may be. Conversely, in a tontine scheme the sequence of
amounts b1 , b2 , . . . paid to each annuitant depends on the actual size of
the surviving tontine group. Note that, when managing an ordinary life
annuity portfolio the annuity provider takes the risk of a poor mortality
experience in the portfolio (see Section 1.2.3), whereas in a tontine
scheme the only cause of risk is the lifetime of the last survivor. Further,
it should be noted that, for a given technical basis and a given amount S,
the annual benefit b is likely to be much higher than the initial payments
20 1 : Life annuities
lx S S
bt = < =b (1.42)
lx+t aω−x ax
It should be noted that, although formulae (1.18) and (1.19) involve prob-
abilities, the model built up so far is a deterministic model, as probabilities
are only used to determine expected values. A first step towards stochastic
models follows.
Equation (1.19) implicitly involves the random present value Y,
Y = aKx (1.43)
of a life annuity (see also (1.25)). The possible outcomes of the random
variable Y are as follows:
y0 = a0 = 0
y1 = a1 = (1 + i)−1
··· = ···
yω−x = aω−x = (1 + i)−1 + (1 + i)−2 + · · · + (1 + i)−(ω−x)
and we have
0.06
0.05
0.04
Probability
0.03
0.02
0.01
0
0 5 10 15 20
Present value of the annuity
Probability
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
L 70 L 85
Example 1.7 Figure 1.8(a) and (b) illustrate the probability distribution
of L70 and L85 respectively, under the following assumptions: x = 65,
l65 = 100, q∗x+t as specified in Example 1.3.
Consider now the random behaviour over time of the fund Zt defined for
t = 1, 2, . . . , ω − x, as follows:
lx
Lx+t = I{T (j) >t} (1.49)
x
j=1
Mt = Zt − Lx+t Vt (1.51)
represents the assets in excess of the level required (according to the first-
order basis) to meet expected future obligations.
Example 1.9 Figures 1.11(a) and (b) represent the (simulated) statistical
distribution of M5 and M20 respectively, based on the simulated sample
previously adopted. The erratic behaviour in these figures (as well as in
24 1 : Life annuities
(a) (b)
100,000 45,000
40,000
95,000
35,000
90,000 30,000
Zt
Zt
85,000 25,000
20,000
80,000
15,000
75,000 10,000
0 1 2 3 4 5 15 16 17 18 19 20
t t
Frequency
0.4
0.4
0.3
0.3
0.2
0.2
0.1 0.1
0 0
10 0
20 0
30 0
40 0
50 0
60 0
70 0
80 0
90 0
00
10 0
60 0
70 0
80 0
90 0
00
20 0
30 0
40 0
00
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
50
Z5 Z20
Figures 1.11(a), 1.11(b), 1.12(a), 1.12(b), and 1.14) is clearly due to the
simulation procedure; smoother results can be obtained by increasing the
number of simulations.
Frequency
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
–4 00
–3 00
–2 00
–1 00
0
10 0
20 0
30 0
40 0
00
–4 00
–3 00
–2 0
–1 0
0
10 0
20 0
30 0
40 0
00
00
0
0
00
00
00
0
0
0
0
0
0
0
0
,0
,0
,0
,0
0
0
,0
,0
,0
,0
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
–5
–5
M5 M20
Of course, causes of risk other than mortality could be introduced into our
model, and typically the investment risk, in particular arising from random
fluctuations (i.e. ‘volatility’) in the investment yield. To this purpose, the
sequence of annual investment yields must be simulated, on the basis of
26 1 : Life annuities
Frequency
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
–4 00
–3 0
–2 000
–1 0
0
10 0
20 0
30 0
40 0
00
–4 00
–3 00
–2 00
–1 0
0
10 0
20 0
30 0
40 0
00
00
00
00
0
0
0
00
00
0
0
0
0
,0
,0
,0
,0
0
0
,0
,0
,0
,0
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
–5
–5
M5 M20
an appropriate model for stochastic interest rates, and used in place of the
estimated yield i∗ . We do not deal with these problems, which are beyond
the scope of the present chapter.
()
Let us now refer to the random present value at time t = 0, Y0 , of
future benefits in a portfolio consisting of one generation of life annuities.
We have
ω−x
()
Y0 =b Lx+t (1 + i)−t (1.52)
t=1
()
If we calculate the expected value of Y0 using the first-order basis, we
have
ω−x
ω−x
E[Y0() ] =b E[Lx+t ] (1 + i) −t
= b lx t px (1 + i)
−t
= lx V0 (1.53)
t=1 t=1
1−α
0 E[Y0(Π)] yα
()
Figure 1.13. Probability distribution of Y0 ; α-percentile.
()
α-percentile of the probability distribution of Y0 (see Fig. 1.13):
(;α)
V0 = yα (1.55)
Example 1.11 Using the data of the previous examples, from the simulated
()
distribution of Y0 (see Fig. 1.14) we find the results shown in Table 1.1.
() () ()
Note that, conversely, we have P[Y0 > V0 ] = 0.209 (where V0 =
E[Y0() ] = 100000).
()
It is worth noting that the calculation of the portfolio reserve V0 (and,
()
in general Vt ) according to (1.54) represents the traditional approach
that is adopted in actuarial practice. In this context, the presence of risks is
taken into account simply via the first-order basis adopted in implementing
formula (1.54). Conversely, the reserving approach based on the probability
() ()
distribution of Y0 (and Yt in general) and leading to the portfolio reserve
(;α) (;α)
V0 (Vt ) allows for risks via the choice of an appropriate percentile
of the distribution itself.
0.16
0.14
0.12
0.1
Probability
0.08
0.06
0.04
0.02
0
83,500 88,500 93,500 98,500 103,500
Y0
()
Figure 1.14. Statistical distribution of Y0 .
α yα
0.75 92067.033
0.90 101553.815
0.95 102608.253
0.99 104480.738
life annuities (and other living benefits), that is, use mortality assumptions
which include a forecast of future mortality trends. Notwithstanding, what-
ever hypothesis is assumed, the future trend in mortality is random, and
hence an uncertainty risk arises, namely a risk due to uncertainty in the
representation of the future mortality scenario.
Example 1.12 Assume the first-order basis already used in the previous
examples. To describe the (future) mortality scenario, use the model (1.32)
with the following alternative parameters:
We assume that scenario (2) (which coincides with the scenario adopted
as the second-order basis in previous examples) represents the best
estimate mortality hypothesis. Scenario (1) involves a higher mortality
level and hence can be considered ‘optimistic’ from the point of view
of the annuity provider. Conversely, scenario (3) expresses a lower mor-
tality level and thus constitutes a ‘pessimistic’ mortality forecast. We
obtain:
(1) (2) (3)
a65 = 11.046, a65 = 11.442, a65 = 12.102
x
Age Value
(2)
(3)
x
Age Value
x
Age Value
Present value
of the life annuity
Figure 1.16. Conditional probability distributions of the random present value of the life annuity.
Let us continue to focus on an immediate life annuity, and denote with b the
annual benefit and S the net single premium (i.e., disregarding expense load-
ings). It is natural to look at the amount S as the result of an accumulation
process carried out during (a part of) the working life of the annuitant.
Let us now denote with x the age at the beginning of the accumulation
process, that is, at time 0. The accumulation process stops at time n, so that
x + n is the age at the beginning of the decumulation phase.
The relation between S and b is given, according to the equivalence
principle, by
S = b ax+n (1.57)
Accumulation Decumulation
S
Fund/reserve
0 1 2 n–1 n n+1
x x+n
Time and age
As regards the accumulation process, this can be carried out via vari-
ous tools, for example insurance policies providing a survival benefit at
maturity (time n). Some policy arrangements tools will be described in
Section 1.6.2.
Conversely, it is possible to look jointly at the accumulation and the decu-
mulation phase, even in actuarial terms. Consider a deferred life annuity of
one monetary unit per annum, with a deferred period of n years. Assume
now that each annual payment is due at the beginning of the year (annuity
in advance). The actuarial value at time 0, n| äx , is given by
ω−x
n| äx = (1 + i)−h h px (1.58)
h=n
n| äx
P=b (1.59)
äx:n
1.6 Types of life annuities 33
where
n−1
äx:n = (1 + i)−h h px (1.60)
h=0
(a) Formulae (1.58) and (1.60) rely on the assumption that the technical
basis is chosen at time 0, when the insured is aged x. If for example
x = 40, this means that the technical rate of interest will be guaranteed
throughout a period of, maybe, fifty years or even more. Further, the
life table adopted should keep its validity throughout the same period.
(b) In the case that the policyholder dies before time n, no benefit is due. This
is, of course, a straight consequence of the policy structure, according
to which the only benefit is the deferred life annuity.
Feature (b) is likely to have a negative impact on the appeal of the annuity
product. However, the problem can be easily removed by adding to the
policy a rider benefit such as the return of premiums in case of death during
the deferred period, or including some death benefit with term n.
The problems arising from aspect (a) are much more complex, and require
a re-thinking of the structure and design of the life annuity product. As a
first step, we provide an analysis of the main features of life annuity prod-
ucts, addressing separately the accumulation period and the decumulation
period.
where n Ex = (1 + i)−n
n px denotes the actuarial value of a pure endowment
with a unitary amount insured.
Clearly, relation (1.61) relies on the assumption that the same technical
basis is adopted for both the period of accumulation and decumulation. As
34 1 : Life annuities
already noted, this implies a huge risk for the life annuity provider. So, an
important idea is to address separately the two periods, possibly delaying
the choice of the technical basis to be adopted for the life annuity.
As regards the accumulation period, the pure endowment can be replaced
by a purely financial accumulation, via an appropriate savings instrument.
Then, the loss in terms of the mutuality effect is very limited when (part
of) the working period is concerned. Hence, a very modest extra-yield can
replace the mortality drag.
Example 1.13 In Fig. 1.18, the function θ (see Section 1.4.1) is plotted
against the age in the range 40–64. Note that θ is consistent with formula
(1.34), with given values for the mathematical reserve, however with b = 0.
The underlying technical basis is the first-order basis adopted in Example
1.4. It is interesting to compare the graph in Fig. 1.18 (noting the scale
on the vertical axis) with the behaviour of the function θ throughout the
decumulation period, illustrated in Fig. 1.6.
0.007
0.006
0.005
0.004
u
0.003
0.002
0.001
0
40 45 50 55 60 65
Age
Ph = bh n−h| ä[h]
x+h
(1.62)
36 1 : Life annuities
n−1
b= bh (1.63)
h=0
Let us denote with n the starting point of the decumulation period, and
with x + n the annuitant’s age. Let S be the amount, available at time n, to
finance the life annuity. In the case of a deferred life annuity, S is given by
the mathematical reserve at time n of the annuity itself.
The relation between S and the annual payment b depends on the policy
conditions which define the (random) number of payments, and hence the
duration of the decumulation period. Let us denote with K the number of
payments. Focussing on a life annuity in arrears only, the following cases
are of practical interest:
(d) If the annuitant dies soon after time n, neither the annuitant nor the
annuitant’s estate receive much benefit from the purchase of the life
1.6 Types of life annuities 37
We have so far assumed that the annuity payment depends on the life-
time of one individual only, namely the annuitant. However, it is possible
to define annuity models involving two (or more) lives. Some examples
(referring to two lives) follow:
In equation (1.69) it has been assumed that the annuity continues with
the same annual amount until the death of the last survivor. A modi-
fied form provides that the amount, initially set to b, will be reduced
following the first death: to b if the individual (2) dies first, and to b
if the individual (1) dies first. Thus
(2)
S = b a(1)
y + b az + (b − b − b ) ay,z (1.71)
with b < b, b < b. Conversely, in many pension plans the last-survivor
annuity commonly provides that the annual payment is reduced only if
38 1 : Life annuities
the retiree, say life (1), dies first. Formally, b = b (instead of b < b) in
equation (1.71).
(f) A reversionary annuity (on two individuals) is payable while a given
individual, say individual (2), is alive, but only after the death of
the other individual. In this case, the number of payments is K =
(2) (1)
max{0, Kz − Ky }, and the first payment (if any) is made at time
(1)
Ky + 1. Such an annuity can be used, for example, as a death benefit
in pension plans, to be paid to a surviving spouse or dependant.
Adding capital protection clearly reduces the annuity benefit (for a given
single premium).
death and in case of life. The GMxBs are usually defined in terms of the
amount resulting from the accumulation process (the account value) at some
point of time, compared with a given benchmark (which may be expressed
in terms of the interest rate, a fixed benefit amount, etc.).
One or more than one GMxB can be included in the policy as a rider
to the basic variable annuity product. A brief description of some GMxBs
follow:
example, the GMWB might guarantee that the policyholder will receive
for 20 years an annual amount equal to 5% of the premiums paid.
Some policies do not allow the policyholder to withdraw money after
the commencement of the annuity payments.
– the random present value of the whole life assurance (with a unitary sum
assured) is (1 + i)−Tx , and then, according to usual actuarial notation,
the expected present value is
– the random present value of the standard endowment is (1 + i)− min{Tx ,n} ,
and hence
Āx,n = E[(1 + i)− min{Tx ,n} ]
44 1 : Life annuities
As regards the stochastic approach to actuarial values, see also the sem-
inal contribution by Sverdrup (1952). Mortality risks in life annuities are
analysed by McCrory (1986).
The objectives and main design features of life annuity products are exten-
sively dealt with by Black and Skipper (2000). We have mainly referred
to this textbook in Section 1.6. Various papers and reports have been
recently devoted to innovation in life annuity products, especially address-
ing the impact of longevity risk. See, for example Cardinale et al. (2002),
Department for Work and Pensions (2002), Retirement Choice Working
Party (2001), Richard and Jones (2004), Wadsworth et al. (2001), Swiss
Re (2007), Blake and Hudson (2000). Variable annuities are addressed in
particular by Sun (2006) and O’Malley (2007).
The book by Milevsky (2006) constitutes an updated reference in the
context of life annuities and post-retirement choices.
Great effort has been devoted to the analysis of life annuities from an
economic perspective, in particular in the framework of wealth management
and human life cycle modelling. We only cite the seminal contribution by
Yaari (1965), whereas for other bibliographic suggestions the reader can
refer to Milevsky (2006). The extra yield defined in Section 1.4.1 is the key
element behind the seminal result of Yaari (1965). He shows that a risk
averse, life cycle consumer facing an uncertain time of death would, under
certain assumptions (e.g. the absence of bequest, and the absence of other
sources of randomness), find it optimal to invest 100% of his/her wealth in
an annuity (priced on an actuarially fair basis).
An extensive discussion on the concepts of mutuality and solidarity (how-
ever with some terms used with a meaning different from that adopted in
the present chapter) is provided by Wilkie (1997).
Finally, some references concerning the history of life annuities and the
related actuarial modelling follow. For the early history of life annuities
the reader can refer to Kopf (1926). The paper by Hald (1987) is more
oriented to actuarial aspects, and constitutes an interesting introduction to
the early history of life insurance mathematics. Haberman (1996) provides
extensive information about the history of actuarial science up to 1919,
while in Haberman and Sibbett (1995) the reader can find the reproduc-
tion of a number of milestone papers in actuarial science. The papers by
Pitacco (2004a) and Pitacco (2004c) mainly deal with the evolution of mor-
tality modelling, ranging from Halley’s contributions to the awareness of
longevity risk.
The basic mortality
2 model
2.1 Introduction
Some elements of the basic mortality model underlying life insurance, life
annuities and pensions have been already introduced in Chapter 1, while
presenting the structure of life annuities; see in particular Sections 1.2 and
1.3. In Chapter 2, we consider the mortality model in more depth. We adopt
a more structured presentation of the fundamental ideas, which means that
some repetition of elements from Chapter 1 is unavoidable.
However, new concepts are also introduced. In particular, an age-
continuous framework is defined in Section 2.3, in order to provide some
tools needed when dealing with mortality projection models.
Indices summarizing the probability distribution of the lifetime are
described in Section 2.4, whereas parametric models (i.e. mortality ‘laws’)
are presented in Section 2.5. Basic ideas concerning non-parametric gradu-
ation are introduced in Section 2.6. Transforms of the survival function are
briefly addressed in Section 2.7.
Less traditional topics, yet of great importance in the context of life
annuities and mortality forecasts, are dealt with in Sections 2.8 and 2.9,
respectively: mortality at very old ages (i.e. the problem of ‘closing’ the life
table), and the concept of ‘frailty’ as a tool to represent heterogeneity in
populations, due to unobservable risk factors.
A list of references and suggestions for further readings (Section 2.10)
conclude the chapter. As regards references to actuarial and statistical
literature, in order to improve readability we have avoided the use of
citations throughout the text of the first sections of this chapter, namely
the sections devoted to traditional issues. Conversely, important contri-
butions to more recent issues are cited within the text of Sections 2.8
and 2.9.
46 2 : The basic mortality model
lx+1 = lx (1 − qx ) (2.1)
assumed in principle, at least when long periods of time are referred to.
Hence, in life insurance applications, the use of period life tables should be
restricted to products involving short or medium durations (5 to 10 years,
say), like term assurances and endowment assurances, whilst it should be
avoided when dealing with life annuities and pension plans. Conversely,
these products require life tables which allow for the anticipated future
mortality trend, namely projected tables constructed on the basis of the
experienced mortality trend.
For any given sequence l0 , l1 , . . . , lω it is usual to define
dx = lx − lx+1 ; x = 0, 1, . . . , ω (2.2)
Mortality data, and hence life tables, can originate from observations con-
cerning a whole national population, a specific part of a population (e.g.
retired workers, disabled people, etc.), an insurer’s portfolio, and so on.
Life tables constructed on the basis of observations involving a whole
national population (usually split into females and males) are commonly
referred to as population tables.
Market tables are constructed using mortality data arising from a collec-
tion of insurance portfolios and/or pension plans. Usually, distinct tables
are constructed for assurances (i.e. insurance products with a positive sum
at risk, for example term and endowment assurances), annuities purchased
on an individual basis, pensions (i.e. annuities paid to the members of a
pension plan).
The rationale for distinct market tables lies in the fact that mortality levels
may significantly differ as we move from one type of insurance product to
another. The case of different types of life annuities has been discussed in
Section 1.6.5.
Market tables provide experience-based data for premium and reserve
calculations and for the assessment of expected profits. Population tables
can provide a starting point when market tables are not available. More-
over, population tables usually reveal mortality levels higher than those
expressed by market tables and hence are likely to constitute a prudential
48 2 : The basic mortality model
px = 1 − qx (2.7)
2.2 Life tables 49
Consider, for example, a group of insureds, all age 45, deriving from a pop-
ulation whose mortality can be described by a given life table. Is q45 (drawn
50 2 : The basic mortality model
So, the answer to the above question is negative if the insureds have entered
insurance in different years: it is reasonable to expect that an individual,
who has just bought insurance, will be of better health than an individual
who bought insurance several years ago.
Hence, the attained age (45, in the example) should be split as follows:
where the number in square brackets denotes the age at policy issue, whereas
the second number denotes the time since policy issue. In general, q[x]+u
denotes the probability of an individual currently aged x + u, who bought
insurance at age x, dying within one year.
According to point (b), it is usual to assume:
We denote by xmin and xmax the minimum and respectively the maximum
age at entry. The set of sequences (2.16), for x = xmin , xmin +1 , . . . , xmax , is
called a select table. In particular, the table used after the select period is
called an ultimate life table.
Conversely, life tables in which mortality depends on attained age only (as
is the case for the life tables described in Section 2.2.1) are called aggregate
tables.
Select mortality also concerns life annuities. The person purchasing a life
annuity is likely to be in a state of good health, and hence it is reasonable to
assume that her/his probabilities of death, for a certain period after policy
issue, are lower than the probabilities of other individuals with the same
age. In this case, a self-selection effect works.
Remark The selection effect, due to medical ascertainment (in the case of
insurances with death benefit) or self-selection (in the case of life annuities),
operates during the first years after policy issue, and the related age-pattern
of mortality is often called issue-select. Another type of selection is allowed
for, when some contingency can adversely affect the individual mortality.
For example, in actuarial calculations regarding insurance benefits in the
case of disability, the mortality of disabled policyholders is usually con-
sidered to be dependent on the time elapsed since the time of disablement
inception (as well as on the attained age). In this case, the mortality is called
inception-select.
Suppose that we have to evaluate the survival and death probabilities (like
(2.8), (2.9) and (2.10)) when ages and times are real numbers. Tools other
than the life table (as described in Section 2.2) are then needed.
Assume that the function S(t), called the survival function and defined
for t ≥ 0 as follows:
S(t) = P[T0 > t] (2.17)
52 2 : The basic mortality model
has been assigned. Clearly, T0 denotes the random lifetime for a new-
born. In the age-continuous framework, it is usual to assume that the
possible outcomes of Tx lie in (0, +∞); nonetheless, we can assume that
the probability measure outside the interval (0, ω) is zero, where ω is the
limiting age.
Consider the probability (2.4); we have
P[T0 > x + h]
P[Tx > h] = P[T0 > x + h | T0 > x] = (2.18)
P[T0 > x]
we then find
S(x + h)
h px = (2.19)
S(x)
For probability (2.5), via the same reasoning, we obtain
S(x + h) − S(x + h + k)
h|k qx = (2.20)
S(x)
and, in particular
S(x) − S(x + k)
k qx = (2.21)
S(x)
Turning back to the life table, we note that, since lx is the expected number
of people alive at age x out of a cohort initially consisting of l0 individuals,we
have:
lx = l0 P[T0 > x] (2.22)
and, in terms of the survival function,
lx = l0 S(x) (2.23)
(provided that all individuals in the cohort have the same age-pattern of
mortality, described by S(x)). Thus, the lx ’s are proportional to the values
which the survival function takes on integer ages x, and so the life table can
be interpreted as a tabulation of the survival function.
Remark If a mathematical formula has been chosen to express the function
S(t), ‘exact’ survival and death probabilities can be calculated, with ages
and times given by real numbers. Conversely, when the survival function is
tabulated at integer ages only, for example, derived from the life table setting
S(x) = lx /l0 (see (2.23)), approximate methods are needed to calculate
survival and death probabilities at fractional ages. Some of these methods
are described in Section 2.3.5.
(a) (b)
1 1
S(x)
S(x)
0 0
Age x Age x
– the survival curve moves (in a north easterly direction over time) towards
a rectangular shape, and hence the term rectangularization is used to
describe this feature;
– the point of maximum downwards slope of the survival curve progres-
sively moves towards the very old ages; this feature is called the expansion
of the survival function.
F0 (t) = t q0 (2.25)
54 2 : The basic mortality model
Of course, we have
F0 (t) = 1 − S(t) (2.26)
The following relation holds between the pdf f0 (t) and the distribution
function F0 (t):
t
F0 (t) = f0 (u) du (2.27)
0
Usually it is assumed that, for t > 0, the pdf f0 (t) is a continuous function.
Then, we have
d d
f0 (t) = F0 (t) = − S(t) (2.28)
dt dt
The pdf f0 (t) is frequently called the curve of deaths.
Figure 2.2(a) illustrates the typical behaviour of the pdf f0 (t). Equation
(2.28) justifies the relation between the curve of deaths and the survival
curve (see Fig. 2.1(a)). In particular, we note that the point of maximum
downward slope in the survival curve corresponds to the modal point (at
adult-old ages) in the curve of deaths.
Moving to the remaining lifetime at age x, Tx (x > 0), the following
relations link the distribution function and the pdf of Tx with the analogous
functions relating to T0 :
From functions Fx (t) and fx (t) (and in particular, via (2.29) and
(2.30), from F0 (t) and f0 (t)), all of the probabilities involved in actuarial
(a) (b)
f0(x)
mx
0 0
Age x Age x
As clearly appears from (2.36), the survival function S(x) can be obtained
once the force of mortality has been chosen. Clearly, the possibility of
finding a ‘closed’ form for S(x) strictly depends on the structure of µx .
Relations between the force of mortality and the basic mortality functions
relating to an individual age x can be easily found. For example, from (2.34)
and (2.30), we obtain
f0 (x + t) fx (t)
µx+t = = (2.37)
S(x + t) 1 − Fx (t)
and hence
fx (t) = t px µx+t (2.38)
h|1 qx = h px qx+h ; h = 0, 1, . . . , ω
fx (t) = t px µx+t ; t ≥ 0
(see (2.38)). The analogy between the right-hand sides of the two expres-
sions is evident. Note, however, that fx (t) (as well as µx+t ) does not
represent a probability, the probability of a person age x dying between
age x + t and x + t + dt being given by fx (t) dt.
2.3 Moving to an age-continuous context 57
The behaviour of the force of mortality over the interval (x, x + 1) can be
summarized by the central death rate at age x, which is usually denoted by
mx . The definition is as follows:
1
0 S(x + u) µx+u du S(x) − S(x + 1)
mx = 1 = 1 (2.40)
0 S(x + u) du 0 S(x + u) du
S(x) − S(x + 1)
m̃x = (2.41)
(S(x) + S(x + 1))/2
Note that m̃x can also be expressed in terms of the annual probabil-
ity of survival or the annual probability of death. Indeed, from (2.41) we
immediately obtain:
1 − px 2 qx
m̃x = 2 = (2.42)
1 + px 2 − qx
Assume that a life table (as described in Section 2.2) is available. How to
obtain the survival function for all real ages x, and probabilities of death
and survival for all real ages x and durations t? In what follows, we describe
three approximate methods widely used in actuarial practice:
and assume S(x) = 0 for x > ω, and so the survival function is a piece-
wise linear function. It easy to prove that, from (2.43) we obtain in
particular t qx = t qx , that is, a uniform distribution of deaths between
58 2 : The basic mortality model
In the age-continuous context, the life expectancy (or expected lifetime) for
a newborn, denoted with ē0 , is defined as follows:
∞
ē0 = E[T0 ] = t f0 (t) dt (2.49)
0
The definition can be extended to all (real) ages x. So, the expected
remaining lifetime at age x is given by
∞
ēx = E[Tx ] = t fx (t) dt (2.51)
0
Note that, for an individual age x, the random age at death can be
expressed as x + Tx , and then the expected age at death is given by
x + E[Tx ] = x + ēx (2.53)
– the modal value (at adult ages) of the curve of death, Mod[T0 ], also called
the Lexis point;
– the median value of the probability distribution of T0 , Med[T0 ], or
median age at death.
thus, the entropy is minus the mean value of ln S(x), weighted by S(x); it
is possible to prove that, as deaths become more concentrated, the value
of H declines and, in particular, H = 0 if the survival function has a
perfectly rectangular shape.
– As deaths become more concentrated in an increasingly narrow interval,
the slope of the survival curve becomes steeper. A simple variability mea-
sure is thus the maximum downward slope of the graph of S(x) in the
adult and old age range. Thus, a lower variability implies a steeper slope.
Formally, the slope at the point of fastest decline is
d
max − S(x) = max{S(x) µx } = max{f0 (x)} (2.61)
x dx x x
Note that the point of fastest decline is Mod[T0 ], that is, the Lexis point.
x1 q0 = 1 − S(x1 ) (2.62)
which, for x1 small (say 1, or 5), provides a measure of infant mortality;
– the percentiles of the probability distribution of T0 ; in particular, the
10-th percentile, usually called endurance, is defined as the age ξ such that
S(ξ) = 0.90 (2.63)
– the interquartile range is defined as follows:
IQR[T0 ] = x − x (2.64)
62 2 : The basic mortality model
where x and x are respectively the first quartile (the 25-th percentile)
and the third quartile (the 75-th percentile) of the probability distribution
of T0 , namely the ages such that S(x ) = 0.75 and S(x ) = 0.25; note that
the IQR decreases as the lifetime distribution becomes less dispersed.
Probabilities k px are derived from the qx ’s according to (2.7) and (2.8), and,
in turn, the qx ’s are determined as the result of a (recent) period mortality
◦
observation. The quantity ex is usually called the (complete) period life
expectancy.
The life expectancy drawn from a period life table can be taken as a
reasonable estimate of the remaining lifetime for an individual currently
age x only if we accept the hypothesis that, from now on, the age-pattern
of mortality will remain unchanged. See also the comments in Section 2.2.1
regarding the construction of the life table in terms of lx .
2.5 Mortality laws 63
f0(x)
max{f0(x)}
x
IQR[T0]
x1q0
Age x
x1 _ x''
Endurance x' e0
Lexis
_
e65+65
µx = B c x (2.66)
µx = α eβ x (2.67)
µx = A + H x + B c x (2.72)
The first term decreases as the age increases and represents the infant mortal-
ity. The second term, which has a ‘Gaussian’ shape, represents the mortality
hump (mainly due to accidents) at young-adult ages. Finally, the third term
(of Gompertz type) represents the senescent mortality.
In 1932 Perks proposed two mortality laws. The first Perks law is as
follows:
α eβx + γ
µx = (2.74)
δ eβx + 1
Conversely, the second Perks law has the following more general structure:
α eβx + γ
µx = (2.75)
δ eβx + e−βx + 1
As we will see in Section 2.8, Perks’ laws have an important role in repre-
senting the mortality pattern at very old ages (say, beyond 80); moreover,
the first Perks law can be reinterpreted in the context of the ‘frailty’ models
(see Section 2.9.5).
The Weibull law, proposed in 1951 in the context of reliability theory, is
given by
µx = A xB (2.76)
or, in equivalent terms:
α−1
α x
µx = (2.77)
β β
The GM class of models (namely, the Gompertz-Makeham class of
models), proposed by Forfar et al. (1988), has the following structure:
r−1
s−1
µx = αi xi + exp βj xj (2.78)
i=1 j=0
with the proviso that when r = 0 the polynomial term is absent, and when
s = 0 the exponential term is absent. The general model in the class (2.78) is
usually labelled as GM(r, s). Note that, in particular, GM(0, 2) denotes the
Gompertz law, GM(1, 2) the first Makeham law and GM(2, 2) the second
Makeham law. Models used by the Continuous Mortality Investigation
Bureau in the UK to graduate the force of mortality µx are of the GM(r, s)
type. In particular, models GM(0, 2), GM(2, 2), and GM(1, 3) have been
widely used.
66 2 : The basic mortality model
Various mortality laws have been proposed in terms of the annual proba-
bility of death, qx , and in terms of the odds φx (see (2.14)). For example,
Beard proposed in 1971 the following law:
B cx
qx = A + (2.79)
E c−2x + 1 + D cx
φx = A − H x + B cx (2.80)
φx = ePx (2.81)
When various (say, r) causes or death are singled out, the force of mortality
µx can be expressed in terms of ‘partial’ forces of mortality, each force
pertaining to a specific cause:
r
µx = µ(k)
x (2.88)
k=1
(k)
where µx refers to the k-th cause of death.
Makeham proposed a reinterpretation of his first law (see (2.70)) in terms
of partial forces of mortality. Let
m
A= Ak (2.89)
k=1
and
m+n
B= Bk (2.90)
k=m+1
whence
m
m+n
m+n
µx = Ak + c x Bk = µ(k)
x (2.91)
k=1 k=m+1 k=1
where
k
k i k
yh = (−1) y (2.93)
i h+k−i
i=0
– λ is a (constant) parameter.
2.6 Non-parametric graduation 69
The first term on the right-hand side of formula (2.92) provides a mea-
sure of the discrepancy between observed and graduated values. The choice
of each weight wh allows us to attribute more or less importance to the
squared deviation related to the h-th observation. In particular, referring
to the graduation of mortality rates, an appropriate choice of the weights
should reflect a low importance attributed to the raw mortality rates con-
cerning very old ages at which few individuals are alive, and hence the
observed values could be affected by erratic behaviour. To this purpose,
the weights can be chosen to be inversely proportional to the estimated
variance of the observed mortality rates.
The second term on the right-hand side of (2.92) quantifies the degree
of roughness in the set of graduated values. Usually, the value of k is
set equal to 2, 3, or 4. Finally, the parameter λ allows us to express our
‘preference’ regarding features of the graduation results: higher values of
λ denote a stronger preference for a smooth behaviour of the graduated
values, whereas lower values express more interest in the fidelity of the
graduated values to the observed ones.
The objective function can be generalized and modified. For example,
it has been proposed to replace, in the first term of the right-hand side of
(2.92), the squared deviations with other powers. As regards the second
term, a mixture of differences of various orders can be used instead of the
k-th differences only.
2.6.3 Splines
where
0; x < ξh
(x − ξh )+ = (2.97)
x − ξh ; x ≥ ξh
for h = 1, . . . , m. The corresponding representation of the spline function
is given by:
r
m
s(x) = αj xj + βh [(x − ξh )+ ]r (2.98)
j=0 h=1
where the αj ’s and the βh ’s are the coefficients of the linear combination.
If d is the dimension of the space, then any basis consists of d elements.
We denote by b1 , b2 , . . . , bd a basis. Hence, any spline s in the space can be
represented as a linear combination of these functions, namely
where the wh ’s are positive weights. Using (2.99) to express the spline func-
tion, our best-fit problem can be stated as follows: find the coefficients
γ1 , γ2 , . . . , γd which minimize the function
2
n d
G(γ1 , γ2 , . . . , γd ) = wh γj bj (x) − zh (2.102)
h=1 j=1
is zero outside a given short interval, the matrix involved by solving the
related set of simultaneous equations has many entries equal to zero, and
this improves the tractability of the best-fit problem.
Spline functions can be introduced by adopting a different approach,
namely the ‘variational approach’. Following Champion et al. (2004), we
start by defining an interpolation problem. Assume that we need to find a
function f interpolating the n data points (x1 , z1 ), (x2 , z2 ), . . . , (xn , zn ), that
is, such that
f (xh ) = zh ; h = 1, 2, . . . , n (2.103)
Clearly, [f ] generalizes the functional (2.104). The first term on the right-
hand side of (2.105) provides a measure of the discrepancy between the data
zh ’s and the graduated values f (xh )’s, whereas the second term can be inter-
preted as a measure of smoothness. The parameter λ allows us to express
our preference in the trade-off between closeness to data and smoothness.
The analogy with the structure of formula (2.92) is self-evident.
It can be proved that, among all functions f with continuous second
derivatives, there is a unique function which minimizes the functional
(2.105).
Finally, it is worth noting that the spline functions so far dealt with are
‘univariate’ splines, as their domains consist of intervals of real numbers.
Extension to a bivariate context is possible; an example will be presented
in Section 5.4, together with the more general concept of P-splines (namely,
‘Penalized’ splines).
2.7 Some transforms of the survival function 73
(a) (b)
7 1.25
6 a = 0; b = 1
5 1
a = –0.2; b = 1
4 a = 0.2; b = 1
3 0.75
a = 0; b = 1
2
0.5 a = –0.2; b = 1
1 a = 0.2; b = 1
0
0.25
–1 0 10 20 30 40 50 60 70 80 90 100
–2 0
–3 30 40 50 60 70 80 90 100
(a) (b)
8 1.25
6 a = 0; b = 1
1
a = 0; b = 1.25
4 a = 0; b = 0.75
0.75
a = 0; b = 1
2
0.5 a = 0; b = 1.25
0 a = 0; b = 0.75
0 10 20 30 40 50 60 70 80 90 100 0.25
–2
0
–4 30 40 50 60 70 80 90 100
(a) (b)
8 1.25
a = 0; b = 1
6 a = –0.2; b = 1.25 1
4 0.75
2 0.5
0 a = 0; b = 1
0 10 20 30 40 50 60 70 80 90 100 0.25 a = –0.2; b = 1.25
–2
0
–4 0 10 20 30 40 50 60 70 80 90 100
where ω denotes, as usual, the limiting age. Thus, the transform is the ratio
of the average annual probability of death beyond age x to the average
annual probability of death prior to age x (both probabilities being referred
to a newborn).
Gompertz–Makeham–Thiele Lindbergson
Force of mortality mx
e.g. Logistic
Age x
in many countries, and provide stronger evidence about the shape of the
mortality curve at old and very old ages.
In particular, it has been observed that the force of mortality is slowly
increasing at very old ages, approaching a rather flat shape. In other words,
the exponential rate of mortality increase at very old ages is not constant,
as for example in Gompertz’s law (see (2.66)), but declines (see Fig. 2.7).
However, a basic problem arises when discussing the appropriateness of
mortality laws in representing the pattern of mortality at old ages: ‘what’
force of mortality are we dealing with? We will return on this important
issue in Section 2.9.3.
As classical mortality laws may fail in representing the very old-age mor-
tality, shifting from the exponential assumption may be necessary in order
to fit the relevant pattern of mortality.
In Perks’ laws (see (2.74) and (2.75)), the denominators have the effect
of reducing the mortality especially at old and very old ages. In particular
the graph of the first law is a logistic curve.
76 2 : The basic mortality model
The logistic model for the force of mortality proposed by Thatcher (1999)
assumes that
δ α eβx
µx = +γ (2.110)
1 + α eβx
Its simplified version, used in particular for studying long-term trends and
forecasting mortality at very old ages, has δ = 1 and hence has only three
parameters, namely α, β, and γ:
α eβx
µx = +γ (2.111)
1 + α eβx
The model proposed by Coale and Kisker (see Coale and Kisker (1990))
relies on the so-called exponential age-specific rate of change of central
death rates, defined as follows:
mx
kx = ln (2.113)
mx−1
mx = exp(a x2 + b x + c) (2.116)
Let us index with (S) standard mortality and with (D) a different (higher
or lower) mortality. Below, some examples of differential mortality models
follow.
In any case, x is the current age and t the time elapsed since policy issue
(t ≥ 0), whence x − t is the age at policy issue.
2.9 Heterogeneity in mortality models 79
Models (2.117) and (2.118) are usually adopted for substandard risks.
( S) ( S)
Letting a = 1 and b = δqx−t , δ > 0, in (2.117) (b = δµx−t in (2.118)) the so-
called additive model is obtained, where the increase in mortality depends
on initial age. An alternative model is obtained choosing b = θ, θ > 0, that
is, a mortality increase which is constant and independent of the initial age;
such a model is consistent with extra-mortality due to accidents (related
either to occupation or to extreme sports). Letting a = 1 + γ, γ > 0, and
b = 0 the so-called multiplicative model is derived, where the mortality
increase depends on current age. When risk factors are only temporarily
effective (e.g. some diseases which either lead to an early death or have a
short recovery time), parameters a, b may be positive up to some proper
time τ; for t > τ, standard mortality is assumed, so that a = b = 0.
Models (2.119) and (2.120) are very common in actuarial practice, both
for substandard and preferred risks, due to their simplicity; they are called
age rating or age shifting models. Model (2.120), in particular, can be
formally justified, assuming the Gompertz law for the standard force of
mortality and the multiplicative model for differential mortality. Actually,
( S)
if µx = α eβx (see (2.67)), we have from (2.118), with a = 1 + γ and b = 0,
( S)
µx(D) = (1 + γ) α eβx = α eβ(x+z) = µx+z (2.125)
The second approach is most interesting. We will deal with this approach
only. In the following discussion, the term heterogeneity refers to unob-
servable risk factors only; in respect of the observable risk factors, the
population is instead assumed to be homogeneous.
In order to develop a continuous model for heterogeneity, a proper
characterization of the unobservable risk factors must be introduced. In
their seminal paper, Vaupel et al. (1979) extend the earlier work of Beard
(1959, 1971) and define the frailty as a non-negative quantity whose level
expresses the unobservable risk factors affecting individual mortality. The
underlying idea is that those people with a higher frailty die on average
earlier than others. Several models can be developed, which are susceptible
to interesting actuarial applications.
With reference to a population (defined at age 0, and as such closed to new
entrants), we consider people current age x. They represent a heterogeneous
group, because of the unobservable factors. Let us assume that, for any
individual, such factors are summarized by a non-negative variable, viz the
frailty. The specific value of the frailty of the individual does not change
over time, but remains unknown. On the contrary, because of deaths, the
distribution of people in respect of frailty does change with age, given that
people with low frailty are expected to live longer; we denote by Zx the
random frailty at age x, for which a continuous probability distribution
2.9 Heterogeneity in mortality models 81
Note that the pdf of Zx is given by the pdf of Z0 , adjusted by the ratio
S(x|z)/S̄(x) which updates at age x the proportion of people with frailty
z. It is also interesting to stress that the assessment of gx (z) is based on
an update of g0 (z) with regard to the number of survivors with frailty z
compared to what would be expected over the whole population.
We define the average force of mortality in the population as
∞ ∞
0 µx (z) S(x|z) g0 (z) dz h0 (x, z) dz
µ̄x = ∞ = 0 (2.138)
0 S(x|z) g0 (z) dz S̄(x)
2.9 Heterogeneity in mortality models 83
that is, to
µ̄x = µx z̄x (2.140)
∞
where z̄x = 0 z gx (z) dz = E[Zx ] represents the expected frailty at age x.
Note that the average force of mortality coincides with the standard one
only if z̄x = 1. A similar relation holds for model (2.130): we easily find
µ̄x = b + µx z̄x .
It is easy to show that
d
z̄x = −µx Var[Zx ] < 0 (2.141)
dx
Then, according to (2.140), µ̄x varies less rapidly than µx . This is due
to the fact that those with a high frailty die earlier, therefore leading to a
reduction of z̄x with age. If one disregards the presence of heterogeneity,
on average an underestimation of the force of mortality follows when one
cohort only is addressed.
We have in particular
δ
E[Z0 ] = z̄0 = (2.143)
θ
δ
Var[Z0 ] = (2.144)
θ2
The coefficient of variation of Z0
√
Var[Z0 ] 1
CV[Z0 ] = =√ (2.145)
E[Z0 ] δ
shows that δ plays the role of measuring, in relative terms, the level of
heterogeneity in population. If δ
∞, then CV[Z0 ] 0, that is, the
84 2 : The basic mortality model
which is the pdf of a random variable Gamma(δ, θ+H(x)) Thus, the Gamma
distribution has a self-replicating property, and the relevant parameters
need to be chosen with reference to the distribution at age 0.
So it follows that
δ
E[Zx ] = z̄x = (2.149)
θ + H(x)
δ
Var[Zx ] = (2.150)
(θ + H(x))2
√
Var[Zx ] 1
CV[Zx ] = =√ (2.151)
E[Zx ] δ
Note that whilst the expected value of the frailty reduces with age, its relative
variability keeps constant.
We can give an interesting interpretation for the average survival
function. Rearranging (2.146) we find
δ δ
θ δ z̄x
S̄(x) = = (2.152)
δ θ + H(x) z̄0
2.9 Heterogeneity in mortality models 85
and then we argue that the average survival function at age x, that is, the
average probability of newborns attaining age x, depends on the compar-
ison between the expected frailty level at age x and age 0; this result is
independent of the particular mortality law that we adopt for the standard
force of mortality, which actually has not yet been introduced, and is simply
due to the properties of the Gamma distribution.
The population force of mortality is
δ
µ̄x = µx (2.153)
θ + H(x)
Usually, the initial values of the parameters of the Gamma distribution are
chosen so that z̄0 = 1, that is, θ = δ. So we have
δ
µ̄x = µx (2.154)
δ + H(x)
Referring to adult ages, we can assume the Gompertz law (see (2.67)) for
describing the standard force of mortality. So the cumulative standard force
of mortality is
x
α
H(x) = α eβt dt = (eβx − 1) (2.155)
0 β
86 2 : The basic mortality model
Rearrange as
1 αδeβx
µ̄x = α α (2.157)
θ− β 1 + βθ−α eβx
αδ α
Let θ−(α/β) = α , βθ−α = δ ; so
α eβx
µ̄x = (2.158)
1 + δ eβx
which is the first Perks law (see (2.74)), with γ = 0. Hence, (2.156) has a
logistic shape; see Fig. 2.8.
The logistic model for describing mortality within a heterogeneous pop-
ulation may be built also adopting a different approach (see Cummins
et al. (1983); Beard (1971)). With reference to a heterogeneous population,
assume that the individual force of mortality is Gompertz, with unknown
‘base’ mortality; hence
µx = A eβx (2.159)
where A (the parameter for base mortality) is a random quantity, specific to
the individual, whilst β (the parameter for senescent mortality) is common
to all individuals and known. Let ϕ(a) denote the pdf of A; the population
force of mortality is then
∞
µ̄x = a eβx ϕ(a) da = eβx E[A] (2.160)
0
2.5 5
Gomperiz
x=0 4.5 Perks
2 x = 85 4
3.5
1.5 3
2.5
1 2
1.5
0.5 1
0.5
0 0
0.4 0.6 0.8 1 1.2 1.4 1.6 65 75 85 95 105 115
αeβx
µ̄x = (2.162)
1 + δeβx
which is still a particular case of (2.74), with γ = 0. Note, however, that
this choice implies that the probability distribution of A depends on age.
What we have just described can be easily classified under the multiplica-
tive frailty model. Actually, if A in (2.159) is replaced with αz (with α certain
and z random), one finds (2.128). The Perks model then follows by choosing
a Gamma distribution for Z0 , with appropriate parameters. However, this
approach is less elegant than that proposed by Vaupel et al. (1979), given
that in (2.159) the distribution of A is not forced to depend on age. Actually,
the multiplicative model allows for extensions and generalizations; further,
it does not require a Gompertz force of mortality.
3.1 Introduction
Life expectancy at birth among early humans was likely to be between 20
and 30 years as testified by evidence that has been glaned from tombstones
inscriptions, genealogical records, and skeletal remains. Around 1750, the
first national population data began being collected in the Nordic countries.
At that time, life expectancy at birth was around 35–40 years in the more
developed countries. It then rose to about 40–45 by the mid-1800s. Rapid
improvements began at the end of the 19th century, so that, by the middle
of the 20th century it was approximately 60–65 years. By the beginning of
the 21st century, life expectancy at birth has reached about 70 years. The
average life span has thus, roughly tripled over the course of human history.
Much of this increase has happened in the past 150 years: the 20th century
has been characterized by a huge increase in average longevity compared
to all of the previous centuries. Broadly speaking, the average life span
increased by 25 years in the 10,000 years before 1850. Another 25-year
increase took place between 1850 and 2000. And there is no evidence that
improvements in longevity are tending to slow down.
The first half of the 20th century saw significant improvement in
the mortality of infants and children (and their mothers) resulting from
improvements to public health and nutrition that helped to withstand infec-
tious diseases. Since the middle of the 20th century, gains in life expectancy
have been due more to medical factors that have reduced mortality among
older persons. Reductions in deaths due to the ‘big three’ killers (cardio-
vascular disease, cancer, and strokes) have gradually taken place, and life
expectancy continues to improve.
The population of the industrialized world underwent a major mortality
transition over the course of the 20th century. In recent decades, the pop-
ulations of developed countries have grown considerably older, because of
two factors – increasing survival to older ages as well as the smaller numbers
90 3 : Mortality trends during the 20th century
of births (the so-called ‘baby bust’ which started in the 1970s). In this new
demographic context, questions about the future of human longevity have
acquired a special significance for public policy and fiscal planning. In par-
ticular, social security systems, which in many industrialized countries are
organized according to the pay-as-you-go method, are threatened by the
ageing of the population due to the baby bust combined with the increase in
life expectancy. As a consequence, many nations are discussing adjustments
or deeper reforms to address this problem.
Thus, mortality is a dynamic process and actuaries need appropriate tools
to forecast future longevity. We believe that any sound procedure for pro-
jecting mortality must begin with a careful analysis of past trends. This
chapter purposes to illustrate the observed decline in mortality, on the basis
of Belgian mortality statistics. The mortality experience during the 20th cen-
tury is carefully studied by means of several demographic indicators which
have been introduced in Chapter 2. Specifically, after having presented the
different sources of mortality statistics, we compute age-specific death rates,
life expectancies, median lifetimes and interquartile ranges, inter alia, as well
as survival curves. We also compare statistics gathered by the insurance
regulatory authorities with general population figures in order to measure
adverse selection. A comparison between the mortality experience of some
EU member countries is performed in Section 3.5.
Before proceeding, let us say a few words about the notation used in
this chapter. Here, we analyse mortality in an age-period framework. This
means that we use two dimensions: age and calendar time. Both age and
calendar time can be either discrete or continuous variables. In discrete
terms, a person aged x, x = 0, 1, 2, . . ., has an exact age comprised between
x and x+1. This concept is also known as ‘age last birthday’ (i.e., the age of
an individual as a whole number of years, by rounding down to the age at
the most recent birthday). Similarly, an event that occurs in calendar year
t occurs during the time interval [t, t + 1]. This two-dimension setting is
formally defined in Section 4.2.1; see Table 4.1. Otherwise, we follow the
notation introduced in the previous chapters.
The Human mortality database (HMD) was launched in May 2002 to pro-
vide detailed mortality and population data to those interested in the history
of human longevity. It has been put together by the Department of Demog-
raphy at the University of California, Berkeley, USA, and the Max Planck
Institute for Demographic Research in Rostock, Germany. It is freely avail-
able at http://www.mortality.org and provides a highly valuable source of
mortality statistics.
HMD contains original calculations of death rates and life tables for
national populations, as well as the raw data used in constructing those
tables. The HMD includes life tables provided by single years of age up to
109, with an open age interval for 110+. These period life tables represent
the mortality conditions at a specific moment in time. We refer readers
to the methods protocol available from the HMD website for a detailed
exposition of the data processing and table construction.
For Belgium, date were compiled by Dana Glei, Isabelle Devos and Michel
Poulain. They cover the period starting in 1841 and ending in 2005. How-
ever, data are missing during World War I. This is why we have decided to
restrict the study conducted in this chapter to the period 1920–2005.
As explained in Section 2.2, life table analyses are based upon an analytical
framework in which death is viewed as an event whose occurrence is prob-
abilistic in nature. Life tables create a hypothetical cohort (or group) of,
say, 100,000 persons at age 0 (usually of males and females separately) and
subject it to age-gender-specific annual death probabilities (the number of
deaths per 1,000 or 10,000 or 100,000 persons of a given age and gender)
observed in a given population. In doing this, researchers can trace how the
100,000 hypothetical persons (called a synthetic cohort) would shrink in
numbers due to deaths as the group ages.
As stressed in Section 2.2.1, there are two basic types of life tables: period
life tables and cohort life tables. A period life table represents the mortality
experience of a population during a relatively short period of time, usually
between one and three years. Life tables based on population data are gen-
erally constructed as period life tables because death and population data
are most readily available on a time period basis. Such tables are useful
in analysing changes in the mortality experienced by a population through
time. These are the tables used in the present chapter.
We analyse the changes in mortality as a function of both age x and cal-
endar time t. This is the so-called age-period approach. In this chapter, we
assume that the age-specific forces of mortality are constant within bands
of age and time, but allowed to vary from one band to the next. This
extends to a dynamic setting the constant force of mortality assumption
(b) in Section 2.3.5.
Specifically, let us denote as Tx (t) the remaining lifetime of an individual
aged x at time t. Compared to Section 2.2.3, we supplement the notation
Tx for the remaining lifetime of an x-aged individual with an extra index
t representing calendar time. This individual will die at age x + Tx (t) in
year t + Tx (t). Then, qx (t) is the probability that an x-aged individual in
94 3 : Mortality trends during the 20th century
calendar year t dies before reaching age x + 1, that is, qx (t) = P[Tx (t) ≤ 1].
Similarly, px (t) = 1 − qx (t) is the probability that an x-aged individual in
calendar year t reaches age x + 1, that is, px (t) = P[Tx (t) > 1].
The force of mortality µx (t) at age x and time t is formally defined as
P[x < T0 (t − x) ≤ x + |T0 (t − x) > x]
µx (t) = lim (3.1)
0
Compare (3.1) to (2.32)–(2.34). Now, given any integer age x and calendar
year t, we assume that
µx+ξ1 (t + ξ2 ) = µx (t) for 0 ≤ ξ1 , ξ2 < 1 (3.2)
This is best illustrated with the aid of a coordinate system that has calendar
time as abscissa and age as coordinate as in Fig. 3.1. Such a representation
is called a Lexis diagram after the German demographer who introduced
it. Both time scales are divided into yearly bands, which partition the Lexis
plane into square segments. Formula (3.2) assumes that the mortality rate
is constant within each square, but allows it to vary between squares; see
Fig. 3.1 for a graphical interpretation. Since life tables do not include mor-
tality measures at non-integral ages or for non-integral durations, (3.2) can
also be seen as a convenient interpolation method to expand a life table for
estimating such values.
Under (3.2), we have for integer age x and calendar year t that
1
px (t) = exp − µx+ξ (t + ξ) dξ = exp(−µx (t)) (3.3)
0
Age
x+1
Figure 3.1. Illustration of the basic assumption (3.2) with a Lexis diagram.
3.3 Mortality trends in the general population 95
which extends (2.36). For durations s less than 1 year, we have under
assumption (3.2) that
s
s px (t) = exp − µx+ξ (t + ξ) dξ
0
s
= exp (−sµx (t)) = px (t) (3.4)
Moreover, the forces of mortality and the central death rates (see Section
2.3.4 for formal definitions) coincide under (3.2), that is, µx (t) = mx (t).
This makes statistical inference much easier since rates are estimated by
dividing the number of occurrences of a selected demographic event in a
(sub-) population by the corresponding number of person-years at risk (see
next section).
3.3.2 Exposure-to-risk
When working with death rates, the appropriate notion of risk exposure
is the person-years of exposure, called the (central) exposure-to-risk in
the actuarial literature. The exposure-to-risk refers to the total number of
‘person-years’ in a population over a calendar year. It is similar to the aver-
age number of individuals in the population over a calendar year adjusted
for the length of time they are in the population.
Let us denote as ETRxt the exposure-to-risk at age x last birthday during
year t, that is, the total time lived by people aged x last birthday in calendar
year t. There is an easy expression for the average exposure-to-risk that is
valid under (3.2). As in (1.45), let Lxt be the number of individuals aged x
last birthday on January 1 of year t. Then,
1
ξ
E[ETRxt |Lxt = l] = l px (t) dξ
ξ=0
l
=− 1 − px (t)
µx (t)
−lqx (t)
= (3.5)
ln(1 − qx (t))
Hence, provided the population size is large enough, we get the
approximation
−Lxt qx (t)
ETRxt ≈ (3.6)
ln(1 − qx (t))
that can be used to reconstitute the ETRxt ’s from the Lxt ’s and the qx (t)’s in
the case where the ETRxt ’s are not readily available. This formula appears
96 3 : Mortality trends during the 20th century
Note that the method of recording the calendar year of death and the age
last birthday at death means that the death counts Dxt cover individuals
born on January 1 in calendar year t−x−1 through December 31 in calendar
year t − x (i.e., two successive calendar years) with a peak representation
around January 1 in calendar year t − x.
Under the assumption (3.2) and using (3.3), the contribution of individual
i to the likelihood may be written as
if he survives, and
If the individual lifetimes are mutually independent, the likelihood for the
Lxt individuals aged x is then equal to
!
Lxt
L µx (t) = exp(−τi µx (t)) (µx (t))δi
i=1
2000–2002
1968–1972
–2 1928–1932
1880–1890
–4
ln mx
–6
–8
0 20 40 60 80 100
x
2000–2002
–2 1968–1972
1928–1932
1880–1890
–4
ln mx
–6
–8
0 20 40 60 80 100
x
Figure 3.2. Death rates (on the log scale) for Belgian males (top panel) and Belgian females (bot-
tom panel) from period life tables 1880–1890, 1928–1932, 1968–1972, and 2000–2002. Source:
Statistics Belgium.
The trend in the logarithm of the m" x (t)’s for some selected ages is depicted
in Figs 3.3 and 3.4. An examination of Fig. 3.3 reveals distinct behaviours
for age-specific death rates affecting Belgian males. At age 20, a rapid reduc-
tion took place after a peak which occurred in the early 1940s due to World
War II. A structural break seems to have occurred, with a relatively high
level of mortality before World War II, and a much lower level after 1950.
Since the mid-1950s, only modest improvements have occurred for the
" 20 (t)’s. This is typical for ages around the accident hump, where male
m
mortality has not really decreased since the 1970s. At age 40, the same
decrease after World War II is apparent, followed by a much slower reduc-
tion after 1960. The decrease after 1970 is nevertheless more marked than
–4.5
–5.0
–5.0
ln m40(t)
ln m20(t)
–5.5
–5.5
–6.0
–6.5 –6.0
–7.0
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
–3.6
–1.8
–3.8
–2.0
ln m80(t)
ln m60(t)
–4.0
–2.2
–4.2
–2.4
–4.4
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
Figure 3.3. Trend in observed death rates (on the log scale) for Belgian males at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.
–5.0
–5.5
–6.0
–5.5
–6.5
ln m40(t)
ln m20(t)
–7.0 –6.0
–7.5
–6.5
–8.0
–8.5
–7.0
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
–3.8 –1.8
–4.0 –2.0
–4.2 –2.2
ln m80(t)
–4.4
ln m60(t)
–2.4
–4.6
–2.6
–4.8
–2.8
–5.0
–3.0
–5.2
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
Figure 3.4. Trend in observed death rates (on the log scale) for Belgian females at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.
3.3 Mortality trends in the general population 101
at age 20. At ages 60 and 80, mortality rates have declined rapidly after
1970, whereas the decrease during 1920–1970 was rather moderate. We
note that the effect of World War II is much more important at younger
ages than at older ages. This clearly shows that gains in longevity have been
concentrated on younger ages during the first half of the 20th century, and
have then moved to older ages after 1950.
The analysis for Belgian females illustrated in Fig. 3.4 parallels that for
males for ages 20 and 40, but with several differences. At age 20, modest
improvements are visible after the mid-1950s. At age 40, more pronounced
reductions occurred after 1960. At older ages, the rate of decrease is more
regular, and has tended to accelerate after 1980.
This acceleration is a feature seen in a number of Western European
countries. Kannisto et al. (1994) report an acceleration in the late 1970s in
rate of decrease of mortality rates at ages over 80 in an analysis of mortality
rates for 9 European countries with reliable mortality data at these ages over
an extended period.
At higher ages (above 80), death rates displayed in Fig. 3.5 appear rather
smooth. This is a consequence of the smoothing procedure implemented
102 3 : Mortality trends during the 20th century
–2
–4
–6
–8
1920
1940 100
80
1960
60
t
1980 40 x
20
2000
0
–2
–4
–6
–8
1920
1940 100
80
1960
60
t 1980 40 x
20
2000
0
Figure 3.5. Observed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), ages 0 to 109, period 1920–2005. Source: HMD.
3.3 Mortality trends in the general population 103
in HMD. Death rates for ages 80 and above were estimated according
to the logistic formula and were then combined with death rates from
younger ages in order to reconstitute life tables. To have an idea of the
behaviour of mortality rates at the higher ages, we have plotted in Fig. 3.6
the rough death rates observed for the Belgian population. As discussed
in Section 2.8, we clearly see from Fig. 3.6 that data at old ages produce
suspect results (because of small risk exposures): the pattern at old and
very old ages is heavily affected by random fluctuations because of the
scarcity of data. Sometimes, data above some high age are not available
at all.
Recently, some in-depth demographic studies have provided a more
sound knowledge about the slope of the mortality curve at very old ages.
It has been documented that the force of mortality is slowly increasing at
very old ages, approaching a rather flat shape. The deceleration in the rate
of increase in mortality rates can be explained by the selective survival of
healthier individuals at older ages (see, e.g. Horiuchi and Wilmoth, 1998)
for more details, as well as the discussion about frailty in Section 2.9.3).
Demographers and actuaries have suggested various techniques for estimat-
ing the force of mortality at old ages and for completing the life table. See
Section 2.8.2 for examples and references. Here, we apply a simple and
powerful method proposed by Denuit and Goderniaux (2005).
The starting point is standard: there is ample empirical evidence that
the one-year death probabilities behave like the exponential of a quadratic
polynomial at older ages, that is, qx (t) = exp(at + bt x + ct x2 ). Hence, a
log-quadratic regression model of the form
ln "
qx (t) = at + bt x + ct x2 + xt (3.14)
which retains as working assumption that the limit age 130 will not be
exceeded. Secondly, an inflection constraint
#
∂ #
qx (t)## =0 for all t (3.16)
∂x x=130
104 3 : Mortality trends during the 20th century
–2
–4
–6
–8
1950
1960 100
1970 80
60
t 1980
40 x
1990
20
2000
0
–2
–4
–6
–8
–10
–12
1950
1960 100
1970 80
60
1980
t
40 x
1990
20
2000
0
Figure 3.6. Observed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), period 1950–2004. Source: Statistics Belgium.
which is used to ensure that the behaviour of the ln qx (t)’s will be ulti-
mately concave. This is in line with empirical studies that provide evidence
of a decrease in the rate of mortality increase at old ages. One explana-
tion proposed for this deceleration is the selective survival of healthier
individuals to older ages, as noted above.
3.3 Mortality trends in the general population 105
Note that both constraints are imposed here at age 130. In general, the
closing age could also be treated as a parameter and selected from the data
(together with the starting age xt , thereby determining the optimal fitting
age range).
These two constraints yield the following relation between the at ’s, bt ’s,
and ct ’s for each calendar time t:
at + bt x + ct x2 = ct (130 − x)2 (3.17)
for x = xt , xt + 1, . . . and t = t1 , t2 , . . . , tm . The ct ’s are then estimated
on the basis of the series {" qx (t), x = xt , xt + 1, . . .} relating to year t
from equation (3.14), noting the constraints imposed by (3.17). It is worth
mentioning that the two constraints underlying the modelling of the qx (t)
for high x are in line with empirical demographic evidence.
Let us now apply this method to the data displayed in Fig. 3.6. The
optimal starting age is selected from the age range 75–89. It turns out to
be around 75 for all of the calendar years. Therefore, we fix it to be 75 for
both genders and for all calendar years. The R2 corresponding to the fitted
regression models (3.14), as well as the estimated regression parameters ct
are displayed in Fig. 3.7. We keep the original "
qx (t) for ages below 85 and
we replace the death probabilities for ages over 85 with the fitted values
coming from the constrained quadratic regression (3.14). The results for
calendar years 1950, 1960, 1970, 1980, 1999, and 2000 can be seen in
Fig. 3.8 for males and in Fig. 3.9 for females. The completed mortality
surfaces are displayed in Fig. 3.10.
1.00
R2 0.98
0.96
0.94
Males
Females
0.92
–0.0008
–0.0009
–0.0010
ct
–0.0011
Males
Females
–0.0012
Figure 3.7. Adjustment coefficients and estimated regression parameters for model
(3.14)–(3.17).
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
0 0 0
–2 –2 –2
–4 –4
ln qx
ln qx
ln qx
–4
–6
–6 –6
–8
–8
–8
–10
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
Figure 3.8. Completed life tables for Belgian males, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (broken
line), on the log-scale.
0 0 0
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
0 0 0
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
Figure 3.9. Completed FPB life tables for Belgian females, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (broken
line), on the log-scale.
3.3 Mortality trends in the general population 109
−2
−4
−6
−8
1950
1960
1970 100
1980
t
50
1990
x
2000
0
−2
−4
−6
−8
1950
1960
1970 100
1980
t 50
1990
x
2000
0
Figure 3.10. Completed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), period 1920–2005.
110 3 : Mortality trends during the 20th century
1.0
0.8
0.6
S(x)
0.4
2000–2002
1968–1972
0.2 1928–1932
1880–1890
0.0
0 20 40 60 80 100
x
1.0
0.8
0.6
S(x)
0.4
2000–2002
0.2 1968–1972
1928–1932
1880–1890
0.0
0 20 40 60 80 100
x
Figure 3.11. Survival curves for Belgian males (top panel) and Belgian females (bottom panel)
corresponding to the 1880–1890, 1928–1932, 1968–1972, and 2000–2002 period life tables.
Source: Statistics Belgium.
85
Males
80 Females
Med[T65(t)]
75
70
65
60
1920 1940 1960 1980 2000
t
20
Males
18 Females
Med[T65(t)]
16
14
12
10
1920 1940 1960 1980 2000
t
Figure 3.12. Observed median lifetimes at birth (top panel) and at age 65 (bottom panel), period
1920–2005. Source: HMD.
become less variable and less obviously bimodal. We clearly observe that
the point of fastest decline increases with time, which empirically supports
the expansion phenomenon.
The index, life expectancy, has been formally defined in Section 2.4.1. Life
expectancy statistics are very useful as summary measures of mortality, and
they have an intuitive appeal. However, it is important to interpret data
on life expectancy correctly when their computation is based on period
life tables. Period life expectancies are calculated using a set of age-specific
mortality rates for a given period (either a single year, or a run of years), with
112 3 : Mortality trends during the 20th century
2000–2002
1968–1972
0.03 1928–1932
1880–1890
0.02
f(x)
0.01
0.00
0 20 40 60 80 100
x
0.04 2000–2002
1968–1972
1928–1932
1880–1890
0.03
f(x)
0.02
0.01
0.00
0 20 40 60 80 100
x
Figure 3.13. Observed proportion of ages at death for Belgian males (top panel) and Belgian
females (bottom panel) corresponding to 1880–1890, 1928–1932, 1968–1972, and 2000–2002
period life tables. Source: Statistics Belgium.
In this formula, the ratio (1 − exp(−µx+k (t))/µx+k (t)) is the average frac-
tion of the year lived by an individual alive at age x + k, and the product
$k−1 ↑
j=0 exp(−µx+j (t)) is the probability k px (t) of reaching age x+k computed
from the period life table.
↑
Figure 3.14 shows the trend in the period life expectancies at birth e0 (t)
↑
and at retirement age e65 (t) by gender. The period life expectancy at a
particular age is based on the death rates for that and all higher ages that
were experienced in that specific year. For life expectancies at birth, we
observe a regular increase after 1950, with an effect due to World War II
which is visible before that time (especially at the beginning and at the end
↑
of the conflict for e0 (t), and during the years preceding the conflict as well as
↑
during the war itself for e65 (t)). Little increase was experienced from 1930
to 1945. It is interesting to note that period life expectancies are affected
by sudden and temporary events, such as a war or an epidemic.
3.3.8 Variability
85
Males
80
Females
75
70
e0(t)
65
60
55
50
1920 1940 1960 1980 2000
t
20
Males
18 Females
16
e65(t)
14
12
10
1920 1940 1960 1980 2000
t
Figure 3.14. Observed period life expectancies at birth (top panel) and at age 65 (bottom panel)
for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:
HMD.
to the value 0.25 of the survival curve minus the age corresponding to the
value 0.75 of this curve; see (2.64). The former age (called the third quar-
tile) is attained by 25% of the population whereas 75% of the population
reaches the latter age (called the first quartile). The interquartile range is
thus the width of the age interval containing the 50% central deaths in
the population. As age at death becomes less variable, we would expect
that this measure would decrease. It is very simple to calculate because it
equals the difference between the ages where the survival curve S crosses the
probability levels 0.25 and 0.75. Being the length of the span of ages con-
taining the middle 50% of deaths, it possesses a simple interpretation. Note
that the rectangularization of survival curves is associated with decreasing
interquartile range.
3.3 Mortality trends in the general population 115
45 Males
Females
40
35
IQR
30
25
20
15
11.5
11.0
IQR
10.5
Males
10.0 Females
Figure 3.15. Observed interquartile range at birth (top panel) and at age 65 (bottom panel)
for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:
HMD.
Figure 3.15 depicts the interquartile range at birth and at age 65. Whereas
the interquartile range at birth clearly decreases over time, there is an
upward trend at age 65. This suggests that even if variability is decreasing
for the entire lifetime, this may not be the case for the remaining lifetime at
age 65.
3.3.9 Heterogeneity
Figure 3.16 displays the period life tables for the Belgian individual life
insurance market, group life insurance market, and the general population
observed in the calendar years 1995, 2000, and 2005. The variability in
the set of death rates is clearly much higher for the insurance market, as
exposures-to-risk are considerably smaller. This is why smoothing the mar-
ket experience to make the underlying trend more apparent is desirable.
This is achieved as explained below.
The standardized mortality ratio (SMR) is a useful index for comparing
mortality experiences: actual deaths in a particular population are com-
pared with those which would be expected if ‘standard’ age-specific rates
applied. Precisely, the SMR is defined as
" x (t)
(x, t)∈D ETRxt m (x, t)∈D Dxt
SMR = =
" stand
(x, t)∈D ETRxt m x (t) (x, t)∈D ETRxt m " stand
x (t)
where D is the set of ages and calendar years under interest.
Here are the SMRs by calendar year for the life insurance market: com-
puted over 1993–2005, the estimated SMR is equal to 0.5377419 for ages
0 0 0
–2 –2 –2
ln mx
ln mx
ln mx
–4 –4 –4
–6 –6 –6
–8 –8 –8
0 0 0
–2 –2 –2
ln mx
ln mx
ln mx
–4 –4 –4
–6 –6 –6
–8 –8 –8
Figure 3.16. General population (broken line) and individual (circle) and group (triangle) life insurance market death rates (on the log scale) observed in
1995, 2000, and 2005 for Belgian males (top panel) and females (bottom panel). Source: HMD for the general population and BFIC for insured lives.
118 3 : Mortality trends during the 20th century
45–64 and to 0.3842981 for ages 65 and over for individual policies, and
to 0.495525 and to 0.8042604 for group policies. The same values com-
puted over 2000–2005 are equal to 0.4796451, 0.3699633, 0.4963897, and
0.8692767, respectively. Note that the values for group contracts, ages 45–
64 have been computed by excluding calendar year 2001, which appeared to
be atypical for group life contracts before retirement age. We see that SMR’s
are around 50% for individual and group life insurance contracts before
retirement age, and then decrease to reach 40% for individual policies and
increase to 80% for group life policies.
It is clear from Fig. 3.16 that death rates based on market data exhibit
considerable variations. This is why some smoothing is desirable in order
to obtain a better picture of the underlying mortality experienced by insured
lives. Since possible changes in underwriting practices or tax reforms are
likely to affect market death rates, we smooth the death rates across ages by
calendar year, as in Hyndman and Ullah (2007). To this end, we use local
regression techniques.
Local regression is used to model a relation between a predictor variable
(or variables) x and a response Y, which is related to the predictor variable.
Typically, x represents age in the application that we have in mind in this
chapter, while Y is some (suitably transformed) demographic indicator such
as the logarithm of the death rate or the logit of the death probability. The
logarithmic and logit transformations involved in these models ensure that
the dependent variables can assume any possible real values.
As pointed out by Loader (1999), smoothing methods and local regres-
sion originated in actuarial science in the late 19th and early 20th centuries,
in the problem of graduation. See Section 2.6 for an introduction to these
concepts. Having observed (x1 , Y1 ), (x2 , Y2 ), . . ., (xm , Ym ), we assume a
model of the form Yi = f (xi ) + i , i = 1, 2, . . . , m, where f (·) is an unknown
function of x, and i is an error term, assumed to be Normally distributed
with mean 0 and variance σ 2 . This term represents the random departures
from f (·) in the observations, or variability from sources not included in the
xi ’s. No strong assumptions are made about f , except that it is a smooth
function that can be locally well approximated by simple parametric func-
tions. For instance, invoking Taylor’s theorem, any differentiable function
can be approximated locally by a straight line, and a twice differentiable
function can be approximated locally by a quadratic polynomial.
In order to estimate f at some point x, the observations are weighted in
such a way that the largest weights are assigned to observations close to
3.4 Life insurance market 119
m
2
OW (x) = wi (x) Yi − β0 (x) − β1 (x)xi (3.21)
i=1
120 3 : Mortality trends during the 20th century
Denoting as m
wi (x)xi
xw = i=1
m (3.22)
i=1 wi (x)
the weighted average of the xi ’s in the smoothing window, the minimization
of the objective function OW (x) gives
" β0 (x) + "
f (x) = " β1 (x)x
m m
i=1 wi (x)Yi
i=1 wi (x) xi − xw Yi
= m + x − xw (3.23)
i=1 wi (x)
m 2
wi (x) xi − xw
i=1
The first term ensures that f (·) will fit the data as well as possible. The
second term penalizes roughness of f (·); it imposes some smoothness on
the estimated f (·). The factor λ quantifies the amount of smoothness: if
λ
+∞ then f " = 0 and we get a linear fit; and if λ 0 then f perfectly
interpolates the data points.
–1 –1 –1
–3 –3 –3
–4 –4 –4
–5 –5 –5
–6 –6 –6
–7 –7 –7
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100
Age Age Age
–1 –1 –1
–2 –2 –2
–3 –3 –3
–4 –4 –4
–5 –5 –5
–6 –6 –6
–7 –7 –7
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100
Age Age Age
Figure 3.17. General population (broken line) death rates and individual (circle) and group (triangle) life insurance market smoothed death rates (on the log
scale) observed in 1994 for Belgian males (top panel) and females (bottom panel).
122 3 : Mortality trends during the 20th century
If x1 < x2 < · · · < xm then the solution " fλ is a cubic spline with knots
x1 , x2 , . . . , xm ; see Section 2.6.3. This means that "
fλ coincides with a third-
degree polynomial on each interval (xi , xi+1 ) and possesses continuous first
and second derivatives at each xi .
Remark Instead of working in a Gaussian regression model, we could also
move to the generalized linear modelling framework by implementing a
local likelihood maximization principle. Consider for instance the Bernoulli
model where P[Yi = 1] = 1 − P[Yi = 0] = p(xi ). The contribution of the
ith observation to the log-likelihood is
Figure 3.18 gives the life expectancy at age 65 for the general population
and for insured lives, computed on the basis of observed death rates.
We see that the life expectancies for the group life insurance market are
close to the general population ones. This is due to the moderate adverse
selection present in the collective contracts, where the insurance coverage is
made compulsory by the employment contract, noting that there is a selec-
tion effect through being employed (the so-called ‘healthy worker effect’).
On the contrary, the effect of adverse selection seems to be much stronger
for individual policies. This is due to the particular situation prevailing
3.4 Life insurance market 123
26
24
Life exp. 65
22
20
18
16
28
26
Life exp. 65
24
22
20
18
Figure 3.18. Life expectancy at age 65 for males (top panel) and females (bottom panel): General
population (diamond) and individual (circle) and group (triangle) life insurance market. Source:
HMD for the general population and BFIC for insured lives.
in Belgium, where no tax incentives are offered for buying life annuities or
other life insurance products after retirement. This explains why only people
with improved health status consider insurance products as valuable assets.
Note that this situation has recently changed in Belgium, where purchasing
life annuities at retirement age is now encouraged by the government.
Actuaries are aware that the nominee of a life annuity is, with a high proba-
bility, a healthy person with a particularly low mortality in the first years of
life annuity payment and, generally, with an expected lifetime higher than
124 3 : Mortality trends during the 20th century
for ages x = 40, 41, . . . , 98 and calendar years 1994–2005. The similarity
with (3.24) is clearly apparent. Now, population death rates are used as
explanatory variables, instead of age x. Note that both variables could
enter the model as covariates, but we need here to establish a link between
population and insurance market mortality statistics that will be exploited
in Chapter 5. Figure 3.19 describes the result of the procedure for males,
whereas Fig. 3.20 is the analogue for females.
Figures 3.19 and 3.20 suggest that a linear relationship exists between
population and market death rates (at least for older ages). If we fit the
regression model
" BFIC
ln m x " HMD
(t) = a + b ln m x (t) + xt (3.31)
" HMD
to the observed pairs (ln m x " BFIC
(t), ln m x (t)) that are available for ages
60–98, and calendar years 1994 to 2005, we obtain estimated values for b
that are significantly less than 1 (for group and individual policies, males
and females). Moreover, the estimations are very sensitive to the age and
time ranges included in the analysis. Let us briefly explain why b < 1 seems
inappropriate.
Mortality reduction factors express the decrease in mortality at some
future time t + k compared with the current mortality experience at time
t. They are widely used to produce projected life tables and are formally
introduced in Section 4.3.2. The link between the regression model (3.31)
and the mortality reduction factors for the insurance market is as follows.
It is easily seen that if the linear relationship given above indeed holds
true then
mBFIC
x (t + k) mHMD
x (t + k)
ln = b ln (3.32)
mBFIC
x (t) mHMD
x (t)
b
mBFIC (t + k) mHMD (t + k)
⇔ x BFIC = x
(3.33)
mx (t) mxHMD (t)
3
–1 –1
2
–4 0 –4
–5 –5
–1
–6 –6
–2
–7 –7
–6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1
Gen. population Gen. population Gen. population
3
–2 –2
1
–4 –4
0
–6 –1 –6
–2
–8 –3 –8
–6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1
Gen. population Gen. population Gen. population
Figure 3.19. Relational models for males: observed pairs (ln m " HMD
x " BFIC
(t), ln m x (t)) are displayed in the left panels, the estimated functions f in (3.30) are
displayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.
–1 3 –1
–3 –3
1
–4 –4
0
–5 –5
–6 –1 –6
–7 –2 –7
–7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1
Gen. population Gen. population Gen. population
–6 –2 –6
–8 –4 –8
–7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1
Gen. population Gen. population Gen. population
Figure 3.20. Relational models for females: observed pairs (ln m " HMD
x " BFIC
(t), ln m x (t)) are displayed in the left panels, the estimated functions f in (3.30) are
displayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.
3.4 Life insurance market 127
so that the mortality reduction factor for the market is equal to the mortality
reduction factor for the general population raised to the power b. The same
reasoning obviously holds for the group life insurance market. We note that
the mortality reduction factors are less than 1 in the presence of decreasing
trends in mortality rates.
As socio-economic class mortality differentials have widened over time,
we expect mortality improvements for assured lives to have been greater
than in the general population. This statement is based on the fact that
the socio-economic class mix of this group is higher than the population
average. Of course, there may be distortion factors, like changes in under-
writing practices, or reforms in tax systems. Considering that the estimated
values for parameters b are less than 1, the interpretation is that the speed of
the future mortality improvements in the insured population is somewhat
smaller than the corresponding speed for the general population. This is
not desirable and only reflects the changes in the tax regimes in Belgium,
lowering adverse selection.
This is why we now consider the following model:
" BFIC
ln m x mHMD
(t) = f (x) + ln" x (t) + xt (3.34)
80
BFIC 2
HMD
Ot ( ) = "
ex (t) −"
ex− (t) (3.35)
x=65
We select the optimal value of (t) by a grid search over {−10, −9, . . . , 10}.
Then, the overall age shift is determined by minimizing O( ) =
2005
t=1994 Ot ( ). This gives the values displayed in Table 3.1.
128 3 : Mortality trends during the 20th century
0.50
0.8
0.45
SMR
0.7
SMR
0.40
0.6
0.35
0.5
60 70 80 90 60 70 80 90
x x
0.60 1.00
0.95
0.55
0.90
SMR
SMR
0.50 0.85
0.80
0.45
0.75
0.40 0.70
60 70 80 90 60 70 80 90
x x
Figure 3.21. Estimated SMR’s from (3.34) for males (top panels) and females (bottom panels),
individual (left) and group (right) life insurance market.
Table 3.1. Optimal age shifts obtained from the objective functions Ot in (3.35),
t = 1994, 1995, . . . , 2005 and O = 2005
t=1994 Ot .
1994 −8 −6 −4 −1
1995 −7 −6 −1 0
1996 −9 −8 −2 −1
1997 −8 −5 −2 −1
1998 −6 −4 −1 −1
1999 −9 −6 −1 −1
2000 −8 −5 −1 0
2001 −9 −8 −2 0
2002 −9 −5 0 1
2003 −9 −4 −1 0
2004 −8 −4 1 3
2005 −6 −3 1 1
1994–2005 −9 −5 −1 0
–125000
–122000 –130000
–124000 –135000
L
–126000 –140000
L
–145000
–128000
–150000
–130000
0 5 10 15 20 0 5 10 15 20
a a
–74000
–60000
–76000
–65000
L
L
–78000
–70000
–80000
–75000
0 5 10 15 20 0 5 10 15 20
a a
Figure 3.22. Log-likelihood L in function of the age shift for males (top panels) and females
(bottom panels), individual (left) and group (right) life insurance market.
70 70 70
Life exp. at birth
20 30 30
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
Calendar year Calendar year Calendar year
76
70 70
74
Life exp. at birth
50
70 50
40
68
40
66 30
1960 1970 1980 1990 2000 1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
Calendar year Calendar year Calendar year
Figure 3.23. Life expectancy at birth in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
Wales, compared to Belgium (broken line). Source: HMD.
16 16
16
Life exp. at 65
14
Life exp. at 65
Life exp. at 65
14
14
12
12
10 12
8 10 10
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000
1920 1940 1960 1980 2000
Calendar year Calendar year
Calendar year
18
16 16
16 15
15
Life exp. at 65
Life exp. at 65
Life exp. at 65
14
14 14 13
12
13 12
11
12 10
10
1960 1970 1980 1990 2000 1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
Calendar year Calendar year Calendar year
Figure 3.24. Life expectancy at age 65 in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
Wales, compared to Belgium (broken line). Source: HMD.
80 80 80
70
70 70
Life exp. at birth
30 40 40
20 30
30
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
Calendar year Calendar year Calendar year
80
80
80
70
78
60
76 60
50
74
50
72 40
1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
1960 1970 1980 1990 2000 Calendar year Calendar year
Calendar year
Figure 3.25. Life expectancy at birth in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
Wales, compared to Belgium (broken line). Source: HMD.
20 20
20
18 18
18
Life exp. at 65
Life exp. at 65
Life exp. at 65
16 16
16
14 14
14
12
12
12
10
10
8 10
1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
1750 1800 1850 1900 1950 2000
Calendar year Calendar year
Calendar year
22
19 18
20
Life exp. at 65
Life exp. at 65
18 16
Life exp. at 65
18
17 16
14
16
14
12
15
12
14 10
1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
1960 1970 1980 1990 2000
Calendar year Calendar year
Calendar year
Figure 3.26. Life expectancy at age 65 in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England
& Wales, compared to Belgium (broken line). Source: HMD.
3.6 Conclusions 135
3.6 Conclusions
As clearly demonstrated in this chapter, mortality at adult and old ages
reveals decreasing annual death probabilities throughout the 20th century.
There is an ongoing debate among demographers about whether human
longevity will continue to improve in the future as it has done in the past.
Demographers such as Tuljapurkar and Boe (2000) and Oeppen and Vaupel
(2002) argue that there is no natural upper limit to the length of human life.
The approach that these demographers use is based on an extrapolation
of recent mortality trends. The complexity and historical stability of the
changes in mortality suggest that the most reliable method of predicting
the future is merely to extrapolate past trends. However, this approach has
come in for criticisms because it ignores factors relating to life style and the
environment that might influence future mortality trends. Olshansky et al.
(2005) have suggested that the future life expectancy might level off or even
decline. This debate clearly indicates that there is considerable uncertainty
about future trends in longevity.
Mortality improvements are viewed as a positive change for individuals
and as a substantial social achievement. Nevertheless, they pose a chal-
lenge for the planning of public retirement systems as well as for the private
life annuities business. Longevity risk is also a growing concern for com-
panies faced with off-balance-sheet or on-balance-sheet pension liabilities.
More generally, all the components of social security systems are affected by
mortality trends and their impact on social welfare, health care and societal
planning has become a more pressing issue. And the threat has now become
a reality, as testified by the failure of Equitable Life, the world’s oldest life
insurance company, in the UK in 2001. Equitable Life sold deferred life
annuities with guaranteed mortality rates, but failed to predict the improve-
ments in mortality between the date the life annuities were sold and the date
they came into effect.
Despite the fact that the study of mortality has been core to the actuarial
profession from the beginning, booming stock markets and high interest
136 3 : Mortality trends during the 20th century
rates and inflation have largely hidden this source of risk. In the recent past,
with the lowering of inflation, interest rates, and expected equity returns,
mortality risks have no longer been obscured.
Low nominal interest rates have made increasing longevity a much big-
ger issue for insurance companies. When living benefits are concerned, the
calculation of expected present values (which are needed in pricing and
reserving) requires an appropriate mortality projection in order to avoid
underestimation of future costs. This is because mortality trends at adult/old
ages reveal decreasing annual death probabilities. In order to protect the
company from mortality improvements, actuaries have to resort to life
tables including a forecast of the future trends of mortality (the so-called
projected tables). The building of such life tables will be the topic of the
next chapters.
Forecasting mortality:
4 An introduction
4.1 Introduction
This chapter aims at describing various methods proposed by actuaries and
demographers for projecting mortality. Many of these have been actually
used in the actuarial context, in particular for pricing and reserving in rela-
tion to life annuity products and pensions, and in the demographic field,
mainly for population projections.
First, the idea of a ‘dynamic’ approach to mortality modelling is intro-
duced. Then, projection methods are presented starting from extrapolation
procedures which are still widely used in current actuarial practice. More
complex methods follow, in particular methods based on mortality laws,
on model tables, and on relations between life tables. The Lee–Carter
method, recently proposed, and some relevant extensions are briefly intro-
duced, whereas a more detailed discussion, together with some examples of
implementation, is presented in Chapters 5 and 6.
The presentation does not follow a chronological order. In order to obtain
an insight into the historical evolution of mortality forecasts the reader
should refer to Section 4.9.1, in which some landmarks in the history of
dynamic mortality modelling are identified.
Allowing for future mortality trends (and, possibly, for the relevant uncer-
tainty of these trends) is required in a number of actuarial calculations and
applications. In particular, actuarial calculations concerning pensions, life
annuities, and other living benefits (provided, e.g. by long-term care cov-
ers and whole life sickness products) are based on survival probabilities
which extend over a long time horizon. To avoid underestimation of the
relevant liabilities, the insurance company (or the pension plan) must adopt
an appropriate forecast of future mortality, which should account for the
most important features of past mortality trends.
Various aspects of mortality trends can be captured looking at the
behaviour, through time, of functions representing the age-pattern of
138 4 : Forecasting mortality: An introduction
(a) an increasing concentration of deaths around the mode (at old ages) of
the curve of deaths is evident; so the graph of the survival function moves
towards a rectangular shape, whence the term rectangularization to
denote this aspect; see Fig. 3.11 for an actual illustration, and Fig. 4.1(a)
for a schematic representation;
(b) the mode of the curve of deaths (which, owing to the rectangularization,
tends to coincide with the maximum age ω) moves towards very old
ages; this aspect is usually called the expansion of the survival function;
see Fig. 3.13 for an actual illustration, and Fig. 4.1(b) for a schematic
representation;
(c) higher levels and a larger dispersion of accidental deaths at young ages
(the so-called young mortality hump) have been more recently observed;
see Fig. 3.2 for an illustration.
1 1
0 v 0 v v⬘
Age Age
From the above aspects, the need for a dynamic approach to mortal-
ity assessment clearly arises. Addressing the age-pattern of mortality as a
dynamic entity underpins, from both a formal and a practical point of view,
any mortality forecast and hence any projection method.
Turning back to age-specific functions, we assume now that both age and
calendar year are integers. Hence, (x, t) can be represented by a matrix
whose rows correspond to ages and columns to calendar years. In particular,
let (x, t) = qx (t), where qx (t) denotes the probability of an individual
aged x in the calendar year t dying within one year (namely, the one-year
probability of death in a dynamic context).
The elements of the matrix (see Table 4.1) can be read according to three
arrangements:
1. How are the items in the database interpreted? Are they correctly inter-
preted as observed outcomes of random variables (e.g. frequencies of
death), or, conversely, are they simply taken as ‘numbers’?
2. The projected table, resulting from the extrapolation procedure, is
a two-dimensional array of numbers, providing point estimates of
future mortality. How do we get further information, namely, interval
estimates?
If the answer to question (1) is ‘data are simply numbers’, then the extrap-
olation procedure does not allow for any statistical feature of the infor-
mation available, as, for example, the reliability of the data. Conversely,
Past Future
t⬘
0
1
Projected
x qx( t ⬘)
table
v–1
Projection
0
1
Projected
x
table
v–1
Database
when the data are interpreted as the outcomes of random variables, the
extrapolation procedure must rely on sound statistical assumptions and,
as a consequence, future mortality can be represented in terms of both
point and interval estimates (whilst only point estimates can be provided
by extrapolation procedures only based on ‘numbers’).
Various traditional projection methods consist in extrapolation pro-
cedures simply based on ‘numbers’. First, we will describe these meth-
ods which, in spite of several deficiencies, offer a simple and intuitive
introduction to mortality forecasts.
Let us assume that several period observations (or ‘cross-sectional’ obser-
vations) are available for a given population (e.g. males living in a country,
pensioners who are members of a pension plan, etc.). Each observation
consists of the age-pattern of mortality for a given set X of ages, say
X = {xmin , xmin + 1, . . . , xmax }. The observation referred to calendar year
t is expressed by
Past Future
(Graduation) (Extrapolation)
qx (t)
t'
Time t
constitutes the data base for mortality projections. Note that each sequence
on the right-hand side of (4.5) represents the observed mortality profile at
age x.
We assume that the trend observed in past years (i.e. in the set of years
T ) can be graduated, for example, via an exponential function. Further, we
suppose that the observed trend will continue in future years. Then, future
mortality can be estimated by extrapolating the trend itself (see Fig. 4.4).
qx (t)
t'
Time t
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻ c x min(t)
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻ c x(t)
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻ c x max(t)
the result is a function ψx (t) for each age x. This may lead to inconsisten-
cies with regard to the projected age-pattern of mortality, as we will see in
Section 4.5.3.
qx (t) = qx (t ) Rx (t − t ) (4.6)
4.3 Projection by extrapolation of annual probabilities of death 145
Rx (t − t ) = R(t − t ) (4.7)
Let us suppose that the observed mortality profiles are such that the
behaviour over time of the logarithms of the qx ’s is, for each age x, approx-
imately linear (see Fig. 4.7). Then, we can find a value δx such that, for
h = 1, 2, . . . , n − 1, we have approximately:
Hence
qx (th+1 )
≈ e−δx (th+1 −th ) (4.9)
qx (th )
or, defining rx = e−δx :
qx (th+1 ) t −t
≈ rxh+1 h (4.10)
qx (th )
Assume that, for each age x, the parameter δx (or rx ) is estimated, for
example via a least squares procedure. So, the graduated probabilities q̂x (t)
can be calculated. The constraint q̂x (tn ) = qx (tn ) is usually applied in the
estimation procedure.
146 4 : Forecasting mortality: An introduction
0
Time t
ln qx (t)
qx (t)
0
Time t
The extrapolation formula (4.11) (as well as, for instance, formula (4.17)
in Section 4.3.5) originates from the analysis of the mortality profiles, and
hence constitutes an example of the horizontal approach.
For the calculation of parameters rx ’s (or δx ’s), procedures other than least
squares estimation can be used. An example follows.
Suppose, as above, that n period tables are available. For each age x and
(h)
for h = 1, 2, . . . , n − 1, calculate the quantities rx ’s as follows:
% & 1
qx (th+1 ) th+1 −th
r(h)
x = (4.13)
qx (th )
Each weight, wh , should be chosen in a way to reflect both the length of the
time interval between observations and the statistical reliability attaching to
the observations themselves. Trivially, if we set wh = (th+1 − th )/(tn − t1 )
for all h, only the lengths of the time intervals are accounted for, and so
expression (4.14) reduces to
% & 1
qx (tn ) tn −t1
rx = (4.15)
qx (t1 )
so that rx is determined only by the first and last values of qx (t) in the past
data.
Let us turn back to the exponential formula. From (4.11) it follows that, if
rx < 1, then
qx (∞) = 0 (4.16)
where αx ≥ 0 for all x see Fig. 4.8. The reduction factor is thus given by
Rx (t − t ) = αx + (1 − αx ) rt−t
x (4.18)
qx (∞) = αx qx (t ) (4.19)
qx (t') qx (t' )
qx (t' ) αx
t' t'
Time t Time t
then, fx (m) is the proportion of the total mortality decline assumed to occur
by time m. Dividing both numerator and denominator by qx (t ), we obtain:
1 − Rx (m) (1 − αx )(1 − rm )
fx (m) = = = 1 − rm (4.22)
1 − Rx (∞) 1 − αx
Note that, since we have assumed rx = r for all x, we have fx (m) = f (m).
Hence
1
r = (1 − f (m)) m (4.23)
The choice of the couple (m, f (m)) unambiguously determines the
parameter r. Finally, we have
t−t
Rx (t − t ) = αx + (1 − αx ) (1 − f (m)) m (4.24)
For example, if we assume that 60% of the total mortality decline occurs
1
in the first 20 years, we set (m, f (m)) = (20, 0.60), and so r = 0.40 20 =
0.9552.
4.3 Projection by extrapolation of annual probabilities of death 149
t−t
(called the Sachs formula) where a and b are constants and a x+b represents
the reduction factor, also constitutes a particular case of (4.25), as can be
easily proved.
Note that formulae (4.11) and (4.17) (and some related expressions)
explicitly refer to the base year t (usually related to the most recent
observation, that is, t = tn ). Conversely, formula (4.25) as well as other
formulae presented in Section 4.3.9 do not explicitly address a fixed calen-
dar year. Nonetheless, a link with a given calendar year can be introduced
via parameters, as illustrated, for example, by formula (4.26).
It is easy to see that, for any year t, the reduction factor increases (i.e.
the mortality improvement reduces) linearly with increasing age, between
t−t
0.50 + 0.50 (0.40) 20 at age 60 and below, to unity at age 110 and above.
For any given age x, the rate of improvement decreases as t increases.
Further, following the analysis in Section 4.3.6, it is easy to prove that
expression (4.28) for the reduction factor, with f = 0.60, implies that 60%
of the total (asymptotic) mortality improvement (at any age x) is assumed
to occur in the first 20 years.
Example 4.3 A recent implementation of formula (4.17) by the Continuous
Mortality Investigation Bureau is as follows (see CMIB (1999)). In this case,
the reduction factor is given by
t−t
Rx (t − t ) = αx + (1 − αx )(1 − fx ) 20 (4.30)
Example 4.4 An exponential formula has also been used in the United
States. The Society of Actuaries published the 1994 probabilities on death
as the base table and the annual improvement factors 1 − rx ; see Group
Annuity Valuation Table Task Force (1995). The projected probabilities of
death are determined as follows:
The parameter rx varies from 0.98 to 1, being equal to 1 for x > 100, for
both males and females.
with ax,1 < 0 to express mortality decline. This formula is not usually
adopted because of its obvious disadvantage that for large t a negative
probability is predicted. The polynomial extrapolation formula (4.35) with
p = 3 is called the Esscher formula.
152 4 : Forecasting mortality: An introduction
ecx,0 +cx,1 t
qx (t) = (4.41)
1 + ecx,0 +cx,1 t
that is, from the relevant cohort table (see also Section 4.2.2). Then, the
probability of a person age x in year t being alive at age x + k is given by:
!
k−1
k px (t) = [1 − qx+j (t + j)] (4.43)
j=0
where the superscript
recalls that we are working along a diagonal band
in the Lexis diagram (see Section 3.3, and Fig. 3.1 in particular), or, simi-
larly, along a diagonal of the matrix in Table 4.1 with the proviso that the
ordering of the lines is inverted. Note that explicit reference to the year of
birth τ is omitted, as this is trivially given by τ = t − x.
For example, to calculate, in the calendar year t, the expected remaining
lifetime of an individual age x in that year, the following formula should
be adopted, rather than formula (2.65) (which relies on the assumption of
unchanging mortality after the period observation from which the life table
4.4 Using a projected table 153
was drawn):
ω−x
◦
1
ex (t) = k px (t) + (4.44)
2
k=1
◦
The quantity ex (t) is usually called the (complete) cohort life expectancy,
for a person age x in year t. If a decline in future mortality is expected (and
hence represented by the projected cohort table), the following inequality
holds:
◦
◦
ex (t) > ex (4.45)
◦
where ex denotes the period life expectancy (see Section 2.4.3).
Note that, in a dynamic framework, the period life expectancy should be
denoted as follows:
ω−x
◦↑ ↑ 1
ex (t) = k px (t) + (4.46)
2
k=1
with
!
k−1
↑
k px (t) = [1 − qx+j (t)] (4.47)
j=0
where the superscript ↑ recalls that we are working along a vertical band in
the Lexis diagram, or, similarly, along a column of the matrix in Table 4.1.
The same cohort-based approach should be adopted to calculate actuarial
values of life annuities, for both pricing and reserving. Hence, various cohort
tables should be simultaneously used, according to the year of birth of the
individuals addressed in the calculations.
(1) A birth year τ̄ is chosen and the cohort table pertaining to the generation
born in year τ̄ is only addressed; so, the probabilities
qxmin (τ̄ + xmin ), qxmin +1 (τ̄ + xmin + 1), . . . , qx (τ̄ + x), . . . (4.48)
154 4 : Forecasting mortality: An introduction
Past Future
t t⬘ t
0
1
Projected
x qx( t ⬘)
table
v–1
(2) (1)
where xmin denotes the minimum age of interest, are used in actuarial
calculations. Thus, just one diagonal of the matrix {qx (t)} is actu-
ally used. The choice of τ̄ should reflect the average year of birth of
annuitants or pensioners to whom the table is referred.
(2) A (future) calendar year t̄ is chosen and the projected period table
referring to year t̄ is only addressed; and so the probabilities
Following approach (1), and using the superscript [τ̄]
to denote refer-
ence to the cohort table for the generation born in year τ̄, the probability
of being alive at age x + k is given (for any year of birth τ = t − x) by
!
k−1
[τ̄]
k px = [1 − qx+j (τ̄ + x + j)] (4.50)
j=0
4.4 Using a projected table 155
Adopting approach (2), and denoting by [t̄] ↑ the reference to the period
table for year t̄, the probability of being alive at age x+k is conversely given
(for any year of birth τ = t − x) by
!
k−1
[t̄]↑
k px = [1 − qx+j (t̄)] (4.51)
j=0
For people born in year τ = t −x, the probabilities (4.43) (which are related
to the year of birth τ) should be used, whereas approach (1) leads to the use
of probabilities (4.50), which are independent of the actual year of birth. To
reintroduce a dependence on τ, at least to some extent, we use the following
probabilities:
qxmin +h(τ) (τ̄ + xmin + h(τ)), qxmin +1+h(τ) (τ̄ + xmin + 1 + h(τ)), . . . ,
qx+h(τ) (τ̄ + x + h(τ)), . . . (4.52)
Note that all the probabilities involved belong to the same diagonal referred
to within approach (1).
This adjustment (often called Rueff’s adjustment) involves an age-shift
of h(τ) years. Assuming a mortality decline, the function h(τ) must satisfy
the following relations:
≥ 0 for τ < τ̄
h(τ) = 0 for τ = τ̄ (4.53)
≤ 0 for τ > τ̄
!
k−1
[τ̄; h(τ)]
k px = [1 − qx+h(τ)+j (τ̄ + x + h(τ) + j)] (4.54)
j=0
where the superscript also recalls the age-shift. Probabilities given by for-
mula (4.54) can be adopted to approximate the cohort life expectancy (see
Section 4.4.1) as well as actuarial values of life annuities.
156 4 : Forecasting mortality: An introduction
τ h(τ)
1901–1910 5
1911–1920 4
1921–1929 3
1930–1937 2
1938–1946 1
1947–1953 0
1954–1960 −1
1961–1967 −2
1968–1975 −3
1976–1984 −4
≥ 1985 −5
procedure can be applied to the set of parameters (instead of the set of age-
specific probabilities), with a dramatic reduction in the dimension of the
forecasting problem, namely in the number of the ‘degrees of freedom’.
Consider a law, for example, describing the force of mortality:
µx = ϕ(x; α, β, . . . ) (4.55)
In a dynamic context, the calendar year t enters the model via its parameters
α1 , α2 , . . . , αn ⇒ α(t)
β1 , β2 , . . . , βn ⇒ β(t)
...
Past Future
(Graduation) (Extrapolation)
a
a(t)
ah
th t'
Time t
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
γ1 , γ2 , . . . , γm ⇒ γ(τ)
δ1 , δ2 , . . . , δm ⇒ δ(τ)
...
functions A(t), B(t), C(t), . . . are used to express the dependency of the
age-pattern of mortality on the calendar year t.
Example 4.8 We assume that, for each past calendar year t, the odds
φx (t) = qx (t)/px (t) are graduated using (2.81). Then, we have
φx (t) = φx (t ) rs (4.66)
ln φx (t) = α + βx + s ln r (4.68)
Defining
ln r
w=− (4.69)
β
we finally obtain:
It is well known that, whilst the Weibull law does not fit well the age-
pattern of mortality throughout the whole life span (especially because of
the specific features of infant and young-adult mortality), it provides a rea-
sonable representation of mortality at adult and old ages. Moreover, the
choice of the Weibull law is supported by the possibility of easily express-
ing, in terms of its parameters, the mode (at adult ages) of the distribution
of the random lifetime T0 , that is, the Lexis point,
1
α−1 α
Mod[T0 ] = β ; α>1 (4.74)
α
where denotes the complete gamma function (see, e.g. Kotz et al. (2000)).
Moments for the remaining lifetime at age x > 0, Tx , can similarly be
derived.
The above possibility facilitates the choice of laws which reflect specific
future trends of mortality. When a dynamic mortality model is con-
cerned, the force of mortality must be addressed as a function of the
(future) calendar year t (according to the vertical approach), or the year of
birth τ (diagonal approach). Hence, referring for example to the diagonal
approach, we generalize formula (2.77) as follows:
α(τ) x α(τ)−1
µx (τ) = (4.77)
β(τ) β(τ)
Functions α(τ) and β(τ) should be chosen in order to reflect the assumed
trends in the rectangularization and expansion processes. To this purpose,
formulae (4.74) to (4.76) provide us with a tool for checking the validity of
a choice of the above functions.
162 4 : Forecasting mortality: An introduction
qx(t)
x1
x2
t' t* Time t
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
⌽(x,t)
with the proviso that some of the γij may be preset to 0. Lj (x̄) are Legendre
polynomials. The variables x̄ and t̄ are the transformed ages and trans-
formed calendar years, respectively, such that both x̄ and t̄ are mapped
onto [−1, +1]. Note that the first of the two multiplicative terms on the
right hand side is a graduation model GM(0, s + 1), while the second one
may be interpreted as an age-specific trend adjustment term (provided that
at least one of the γij is not preset to zero). Formula (4.78) has been pro-
posed by Renshaw et al. (1996) for modelling with respect to age and time,
noting that, for forecasting purposes, low values of r should be preferred –
that is, polynomials in t with a low degree.
A further implementation of this model has been carried out by Sithole
et al. (2000). Trend analysis of UK immediate annuitants’ and pensioners’
mortality experiences (provided by the CMIB) suggested the adoption of
the following particular formula (within the class of models (4.78)):
3
µx (t) = expβ0 + βj Lj (x̄) + (α1 + γ11 L1 (x̄)) t̄ (4.79)
j=1
µx (t) = µx (t ) Rx (t − t ) (4.80)
4.6 Other approaches to mortality projections 165
where, as usual, t is the base year for the mortality projection. From (4.5.30)
we obtain: % &
t − t
Rx (t − t ) = exp (α1 + γ11 x̄) (4.81)
w
where w denotes half of the calendar year range for the investigation period.
Hence:
) *
Rx (t − t ) = exp (a + b x) (t − t ) (4.82)
(with a < 0 and b > 0, which result from the fitting of the observed data).
Renshaw and Haberman (2003b) consider a regression-based forecasting
model of the following simple structure:
ln mx (t) = ax + bx (t − t ) (4.83)
Then, introducing a reduction factor that is related to the central death rate
and interpreting the term ax as representing the central death rate for the
base year mx (t ), we have that
mx (t) = mx (t ) Rx (t − t ) (4.84)
and
Rx (t − t ) = exp[bx (t − t )] (4.85)
From Sections 4.3 and 4.5, it clearly emerges that a number of projection
methods are based on the extrapolation of observed mortality trends, pos-
sibly via the parameters of some mortality law. Important examples are
provided by formulae (4.11), (4.20), and (4.63). Athough it seems quite
natural that mortality forecasts are based on past mortality observations,
different approaches to the construction of projected tables can be adopted.
166 4 : Forecasting mortality: An introduction
with r < 1. Note that formula (4.20) can be easily linked to (4.88), choosing
αx such that qx (t )αx = q̃x .
Observed trend
in markers Extrapolation
Markers
Set of model
Life tables
tables
In a given population
Trends in some markers are analysed and then projected, possibly using
some mathematical formula, in order to predict their future values. Pro-
jected age-specific probabilities of death are then obtained by entering the
system of model tables for the various projected values of the markers. The
procedure is sketched in Fig. 4.15.
denoting by (x, τ) the logit of the survival function, S(x, τ), for the cohort
born in the calendar year τ, we have:
1 1 − S(x, τ)
(x, τ) = ln (4.89)
2 S(x, τ)
Note that, when a model for the resistance function (see (4.92) and (4.93))
is assumed, the resulting projection model can be classified as an analytical
model, even though it does not directly address the survival function.
The Petrioli–Berti model has been used to project the mortality of the
Italian population, and then has been adopted by the Italian Association
of Insurers in order to build up projected mortality tables for life annuity
business.
Hence, stochastic assumptions about mortality are required, that is, prob-
ability distributions for the random numbers of deaths, and a statistical
structure linking forecasts to observations must be specified (see Fig. 4.16).
In a stochastic framework, the results of the projection procedures
consist in
• Point estimates
• Interval estimates
of future mortality rates (see Fig. 4.17) and other life table functions.
Clearly, traditional graduation–extrapolation procedures, which do not
explicitly allow for randomness in mortality, produce just one numerical
value for each future mortality rate (or some other age-specific quantity).
Moreover, such values can be hardly interpreted as point estimates, because
of the lack of an appropriate statistical structure and model.
170 4 : Forecasting mortality: An introduction
. . . mortality frequency
.
.
A model linking the probabilistic structure
of the stochastic process to the sample
t' Time t
Observations
t⬘ Time t
e65(t)
t' Time t
series, for example, as a random walk with drift. Starting from a given year
t , forecasted mortality rates are then computed, for t > t , as follows:
mx (t) = exp(α̂x + β̂x κt ) = mx (t ) exp β̂x (κt − κ̂t ) (4.96)
The LC method implicitly assumes that the random errors are homoskedas-
tic. This assumption, which follows from the ordinary least squares estima-
tion method that is used as the main statistical tool, seems to be unrealistic,
as the logarithm of the observed mortality rate is much more variable at
older ages than at younger ages, because of the much smaller number of
deaths observed at old and very old ages.
In Brouhns et al. (2002b) and Brouhns et al. (2002a), possible improve-
ments of the LC method are investigated, using a Poisson random variation
for the number of deaths. This is instead of using the additive error term x,t
in the expression for the logarithm of the central mortality rate (see (4.95)).
In terms of the force of mortality µx (t), the Poisson assumption means
that the random number of deaths at age x in calendar year t is given by
Dx (t) ∼ Poisson ETRx (t) µx (t) (4.97)
where ETRx (t) is the central number of exposed to risk. In order to define
the Poisson parameter ETRx (t) µx (t), Brouhns et al. (2002a) and Brouhns
et al. (2002b) assume a log-bilinear force of mortality, that is,
ln µx (t) = αx + βx κt (4.98)
hence with the structure expressed by (4.95), apart from the error term.
The meaning of the parameters αx , βx , κt is essentially the same as for
the corresponding parameters in the LC model. The parameters are then
determined by maximizing the log-likelihood based on (4.97) and (4.98).
Brouhns et al. (2002b) do not modify the time series part of the LC
method. Hence, the estimates α̂x and β̂x are used with the forecasted κt
in order to generate future mortality rates (as in (4.96)), as well as other
age-specific quantities.
4.8 Further issues 173
First, consider the following projection model referred to the mortality odds
φx (t) = qx (t)/px (t):
φx (t) = φx (t ) rt−t (4.100)
where the first term on the right-hand side does not depend on t, whereas
the second term does not depend on x. Denoting the first term with A(x)
and the second term with B(t), equation (4.100) can be rewritten as follows:
with finite sets for the values of x and t. Constraints are usually as follows:
αx = βt = γt−x = 0 (4.106)
x t t−x
Further weak points can be found in APC models like (4.102) and
(4.103). In particular, these models assume an age-independent period
effect, or an age-independent cohort effect, whereas the impact of mortality
improvements over time (or between cohorts) may vary with age.
As far as statistical evidence is concerned, both period and cohort effects
seem to impact on mortality improvements. In particular, it is reason-
able that period effects summarize contemporary factors, for example,
the general health status of the population, availability of healthcare ser-
vices, critical weather conditions, etc. Conversely, cohort effects quantify
historical factors, for example, World War II, diet, smoking habits, etc.
From a practical point of view, the main difficulty in implementing projec-
tion models allowing for cohort effects obviously lies in the fact that statist-
ical data for a very long period are required, and such data are rarely available.
4.9 References and suggestions for further reading 175
Conversely, from a general point of view, the role of period and cohort effects
in quantifying factors that affect mortality improvements suggests that we
consider future likely scenarios and, in particular, causes of death.
denote the force of mortality in both the vertical (with z = t) and the
diagonal (with z = t − x) approach. For the graduation of the parameters,
Cramér and Wold (1935) assumed that, in both the vertical and the diagonal
approach, α(z) is linear while ln β(z) and ln γ(z) are logistic.
The assumption formulated in 1934 by Kermack, McKendrick, and
McKinlay constitutes another example of the diagonal approach to mor-
tality projections. As Pollard (1949) notes, these authors showed that, for
some countries, it was reasonable to assume that the force of mortality
depended on the attained age x and the year of birth τ = t − x, and they
deduced that µx (t) = C(x) D(τ), where C(x) is a function of age only and
D(τ) is a function of the year of birth only; see also Section 4.8.1.
There are a number of both theoretical and practical papers dealing with
mortality forecasts, produced by actuaries as well as by demographers. The
reader interested in various perspectives on forecasting mortality should
refer to Tabeau et al. (2001), and Booth (2006), in which a number of
4.9 References and suggestions for further reading 179
(see Section 4.4.3) has been dealt with by Delwarde and Denuit (2006); see
also Chapter 3.
Considerable research work has been recently devoted to improve and
generalize the LC methodology. In particular, the reader should refer to
Carter (1996), Alho (2000), Renshaw and Haberman (2003a, b, c), Brouhns
and Denuit (2002), Brouhns et al. (2002b). See also the list of references in
Lee (2000).
Among the extensions of the LC method, we note the following devel-
opments. Carter (1996) incorporates in the LC methodology uncertainty
about the estimated trend of mortality kt , through a specific model for the
trend itself. Renshaw and Haberman (2003c) have noted that the standard
LC methodology fails to capture and then project recent upturn in crude
mortality rates in the age range 20–39 years. So, an extension of the LC
methodology is proposed, in order to incorporate in the LC model specific
age differential effects.
Booth et al. (2002) have developed systematic methods for choosing the
most appropriate subset of the data to use for modelling – the graduation
subset of Fig. 4.4. The importance of ensuring that the estimates α̂x and β̂x
are smooth with respect to age so that irregularities are not magnified via
extrapolations into the future has been discussed by Renshaw and Haber-
man (2003a), Renshaw and Haberman (2003c), De Jong and Tickle (2006),
and Delwarde et al. (2007).
A cause-of-death projection study was proposed by Pollard (1949), based
on Australian population data.
As regards scenario-based mortality forecasts, Gutterman and Van-
derhoof (1998) stress that a projection methodology should allow for
relationships between causes (e.g. advances in medical science) and effects
(mortality improvements).
Forecasting mortality:
5 applications and
examples of age-period
models
5.1 Introduction
As explained in Chapter 4, actuaries working in life insurance and pension
have been using projected life tables for some decades. But the problem
confronting actuaries is that people have been living much longer than they
were expected to according to the life tables being used for actuarial com-
putations. What was missing was an accurate estimation of the speed of the
mortality improvement: thus, most of the mortality projections performed
during the second half of the 20th century have underestimated the gains
in longevity. The mortality improvements seen in practice have quite con-
sistently exceeded the projected improvements. As a result, insurers have,
from time to time, been forced to allocate more capital to support their in-
force annuity business, with adverse effects on free reserves and profitability.
From the point of view of the actuarial approach to risk management, the
major problem is that mortality improvement is not a diversifiable risk. Tra-
ditional diversifiable mortality risk is the random variation around a fixed,
known life table. Mortality improvement risk, though, affects the whole
portfolio and can thus not be managed using the law of large numbers
(see Chapter 7 for a detailed discussion of systematic and non-systematic
risks). In this respect, longevity resembles investment risk, in that it is
non-diversifiable: it cannot be controlled by the usual insurance mecha-
nism of selling large numbers of policies, because they are not independent
in respect of that source of uncertainty. However, longevity is different
from investment risk in that there are currently no large traded markets
in longevity risk so that it cannot easily be hedged. The reaction to this
problem is twofold. First, actuaries are trying to produce better models for
mortality improvement, paying more attention to the levels of uncertainty
involved in the forecasts. The second part of the reaction is to look to the
182 5 : Age-period projection models
After the Crédit Suisse longevity index based on the expectation of life
derived from US data, the more comprehensive JPMorgan LifeMetrics has
innovated by producing publicly available indices on population longevity.
LifeMetrics is a toolkit for measuring and managing longevity and mor-
tality risk. LifeMetrics advisors include Watson Wyatt and the Pensions
Institute at Cass Business School. LifeMetrics Index provides mortality rates
and period life expectancy levels across various ages, by gender, for each
national population covered. Currently the LifeMetrics Index publishes
index values for the United States, England & Wales, and The Netherlands.
All of the methodology, algorithms and calculations are fully disclosed and
open. The LifeMetrics toolkit includes a set of computer based models that
can be used in forecasting mortality and longevity. These models have been
evaluated in the research paper ‘A quantitative comparison of eight stochas-
tic mortality models using data from England & Wales and the United
States’ by Cairns et al. (2007). The R source code required to run the forecast
models is available for download along with a user guide.
We also mention two other resources which are available from the web
(but which were not used in the present book). Federico Girosi and Gary
King offer the YourCast software that makes forecasts by running sets of
linear regressions together in a variety of sophisticated ways. This open
source software is freely available from http://gking.harvard.edu/yourcast/.
It implements the methods introduced in Federico Girosi and Gary King’s
manuscript on Demographic Forecasting, to be published by Princeton
University Press.
Further, we note the recent initiative of the British CMIB (Continuous
Mortality Investigation Bureau), that is, the bureau affiliated to the UK actu-
arial profession, with the function of producing mortality tables for use by
insurers and pension plans. The CMIB has made available software running
on R with the aim of illustrating the P-Spline methodology for projecting
mortality. CMIB software now allows the fitting of the Lee-Carter model
as well, but with restricted ARIMA specifications. For more details, please
consult http://www.actuaries.org.uk.
Before embarking in the presentation of the Lee–Carter and the Cairns–
Blake–Dowd approaches, let us say a few words about the material not
included in the present chapter. First, we do not consider possible cohort
effects, and limit our analysis to the age and period dimensions. For coun-
tries like Belgium, cohort effects are weak enough and can be neglected.
However, for countries like the UK, cohort effects are significant and must
be accounted for. Chapter 6 is devoted to the inclusion of cohort effects in
the Lee–Carter and Cairns–Blake–Dowd models discussed here.
186 5 : Age-period projection models
Lee and Carter (1992) proposed a simple model for describing the secu-
lar change in mortality as a function of a single time index. Throughout
this chapter, we assume that assumption (3.2) is fulfilled, that is, that the
age-specific mortality rates are constant within bands of age and time, but
allowed to vary from one band to the next. Recall that under (3.2), the force
of mortality µx (t) and the death rate mx (t) coincide.
Lee and Carter (1992) specified a log-bilinear form for the force of
mortality µx (t), that is,
ln µx (t) = αx + βx κt (5.1)
The specification (5.1) differs structurally from parametric models given
that the dependence on age is non-parametric, and represented by the
sequences of αx ’s and βx ’s. Interpretation of the parameters is quite simple:
exp αx is the general shape of the mortality schedule and the actual forces
of mortality change according to an overall mortality index κt modulated
by an age response βx (the shape of the βx profile tells which rates decline
rapidly and which slowly over time in response of change in κt ). The param-
eter βx represents the age-specific patterns of mortality change. It indicates
the sensitivity of the logarithm of the force of mortality at age x to varia-
tions in the time index κt . In principle, βx could be negative at some ages
x, indicating that mortality at those ages tends to rise when falling at other
ages. In practice, this does not seem to happen over the long-run, except
sometimes at the very oldest ages. There is also some evidence of negative βx
estimates for males at young adult ages in certain industrialized countries.
This has been attributed to an increase in mortality due to AIDS in the late
1980s and 1990s.
In a typical population, age-specific death rates have a strong tendency
to move up and down together over time. The specification (5.1) uses this
tendency by modelling the changes over time in age-specific death rates as
5.2 Lee–Carter mortality projection model 187
driven by a scalar factor κt . This strategy implies that the modelled death
rates are perfectly correlated across ages, which is the strength but also the
weakness of the approach. As pointed by Lee (2000), the rates of decline
in the ln µx (t)’s at different ages are given by βx (κt − κt−1 ) so that they
always maintain the same ratio to one another over time. In practice, the
relative speed of decline at different ages may vary. In such a case, the
extended version of the Lee–Carter model introduced by Booth et al. (2002)
– see equation (5.14) – or the Cairns–Blake–Dowd approach might be
preferable.
Remark Many models produce projected death rates that tend to 0. Hence,
some constraint should be imposed on the long-term behaviour of the
death rates. In that respect, limit life tables that have been discussed in
Section 4.6.1 may be specified, or we can use a forecast that incorporates
a theoretical maximum achievable life expectancy. This feature implies a
slowdown in the rate of mortality decline as the theoretical maximum life
expectancy is reached. If we denote as µ∞ x the limiting force of mortality,
∞
the model becomes ln µx (t) − µx = αx + βx κt .
level and age pattern of mortality by country, the general time pattern of
mortality change, and the speed and age pattern of mortality change by
country. As for the Lee–Carter model, the extrapolation of estimates " κt
gives future mortality rates for given gender, age, time, and country. The
main interest of this method lies in the estimation of a unique time series
(or two if each gender is treated separately) which gives mortality rates for
all countries and age-time categories.
As expected, the analysis conducted by Delware et al. (2006) reveals
that age is the most important factor determining mortality rate. The time
effect is more relevant than the country effect if weights are taken into
account, which is a sign of convergence. In other words, the time horizon
is more important than the country, but since the country effect is not neg-
ligible, the differences between country-specific death rates increase with
time. These results allow us to compare the mortality experience observed
in the G5 countries through the same model and also to produce forecasts.
An estimated average death rate and a common index of mortality decline
can be obtained from the analysis, which is essential for economists. Most
financial and insurance decisions are taken on the basis of a worldwide
view, more than on a regional or particular location. From this analysis,
one can obtain baseline mortality forecasts from the pooled G5 popu-
lation, but at the same time, one can see the influence of each gender,
age, time trend, and country on the mortality forecast. In this way, the
observed past behaviour of the G5 is summarized in a single model and
the identification and comparison of each country specific effect become
much easier.
5.2.2 Calibration
αx + +
ln µx (t) = + βx+
κt (5.2)
αx = αx +c1 βx , +
with + βx = βx /c2 and+
κt = c2 (κt −c1 ). Therefore, we need to
impose two constraints on the parameters αx , βx , and κt in order to prevent
the arbitrary selection of the parameters c1 and c2 .
A pair of additional constraints are thus required on the parameters for
estimation to circumvent this problem. To some extent, the choice of the
constraints is a subjective one, although some choices are more natural than
others. In the literature, the parameters in (5.1) are usually subject to the
constraints
tn
xm
κt = 0 and βx = 1 (5.3)
t=t1 x=x1
Note that the lack of identifiability of the Lee-Carter model is not a real
problem. It just means that the likelihood associated with the model has an
infinite number of equivalent maxima, each of which would produce identi-
cal forecasts. Adopting the constraints (5.3) consists in picking one of these
equivalent maxima. The important point is that the choice of constraints
has no impact on the quality of the fit, or on forecasts of mortality. Some
care is needed, however, in any bootstrap procedures used for simulation
(see Section 5.8).
Empirical studies reveal that using the observed dxt ’s as weights (i.e.
wxt = dxt ) has the effect of bringing the parameters estimated into close
agreement with the Poisson-response-based estimates (discussed below).
However, the choice of the death counts as weights is questionable, and
the Poisson maximum likelihood approach described in the next section
has better statistical properties, and should therefore be preferred for infer-
ence purposes. The reason is that a valid weighted least-squares approach
must use exogeneous weights, but obviously the number of deaths is a ran-
dom variable. As such, estimates resulting from the minimization of OWLS
have no known statistical properties and can be strongly biased.
5.2 Lee–Carter mortality projection model 191
∂
Effective computation: Singular value decomposition Setting ∂αx OLS
equal to 0 yields
tn
tn
" x (t) = (tn − t1 + 1)αx + βx
ln m κt (5.7)
t=t1 t=t1
tn
Since t=t1 κt = 0 by the constraint (5.3), we get
1 tn
"
αx = " x (t)
ln m (5.8)
tn − t1 + 1 t=t
1
The minimization of (5.5) thus consists in taking for " αx the row average
" x (t)’s. When the model (5.4) is fitted by ordinary least-squares,
of the ln m
the fitted value of αx exactly equals the average of ln m " x (t) over time t so
that exp αx represents the general shape of the mortality schedule. We then
obtain the "
βx ’s and" κt ’s from the first term of a singular value decomposition
of the matrix ln m " x (t) − "αx .
Specifically, death rates can be combined to form a matrix
mx1 (t1 ) · · · mx1 (tn )
.. .. ..
M= . . . (5.9)
mxm (t1 ) · · · mxm (tn )
of dimension (xm − x1 + 1) × (tn − t1 + 1). Model (5.1) is then fitted so that
it reproduces M as closely as possible. Now, let us create the matrix
" −"
Z = ln M α
" x1 (t1 ) − "
ln m αx1 " x1 (tn ) − "
· · · ln m αx1
. .. ..
= .
. . . (5.10)
" xm (t1 ) − "
ln m αxm " xm (tn ) − "
· · · ln m αxm
of dimension (xm −x1 +1)×(tn −t1 +1). Approximating the zxt ’s with their
Lee–Carter expression βx κt indicates that the absence of age-time interac-
tions is assumed, that is, the βx ’s are fixed over time and the κt ’s are fixed
over ages. Most data sets do not comply with the time-invariance of the
βx ’s, unless the optimal fitting period has been selected as explained below.
Now, the " βx ’s and "
κt ’s are such that they minimize
tn
xm 2
+LS (β, κ) =
O zxt − βx κt (5.11)
x=x1 t=t1
xm −x1 +1
provided that j=1 v1j = 0. The constraints (5.3) are then satisfied by
the "
βx ’s and"
κt ’s. Note that the second and higher terms of the singular value
decomposition together comprise the residuals. Typically, for low mortality
populations, the first order approximation (5.12) behind the Lee–Carter
model accounts for about 95% of the variance of the ln m " x (t)’s.
Remark As pointed out by Booth et al. (2002), the original approach by
Lee and Carter (1992) makes use of only the first term of the singular value
decomposition of the matrix of centered log death rates. In principle, the
second-and higher-order terms could be incorporated in the model. The full
expanded model is
r
[j] [j]
" x (t) = αx +
ln m β x κt (5.14)
j=1
[j] [j]
where r is the rank of the ln mx (t)−αx matrix. In this case, βx κt is referred
to as the jth order term of the approximation. Any systematic variation
in the residuals from fitting only the first term would be captured by the
second and higher terms. In their empirical illustration, Booth et al. (2002)
find a diagonal pattern in the residuals that was interpreted as a cohort-
period effect. We will come back to the modelling of cohort effects in the
next chapter. Brouhns et al. (2002b) have tested whether the inclusion of
a second log-bilinear term significantly improves the quality of the fit, and
this was not the case in their empirical illustrations.
Renshaw and Haberman (2003a) report on the failure of the first-order
Lee–Carter model to capture important aspects of the England and Wales
mortality experience (despite explaining about 95% of the total variance)
together with the presence of noteworthy residual patterns in the second-
order term. As a consequence, Renshaw and Haberman (2003b) have
investigated the feasibility of constructing mortality forecasts on the basis
of the first two sets of SVD vectors, rather than just on the first set of such
5.2 Lee–Carter mortality projection model 193
xm
0= " x (t) − αx − βx κt ,
βx ln m t = t1 , t2 , . . . , tn (5.15)
x=x1
tn
0= " x (t) − αx − βx κt ,
κt ln m x = x1 , x2 , . . . , xm
t=t1
f (ξ (k) )
ξ (k+1) = ξ (k) −
f (ξ (k) )
Each time one of the Lee–Carter parameters αx , βx and κt is updated, the
already revised values of the other parameters are used in the iterative
formulas. The recurrence relations are thus as follows:
tn (k) (k) (k)
(k+1) (k) t=t1 ln m " x (t) − "αx − " βx " κt
"
αx ="αx +
tn − t 1 + 1
xm
" (k) ln m " x (t) − " (k+1)
−"
(k) (k)
(k+1) (k) x=x1 βx αx βx "κt
"
κt ="κt + xm (k) 2 (5.16)
"
β x
x=x1
tn (k+1) (k+1) (k) (k+1)
t=t1 "
κt " x (t) − "
ln m αx −"βx " κt
"
βx(k+1) " (k)
= βx + tn (k+1) 2
t=t1 " κt
194 5 : Age-period projection models
xm
xm
Dxt = αx + "
ETRxt exp(" βx ζ) (5.17)
x=x1 x=x1
in ζ. So, the κt ’s are reestimated in such a way that the resulting death rates
(with the previously estimated " αx and "βx ), applied to the actual risk expo-
sure, produce the total number of deaths actually observed in the data for
the year t in question. There are several advantages to making this second
stage estimate of the parameters κt . In particular, it avoids sizable discrep-
ancies between predicted and actual deaths (which may occur because the
model (5.4) is specified by means of logarithms of death rates). We note
that no explicit solution is available for (5.17), which has thus to be solved
numerically (using a Newton–Raphson procedure, for instance).
It is worth mentioning that more than one solution for (5.17) may arise
when all the "βx ’s do not have the same sign. A nonuniform sign for the "
βx ’s
implies that mortality is increasing at some ages and decreasing at others.
This is not normally expected to happen, except sometimes at advanced ages
5.2 Lee–Carter mortality projection model 195
(but the phenomenon disappears when the actuary starts the modelling by
closing the life tables). Therefore, solving (5.17) usually does not pose any
problem.
in ζ.
The advantage of this second adjustment procedure is that it does not
require exposures-to-risk nor death counts and is thus generally applicable.
Note that, as before, numerical problems may arise when the " βx ’s do not
have the same sign, but we believe that this problem is unlikely to occur in
practice.
Statistical model
Let us now assume that the actuary has at his/her disposal observed death
counts Dxt and corresponding exposures ETRxt . Then, the least-squares
approach can be applied to the ratio of the death numbers to the expo-
" x (t) = Dxt /ETRxt ’s as explained above). The method
sure (i.e. to the m
presented in this section better exploits the available information, and does
not assume that the variability of the m" x (t)’s is the same whatever the age
x. Specifically, we assume that the number of deaths at age x in year t
has a Poisson random variation. To justify this approach, we prove that
assumption (3.2) is compatible with Poisson modelling for death counts.
To this end, let us focus on a particular pair: age x – calendar year t. We
observe Dxt deaths among Lxt individuals aged x on January 1 of year t.
We assume that the remaining lifetimes of these individuals are independent
and identically distributed. The likelihood function (3.12) is proportional to
the Poisson likelihood, that is, the one obtained under the assumption that
Dxt is Poisson distributed with mean ETRxt µx (t) = ETRxt exp(αx + βx κt )
5.2 Lee–Carter mortality projection model 197
where the parameters are still subjected to the constraints (5.3). Therefore,
provided that we resort to the maximum likelihood estimation procedure,
working on the basis of the ‘true’ likelihood (3.12) or working on the
basis of the Poisson likelihood are equivalent, once the assumption (3.2)
has been made.
tn
xm
L(α, β, κ) = Dxt (αx + βx κt ) − ETRxt exp(αx + βx κt ) + constant.
x=x1 t=t1
(5.21)
Equivalently, the parameters are estimated by minimizing the associated
deviance defined as
D = −2(L(α, β, κ) − Lf ) (5.22)
where Lf is the log-likelihood of the full or saturated model (characterized
by equating the fitted and actual numbers of deaths).
The criterion used to stop the procedure is a relative increase in the log-
likelihood function that is smaller than a pre-selected sufficiently small fixed
number.
The maximum likelihood estimations of the parameters coming out of
(5.23) have to be adapted in order
m to fulfill the constraints
xm (5.3): specifically,
we replace " κt − κ) xx=x
κt with (" "
β "
β "
β / "
1 x
, x with x x=x1 βx , and "
αx with
" "
αx + βx κ.
Remark As pointed out by Renshaw and Haberman (2006), the error struc-
ture can be imposed by specifying the second moment properties of the
model, as in the framework of generalized linear modelling. This allows for
a range of options for the choice of the error distribution, including Pois-
son, both with and without dispersion, as well as Gaussian, as used in the
original approach by Lee and Carter (1992).
which is similar to (5.17) except that the sum is now over calendar time
instead of age. So, the estimated κt ’s are such that the resulting death rates
applied to the actual risk exposure produce the total number of deaths
actually observed in the data for each age x. Sizable discrepancies between
predicted and actual deaths are thus avoided.
xm
tn
L(α, β, κ) = dxt ln 1 − "
qx (t) + dxt ln "
qx (t) + constant (5.26)
t=t1 x=x1
have suggested the replacement of the Poisson model with a Mixed Poisson
one. Given xt , the number of deaths Dxt is assumed to be Poisson dis-
tributed with mean ETRxt exp(αx + βx κt + xt ). Unconditionally, Dxt obeys
a mixture of Poisson distributions. The xt ’s are assumed to be indepen-
dent and identically distributed. A prominent example consists in taking the
Dxt ’s to be Negative Binomial distributed. See also Renshaw and Haberman
(2008).
Mortality data from the life insurance market often exhibit overdisper-
sion because of the presence of duplicates. It is common for individuals to
hold more than one life insurance or annuity policy and hence to appear
more than once in the count of exposed to risk or deaths. In such a case,
the portfolio is said to contain duplicates, that is, the portfolio contains sev-
eral policies concerning the same lives. It is well known that the variance
becomes inflated in the presence of duplicates. Consequently, even if the
portfolio (or one of its risk class) is homogeneous, the presence of duplicates
would increase the variance and cause overdispersion. The overdispersed
Poisson and Negative Binomial models for estimating the parameters of log-
bilinear models for mortality projections are thus particularly promising for
actuarial applications.
0.020 0
Beta
Alpha
Kappa
–4
0.015
0.010 –50
–6
0.005
–100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
–0.5 10
–1.0 0.030
5
–1.5 0.025 0
Alpha
Kappa
–2.0
Beta
0.020 –5
–2.5
–10
–3.0 0.015
–15
–3.5
0.010
–4.0 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
Figure 5.1. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,
obtained with HMD data by minimizing the sum of squares (5.5) with the estimated κt ’s adjusted by refitting to the period life expectancies at birth or at ages
60 (for the estimated κt ’s, the values before adjustment are displayed in broken line).
5.3 Cairns–Blake–Dowd mortality projection model 203
It is important to mention that the sole use of the proportion of the total
temporal variance (as measured by the ratio of the first singular value to the
sum of singular values) is not a satisfactory diagnostic indicator. An exam-
ination of the residuals is needed to check for model adequacy (see below).
The fitted mortality surfaces are depicted in Fig. 5.2. These surfaces
should be compared with Fig. 5.3. The mortality experience appears rather
smooth, with some ridges around 1940–1945.
We now fit the log-bilinear model to the HMD data set by the method of
Poisson maximum likelihood. All of the ages 0–104 are included in the anal-
ysis. Figure 5.3 (top panels) plots the estimated αx , βx and κt . The estimated
parameters are compared with those obtained by minimizing the sum of the
squared residuals (5.5). We see that the least-squares and Poisson maximum
likelihood procedures produce very similar sets of estimated parameters αx ,
βx , and κt .
As above, we restrict ourselves to ages above 60. Figure 5.3 (bottom pan-
els) plots the estimated αx , βx , and κt . The estimated parameters are com-
pared with those obtained by minimizing least squares. We observe sizeable
discrepancies between the " βx ’s produced by the least-squares and Poisson
maximum likelihood procedures, whereas the " αx ’s and "
κt ’s remain similar.
where κt[1] and κt[2] are themselves stochastic processes. This specification
does not suffer from any identifiability problems so that no constraints need
to be specified.
We see that age is now treated as a continuous covariate and enters the
model in a linear way on the logit scale. The intercept κt[1] and slope κt[2]
parameters make up a bivariate time series the future path of which governs
the projected life tables. The intercept period term κt[1] is generally declining
–2
–4
–6
–8
1920
1940 100
80
1960
60
t
1980 40
x
20
2000
0
–1
–2
–3
–4
1920
1940 100
1960 90
t 80
1980 x
70
2000
60
Figure 5.2. Fitted death rates (on the log scale) for Belgian males, ages 0–104 (top panel) and
ages 60–104 (bottom panel), period 1920–2005.
0 0.035
50
0.030
–2 0.025
0
Kappa
Alpha
0.020
Beta
–4
0.015
–50
0.010
–6
0.005
0.000 –100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
10
–1 0.03 5
0
Alpha
Kappa
Beta
–2 0.02
–5
–10
–3 0.01
–15
–4 0.00 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
Figure 5.3. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,
obtained with HMD data by maximizing the Poisson log-likelihood (5.21) (the values obtained by least-squares are displayed in broken line).
206 5 : Age-period projection models
over time, which corresponds to the feature that mortality rates have been
decreasing over time at all ages. Hence, the upward-sloping plot of the logit
of death probabilities against age is shifting downwards over time. If during
the fitting period, the mortality improvements have been greater at lower
ages than at higher ages, the slope period term κt[2] would be increasing over
time. In such a case, the plot of the logit of death probabilities against age
would be becoming more steep as it shifts downwards over time.
Sometimes, the logit of the death probabilities qx (t) plotted against age x
exhibits a slight curvature after retirement age. This curvature can be mod-
elled by including a quadratic term in age in the Cairns–Blake–Dowd model.
However, the dynamics of the time factor associated with this quadratic
effect often remains unclear and when combined with the quadratic age
term, its contribution to mortality dynamics is highly complex.
The Cairns–Blake–Dowd model has two time series κt[1] and κt[2] which
affect different ages in different ways. This is a fundamental difference com-
pared with the 1-factor Lee–Carter approach where a single time series
induces perfect correlation in mortality rates at different ages from one
year to the next. There is empirical evidence to suggest that changes in the
death rates are imperfectly correlated, which supports the Cairns–Blake–
Dowd model or the 2-factor Lee–Carter model represented by equation
(5.14) with r = 2. Compared to the 1-factor Lee–Carter model, the Cairns–
Blake–Dowd model thus allows changes in underlying mortality rates that
are not perfectly correlated across ages. Also, the longer the run of data that
the actuary uses, the better does the 2-factor model relative to its 1-factor
counterpart. For example, if we consider the entire 20th century, mortality
improvements concentrate on younger ages during the first half of the cen-
tury and on higher ages during the second half. We need a 2-factor model to
capture these two different dynamics. Note, however, that the restriction to
the optimal fitting period in the Lee–Carter case favours recent past history
so that the inclusion of a second factor may not be needed.
Note that the switch from a unique time series to a pair of time-dynamic
factors has far-reaching consequences when we discuss securitization, as the
existence of an imperfect correlation structure implies, for example, that
hedging longevity-linked liabilities would require more than one hedging
instrument.
5.3.2 Calibration
these observations, we would like to estimate the intercept κt[1] and slope
κt[2] parameters. This can be done by least-squares. This means that the
regression model
"
qx (t)
ln = κt[1] + κt[2] x + x (t) (5.29)
"
px (t)
is fitted to the observations of calendar year t, where the " qx (t)’s are the
crude one-year death probabilities, and where the error terms x (t) are
independent and Normally distributed, with mean 0 and constant variance
σ2 . The objective function
xm
2
"
qx (t)
Ot (κ) = ln − κt[1] − κt[2] x (5.30)
"
px (t)
x=x1
R2[t]
k[2]
0.90
k[1]
t
t
–8.0 0.070
0.88
–8.5 0.065
0.86
–9.0 0.060
0.84
–9.5
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t t
0.998
–9.0 0.105
0.996
–9.5 0.100
R2[t]
0.994
k[1]
k[2]
–10.0 0.095
t
[1] [2]
Figure 5.4. Estimated κt and κt parameters together with the values of the adjustment coefficient by calendar year (from left to right), for ages
x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005, obtained with HMD data by least-squares.
5.4 Smoothing 209
to right, we see the estimated κt[1] ’s, the estimated κt[2] ’s, and the value
of the adjustment coefficient R2 (t) for each calendar year t. The bot-
tom panels give the corresponding results for the restricted age range 60,
61, . . . , 104.
When all of the ages are considered, the estimated κt[1] ’s exhibit a down-
ward trend, which expresses the improvement in mortality rates over time
for all ages. A peak around 1940–1945 indicates a higher mortality experi-
ence during World War II. The estimated κt[2] ’s tend to increase over time,
indicating that mortality improvements have been comparatively greater at
younger ages over the period 1920–2005. We note that World War II also
affected the estimated κt[2] ’s, with a decrease in the early 1940s. The val-
ues of the adjustment coefficient R2 (t) indicate that the Cairns-Blake-Dowd
model explains from about 80% of the variance in 1920 to about 95% in
the early 2000s.
If we restrict the age range to 60, 61, . . . , 104, we see that the goodness-
of-fit is greatly increased, with adjustment coefficients larger than 99%. The
Cairns–Blake–Dowd model takes advantage of the approximate linearity in
age (on the logit scale) at higher ages to provide a parsimonious represen-
tation of one-year death probabilities. The adjustment coefficients close to
1 demonstrate the ability of the Cairns–Blake–Dowd model to describe the
mortality experienced in Belgium. The trend in the estimated intercept and
slope parameters is less clear, unless we restrict our interest to the latter part
of the 20th century, where the estimated κt[1] ’s and κt[2] ’s become markedly
linear (with a decreasing trend for the former, and an increasing one for the
latter).
5.4 Smoothing
5.4.1 Motivation
Actuaries use projected life tables in order to compute life annuity prices,
life insurance premiums as well as reserves that have to be held by insurance
companies to enable them to be able to pay the future contractual benefits.
Any irregularities in these life tables would then be passed on to the price
list and to balance sheets, which is not desirable. Therefore, as long as these
irregularities do not reveal particular features of the risk covered by the
insurer, but are likely to be caused by sampling errors, actuaries prefer to
resort to statistical techniques to produce life tables that exhibit a regular
progression, in particular with respect to age.
210 5 : Age-period projection models
Durban and Eilers (2004) have smoothed death rates with P-splines in the
context of a Poisson model. The P-spline approach is an example of a regres-
sion model and is similar to the generalized linear modelling discussed in
Section 4.5.4. But unlike generalized linear models, P-splines allow for more
flexibility in modelling observed mortality.
Regression models take a family of basis functions, and choose a com-
bination of them that best fits the data according to some criterion. The
P-spline approach uses a spline basis, with a penalty function that is intro-
duced in order to avoid oversmoothing. P-splines are related to B-splines
which have been discussed in Section 2.6.3. Recall that univariate, or uni-
dimensional, B-splines are a set of basis functions each of which depends
on the placement of a set of ‘knot’ points providing full coverage of the
range of data. Defining B-splines in two dimensions is straightforward. We
define knots in each dimension, and each set of knots gives rise to a uni-
variate B-spline basis. The two-dimensional B-splines are then obtained by
multiplying the respective elements of these two bases.
Durban and Eilers (2004) have suggested a decomposition of µx (t) as
follows:
ln µx (t) = θij Bij (x, t) (5.32)
i,j
(θi+1, j−1 − 2θij + θi−1, j+1 )2 across cohorts. The CMI Bureau in the UK has
suggested the use of age and cohort penalties (see also Chapter 6). Each
of these penalties involves an unknown weight coefficient that has to be
selected from the data.
Note that there is a difference in the structural assumption behind the P-
spline approach, compared with the Lee–Carter and Cairns–Blake–Dowd
alternative approaches: the P-spline approach assumes that there is smooth-
ness in the underlying mortality surface in the period effects as well as in the
age and cohort effects. Some further extensions have recently been proposed
to account for period shocks.
The P-splines approach is a powerful smoothing procedure for the
observed mortality surface. Using the penalty to project the θij ’s to the
future, it is also possible to use this tool to forecast future mortality rates,
by extrapolating the smooth mortality surface. However, as pointed out
by Cairns et al. (2007), the P-spline approach to mortality forecasting is
not transparent. Its output is a smooth surface fitted to historical data and
then projected into the future. An important difference (compared with the
Lee–Carter and Cairns–Blake–Dowd alternatives) is that forecasting with
the P-splines approach is a direct consequence of the smoothing process.
The choice of the penalty then corresponds to a view of the future pattern
of mortality. In contrast, the two stages of fitting the data and extrapolating
past trends are kept separate in the Lee–Carter annd Cairns–Blake–Dowd
approaches. This is an advantage for actuarial applications, since it allows
for more flexibility.
Moreover, the form of the penalty is usually difficult to infer from the
data, whereas it entirely drives the P-spline mortality forecast (a similar
feature occurs in period-based mortality graduation using splines when
mortality rates are extrapolated beyond the data to the oldest ages). The
degree of smoothing in empirical applications depends on the variabil-
ity of the observed death rates. The size of the population under study,
as well as the range of ages considered, thus, both influence the smooth-
ing coefficient and, possibly, the choice of the penalty. In the Lee–Carter
and Cairns–Blake–Dowd approaches, these features of the data do not
directly affect the projection of the time index. As the order of the penalty
has no discernible effect on the smoothness of the observed data, it is
hard to deduce it from the observed data. The choice of the penalty,
in fact, corresponds to a view of the future pattern of mortality: future
mortality continuing at a constant level, future mortality improving at a
constant rate or future mortality improving at an accelerating (quadratic)
rate.
212 5 : Age-period projection models
As can be seen from Fig. 5.1, the estimated βx ’s exhibit an irregular pat-
tern. This is undesirable from an actuarial point of view, since the resulting
projected life tables will also show some erratic variations across ages.
Bayesian formulations assume some sort of smoothness of age and period
effects in order to improve estimation and facilitate prediction. A Bayesian
treatment of mortality projections has been proposed by Czado et al. (2005).
Note that the estimated αx ’s are usually very smooth, since they represent
an average effect of mortality at age x (however, Renshaw and Haberman
(2003a) experiment with different choices for αx , representing different
averaging periods and hence different levels of smoothing, as well as explicit
graduation of the αx estimates). The estimated κt ’s are often rather irregular,
but the projected κt ’s, obtained from some time series model (as explained
below), will be smooth. Hence, we only need to smooth the βx ’s in order
to get projected life tables with mortality varying smoothly across the ages.
This can be achieved by penalized least-squares or maximum likelihood
methods.
The estimated Lee–Carter parameters are traditionally obtained by min-
imizing (5.5). This has produced estimated βx ’s and κt ’s with an irregular
shape in the majority of empirical studies. In order to smooth the estimated
βx ’s we can use the objective function
xm
tn
2
OPLS (α, β, κ) = ln "
µx (t) − αx − βx κt
x=x1 t=t1
xm
2
+πβ βx+2 − 2βx+1 + βx (5.33)
x=x1
1
xm
2
L(α, β, κ) − πβ βx+2 − 2βx+1 + βx (5.34)
2 x=x
1
As above, the selection of the optimal value for the roughness penalty
coefficient πβ is based on cross validation.
Here, we adopt a very simple strategy in our case study: instead of fitting
the Lee–Carter model to the rough mortality surface, we first smooth it
using the methods described in Section 3.4.2 and then we fit the model to
the resulting surface.
Remark An alternative aproach to smoothing the βx ’s has also been sug-
ested. It is more ad hoc in nature than the above, in that it introduces an extra
stage in the modelling process. Thus, Renshaw and Haberman (2003a,c)
smooth the Lee–Carter βx estimates using linear regression as well as cubic
B-splines and natural cubic splines and the methods of least-squares.
Many actuarial studies have based the projections of mortality on the statis-
tics relating to the years from 1950 to the present. The question then
becomes why the post-1950 period better represents expectations for the
future than does the post-1900 period, for example. There are several justi-
fications for the use of the second half of the 20th century. First, the pace of
mortality decline was more even across all ages over the 1950–2000 period
than over the 1900–2000 period. Second, the quality of mortality data, par-
ticularly at the older ages, for the 1900–1950 period is questionable. Third,
infectious diseases were an uncommon cause of death by 1950, while heart
disease and cancer were the two most common causes, as they are today.
This view seems to imply that the diseases affecting death rates from 1900
through 1950 are less applicable to expectations for the future than the
dominant causes of death from 1950 through 2000.
According to Lee and Carter (1992), the length of the mortality time
series was not critical as long as it was more than about 10–20 years.
However, Lee and Miller (2001) obtained better fits by restricting the
start of the calibration period to 1950 in order to reduce structural shifts.
Specifically, in their evaluation of the Lee–Carter method, Lee and Miller
(2001) have noted that for US data the forecast was biased when using
the fitting period 1900–1989 to forecast the period 1990–1997. The main
source of error was the mismatch between fitted rates for the last year
of the fitting period (1989 in their study) and actual rates in that year.
This is why a bias correction is applied. It was also noted that the βx
pattern did not remain stable over the whole 20th century. In order to
obtain more stable βx ’s, Lee and Miller (2001) have adopted 1950 as
the first year of the fitting period. Their conclusion is that restricting the
0
0.030
–1 50
0.025
–2
–3 0.020 0
Beta
Kappa
Alpha
–4 0.015
–5
0.010 –50
–6
0.005
–7
–100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
–0.5 10
–1.0 0.030
5
–1.5 0.025 0
Kappa
–2.0
Alpha
Beta
0.020 –5
–2.5
–10
–3.0 0.015
–15
–3.5
0.010
–4.0 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
Figure 5.5. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,
obtained with smoothed HMD death rates by minimizing the sum of squares (5.5) with the resulting κt ’s adjusted by refitting to the period life expectancies at
birth (corresponding values obtained without smoothing are displayed in broken line).
216 5 : Age-period projection models
Booth et al. (2002) have designed procedures for the selection of an opti-
mal calibration period which identifies the longest period for which the
estimated mortality index parameter κt is linear. Specifically, these authors
seek to maximize the fit of the overall model by restricting the fitting period
in order to maximize the fit to the linearity assumption. The choice of the
fitting period is based on the ratio of the mean deviances of the fit of the
underlying Lee–Carter model to the overall linear fit. This ratio is computed
by varying the starting year (but holding the jump-off year fixed) and the
chosen fitting period is that for which the ratio is substantially smaller than
for periods starting in previous years.
More specifically, Booth et al. (2002) assume, a priori, that the trend
in the adjusted " κt ’s is linear, based on the ‘universal pattern’ of mortality
decline that has been identified by several researchers, including Lee and
Carter (1992) and Tuljapurkar and Boe (2000). When the " κt ’s depart from
linearity, this assumption may be better met by appropriately restricting the
fitting period. As noted above, the ending year is kept equal to tn and the
fitting period is then determined by the starting year (henceforth denoted
as tstart ).
Restricting the fitting period to the longest recent period (tstart , tn )
for which the adjusted " κt ’s do not deviate markedly from linearity
has several advantages. Since systematic changes in the trend in " κt are
avoided, the uncertainty in the forecast is reduced accordingly. More-
over, the βx ’s are likely to satisfy better the assumption of time invariance.
5.5 Selection of an optimal calibration period 217
Finally, the estimate of the drift parameter more clearly reflects the recent
experience.
An ad hoc procedure for selecting tstart has been suggested in Denuit and
Goderniaux (2005). Precisely, the calendar year tstart ≥ t1 is selected in such
a way that the series {"
κt , t = tstart , tstart + 1, . . . , tn } is best approximated
by a straight line. To this end, the adjustment coefficient R2 (which is the
classical goodness-of-fit criterion in linear regression) is maximized (as a
function of the number of observations included in the fit).
Note that in Denuit and Goderniaux (2005), the κt ’s are replaced by
a linear function of t and a parametric regression model (using a linear
effect term for the continuous covariate calendar time with an interaction
with the categorical variable age, together with a term for the categor-
ical variable age) is then used. Even if this approach produces almost
the same projections as the Lee–Carter method, it underestimates the
uncertainty in mortality forecasts. The resulting confidence intervals are
then artificially narrow because of the imposition of the linear trend in
the κt ’s.
The situation is slightly different in the Cairns–Blake–Dowd model. As
the time-varying parameters are estimated separately for each calendar year,
they remain unaffected if we modify the range of calendar years under
interest. Considering Fig. 5.4, we clearly see that the slope and intercept
parameters become linear only in the last part of the observation period
(especially for ages 60 and over). Therefore, it is natural to extrapolate their
future path on the basis of recent experience only. The approach suggested
by Denuit and Goderniaux (2005) is easily extended to the Cairns-Blake-
Dowd setting, by selecting the starting year as the maximum of the starting
years for each time factor. The deviance approach proposed by Booth et al.
(2002) can also easily be adapted to the Cairns–Blake–Dowd model.
Note, however, that the selection of the optimal fitting period is subject
to criticisms, in the sense that it could lead to an underestimation of the
uncertainty in forecasts, and artificially favours the Lee–Carter specifica-
tion. The same comment applies in the Cairns–Blake–Dowd approach. We
do not share this view, and we believe that the selection of the optimal
fitting period is an essential part of the mortality forecast.
We first consider the Lee–Carter fit. Applying the method of Booth et al.
(2002) gives tstart = 1978. The ad hoc method suggested in Denuit and
Goderniaux (2005) roughly confirms this choice. Restricting the age range
218 5 : Age-period projection models
to 60 and over yields tstart = 1974. Again, the ad-hoc method agrees with
this choice.
Whereas the common practice would consist of taking all of the available
data 1920–2005, we discard here observations for the years 1920–1977
when all of the ages are considered, and observations for the years 1920–
1973 when the analysis is restricted to ages 60 and over. Here, short-term
trends are preferred even if long-term forecasts are needed for annuity pric-
ing. The reason is that past long-term trends are not expected to be relevant
to the long-term future. Note that the fact that the optimal fitting period is
selected on the basis of goodness-of-fit criteria to the linear model results
in relatively small deviations from this short-term linear trend, but the
shorter fitting period results in a more rapid widening of the confidence
intervals.
The final estimates based on observations comprised in the optimal fitting
period are displayed in Fig. 5.6 which plots the estimated αx , βx , and κt .
We see that the estimated αx ’s and κt ’s obtained with and without prior
smoothing closely agree whereas the estimated βx ’s are smoothed in an
appropriate way. The model explains 67.70% of the total variance for males
on the basis of unsmoothed data, 90.57% of the total variance for males
on the basis of smoothed data for ages 0–104. The model explains 92.62%
of the total variance for males on the basis of unsmoothed data, 95.74% of
the total variance for males on the basis of smoothed data for ages 60 and
over.
For the Cairns–Blake–Dowd model, the optimal projection periods now
become 1969–2005 when all of the ages are included in the analysis and
1979–2005 when ages are restricted to the range 60–104. Note that the esti-
mated time indices are not influenced by the restriction of the time period,
so that those displayed in Fig. 5.4 remain valid.
Kappa
0.015
Alpha
–4
Beta
0
0.010
–6 –10
0.005
–20
–8 0.000
0.04
–1
5
0.03
–2
Alpha
Kappa
0
Beta
0.02
–3
0.01 –5
–4
0.00 –10
60 70 80 90 100
60 70 80 90 100 1975 1980 1985 1990 1995 2000 2005
Age
Age Time
Figure 5.6. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), obtained by minimizing the
sum of squares (5.5) over the optimal fitting period 1978–2005 for ages 0–104 and 1974–2005 for ages 60–104 with smoothed HMD death rates (corresponding
values obtained without smoothing are displayed in broken line).
220 5 : Age-period projection models
where" αx +"
" x (t) − ("
x (t) = ln m βx"
κt ). In the Cairns–Blake–Dowd case, these
residuals are given by
"
x (t)
rxt = . xm tn (5.36)
1 2
(xm −x1 −1)(tn −t1 +1) x=x1 t=t1 " x (t)
x (t) =
where " qx (t)/"
ln(" κt[1]
px (t)) − (" κt[2] x).
+"
If the residuals rxt exhibit some regular pattern, this means that the model
is not able to describe all of the phenomena appropriately. In practice,
looking at (x, t) → rxt , and discovering no structure in those graphs ensures
that the time trends have been correctly captured by the model.
With a Poisson, Binomial, or Negative Binomial random component, it is
more appropriate to consider the deviance residuals in order to monitor the
quality of the fit. These residuals are defined as the signed square root of the
contribution of each observation to the deviance statistics. These residuals
should also be displayed as a function of time at different ages, or as a
function of both age and calendar year.
We find that the residuals computed from the model fitted to ages 0–104
reveal systematic patterns and comparatively large values at young ages. In
the Lee–Carter case, the fit around the accident hump is very poor, with
large negative residuals for ages below 20. The residuals are positive for all
of the higher ages. The same phenomenon appears with the Cairns–Blake–
Dowd fit, with huge positive residuals around age 0. Overall, we find that
the inclusion of young ages significantly deteriorates the quality of the fit
at the higher ages. The presence of a trend in the residuals violates the
independence assumption and homoskedasticity does not hold as the graph
presents clustering. The large residuals before the accident hump suggest
that the Lee–Carter and Cairns–Blake–Dowd approaches are not able to
account for the particular mortality dynamics at younger ages. Since older
ages are the most relevant in pension and annuity applications, we restrict
the analysis to ages 60 and over.
5.7 Mortality projection 221
4.6179
100 4.1200
3.5400
2.9600
2.3800
90 1.8000
1.2200
0.6400
0.0600
Age
80
–0.5200
–1.1000
–1.6800
70 –2.2600
–2.8400
–3.4200
–4.0753
60
1975 1980 1985 1990 1995 2000 2005
Time
3.1197
100 2.7199
2.3202
1.9205
1.5208
90 1.1211
0.7213
0.3216
Age
–0.0780
80
–0.4777
–0.8774
–1.2771
70 –1.6769
–2.0766
–2.4763
–2.8760
60
1980 1985 1990 1995 2000 2005
Time
Figure 5.7. Residuals for Belgian males, Lee–Carter model, ages x = 60, 61, . . . , 104 (top panel),
and Cairns–Blake–Dowd model, ages x = 60, 61, . . . , 104 (bottom panel).
There are a few basic steps to fitting ARIMA models to time series data.
The main point is to identify the values of the autoregressive order p, the
order of differencing d, and the moving average order q. If the time index
is not stationary, then a first difference (i.e. d = 1) can help to remove the
time trend. If this proves unsuccessful then it is standard to take further
differences (i.e. investigate d = 2 and so on). Preliminary values of p and
q are chosen by inspecting the autocorrelation function and the partial
autocorrelation function of the κt ’s. More details can be found in standard
textbooks devoted to time series analysis.
The appropriateness of the Lee–Carter approach has been questioned
by several authors. The rigid structure imposed by the model necessitates
the selection of an optimal fitting period (which is also conservative in
the context of life annuities, that is, it tends to overstate the expected
value of annuities). The Gaussian distributional assumption imposed on
the κt ’s means that large jumps are unlikely to occur. This feature can
be problematic for death benefits, where negative jumps correspond to
events which threaten the financial strength of the insurance company. For
instance, insurers currently are worrying about an avian influenza pan-
demic which could cause the death of many policyholders. On the basis of
vital registration data gathered during the 1918–1920 influenza pandemic,
extrapolations indicate that if the mortality were concentrated in a sin-
gle year, it would increase global mortality by 114%. However, neglecting
such jumps is conservative for life annuities. Positive jumps correspond-
ing to sudden improvements in mortality thanks to the availability of new
medical treatments are considered to be unlikely to occur, since it would
take some time for the population to benefit from these innovative treat-
ments. Hence, the assumptions behind the Lee–Carter model are compatible
with mortality projections for life annuity business, and we do not need to
acknowledge explicitly period shocks in the stochastic mortality model.
We note also that the optimal fitting period, that has been widely used, has
tended to start after the three pandemics of the 20th century (1918–1920,
1957–1958, and 1968–1970).
5.7.2.1 Stationarity
Time series analysis procedures require that the variables being studied be
stationary. We recall that a time series is (weakly) stationary if its mean
and variance are constant over time, and the covariance for any two time
periods (t and t + k, say) depends only on the length of the interval between
the two time periods (here k), not on the starting time (here t).
224 5 : Age-period projection models
1.0
0.8
0.6
0.4
ACF
0.2
0.0
–0.2
–0.4
0 5 10 15
Lag
0.8
0.6
Partial ACF
0.4
0.2
0.0
–0.2
2 4 6 8 10 12 14
Lag
Figure 5.8. Autocorrelation function (on the left) and partial autocorrelation function (on the
right) of the estimated κt ’s obtained with completed data for the ages 60 and over.
5.7.2.2 Random walk with drift model for the time index
As no autocorrelation coefficient nor partial autocorrelation coefficient
of the differenced time index appears to be significantly different from
0, an ARIMA(0,1,0) process seems to be appropriate for the estimated
κt ’s. The Ljung–Box–Pierce test supports this model. Running a Shapiro–
Wilk test yields a p-value of 23.08%, which indicates that the residuals
seem to be approximately Normal. The corresponding Jarque-Bera p-value
equals 48.27%, which confirms that there is no significant departure from
Normality.
The previous analysis suggests that for Belgian mortality statistics, a ran-
dom walk with drift model is suitable for modelling the estimated κt ’s (as
is the case in many of the empirical studies in the literature). In this case,
226 5 : Age-period projection models
κt = κt−1 + d + ξt (5.38)
where the ξt ’s are independent and Normally distributed with mean 0 and
variance σ 2 , and where d is known as the drift parameter. In this case,
k
κtn +k = κtn + kd + ξtn +j (5.39)
j=1
Therefore, the conditional standard errors for the forecast increase with the
square root of the distance of the forecast horizon k.
Using the random walk with drift model for forecasting κt is equivalent
to forecasting each age-specific death rate to decline at its own rate. Indeed,
it follows from (5.38) that the differences in expected log-mortality rates
between times t + 1 and t is
The ratio of death rates in two subsequent years of the forecast is equal
to exp(βx d) and is thus invariant over time. The product βx d is therefore
equal to the rate of mortality change over time at age x. In such a case, the
parameter βx can be interpreted as a normalized schedule of age-specific
rates of mortality change over time.
It is important to notice that the future mortality age profile produced by
the Lee–Carter model always becomes less smooth over time, as pointed
out by Girosi and King (2007). This explains why this approach has
been designed to forecast aggregate demographic indicators, such as life
expectancies (or actuarial indicators like annuity values), and not future
period or cohort life tables. This comes from the fact that the forecast of
the log-death rates is linear over time from (5.42): as the βx ’s vary with age,
the age profile of log-mortality will eventually become less smooth over
time, since the distance between log-mortality rates in adjacent age groups
can only increase. Each difference in βx is amplified as we forecast further
5.7 Mortality projection 227
into the future. Sometimes, the forecast lines converge for a period, but after
converging they cross and the age profile pattern becomes inverted.
The dynamics (5.38) ensures that κt −κt−1 , t = t2 , t3 , . . . , tn , are indepen-
dent and Normally distributed, with mean d and variance σ 2 . The maximum
likelihood estimators of d and σ 2 are given by the sample mean and vari-
ance of the κt − κt−1 ’s, that is, the maximum likelihood estimators of the
model parameters are
1
tn
" "
κt − "κt1
d= ("
κt − "
κt−1 ) = n (5.43)
tn − t1 t=t tn − t1
2
and
tn 2
1
σ2 =
" " κt−1 − "
κt − " d (5.44)
tn − t1 t=t
2
Remark Carter (1996) has developed a method in which the drift d in the
random walk forecasting equation for κt is itself allowed to be a random
228 5 : Age-period projection models
variable. This is done using state-space methods for modelling time series.
Nevertheless, it is noteworthy that the forecast and probability intervals
remain virtually unchanged compared to the simple random walk with drift
model.
The analysis of each time index in isolation parallels the analysis performed
for the Lee–Carter time index. These preliminary results have now to be sup-
plemented with a bivariate analysis of the time series κ t = (κt[1] , κt[2] )T that
goes beyond the scope of this book. When fitted to data, the changes over
time in κ t have often been approximately linear, at least in the recent past.
This suggests that the dynamics of the time factor κ t could be appropriately
described by a bivariate random walk with drift of the form
κt[1] = κt−1
[1]
+ d1 + ξt[1]
(5.45)
κt[2] = κt−1
[2]
+ d2 + ξt[2]
where d1 and d2 are the drift parameters, and ξ t = (ξt[1] , ξt[2] )T are inde-
pendent bivariate Normally distributed random pairs, with zero mean and
variance-covariance matrix
2
σ1 σ12
= (5.46)
σ12 σ22
" κt[i]
" κt[i]
−"
di = n 1
, i = 1, 2 (5.47)
tn − t 1
d1 = −0.0757558, "
This gives " d2 = 0.0007619443, "σ12 = 0.01563272,
σ22 = 3.3048 × 10−6 , and "
" σ12 = −0.0002247978 for Belgian males for the
period 1979-2005.
While a bivariate random walk with drift model has been used in connec-
tion with the Cairns–Blake–Dowd approach to mortality forecasting, mean
reverting alternatives might have a stronger biological justification. Andrew
Cairns pointed out in a personal communication that negative autocorrela-
tion coefficients between the "κt[2] −"[2]
κt−1 ’s indicate that at higher ages good
years and bad years alternate. This can be explained as follows: if a flu epi-
demic kills a lot of the unhealthy older people, it leaves the healthy older
and then the next year mortality is low.
The projections made so far, while interesting, reveal nothing about the
uncertainty attached to the future mortality. In forecasting, it is important
to provide information on the error affecting the forecasted quantities. In
the traditional demographic approach to mortality forecasting, a range of
uncertainty is indicated by high and low scenarios, around a medium fore-
cast that is intended to be a best estimate. However, it is not clear how to
interpret this high-low range unless a corresponding probability distribution
is specified.
In this respect, prediction intervals are particularly useful. This section
explains how to get such margins on demographic indicators in the Lee–
Carter setting. The ideas are easily extended to the Cairns–Blake–Dowd
setting.
In the current application, it is impossible to derive the relevant prediction
intervals analytically. The reason for this is that two very different sources
of uncertainty have to be combined: sampling errors in the parameters αx ,
βx , and κt , and forecast errors in the projected κt ’s. An additional compli-
cation is that the measures of interest – life expectancies or life annuities
premiums and reserves – are complicated non-linear functions of the param-
eters αx , βx , and κt and of the ARIMA parameters. The key idea behind the
230 5 : Age-period projection models
bootstrap is to resample from the original data (either directly or via a fitted
model) in order to create replicate data sets, from which the variability of the
quantities of interest can be assessed. Because this approach involves repeat-
ing the original data analysis procedure with many replicate sets of data, it
is sometimes called a computer-intensive method. Bootstrap techniques are
particularly useful when, as in our problem, theoretical calculation with the
fitted model is too complex.
If we ignore the other sources of errors, then the confidence bounds
on future κt ’s can be used to calculate prediction intervals for demo-
graphic indicators. Even if for long-run forecasts (over 25 years), the
error in forecasting the mortality index clearly dominates the errors in
fitting the mortality matrix, prediction intervals based on κt alone seri-
ously understate the errors in forecasting over shorter horizons. We know
from Lee and Carter (1992), Appendix B, that prediction intervals based
on κt alone are a reasonable approximation only for forecast horizons
greater than 10–25 years. If there is a particular interest in forecasting over
the shorter term, then we cannot make a precise analysis of the forecast
errors.
Because of the importance of appropriate measures of uncertainty in an
actuarial context, the next sections aim to derive prediction intervals taking
into account all of the sources of errors. To fix the ideas, we will con-
sider a cohort life expectancy ex (t) as defined in Section 4.4.1 or in (5.57)
below, but the approach is easily adapted to other demographic or actuarial
indicators.
We then estimate the time series model using the κtb as data points. This
yields a new set of estimated ARIMA parameters. We can then generate a
projection κtb , t ≥ tn + 1 using these ARIMA parameters. The future errors
ξtb are sampled from a univariate Normal distribution with a mean of 0 and
a standard deviation of σεb . Note that the κt ’s are projected on the basis of
the reestimated ARIMA model. Note that we do not select a new ARIMA
model but keep the ARIMA model selected on the basis of the original
232 5 : Age-period projection models
data. Nevertheless, the parameters of these models are reestimated with the
bootstrapped data.
The first step is meant to take into account the uncertainty in the param-
eters αx ’s, βx ’s, and κt ’s. The second step deals with the fact that the
uncertainty in the ARIMA parameters depends on the uncertainty in the
αx ’s, βx ’s and κt ’s parameters. The third step ensures that the uncertainty
of the forecasted κt ’s not only depends on the ARIMA standard error, but
also on the uncertainty of the ARIMA parameters themselves. Finally, in the
computation of the relevant measures in step four, all sources of uncertainty
are taken into account.
This yields B realizations αbx , βxb , κtb and projected κtb on the basis of which
we can compute the measure of interest ex (t). Assume that B bootstrap
estimates ex (t), b = 1, 2, . . . , B, have been computed. The (1−2α) percentile
b
b(α) b(1−α) b(ζ)
interval for ex (t) is given by (ex (t), ex (t)), where ex (t) is the 100 ×
ζth empirical percentile of the bootstrapped values for ex (t), which is equal
to the (B×ζ)th value in the ordered list of replications exb (t), b = 1, 2, . . . , B.
For instance, in the case of B = 1, 000 bootstrap samples, the 0.95th and the
0.05th empirical percentiles are, respectively, the 950th and 50th numbers
in the increasing ordered list of 1,000 replications of ex (t).
Note that these bootstrap procedures account for parameter uncertainty
as well as Arrowian uncertainty (also known as risk, in which the set
of future outcomes is known and probabilities can be assigned to each
of the possible outcomes based on a known model with known parame-
ters generating the distribution of future outcomes). Knightian uncertainty,
by comparison, ackowledges the presence of both model uncertainty and
parameter uncertainty. Allowing for model uncertainty would require the
consideration of several mortality projection models and the assignment to
these of probabilities in line with their relative likelihoods.
Remark Empirical studies conducted in Renshaw and Haberman (2008)
reveal varying magnitudes of the Monte Carlo based confidence and pre-
diction intervals under different sets of identifiability constraints. Such
diverse results are attributed by these authors to the over parametriza-
tion present in the model rather than to the non-linearity of the parametric
structure.
αx + "
µ̇x (tn + s) = exp(" βx κ̇tn +s ) (5.50)
In this case, the jump-off rates (i.e. the rates in the last year of the fitting
period or jump-off year) are fitted rates. The basic Lee–Carter method has
been criticized by Bell (1997) for the fact that a discontinuity is possible
between the observed mortality rates and life expectancies for the jump-
off year and the forecast values for the first year of the forecast period.
The bias arising from this discontinuity would then persist throughout the
forecast.
As suggested by Bell (1997), Lee and Miller (2001), Lee (2000), Renshaw
and Haberman (2003a), Renshaw and Haberman (2003c), the forecast
could be started with observed death rates rather than with fitted ones.
This would help to eliminate a jump between the observed and forecasted
death rates in the first year of the forecast as the model does not fit age-
specific death rates exactly in the last year. If the fitting period is sufficiently
long, then the difference between the observed and the fitted death rates
can be appreciable. Specifically, the forecast mortality rates are aligned to
the latest available empirical mortality rates as
" x (tn ) exp "
µ̇x (tn + s) = m βx κ̇tn +s − "
κtn
" x (tn )RF(x, tn + s)
=m (5.51)
1000
Frequency 800
600
400
200
0
16.5 17.0 17.5 18.0 18.5 19.0 19.5
1000
800
Frequency
600
400
200
0
16.5 17.0 17.5 18.0 18.5 19.0 19.5
Figure 5.9. Histograms for the 10,000 bootstrapped values of the cohort life expectancies at age
65 in year 2006 for the general population, males: residuals bootstrap in the top panel, Poisson
bootstrap in the bottom panel.
The FPB was asked in 2003 by the Pension Ministry to produce (in collab-
oration with Statistics Belgium) projected life tables to be used to convert
pension benefits into life annuities in the second pillar. A working party was
set up by the FPB with representatives from Statistics Belgium, BFIC, the
Royal Society of Belgian Actuaries and UCL. The results are summarized
in the Working Paper 20-04 available from http://www.plan.be.
The FPB model specifies qx (t) = exp(αx + βx t) where αx = ln qx (0) and
βx is the rate of decrease of qx (t) over time. Thus, each age-specific death
probability is assumed to decline at its own exponential rate. The αx ’s and
βx ’s are first estimated by the least-squares method, that is, minimizing the
objective function
tn
xm 2
ln "
qx (t) − αx − βx t (5.52)
x=x1 t=t1
The method used by Andreev and Vaupel (2006) is based on Oeppen and
Vaupel (2002). Plotting the highest period female life expectancy attained
for each calendar year from 1840 to 2000, Oeppen and Vaupel (2002) have
noticed that the points fall close to a straight line, starting at about 45 years
in Sweden and ending at about 85 years in Japan. They find that record
female life expectancy has increased linearly by 2.43 years per decade from
1840 to 2000 (with an adjustment coefficient R2 = 99.2%). The record
male life expectancy has increased linearly from 1840 to 2000 at a rate of
2.22 years per decade (with R2 = 98%). Moreover, there is no indication
of either an acceleration or deceleration in the rates of change. If the trend
236 5 : Age-period projection models
continues, they predict that female record life expectancy will be 97.5 by
mid-century and 109 years by 2100. Life expectancy can be forecast for a
given country by considering the gap between national performance and
the best-practice level. See also Lee (2003).
Andreev and Vaupel (2006) combine the approach due to Oeppen and
Vaupel (2002) with the Lee–Carter model to gain stability over the long
run. More precisely, they assume that the linear trend in the best practice
female life expectancy continues into the future and also that the differ-
ence between the life expectancy of a particular country and the general
trend stays constant over time. Then, the life expectancy at birth can be
forecast as
↑ ↑
e0 (t) = e0 (tn ) + s(t − tn ) (5.53)
where s is the pace of increase in the best practice life expectancy over time
↑
that has been estimated by Oeppen and Vaupel (2002) and e0 (t) is the life
expectancy at birth in the particular country. Andreev and Vaupel (2006)
do not use separate values of s for males and females but the female value
of 0.243 for both genders.
Andreev and Vaupel (2006) consider ages 50 and over so that they need to
↑ ↑
deduce the value of e50 (t) from e0 (t). To do so, they start with a forecast of
death rates by the linear decline model (according to which each age-specific
death rate is forecasted to decline at its own independent rate) along the
lines of
µ̇x (tn + s) = "
µx (tn ) exp(−gx s) (5.54)
where gx is the annual rate of decline for the mortality rate at age x.
Then, the forecasted death rates are multiplied by a constant factor so that
↑
the life expectancy at birth matches the e0 (t) values coming from (5.53).
↑
The value of e50 (t) is then obtained from these adjusted death rates.
↑
Given the estimated value of e50 (t), we need to calculate the set of mortal-
ity rates at ages over 50 that correspond to this value. Andreev and Vaupel
(2006) use the Kannisto model
at exp(bt x)
µx (t) = (5.55)
1 + at exp(bt x)
which is fitted to data for ages 50 and over by the method of Poisson
maximum likelihood. The at ’s are then projected into the future from
the linear model
ln at = β0 + β1 t (5.56)
↑
Then, for each t ≥ tn + 1, the parameter bt is determined to match e50 (t)
given the value of at obtained from (5.56).
5.9 Forecasting life expectancies 237
This method may produce a jump in death rates. To avoid this drawback,
the death rates can be blended with the death rates produced by the Lee–
Carter method over a short period of time. Specifically, the Lee–Carter
model is fitted to data for ages 50 and over, and the estimated κt ’s are
↑
adjusted by refitting to the e50 (t)’s. The bias correction ensures that the
forecasted death rates closely agree with the latest available death rates in
the first years of the forecast. The weight assigned to the Lee–Carter death
rates is 1 − k/n + 1 for years tn + k, k = 1, 2, . . . , n, where n is the length
of the blending period. The value of n ranges from 10 for countries where
the model (5.55) provides a good fit to 40 where this is not the case.
21
20
e65(t)
19
18
Figure 5.10. Forecast of cohort life expectancies at age 65 for the general population (circle)
with 90% confidence intervals (gray-shaded area), together with values obtained from the Cairns–
Blake–Dowd model (triangle).
close agreement, with slightly larger values coming from the Lee–Carter
approach.
Figure 5.11 displays the cohort life expectancies at age 65 resulting from
the Lee–Carter forecast for the general population, together with the official
FPB values and the corresponding values obtained by Andreev and Vaupel
(2006). The small differences (of <6 months) between the FPB forecasts
and the projections obtained in this chapter remain stable over time. The
official FPB forecasts lie inside the 90% confidence interval for the cohort
life expectancy at age 65. Hence, the FPB forecast is as plausible as the
Lee–Carter projection performed in this chapter. These two projections,
however, significantly differ from the results derived in Andreev and Vaupel
(2006), which are either implausibly small or become rapidly significantly
larger than the present forecasts.
Considering the values obtained by Andreev and Vaupel (2006) using the
Lee–Carter methodology, the differences relative to the forecast obtained
in the present study can be explained as follows. First, Andreev and Vau-
pel (2006) use age groups 50–54, 55–59,…, 100+ and not single years of
ages. Next, the optimal fitting period is not determined by Andreev and
Vaupel (2006) who routinely used 1950–2000. Finally, the forecast starts
with death rates observed in the last year with available data (i.e. 2000 in
their case). We see that the projections obtained in this chapter from the
Lee–Carter model after the optimal fitting period has been selected exceed
those produced by Andreev and Vaupel (2006) by the same methodology
5.9 Forecasting life expectancies 239
22
21
20
e65(t)
19
18
17
2010 2015 2020 2025
t
Figure 5.11. Forecast of cohort life expectancies at age 65 for the general population (circle)
with 90% confidence intervals (gray-shaded area), together with official FPB values (triangle),
with values obtained by Andreev and Vaupel (2006) using the Lee–Carter methodology (square),
and with values obtained by Andreev and Vaupel (2006) using the Oeppen and Vaupel (2002)
modified methodology (diamond).
from calendar years 1950–2000. The selection of the optimal fitting period
may thus have a dramatic effect on the forecast, and is in line with the
conservative actuarial approach.
Andreev and Vaupel (2006) apply the same rate of decrease 0.243 for
both genders in order to forecast future life expectancies using the Oeppen–
Vaupel line of record life expectancies. The life expectancy at age 50 is
deduced from the projected life expectancy at birth using a forecast of death
rates by the linear decline model (i.e. letting each age-specific death rate
decline at its own independent rate by fitting a random walk with drift model
separately to the log of death rates in each age group). Finally, the Lee–
Carter projection is combined with a Kannisto model to produce projected
life tables. We see from Fig. 5.11 that this method yields a much higher life
expectancy at age 65 than the other approaches. Moreover, the speed of
improvement exceeds the other forecasts.
It is interesting to note that all of the mortality forecasting models con-
sidered in the present chapter (Lee–Carter with optimal fitting period,
Cairns–Blake–Dowd, FPB, and Oeppen-Vaupel) agree about the forecasts
of e65 (t) in the next few years. Significant differences compared with the
Oeppen–Vaupel approach emerge from 2013, this approach suggesting sig-
nificantly higher values for the life expectancy at retirement age than its
competitors.
240 5 : Age-period projection models
Following Blake and Dowd (2007), we produce longevity fan charts for
e65 (2006 + t), t = 0, 1, . . . , 20 based on residuals bootstrap (with B =
10,000). The result is shown in Fig. 5.12. Such charts depict some cen-
tral projection of the forecasted variable, together with bounds around this
showing the probabilities that the variable will lie within specified ranges.
The chart in Fig. 5.12 shows the central 10% prediction interval with the
heaviest shading surrounded by the 20%, 30%, . . ., 90% prediction inter-
vals with progressively lighter shading. The shading becomes stronger as
the prediction interval narrows. We can therefore interpret the degree of
shading as reflecting the likelihood of the outcome: the darker is the shad-
ing, the more likely is the outcome. The fan in Fig. 5.12 consists of 9 grey
bands of varying intensity. The upper and lower boundaries correspond to
paths of the forecast 95% and 5% quantiles, and the inner edges of the
bands in the fan correspond to the 10%, 15%, . . ., 90% quantiles. The
darkest band in the middle is bounded by the 45% and 55% quantiles.
Note that the quantiles are calculated for each year in isolation. The fan
chart in Fig. 5.12 shows that longevity risk is rather low. The question as to
whether these narrow confidence bounds are realistic remains an open one.
Let us now forecast the period life expectancies for calendar years
1981–2005, 1991–2005, and 2001–2005 on the basis of the observations
21
20
e65(t)
19
18
16
15
e65(t)
14
13
12
Figure 5.13. Observed period life expectancies at age 65 for the general population (circles),
together with forecast based on 1950–1980, 1950–1990, and 1950–2000 periods (triangles)
surrounded by prediction 90% intervals.
gives a point forecast far below the actual life expectancies observed during
1981–2005. Moreover, the prediction intervals are wider compared to the
1950–1990 and 1950–2000 periods. The Lee–Carter model would thus
clearly have underestimated the actual gains in longevity after 1980 on the
basis of the 1950–1980 observation period. The forecast becomes better
when the 1950–1990 and 1950–2000 periods are considered. The Lee–
Carter model captures the trends in the observed period life expectancies,
which remain in the prediction intervals.
Forecasting mortality:
6 applications and
examples of
age-period-cohort models
6.1 Introduction
In this chapter, we consider the proposal that the models introduced in
Chapter 5 should be extended to include components that represent a cohort
effect, as well as how this proposal has been implemented. We illustrate
this implementation with a case study based on the UK experience. The
justification for this proposal comes initially from some descriptive studies
of mortality trends in the United Kingdom which demonstrate that there
is a strong birth cohort effect present. Thus, the Government Actuary’s
Department, which was responsible at the time for the official UK popula-
tion projections, has highlighted, in a series of reports (GAD 1995, 2001,
and 2002), the existence of cohorts in the United Kingdom who have expe-
rienced rapid mortality improvement, relative to those born previously or
more recently. The generations (of both sexes) born approximately between
1925 and 1945 (and centered on the generation with year of birth 1931)
seem to have experienced this more rapid mortality improvement.
Further evidence has come also from the Continuous Mortality Investiga-
tion Bureau (CMIB) in the United Kingdom. In an analysis of the mortality
experience of males with life insurance over an extended period, CMIB
(2002) notes the existence of a similar effect, although this seems to be
centered on a slightly earlier cohort, that is, that born in 1926. A similar
cohort effect is also noted in an investigation into the mortality rates of
male pensioners who are members of insured pension schemes – again the
highest rates of mortality decrease are noted for the 1926 cohort.
The reasons for this so-called cohort effect are not precisely understood.
A number of explanatory factors have been suggested in the literature – a
helpful review is provided by Willets (2004). Among the most plausible
244 6 : Forecasting mortality
factors are the following. First, the diet in the United Kingdom during the
1940s and early 1950s may have had a beneficial effect on the health of
children growing up during this time. Although this was a time of food
rationing, there is evidence that the average consumption of fresh vegeta-
bles, bread, milk, and fish was higher during those years than in a recent
period like the early 1990s – and, at the same time, average consump-
tion of cheese and meat was lower. Second, the introduction of a universal
social security system in 1948 (following the Beveridge Report of 1942),
the introduction of free secondary school education for all in 1944 and the
establishment of the National Health Service in 1947 meant that the social
conditions for children growing up in the early 1950s would have been
very different than that experienced by earlier generations. Third, there are
strong cohort-related patterns in mortality from diseases that are linked
to smoking, for example, lung cancer and heart disease (ONS 1997). It is
clear that in the United Kingdom (and elsewhere) different generations have
had different smoking histories. Those born around 1920 may have started
smoking during the 1930s, they may have been given free cigarettes during
World War II and would have been smoking for some considerable time
when the deleterious health impact of smoking was first identified in the late
1950s and early 1960s. There is a marked contrast with those born some
20 years later who would have reached adulthood just when these research
findings were being widely discussed.
Given the close association of lung cancer with smoking, Willets (2004)
also examines the patterns of cause-specific mortality rates from lung cancer
in the United Kingdom. He argues that, for males, lung cancer death rates
plotted for different cohorts indicate an upward trend for those born from
1870 onwards with the peak rates for those born between 1900 and 1905
and the greatest average annual improvement for those born in the period
1930–1935.
These findings are supported by published analyses; for example,
Evandrou and Falkingham (2002) have studied smoking prevalence rates.
They estimate that approximately 95% of men born in 1916–1920 had
smoked at some point by the time they reached age 60 – while, for the
cohort born in 1931–1935, the corresponding figure was 25%. Finally,
varying birth cohort sizes may confer benefits in that those born at a time
of low birth rates may acquire social and economic advantages relative to
those born at times of higher birth rates. In this regard, we note that the
period from 1925 to 1945 was a period of falling birth rates sandwiched
between the two post-war ‘baby booms’.
Other hypotheses have been suggested. For example, Catalano and
Bruckner (2006) have tested the ‘diminished entelechy hypothesis’ – this
6.1 Introduction 245
(0)
with an extra pair of bilinear terms βx it−x introduced in order to represent
the cohort effects. We can see that (6.6) is then a natural extension of
equation (4.98) which was introduced in Chapter 4.
It is clear that the structure represented by equations (6.6) and (6.7) gives
rise to a rich sub-structure of models:
Lee−Carter age-period (LC) model βx(0) = 0
Age−Cohort (AC) model βx(1) = 0
(j)
plus versions where either or both of the βx = 1 for j = 1, 2: where the
application of age adjustments to one or both of the main period-effects
and cohort effects terms is not found to be significant.
In formulating these structures, in each of the above cases, we may
partition in the force of mortality as follows:
µx (t) = exp(αx )RF(x, t) (6.8)
that is into the product of a static term, representing the age profile
of mortality and incorporating the main age effects αx , and a dynamic
248 6 : Forecasting mortality
parameterized mortality reduction factor RF(x, t), which contains both the
age-specific (κt ) and cohort (it−x ) effects.
6.2.2.1 Introduction
In Chapter 5, a number of approaches to specifying the error structure
for the model and to model fitting have been described. In this section,
we set aside the standard least-squares and singular value decomposition
approach, and focus on selecting a Poisson response model and using
maximum likelihood estimation (as in Section 5.2.2.3).
Thus, we model the random number of deaths, Dxt , as a Poisson response
variable. As noted in the previous chapter, direct modelling of Dxt is very
useful in many practical applications where, for example, we might need to
simulate the future cash flows of a life annuity or pensions portfolio. We
allow also for over-dispersion and the allocation of prior weights, which
can be important in the presence of empty data cells. This is formalized by
following the approach of generalized linear models and by specifying the
first two moments of the responses Yxt where
Yxt = Dxt ,
E(Yxt ) = ETRxt µx (t) = ETRxt exp(αx )RF(x, t) (6.9)
Var(Yxt ) = φE(Yxt ) (6.10)
where
yxt = ln m̂x (t), ŷxt = α̂x + β̂x κ̂t (6.14)
yxt
yxt − u
D(yxt , ŷxt ) = dev(x, t) = 2wxt du = wxt (yxt − ŷxt )2
x,t x,t
V(u) x,t
ŷxt
(6.15)
with weights
1, ETRxt > 0
wxt = . (6.16)
0, ETRxt = 0
The updating of a typical parameter θ proceeds according to
/
∂D ∂2 D
updated(θ) = u(θ) = θ − (6.17)
∂θ ∂θ 2
250 6 : Forecasting mortality
where D is the deviance of the current model. Table 6.1 provides fuller
details. Effective starting values, conforming to the usual LC constraints
(6.2) are κ̂t = 0, β̂x = 1/k, coupled with the SVD estimate
1 tn
∧
α̂x = ln mx (t) (6.18)
tn − t1 + 1 t=t
1
Gaussian Poisson
wxt (yxt − ŷxt ) wxt (yxt − ŷxt )
LC u(α̂x ) = α̂x + t u(α̂x ) = α̂x + t
wxt wxt ŷxt
t t
wxt (yxt − ŷxt )β̂x wxt (yxt − ŷxt )β̂x
u(κ̂t ) = κ̂t + x u(κ̂t ) = κ̂t + x
wxt β̂x2 wxt ŷxt β̂x2
x x
wxt (yxt − ŷxt )κ̂t wxt (yxt − ŷxt )κ̂t
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt κ̂t2 wxt ŷxt κ̂t2
t t
wxt (yxt − ŷxt )β̂x(0) wxt (yxt − ŷxt )β̂x(0)
x,t x,t
t−x=z t−x=z
APC u(ι̂z ) = ι̂z + 2 u(ι̂z ) = ι̂z + 2
wxt β̂x(0) wxt ŷxt β̂x(0)
x,t x,t
t−x=z t−x=z
(0) (0)
wxt (yxt − ŷxt )ι̂t−x (0) (0)
wxt (yxt − ŷxt )ι̂t−x
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt ι̂2t−x wxt ŷxt ι̂2t−x
t t
wxt (yxt − ŷxt )β̂x(1) wxt (yxt − ŷxt )β̂x(1)
u(κ̂t ) = κ̂t + x u(κ̂t ) = κ̂t +
x
(1)2 2
wxt β̂x wxt ŷxt β̂x(1)
x x
(1) (1)
wxt (yxt − ŷxt )κ̂t (1) (1)
wxt (yxt − ŷxt )κ̂t
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt κ̂t2 wxt ŷxt κ̂t2
t t
AC u(α̂x ) computedas above u(α̂x ) computedas above
wxt (yxt − ŷxt )β̂x wxt (yxt − ŷxt )β̂x
x,t x,t
t−x=z t−x=z
u(ι̂z ) = ι̂z + u(ι̂z ) = ι̂z +
wxt β̂x2 wxt ŷxt β̂x2
x,t x,t
t−x=z t−x=z
wxt (yxt − ŷxt )ι̂t−x wxt (yxt − ŷxt )ι̂t−x
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt ι̂2t−x wxt ŷxt ι̂2t−x
t t
6.2 LC age–period–cohort mortality projection model 251
yxt
yxt − u
D(yxt , ŷxt ) = dev(x, t) = 2wxt du
x,t x,t
V(u)
ŷxt
yxt
= 2wxt yxt ln − yxt − ŷxt (6.20)
x,t
ŷxt
and then revert back to the standard LC constraints (6.2) once convergence
is attained.
252 6 : Forecasting mortality
where
ι̂tn −x+s , 0 < s ≤ x − x1
ι̃tn −x+s =
ι̇tn −x+s , s > x − x1
As we have seen in Section 5.7, the time series forecasts are typically gen-
erated using univariate ARIMA processes. The random walk with drift
(or ARIMA(0,1,0) process) features prominently in many of the published
applications of the LC model. If no provision for alignment with the latest
available mortality rates is made (as in equation (6.24)), the extrapolated
mortality rates decompose multiplicatively as
·
ṁx (tn + s) = exp α̂x + β̂x(0) ι̂tn −x + β̂x(1) κ̂tn RF(x, tn + s), s > 0 (6.27)
which has the same functional form as (6.8), and can be directly compared
with (6.24). This was the approach originally proposed in Lee and Carter
(1992).
6.2.4 Discussion
maximum likelihood estimates obtained for the LC model using the itera-
tive fitting processes under the Gaussian error structure given by (6.12), are
the same as those obtained under fitting by SVD. Unlike modelling by SVD,
however, the choice of weights (6.16) means that estimation can proceed,
in the presence of empty data cells, under the Gaussian, Poisson, and any
of the other viable error settings. Wilmoth (1993) uses weights wxt = dxt
in combination with the Gaussian error setting. Empirical studies reveal
that this has the effect of bringing the parameter estimates into close agree-
ment with the Poisson-response-based estimates. When comparing a range
of results obtained under both modelling approaches (with identical model
structures), we have found that the same number of iterations is required
to induce convergence. However, convergence is slow when fitting the APC
model.
As discussed in Section 5.6, diagnostic checks on the fitted model are very
important. For consistency with the model specification, we consider plots
of the standardized deviance residuals
.
rxt = sign(yxt − ŷxt ) dev(x, t)/φ̂ (6.28)
where
D(yxt , ŷxt )
φ̂ = (6.29)
ν
The sole use of the proportion of the total temporal variance, as measured
by the ratio of the first singular value to the sum of singular values under
SVD, is not a satisfactory diagnostic indicator. However, this index is widely
quoted in the demographic literature: see, for example, Tuljapurkar et al.
(2000).
The parameters αx are estimated simultaneously with the parameters of
the reduction factor RF in both the LC and AC models. A two-stage esti-
mation process is necessary, however, in which αx is estimated separately
to condition on the estimation of RF, when fitting the APC model (and its
substructures). This two-stage approach can also be applied when fitting
the LC and AC models. In the case of the former, empirical studies show
that this has little practical material effect, due to the robust nature of the
αx estimate (6.18).
Figure 6.1. Female mortality experience: residual plots for (a) LC model; (b) AC model; and (c)
APC model.
256 6 : Forecasting mortality
over the fitted LC model. Similar patterns are observed in the residual plots
for the UK male experience (not reproduced here but the details are available
from the authors).
Turning first to the parameter estimates for the APC modelling approach
(Fig. 6.2), we believe that it is helpful and informative to compare matching
frames between the sexes. Thus, the main age–effect plots (α̂x vs x) display
the familiar characteristics, including the ‘accident’ humps, of static cross-
sectional life-tables (on the log scale), with a more pronounced accident
hump and heavier mortality for males than for females. We recall that these
effects are estimated separately, by averaging crude mortality rates over t
for each x, to condition for both period and cohort effects.
The main period effects plot (κ̂t vs t) is linear for females but exhibits mild
curvature for males, which can be characterized as piece-wise linear with
a knot or hinge positioned in the first-half of the 1970s. This effect is also
present in the separate LC analysis of mortality data of the G7 countries
–1 0.022
(a) 0.035
0.020
–2
0.030 0.018
–3 0.016
0.025
–4 0.014
b0
b1
0.020
a
–5 0.012
0.010
–6 0.015
0.008
–7 0.010 0.006
–8 0.004
0.005
0.002
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age Age
0
–10 80
–20 60
–30
–40 40
–50
k
20
i
–60
–0
–70
–80 –20
–90 –40
–100
1960 19651970 1975 1980 1985 1990 1995 2000 2005 20102015 2020 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020
Calendar year Year or birth
(b) –1 0.060
0.055 0.014
–2 0.050 0.012
–3 0.045
0.040 0.010
–4 0.035
b0
b1
a
–5 0.030 0.008
0.025
–6 0.020 0.006
0.015 0.004
–7 0.010
–8 0.005 0.002
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age Age
0 100
–10 80
–20
60
–30
–40 40
i
k
–50 20
–60
–0
–70
–20
–80
1960 1965 1970 1975 1980 19851990 19952000 20052010 2015 2020 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020
Calendar year Year of birth
Figure 6.2. Parameter estimates and forecasts for the APC model: (a) females; (b) males.
6.3 Application to United Kingdom mortality data 257
(Tuljaparkar et al., 2000) and has been discussed further for the United
Kingdom by Renshaw and Haberman (2003a). The forecasts for κt are
based on the auto-regressive time series model
Figure 6.3. Current (2000) and projected (2020) ln µx (t) age profiles: (a) LC and AC models; (b)
LC and APC models.
for a range of years t using both the cohort and period method of computing.
(We note that the annuity values represent the expected present value of an
income of one paid annually in arrears while the individual initially aged 65
remains alive.) For the cohort method of computing, we use the following
formulae, which are analogous to (5.57):
1
h≥0 lx+h (t + h){1 − 2 qx+i (t + h)}
ex (t) = ,
lx (t)
h≥1 lx+i (t + h)v
h
ax (t) = (6.31)
lx (t)
where
20
e(65,t)
18
16
14
12
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
20
e(65,t)
18
16
14
12
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
Figure 6.4. Projected life expectancies at age 65, computed by period and by cohort methods for
age-period (LC) and age-period-cohort (APC) models.
11
a(65,t)
10
7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
11
a(65,t)
10
7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
Figure 6.5. Projected life annuity values at age 65 (calculated using a 5% per annum fixed interest
rate), computed by period and by cohort under age-period (LC) and age-period-cohort (APC)
models.
5 t = 2020
4 t = 2016
3 t = 2012
2 t = 2008
Age-period (LC)
1 t = 2004 Age-period-cohort (APC)
Age 65
0
14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0
Life expectancy
5 x = 85
4 x = 80
3 x = 75
2 x = 70
Age-period (LC)
1 Age-period-cohort (APC) x = 65
Period 2000
0
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Life expectancy
Figure 6.6. E +W male mortality: comparison of life expectancy predictions using (i) age-period-
cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrapping the time
series prediction error in the period (and cohort) components, and selecting the resulting 2.5, 50,
97.5 percentiles.
indicate the burden that increasing longevity may place on such financial
institutions.
As we have discussed in Section 5.8, it is important to be able to qual-
ify any projections of key mortality indices with measures of the error or
uncertainty present. Because of the complexities of the structure of the APC
LC model, the indices of interest are non-linear functions of the parameters
αx , βx , κt , it−x and hence analytical deviations of prediction intervals are not
possible. It is therefore necessary to employ bootstrapping techniques.
In Figs. 6.6 and 6.7, we use the LC and APC models fitted to England and
Wales male mortality rates over the period 1961–2000 in order to compare
estimates of life expectancy and 95% prediction intervals. Specifically, we
show in Fig. 6.6(a) computations of the period life expectancy at age 65 for
various future periods (equivalent to the median of the simulated distribu-
tions) and the corresponding 2.5 and 97.5 percentiles from the simulated
262 6 : Forecasting mortality
5 t = 2020
4 t = 2016
3 t = 2012
2 t = 2008
Age-period (LC)
1 t = 2004 Age-period-cohort (M)
Age 65
0
10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
4% Annuity
5 x = 85
4 x = 80
3 x = 75
2 x = 70
Age-period (LC)
1 Age-period-cohort (M) x = 65
Period 2000
0
3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0
4% Annuity
Figure 6.7. E + W male mortality: comparison of 4% fixed rate annuity predictions using (i) age-
period-cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrapping
the time series prediction error in the period (and cohort) components, and selecting the resulting
2.5, 50, 97.5 percentiles.
The results in Fig. 6.6 (lower frame) show the corresponding figures for
cohort life expectancy for five cohorts of males aged 65, 70, 75, 80, and 85 in
2000. The younger cohorts have estimates of cohort life expectancy that are
higher under the APC model than under the LC model (as in Fig. 6.4). The
prediction intervals under the APC model are much wider for the younger
cohorts. As we consider the older cohorts, we note that the central estimates
and the prediction intervals become more similar under the two models
indicating the particular incidence of the cohort effect which affects those
aged 55–75 in 2000. Expectedly, under both models, the prediction intervals
are wider for the cohorts aged 65 and 70 in 2000 than for the older cohorts,
and the width decreases in stages as age in 2000 increases. This reflects the
underlying level of projection involved in the calculations – if we regard age
110 as approximately the terminal age in the underlying survival model,
then the cohort estimates at age 65 would involve 45 years of projected
quantities while the cohort estimates at age 85 would involve only 25 years
of projections.
Figure 6.7 reproduces the calculations of Fig. 6.6 but for immediate life
annuities calculated using a constant interest instant rate of 4% per annum.
We can regard Fig. 6.7 as extending the results of Fig. 6.5 by including
prediction intervals and a more detailed comparison. Figure 6.7 shows the
same principal features as Fig. 6.6.
where the it−x term represents a cohort effect as in (6.6). Having considered
goodness-of-fit of this family of models to historic data from England and
Wales and the USA, Cairns et al. (2007) investigate two specific versions in
some detail.
The special cases are
where x is the average age in the data set and xc is a constant parameter that
needs to be estimated. As with the APC version of the Lee Carter model in
Section 6.2, we need to introduce some identifiability constraints to ensure
that the parameters can be uniquely estimated. Version II is motivated by
the observation that in the applications of the APC model of Section 6.2, the
coefficient of the cohort term it−x is often found to be a decreasing function
(3)
of age: (6.37) incorporates the simplest such specification of βx .
Cairns et al. (2007) fit the models by the method of maximum likelihood,
assuming that Dxt has a Poisson distribution as assumed in earlier Sections
5.2.2.3 and 6.2.2. For England and Wales data comprising the calendar
years 1961–2004 inclusive and ages 60–89, they find that the best-fitting
model is (6.37). For US data comprising the calendar years 1968–2003
inclusive and ages 60–89 (although only data for ages 85–89 are used for
1980–2003), they find that the best-fitting model is (6.6). When robustness
to the choice of fitting period is considered, the best fits to the historic data
from both countries are obtained for an augmented version of (6.35) viz.
qx (t) (1) (2) (3)
ln = βx(1) κt + βx(2) κt + βx(3) κt + βx(4) it−x (6.38)
px (t)
(1) (2) (3)
with the specific choices βx = 1, βx = x − x, βx = (x − x̄)2 − σ̂x2 and
(4)
βx = 1. Here σ̂x2 is the average value of (x − x̄)2 . This development of
the model is inspired by the observation that there is some curvature in the
age-profile of log it qx (t) in the United States data.
As in Sections 5.3 and 6.2, we could use the Cairns–Blake–Dowd class
of models for projection purposes. This would require models to be pos-
tulated and estimated for the dynamics of the period and cohort effects
6.5 P-splines model: allowing for cohort effects 265
where Bij (x, t) = Bi (x) · Bj (t)and where Bij and the θij are parameters to be
estimated from the data and Bi and Bj are the respective univariate splines.
In reality, B-splines can provide a very good fit to the data if we employ a
large number of knots in the year and age dimensions. But this excellent level
of goodness of fit is achieved by sacrificing smoothness in the resulting fit.
The method of P-splines (or penalized splines) has been suggested by Eilers
and Marx (1996) to overcome this problem: in this case, the log-likelihood
is adjusted by a penalty function, with appropriate weights.
Schematically, the penalized by likelihood would have the following form
for an LC model:
where, as in Section 6.2, we use z = t − x to index cohorts. The λ’s are esti-
mated from the data. As noted in Section 5.4.2, typical choices for quadratic
266 6 : Forecasting mortality
Thus, the B-splines are used as the basis for the underlying regression and
the log likelihood is modified by penalty functions like the above which
depend on the smoothness of the θij parameters.
The idea of using P-spline regression not just for graduating mortality
data but also for mortality projections was first suggested by CMIB (2005).
In this application, that is, projecting mortality rates, the choice of the P(θ)
function plays a critical role in extending the mortality surface beyond the
range of the data so that projections are a direct consequence of the smooth-
ing process. Thus, a quadratic penalty function effectively leads to linear
extrapolation – in the age and time dimensions, for (6.40) combined with
the choices (6.42) and (6.43); or in the age and year of birth dimensions
for (6.41) combined with the choices (6.42) and (6.44). Different choices
for P(θ) would be possible, and these may have little impact on the quality
of fit to the historic data and hence would be difficult to infer from the
data. However, the impact on the projected mortality surface is consider-
able. The choice of P(θ) corresponds to a decision on the projected trend.
We have seen the implications of a quadratic penalty. Similarly, a linear
penalty function would lead to constant log mortality rates being projected
in the appropriate dimensions and a cubic penalty function would lead to
quadratic log mortality rates being projected in the appropriate dimensions.
Detailed applications of the P-spline methodology indicate that it is bet-
ter suited to graduation and smoothing of historic observational data than
to projection: see, for example, Cairns et al. (2007) and Richards et al.
(2007). Further, we should note that P-spline models can be used to gen-
erate percentiles for the measurement of uncertainty but unlike, the LC
and Cairns–Blake–Dowd families of models, P-spline models are not able
to generate sample paths. In many asset–liability modelling applications in
insurance and pensions, the production of sample paths is an important fea-
ture and could be useful elsewhere such as in the pricing of longevity-linked
financial instruments – see Chapter 7.
The longevity risk:
7 actuarial perspectives
7.1 Introduction
In this chapter we deal with the mortality risks borne by an annuity provider,
and in particular with the longevity risk originating from the uncertain
evolution of mortality at adult and old ages.
The assessment of longevity risk requires a stochastic representation of
mortality. Possible approaches are described in Section 7.2, which is also
devoted to an analysis of the impact of longevity risk on the risk profile of
the provider. In Section 7.3 and 7.4 we take a risk management perspective,
and we investigate possible solutions for risk mitigation. In particular, risk
transfers as well as capital requirements for the risk retained are discussed.
Policy design and the pricing of life annuities allowing for longevity risk are
dealt with in Section 7.5 and 7.6; such aspects, owing to commercial pres-
sure and modelling difficulties, are rather controversial. We do not develop
an in-depth analysis, but we instead remark on the main issues. To reach a
proper arrangement of the policy conditions of a life annuity, the possible
behaviour of the annuitant in respect of the planning of her/his retirement
income has to be considered. In Section 7.7 we describe possible choices
available to the annuitant in this respect.
The topics dealt with in this chapter are rather new and not well-
established either in practice or in the literature. So the chapter is based on
recent research. To give a comprehensive view of the available literature,
most contributions are cited in Section 7.8, which is devoted to comments
on further readings; for some specific issues, however, references are also
quoted in the previous sections.
In this chapter, we refer usually to annuitants and insurers. Such terms
are anyhow used in a generic sense. The discussion could also be referred
to pensioners, with a proper adjustment of the parameters of the relevant
268 7 : The longevity risk: actuarial perspectives
mortality models, and to annuity providers other than an insurer. Just for
brevity, only annuitants and insurers are mentioned.
Mortality risk may emerge in different ways. Three cases can in particular
be envisaged.
(a) One individual may live longer or less than the average lifetime in the
population to which she/he belongs. In terms of the frequency of deaths
in the population, this may result in observed mortality rates higher
than expected in some years, lower than expected in others, with no
apparent trend in such deviations.
(b) The average lifetime of a population may be different from what is
expected. In terms of the frequency of deaths, it turns out that mortality
rates observed in time in the population are systematically above or
below those coming from the relevant mortality table.
(c) Mortality rates in a population may experience sudden jumps, due to
critical living conditions, such as influenza epidemics, severe climatic
conditions (e.g. hot summers), natural disasters and so on.
In all the three cases, deviations in mortality rates with respect to what is
expected are experienced; an illustration is sketched in Fig. 7.1 where, with
reference to a given cohort, in each panel dots represent mortality rates
observed along time, whereas the solid line plots their forecasted level.
Case (a) is the well-known situation of possible deviations around
expected mortality rates; the mortality risk here comes out as a risk of
Mortality rates
Mortality rates
Mortality rates
Figure 7.1. Experienced (dots) vs expected (solid line) mortality rates for a given cohort.
7.2 The longevity risk 269
2 system, where (see CEIOPS (2007)) both the mortality and the longevity
risks are meant to result from uncertainty risk. Mortality risk addresses
possible situations of extra-mortality; concern here is for a business with
death benefits. On the contrary, longevity risk addresses the possible real-
ization of extra-survivorship; clearly, in this case concern is for a business
with living benefits, life annuities in particular. In the following, we disre-
gard this meaning; reference is therefore to what we have described under
items (a)–(c) above and the relevant remarks.
Input Output
X1
1
X2
Y
X1
2 X2
Y
fX1
X2
fY
A1 A2 A3
fX1|Ah
4
X2
fY|Ah
fX1|Ah +
5
X2
fY
fX1|Ah +
6 fX2
fY
Let A(τ) denote a given assumption about the mortality trend for peo-
ple born in year τ, and A(τ) the set of such assumptions. The notation
(x, τ + x|A(τ)) refers to the projected mortality quantity conditional on
the specific assumption A(τ). The set of all mortality projections is denoted
as the family {(x, τ + x|A(τ)); A(τ) ∈ A(τ)}.
In principle, the set A(τ) can be either discrete or continuous. The former
case is anyhow more practicable. Examples may be found in the projec-
tions developed by CMIB, addressing the cohort effect and assuming three
hypotheses about the persistence in the future of such an effect; see CMI
(2002) and CMI (2006).
Let us then suppose that a discrete set has been designed for A(τ). A
scenario testing, and possibly a stress testing, can be performed. In par-
ticular, the sensitivity of some quantities, such as reserves, profits, and
so on, in respect of future mortality trends can be investigated. As men-
tioned in Section 7.2.2, process risk can be explicitly appraised through the
probability distribution function of the lifetime of all the individuals in the
cohort, conditional on a given trend assumption. However, the approach
in respect of parameter risk is deterministic. Some examples are described
in Section 7.2.4.
A step forward consists of assigning a (non-negative and normalized)
weighting structure to A(τ) (see approach 5 in Fig. 7.2). In this way,
unconditional valuations can be performed, thus accounting explicitly for
parameter risk. Let
parameters. It emerges that, in terms of the survival function itself, the alter-
native assumptions imply different levels of rectangularization
(i.e. squaring
of the survival function, as it is witnessed by Var[T65 |Ah (τ)]) and expan-
sion (i.e. forward shift of the adult age at which most deaths occur, which
is then reflected in the value of E[T65 |Ah (τ)]) (see Sections 3.3.6 and 4.1
for the meaning of rectangularization and expansion). The relevant survival
functions and curves of deaths are plotted in Fig. 7.3.
Assumption A3 (τ) will be referred to as the best-estimate description of
the mortality trend for cohort τ; its parameters have been obtained by fitting
(7.3) to the current market Italian projected table for immediate annuities
(named
IPS55). When comparing the values taken by E[T65 |Ah (τ)] and
Var[T65 |Ah (τ)] (quoted in Table 7.1) under the various assumptions, it
turns out that in respect of A3 (τ) at age 65:
In each case, the maximum attainable age has been set equal to 117,
according to the reference projected table.
The portfolio we refer to consists of one cohort of immediate life annu-
ities. We assume that all annuitants are aged x0 at the time t0 of issue.
To shorten the notation, time t will be recorded as the time elapsed since
the policy issue, that is, it is the policy duration; hence, at policy dura-
tion t the underlying calendar year is t0 + t. The lifetimes of annuitants
are assumed, conditional on any given survival function, to be indepen-
dent of each other and identically distributed. Since our objective is the
278 7 : The longevity risk: actuarial perspectives
0.9
0.8
0.7
Number of survivors
0.6 A1
A2
0.5 A3
A4
0.4 A5
0.3
0.2
0.1
0
65 75 85 95 105 115
Age
0.06
0.05
0.04
Number of deaths
A1
A2
0.03 A3
A4
A5
0.02
0.01
0
65 75 85 95 105 115
Age
Figure 7.3. Survival functions (top panel) and curves of deaths (bottom panel) under the
Heligman–Pollard law.
() ()
We are interested in investigating some typical values of Bt and Yt , as
well as the coefficient of variation and some percentiles. We will in particular
consider the impact of longevity risk in relation to the size of the portfolio.
So, unless otherwise stated, a homogeneous portfolio in respect of annual
amounts is considered; that is, we set b(j) = b for all j. Note that in this case
(7.6) may be rewritten as
()
Bt = b Nt (7.9)
whilst the present value of future payments for the portfolio may also be
expressed as
ω−x
0 ω−x
0
() ()
Yt = Bh (1 + i)−(h−t) = b Nh (1 + i)−(h−t) (7.10)
h=t+1 h=t+1
We first adopt approach 4 described in Section 7.2.2 (see also Fig. 7.2).
All valuations are then conditional on a given mortality assumption. We
have
E[Yt() |Ah (τ), nt ] = nt E[Yt(1) |Ah (τ)] (7.11)
Because we are assuming independence of the annuitants’ lifetimes, condi-
tional on a given mortality trend, the following results hold:
This represents the well-known result that the larger is the portfolio, the
less risky it is, since with high probability the observed values will be close
()
to the expected ones. The quantity CV[Yt |Ah (τ), nt ] is sometimes called
the risk index.
Conditional on a given mortality assumption and because of the inde-
pendence among the lifetimes of the annuitants and the assumption of
()
homogeneity of annual amounts, the percentiles of Yt could be assessed
through a process of convolution. In practice, however, due the number
()
of random variables constituting Yt (i.e. due to the magnitude of nt ),
analytical calculations are not practicable and so we must resort to stochas-
()
tic simulation. The ε-percentile of the distribution of Yt conditional on
assumption Ah (τ) and an observed size of the in-force portfolio nt at time
t is defined as
# ()
yt,ε [Ah (τ), nt ] = inf u ≥ 0 # P Yt ≤ u|Ah (τ), nt > ε (7.15)
()
In particular, we are interested in investigating the right tail of Yt ;
therefore, high values for ε should be considered.
()
As far as the distribution of annual outflows Bt is concerned, simi-
()
lar remarks to those for Yt can be made. Thus, due to independence
7.2 The longevity risk 281
Assumption
Time t A1 (τ) A2 (τ) A3 (τ) A4 (τ) A5 (τ)
()
and homogeneity, the random variables Bt have (under the information
available at time 0) a binomial distribution, with parameters n0 and the
survival probability from issue time to policy duration t calculated under
the given mortality assumption. For reasons of space, we omit the relevant
expressions (which are straightforward).
Assumption
Time t A1 (τ) A2 (τ) A3 (τ) A4 (τ) A5 (τ)
respectively, the coefficient of variation for some initial sizes of the portfo-
lio and some percentiles of the present value of future payments, per unit
of expected value. Only the best-estimate assumption is considered. As far
as the coefficient of variation is concerned, we note that at any given time it
decreases rapidly as the size of the portfolio increases, as we commented on
earlier. For a given initial portfolio size, the coefficient of variation increases
in time; this is due to the decreasing residual size of the portfolio and to
annuitants becoming older as well. A similar result is found when analysing
the right tail of the distribution, as it emerges in Table 7.5.
Tables 7.6 and 7.7 give a highlight on the distribution of annual outflows.
In particular, Table 7.6 quotes the expected value of annual outflows under
the different assumptions; we recall that, having set b(j) = 1, what is shown
is the expected number of annuitants (not rounded, to avoid too many
approximations). Remarks are similar to those discussed for the present
value of future payments.
7.2 The longevity risk 283
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99
We now assign the (naive) probability distribution (7.2) on the set A(τ).
The unknown mortality trend, assumed to lie in A(τ), is denoted by Ã(τ).
For the unconditional expected present value of future payments, the
following relations hold (the suffix ρ denotes that the underlying probability
distribution is given by (7.2)):
Assumption
Time t A1 (τ) A2 (τ) A3 (τ) A4 (τ) A5 (τ)
(1) m (1)
where E[Yt ] = h=1 E[Yt |Ah (τ)] ρh .
()
The unconditional variance of Yt can be calculated as
The first term in the expression for the variance reflects deviations around
the expected value; so it can be thought of as a measure of process
risk. The second term, instead, reflects deviations from the expected value
(i.e. systematic deviations) and so it may be thought of as a measure of
longevity (namely parameter, in our example) risk. Under the unconditional
7.2 The longevity risk 285
The first term under the square root shows that random fluctuations rep-
resent a pooling risk, since (in relative terms) their effect is absorbed by
the size of the portfolio. This result is similar to that obtained under the
valuation conditional on a given mortality trend (see (7.13)). The second
term, instead, shows that systematic deviations constitute a non-pooling
risk, which is not affected by changes in the portfolio size. In particular, the
asymptotic value of the risk index
2
3
3 Varρ [E[Y (1) |Ã(τ)]]
lim CV[Yt |nt ] = 4
() t
(7.19)
nt
∞ E [Yt(1) ]
2
Assumption Weight ρh
A1 (τ) 0.1
A2 (τ) 0.1
A3 (τ) 0.6
A4 (τ) 0.1
A5 (τ) 0.1
0 15.290
5 12.985
10 10.625
15 8.317
20 6.187
25 4.353
30 2.894
35 1.824
()
In Table 7.10, the unconditional variance of Yt for some portfolio
sizes is shown, split into the pooling and non-pooling components. For
comparison with the conditional valuation, also the case n0 = 1 is quoted.
We note the increase in the magnitude of the variance, due to the non-
pooling part, as the portfolio size increases. Whenever the portfolio is large
at policy issue, the non-pooling component remains important relative to
the pooling component even at high policy durations.
The behaviour of the coefficient of variation in respect of the portfolio
size is illustrated in Table 7.11. When compared with the case allowing for
process risk only (see Table 7.4), the risk index decreases more slowly as the
portfolio size increases. We note, in particular, its positive limiting value,
which is evidence of the magnitude of the systematic risk.
In Table 7.12 the right tail of the distribution of the present value of
future payments is investigated, for some portfolio sizes. We note that the
tail is rather heavier than in the case allowing for process risk only (see
Table 7.5).
Finally, in Tables 7.13–7.15 the distribution of annual outflows is inves-
tigated. Similar remarks hold to those made above for the distribution of
future payments.
7.2 The longevity risk 287
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99
()
Time t E[Bt ]
5 963.986
10 900.924
15 795.459
20 633.446
25 419.899
30 203.682
35 60.162
Weighting system
Example 7.3 In this example, we compare the right tail of the distribution
of the present value of future payments assuming the alternative weighting
systems for (7.2) that are presented in Table 7.16. System (a) is the one
allowing for process risk only (see Example 7.1). System (b) is the one
adopted in Example 7.2. System (c) is similar to (b), with the highest weight
assigned to the best-estimate assumption; however, such weight has been
reduced. System (d), finally, consists of a uniform distribution of weights.
We focus on the right tail of the distribution of the present value of future
payments (and not on the other risk measures considered previously, such
as the risk index) due to its practical importance. Actually, reserving or
capital allocation could be based on this quantity (see also Section 7.3.3).
From the details presented for this example in Table 7.17, it seems that
whenever parameter risk is allowed for, the magnitude of the right tail is
not deeply affected by the weighting system (although, of course, the actual
figure does depend on the specific weights). Indeed, an apparent difference
emerges between results found under system (a), on the one hand, and
systems (b)–(d), on the other. This suggests that, having poor information,
the allowance for longevity risk is more important than the actual choice
of the weights.
7.2 The longevity risk 291
Table 7.17. Some (unconditional) percentiles of the present value of future payments, per unit
yt,ε [nt ]
of expected value: () , under alternative weighting systems; n0 = 1, 000
E[Yt |nt ]
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99
System (a)
0 0.635% 1.286% 1.631% 2.286%
5 0.820% 1.531% 1.934% 2.668%
10 0.898% 1.923% 2.423% 3.386%
15 1.131% 2.221% 2.854% 4.472%
20 1.354% 2.692% 3.781% 6.223%
25 2.117% 4.281% 5.443% 7.967%
30 3.638% 7.355% 9.765% 14.334%
35 9.155% 18.426% 22.253% 31.641%
System (b)
0 1.057% 5.164% 7.357% 8.591%
5 1.378% 6.300% 9.654% 11.276%
10 1.878% 7.501% 12.471% 14.889%
15 2.108% 9.426% 16.712% 19.332%
20 2.797% 11.838% 22.206% 25.939%
25 3.417% 14.771% 29.772% 34.880%
30 5.100% 19.791% 37.774% 47.734%
35 14.933% 27.796% 48.299% 72.891%
System (c)
0 2.912% 6.785% 7.652% 8.539%
5 3.206% 8.850% 9.822% 11.384%
10 3.615% 11.893% 13.119% 15.178%
15 3.881% 15.645% 17.404% 19.643%
20 4.258% 20.997% 23.363% 26.253%
25 5.047% 27.391% 31.364% 35.474%
30 6.192% 35.431% 41.423% 49.504%
35 16.809% 41.965% 54.794% 74.366%
System (d)
0 3.697% 7.142% 7.642% 8.508%
5 4.609% 9.408% 10.362% 11.702%
10 5.079% 12.195% 13.497% 15.037%
15 5.480% 16.397% 17.687% 19.671%
20 5.929% 21.825% 23.725% 26.542%
25 7.025% 29.249% 32.239% 35.218%
30 8.782% 36.966% 42.059% 50.565%
35 19.443% 46.965% 60.689% 74.138%
We have noted that the most important aspect is to allow for parameter
risk by assigning positive weights to trend assumptions alternative to the
best-estimate one. However, the specific weights do affect the magnitude of
quantities of interest (such as the tail of the distribution of future payments).
292 7 : The longevity risk: actuarial perspectives
Note that
m
ft (z) = ft (z|A(τ)) ρ(Ah (τ)) (7.24)
h=1
represents the (prior) predictive pdf restricted to the age interval [t, ω − x0 ].
Assume now the observation period [t, t ]. Let d denote the number of
deaths observed in such period. With an appropriate renumbering, let
x = {x(1) , x(2) , . . . , x(d) } (7.25)
denote the array of ages at death. We note that the defined observation pro-
cedure implies a Type I-censored sampling (see, for instance, Namboodiri
and Suchindran (1987)).
Using the information provided by the pair (d, x), the (posterior) predic-
tive pdf ft (z|d, x) can be constructed. With this objective in mind, we can
adopt the following procedure (usual in the Bayesian context):
7.3 Managing the longevity risk 293
Step 1 requires the construction of the likelihood function L(Ah (τ|d, x)).
We have (see, e.g. Namboodiri and Suchindran (1987)):
!
S(t |Ah (τ)) nt −d
d
L(Ah (τ|d, x)) ∝ ft (x(k) − t|Ah (τ)) (7.28)
S(t|Ah (τ))
k=1
Several tools can be developed to manage longevity risk. These tools can be
placed and analysed in a risk management (RM) framework.
As sketched in Fig. 7.4, the RM process consists of three basic steps,
namely the identification of risks, the assessment (or measurement) of the
relevant consequences, and the choice of the RM techniques. In what fol-
lows we refer to the RM process applied to life insurance, in general, and
to life annuity portfolios, in particular.
The identification of risks affecting an insurer can follow, for example, the
guidelines provided by IAA (2004) or those provided within the Solvency 2
project (see CEIOPS, 2007 and CEIOPS, 2008). Mortality/longevity risks
belong to underwriting risks; the relevant components have already been
discussed (see Section 7.2.1). Obviously, for an insurer the importance of
the longevity risk within the class of mortality risks is strictly related to the
294 7 : The longevity risk: actuarial perspectives
STOCHASTIC MODELS
Risk index, VaR,
Probability of default
...
RISK PORTFOLIO
MANAGEMENT STRATEGIES
TECHNIQUES
relative weight of the life annuity portfolio with respect to the overall life
business.
A rigorous assessment of the longevity risk requires the use of stochastic
models (i.e. approach 5 in Fig. 7.2). In Section 7.2.4 we have provided
some examples of risk measurement, viz the variance, the coefficient of
variation, and the right tail of liabilities – these need to be appropriately
defined; in Section 7.2.4 they were stated in terms of the present value of
future payments and of annual outflows. A further example is given by
7.3 Managing the longevity risk 295
Actual Expected
Threshold
outflows values
Annual outflows
Time
Figure 7.5. Annual outflows in a portfolio of immediate life annuities (one cohort).
The situation occurring in Fig. 7.5, namely, some annual outflows being
above the threshold level, should be clearly avoided. To lower the proba-
bility of such critical situations, the insurer can resort to various portfolio
strategies, in the framework of the RM process.
Figure 7.6 illustrates a wide range of portfolio strategies which aim at risk
mitigation, in terms of lowering the probability and the severity of events
like the situation depicted in Fig. 7.5. In practical terms, a portfolio strategy
can have as targets
Both loss control and loss financing techniques (according to the RM lan-
guage) can be adopted to achieve targets (i) and (ii). Loss control techniques
are mainly performed via the product design, that is, via an appropriate
choice of the various items which constitute an insurance product. In par-
ticular, loss prevention is usually interpreted as the RM technique which
aims to mitigate the loss frequency, whereas loss reduction aims at lowering
the severity of the possible losses.
The pricing of insurance products provides a tool for loss prevention. This
portfolio strategy is represented by path (1) → (a) in Fig. 7.6. Referring to
(1) Single
Reserve
premiums
(a) Threshold
(2) Allocation
Shareholders'
capital
(3) Undistributed
profits
(8) Longevity
bonds
But, it should be stressed that such risk transfer solutions mainly rely on
the improved diversification of risks when these are taken by the reinsurer,
thanks to a stronger pooling effect. Notably, such an improvement can be
achieved in relation to process risk (i.e. random fluctuations in the num-
ber of deaths), whilst uncertainty risk (leading to systematic deviations)
cannot be diversified ‘inside’ the insurance–reinsurance process. Hence, to
become more effective, reinsurance transfers must be completed with a fur-
ther transfer, that is, a transfer to capital markets. Such a transfer can
be realized via bonds, whose yield is linked to some mortality/longevity
index, so that the bonds themselves generate flows which hedge the pay-
ment of life annuity benefits. While mortality bonds (hedging the risk of
a mortality higher than expected) already exist, longevity bonds (hedg-
ing the risk of a mortality lower than expected) are yet to appear in the
market.
To the extent that mortality/longevity risks are retained by an insurer, the
impact of a poor experience falls on the insurer itself. To meet an unexpected
amount of obligations, an appropriate level of advance funding may provide
a substantial help. To this purpose, shareholders’ capital must be allocated
to the life annuity portfolio (path (2) → (a), as well as (3) → (a) in Fig. 7.6),
and the relevant amount should be determined to achieve insurer solvency.
Conversely, the expression ‘no advance funding’ (see Fig. 7.4) should be
referred to the situations where no specific capital allocation is provided in
respect of mortality/longevity risks. In the case of adverse experience, the
unexpected amount of obligations has to be met (at least partially) by the
available residual assets, which are not tied up to specific liabilities.
Hedging strategies in general consist of assuming the existence of a risk
which offsets another risk borne by the insurer. In some cases, hedging
strategies involve various portfolios or lines of business (LOBs), or even
the whole insurance company, so that they cannot be placed in the port-
folio framework as depicted in Fig. 7.6. In particular, natural hedging (see
Fig. 7.4) consists of offsetting risks in different LOBs. For example, writing
both life insurance providing death benefits and life annuities for similar
groups of policyholders may help to provide a hedge against longevity risk.
Such a hedge is usually named across LOBs. A natural hedge can be realized
even inside a life annuity portfolio, allowing for a death benefit (possibly
decreasing as the age at death increases) combined with the life annuity;
see Section 1.6.4. Clearly, in the case of a higher than anticipated mortality
improvement, death benefits which are lower than expected will be paid.
Such a hedge is usually called across time.
Clearly, mortality/longevity risks should be managed by the insurer
through an appropriate mix of the tools described above. The choice of the
7.3 Managing the longevity risk 299
(j)
where Ct is the death benefit payable at time t if death occurs in (t − 1, t),
defined as follows
ω−x
0 −t
(j)
Ct = b(j) a[A]
x0 +t
(j)
=b (1 + i)−h h p[A]
x0 +t (7.30)
h=1
(j)
The benefit Ct is therefore the mathematical reserve set up at time t to meet
the life annuity benefit, calculated according to the mortality assumption
A(τ) and the annual interest rate i. Note that the individual reserve (meeting
both the life annuity and the death benefit) to be set up at time t according
300 7 : The longevity risk: actuarial perspectives
From the point of view of the annuitant, the previous policy structure has
the advantage of paying back the assets (in terms of the amount stated under
policy conditions) remaining at her/his death, hence meeting bequest expec-
tations. On the other hand, the death benefit is rather expensive. Further
solutions can be studied, in order to reconcile the risk reduction purposes
of the insurer with the request by the annuitant for a high level of the ratio
between the annual amount and the single premium. However, the lower is
the death benefit, the lower is the risk reduction gained by the insurer. To
7.3 Managing the longevity risk 301
Table 7.18. Coefficient of variation of the present value of future payments, conditional on
()
the best-estimate scenario: CV[Yt |A3 (τ), nt ], in the presence of death benefit (7.30)
Table 7.20. Coefficient of variation of the present value of future payments, condi-
()
tional on the best-estimate scenario: CV[Yt |A3 (τ), nt ], in the presence of death benefit
(7.32)
years; of course, when the death benefit is zero, we find again the case of the
stand-alone life annuity benefit. The risk reduction is lower than in Exam-
ple 7.4, due to the lower death benefit. The increase in the single premium
required at age 65 is lower as well; according to the usual pricing basis
(i = 0.03, mortality assumption A3 (τ)), a 7.173% increase is required with
respect to the case of the stand-alone life annuity.
()
Wt = Wt−1 (1 + it ) − Bt ; t = z + 1, z + 2, . . . (7.33)
represents the assets available to meet the residual risks having allowed
for those risks met by the portfolio reserve; shortly, we will refer to Wt
as the total portfolio assets and to Mt as the capital assets in the portfolio
(conversely, Wt − Mt represents assets backing the portfolio reserve).
In line with common practice, we consider solvency to be the ability of
the insurer to meet, with an assigned (high) probability, random liabilities
as they are described by a realistic probabilistic structure. To implement
such a concept, choices are needed in respect of the following items:
1. The quantity expressing the ability of the insurer to meet liabilities; rea-
sonable choices are either the total portfolio assets Wt or, as it is more
usual in practice, the capital assets, Mt , which (clearly) is supposed to
be positive when the insurer is solvent.
2. The time span T which the above results are referred to; it may range
from a short-medium term (1–5 years, say), to the residual duration of
the portfolio.
7.3 Managing the longevity risk 305
3. The timing of the results, in particular annual results (e.g. the amount of
portfolio assets at every integer time within T years) versus single figure
results (e.g. the amount of portfolio assets at the end of the time horizon
under consideration, that is, after T years).
Further choices concern how to define the portfolio (just in-force poli-
cies or also future entrants). To make these choices, the point of view from
which solvency is ascertained must be stated. Policyholders, investors and
the supervisory authority represent possible viewpoints in respect of the
insurance business. However, the perspectives of the (current or poten-
tial) policyholders and investors involve profitability requirements possibly
higher than those implied by the need of just meeting current liabilities.
Such requirements would lead to a concept of insurer’s solidity, rather
than solvency. So, we restrict our attention to the supervisory authority’s
perspective.
The supervisory authority is charged to protect mainly the interests of
current policyholders. So a run-off approach should be adopted (hence
disregarding future entrants). Further, no profit release should be allowed
for within the solvency time-horizon T, nor should any need for capital
allocation be delayed.
Let z be the time at which solvency is ascertained (z = 0, 1, . . . ). The
capital required at time z could be assessed according to one of the following
(alternative) models
5
z+T
P Mt ≥ 0 = 1 − ε1 (7.36)
t=z+1
P [Mz+T ≥ 0] = 1 − ε2 (7.37)
5
z+T
()
P Wt − Y t ≥ 0 = 1 − ε3 (7.38)
t=z+1
1
t
() 1
Wt = Wz − Bh (7.39)
v(z, t) v(h, t)
h=z+1
306 7 : The longevity risk: actuarial perspectives
where
1
= (1 + ih+1 ) (1 + ih+2 ) . . . (1 + ik ) (7.40)
v(h, k)
is the discount factor, based on the annual investment yields, from time k to
()
time h. Referring to one cohort only, the quantity Yt can also be written
as (see (7.10))
ω−x
0
() ()
Yt = Bh v(t, h) (7.42)
h=t+1
or also as
'
6
z+T
P (1 + i)t−(ω+1−x0 ) (Wz (1 + i)ω+1−x0 −z
t=z+1 ( (7.45)
ω−x
0 ()
− Bh (1 + i) ω+1−x 0 −h ) ≥ 0 = 1 − ε3
h=z+1
We note that
ω−x
0 ()
Wz (1 + i)ω+1−x0 −z − Bh (1 + i)ω+1−x0 −h = Wω+1−x0 (7.46)
h=z+1
7.3 Managing the longevity risk 307
P[Wω+1−x0 ≥ 0] = 1 − ε3 (7.48)
Example 7.6 Let us adopt the inputs of Example 7.2; so, in particular, we
refer to a homogeneous cohort. To focus on mortality, we disregard finan-
cial risk; so we set it = i = 0.03 for all t (i = 0.03 is adopted in the reserving
basis as well). To facilitate the comparisons among the results obtained
under the different requirements, we define the individual reserve as the
7.3 Managing the longevity risk 309
(1)
Time z Reserve Vz
0 15.259
5 12.956
10 10.599
15 8.294
20 6.167
25 4.336
30 2.877
35 1.807
(j) (j)
Vt = E[Yt |A3 (τ)] (7.49)
Further, the same default probability is set for all the requirements, so
ε1 = ε3 = 0.005. Such a level has been chosen to be consistent with the
developing Solvency 2 system (see CEIOPS (2007) and CEIOPS (2008)).
We note that under Solvency 2 a risk margin should be added to (7.49),
calculated according to the Cost of Capital approach; see CEIOPS (2007)
and CEIOPS (2008) for details.
Table 7.22 quotes the individual reserve. Clearly, at any time z the port-
() (1) (1)
folio reserve is simply: Vz = nz Vz , where Vz is the reserve at time z
for a generic annuitant.
In Table 7.23, we state the amount of the capital (per unit of portfolio
reserve) required according to (7.36) and (7.38) for several portfolio sizes.
For (7.36), the maximum possible time-horizon has been chosen. As we
would expect from the previous discussion, the two requirements lead to
similar outputs, at least when mortality only is addressed. In this case, at
least, the outputs suggest that requirement (7.36) is to some extent inde-
pendent of the reserve when T takes the maximum possible value for the
time-horizon. It should be stressed that in our investigation no risk mar-
()
gin is included in Vz . Thus, a share of the required capital quoted in
Table 7.23 should be included in the reserve and, possibly, charged to annu-
itants through an appropriate safety loading at the issue of the policy. When
interpreting the size of the required capital per unit of the portfolio reserve,
we also point out that the reserve is lower than what would be required by
the supervisory authority, and so the ratios in Table 7.23 would be higher
than what we would find in practice.
310 7 : The longevity risk: actuarial perspectives
Table 7.23. Required capital based on requirements (7.36) and (7.38), facing longevity risk and
mortality random fluctuations
Table 7.24. Required capital based on requirements (7.36), per unit of portfolio reserve:
[R1] ()
(Mz (T))/Vz , facing longevity risk and mortality random fluctuations
Time-horizon T = 1 Time-horizon T = 3
Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000
Table 7.25. Required capital based on requirements (7.36) and (7.38), facing mortality
random fluctuations only; mortality assumption A3 (τ)
risk. The capital required to deal with such risk is the change expected in
the net asset value against a permanent reduction by 25% in the current
and all future mortality rates (we do not discuss further details, such as pos-
sible reductions of this amount; see CEIOPS (2007) and CEIOPS (2008)).
Under our hypotheses (we are considering just one cohort, there is no profit
participation, we are disregarding risks other than those deriving from mor-
tality, and so on), the requirement reduces to the difference between the
best-estimate reserve and a reserve set up with a mortality table embedding
probabilities of death 25% lower than in the best-estimate assumption. The
relevant results are quoted in Table 7.27, where the required capital at time
z is denoted by Mz[Solv2] . It is clear that, in relative terms, such an amount is
independent of the portfolio size. We further recall that, under Solvency 2,
no specific capital allocation is required for the risk of random fluctuations,
since they are treated as hedgeable risks.
7.3 Managing the longevity risk 313
Required capital
Mz[R1] (ω+1−x0 −z) Mz[R3] Mz[R1] (1) Mz[R1] (3)
Time z () () () ()
Vz Vz Vz Vz
Mz[Solv2]
Time z ()
Vz
0 7.274%
5 9.080%
10 11.377%
15 14.293%
20 18.000%
25 22.767%
30 29.102%
35 38.065%
Tables 7.26 and 7.27 may suggest that a deterministic approach can be
adopted for allocating capital to deal with longevity risk. In particular, the
assessment of the required capital could be based on a comparison between
the actual reserve and a reserve calculated under a more severe mortality
trend assumption (as turns out to be the case under Solvency 2).
()[B]
Let Vz be a reserve calculated according to the same valuation prin-
()
ciple adopted for Vz (the equivalence principle, in our implementation),
but based on a worse mortality assumption, so that
() ()[B]
Vz ≤ Vz (7.50)
The required capital would be
()[B] ()
Mz[R4] = Vz − Vz (7.51)
We note that requirement (7.51) would deal with longevity risk only. Fur-
ther, no default probability is explicitly mentioned; however, the mortality
314 7 : The longevity risk: actuarial perspectives
()[B]
assumption adopted in Vz clearly implies some (not explicit) default
probability. The time-horizon implicitly considered is the maximum resid-
ual duration of the portfolio, given that this is the time-horizon referred
to in the calculation of the reserve. We also point out that, to simplify the
assessment of the required capital and to avoid any duplication of risk mar-
gins as well, it is reasonable that reserves in (7.51) are actually based on the
equivalence principle. So the required capital Mz[R4] turns out to be linear
in respect of the portfolio size nz .
To compare requirements (7.36) and (7.38) with (7.51), let us define the
following ratios:
Mz[R1] (T)
QMz[R1] (T; nz ) = ()
(7.52)
Vz
Mz[R3]
QMz[R3] (nz ) = ()
(7.53)
Vz
Mz[R4]
QVz = ()
(7.54)
Vz
Example 7.7 Figure 7.7 plots the ratios (7.53) and (7.54), for several
portfolio sizes, based on calculations performed at time 0. In particular:
7.3 Managing the longevity risk 315
12%
10%
Required capital, per unit of reserve
8%
6%
(3) (1)
4%
(2) (4)
2%
0%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3] [R3] ()[A5 (τ)]
Figure 7.7. Ratios QM0 (n0 ) and QV0 . (1): QM0 (n0 ); (2): QV0 , with V0 ; (3):
[R3] [R3] [R3]
QM0 (n0 ), with M0 accounting for random fluctuations only; (4): QV0 + QM0 (n0 ), with
[R3]
M0 accounting for random fluctuations only.
We first note that the outputs found under case (2) are very similar to
(indeed, in our example they coincide with) those found adopting require-
ment (7.38), as well as requirement (7.36) with T = ω + 1 − x0 (the ratio
()[A5 (τ)]
QV0 , with V0 , plotted in Fig. 7.7, amounts to 7.562% for each port-
folio size; compare this outcome with the ratios QM0[R1] (ω + 1 − x0 ; n0 ) =
() ()
(M0[R1] (ω + 1 − x0 ))/V0 and QM0[R3] (n0 ) = M0[R3] /V0 in Table 7.26).
This is explained by the fact that the (left) tail of the distribution of assets
(addressed in (7.38) and (7.36)) is heavily affected by the worst scenario
(A5 (τ), in our example) when low probabilities (of default) are addressed.
316 7 : The longevity risk: actuarial perspectives
Thus, when allowing for longevity risk only, requirement (7.36) adopted
with the maximum possible time-horizon and requirement (7.38) reduce to
(7.51). This is why a practicable idea could be to split the capital allocation
process in two steps:
Case (4) in Fig. 7.7 is intended to represent such a choice. We note, however,
that an unnecessary allocation may result from this procedure; as we have
already commented, working separately on the components of mortality
risk is improper and may lead to an inaccurate capital allocation.
12%
portf 1
Require dcapital (per unit of reserve) portf 2
portf 3
portf 4
11%
portf 5
10%
9%
8%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3 ]
Figure 7.8. Required capital, per unit of reserve: QM0 (n0 ).
model, which does not explicitly account for any dependence of the lifetime
of the individual on her/his annual amount.
4%
portf 1
portf 2
Required capital (per unit of reserve) portf 3
portf 4
3% portf 5
2%
1%
0%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3 ]
Figure 7.9. Required capital, per unit of reserve: QM0 (n0 ), facing mortality random fluctua-
tions only; mortality assumption A3 (τ).
Reinsurer's intervention
Annuitants
n
. . .
. . .
. . .
5
4
3
2
1
65 85 Lifetime
reinsurer only if it were compulsory for some annuity providers. This could
be the case, for example, with pension funds, which may be forced by the
supervisory authority to back their liabilities through arrangements with
(re-)insurers.
The XL arrangement is clearly defined on a long-term basis, so imply-
ing a heavy longevity risk charged to the reinsurer. In more realistic terms,
reinsurance arrangements defined on a short-medium period basis could
be addressed. With this objective in mind, stop-loss arrangements could
provide interesting solutions. According to the stop-loss rationale, the rein-
surer’s interventions are aimed at preventing the default of the cedant,
caused by (systematic) mortality deviations.
The effect of mortality deviations can be identified, in particular, by com-
paring the total portfolio assets at a given time with the portfolio reserve
required to meet the insurer’s obligations. A Stop-Loss reinsurance on assets
can then be designed, according to which the reinsurer funds (at least par-
tially) the possible deficiency in assets; Fig. 7.11 sketches this idea (in a
run-off perspective).
Let z be the time of issue (or revision) of the reinsurance arrangement.
Adopting the notation introduced earlier, in practical terms the reinsurer’s
intervention can be limited to the case
()
Wz+k < (1 − π) Vz+k , π≥0 (7.55)
7.3 Managing the longevity risk 321
Required
portfolio
reserve Assets
available
Time
()
where the amount πVz+k represents the ‘priority’ of the stop-loss treaty and
k is a given number of years. We note that setting π > 0 may contain the
possibility of random fluctuations being transferred. However, thanks to
the fact that the assets and the reserve of a life annuity portfolio have long-
term features, the flows of the arrangement should not be heavily affected
by random fluctuations, at least up to some time. In fact, close to the natural
maturity of the portfolio we may expect that random fluctuations become
predominant relative to systematic deviations; see also Section 7.2.4. Setting
k > 1 (e.g. k = 3 or k = 5) ensures that the reinsurer intervenes in the more
severe situations, and not when the lack of assets may be recovered by the
subsequent flows of the portfolio. However, k should not be set too high,
otherwise the funding to the cedant in the critical cases would turn out to
be too delayed in time.
A technical difficulty in this treaty concerns the definition of assets and
reserve to be referred to for ascertaining the loss. Further, some control of
the investment policy adopted by the cedant in relation to these assets could
be requested by the reinsurer. For these reasons, the treaty can be conceived
as an ‘internal’ arrangement, that is, within an insurance group (where the
holding company takes the role of the reinsurer of affiliates) or when there
is some partnership between a pension fund and an insurance company (the
latter then acting as the reinsurer, the former as the cedant).
A Stop-Loss reinsurance may be designed on annual outflows, instead of
assets. The rationale, in this case, is that, at a given point in time, longevity
risk is perceived if the amount of benefits to be currently paid to annuitants
is (significantly) higher than expected. A transfer arrangement can then be
322 7 : The longevity risk: actuarial perspectives
– Let z be the time of issue (or revision) of the arrangement. The time
horizon k of the reinsurance coverage should be stated, as well as the
timing of the possible reinsurer’s intervention within it. Within the time
horizon k, policy conditions (i.e. premium basis, mortality assumptions,
and so on) should be guaranteed. As to the timing of the intervention
of the reinsurer, since reference is to annual outflows, it is reasonable to
assume that a yearly timing is chosen. Hence, in the following, we will
make this assumption.
– The mortality assumption for calculating the expected value of the out-
flow, required to define the loss of the cedant. Reasonably, we will adopt
the current mortality table, which will be generically denoted as A(τ) in
what follows.
– The minimum amount t of benefits (at time t, t = z + 1, z + 2, . . . , z + k)
below which there is no payment by the reinsurer. For example,
()
t = E[Bt |A(τ), nz ] (1 + r) = b E[Nt |A(τ), nz ] (1 + r) (7.56)
with r ≥ 0 and b the annual amount for each annuitant; thus the amount
t represents the priority of the Stop-Loss arrangement.
– The Stop-Loss upper limit, that is, an amount t such that t − t is
the maximum amount paid by the reinsurer at time t. From the point
of view of the cedant, the amount t should be set high enough so that
only situations of extremely high survivorship are charged to the cedant.
However, the reinsurer reasonably sets t in connection to the available
hedging opportunities. We will come back to this issue in Section 7.4.3.
As to the cedant, a further reinsurance arrangement may be underwritten,
if available, for the residual risk, possibly with another reinsurer; in this
case, the amount t − t operates as the first layer.
Actual Expected
outflows values
Priority Upperlimit
Annual outflows
Reinsurer's
intervention
Time
portfolio (viz. the number of living annuitants, joint to the annual amount
of their benefits). On the other hand, as already pointed out, it is more
difficult to avoid the transfer of random fluctuations as well.
(SL)
We now define in detail the flows paid by the reinsurer. Let Bt denote
such flow at time t, t = z + 1, z + 2, . . . , z + k. We have
()
0 if Bt ≤ t
(SL)
Bt = B() ()
− t if t < Bt ≤ t (7.57)
t
()
t − t if Bt > t
The net outflow of the cedant at time t (gross of the reinsurance premium),
(SL)
denoted as OFt , is then
() ()
Bt if Bt ≤ t
(SL) () (SL) ()
OFt = Bt − Bt = t if t < Bt ≤ t (7.58)
() ()
Bt − (t − t ) if Bt > t
The net outflow of the cedant is clearly random but, unless some ‘extreme’
survivorship event occurs, it is protected with a cap. It is interesting
(especially for comparison with the swap-like arrangement described sub-
sequently) to comment on this outflow. First of all, it must be stressed that
()
Bt ≤ t represents a situation of profit or small loss to the insurer. On
()
the contrary, the event Bt > t corresponds to a huge loss. Whenever
()
t < Bt ≤ t a loss results for the insurer, whose severity may range
324 7 : The longevity risk: actuarial perspectives
() ()
from small (if Bt is close to t ) to high (if Bt is close to t ). So the
effect of the Stop-Loss arrangement is to transfer to the reinsurer all of the
loss situations, except for the lowest and the heaviest ones; any situation of
profit, on the contrary, is kept by the cedant.
To reduce further randomness of the annual outflow, the cedant may be
willing to transfer to the reinsurer not only losses, but also profits. Thus,
a reinsurance-swap arrangement on annual outflows can be designed. Let
B∗t be a target value for the outflows of the insurer at time t, t = z + 1, z +
2, . . . , z + k; for example,
()
B∗t = E[Bt |A(τ), nz ] (7.59)
Actual outflow
Target outflow
bn0
Annual outflows
Time
From/to cedant
Clearly, when setting t = t = B∗t in (7.62), one finds (7.60) again. The
net outflow (gross of the reinsurance premium) to the cedant is then
()
if Bt ≤ t
t
(swap-b) () (swap-b) () ()
OFt = Bt − Bt = Bt if t < Bt ≤ t (7.63)
()
t if Bt > t
For all of the arrangements, a 5-year reinsurance period has been chosen.
To allow for some comparisons, we have assumed that at the beginning of
each reinsurance period a premium must be paid by the cedant, assessed as
the (unconditional) expected present value of future reinsurance flows. We
should point out that this pricing principle does not make practical sense,
given that no risk margin is included; however, with this approach, we can
at least take into account the magnitude of the reinsurance premium. We
assume further that the reinsurer and the cedant adopt the same mortality
model, with the same parameters and that the reserve must be fully set up
by the cedant. The possible default of the reinsurer is disregarded when
assessing the required capital.
In Table 7.29, we give the required capital (per unit of reserve) for
the three arrangements, for different portfolio sizes, as well as for the
case of no reinsurance arrangement (these latter results are taken from
Table 7.23). Because of the increased certainty of the outflows during the
reinsurance period, the lowest amount of required capital is found under the
reinsurance-swap (with no barriers); but clearly, in such an arrangement the
premium for the risk (which we have not considered) could be higher than
in other cases. As already noted, due to the different parameter values, the
outflows under the alternative arrangements are not directly comparable.
It is interesting to note that most of the reduction in the required capital
is gained at the oldest ages, roughly after the Lexis point. Indeed, the most
severe part of the longevity risk is expected to emerge after this age. So, we
can argue that the need for reinsurance emerges in particular at the oldest
ages; at earlier ages, the risk could be managed through other RM tools.
[R3 ] ()
Table 7.29. Required capital, per unit of reserve: Mz /Vz , with and without reinsurance
Death Annuity
benefit benefit
Insurer IA Reinsurer Insurer IB
Insurance Annuity
premiums premiums
Annuity Annuity Death Insurance
premiums benefit benefit premiums
Annuitants Insureds
Figure 7.14. Flows in the swap-like arrangement between life annuities and life insurances.
in the portfolio of insurer IB . Let nt∗ and dt∗ be two given benchmarks for
the number of annuitants at time t for insurer IA and the number of deaths
in year (t − 1, t) for insurer IB , respectively. Insurer IA and IB agree that
the flow b · max{Nt − nt∗ , 0} is paid at time t by insurer IB to insurer IA ,
whilst the flow c · max{Dt − dt∗ , 0} is paid at the same time by insurer IA
to IB . This way, insurer IA is protected against excess survivorship, whilst
insurer IB is protected in respect of excess mortality. However, insurer IA is
then exposed to excess mortality, whilst insurer IB to excess survivorship.
Cox and Lin (2007) show through numerical assessments that some natural
hedging effects are gained by both insurers, provided that the present value
of future payments for life annuities and for life insurances are the same at
the time of issue.
We note that, since new securities are issued, a counterparty risk arises (for
the investor).
The organizational aspects of a securitization transaction are rather com-
plex. Figure 7.15 sketches a simple design for a life insurance deal, focussing
on the main agents involved. The transaction starts in the insurance market,
where policies underwritten give rise to the cash flows which are securi-
tized (at least in part). The insurer then sells the right to some cash flows
to a special purpose vehicle (SPV), which is a financial entity that has been
established to link the insurer to the capital market. Securities backed by
the chosen cash flows are issued by the SPV, which raises monies from the
capital market. Such funds are (at least partially) available to the insurer.
According to the specific features of the transaction, further items may
be added to the structure. For example, a fixed interest rate could be paid
7.4 Alternative risk transfers 331
Policyholders
Premiums Benefits
Funding Price
Special Purpose Capital
Insurer
Vehicle (SPV) Market
Cash flows Securities
Credit
Policyholders Enhancement
Mechanism
Premium Guarantee
Premiums Benefits
Funding Price
Special Purpose Capital
Insurer Vehicle (SPV) Market
Cash flows Securities
Swap
counterparty
Figure 7.16. The securitization process in life insurance: a more composite structure.
λ I0 − IT
(IT ) = (7.70)
(λ − λ ) I0
Note that λ I0 and λ I0 represent two thresholds for the mortality index.
The coupon is independent of mortality; it could be defined as follows:
Ct = S0 (it + r) (7.71)
where it is the market interest rate in year t (defined by the bond conditions)
and r is an extra-yield rewarding investors for taking mortality risk.
We note that for an insurer/reinsurer dealing with life insurances and
taking a short position in the bond, in the case of high mortality experience,
the high frequency of payment of death benefits is counterbalanced by a
reduced payment to investors.
An example of this security is the mortality bond issued by Swiss Re; see,
for example, Blake et al. (2006a).
Mortality bond – example 2. The flows of the bond described in the
previous example try to match the flows in the life insurance portfolio just
at the end of a period of some years. An alternative design of the mortality
bond may provide a match on a yearly basis. This is obtained by letting the
coupon depend on mortality. For example,
it + r if It ≤ t
Ct = S0 × (it + r) φ(It ) if t < It ≤ t (7.72)
0 if It > t
Depending on its design, the longevity bond may offer hedging oppor-
tunities to an insurer/reinsurer dealing with life annuities through either a
long or a short position. In the first case, the pay-off of the bond increases
with decreasing mortality; vice versa in the second case. Given the long-
term maturity, it is reasonable that the link is realized through the coupon,
hence providing liquidity on a yearly basis. In the following, we therefore
assume that the principal is fixed.
The reference population should be a given cohort, possibly close to
retirement, that is, with age 60–65 at bond issue. Let Lt be the number of
individuals in the cohort after t years from issue, t = 0, 1, . . . ; viz, L0 = l0
is a known value. A maturity T may be chosen for the bond, with T high
(e.g.: T ≥ 85−initial age). In the following, some possible designs for the
coupons are examined.
Longevity bond – example 1. The easiest way to link the coupon to the
longevity experience in the reference population is to let it be proportional
to the observed survival rate. So
Lt
Ct = C × (7.75)
l0
where C is a given amount (linking the size of the coupon to the principal of
the bond). We note that in the case of unanticipated longevity the coupon
increases faster than expected; so a long position should be taken by an
336 7 : The longevity risk: actuarial perspectives
insurer/reinsurer dealing with life annuities. A similar bond has been pro-
posed by EIB/BNP Paribas, although it has not been traded on the market;
see Blake, Cairns and Dowd (2006a) for details.
Longevity bond – example 2. In a similar way to the mortality bond
(example 1 or 2), two thresholds may be assigned, expressing survival levels.
If the number of survivors in the cohort exceeds such thresholds, then the
amount of the coupon is reduced, possibly to 0. The following definition
can be adopted:
lt −lt
l0 if Lt ≤ lt
Ct = C × lt −Lt if lt < Lt ≤ lt (7.76)
l0
0 if L > l t t
where lt , lt are the two thresholds, expressing a given number of survivors.
For example: lt = λ E[Lt |A(τ)], lt = λ E[Lt |A(τ)], where 1 ≤ λ < λ and
A(τ) is a given mortality assumption for the reference cohort (assumed to
be born in year τ). We note that, in this case, the lower is the mortality (i.e.
the higher is Lt ), the lower is the amount of the coupon. A short position
should be taken to hedge life annuity outflows. A similar bond is described
by Lin and Cox (2005).
Longevity bond – example 3. The coupon can be set proportional to the
number of deaths observed in the reference cohort from issue. For example
l 0 − Lt
Ct = C × (7.77)
l0
where l0 − Lt is the observed number of deaths up to time t. In contrast to
the previous case, no target is set for such a number. Clearly, also in this
case a short position should be taken to hedge longevity risk.
We will discuss in more detail how to hedge longevity risk through
longevity bonds in Section 7.4.3. We now address some market issues.
There are many difficulties in developing a market for longevity bonds.
A first issue concerns who might be interested in issuing/investing in bonds
that offer hedging opportunities to insurers/reinsurers. In general terms,
one could argue that such securities may offer diversification opportunities,
in particular because of their low correlation with standard financial mar-
ket risk factors. Further, they may give long-term investment opportunities,
which may be rarely available. From the point of view of the issuer of bonds
like example 1, the possibility of building a longevity bond depends, how-
ever, on the availability of financial securities with an appropriate maturity
to match the payments promised under the longevity bond.
7.4 Alternative risk transfers 337
(SL)
hedging is realized by a reinsurer, then reference is to the outflows Bt ,
(swap) (swap−b)
Bt , or Bt , depending on the reinsurance arrangement dealt with.
In the following, we discuss how the target OFt∗ can be set and reached
according to the hedging tools available in the market. For the sake of
brevity, we assume that the longevity bond is issued at the same time as
the life annuities; some comments will follow in this regard. Thus, unless
otherwise stated, time 0 will be the time of issue of the life annuities and
the bond.
We first consider the case of a longevity bond with coupon (7.75). An
insurer dealing with immediate life annuities should buy k units of such
bond at time 0, so that Ft = k Ct > 0 at time t = 1, 2, . . . . The net outflow
for the insurer at time t, t = 1, 2, . . . , is then
(LB) ()
OFt = Bt − k Ct (7.78)
which can be rewritten as
(LB) Nt Lt
OFt = b n0 − kC (7.79)
n0 l0
We assume that Nt /n0 = Lt /l0 for any time t; this means that mortality of
annuitants is perfectly replicated by mortality in the reference population.
The net outflow to the insurer then becomes
(LB) Lt
OFt = (b n0 − k C) (7.80)
l0
Note that the net outflow is still random because of the dependence on Lt .
However, if k = b n0 /C then the term b n0 − k C reduces to zero, and a
situation of certainty is achieved (i.e. the hedging would be perfect); the
target outflow for this situation is therefore OFt∗ = 0.
In practical terms, perfect hedging is difficult to realize. Although we can
rely on some positive correlation between the survival rate in the reference
population, Lt /l0 , and that in the annuitants’ cohort, Nt /n0 , it is unrealistic
that they coincide in each year, due to the fact that usually the annuitants
are not representative of the reference population. In particular, the year of
birth of the reference cohort and of annuitants may differ. This mismatching
leads to basis risk in the strategy for hedging longevity risk.
A second aspect concerns the lifetime of the bond. Typically the bond
is not issued when the life annuity payments start. If it is issued earlier,
the previous relations still hold, just with an appropriate redefinition of the
quantities l0 and Lt ; the problem in this case would consist in the availability
of the bond, in the required size, in the secondary market. If the bond is
issued later than the life annuities, the longevity risk of the insurer would be
7.4 Alternative risk transfers 339
unhedged for some years (but in a period when annuitants are still young,
and longevity risk is therefore not too severe). In both cases, the basis risk
may be stronger, due to the fact that it is more likely that the years of
birth of annuitants and the reference population are different. The critical
aspect of the lifetime of the bond is its maturity, T. Realistically, T is a
finite time, so that the hedge in (7.79) can be realized just up to time T (and
not for any time t). The insurer has to plan a further purchase of longevity
bonds after time T; however, the availability of bonds, in particular with
the features required for the hedging, is not certain. In the case that further
longevity bonds are available in the future, the basis risk may worsen in
time, given that for any bond issue a cohort of new retirees is likely to be
referred to.
We now move to longevity bonds with coupon (7.76) and (7.77). As
already mentioned in Section 7.4.2, such bonds require a short position to
hedge longevity risk. This position is, however, difficult for an insurer (or
other annuity provider) to realize on its own, because of the complexity
of the deal. It is reasonable to assume that some form of reinsurance is
purchased by the annuity provider. The reinsurer, who transacts business
on a larger scale than the insurer, then hedges its position through longevity
bonds, typically issued by an SPV (see Fig. 7.17).
Let us assume that a reinsurer is able to issue a bond with coupon (7.76).
The reinsurer should be willing, in this case, to underwrite the Stop-Loss
arrangement on annual outflows, whose reinsurance flows are described
Annuitants
Premiums
Annual
payments
Premium Coupons
Premium
and Principal
Annuity Capital
Reinsurer SPV
Provider Market
Income from
Benefits Benefits
bond sale
Figure 7.17. Longevity risk transfer from the annuity provider to the capital market.
340 7 : The longevity risk: actuarial perspectives
Since we are aiming at perfect hedging, the thresholds t , t in the rein-
surance arrangement are reasonably chosen according to the feature of the
longevity bond. So we assume that t = (lt /l0 ) b n0 and t = (lt /l0 ) b n0 .
We can rewrite (replacing the relevant quantities and rearranging)
l
0 if N t
≤ lt
n 0 0
(SL) lt lt lt
NFt = b n0 × N t
− if < Nt
≤
n0 l0 l0 n 0 l0
lt −lt lt
l0
if N n0 > l0
t
lt −lt l
if Ll t ≤ l t
l0 0 0
l
+ k C × lt −L t l
if l t < Ll t ≤ lt (7.82)
l0 0 0 0
0 l
if Ll t > lt
0 0
Annuity
outflows
Flow to
investors
Flow to the
insurer
Time
Figure 7.18. Flows for a reinsurer dealing with a Stop-Loss arrangement on annual outflows and
issuing a longevity bond – example 2.
lt − lt
= b n0 (7.84)
l0
which is a non-random situation. A graphical representation is provided in
Fig. 7.18.
The assumptions on which such a perfect hedging strategy is based are
the same as those adopted for the longevity bond – example 1, that is,
– the survival rate in the annuitant population, Nt /n0 , is the same as that
observed in the reference population, Lt /l0 ;
– the lifetime of the bond coincides with the lifetime of the life annuity
portfolio; in particular, no maturity has been set.
It is clear that such conditions are unrealistic, so that the reinsurer transfers
just partially the longevity risk to investors. In any case, the target outflow
l −l
in setting the hedging strategy in this case is OFt∗ = b n0 t l t . A similar
0
strategy is described by Lin and Cox (2005), albeit without calling explicitly
for a reinsurance arrangement between an insurer and a reinsurer.
342 7 : The longevity risk: actuarial perspectives
Net outflow
bn0
Annual outflow To investors
Time
From/to cedant
Figure 7.19. Flows for a reinsurer dealing with a reinsurance-swap arrangement and issuing a
longevity bond – example 3.
to a reinsurer, a default risk arises for the insurer. This aspect should be
accounted for when allocating capital for the residual longevity risk borne
by the insurer itself.
So far in this chapter we have dealt with longevity risk referring to a portfo-
lio of immediate life annuities. The need for taking into account uncertainty
in future mortality trends and hence for a sound management of the impact
of longevity risk has clearly emerged.
However, life annuity products other than immediate life annuities are
sold on a number of insurance markets and, in many products, the severity
of longevity risk can be even higher than what has emerged in the previous
investigations. We now introduce some remarks considering cases other
than immediate life annuities.
The technical features of several types of life annuities have already been
examined in Chapter 1, and the relevant traditional pricing tools as well
(see, in particular, Section 1.6). Unsatisfactory features of such models can
344 7 : The longevity risk: actuarial perspectives
be easily understood if one analyses the models themselves under the per-
spective of a dynamic mortality scenario. In this section, we develop some
general comments on the pricing of life annuities allowing for longevity
risk; a few examples are then mentioned in Section 7.6.
In Section 1.6, we recalled that in the traditional guaranteed life annuity
product the technical basis is stated when the premiums are fixed. So
(a) a deferred life annuity with (level) annual premiums implies the highest
longevity risk borne by the insurer, as the technical basis is stated at
policy issue (hence, well before retirement);
(b) a single premium immediate life annuity implies the lowest longevity
risk, as the technical basis is stated at retirement time only;
(c) the arrangement with single recurrent premiums represents an interme-
diate solution, given that the technical basis can be stated specifically
for each premium.
It follows that a stronger safety loading is required for solution (a) than
for (b), with solution (c) at some intermediate level. Clearly, in order to
calculate properly the safety loading required for the implied longevity risk,
some pricing model is needed. Alternatively, policy conditions that allow
for a revision of the technical basis should be included in the policy, as will
be commented later.
As it was recalled in Section 1.6, in case (b) the accumulation of
the amount funding an immediate life annuity can be obtained through
some insurance saving product, for example, an endowment insurance.
A package, in particular, can be offered, in which an endowment for the
accumulation period is combined with an immediate life annuity for the
decumulation period.
Combining an endowment insurance with a life annuity provides the
policyholder with
(a) an insurance cover against the risk of early death during the working
period;
(b) a saving instrument for accumulating a sum at retirement, to be (partly)
converted into a life annuity;
(c) a life annuity throughout the whole residual lifetime.
Risk of Annuitization
Mortality
surrender risk
risk
C1
Mortality risk
Sum at risk
Reserve
Investment
Reserve
risk
0 n Time
Section 1.6, we let 0 be the time of issue of the endowment, n the maturity
of the endowment and the retirement time as well, x the age at time 0.
During the accumulation period, that is, throughout the policy duration
of the endowment, the insurer in particular bears:
As regards the longevity risk, the time interval throughout which the
insurer bears the risk itself clearly coincides with the time interval involved
by the immediate life annuity, if the annuity rate 1/ax+n is stated and hence
guaranteed at retirement time only. We recall that the annuity rate converts
the sum at maturity S (used as a single premium) into a life annuity of annual
amount b according to the relation b = S/ax+n (see (1.57)).
Even if the annuity rate is stated at time n only, it is worth noting that
the endowment policy contains an ‘option to annuitize’. Apart from the
severity of the longevity risk implied by the guarantee on the annuity rate,
the presence of this option determines the insurer’s exposure to the risk
of adverse selection, as most of the policyholders annuitizing the maturity
benefit will be in a good health status (see Section 1.6.5).
The so-called guaranteed annuity option (GAO) (see Section 1.6.2) entitles
the policyholder to choose at retirement between the current annuity rate
(i.e. the annuity rate applied at time n for pricing immediate life annuities)
and the guaranteed one.
By definition, the GAO condition implies a guaranteed annuity rate
(GAR). In principle, the GAR can be stated at any time t, 0 ≤ t ≤ n. In
practice, the GAR stated at policy issue, that is, at time 0, constitutes a more
appealing feature of the life insurance product. If the GAR is stated at time
n only, the GAO vanishes and the insurance product simply provides the
policyholder with a life annuity with a guaranteed annual amount. What-
ever may be the time at which the GAR is stated, the life annuity provides
a guaranteed benefit, so that it can be referred to as a guaranteed annuity
(see Fig. 7.21).
Conversely, the expression non-guaranteed annuity denotes a life annuity
product in which the technical basis (and in particular the mortality basis)
can be changed during the annuity payment period; in practice, this means
GAR
GAO at time t
(0 ≤ t ≤ n)
Guaranteed
Annuity
GAR
at time n
that the annual amount of the annuity can be reduced, according to the
mortality experience. Clearly, such an annuity is a rather poor product
from the point of view of the annuitant.
As a consequence of the GAR, the insurer bears the longevity risk
(and the market risk, as the guarantee concerns both the mortality table
and the rate of interest) from the time at which the guaranteed rate
is stated on. Obviously, the longevity (and the market) risk borne by
the insurer decreases as the time at which the guaranteed rate is stated
increases.
The importance of an appropriate pricing of a GAO, and therefore of
an appropriate setting of a GAR, is witnessed by the default of Equitable
Life. The unanticipated decrease in interest and mortality rates experienced
during the 1990s, let the GAOs issued by Equitable during the 1980s
to become deeply in the money at the end of the 1990s. As a conse-
quence, in 2000 the Equitable was forced to close to new life and pension
business.
Pricing a life annuity product within the GAR framework requires the use
of a projected mortality table. The more straightforward (and traditional)
approach for pricing the guarantee consists of adopting a table that includes
a safety loading to meet mortality improvements higher than expected. One
should, however, be aware of the fact that the possibility of unanticipated
mortality improvements reduces the reliability of such a safety loading (as
happened to Equitable). A more appropriate approach requires a pricing
model explicitly allowing for the longevity risk borne by the insurer, rather
than a safety loading roughly determined; see Section 7.6.
Reduction in the
annual amount
b[1]
b'[1]
0 h r n Time
A new
projected table
S
b[1] = (7.89)
a[1]
x+n (h)
Assume that the insurer promises to pay the annual amount b[1] from
time n on, with the proviso that no dramatic improvement in the mortality
experienced occurs before time n. Conversely, if such an improvement is
experienced (and it results, for example, from a new projected life table
available at time r, h < r ≤ n), then the insurer can reduce the annual
amount to a lower level b[1] (see Fig. 7.22). So a policy condition must be
added, leading to a conditional GAR product. Some constraints are usually
imposed (e.g. by the supervisory authority); in particular:
(a) the mortality improvement must exceed a stated threshold (e.g. in terms
of the increase in the life expectancy at age 65);
(b) r ≤ n − 2, say;
(c) no more than one reduction can be applied in a given number of years;
(d) whatever the mortality improvements may be, the reduction in the
annual amount must be less than or equal to a given share ρ, that is,
b[1] − b[1]
≤ρ (7.90)
b[1]
Increase in the
annual amount
b' [2]
b[2]
0 h n s Time
Experienced mortality
higher than expected
Let us now turn to the case in which the insurer charges a rigourous (i.e.
lower) annuity rate 1/a[2]
x+n (h). Hence, the annuity amount is given by
S
b[2] = (7.91)
a[2]
x+n (h)
Reduction in the
annual amount
b[3]
b[2]
0 h n s Time
Experienced
mortality lower
than expected
Figure 7.24. Annual amount in a product with conditional GAR in the decumulation period.
description in the present chapter: for example, there are different opinions
on evolving mortality and hence on the appropriate stochastic model to
allow for uncertain mortality trends, and the data for estimating the main
parameters are unavailable.
On the other hand, pricing models for longevity risk are required when
dealing with life annuities and longevity bonds. Therefore, in this section,
we summarize a few of the main proposals which have been described in
literature. However, this is a subject which has been developing in the recent
literature, and we do not aim to give a comprehensive illustration of the
several proposals that have been put forward.
We first address the present value of life annuities. Denuit and Dhaene
(2007) and Denuit (2007) allow for randomness in the probabilities of
death within a Lee–Carter framework. Due to the importance of such a
framework, we briefly describe their approach. Let us adopt the standard
Lee–Carter framework, where the future forces of mortality are decom-
posed in a log-bilinear way (see Section 4.7.2). Specifically, the death rate
at age x in calendar year t is of the form exp(αx + βx κt ), where κt , in
particular, is a time index, reflecting the general level of mortality.
We denote as h Px0 (t0 ) the random h-year survival probability for an
individual aged x0 in year t0 , that is, the conditional probability that this
individual reaches age x0 + h in year t0 + h, given the κt ’s. Adopting
assumptions (3.2) (from which (3.13) holds), such probability is formally
defined as
h−1
h Px0 (t0 ) = exp − mx0 +s (t0 + s)
s=0
h−1
= exp− exp αx0 +s + βx0 +s κt0 +s (7.92)
s=0
where v(0, h) is the discount factor, that is, the present value at time 0 of
a unit payment made at time h. We note that ax0 (t0 ) is a random vari-
able, since it depends on the future trajectory of the time index (i.e. on
κt0 , κt0 +1 , κt0 +2 , . . .). We note also that (7.93) generalizes (1.27).
The distribution function of ax (t0 ) is difficult to obtain. Useful approx-
imations have been proposed by Denuit and Dhaene (2007) and Denuit
(2007). Specifically, Denuit and Dhaene (2007) have proposed comono-
tonic approximations for the quantiles of the random survival probabilities
h Px0 (t0 ). Since the expression for ax (t0 ) involves a weighted sum of
the h Px0 (t0 ) terms, Denuit (2007) supplemented the first comonotonic
approximation with a second one. This second approximation is based
on the fact that the h Px0 (t0 ) terms are expected to be closely dependent
for increasing values of h so that it may be reasonable to approxi-
mate the vector of random survival probabilities with its comonotonic
version.
Interesting information can be obtained from a further investigation of
the distribution of ax0 (t0 ). We consider a homogeneous portfolio, made of
n0 annuitants at time t0 . We refer now to the random variable aK(j) , where
x0
(j) (j)
Kx0 is the curtate lifetime of individual j. Given the time index, the Kx0 ’s
are assumed to be independent and identically distributed, with common
conditional h-year survival probability h Px0 (t0 ).
We recall from Denuit et al. (2005) that a random variable X is said
to precede another one Y in the convex order, denoted as X cx Y, if
the inequality E[g(X)] ≤ E[g(Y)] holds for all the convex functions g
for which the expectations exist. Since X cx Y ⇒ E[X] = E[Y] and
Var[X] ≤ Var[Y], X cx Y intuitively means that X is ‘less variable’, or
‘less dangerous’ than Y.
Now, since the aK(j) ’s are exchangeable, we have from Proposition 1.1
x0
in Denuit and Vermandele (1998) that
n0 +1
j=1 aK(j)
x0
ax (t0 ) = E[aK(j) |κt0 +k , k = 1, 2, . . .] cx · · · cx
x0 n0 + 1
n0
j=1 aKx(j)
cx 0
. (7.94)
n0
Increasing the size of the portfolio makes the average payment per annuity
less variable (in the cx -sense), but this average remains random whatever
the number of policies comprising the portfolio, being bounded from below
by ax (t0 ) in the cx -sense. We note that, despite the positive dependence
7.6 Allowing for longevity risk in pricing 353
where (·) is the standard normal cdf and λ is the market price of risk
(longevity risk included). The fair price of X is the present value of the
expected value of X, calculated with the risk-free rate and the distorted cdf
F ∗ (x).
Lin and Cox (2005) take X as the lifetime of an annuitant and calibrate
λ using life annuity quotations in the market (assuming that the price of a
life annuity is the present value of future payments, based on the risk-free
rate and the distorted cdf of the lifetime). They then apply the approach to
price mortality-linked securities.
The one-factor Wang transform assumes that the underlying distribu-
tion is known. However, usually F(x) is the best-estimate of the underlying
unknown distribution. The two-factor Wang transform is the cdf F ∗∗ (x)
such that
described in Section 5.3 and price longevity bonds with different terms
to maturity referenced to different cohorts. In particular, they develop a
method for calculating the market risk-adjusted price of a longevity bond,
which allows for mortality trend uncertainty and parameter risk as well.
We finally address the problem of the valuation of a GAO. The GAO
(see Section 7.5.2) consists of a European call option with the underly-
ing asset the retail market value of a life annuity at retirement time and
the strike the GAR set when the GAO was underwritten. The pay-off of
the option by itself depends on the comparison between the guaranteed
and the current annuity rate. However, the actual exercise of the option
depends also on the preference that the holder expresses for a life annu-
ity instead of self-annuitization. The intrinsic structure of the pay-off of the
option is, therefore, uncertain because it depends on individual preferences,
with possible adverse selection in respect of the insurer. When assessing the
value of the GAO, individual preferences are usually disregarded in the cur-
rent literature. The pricing problem is therefore attacked by assuming that
the policyholder will decide to exercise the option just comparing the cur-
rent market quotes for life annuities and the GAR. Ballotta and Haberman
(2003) address this problem, assuming that the overall mortality risks (and
hence also the longevity risk) are diversified. In Ballotta and Haberman
(2006) the analysis is extended to the case in which mortality risk is incor-
porated via a stochastic model for the evolution over time of the underlying
force of mortality.
ln(1 − i bS )
k=− (7.97)
ln(1 + i)
Example 7.10 With reference to the expected values quoted in Table 7.2 for
time 0, we perform the comparisons discussed above. We assume that the
prevailing mortality table referred to for the traditional actuarial valuation
of the life annuity is given by assumption A3 (τ). All of the other assumptions
are as in Example 7.1; in particular, the actual entry age is x0 = 65.
356 7 : The longevity risk: actuarial perspectives
1
A1 (τ) 14.462 = 0.06915 19.247
1
A2 (τ) 14.651 = 0.06825 19.587
1
A3 (τ) 15.259 = 0.06554 20.707
1
A4 (τ) 15.817 = 0.06322 21.767
1
A5 (τ) 16.413 = 0.06093 22.938
1
i=0 21.853 = 0.04576 21.853
1
i = 0.01 19.238 = 0.05198 21.473
1
i = 0.02 17.071 = 0.05858 21.091
1
i = 0.03 15.259 = 0.06554 20.707
1
i = 0.04 13.733 = 0.07282 20.321
Tables 7.30 and 7.31 give the equivalent number of payments of an annu-
ity certain, for several quoted prices of the life annuity. In particular, in
Table 7.30 the discount rate has been kept fixed, while alternative mor-
tality assumptions have been used; in Table 7.31 the annuity rate is based
on the mortality assumption A3 (τ) while alternative levels of the discount
rate are chosen. Clearly, given the mortality table, the equivalent number
of payments of an annuity certain is higher the lower is the discount rate.
With a fixed discount rate, the equivalent number of payments is higher the
stronger is the mortality improvement implied by the table.
In Table 7.32 the reference mortality assumption is A3 (τ) and the refer-
ence discount rate is i = 0.03. First, the equivalent discount rate relating to
different mortality assumptions is calculated (third column); then the equiv-
alent rounded entry age is quoted (fourth column). We note that a lower
equivalent discount rate and a lower equivalent entry age emerge from a
stronger assumption about mortality improvements.
1
A1 (τ) 14.462 = 0.06915 3.501% 67
1
A2 (τ) 14.651 = 0.06825 3.379% 66
1
A3 (τ) 15.259 = 0.06554 3% 65
1
A4 (τ) 15.817 = 0.06322 2.673% 64
1
A5 (τ) 16.413 = 0.06093 2.343% 62
(a) a life annuity provides the annuitant with an inflexible income, in the
sense that, if the whole fund available to the annuitant at retirement is
converted into a life annuity, the annual income is stated as defined by
the annuity rate (apart from the effect of possible profit participation
mechanisms);
(b) a more flexible income can be obtained via a partial annuitization of
the fund, or partially delaying the annuitization itself; the part of the
income not provided by the life annuity is then obtained by drawdown
from the non-annuitized fund;
(c) the life annuity product benefits from a mortality cross-subsidy, as
each life annuity in a given portfolio (or pension plan) is annually
credited with ‘mortality interests’, that is, a share of the technical pro-
visions released by the deceased annuitants, according to the mutuality
principle (see Sections 1.4 and 1.4.1 in particular).
Let us start with point (c). We refer to a life annuity issued at age x0
with annual amount b, whose technical provision (simply denoted by Vt ) is
calculated according to rule (7.49) (adopting a mortality assumption A(τ)).
Recursively, we may express the technical provision as follows:
V0 = S
Vt−1 (1 + i) = (Vt + b) px0 +t−1 , t = 1, 2, . . . (7.98)
where i is the technical interest rate, px0 +t−1 is based on mortality assump-
tion A(τ) and S is the single premium (see (1.28)). According to a traditional
pricing structure, we may further assume
S = b ax0 (7.99)
where ax0 is calculated according to the same assumptions adopted in (7.98).
To be more realistic, we consider a (financial) profit participation mech-
anism. We denote as b0 the amount of the benefit set at policy issue (so,
358 7 : The longevity risk: actuarial perspectives
The splitting of the variation of the reserve in a year is sketched in Fig. 1.4.
We now address item (b) in the list at the beginning of this Section. As
was discussed in Section 1.2.1, the annuitant may decide not to use S to
buy a life annuity, but simply to invest it and receive the post-retirement
income via a sequence of withdrawals (set at her/his choice). Suppose that
the fund is credited each year with annual interest at the rate g. Further
assume that the annuitant withdraws from the fund a sequence of amounts
7.7 Financing post-retirement income 359
(a)
2.00
1.80
1.60
1.40
1.20
Share α
1.00
0.80
0.60
0.40
0.20
0.00
0 10 20 30 40 50 60
Time to exhaustion, m
(b)
9%
8%
7%
6%
5%
Rate g
4%
3%
2%
1%
0%
0 10 20 30 40 50 60
Time to exhaustion, m
Figure 7.25. Annual withdrawal (panel (a)) and annual investment yield (panel (b)) as a function
of the time to fund exhaustion.
extra return required in each year for this purpose has been called the mor-
tality drag. However, it is worth stressing that a fixed drawdown sequence
leads in any case to wealth exhaustion in a given number of years (possibly
the maximum residual lifetime), whatever the interest rate may be, as was
depicted in Fig. 7.25, panel (b).
7.7 Financing post-retirement income 361
Example 7.12 Under the assumptions adopted in Example 7.11 for the life
annuity, Fig. 7.26 plots the extra-yield required on individual investments
in each of the k years of delay to compensate the loss of mutuality. Trivially,
the higher is k, the higher is the required extra yield. Given that the extra
yield must be realized in each of the k years of delay, this target may be
very difficult to reach when the annuitization is planned for a distant time
in the future.
9%
Extra investment yield
8% Life annuity yield
7%
6%
5%
4%
3%
2%
1%
0%
5 10 15 20 25 30 35 40 45 50
Delay period k
we get
t
t
Ft = S (1 + g) − bh (1 + g)t−h (7.108)
h=1
Let gk be the rate g such that Fk = Vk for a given k. The rate gk is therefore
defined by the following relation:
k
S (1 + gk )k − bh (1 + gk )k−h = Vk (7.109)
h=1
Note that Fig. 7.26 actually plots the rate gk for several choices of k.
From (7.100), we can express the annual benefit at time t as
bt = b (1 + r)t (7.110)
k
b ax0 (1 + gk )k − b (1 + r)h (1 + gk )k−h = b (1 + r)k ax0 +k (7.111)
h=1
or equivalently
1+r (1 + r)k+1
ax0 (1 + gk )k − (1 + gk )k + = (1 + r)k ax0 +k (7.112)
gk − r gk − r
7.7 Financing post-retirement income 363
– in the case of death before time k, the fund available constitutes a bequest
(which is not provided by a life annuity purchased at time 0, because of
the implicit mortality cross-subsidy);
– more flexibility is gained, as the annuitant may change the annual income
modifying the drawdown sequence (with a possible change in the fund
available at time k).
Contributions
(before retirement)
Annuity
purchase
Non-annuitized Annuitized
fund fund
Interests Interests
Mortality
Income Annuity
drawdown payment
(after retirement) (after retirement)
Fund
Income
drawdown
Life annuity
Life annuity purchase payment
Fund
100%
(1)
Annuitization ratio
Deferred
life annuity
Income
drawdown
only
(2)
0%
Accumulation Post-retirement
period period
Time
Figure 7.30. Arrangements: (1) deferred life annuity; (2) income drawdown.
100%
Annuitization ratio
Immediate
life annuity
0%
Accumulation Post-retirement
period period
Time
100%
Annuitization ratio
Combined
annuities
0%
Accumulation Post-retirement
period period
Time
100%
Annuitization ratio
Staggered
annuitization
0%
Accumulation Post-retirement
period period
Time
The framework proposed above clearly shows the wide range of choices
leading to different annuitization strategies. So, convenient investment and
life annuity products can be designed, to meet the different needs and pref-
erences of the clients. An example in this regard is given by the solutions
providing natural hedging across time (Section 7.3.2), such as the money-
back annuity with death benefit (7.32), which is designed so that at some
future time the death benefit reduces to zero. We note that, as long as the
death benefit is positive, a situation of fund just partially annuitized can be
identified. As soon as the death benefit reduces to zero, the fund turns out
to be fully annuitized. Thus, an annuitization strategy is embedded in the
structure of money-back annuities.
also addressing financial risk for life annuity portfolios. In the Lee–Carter
framework, given that the future path of the time index is unknown and
modelled as a stochastic process, the policyholders’ lifetimes become depen-
dent on each other. Consequently, systematic risk is involved. Denuit and
Frostig (2007a) study this aspect of the Lee–Carter model, in particular
considering solvency issues. Denuit and Frostig (2007b) further study the
distribution of the present value of benefits in a run-off perspective. As
the exact distribution turns out to be difficult to compute, various approx-
imations and bounds are derived. Denuit (2008) summarizes the results
obtained in this field.
The literature on risk management in industry and business in general
is very extensive. For an introduction to the relevant topics the reader can
refer, for example, to Harrington and Niehaus (1999), and to Williams,
Smith and Young (1998). Various textbooks address specific phases of
the risk management process. For example, Koller (1999) focuses on the
risk assessment in the risk management process for business and industry,
whereas Wilkinson Tiller, Blinn and Kelly (1990) deal with the topic of risk
financing. Pitacco (2007) addresses mortality and longevity risk within a
risk management perspective.
Several investigations have been performed with regard to natural hedg-
ing. As far as portfolio diversification effects are concerned, the reader may
refer to Cox and Lin (2007), where the results of an empirical investigation
concerning the US market are discussed. With regard to arrangements on
a per-policy basis, some possible designs referring to pension schemes with
combined benefits are discussed in Biffis and Olivieri (2002). Gründl et al.
(2006) analyse natural hedging from the perspective of the maximization
of shareholder value and show, under proper assumptions, that natural
hedging could not be optimal in this regard.
Solvency investigations in portfolio of life annuities are dealt with by
Olivieri and Pitacco (2003). Solvency issues within a Lee-Carter framework
are discussed by Denuit and Frostig (2007a). A review of solvency systems
is provided by Sandström (2006); when the longevity risk is addressed, typ-
ically the required capital in this respect is set as a share of the technical
provision. The most recent regulatory system is provided by the evolving
Solvency 2 system, where the required capital is the change expected in the
net asset value in case of a permanent shock in survival rates; see, for exam-
ple, CEIOPS (2007) and CEIOPS (2008). The idea of assessing the required
capital by comparing assets to the random value of future payments, exam-
ined in Section 7.3.3, has been put forward, for the life business in general,
by Faculty of Actuaries Working Party (1986).
7.8 References and suggestions for further reading 371
Pensions (2002) in the United Kingdom, and the Retirement Choice Work-
ing Party (2001). The paper by Wadsworth et al. (2001) suggests a technical
structure for a fund providing annuities. A comprehensive description of
several annuities markets is provided by Cardinale et al. (2002). Piggot
et al. (2005) describe Group-Self Annuitization schemes, which provide an
example of flexible GAR; however, the benefit in this case is not guaranteed.
Money-back annuities in the United Kingdom represent an interesting annu-
itization strategy; see Boardman (2006). Income drawdown issues within
the context of defined contribution pension plans are discussed by Emms
and Haberman (2008), Gerrard et al. (2006). An extensive presentation of
issues concerning financing the post-retirement income is given by Milevsky
(2006). An informal description of private solutions is provided by Swiss
Re (2007).
The reader interested in the impact of longevity risk on living benefits
other than life annuities can refer, for example, to Olivieri and Ferri (2003),
Olivieri and Pitacco (2002c), Olivieri and Pitacco (2002b). See also Pitacco
(2004b), where both life insurance and other living benefits are considered.
References
Booth, P., Chadburn, R., Heberman, S., James, D., Kharasarce, Z., Plumb,
R. and Rickayza, B. (2005). Modern advanced theory and practice. Boca
Rator: Chapman & Hall/CRC.
Bourgeois-Pichat, J. (1952). Essai sur la mortalité “biologique” de
l’homme. Population, 7(3), 381–394.
Bowers, N. L., Gerber, H. U., Hickman, J. C., Jones, D. A., and Nes-
bitt, C. J. (1997). Actuarial mathematics. The Society of Actuaries,
Schaumburg, Illinois.
Boyle, P. and Hardy, M. (2003). Guaranteed annuity options. ASTIN
Bulletin, 33, 125–152.
Brass, W. (1974). Mortality models and their uses in demography.
Transactions of the Faculty of Actuaries, 33, 123–132.
Brillinger, D. R. (1986). The natural variability of vital rates and associated
statistics. Biometrics, 42, 693–734.
Brouhns, N. and Denuit, M. (2002). Risque de longévité et rentes viagères.
II. Tables de mortalité prospectives pour la population belge. Belgian
Actuarial Bulletin, 2, 49–63.
Brouhns, N., Denuit, M., and Keilegom, van, I. (2005). Bootstrapping
the Poisson log-bilinear model for mortality forecasting. Scandinavian
Actuarial Journal, (3), 212–224.
Brouhns, N., Denuit, M., and Vermunt, J. K. (2002a). Measuring the
longevity risk in mortality projections. Bulletin of the Swiss Association
of Actuaries, 2, 105–130.
Brouhns, N., Denuit, M., and Vermunt, J. K. (2002b). A Poisson log-
bilinear approach to the construction of projected lifetables. Insurance:
Mathematics & Economics, 31(3), 373–393.
Buettner, T. (2002). Approaches and experiences in projecting mortality
patterns for the oldest-old. North American Actuarial Journal, 6(3),
14–25.
Butt, Z. and Haberman, S. (2002). Application of frailty-based mortality
models to insurance data. Actuarial Research Paper No. 142, Dept. of
Actuarial Science and Statistics, City University, London.
Butt, Z. and Haberman, S. (2004). Application of frailty-based mortality
models using generalized linear models. ASTIN Bulletin, 34(1), 175–
197.
Buus, H. (1960). Investigations on mortality variations. In Transactions
of the 16th International Congress of Actuaries, Volume 2, Bruxelles,
pp. 364–378.
Cairns, A. J. G., Blake, D., and Dowd, K. (2006a). Pricing death: frame-
works for the valuation and securitization of mortality risk. ASTIN
Bulletin, 36(1), 79–120.
376 References
Cairns, A. J. G., Blake, D., and Dowd, K. (2006b). A two-factor model for
stochastic mortality with parameter uncertainty: theory and calibration.
The Journal of Risk and Insurance, 73(4), 687–718.
Cairns, A., Blake, D., Dowd, K., Coughlan, G., Epstein, D., Ong, A.
and Balevich, I. (2007) A quantitative comparison of stochastic mor-
tality models using data from England and Wales and the United States.
Pensions Insitute Discussion Paper PI-0701, Cass Business School, City
University.
Cardinale, M., Findlater, A., and Orszag, M. (2002). Paying out pensions.
A review of international annuities markets. Research report, Watson
Wyatt.
Carter, L. and Lee, R. D. (1992). Modelling and forecasting US sex
differentials in mortality. International Journal of Forecasting, 8,
393–411.
Carter, L. R. (1996). Forecasting U.S. mortality: a comparison of Box –
Jenkins ARIMA and structural time series models. The Sociological
Quarterly, 37(1), 127–144.
Catalano, R. and Bruckner, T. (2006). Child mortality and cohort lifespan:
a test of diminished entelechy. International Journal of Epidemiology,
35, 1264–1269.
CEIOPS (2007). QIS3. Technical specifications. Part I: Instructions.
CEIOPS (2008). QIS4. Technical specifications.
Champion, R., Lenard, C. T., and Mills, T. M. (2004). Splines. In Ency-
clopedia of actuarial science (ed. J. L. Teugels and B. Sundt), Volume 3,
pp. 1584–1586. John Wiley & Sons.
CMI (2002). An interim basis for adjusting the “92” series mortality pro-
jections for cohort effects. Working Paper 1, The Faculty of Actuaries
and Institute of Actuaries.
CMI (2005). Projecting future mortality: towards a proposal for a stochas-
tic methodology. Working paper 15, The Faculty of Actuaries and
Institute of Actuaries.
CMI (2006). Stochastic projection methodologies: Further progress and
P-spline model features, example results and implications. Working
Paper 20, The Faculty of Actuaries and Institute of Actuaries.
CMIB (1978). Report no. 3. Continuous Mortality Investigation Bureau,
Institute of Actuaries and Faculty of Actuaries.
CMIB (1990). Report no. 10. Continuous Mortality Investigation Bureau,
Institute of Actuaries and Faculty of Actuaries.
CMIB (1999). Report no. 17. Continuous Mortality Investigation Bureau,
Institute of Actuaries and Faculty of Actuaries.
Coale, A. and Kisker, E. E. (1990). Defects in data on old age mortal-
ity in the United States: new procedures for calculating approximately
References 377
accurate mortality schedules and life tables at the highest ages. Asian
and Pacific Population Forum, 4, 1–31.
Congdon, P. (1993). Statistical graduation in local demographic anal-
ysis and projection. Journal of the Royal Statistical Society, A, 156,
237–270.
Coppola, M., Di Lorenzo, E., and Sibillo, M. (2000). Risk sources in a
life annuity portfolio: decomposition and measurement tools. Journal
of Actuarial Practice, 8(1–2), 43–61.
Cossette, H., Delwarde, A., Denuit, M., Guillot, F., and Marceau, E.
(2007). Pension plan valuation and dynamic mortality tables. North
American Actuarial Journal, 11, 1–34.
Cowley, A. and Cummins, J. D. (2005). Securitization of life insurance
assets and liabilities. The Journal of Risk and Insurance, 72(2), 193–
226.
Cox, S. H., Fairchild, J. R., and Pedersen, H. W. (2000). Economic aspects
of securitization of risk. ASTIN Bulletin, 30(1), 157–193.
Cox, S. H. and Lin, Y. (2007). Natural hedging of life and annuity
mortality risks. North Americal Actuarial Journal, 11, 1–15.
Cramér, H. and Wold, H. (1935). Mortality variations in Sweden: a
study in graduation and forecasting. Skandinavisk Aktuarietidskrift, 18,
161–241.
Crimmins, E. and Finch, C. (2006). Infection, inflammation, height
and longevity. Proceedings of the National Academy Sciences, 103,
498–503.
Cummins, J. D., Smith, B. D., Vance, R. N., and VanDerhei, J. L. (1983).
Risk classification in life insurance. Kluwer-Nijhoff Publishing, Boston,
The Hague, London.
Czado, C., Delwarde, A., and Denuit, M. (2005). Bayesian Poisson
log-bilinear mortality projections. Insurance: Mathematics & Eco-
nomics, 36(3), 260–284.
Dahl, M. (2004). Stochastic mortality in life insurance. Market reserves
and mortality-linked insurance contracts. Insurance: Mathematics &
Economics, 35(1), 113–136.
Dahl, M. and Møller, T. (2006). Valuation and hedging of life insur-
ance liabilities with systematic mortality risk. Insurance: Mathematics
& Economics, 39(2), 193–217.
Davidson, A. R. and Reid, A. R. (1927). On the calculation of rates of
mortality. Transactions of the Faculty of Actuaries, 11(105), 183–232.
Davy Smith, G., Hart, C., Blane, D., and Hole, D. (1998) Adverse socio-
economic conditions in childhood and cause specific adult mortality: a
prospective observational study. British Medical Journal, 316, 1631–
1635.
378 References
Wilkinson Tiller, M., Blinn, J. D., and Kelly, J. J. (1990). Essentials of risk
financing. Insurance Institute of America.
Willets, R. C. (2004). The cohort effect: insights and explanations. British
Actuarial Journal, 10, 833–877.
Williams, JR. C. A., Smith, M. L., and Young, P. C. (1998). Risk
management and insurance. Irwin/McGraw-Hill.
Wilmoth, J. R. (1993). Computational methods for fitting and extrapolat-
ing the Lee–Carter model of mortality change. Technical report.
Wilmoth, J. R. (2000). Demography of longevity: Past, present, and future
trends. Journal of Experimental Gerontology, 35, 1111–1129.
Wilmoth, J. R. and Horiuchi, S. (1999). Rectangularization revisited: vari-
ability of age at death within human populations. Demography, 36(4),
475–495.
Wong-Fupuy, C. and Haberman, S. (2004). Projecting mortality trends:
recent developments in the United Kingdom and the United States.
North American Actuarial Journal, 8, 56–83.
Yaari, M. E. (1965). Uncertain lifetime, life insurance, and the theory of
the consumer. Review of Economic Studies, 32(2), 137–150.
Yashin, A. I. and Iachine, I. A. (1997). How frailty models can be used
for evaluating longevity limits: Taking advantage of an interdisciplinary
approach. Demography, 34, 31–48.
This page intentionally left blank
Index