Modelling Pensions

Modelling Longevity Dynamics for
Pensions and Annuity Business

MATHEMATICS TEXTS FROM OXFORD UNIVERSITY PRESS
David Acheson: From Calculus to Chaos: An introduction to dynamics
Norman L. Biggs: Discrete Mathematics, second edition
Bisseling: Parallel Scientific Computation
Cameron: Introduction to Algebra
A.W. Chatters and C.R. Hajarnavis: An Introductory Course in
Commutative Algebra
René Cori and Daniel Lascar: Mathematical Logic: A Course with
Exercises, Part 1
René Cori and Daniel Lascar: Mathematical Logic: A Course with
Exercises, Part 2
Davidson: Turbulence
D’Inverno: Introducing Einstein’s Relativity
Garthwaite, Jollife, and Jones: Statistical Inference
Geoffrey Grimmett and Dominic Welsh: Probability: An Introduction
G.R. Grimmett and D.R. Stirzaker: Probability and Random Processes,
third edition
G.R. Grimmett and D.R. Stirzaker: One Thousand Exercises in Probability,
second edition
G.H. Hardy and E.M. Wright: An Introduction to the Theory of Numbers
John Heilbron: Geometry Civilized
Hilborn: Chaos and Nonlinear Dynamics
Raymond Hill: A First Course in Coding Theory
D.W. Jordan and P.Smith: Non Linear Ordinary Differential Equations
Richard Kaye and Robert Wilson: Linear Algebra
J.K. Lindsey: Introduction to Applied Statistics: A modelling approach,
second edition
Mary Lunn: A First Course in Mechanics
Jiří Matoušek and Jaroslav Nešetřil: Invitation to Discrete Mathematics
Tristan Needham: Visual Complex Analysis
John Ockendon, Sam Howison, : Applied Partial Differential Equations
H.A. Priestley: Introduction to Complex Analysis, second edition
H.A. Priestley: Introduction to Integration
Roe: Elementary Geometry
Ian Stewart and David Hall: The Foundations of Mathematics
W.A. Sutherland: Introduction to Metric and Topological Spaces
Dominic Welsh: Codes and Cryptography
Robert A. Wilson: Graphs, Colourings and the Four Colour Theorem
Adrian F. Tuck: Atmospheric Turbulence
André Nies: Computability and Randomness
Pitacco, Denuit, Haberman, and Olivieri: Modelling Longevity Dynamics
for Pensions and Annuity Business
Modelling Longevity
Dynamics for Pensions
and Annuity Business
Ermanno Pitacco
University of Trieste (Italy)
Michel Denuit
UCL, Louvain-la-Neuve (Belgium)
Steven Haberman
City University, London (UK)
Annamaria Olivieri
University of Parma (Italy)
1
3
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
© Ermanno Pitacco, Michel Denuit, Steven Haberman, and Annamaria Olivieri 2009
The moral rights of the authors have been asserted
Database right Oxford University Press (maker)
First published 2009
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose the same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data available
Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India
Printed in Great Britain
on acid-free paper by
CPI Antony Rowe, Chippenham, Wiltshire
ISBN 978–0–19–954727–2
10 9 8 7 6 5 4 3 2 1
Preface
Actuarial science effectively began with the bringing together of compound

interest and life tables, some of which had been derived from observed
mortality rates. One of the first categories of financial problems that early
actuaries tackled was the calculation of annuity values. Thus, the subject
matter of this book can be traced back to the beginnings of the discipline
of actuarial science in the mid-17th century. At this time, states and cities
often raised money for public purposes by the sale of life annuities to their
citizens. One of the first to write on this subject was Jan de Witt, who was
the Prime Minister of the States of Holland, and who demonstrated in 1671
how to calculate the value of annuities using a constant rate of interest and
a hypothetical life table (that was piecewise linear). Another early investi-
gation of the calculation method for annuity values is the seminal work of
Edmund Halley in 1693 which uses population mortality rates.
From an overview of this early history, we can identify two key devel-
opments that feature in this book. First, there was the recognition of the
importance of using life tables that were not hypothetical and were not
based on general population data but rather were based on observed mor-
tality data from registers of annuitants. This came in the mid-18th century
– through the work of Nicholas Struyck in 1740, William Kersseboom in
1742, and Antoine Deparcieux in 1746 (Haberman and Sibbett, 1995). In
modern terminology, we would present this in the context of adverse selec-
tion among the holders of life annuities – the tendency for purchasers of
life annuities to live longer than the general population. It was the work of
John Finlaison (the UK’s first Government Actuary) in 1829, which cogently
demonstrated the financial problems that could emerge from overlooking
this fundamental phenomenon. During the first two decades of the 19th
century, the British Government had been selling annuities in an attempt to
reduce the National Debt. The annuities were priced using mortality rates
from a contemporary population-based life table, which failed to allow for
the adverse selection effect and hence the mortality rates were too high for
a portfolio of annuitants. Finlaison uncovered this problem and showed
that the annuities were being sold at a loss. He identified the problem in
1819 and then produced scientific evidence based on a painstaking analysis
vi Preface
of a range of data sets that led to recommendations that were accepted by

the British Government in 1829. Thus, subsequently, Government annuities
were sold on a sound basis in line with Finlaison’s recommendations.
It is noteworthy that issues connected with mortality, annuities and
adverse selection are common features of western 19th century literature.
As pointed out by Skwire (1997), the novels of Jane Austen are a particu-
larly rich source of actuarial references. In the words of Fanny Dashwood
in Sense and Sensibility, ‘people always live for ever when there is any annu-
ity to be paid them; . . . An annuity is very serious business and there is no
getting rid of it’.
The second development related to the understanding that, in the pres-
ence of a downward secular trend in mortality rates, mortality tables for
application to annuities should include some allowance for the expected
future improvements in mortality rates in order to protect the seller of
annuities against future loss. The first tables to make such an allowance
were those produced in the United Kingdom based on insurance company
data covering the period 1900–1920.
Thus, we see that this book is closely related to fundamental practical
problems that actuaries have been trying to address for some years. But the
more immediate history of this book can be traced to the research that has
been carried out by the four authors and to two Summer Schools on which
the authors taught and which were organized by the Groupe Consultatif
Actuariel Européen (i.e. Consultative Group of Actuarial Organizations in
the European Union) in 2005 at the MIB School of Management of Trieste
and in 2006 at the University of Parma. The presentations at the summer
schools were centered on disseminating the results of this research work
in a manner that was accessible to both practitioner and academic audi-
ences. The book is specifically aimed at final year undergraduate students,
MSc students, research students preparing for an MPhil or PhD degree and
practising actuaries (as part of their continuing professional development).
This book deals with some very important topics in the field of actu-
arial mathematics and life insurance techniques. These concern mortality
improvements, the uncertainty in future mortality trends and their rele-
vant impact on life annuities and pension plans. In particular, we consider
the actuarial calculations concerning pensions and life annuities. As we
have noted above, the insurance company (or the pension plan) must
adopt an appropriate forecast of future mortality in order to avoid under-
estimation of the related liabilities. These concepts and models could
be extended to apply to other living benefits, which are provided, for
example, by long-term care insurance products and whole life sickness
covers.
Preface vii
Considerable attention is currently being devoted in actuarial work to the

management of life annuity portfolios, both from a theoretical and a prac-
tical point of view, because of the growing importance of annuity benefits
paid by private pension schemes. In particular, the progressive shift in many
countries from defined benefit to defined contribution pension schemes has
increased the interest in life annuities with a guaranteed annual amount.
This book aims at providing a comprehensive and detailed description
of methods for projecting mortality, and an extensive introduction to some
important issues concerning longevity risk in the area of life annuities and
pension benefits. The following topics are dealt with: life annuities in the
framework of post-retirement income strategies; the basic mortality model;
recent mortality trends; general features of projection models; a discussion
of stochastic projection models, with numerical illustrations; and measuring
and managing longevity risk.
Chapter 1 has an introductory role, and aims to present the basic structure
of life annuity products. Moving from the simple model of the annuity-
certain, typical features of life annuity products are presented. From an
actuarial point of view, the presentation progressively shifts from the tradi-
tional deterministic models to the more modern stochastic models. With an
appropriate stochastic approach, we are able to capture the riskiness inher-
ent in a life annuity portfolio and in particular the risks that arise from
random mortality. Cross-subsidy mechanisms which may operate in life
annuity portfolios and pension plans are then described. Our presentation
of the actuarial structure of life annuities focuses on a very simple annu-
ity model, namely the immediate life annuity. So, problems arising in the
so-called accumulation phase (as well as problems regarding the annuitiza-
tion of the accumulated amount) are initially disregarded. The chapter then
provides a comprehensive description of a number of life annuity models;
actuarial aspects are briefly mentioned, in favour of some more practical
issues with the objective, in particular, of paving the way for the subsequent
formal presentation.
Some elements of the basic mortality model underlying life insurance,
life annuities, and pensions are introduced in Chapter 1, while present-
ing the structure of life annuities. In Chapter 2, the mortality model is
described in more depth, by adopting a more structured presentation of
the fundamental ideas. At the same time we introduce some new concepts.
In particular, an age-continuous framework is defined, in order to provide
some tools needed when dealing with mortality projection models. Indices
summarizing the probability distribution of the lifetime are described, and
parametric models (often called mortality ‘laws’ in the literature) are pre-
sented. Transforms of the survival function are briefly addressed. We also
viii Preface
consider two further topics that are of great importance in the context of
life annuities and mortality forecasts but which are less traditional as far as
actuarial books are concerned. These are mortality at the very old ages (i.e.
the problem of ‘closing’ the life table) and the concept of ‘frailty’ as a tool
to represent heterogeneity in populations due to unobservable risk factors.
Chapter 3 considers mortality trends during the past century. The well-
known background is that average human life span has roughly tripled over
the course of human history. Compared to all of the previous centuries, the
20th century has been characterized by a huge increase in average longevity.
As we demonstrate in several chapters, there is no evidence which shows
that improvements in longevity are tending to slow down. This chapter
aims to illustrate the observed decline in mortality over the 20th century,
on the basis of Belgian mortality statistics, using several of the mortality
indices that have been introduced in Chapters 1 and 2. We also illus-
trate the trends in mortality indices for insurance data from the Belgian
insurance market, which have been provided by the Banking, Finance and
Insurance Commission (in Brussels). We note the key point that emerges
from actuarial history that, in order to protect an insurance company from
mortality improvements, actuaries need to resort to life tables incorporat-
ing a forecast of the future trends of mortality rates (the so-called projected
tables). The building of these projected life tables is the main topic of the
next chapters.
Chapter 4 aims at describing the various methods that have been pro-
posed by actuaries and demographers for projecting mortality. Many of
these have been used in an actuarial context, in particular for pricing and
reserving in relation to life annuity products and pension products and
plans, and in the demographic field, mainly for population projections. First,
the idea of a ‘dynamic’ approach to mortality modelling is introduced. Then,
projection methods are presented and our starting point is the extrapolation
procedures which are still widely used in current actuarial practice. More
complex methods follow, in particular those methods based on mortality
laws, on model tables, and on relations between life tables. The Lee–Carter
method, which has been recently proposed, and some relevant extensions
are briefly introduced (while a more detailed discussion, together with var-
ious examples of its implementation, is presented in Chapters 5 and 6). The
presentation is thematic rather than following a strict chronological order.
In order to obtain an insight into the historical evolution of mortality fore-
casts, the reader can refer to the final section of this chapter, in which some
landmarks in the history of dynamic mortality modelling are identified.
There is a variety of statistical models used for mortality projection, rang-
ing from the basic regression models, in which age and time are viewed
Preface ix
as continuous covariates, to sophisticated robust non-parametric models.

In Chapter 5, we adopt the age-period framework and we first consider
the Lee–Carter log-bilinear projection model. The key difference with the
classical generalized linear regression model approach centers on the inter-
pretation of time which in the log-bilinear approach is modelled as a factor
and under the generalized linear regression approach is modelled as a known
covariate. In addition to the Lee–Carter model, we also consider the alter-
native Cairns–Blake–Dowd mortality forecasting method. Compared with
the Lee–Carter approach, the Cairns–Blake–Dowd model includes two time
factors. This allows the model to capture the imperfect correlation in mor-
tality rates at different ages from one year to the next. This approach can
also be seen as a compromise between the generalized regression approach
and the Lee–Carter views of mortality modelling, in that age enters the
Cairns–Blake–Dowd model as a continuous covariate whereas the effect
of calendar time is captured by two factors (time-varying intercept and
slope parameters). These two approaches are applied to Belgian mortality
statistics and the results are interpreted.
In Chapter 6, our aim is to extend the mortality models described in
Chapter 5 in order to incorporate cohort effects as well as age and period
effects. The cohort effect is a prominent feature of mortality trends in sev-
eral developed countries including the United Kingdom, the United States,
Germany, and Japan. It relates to the favourable mortality experience that
has been observed for those born during the decades between the two world
wars. Given that this is a significant feature of past experience, it is neces-
sary first to be able to model and then to forecast its impact on future
mortality trends. First, we discuss the evidence for the cohort effect, with
particular reference to the United Kingdom. The age–period–cohort version
of the Lee–carter model is then introduced, along with a discussion of the
error structure, model fitting, and forecasting. A detailed case study is then
presented involving historic data from England and Wales. The cohort ver-
sions of the Cairns–Blake–Dowd and P-splines models are also presented
and their principal features are reviewed.
In Chapter 7, we deal with the mortality risks borne by an annuity
provider, and in particular with the longevity risk which originates from the
uncertain evolution of mortality at adult and old ages. First, we describe pos-
sible approaches to a stochastic representation of mortality, as is required
when modelling longevity risk. Then, an analysis of the impact of longevity
risk on the risk profile of the provider of immediate life annuities is devel-
oped. Taking a risk management perspective, possible solutions for risk
mitigation are then examined. Risk transfers as well as capital requirements
for the risk retained are discussed. As far as the latter are concerned, some
x Preface
rules which could be implemented within internal models are tested and a
comparison is also developed with the requirement for longevity risk set by
Solvency 2, in its current state of development. With regard to risk trans-
fers, particular attention is devoted to capital market solutions, that is, to
longevity bonds. The possible design of reinsurance arrangements is exam-
ined in connection with the hedging opportunities arising from some of
these capital market solutions. The main issues concerning policy design
and the pricing of longevity risk are sketched. The possible behaviour of
the annuitant with respect to the planning of her/his retirement income,
which should be carefully considered in order to choose an appropriate
design of life annuity products, is also examined.
Our approach to writing this book has been to allocate prime responsi-
bility for each chapter to one or two authors and then for us all to provide
comments and input. Thus, Chapters 1 and 4 were written by Ermanno
Pitacco; Chapter 2 by Ermanno Pitacco and Annamaria Olivieri jointly;
Chapters 3 and 5 by Michel Denuit; Chapter 6 by Steven Haberman; and
Chapter 7 by Annamaria Olivieri. We would like to add that a book like
this will never be the result of the inputs of just the authors. Thus, we each
would like to acknowledge the support that we have received from a range
of colleagues. First, we would each like to thank our respective institutions
for the stimulating environment that has enabled us to complete this project.
Michel Denuit would like to acknowledge the inputs by Natacha Brouhns
and Antoine Delwarde, who both worked on the topic of this book as
PhD students under his supervision at UCL. Andrew Cairns kindly pro-
vided detailed comments on an earlier version of Chapters 3 and 5, which
led to significant improvements, in particular with regard to mortality
projection models. Discussions and/or collaborations with many esteemed
colleagues helped to clarify the analysis of mortality and its consequence
for insurance risk management, including Enrico Biffis, Hélène Cossette,
Claudia Czado, Pierre Devolder, Jan Dhaene, Paul Eilers, Esther Frostig,
Anne-Cécile Goderniaux, Montserrat Guillen, Étienne Marceau, Christian
Partrat, Christian Robert, Jeroen Vermunt, and Jean-François Walhin. Luc
Kaiser, Actuary at the BFIC kindly supplied mortality data about the Bel-
gian life insurance market. Particular thanks go to all the participants
to the ‘Mortality’ task force of the Royal Society of Belgian Actuaries,
directed by Philippe Delfosse. Interesting discussions with practising actuar-
ies involved also helped to clarify some issues. In that respect, Michel Denuit
would like to thank Pascal Schoenmaekers from Munich Re for stimulating
exchanges. Michel Denuit would like to stress his beneficial involvement in
the working party appointed by the Belgian federal government in order to
Preface xi
produce projected life tables for Belgium. Special thanks in this regard go to
Micheline Lambrecht and Benoît Paul from FPB. Also, Michel Denuit has
benefited from partnerships with (re)insurance companies, especially with
Daria Khachakidze and Laure Olié from SCOR, and with Lucie Taleyson
from AXA. The financial support of the Communauté française de Belgique
under contract ‘Projet d Actions de Recherche Concertées’ ARC 04/09-320
and of Banque Nationale de Belgique under grant ‘Risk measures and
Economic capital’ are gratefully acknowledged.
Steven Haberman would like to express his deep gratitude to his long-
term research collaborator, Arthur Renshaw, for his contributions to their
joint work which has underpinned the ideas in Chapters 5 and 6 and for
stimulating discussions about mortality trends. He would also like to thank
his close colleague, Richard Verrall, for his contributions and advice on
modelling mortality, as well as their recent PhD students, Terry Sithole
and Marwa Khalaf-Allah, and their research assistant, Zoltan Butt, who
have all worked on the subject of mortality trends and their impact on
annuities and pensions. Steven Haberman would also like to thank Adrian
Gallop from the Government Actuary’s Department for providing mortality
data for England and Wales (by individual year of age and calendar year)
that facilitated the modelling of trends by cohort. The financial support,
provided through annual research grants, received from the Continuous
Mortality Investigation Bureau of the UK Actuarial Profession is gratefully
acknowledged.
Annamaria Olivieri and Ermanno Pitacco would like to thank Enrico
Biffis and Pietro Millossovich for stimulating exchanges and collaborations,
Patrizia Marocco and Fulvio Tomè from Assicurazioni Generali for interest-
ing discussions on various practical aspects of longevity, Marco Vesentini
from Cattolica Assicurazioni, Verona, for providing useful material. The
financial support from the Italian Ministero dell’Università e della Ricerca is
gratefully acknowledged; thanks to the research project ‘Income protection
against longevity and health risks: financial, actuarial and economic analysis
of pension and health products. Market trends and perspectives’, coordi-
nated by Ermanno Pitacco, various stimulating meetings have been held.
Finally, special thanks go to all the participants of the Summer School
of the Groupe Consultatif Actuariel Europeen on the topic ‘Modelling
mortality dynamics for pensions and annuity business’ held twice in Italy
(Trieste, 2005; Parma, 2006). Their feedback and comments have been
very useful and such Continuing Professional Development initiatives offer
to the lecturers involved exciting opportunities for the merging of theoret-
ical approaches and practical issues, which we hope have been retained as
a theme in this book.
This page intentionally left blank
Contents
Preface v
1 Life annuities 1
1.1 Introduction 1
1.2 Annuities-certain versus life annuities 2
1.2.1 Withdrawing from a fund 2
1.2.2 Avoiding early fund exhaustion 5
1.2.3 Risks in annuities-certain and in life annuities 6
1.3 Evaluating life annuities: deterministic approach 8
1.3.1 The life annuity as a financial transaction 8
1.3.2 Actuarial values 9
1.3.3 Technical bases 12
1.4 Cross-subsidy in life annuities 14
1.4.1 Mutuality 14
1.4.2 Solidarity 16
1.4.3 ‘Tontine’ annuities 18
1.5 Evaluating life annuities: stochastic approach 20
1.5.1 The random present value of a life annuity 20
1.5.2 Focussing on portfolio results 21
1.5.3 A first insight into risk and solvency 24
1.5.4 Allowing for uncertainty in mortality
assumptions 27
1.6 Types of life annuities 31
1.6.1 Immediate annuities versus deferred annuities 31
1.6.2 The accumulation period 33
1.6.3 The decumulation period 36
1.6.4 The payment profile 38
1.6.5 About annuity rates 40
1.6.6 Variable annuities and GMxB features 41
1.7 References and suggestions for further reading 43
xiv Contents
2 The basic mortality model 45
2.1 Introduction 45
2.2 Life tables 46
2.2.1 Cohort tables and period tables 46
2.2.2 ‘Population’ tables versus ‘market’ tables 47
2.2.3 The life table as a probabilistic model 48
2.2.4 Select mortality 49
2.3 Moving to an age-continuous context 51
2.3.1 The survival function 51
2.3.2 Other related functions 53
2.3.3 The force of mortality 55
2.3.4 The central death rate 57
2.3.5 Assumptions for non-integer ages 57
2.4 Summarizing the lifetime probability distribution 58
2.4.1 The life expectancy 59
2.4.2 Other markers 60
2.4.3 Markers under a dynamic perspective 62
2.5 Mortality laws 63
2.5.1 Laws for the force of mortality 64
2.5.2 Laws for the annual probability of death 66
2.5.3 Mortality by causes 67
2.6 Non-parametric graduation 67
2.6.1 Some preliminary ideas 67
2.6.2 The Whittaker–Henderson model 68
2.6.3 Splines 69
2.7 Some transforms of the survival function 73
2.8 Mortality at very old ages 74
2.8.2 Models for mortality at highest ages 75
2.9 Heterogeneity in mortality models 77
2.9.1 Observable heterogeneity factors 77
2.9.2 Models for differential mortality 78
2.9.3 Unobservable heterogeneity factors.
The frailty 80
2.9.4 Frailty models 83
2.9.5 Combining mortality laws with frailty models 85
Contents xv
3 Mortality trends during the 20th century 89
3.1 Introduction 89
3.2 Data sources 90
3.2.1 Statistics Belgium 91
3.2.2 Federal Planning Bureau 91
3.2.3 Human mortality database 92
3.2.4 Banking, Finance, and Insurance Commission 92
3.3 Mortality trends in the general population 93
3.3.1 Age-period life tables 93
3.3.2 Exposure-to-risk 95
3.3.3 Death rates 96
3.3.4 Mortality surfaces 101
3.3.5 Closure of life tables 101
3.3.6 Rectangularization and expansion 105
3.3.7 Life expectancies 111
3.3.8 Variability 113
3.3.9 Heterogeneity 115
3.4 Life insurance market 116
3.4.1 Observed death rates 116
3.4.2 Smoothed death rates 118
3.4.3 Life expectancies 122
3.4.4 Relational models 123
3.4.5 Age shifts 127
3.5 Mortality trends throughout EU 129
3.6 Conclusions 135
4 Forecasting mortality: an introduction 137
4.1 Introduction 137

4.2 A dynamic approach to mortality modelling 139
4.2.1 Representing mortality dynamics: single-figures
versus age-specific functions 139
4.2.2 A discrete, age-specific setting 140
4.3 Projection by extrapolation of annual probabilities
of death 141
4.3.2 Reduction factors 144
xvi Contents
4.3.3 The exponential formula 145

4.3.4 An alternative approach to the exponential
extrapolation 146
4.3.5 Generalizing the exponential formula 147
4.3.6 Implementing the exponential formula 148
4.3.7 A general exponential formula 149
4.3.8 Some exponential formulae used in
actuarial practice 149
4.3.9 Other projection formulae 151
4.4 Using a projected table 152
4.4.1 The cohort tables in a projected table 152
4.4.2 From a double-entry to a single-entry
projected table 153
4.4.3 Age shifting 155
4.5 Projecting mortality in a parametric context 156
4.5.1 Mortality laws and projections 156
4.5.2 Expressing mortality trends
via Weibull’s parameters 160
4.5.3 Some remarks 162
4.5.4 Mortality graduation over age and time 163
4.6 Other approaches to mortality projections 165
4.6.1 Interpolation versus extrapolation:
the limit table 165
4.6.2 Model tables 166
4.6.3 Projecting transforms of life table functions 167
4.7 The Lee–Carter method: an introduction 169
4.7.2 The LC model 171
4.7.3 From LC to the Poisson log-bilinear model 172
4.7.4 The LC method and model tables 173
4.8 Further issues 173
4.8.1 Cohort approach versus period approach.
APC models 173
4.8.2 Projections and scenarios. Mortality
by causes 175
4.9.1 Landmarks in mortality projections 175
4.9.2 Further references 178
Contents xvii
5 Forecasting mortality: applications and examples

of age-period models 181

5.2 Lee–Carter mortality projection model 186
5.2.1 Specification 186
5.2.2 Calibration 188
5.2.3 Application to Belgian mortality statistics 200
5.3 Cairns–Blake–Dowd mortality projection model 203
5.3.1 Specification 203
5.3.2 Calibration 206
5.4 Smoothing 209
5.4.1 Motivation 209
5.4.2 P-splines approach 210
5.4.3 Smoothing in the Lee–Carter model 212
5.5 Selection of an optimal calibration period 214
5.5.1 Motivation 214
5.5.2 Selection procedure 216
5.6 Analysis of residuals 218
5.6.1 Deviance and Pearson residuals 218
5.7 Mortality projection 221
5.7.1 Time series modelling for the time indices 221
5.7.2 Modelling of the Lee-Carter time index 223
5.7.3 Modelling the Cairns-Blake-Dowd time indices 228
5.8 Prediction intervals 229
5.8.1 Why bootstrapping? 229
5.8.2 Bootstrap percentiles confidence intervals 230
5.9 Forecasting life expectancies 234
5.9.1 Official projections performed by the Belgian
Federal Planning Bureau (FPB) 235
5.9.2 Andreev–Vaupel projections 235
xviii Contents
5.9.4 Longevity fan charts 240

5.9.5 Back testing 240
6 Forecasting mortality: applications and examples of

age-period-cohort models 243

6.2 LC age–period-cohort mortality projection model 246
6.2.1 Model structure 246
6.2.2 Error structure and model fitting 248
6.2.3 Mortality rate projections 253
6.2.4 Discussion 253
6.3 Application to United Kingdom mortality data 254
6.4 Cairns-Blake-Dowd mortality projection model:
allowing for cohort effects 263
6.5 P-splines model: allowing for cohort effects 265
7 The longevity risk: actuarial perspectives 267

7.2 The longevity risk 268
7.2.1 Mortality risks 268
7.2.2 Representing longevity risk: stochastic
modelling issues 270
7.2.3 Representing longevity risk: some examples 273
7.2.4 Measuring longevity risk in a static framework 276
7.3 Managing the longevity risk 293
7.3.1 A risk management perspective 293
7.3.2 Natural hedging 299
7.3.3 Solvency issues 303
7.3.4 Reinsurance arrangements 318
7.4 Alternative risk transfers 330
7.4.1 Life insurance securitization 330
7.4.2 Mortality-linked securities 332
7.4.3 Hedging life annuity liabilities through
longevity bonds 337
7.5 Life annuities and longevity risk 343
7.5.1 The location of mortality risks in traditional
life annuity products 343
7.5.2 GAO and GAR 346
7.5.3 Adding flexibility to GAR products 347
Contents xix
7.6 Allowing for longevity risk in pricing 350

7.7 Financing post-retirement income 354
7.7.1 Comparing life annuity prices 354
7.7.2 Life annuities versus income drawdown 356
7.7.3 The ‘mortality drag’ 359
7.7.4 Flexibility in financing post-retirement income 363
References 373
Index 389
Life annuities
1
1.1 Introduction
Great attention is currently devoted to the management of life annuity port-
folios, both from a theoretical and a practical point of view, because of the
growing importance of annuity benefits paid by private pension schemes.
In particular, the progressive shift from defined benefit to defined contribu-
tion pension plans has increased the interest in life annuities, which are the
principal delivery mechanism of defined contribution pension plans.
Among the risks which affect life insurance and life annuity portfolios,
longevity risk deserves a deep and detailed investigation and requires the
adoption of proper management solutions. Longevity risk, which arises
from the random future trend in mortality at adult and old ages, is a rather
novel risk. Careful investigations are required to represent and measure it,
and to assess the relevant impact on the financial results of life annuity
portfolios and pension plans.
This book provides a comprehensive and detailed description of methods
for projecting mortality, and an extensive introduction to some important
issues concerning the longevity risk in the area of life annuities and pension
benefits.
Conversely, the present chapter mainly has an introductory role, aiming
at presenting the basic structure of life annuity products. Moving from
the simple model of the annuity-certain, typical features of life annuity
products are presented (Section 1.2). From an actuarial point of view, the
presentation progressively shifts from very traditional deterministic models
(Section 1.3) to more modern stochastic models (Section 1.5). An appropri-
ate stochastic approach allows us to capture the riskiness inherent in a life
annuity portfolio, and in particular the risks arising from random mortality.
Cross-subsidy mechanisms which work (or may work) in life annuity
portfolios and pension plans are described in Section 1.4.
2 1 : Life annuities
The presentation of the actuarial structure of life annuities focusses on a

very simple annuity model, namely the immediate life annuity. So, problems
arising in the so-called accumulation phase (as well as problems regard-
ing the annuitization of the accumulated amount) are initially disregarded.
Conversely, in Section 1.6 a comprehensive description of a number of life
annuity models is provided. In this section, actuarial aspects are just men-
tioned, in favour of more practical issues aiming in particular at paving the
way for a following formal presentation.
A list of references and some suggestions for further readings conclude
the chapter (Section 1.7).
1.2 Annuities-certain versus life annuities
1.2.1 Withdrawing from a fund
Assume that the amount S is available at a given time, say at retirement, and
is used to build up a fund. Denote the retirement time with t = 0, and assume
that the year is the time unit. In order to get her/his post-retirement income,
the retiree withdraws from the fund at time t the amount bt (t = 1, 2, . . . ).
Suppose that the fund is managed by a financial institution which guarantees
a constant annual rate of interest i.
Denote with Ft the fund at time t, immediately after the payment of the
annual amount bt . Clearly:
Ft = Ft−1 (1 + i) − bt for t = 1, 2, . . . (1.1)
with F0 = S. Thus, the annual variation in the fund is given by
Ft − Ft−1 = Ft−1 i − bt for t = 1, 2, . . . (1.2)
Figure 1.1 illustrates the causes explaining the behaviour of the fund
throughout time, formally expressed by equation (1.2).
The behaviour of the fund throughout time obviously depends on the
sequence of withdrawals b1 , b2 , . . .. In particular, if for all t the annual
withdrawal is equal to the annual interest credited by the fund manager,
that is,
bt = Ft−1 i (1.3)
then, from (1.1) we immediately find
Ft = S (1.4)
– Annual
Ft–1 payment
Ft – Ft–1
Fund
Ft
+ Interest
t–1 t
Time
Figure 1.1. Annual variation in the fund providing an annuity-certain.
for all t, whence a constant withdrawal
b = Si (1.5)
follows.
Conversely, if we assume a constant withdrawal b,
b > Si (1.6)
(as probably will be needed to obtain a reasonable post-retirement income)

the drawdown process will exhaust, sooner or later, the fund. Indeed, from
equation (1.2) we have
F0 > F1 > · · · > Ft > · · · (1.7)
and we can find a time m such that
Fm ≥ 0 and Fm+1 < 0 (1.8)
Clearly, the exhaustion time m depends on the annual amount b (and the
interest rate i as well), as it can be easily understood from equation (1.2).
The sequence of m constant annual withdrawals b (with m defined by
conditions (1.8), and possibly completed by the exhausting withdrawal at
time m + 1) constitutes an annuity-certain.
Example 1.1 Assume S = 1000. Figure 1.2 illustrates the behaviour of
the fund when i = 0.03 and for different annual amounts b. Conversely,
Fig. 1.3 shows the behaviour of the fund for various interest rates i, assuming
b = 100.
2,000
b = 50
b = 75
b = 100
1,000 b = 125
0
0 5 10 15 20 25 30 35
Ft
–1,000
–2,000
–3,000
–4,000 t
Figure 1.2. The fund providing an annuity-certain (i = 0.03).
1,200
i = 0.02
i = 0.03
1,000 i = 0.04
i = 0.05
800
600
400
Ft
200
0
0 2 4 6 8 10 12 14 16
–200
–400
–600 t
Figure 1.3. The fund providing an annuity-certain (b = 100).
It is interesting to compare the exhaustion time m with the remaining

lifetime of the retiree. Assume that her/his age at retirement is x, for exam-
ple x = 65. Of course the lifetime is a random variable. Denote with Tx
the remaining random life for a person age x. Let ω denote the maximum
attainable age (or limiting age), say ω = 110. Hence, Tx can take all val-
ues between 0 and ω − x. If Tx < m then the amount Fm is available as a
bequest. Conversely, if Tx > m there are ω − x − m years with no possibility
of withdrawal (and hence no income).
In practice, the annual amount b (for a given interest rate i) could be

chosen by comparing the related exhaustion time m with some quantity
which summarizes the remaining lifetime. For example, a synthetic value
is provided by the expected remaining lifetime E[Tx ]; another possibility
is given by the remaining lifetime with the maximum probability, that is,
the mode of the remaining lifetime, Mod[Tx ]. Note that, to find E[Tx ] or
Mod[Tx ], assumptions about the probability distribution of the lifetime Tx
are needed (see Section 1.3.2).
For example, the value b may be chosen, such that
m ≈ Mod[Tx ] (1.9)
Thus, with a high probability the exhaustion time will coincide with the
residual lifetime. Notwithstanding, events like Tx > m, or Tx < m, may
occur and hence the retiree bears the risk originating from the randomness
of her/his lifetime. Conversely, the choice
m=ω−x (1.10)
obviously removes the risk of remaining alive with no withdrawal possibil-

ity, but this choice would result in a low annual amount b.
1.2.2 Avoiding early fund exhaustion
Risks related to random lifetimes can be transferred from the annuitants

to the annuity provider thanks to a different contractual structure, that is,
the life annuity. To provide a simple introduction to technical features of
life annuities, we adopt now a (very) traditional model; in the following
sections, more modern and general models will be described.
Consider the following transaction: an individual age x pays to a life
annuity provider (e.g. an insurer) an amount S to receive a (life) annuity
consisting in a sequence of annual benefits b, paid at the end of every year
while she/he is alive. Assume that the same type of annuity is purchased at
time t = 0 by a given number, say lx , of individuals all age x.
Let lx+t denote an estimate (at time 0) of the number of individuals (annu-
itants) alive at age x + t (t = 1, 2, . . . , ω − x), out of the initial ‘cohort’ of lx
individuals. As ω denotes the (integer) maximum age, we have by definition
lω > 0 and lω+1 = 0. The following (estimated) cash flows of the annuity
provider are then defined:
(a) an income lx S at time 0;

(b) a sequence of outgoes lx+t b at time t, t = 1, 2, . . . , ω − x.
Let Vt denote the fund pertaining to a generic annuitant at time t. The

total fund of the annuity provider is given by lx+t Vt , and is defined for
t = 1, 2, . . . , ω − x as follows:
lx+t Vt = lx+t−1 Vt−1 (1 + i) − lx+t b (1.11)
clearly with lx V0 = lx S
From (1.11), we find the following recursion describing the evolution of
the individual fund:
lx+t−1
Vt = Vt−1 (1 + i) − b (1.12)
lx+t
with V0 = S. Recursion (1.12) can also be written as follows:
lx+t−1 − lx+t
Vt = Vt−1 (1 + i) + Vt−1 (1 + i) − b (1.13)
lx+t
Thus, the annual variation in the fund is given by
lx+t−1 − lx+t
Vt − Vt−1 = Vt−1 i + Vt−1 (1 + i) − b (1.14)
lx+t
It is worth noting from (1.14) that the annual decrement of the individual
fund can be split into three contributions (see Figure 1.4):
(a) a positive contribution provided by the interest Vt−1 i;

(b) a positive contribution provided by the share of the funds released
because of the death of lx+t−1 − lx+t annuitants in the t-th year, the
share being credited to the lx+t annuitants alive at time t;
(c) a negative contribution given by the benefit b.
Contribution (b), which does not appear in the model describing the
annuity-certain (see Figure 1.1), is maintained thanks to a cross-subsidy
among annuitants, that is, the so-called mutuality effect. For more details,
see Section 1.4.1.
In the case of life annuities, the individual fund Vt (as defined by recursion
(1.12)) is called the reserve.
1.2.3 Risks in annuities-certain and in life annuities
First, let us focus on the simple model of annuity-certain we have dealt

with in Section 1.2.1, and consider the perspectives of the retiree and the
financial institution providing the annuity.
– Annual
payment
Vt–1
Vt – Vt–1
Reserve
+ Interest
Vt
+ Mutuality
t–1 t
Time
Figure 1.4. Annual variation in the (individual) fund of a life annuity.
The provider of an annuity-certain does not bear any risk inherent in

the random lifetime of the retiree, as, whatever this lifetime may be, the
annuity will be paid up to the exhaustion of the fund. Conversely, the
annuity provider takes financial risks which can be singled out looking
at the two causes of change in the fund level (see Fig. 1.1). Risks are as
follows:
– market risk, more precisely interest rate risk, as we have assumed that i is
the guaranteed interest rate which must be credited to the fund whatever
the return from the investment of the fund itself may be;
– liquidity risk, as the annual payment obviously requires cash availability.
Conversely, the retiree does not take any financial risk thanks to the
guaranteed interest rate, whereas she/he bears the risk related to her/his
random lifetime, as seen above.
Now, let us move to the life annuity. According to the structure of this
product (at least as defined in Section 1.2.2), the annuitant does not bear
any risk. Actually, the annuity is paid throughout the whole lifetime and
the amount of the annual payment is guaranteed.
Conversely, the annuity provider first bears the market risk and the
liquidity risk as in the annuity-certain model. Further, if the actual life-
times of annuitants lead to numbers of survivors greater than the estimated
ones, the cross-subsidy mechanism (see Section 1.2.2 and Fig. 1.4) cannot
finance the payments to the annuitants still alive. In other words, contri-
bution (b), which is required to maintain the individual fund Vt , should be
partially funded, in this case, by the annuity provider. Conversely, num-

bers of survivors less than the estimated ones lead to a provider’s profit.
Hence, the annuity provider takes risks related to the mortality of the
annuitants.
1.3 Evaluating life annuities: deterministic approach
1.3.1 The life annuity as a financial transaction
Purchasing a life annuity constitutes a financial transaction, whose cash

flows are
– a price, or premium, paid by the annuitant to the annuity provider;

– a sequence of amounts, namely the annuity, paid by the annuity provider
to the annuitant while he/she is alive; the payment frequency may be
monthly, quarterly, semi-annual, or annual.
In what follows, we only refer to annual payments, hence disregarding

annuities payable more frequently than once a year (which require special
treatment; see the references cited in Section 1.7). Further, we will assume
(if not otherwise specified) that payments are made at the end of each year
(annuity in arrears).
In the life annuity structure presented in Section 1.2.2, the amount S rep-
resents the premium paid against the annuity with b as the annual payment.
Clearly, the life annuity structure we have described requires a single pre-
mium at time 0, as the annuity is an immediate one. Conversely, for other
annuity models different premium arrangements are feasible, as we will see
in Section 1.6.
The relation between S and b is implicitly defined by recursion (1.11)
(or (1.12)). Solving with respect to S (or b), when b (or S) has been
assigned, leads to an explicit relation between the two amounts. In par-
ticular, S is the expected present value of the life annuity, as we will see in
Section 1.3.2.
Indeed, a reasonable starting point (but not necessarily the only one) for
determining the single premium is given by the calculation of the expected
present value of the life annuity. In particular, when the so-called equiva-
lence principle is adopted, the single premium is set equal to the expected
present value. Other premium calculation principles will be dealt with in
Section 7.6.
1.3.2 Actuarial values
For a given i and a given sequence lx , lx+1 , . . . , lω , from recursion (1.11),

with lx V0 = lx S, we find
ω−x

lx S = b lx+t (1 + i)−t (1.15)
t=1
and, referring to a single annuitant,

ω−x
lx+t
S= b (1 + i)−t (1.16)
lx
t=1
In formula (1.16), S turns out to be the present value of the sequence

of amounts b ‘weighted’ with the ratios lx+t /lx . The numbers of survivors
lx+t (and the interest rate i, as well) are assumed deterministic. Hence the
model relying on these assumptions, and leading in particular to expression
(1.16), is a deterministic one.
Some comments can help in understanding the features of the deter-
ministic model. First, a point in favour of the model is that, in spite of
its deterministic nature, the risk borne by the life annuity provider, and
arising from random lifetimes clearly emerges, although it is not explicitly
accounted for (see Section 1.2.3).
Second, equation (1.16) can be rewritten in ‘probabilistic’ terms, since
lx+t /lx can be interpreted as the estimate of the probability of an individual
age x being alive at age x+t. Denoting with t px this probability, it is formally
defined as follows:
t px = P[Tx > t] (1.17)
and we have
ω−x

−t
S=b t px (1 + i) (1.18)
t=1
An alternative expression is provided by the following formula:

ω−x

S=b ah h px qx+h (1.19)
h=1
where
– the symbol ah , defined as follows:
1 − (1 + i)−h
ah = (1.20)
i
denotes the present value of a temporary annuity-certain consisting of h
unitary annual payments in arrears;
– the symbol qx+h denotes the probability of an individual age x + h dying
within one year, formally
qx+h = P[Tx+h < 1] (1.21)
we note that, assuming ω as the maximum age, qω = 1;
– hence, h px qx+h is the probability of an individual currently age x dying
between ages x + h and x + h + 1; in symbols
h px qx+h = P[h ≤ Tx < h + 1] (1.22)
Note that
h px = (1 − qx )(1 − qx+1 ) . . . (1 − qx+h−1 ) (1.23)

The equivalence of (1.18) and (1.19) can be proved using the following
relation:

t−1
t px =1− h px qx+h (1.24)
h=0
where the sum expresses the probability of dying before age x + t.

Clearly, the right-hand side of expression (1.19) represents the expected
present value, or actuarial value, of the life annuity, thus:
S = b E[aKx ] (1.25)
where Kx denotes the curtate random remaining lifetime of an individual age
x, namely the integer part of Tx . The quantities h px qx+h , h = 0, 1, . . . , ω−x,
constitute the probability distribution of the discrete random variable Kx .
With the symbol commonly used to denote the actuarial value of the life
annuity, we have:
S = b ax (1.26)
where, according to (1.18),
ω−x

−t
ax = t px (1 + i) (1.27)
t=1
Finally, the quantity Vt can be interpreted as the mathematical reserve of

the life annuity, whose evolution throughout time is described by recursion
(1.12), namely, in probabilistic terms:
1
Vt = Vt−1 (1 + i) − b (1.28)
1 px+t−1
It should be noted that recursion (1.28) expresses the reserve Vt as the

result of the decumulation process, driven by financial items (the interest
rate i and the payment b) and a demographic item (the probability 1 px+t−1 ).
Under this perspective, the reserve Vt can be interpreted as assets pertaining
to the generic annuitant. Conversely, the annuitant has the right to receive
the annual amount b while she/he is alive. This obligation of the life annuity
provider, viz. a liability, can be expressed as the expected present value at
time t (and hence referred to the annuitant assumed to be alive at time t) of
future annual payments:
ω−x−t

b ax+t = b h px+t (1 + i)−h (1.29)
h=1
It is easy to prove, replacing Vt and Vt−1 in equation (1.28) with b ax+t

and b ax+t−1 respectively, that equation (1.28) itself is satisfied. Thus,
Vt = b ax+t (1.30)
whence the amount Vt can be interpreted as the amount of assets exactly

meeting the provider’s liability. Note that the reserve Vt exhausts at the
maximum age ω only.
Example 1.2 In Fig. 1.5 the mathematical reserve Vt is plotted against time
t. We have assumed S = 1000, i = 0.03, x = 65. The estimated numbers
of survivors can be drawn from various data set. For example, assume that
the probabilities qx+h , h = 0, 1, . . . , ω − x, where x is a given initial age
of interest, have been assigned. From the qx+h ’s, the estimated number of
survivors can be derived via the following recursion:
lx+h+1 = lx+h (1 − qx+h ) (1.31)
starting from a (notional) initial value lx . For example, assume for qx+h the
following expression:


 GH
x+h
if x + h < 110
qx+h = 1 + G H x+h (1.32)

1 if x + h = 110
1,200
1,000
800
Vt
600
400
200
0
65 75 85 95 105 115
x+t
Figure 1.5. Mathematical reserve of a life annuity.
with the parameters G = 0.000002, H = 1.13451. From the data assumed,

we obtain a65 = 14.173 and hence b = 70.559.
Remark The first line on the right-hand side of (1.32) approximately

expresses the mortality at older ages according to the first and second
Heligman–Pollard laws, as we will see in Section 2.5.2.
1.3.3 Technical bases
The relation between S (the single premium) and b (the annual benefit)
relies on the equivalence principle, as S is the expected present value of
the sequence of annual amounts b. The adoption of this principle complies
with common (but not necessarily sound) actuarial practice. Actually, when
the equivalence principle is used for pricing insurance products and life
annuities in particular, a safe-side technical basis (or prudential basis, or
first-order basis) is chosen, namely an interest rate i lower than the estimated
investment yield, and a set of probabilities expressing a mortality level lower
than that expected in the life annuity portfolio. The estimated investment
yield and the mortality actually expected constitute the scenario technical
basis (or realistic basis, or second-order basis).
For simplicity, assume a constant estimated investment yield i∗ ; denote
with q∗x+h , h = 0, 1, . . . , ω − x the realistic probabilities of death. The sur-
vival probabilities, t p∗x , can be calculated from the q∗x+h as stated by relation
(1.23). The resulting actuarial value of the life annuity, a∗x , is clearly given
(see (1.27)) by
ω−x

a∗x = ∗ ∗ −t
t px (1 + i ) (1.33)
t=1
The difference ax − a∗x can be interpreted as the expected present value (at
time t = 0) of the profit generated by the life annuity contract. Note that,
if i∗ > i, the yield from investment contributes to the profit. Usually profit
participation mechanisms assign a (large) part of the investment profit to
policyholders, and so the expected profit ax − a∗x should be taken as gross
of the profit participation.
Example 1.3 For example, assume i = 0.03 and the qx+h adopted in Exam-
ple 1.1 as the items of the safe-side technical basis (i.e. the pricing basis);
conversely, for the scenario basis assume i∗ = 0.05 as the estimated invest-
ment yield, and the mortality level described by probabilities q∗x+h given by
the expression (1.32) implemented with the parameters G∗ = 0.0000023,
H ∗ = 1.134. With these assumptions, we have that i∗ > i and q∗x+h > qx+h .
We find that a∗65 = 11.442, and hence the expected present value of the
profit produced by a life annuity with a unitary annual payment, that is,
with b = 1, is a65 − a∗65 = 2.731.
An appropriate choice of the first-order basis, for a given scenario basis,

also provides the insurer with a safety loading in order to face an adverse
mortality experience (and/or an adverse yield from investments). In other
words, while the spread between the technical bases produces a (posi-
tive) profit if the insurer experiences a mortality and an investment yield
as described by the scenario basis, the spread itself, increasing the single
premium for a given annual payment or conversely reducing the annual
payment for a given premium, can avoid losses when an adverse experience
occurs.
As regards the choice of the age-patterns of mortality, to adopt as the
first-order and the second-order technical basis respectively, it should be
kept in mind that life annuities may involve very long time intervals, say
25–30 years or even more. Indeed, survival probabilities (i.e. probabilities
∗
t px and t px ) should express reasonable mortality assumptions referring to
the future lifetime of an individual who is currently at age x.
Age-patterns of mortality are commonly available as the result of statis-
tical observations, and usually express the mortality at various ages as it
emerges at the time of the observation itself. As mortality is affected by evi-
dent trends (see Chapter 3), observed mortality (even when resulting from
recent investigations) cannot be directly used to express long term future

mortality, as required when dealing with life annuities. Thus, projection
models (see Chapter 4) are needed to forecast future mortality.
1.4 Cross-subsidy in life annuities

Although insurance transactions can be analysed at an individual level (e.g.
in terms of the equivalence principle), in practice these transactions usually
involve a group of insureds transferring the same type of risk to an insurer.
This is also the case for life annuity products, and actually these products
have been introduced in Section 1.2.2 referring to a cohort of annuitants.
Thanks to the existence of an insured population, money transfers inside
the population itself (i.e. among the policyholders) are possible, causing a
cross-subsidy among the insureds (or annuitants). The term cross-subsidy
broadly refers to some arrangement adopted for sharing among a given
population the cost of a set of benefits. However, various types of cross
subsidy can be recognized. While mutuality underpins the management of
any insurance portfolio (see Section 1.2.2, as regards life annuities), other
types of cross subsidy are not necessarily involved, for example, solidarity.
Further, special cross-subsidy structures may occur with particular policies;
this is the case of tontine schemes in the context of life annuities.
In the following parts of this section, we deal with cross-subsidy mech-
anisms (mutuality, solidarity, and tontines), focussing on life annuity
portfolios.
1.4.1 Mutuality
The mutuality principle underpins the insurance process (whether or not

it is run by a ‘mutual’ insurance company or by a proprietary insurance
company, which is owned by shareholders), and arises from the pooling of
a number of risks. Moreover, the mutuality effect also works in ‘mutual
associations’ of individuals exposed to the same type of risk, even without
resorting (at least in principle) to an insurance company.
The mutuality effect leads to money transfers from insureds (or annu-
itants) who, in terms of actuarial value, paid premiums greater than the
benefits received to insureds in the opposite situation. For example, in a
non-life portfolio the insureds without claims transfer money to the insureds
with claims.
Referring to a life annuity portfolio, it is interesting to focus on the annual

equilibrium between assets available and liabilities. This equilibrium relies
on an asset transfer among annuitants, namely, from annuitants dying in
the year to annuitants alive at the end of the year. This clearly appears
from recursion (1.11), where the accumulated fund pertaining to lx+t−1
annuitants alive at time t −1, whose amount is lx+t−1 Vt−1 (1 +i), is used to
finance benefits to lx+t annuitants (out of the lx+t−1 ) alive at time t, namely,
the payment of the amount lx+t b and the maintenance of the fund lx+t Vt
for future payments. So, resources needed at time t are made available (also)
thanks to this cross subsidy, namely the mutuality effect.
Let us now look at the technical equilibrium under an individual
perspective. Recursion (1.13) can be rewritten, in more compact terms, as
Vt = Vt−1 (1 + i) (1 + θx+t ) − b (1.34)
where
lx+t−1 − lx+t
θx+t = (1.35)
lx+t
In terms of survival probabilities, as it emerges from (1.28), we have

θx+t = (1/1 px+t−1 ) − 1.
Looking at recursion (1.34), θx+t can be interpreted as an ‘extra-yield’
which is required to maintain the decumulation process of the individual
reserve Vt , and hence can be interpreted as a measure of the mutuality
effect. The extra yield θx+t is also called the mortality drag, or interest from
mutuality. As already seen in Section 1.2.2, θx+t determines the share of the
funds released because of the death of lx+t−1 − lx+t annuitants in the t-th
year, and credited to the lx+t annuitants alive at time t.
Remark The (annual) extra-yield provided by the mutuality effect is clearly

a function of the current age x + t (see (1.35)). Referring to a given age
interval (x, x + m), the sequence θx , θx+1 , . . . , θx+m can be summarized in
an index, depending on x, m, and the interest rate i, called the implied
longevity yield (ILY).1 This index plays an important role in the analysis of
annuitization alternatives, as we will see in Section 7.7.
Example 1.4 In Fig. 1.6 the quantity θx+t is plotted for x = 65 and
t = 0, 1, . . .. The underlying technical basis is the first-order basis, with
i = 0.03 and the qx+t defined in Example 1.2. It is interesting to note that,
1 The expression ‘Implied Longevity Yield’ and its acronym ‘ILY’ are registered trademarks and
property of CANNEX Financial Exchanges.
2
1.8
1.6
1.4
1.2
1
u
0.8
0.6
0.4
0.2
0
65 75 85 95 105
Age
Figure 1.6. A measure of the mutuality effect.
when moderately old ages are involved (say, in the interval 65–75), the val-
ues of θ are rather small. In such a range of ages, they could be ‘replaced’
with a higher yield from investments (provided that riskier investments can
be accepted), and so, in that age interval, a withdrawal process could be pre-
ferred to a life annuity. Conversely, as the age increases, θ reaches very high
values, which obviously cannot be replaced by investment yields. So, when
old and very old ages are concerned, the life annuity is the only technical
tool which guarantees a lifelong constant income. As regards theoretical
results showing that the annuitization constitutes the optimal choice, see
Section 1.7.
1.4.2 Solidarity
Assume that a population consisting of (potential or actual) insureds is split

into risk classes. Each risk class groups individuals with the same probability
of claim (or death, or survival, etc.).
Risk classes could be directly referred to for pricing purposes, namely
charging with a specific premium rate the individuals belonging to a given
risk class. Conversely, two or more risk classes can be grouped leading to
a rating class, which would be aimed at charging all individuals belong-
ing to a given rating class with the same premium rate. The premium rate
attributed to a rating class should be an appropriate weighted average of
the premiums competing to the risk classes grouped into the rating class
itself. The weighting should reflect the expected numbers of (future) insureds
belonging to the various risk classes.
Assume that, as far as pricing is concerned, the population is split into
rating classes rather than into risk classes. The rationale of this grouping
may be, for example, a simplification in the tariff structure.
When two or more risk classes are aggregated into one rating class,
some insureds pay a premium higher than their ‘true’ premium, that is,
the premium resulting from the risk classification, while other insureds pay
a premium lower than their ‘true’ premium. Thus, the equilibrium inside
a rating class relies on a money transfer among individuals belonging to
different risk classes. This transfer is usually called solidarity (among the
insureds).
Clearly, such a premium system may cause adverse selection, as individ-
uals forced to provide solidarity to other individuals can reject the policy,
moving to other insurance solutions (or, more generally, risk management
actions). The severity of this self-selection phenomenon depends on how
people perceive the solidarity mechanism, as well as on the premium systems
adopted by competitors in the insurance market. In any event, self-selection
can jeopardize the technical equilibrium inside the portfolio, which depends
on actual versus expected numbers of insureds belonging to the various risk
classes grouped into a rating class. So, in practice, solidarity mechanisms
can work provided that they are compulsory (e.g. imposed by insurance
regulation) or they constitute a common market practice.
As regards life annuities, risk classes are usually based on age and gen-
der. In particular, it is well known that females experience a mortality
lower than males and a higher expected lifetime. So, if for some reason
the same premium rates (only depending on age) are applied to all annu-
itants, a solidarity effect arises, implying a money transfer from males to
females.
The solidarity effect is stronger when the number of rating classes is
smaller, compared with the number of risk classes. In the private insur-
ance field, an extreme case is achieved when one rating class only relates
to a large number of underlying risk classes. Outside of the private insur-
ance area, the solidarity principle is commonly applied in social security.
In this field, the extreme case arises when the whole national population
contribute to fund the benefits, even if only a part of the population itself is
eligible to receive benefits; so, the burden of insurance is shared among the
community.
Finally, it is interesting to stress the implications of this argument. Mutu-
ality affects the benefit (or claim) payment phase, so that ‘direction’ and
‘measure’ of the mutuality effect in a portfolio are only known ex-post. Con-
versely, solidarity affects the premium income phase, and hence its direction
and measure are known ex-ante.
1.4.3 ‘Tontine’ annuities
Assume that each one of lx individuals, all aged x at time t = 0, pays

at that time the amount S to a financial institution. Against the amount
S = lx S, the financial institution will pay at the end of each year, that is, at
times t = 1, 2, . . ., the (total) constant amount B , while at least one of the
individuals of the group is alive.
Each year the amount B is divided equally among the survivors. Hence,
each individual (out of the initial lx ) alive at time t receives a benefit bt
which depends on the actual number of survivors at that time. Denoting,
as usual, with lx+t the estimated number of survivors, an estimate of bt is
given by B /lx+t . Clearly,
B B B
≤ ≤ ... ≤ ≤ ··· (1.36)
lx+1 lx+2 lx+t
The mechanism of dividing B among the survivors is called a tontine

scheme, whereas the sequence (1.36) is called a tontine annuity.
The relation between S (the initial income) and B (the annual payment)
can be stated (at least in theory) on the basis of the equivalence principle.
To this purpose, first note that the duration, K, of the annuity paid by the
financial institution is random, being defined as follows:
K = max{Kx(1) , Kx(2) , . . . , Kx(lx ) } (1.37)

(j)
where Kx denotes the random curtate residual lifetime of the j-th individ-
ual. Hence, the equivalence principle requires
S = B E[aK ] (1.38)
The calculation of E[aK ] is extremely difficult. In practice, a rea-

sonable approximation could be provided by aω−x . While in general
aω−x > E[aK ], the larger is lx the better is this approximation, as there
is a higher probability that some individual reaches, or at least approaches,
the maximum age ω.
Example 1.5 The tontine annuity derives its name from Lorenzo Tonti (a
Neapolitan banker living most of his life in Paris) who, around 1650, pro-
posed a plan for raising monies to Cardinal Mazzarino, the Chief Minister
of France at the time of King Louis XIV. In this plan, a fund was raised
by subscriptions. Let S denote the amount collected by the State. Then, the
State had to pay each year the interest on S , at a given annual interest rate
i. The constant annual payment S i was to be divided equally among the
surviving members of the group and would terminate with the death of the
last survivor. Thus, according to our notation, the duration of the annuity
is K (see definition (1.37)), and we have B = S i. Note that
B 1
=i= (1.39)
S a∞
where a∞ = 1/i is the present value of a perpetuity (given the discount
rate i). As
S S S
< < (1.40)
a∞ aω−x E[aK ]
(assuming that the same discount rate is used for all the present values), we
find that original Tonti’s scheme did not fulfill the equivalence principle,
whilst it is favourable to the issuer (i.e., to the State).
Turning back to the general tontine scheme, two points should be

stressed.
(a) The tontine scheme clearly implies a cross subsidy among the annui-
tants, and in particular a mutuality effect arises as each dying annuitant
releases a share of the amount B , which is divided among the surviving
annuitants.
(b) A basic difference between tontine annuities and ordinary life annuities
should be recognized. In an ordinary life annuity, the annual (individual)
benefit b is stated and guaranteed, in the sense that the life annuity
provider has to pay the amount b to the annuitant for her/his whole
residual lifetime, whatever the mortality experienced in the portfolio (or
pension plan) may be. Conversely, in a tontine scheme the sequence of
amounts b1 , b2 , . . . paid to each annuitant depends on the actual size of
the surviving tontine group. Note that, when managing an ordinary life
annuity portfolio the annuity provider takes the risk of a poor mortality
experience in the portfolio (see Section 1.2.3), whereas in a tontine
scheme the only cause of risk is the lifetime of the last survivor. Further,
it should be noted that, for a given technical basis and a given amount S,
the annual benefit b is likely to be much higher than the initial payments
in a tontine scheme. Actually (using the approximation aω−x ), from

S lx S
B= = (1.41)
aω−x aω−x
lx aω−x
we obtain, for small values of t (such that lx+t
< ax )
lx S S
bt = < =b (1.42)
lx+t aω−x ax
From inequality (1.42) it follows that achieving a ‘good’ amount bt

(when compared with b) relies on the mortality experienced in the ton-
tine group. Mainly for this reason, tontine annuities were suppressed by
many governments, and at present prohibited in most countries.
Nevertheless, ideas underlying tontine schemes survive in some mech-
anisms of profit participation, especially when also mortality profits are
involved, as we will see in Section 7.5.3.
1.5 Evaluating life annuities: stochastic approach
1.5.1 The random present value of a life annuity
It should be noted that, although formulae (1.18) and (1.19) involve prob-
abilities, the model built up so far is a deterministic model, as probabilities
are only used to determine expected values. A first step towards stochastic
models follows.
Equation (1.19) implicitly involves the random present value Y,
Y = aKx (1.43)
of a life annuity (see also (1.25)). The possible outcomes of the random
variable Y are as follows:
y0 = a0 = 0
y1 = a1 = (1 + i)−1
··· = ···
yω−x = aω−x = (1 + i)−1 + (1 + i)−2 + · · · + (1 + i)−(ω−x)
and we have
P[aKx = yh ] = P[Kx = h] (1.44)

0.06
0.05
0.04
Probability
0.03
0.02
0.01
0
0 5 10 15 20
Present value of the annuity
Figure 1.7. Probability distribution of aK65 .
Calculating the probability distribution of Y = aKx requires the choice

of a technical basis, for example the scenario basis. Moments other than the
expected value can then be calculated, for example, the variance of aKx .
Example 1.6 Figure 1.7 illustrates the probability distribution of aK65 ,

calculated adopting the probabilities q∗x+h and the interest rate i∗ , as spec-
ified in Example 1.3. In particular, for the variance we find Var(aK65 ) =
12.889.
1.5.2 Focussing on portfolio results
Interesting insights into the features of a stochastic approach to life annuity

modelling can be achieved by focussing on a group (a portfolio, a pension
plan, etc.) of annuitants.
For a given initial number lx of annuitants, all age x and all with the
same age-pattern of mortality, for example, expressed by the q∗x+h ’s, the
numbers lx+t , t = 1, 2, . . . , ω − x, can be interpreted as expected numbers
of survivors at age x + t, out of the initial cohort (see (1.31)).
Actually, the numbers of annuitants alive at time t, t = 1, 2, . . . ,
ω − x, constitute a random sequence,
Lx+1 , Lx+2 , . . . , Lω (1.45)

(a) 0.2 (b) 0.2

0.18 0.18
0.16 0.16
0.14 0.14
Probability
Probability
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
L 70 L 85
Figure 1.8. Probability distributions of L70 and L85 .
It is interesting to find the probability distribution of the generic random

number Lx+t . If we assume that the lifetimes of the annuitants are indepen-
dent (and identically distributed), then the probability distribution of Lx+t
is binomial, namely

lx
P[Lx+t = k] = (t p∗x )k (1 − t p∗x )lx −k ; k = 0, 1, . . . , lx (1.46)
k
and, in particular, we have
E[Lx+t ] = lx t p∗x (1.47)
Example 1.7 Figure 1.8(a) and (b) illustrate the probability distribution
of L70 and L85 respectively, under the following assumptions: x = 65,
l65 = 100, q∗x+t as specified in Example 1.3.
Further insights can be obtained from a consideration of the insurer’s cash

flows. First, the probability distribution of the annual random payout may
be of interest. If we assume that all annuitants receive an annual amount b,
the random payout at time t is given by b Lx+t , and the related probability
distribution of the annual payment is immediately derived from (1.46).
When various individual annual amounts are concerned, deriving the
probability distribution of the annual payout is more difficult. In any event,
various numerical procedures and approximations are available. As an
alternative, Montecarlo simulation procedures can be used. Simulation pro-
cedures can also be used to obtain other results related to a portfolio of life
annuities, or a pension plan.
Consider now the random behaviour over time of the fund Zt defined for
t = 1, 2, . . . , ω − x, as follows:
Zt = Zt−1 (1 + i∗ ) − Lx+t b (1.48)
with Z0 = lx S. Suppose that the relation between b and S is given by formula

(1.26), where ax has been calculated assuming the first-order technical basis,
given by i = 0.03 and the qx+h ’s used in Example 1.2.
A ‘path’ of the fund Zt can be obtained via simulation of the random
numbers Lx+t , which in turn can be obtained simulating the random life-
(j)
times of the annuitants. Indeed, denoting with Tx the remaining lifetime
of the j-th annuitant, we have

lx
Lx+t = I{T (j) >t} (1.49)
x
j=1
where IE is the indicator function of event E.

Note that the expected path E[Zt ], t = 1, 2, . . . , ω−x, can be immediately
derived as
E[Zt ] = E[Zt−1 ] (1 + i∗ ) − E[Lx+t ] b (1.50)
the expected numbers E[Lx+t ] being given by (1.47).

Example 1.8 Figures 1.9(a) and (b) illustrate 10 paths of Zt , for t =
0, 1, . . . , 5 and t = 15, . . . , 20, respectively. The data already described
have been assumed as the input of the simulation procedure, in particu-
lar i∗ = 0.05 and the q∗x+h ’s used in Example 1.3. Figures 1.10(a) and (b)
illustrate the (simulated) statistical distribution of Z5 and Z20 respectively,
based on a sample of 1000 simulated paths.
Further interesting aspects may emerge from comparing the behaviour of

the fund Zt with the (random) portfolio reserve, whose amount is Lx+t Vt ,
with Vt given by (1.30) (traditionally implemented with the first-order
basis). As the assets actually available are given by Zt , the (random) quantity
Mt = Zt − Lx+t Vt (1.51)
represents the assets in excess of the level required (according to the first-
order basis) to meet expected future obligations.
Example 1.9 Figures 1.11(a) and (b) represent the (simulated) statistical
distribution of M5 and M20 respectively, based on the simulated sample
previously adopted. The erratic behaviour in these figures (as well as in
(a) (b)
100,000 45,000
40,000
95,000
35,000
90,000 30,000
Zt
Zt
85,000 25,000
20,000
80,000
15,000
75,000 10,000
0 1 2 3 4 5 15 16 17 18 19 20
t t
Figure 1.9. Some paths of Zt .
(a) 0.8 (b) 0.7

0.7 0.6
0.6
0.5
0.5
Frequency
Frequency
0.4
0.4
0.3
0.3
0.2
0.2
0.1 0.1
0 0
10 0
20 0
30 0
40 0
50 0
60 0
70 0
80 0
90 0
00
10 0
60 0
70 0
80 0
90 0
00
20 0
30 0
40 0
00
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
,0
50
Z5 Z20
Figure 1.10. Statistical distributions of Z5 and Z20 .
Figures 1.11(a), 1.11(b), 1.12(a), 1.12(b), and 1.14) is clearly due to the
simulation procedure; smoother results can be obtained by increasing the
number of simulations.
1.5.3 A first insight into risk and solvency
From the exercise developed in Examples 1.7–1.9, an important feature of

stochastic models clearly emerges. Allowing for randomness provides us
with a tool for assessing the ‘risk’ inherent in a life annuity portfolio or a
pension plan. As we can see in Figure 1.9(a) and (b), random fluctuations
affect the portfolio behaviour, and these are caused (in this example) by the
randomness in the number of survivors throughout time. The risk we are
(a) 0.2 (b) 0.2

0.18 0.18
0.16 0.16
0.14 0.14
Frequency
Frequency
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
–4 00
–3 00
–2 00
–1 00
0
10 0
20 0
30 0
40 0
00
–4 00
–3 00
–2 0
–1 0
0
10 0
20 0
30 0
40 0
00
00
0
0
00
00
00
0
0
0
0
0
0
0
0
,0
,0
,0
,0
0
0
,0
,0
,0
,0
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
–5
–5
M5 M20
Figure 1.11. Statistical distributions of M5 and M20 .
now focussing on is usually named the risk of mortality random fluctuation,

or the process risk due to mortality (see also Section 7.2).
Figures 1.10(a) and (b) suggest measures which can be used for assessing
the riskiness of a life annuity portfolio in terms of the ‘dispersion’ of the fund
Zt . Analogous considerations emerge from Figure 1.11(a) and (b) in relation
to the quantity Mt . For example, the variance or the standard deviation,
estimated from the statistical distributions, can be used as (traditional) risk
measures.
The possibility of quantifying portfolio riskiness suggests ‘operational’
applications of our stochastic model, provided that it is properly general-
ized. For example, let us focus on the quantity Mt . From Figure 1.11(a) and
(b) it emerges that, with a positive probability, M5 and M20 take negative
values. Of course, the event Mt < 0 indicates an insolvency situation.
So, probabilities of events like Mt < 0 for some t, at least within a
stated time horizon, should be kept reasonably small. In particular, an initial
allocation of (shareholders’) capital, leading to Z0 > S, clearly lowers the
probability of being insolvent.
Example 1.10 Allocating the amount M0 = 3000, so that Z0 = 100 S +
3000, leads to the distributions of M5 and M20 depicted in Figure 1.12(a)
and (b), from which a smaller probability of insolvency clearly emerges.
Of course, causes of risk other than mortality could be introduced into our
model, and typically the investment risk, in particular arising from random
fluctuations (i.e. ‘volatility’) in the investment yield. To this purpose, the
sequence of annual investment yields must be simulated, on the basis of
(a) 0.2 (b) 0.2

0.18 0.18
0.16 0.16
0.14 0.14
Frequency
Frequency
0.12 0.12
0.1 0.1
0.08 0.08
0.06 0.06
0.04 0.04
0.02 0.02
0 0
–4 00
–3 0
–2 000
–1 0
0
10 0
20 0
30 0
40 0
00
–4 00
–3 00
–2 00
–1 0
0
10 0
20 0
30 0
40 0
00
00
00
00
0
0
0
00
00
0
0
0
0
,0
,0
,0
,0
0
0
,0
,0
,0
,0
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
–5
–5
M5 M20
Figure 1.12. Statistical distribution of M5 and M20 .
an appropriate model for stochastic interest rates, and used in place of the
estimated yield i∗ . We do not deal with these problems, which are beyond
the scope of the present chapter.
()
Let us now refer to the random present value at time t = 0, Y0 , of
future benefits in a portfolio consisting of one generation of life annuities.
We have
ω−x

()
Y0 =b Lx+t (1 + i)−t (1.52)
t=1
()
If we calculate the expected value of Y0 using the first-order basis, we
have
ω−x
ω−x

E[Y0() ] =b E[Lx+t ] (1 + i) −t
= b lx t px (1 + i)
−t
= lx V0 (1.53)
t=1 t=1
Formula (1.53) provides the (traditional) portfolio reserve, given by

() ()
V0 = E[Y0 ] = lx V0 (1.54)
() ()
Obvious generalizations lead to Yt and E[Yt ], for t ≥ 0.
However, in a stochastic context, the portfolio reserve can be defined in
different ways, in particular in order to allow for the riskiness inherent in
the life annuity portfolio. For example, the reserve can be defined as the
1−α
0 E[Y0(Π)] yα
()
Figure 1.13. Probability distribution of Y0 ; α-percentile.
()
α-percentile of the probability distribution of Y0 (see Fig. 1.13):
(;α)
V0 = yα (1.55)
with yα such that
P[Y0() > yα ] = 1 − α (1.56)
Example 1.11 Using the data of the previous examples, from the simulated
()
distribution of Y0 (see Fig. 1.14) we find the results shown in Table 1.1.
() () ()
Note that, conversely, we have P[Y0 > V0 ] = 0.209 (where V0 =
E[Y0() ] = 100000).
()
It is worth noting that the calculation of the portfolio reserve V0 (and,
()
in general Vt ) according to (1.54) represents the traditional approach
that is adopted in actuarial practice. In this context, the presence of risks is
taken into account simply via the first-order basis adopted in implementing
formula (1.54). Conversely, the reserving approach based on the probability
() ()
distribution of Y0 (and Yt in general) and leading to the portfolio reserve
(;α) (;α)
V0 (Vt ) allows for risks via the choice of an appropriate percentile
of the distribution itself.
1.5.4 Allowing for uncertainty in mortality assumptions
As already mentioned in Section 1.3.3, experience suggests that we should

adopt projected mortality tables (or laws) for the actuarial appraisal of
0.16
0.14
0.12
0.1
Probability
0.08
0.06
0.04
0.02
0
83,500 88,500 93,500 98,500 103,500
Y0
()
Figure 1.14. Statistical distribution of Y0 .
Table 1.1. Percentiles of

the probability distribu-
()
tion of Y0
α yα
0.75 92067.033
0.90 101553.815
0.95 102608.253
0.99 104480.738
life annuities (and other living benefits), that is, use mortality assumptions
which include a forecast of future mortality trends. Notwithstanding, what-
ever hypothesis is assumed, the future trend in mortality is random, and
hence an uncertainty risk arises, namely a risk due to uncertainty in the
representation of the future mortality scenario.
Example 1.12 Assume the first-order basis already used in the previous
examples. To describe the (future) mortality scenario, use the model (1.32)
with the following alternative parameters:
(1) G(1) = 0.0000025; H (1) = 1.13500

(2) G(2) = 0.0000023 (= G∗ ); H (2) = 1.13400 (=H ∗ )
(3) G(3) = 0.0000019; H (3) = 1.13300
We assume that scenario (2) (which coincides with the scenario adopted
as the second-order basis in previous examples) represents the best
estimate mortality hypothesis. Scenario (1) involves a higher mortality
level and hence can be considered ‘optimistic’ from the point of view
of the annuity provider. Conversely, scenario (3) expresses a lower mor-
tality level and thus constitutes a ‘pessimistic’ mortality forecast. We
obtain:
(1) (2) (3)
a65 = 11.046, a65 = 11.442, a65 = 12.102
(with obvious meaning of the notation).
The coexistence of more than one mortality scenario (namely, three in

Example 1.12) depicts a new modelling framework. When no uncertainty
in future mortality trend is allowed for, and hence just one age-pattern of
mortality is assumed (e.g. in terms of probability of dying), a deterministic
actuarial value of the life annuity follows. Conversely, if we recognize uncer-
tainty in the future pattern of mortality, randomness in actuarial values
follows.
Figure 1.15 illustrates three different approaches to uncertainty in mor-
tality assumptions. The first approach (A) simply disregards uncertainty, so
that the related result is a deterministic actuarial value of the life annuity. In
the second case (B), a finite set of scenarios is used to express uncertainty,
from which a finite set of actuarial values follows; clearly, this approach
has been adopted in Example 1.12. Note that, according to this approach,
each actuarial value should be regarded as an expected value conditional
on a given scenario. Finally, the third approach (C) allows for uncertainty
via a continuous set of scenarios and a consequent interval for the (condi-
tional) actuarial value of the life annuity; this approach can be implemented,
for example, assuming a given interval as the set of possible values for a
parameter of the mortality law.
Clearly, the uncertainty risk coexists with the risk of mortality ran-
dom fluctuations. As regards the present value of a life annuity, random
fluctuations lead to the probability distribution depicted for example, in
Fig. 1.14 (see Example 1.11). When allowing also for uncertainty in future
mortality, a set of probability distributions should be addressed. Thus,
referring to approach B (see Fig. 1.15), a finite set of conditional distri-
butions is involved, each one relating to an alternative mortality scenario
(see Fig. 1.16).
A comprehensive description of riskiness inherent in a life annuity prod-
uct (still excluding financial risks arising from investment performance)
(A) Deterministic Deterministic actuarial value

scenario of the life annuity
Probability of dearth
x
Age Value
(B) Uncertainty in the actuarial

Uncertainty
in scenario value of the life annuity
(discrete setting) (discrete setting)
(1)
(2)
(3)
x
Age Value
(C) Uncertainty Uncertainty in the actuarial

in scenario value of the life annuity
(continuous setting) (continuous setting)
x
Age Value
Figure 1.15. Mortality scenarios and actuarial values.
requires a further step. By assigning an appropriate probability descrip-

tion of the scenario space, we can move from conditional probability
distributions to an unconditional distribution, which ‘summarizes’ both
components of risk, namely the uncertainty risk and the risk of random
fluctuations. This topic will be focussed in Chapter 7 while dealing with the
assessment of longevity risk.
(1) (2) (3)
Present value
of the life annuity
Figure 1.16. Conditional probability distributions of the random present value of the life annuity.
1.6 Types of life annuities

In the previous sections we have dealt with an immediate life annuity in
arrears, that is a life annuity whose first payment is due one period (one
year, according to our assumptions) from the date of purchase, while the
last payment is made at the end of the period preceding the death of the
annuitant. Although this structure is rather common, a number of other
types of life annuities are sold on insurance markets, and paid by pension
plans as well.
So, the purpose of this section is to describe a range of annuity types,
looking at features of both the accumulation period and the decumulation
period (also called the liquidation period, or payout period); see Fig. 1.17.
1.6.1 Immediate annuities versus deferred annuities
Let us continue to focus on an immediate life annuity, and denote with b the
annual benefit and S the net single premium (i.e., disregarding expense load-
ings). It is natural to look at the amount S as the result of an accumulation
process carried out during (a part of) the working life of the annuitant.
Let us now denote with x the age at the beginning of the accumulation
process, that is, at time 0. The accumulation process stops at time n, so that
x + n is the age at the beginning of the decumulation phase.
The relation between S and b is given, according to the equivalence
principle, by
S = b ax+n (1.57)
(see (1.26) and the previous equations).

Accumulation Decumulation
S
Fund/reserve
Premiums/ Annuity benefits/

savings withdrawals
0 1 2 n–1 n n+1
x x+n
Time and age
Figure 1.17. Accumulation and decumulation phases.
As regards the accumulation process, this can be carried out via vari-
ous tools, for example insurance policies providing a survival benefit at
maturity (time n). Some policy arrangements tools will be described in
Section 1.6.2.
Conversely, it is possible to look jointly at the accumulation and the decu-
mulation phase, even in actuarial terms. Consider a deferred life annuity of
one monetary unit per annum, with a deferred period of n years. Assume
now that each annual payment is due at the beginning of the year (annuity
in advance). The actuarial value at time 0, n| äx , is given by
ω−x

n| äx = (1 + i)−h h px (1.58)
h=n
In this context, it is natural that the accumulation period coincides with

the deferred period. In particular, the deferred annuity can be financed via
a sequence of n annual level premiums P, paid at times 0, 1, . . . , n − 1. The
annual premium for a deferred life annuity of b per annum, according to
the equivalence principle, is then given by
n| äx
P=b (1.59)
äx:n
where

n−1
äx:n = (1 + i)−h h px (1.60)
h=0
Two important aspects of the actuarial structure of deferred life annuities

financed by annual level premiums, as is apparent from equations (1.58) and
(1.60), should be stressed:
(a) Formulae (1.58) and (1.60) rely on the assumption that the technical
basis is chosen at time 0, when the insured is aged x. If for example
x = 40, this means that the technical rate of interest will be guaranteed
throughout a period of, maybe, fifty years or even more. Further, the
life table adopted should keep its validity throughout the same period.
(b) In the case that the policyholder dies before time n, no benefit is due. This
is, of course, a straight consequence of the policy structure, according
to which the only benefit is the deferred life annuity.
Feature (b) is likely to have a negative impact on the appeal of the annuity
product. However, the problem can be easily removed by adding to the
policy a rider benefit such as the return of premiums in case of death during
the deferred period, or including some death benefit with term n.
The problems arising from aspect (a) are much more complex, and require
a re-thinking of the structure and design of the life annuity product. As a
first step, we provide an analysis of the main features of life annuity prod-
ucts, addressing separately the accumulation period and the decumulation
period.
1.6.2 The accumulation period
The deferred life annuity, as described above, can be interpreted as a pure

endowment at age x with maturity at age x + n, ‘followed’ (in the case of
survival at age x + n) by an immediate life annuity, with benefits due at the
beginning of each year. In formal terms, from (1.58) we obtain
ω−x−n

−n
n| äx = (1 + i) n px (1 + i)−h h px+n = n Ex äx+n (1.61)
h=0
where n Ex = (1 + i)−n
n px denotes the actuarial value of a pure endowment
with a unitary amount insured.
Clearly, relation (1.61) relies on the assumption that the same technical
basis is adopted for both the period of accumulation and decumulation. As
already noted, this implies a huge risk for the life annuity provider. So, an
important idea is to address separately the two periods, possibly delaying
the choice of the technical basis to be adopted for the life annuity.
As regards the accumulation period, the pure endowment can be replaced
by a purely financial accumulation, via an appropriate savings instrument.
Then, the loss in terms of the mutuality effect is very limited when (part
of) the working period is concerned. Hence, a very modest extra-yield can
replace the mortality drag.
Example 1.13 In Fig. 1.18, the function θ (see Section 1.4.1) is plotted
against the age in the range 40–64. Note that θ is consistent with formula
(1.34), with given values for the mathematical reserve, however with b = 0.
The underlying technical basis is the first-order basis adopted in Example
1.4. It is interesting to compare the graph in Fig. 1.18 (noting the scale
on the vertical axis) with the behaviour of the function θ throughout the
decumulation period, illustrated in Fig. 1.6.
Of course, various insurance products including a benefit in case of life

at time n can replace the pure endowment throughout the accumulation
period. Examples are given by the traditional endowment assurance policy,
by various types of unit-linked endowments, and so on. In many cases, some
minimum guarantee is provided: for example, the technical rate of interest
in traditional insurance products like pure endowments and endowment
0.007
0.006
0.005
0.004
u
0.003
0.002
0.001
0
40 45 50 55 60 65
Age
Figure 1.18. Mutuality effect during the accumulation period.

assurances, a minimum death benefit and/or a minimum maturity benefit

in unit-linked products.
Whatever the insurance product may be, the benefit at maturity can be
used to purchase an immediate life annuity. However, the ‘quality’ of the
insurance product used for the accumulation can be improved, from the
perspective of the policyholder, including in the product itself an option
to annuitize. This option is the possibility of converting the lump sum at
maturity into an immediate life annuity, without the need to cash in the sum
and pay the expense charges related to the underwriting of the life annuity.
Clearly, when an option to annuitize is included in the policy, the insurer
first takes the adverse selection risk, as the policyholders who choose the
conversion into a life annuity will presumably be in good health, with a life
expectancy higher than the average. However a further risk may arise, due
to the uncertainty in the future mortality trend, that is, the longevity risk.
If the annuitization rate, that is the quantity 1/äx+n , which is applied to
convert the sum available at maturity into an immediate life annuity is stated
(and hence guaranteed) just at maturity, the time interval throughout which
the insurer bears the longevity risk clearly coincides with the time interval
during which the life annuity is paid.
However, more ‘value’ can be added to the annuity product if the annuiti-
zation rate is guaranteed during the accumulation period, the limiting case
being represented by the annuitization rate guaranteed at time 0, that is, at
policy issue. The opposite limit is clearly given by stating the guaranteed
rate at time n, that is, at maturity.
The so-called guaranteed annuity option (GAO) is a policy condition
which provides the policyholder with the right to receive at retirement either
a lump sum (the maturity benefit) or a life annuity, whose annual amount
is calculated at a guaranteed rate. The annuity option will be exercised by
the policyholder if the current annuity rate (i.e. the annuity rate applied by
insurers at time n for pricing immediate life annuities) will be worse than
the guaranteed one.
As regards the accumulation period, the severity of the longevity risk
borne by the life annuity provider can be reduced (with respect to the
severity involved in a GAO with a guaranteed rate that is stated at the
policy issue date) if the annuity purchase is arranged according to a single-
recurrent premium scheme. In this case, with the premium paid at time h
(h = 0, 1, . . . , n − 1) a deferred life annuity of annual amount bh , with
deferred period n − h, is purchased. In actuarial terms:
Ph = bh n−h| ä[h]
x+h
(1.62)
Note that the actuarial value n−h| ä[h]

x+h
is calculated according to the
technical basis adopted at time h. Hence, the annuity total benefit b, given by

n−1
b= bh (1.63)
h=0
is ultimately determined and guaranteed at time n − 1 only. According to

this step-by-step procedure, the technical basis, used in (1.62) to determine
the amount bh purchased with the premium Ph , can change every year, so
reflecting possible adjustments in the mortality forecast.
1.6.3 The decumulation period
Let us denote with n the starting point of the decumulation period, and
with x + n the annuitant’s age. Let S be the amount, available at time n, to
finance the life annuity. In the case of a deferred life annuity, S is given by
the mathematical reserve at time n of the annuity itself.
The relation between S and the annual payment b depends on the policy
conditions which define the (random) number of payments, and hence the
duration of the decumulation period. Let us denote with K the number of
payments. Focussing on a life annuity in arrears only, the following cases
are of practical interest:
(a) If the number of annual payments is stated in advance, say K = m, we

have an annuity-certain, that is, a simple withdrawal process. Then, the
annual benefit b is defined by the following relation:
S = b am (1.64)
(b) In the case of a whole life annuity, the annual payments cease with
the death of the annuitant. Thus, K = Kx+n (where Kx+n denotes the
curtate remaining lifetime), and
S = b ax+n (1.65)
(c) The m−year temporary life annuity pays the annual benefit while the
annuitant survives during the first m years. Then K = min{m, Kx+n },
and

m
S = b ax:m = (1 + i)−h h px+n (1.66)
h=1
(d) If the annuitant dies soon after time n, neither the annuitant nor the
annuitant’s estate receive much benefit from the purchase of the life
annuity. In order to mitigate (at least partially) this risk, it is possible to

buy a life annuity with a guarantee period (5 or 10 years, say), in which
case the benefit is paid for the guarantee period regardless of whether
the annuitant is alive or not. Hence, for a guarantee period of r years
we have K = max{r, Kx+n }, and
S = b ar + b r| ax+n (1.67)
We have so far assumed that the annuity payment depends on the life-
time of one individual only, namely the annuitant. However, it is possible
to define annuity models involving two (or more) lives. Some examples
(referring to two lives) follow:
(e) Consider an annuity payable as long as at least one of two individu-

als (the annuitants) survives, namely a last-survivor annuity. Let now
denote by y and z respectively the ages of the two lives at the annuity
(1) (2)
commencement, and with Ky , Kz their curtate remaining lifetimes.
(1) (2)
Thus, K = max{Ky , Kz }. The actuarial value of this annuity is usually
denoted by ay,z , and can be expressed as
(2)
ay,z = a(1)
y + az − ay,z (1.68)
where the suffices (1), (2) denote the life tables (e.g. referring to males
and females respectively) used for the two lives, whereas ay,z denotes
the actuarial value of an annuity of 1 per annum, payable while both
individuals are alive (namely a joint-life annuity). Hence,
(2)
S = b ay,z = b (a(1)
y + az − ay,z ) (1.69)
Note that, if we accept the hypothesis of independence between the two
random lifetimes, we have
+∞
(2)
ay,z = (1 + i)−h h p(1)
y h pz (1.70)
h=1
In equation (1.69) it has been assumed that the annuity continues with
the same annual amount until the death of the last survivor. A modi-
fied form provides that the amount, initially set to b, will be reduced
following the first death: to b if the individual (2) dies first, and to b
if the individual (1) dies first. Thus
(2)
S = b a(1)
y + b az + (b − b − b ) ay,z (1.71)
with b < b, b < b. Conversely, in many pension plans the last-survivor
annuity commonly provides that the annual payment is reduced only if
the retiree, say life (1), dies first. Formally, b = b (instead of b < b) in
equation (1.71).
(f) A reversionary annuity (on two individuals) is payable while a given
individual, say individual (2), is alive, but only after the death of
the other individual. In this case, the number of payments is K =
(2) (1)
max{0, Kz − Ky }, and the first payment (if any) is made at time
(1)
Ky + 1. Such an annuity can be used, for example, as a death benefit
in pension plans, to be paid to a surviving spouse or dependant.
1.6.4 The payment profile
Level annuities (sometimes called standard annuities) provide an income

which is constant in nominal terms. Thus, the payment profile is flat.
A number of models of ‘varying’ annuities have been derived, mainly
with the purpose of protecting the annuitant against the loss of purchasing
power because of inflation. First, we focus on escalating annuities.
(a) In the fixed-rate escalating annuity (or constant-growth annuity) the

annual benefit increases at a fixed annual rate, α, so that the sequence
of payments is
b1 , b2 = b1 (1 + α), b3 = b1 (1 + α)2 , ...
Usually, the premium is calculated accounting for the annual increase

in the benefit. Thus, for a given amount S (the single premium of the
immediate life annuity), the starting benefit b1 is lower than the benefit
the annuitant would get from a level annuity.
Various types of index-linked escalating annuities are sold in annuity and

pension markets. Two examples follow:
(b) Inflation-linked annuities provide annual benefits varying in line with

some index, for example a retail-price index (like the RPI in the UK),
usually with a stated upper limit. An annuity provider should invest
the premiums in inflation-linked assets so that these back the annuities
where the payments are linked to a price index.
(c) Equity-indexed annuities earn annual interest that is linked to a stock
or other equity index (e.g., the Standard & Poor’s 500). Usually, the
annuity promises a minimum interest rate.
Moving to investment-linked annuities, we focus on the following

models:
(d) In a with-profit annuity (typically in the UK market), the single premium

is invested in an insurer’s with-profit fund. Annual benefits depend on
an assumed annual bonus rate (e.g. 5%), and on the sequence of actual
declared bonus rates, which in turn depend on the performance of the
fund. In each year, the annual rate of increase in the annuity depends on
the spread between the actual declared bonus and the assumed bonus.
Clearly, the higher is the assumed bonus rate, the lower is the rate of
increase in the annuity. The benefit decreases when the actual declared
bonus rate is lower than the assumed bonus rate. Although the annual
benefit can fluctuate, with-profit annuities usually provide a guaranteed
minimum benefit.
(e) Various profit participation mechanisms (other than the bonus mecha-
nism described above in respect of with-profit annuities) are adopted,
for example, in many European continental countries. A share (e.g.
80%) of the difference between the yield from the investments back-
ing the mathematical reserves and the technical rate of interest (i.e.
the minimum guaranteed interest, say 2% or 3%) is credited to
the reserves. This leads to increasing benefits, thanks to the extra-
yield.
(f) The single premium of a unit-linked life annuity is invested into unit-
linked funds. Generally, the annuitant can choose the type of fund, for
example medium risk managed funds, or conversely higher risk funds.
Each year, a fixed number of units are sold to provide the benefit pay-
ment. Hence, the benefit is linked directly to the value of the underlying
fund, and then it fluctuates in line with unit prices. Some unit-linked
annuities, however, work in a similar way to with-profit annuities. An
annual growth rate (e.g. 6%) is assumed. If the fund value grows at
the assumed rate, the benefit stays the same. If the fund value growth
is higher than assumed, the benefit increases, whilst if lower the benefit
falls. Some unit-linked funds guarantee a minimum performance in line
with a given index.
We conclude this section addressing some policy conditions which

provide a ‘final’ payment, namely some benefit after the death of the
annuitant.
The complete life annuity (or apportionable annuity) is a life annuity
payable in arrears which provides a pro-rata adjustment on the death of the
annuitant, consisting in a final payment proportional to the time elapsed
since the last payment date. Clearly, this feature is more important if the
annuity is paid annually, and less important in the case of, say, monthly
payments.
Capital protection represents an interesting feature of some annuity poli-

cies, usually called value-protected annuities. Consider, for example, a
single-premium, level annuity. In the case of early death of the annuitant,
a value-protected annuity will pay to the annuitant’s estate the difference
(if positive) between the single premium and the cumulated benefits paid
to the annuitant. Usually, capital protection expires at some given age
(75, say), after which nothing is paid even though the difference above
mentioned is positive. The capital protection benefit can be provided in
two ways:
– in a cash-refund annuity, the balance is paid as a lump sum;

– in an instalment-refund annuity the balance is paid in a sequence of
instalments.
Adding capital protection clearly reduces the annuity benefit (for a given
single premium).
Remark Note that capital protection constitutes a death benefit, which is

decreasing as the age at death increases and hence the number of annual
benefits paid to the annuitant increases. For this reason, capital protection
can help in building-up a (partial) ‘natural hedging’ of mortality-longevity
risks inside the annuity product. See Section 7.3.2.
1.6.5 About annuity rates
The price of life annuities depend on several ‘risk factors’. In particular,
(a) age at time of annuity purchase;

(b) gender;
(c) voluntary annuities versus pension annuities;
(d) information available to the insurer about the annuitant’s expected
lifetime are important factors.
The importance of factor (a) is self-evident. Risk factor (b) is usually

taken into account, because of the difference between the age-pattern of
mortality in males and females. However, in uni-sex annuities the same
annuity rate (for a given age at entry) is adopted for males and females.
These annuities involve a solidarity effect (see Section 1.4.2) in the sense
that men cross-subsidize women.
The term voluntary annuities (see point (c)) usually denotes annuities
bought as a consequence of individual choice, that is exercised on a vol-
untary basis. Conversely, the term pension annuities refers to benefits paid
to people as a direct consequence of their membership of an occupational

pension plan, or to annuities bought because a compulsory purchase mech-
anism works. Voluntary annuities are usually purchased by people with a
high life expectancy, whereas individuals who know that they have a low
expected lifetime are unlikely to purchase an annuity. The consequence is
that actual voluntary annuitants have a mortality pattern different from the
population as a whole. This fact is known as adverse selection (from the
point of view of the life insurer). In terms of annuity rates, adverse selection
leads to higher premiums for voluntary annuities, compared with pension
annuities.
As regards point (d), insurers offer lower prices, that is, sell special-rate
annuities, to people with an expected lifetime lower than the average one
(or, equivalently, a higher annual benefit for a given single premium). In
particular,
– impaired-life annuities can be sold to people having health problems

certified by a doctor (e.g. diabetes, chronic asthma, high blood pressure,
cancer, etc.);
– enhanced annuities can be purchased by people who self-certify the pres-
ence of some cause of a higher mortality level, like being overweight, or
being a regular smoker.
Remark Enhanced annuities should not be confused with enhanced pen-

sions, which provide an uplift of the annual benefit if the annuitant enters
a senescent disability state (namely, in the case of a ‘Long-Term Care’
claim).
1.6.6 Variable annuities and GMxB features
In the previous sections, various ‘guarantees’ have been addressed; for

example: minimum guarantees like the guaranteed interest rate in the accu-
mulation period (Section 1.6.2), a guaranteed minimum annual benefit in
with-profit annuities, a minimum interest rate in equity-indexed annuities,
a minimum performance in unit-linked annuities, a minimum total payout
via capital protection mechanisms (Section 1.6.4).
Packaging a range of guarantees is a feature of variable annuities. These
products are unit-linked investment policies, providing deferred annuity
benefits. The annuity can be structured as a level annuity or a unit-linked
annuity (see Section 1.6.4).
The guarantees, commonly referred to as GMxBs (namely, Guaranteed
Minimum Benefits of type ‘x’), include minimum benefits both in case of
death and in case of life. The GMxBs are usually defined in terms of the
amount resulting from the accumulation process (the account value) at some
point of time, compared with a given benchmark (which may be expressed
in terms of the interest rate, a fixed benefit amount, etc.).
One or more than one GMxB can be included in the policy as a rider
to the basic variable annuity product. A brief description of some GMxBs
follow:
(a) GMDB = Guaranteed Minimum Death Benefit. The GMDB guarantees

a minimum lump sum benefit payable upon the annuitant’s death. The
GMDB can be defined in several ways; for example:
– return of premiums consists in the payment of the greater of the
amount of premiums paid and the account value;
– highest anniversary value pays the greater of the highest account
value at past anniversaries and the current account value (hence,
according to a ratchet mechanism);
– roll-up consists in the payment of the higher of an amount equal to
the premiums paid accumulated at a given interest rate (say, 5%) and
the account value.
The GMDB typically expires either at the end of the accumulation
period, or when a given time (say, 10 years) has elapsed since the
commencement of the decumulation period.
(b) GMAB = Guaranteed Minimum Accumulation Benefit. The GMAB
can be exercised at pre-fixed dates (during the accumulation period);
the policyholder receives, as the surrender value, a lump sum equal
to the higher of the guaranteed amount and the account value. For
example the guaranteed amount can be determined as the premiums
paid accumulated at a given interest rate (say, 5%) according to a roll-
up rule, and can be paid for example at the 10-th anniversary (measured
from the beginning of the accumulation period).
(c) GMIB = Guaranteed Minimum Income Benefit. The term ‘income’
refers to (annual) amounts payable to the annuitant. The policyholder
receives the higher of the guaranteed amount and the account value,
payable as an annuity whose annual benefit is determined according
to a given interest rate and life table. The guaranteed amount is typ-
ically calculated according to a roll-up accumulation or an annual
ratchet. Hence, the GMIB guarantees a minimum annual income upon
annuitization.
(d) GMWB = Guaranteed Minimum Withdrawal Benefit. The policyholder
receives the greater of return of premiums and the account value,
payable as a sequence of periodic withdrawals throughout time. For
example, the GMWB might guarantee that the policyholder will receive
for 20 years an annual amount equal to 5% of the premiums paid.
Some policies do not allow the policyholder to withdraw money after
the commencement of the annuity payments.
GMAB, GMIB, and GMWB are commonly referred to as GLB, namely

Guaranteed Living Benefits.
All GMxBs have option-like characteristics. However, the possible uti-
lization of the GMDB follows the age-pattern of mortality, and hence can
be assessed using a life table (together with assumptions about the perfor-
mance of the financial market). Conversely, the utilization of a GLB depends
on the policyholder’s behaviour, and hence the assessment of its impact is
much more difficult.
1.7 References and suggestions for further reading

In this section, we only quote textbooks and papers dealing with general
aspects of life annuity products. Studies particularly devoted to longevity
risk in life annuity portfolios and pension plans will be quoted in the relevant
sections of the following chapters.
Basic actuarial aspects of life annuities (namely expected present values,
premium calculation, mathematical reserves) are dealt with in almost all of
the main textbooks of actuarial mathematics and life insurance techniques.
The reader can refer, for example, to Bowers et al. (1997), Gerber (1995),
Gupta and Varga (2002), Rotar (2007).
As regards the notation, the use of symbols like aKx (see (1.43)) can
be traced back to de Finetti. Actually, de Finetti (1950, 1957) focussed
on the random present value of insured benefits. For example, in the age-
continuous context,
– the random present value of the whole life assurance (with a unitary sum
assured) is (1 + i)−Tx , and then, according to usual actuarial notation,
the expected present value is
Āx = E[(1 + i)−Tx ]
– the random present value of the standard endowment is (1 + i)− min{Tx ,n} ,
and hence
Āx,n = E[(1 + i)− min{Tx ,n} ]
As regards the stochastic approach to actuarial values, see also the sem-
inal contribution by Sverdrup (1952). Mortality risks in life annuities are
analysed by McCrory (1986).
The objectives and main design features of life annuity products are exten-
sively dealt with by Black and Skipper (2000). We have mainly referred
to this textbook in Section 1.6. Various papers and reports have been
recently devoted to innovation in life annuity products, especially address-
ing the impact of longevity risk. See, for example Cardinale et al. (2002),
Department for Work and Pensions (2002), Retirement Choice Working
Party (2001), Richard and Jones (2004), Wadsworth et al. (2001), Swiss
Re (2007), Blake and Hudson (2000). Variable annuities are addressed in
particular by Sun (2006) and O’Malley (2007).
The book by Milevsky (2006) constitutes an updated reference in the
context of life annuities and post-retirement choices.
Great effort has been devoted to the analysis of life annuities from an
economic perspective, in particular in the framework of wealth management
and human life cycle modelling. We only cite the seminal contribution by
Yaari (1965), whereas for other bibliographic suggestions the reader can
refer to Milevsky (2006). The extra yield defined in Section 1.4.1 is the key
element behind the seminal result of Yaari (1965). He shows that a risk
averse, life cycle consumer facing an uncertain time of death would, under
certain assumptions (e.g. the absence of bequest, and the absence of other
sources of randomness), find it optimal to invest 100% of his/her wealth in
an annuity (priced on an actuarially fair basis).
An extensive discussion on the concepts of mutuality and solidarity (how-
ever with some terms used with a meaning different from that adopted in
the present chapter) is provided by Wilkie (1997).
Finally, some references concerning the history of life annuities and the
related actuarial modelling follow. For the early history of life annuities
the reader can refer to Kopf (1926). The paper by Hald (1987) is more
oriented to actuarial aspects, and constitutes an interesting introduction to
the early history of life insurance mathematics. Haberman (1996) provides
extensive information about the history of actuarial science up to 1919,
while in Haberman and Sibbett (1995) the reader can find the reproduc-
tion of a number of milestone papers in actuarial science. The papers by
Pitacco (2004a) and Pitacco (2004c) mainly deal with the evolution of mor-
tality modelling, ranging from Halley’s contributions to the awareness of
longevity risk.
The basic mortality
2 model
2.1 Introduction
Some elements of the basic mortality model underlying life insurance, life
annuities and pensions have been already introduced in Chapter 1, while
presenting the structure of life annuities; see in particular Sections 1.2 and
1.3. In Chapter 2, we consider the mortality model in more depth. We adopt
a more structured presentation of the fundamental ideas, which means that
some repetition of elements from Chapter 1 is unavoidable.
However, new concepts are also introduced. In particular, an age-
continuous framework is defined in Section 2.3, in order to provide some
tools needed when dealing with mortality projection models.
Indices summarizing the probability distribution of the lifetime are
described in Section 2.4, whereas parametric models (i.e. mortality ‘laws’)
are presented in Section 2.5. Basic ideas concerning non-parametric gradu-
ation are introduced in Section 2.6. Transforms of the survival function are
briefly addressed in Section 2.7.
Less traditional topics, yet of great importance in the context of life
annuities and mortality forecasts, are dealt with in Sections 2.8 and 2.9,
respectively: mortality at very old ages (i.e. the problem of ‘closing’ the life
table), and the concept of ‘frailty’ as a tool to represent heterogeneity in
populations, due to unobservable risk factors.
A list of references and suggestions for further readings (Section 2.10)
conclude the chapter. As regards references to actuarial and statistical
literature, in order to improve readability we have avoided the use of
citations throughout the text of the first sections of this chapter, namely
the sections devoted to traditional issues. Conversely, important contri-
butions to more recent issues are cited within the text of Sections 2.8
and 2.9.
46 2 : The basic mortality model
2.2 Life tables

2.2.1 Cohort tables and period tables
The life table is a (finite) decreasing sequence l0 , l1 , . . . , lω . The generic item

lx refers to the integer age x and represents the estimated number of people
alive at that age in a properly defined population (from an initial group of
l0 individuals aged 0). The exact meaning of the lx ’s will be explained after
discussing two approaches to the calculation of these numbers.
First, assume that the sequence l0 , l1 , . . . , lω is provided by statistical
evidence, that is by a longitudinal observation of the actual numbers of
individuals alive at age 1, 2, . . . , ω, out of a given initial cohort consisting
of l0 newborns. The (integer) age ω is the limiting age (say, ω = 115), that
is, the age such that lω > 0 and lω+1 = 0. The sequence l0 , l1 , . . . , lω is called
a cohort life table. Clearly, the construction of a cohort table takes ω + 1
years.
Assume, conversely, that the statistical evidence consists of the frequency
of death at the various ages, observed throughout a given period, for exam-
ple one year. Assume that the frequency of death at age x (possibly after a
graduation with respect to x) is an estimate of the probability qx .
Then, for x = 0, 1, . . . , ω − 1, define
lx+1 = lx (1 − qx ) (2.1)
with l0 (the radix) assigned (e.g. l0 = 100,000). Hence, lx is the expected

number of survivors out of a notional cohort (also called a synthetic cohort)
initially consisting of l0 individuals. The sequence l0 , l1 , . . . , lω , defined by
recursion (2.1), is called a period life table, as it is derived from period
observations.
Remark Period observations are also called cross-sectional observations,
as they analyse an existing population (in terms of the frequency of death)
‘across’ the various ages (or age groups).
An important hypothesis underlying recursion (2.1) should be stressed.

As the qx ’s are assumed to be estimated from mortality experience in a given
period (say, one year), the calculation of the lx ’s relies on the assumption
that the mortality pattern does not change in the future.
As is well known, statistical evidence show that human mortality, in many
countries, has declined over the 20th century, and in particular over its last
decades (see Chapter 3). So, the hypothesis of ‘static’ mortality cannot be
2.2 Life tables 47
assumed in principle, at least when long periods of time are referred to.
Hence, in life insurance applications, the use of period life tables should be
restricted to products involving short or medium durations (5 to 10 years,
say), like term assurances and endowment assurances, whilst it should be
avoided when dealing with life annuities and pension plans. Conversely,
these products require life tables which allow for the anticipated future
mortality trend, namely projected tables constructed on the basis of the
experienced mortality trend.
For any given sequence l0 , l1 , . . . , lω it is usual to define
dx = lx − lx+1 ; x = 0, 1, . . . , ω (2.2)
thus, dx is the expected number of individuals dying between exact age x

and x + 1, out of the initial l0 individuals. Clearly,
ω

dx = l0 (2.3)
x=0
2.2.2 ‘Population’ tables versus ‘market’ tables
Mortality data, and hence life tables, can originate from observations con-
cerning a whole national population, a specific part of a population (e.g.
retired workers, disabled people, etc.), an insurer’s portfolio, and so on.
Life tables constructed on the basis of observations involving a whole
national population (usually split into females and males) are commonly
referred to as population tables.
Market tables are constructed using mortality data arising from a collec-
tion of insurance portfolios and/or pension plans. Usually, distinct tables
are constructed for assurances (i.e. insurance products with a positive sum
at risk, for example term and endowment assurances), annuities purchased
on an individual basis, pensions (i.e. annuities paid to the members of a
pension plan).
The rationale for distinct market tables lies in the fact that mortality levels
may significantly differ as we move from one type of insurance product to
another. The case of different types of life annuities has been discussed in
Section 1.6.5.
Market tables provide experience-based data for premium and reserve
calculations and for the assessment of expected profits. Population tables
can provide a starting point when market tables are not available. More-
over, population tables usually reveal mortality levels higher than those
expressed by market tables and hence are likely to constitute a prudential
(or ‘conservative’, or ‘safe-side’) assessment of mortality in assurance port-

folios. Thus, population tables can be used when pricing assurances in order
to include a profit margin (or an implicit safety loading) into the premiums.
Indeed, in the early history of life insurance, population life tables were used
in the calculation of premiums – and this prudential assessment of mortality
led to many insurance companies making unanticipated profits.
2.2.3 The life table as a probabilistic model
We consider a person aged x, and denote by Tx the random variable rep-

resenting his/her remaining lifetime. In actuarial calculations, probabilities
like P[Tx > h] and P[h < Tx ≤ h + k] are usually involved. When a life
table is available, these probabilities can be immediately derived from the
life table itself, provided that the ages and durations are integers.
In life insurance mathematics, a specific notation is commonly used for the
probabilities of survival and death. The notation for the survival probability
is as follows:
h px = P[Tx > h] (2.4)
where h is an integer. In particular 1 px can be simply denoted with px ;

clearly 0 px = 1.
The notation for the probability of death is as follows:
h|k qx = P[h < Tx ≤ h + k] (2.5)
If h = 0 the notation k qx is used, and in particular, when h = 0 and k = 1,

the symbol qx is commonly adopted. Trivially, 0 qx = 0.
Note that, in all symbols, the right-hand side subscript denotes the age
being considered. Conversely, the left-hand side subscript denotes some
duration, whose meaning depends on the specific probability addressed.
Starting from recursion (2.1), which defines the life table, and using well
known theorems of probability theory, we can calculate probabilities of
survival and death.
Obviously, for the probability qx (called the annual probability of death)
we have
lx+1 dx
qx = 1 − = (2.6)
lx lx
and hence, for the probability px (called the annual survival probability),
px = 1 − qx (2.7)
2.2 Life tables 49
Remark Sometimes the one-year probabilities qx and px are called ‘mor-

tality rate’ and ‘survival rate’ respectively. We do not use these expressions
to denote probability of death and survival, as the term ‘rate’ should be
referred to a counter expressing the number of events per unit of time.
In general, for the survival probability we have

lx+h
h px = px px+1 . . . px+h−1 = (2.8)
lx
while for the probabilities of dying we have
lx+k
k qx = 1 − k px = 1 − (2.9)
lx
and
lx+h − lx+h+k
h|k qx = h px k qx+h = (2.10)
lx
Note that the sequence
0|1 qx , 1|1 qx , . . . , ω|1 qx (2.11)
constitutes the probability distribution of the random variable Kx , usually

called the curtate remaining lifetime and defined as the integer part of Tx ;
thus, the possible outcomes of Kx are 0, 1, . . . , ω − x.
Further useful relations are as follows:
h|k qx = h+k qx − h qx (2.12)
h|k qx = h px − h+k px (2.13)

When qx can be expressed as qx = φx /(1 + φx ), the function φx represents
the so-called mortality odds, namely
qx
φx = (2.14)
px
From 0 < qx < 1 (for x < ω), it follows that φx > 0. Thus, focussing on
the odds, rather than the annual probabilities of dying, can make easier the
choice of a mathematical formula fitting the age-pattern of mortality (see
Section 2.5), as the only constraint is the positivity of the odds.
2.2.4 Select mortality
Consider, for example, a group of insureds, all age 45, deriving from a pop-
ulation whose mortality can be described by a given life table. Is q45 (drawn
from the assumed life table) a reasonable assessment of the probability of

dying for each insured in the group?
In order to answer this question, the following points should be
addressed:
(a) When starting a life insurance policy with an insurance company, an

individual may be subject to medical screening and, possibly, to a med-
ical examination. An individual, who passes such tests and who is not
charged any extra premium, is often called a ‘standard risk’.
(b) It has been observed that the mortality experienced by policyholders
recently accepted (as standard risks) is lower than the mortality expe-
rienced by policyholders (of the same age) with a longer duration since
policy issue.
So, the answer to the above question is negative if the insureds have entered
insurance in different years: it is reasonable to expect that an individual,
who has just bought insurance, will be of better health than an individual
who bought insurance several years ago.
Hence, the attained age (45, in the example) should be split as follows:
attained age = age at entry + time since policy issue
The following notation is usually adopted to denote the annual probabilities

of death for an insured age 45:
q[45] , q[44]+1 , . . . , q[40]+5 , . . .
where the number in square brackets denotes the age at policy issue, whereas
the second number denotes the time since policy issue. In general, q[x]+u
denotes the probability of an individual currently aged x + u, who bought
insurance at age x, dying within one year.
According to point (b), it is usual to assume:
q[45] < q[44]+1 < · · · < q[40]+5 < · · ·
However, experience shows that it is reasonable to assume that the selec-

tion effect vanishes after some years, say r years after policy issue. So, in
general terms, we can assume:
q[x] < q[x−1]+1 < · · · < q[x−r]+r = q[x−r−1]+r+1 = · · · = qx (2.15)
where qx denotes the probability of an individual currently age x, who

bought insurance more than r years ago, dying within one year. The period
r is called the select period.
Referring now to a person who bought insurance at age x, and assum-

ing a select period of r = 3 years, the following probabilities should be
used:
q[x] , q[x]+1 , q[x]+2 , qx+3 , qx+4 , . . . (2.16)
We denote by xmin and xmax the minimum and respectively the maximum
age at entry. The set of sequences (2.16), for x = xmin , xmin +1 , . . . , xmax , is
called a select table. In particular, the table used after the select period is
called an ultimate life table.
Conversely, life tables in which mortality depends on attained age only (as
is the case for the life tables described in Section 2.2.1) are called aggregate
tables.
Select mortality also concerns life annuities. The person purchasing a life
annuity is likely to be in a state of good health, and hence it is reasonable to
assume that her/his probabilities of death, for a certain period after policy
issue, are lower than the probabilities of other individuals with the same
age. In this case, a self-selection effect works.
Remark The selection effect, due to medical ascertainment (in the case of
insurances with death benefit) or self-selection (in the case of life annuities),
operates during the first years after policy issue, and the related age-pattern
of mortality is often called issue-select. Another type of selection is allowed
for, when some contingency can adversely affect the individual mortality.
For example, in actuarial calculations regarding insurance benefits in the
case of disability, the mortality of disabled policyholders is usually con-
sidered to be dependent on the time elapsed since the time of disablement
inception (as well as on the attained age). In this case, the mortality is called
inception-select.
2.3 Moving to an age-continuous context

2.3.1 The survival function
Suppose that we have to evaluate the survival and death probabilities (like
(2.8), (2.9) and (2.10)) when ages and times are real numbers. Tools other
than the life table (as described in Section 2.2) are then needed.
Assume that the function S(t), called the survival function and defined
for t ≥ 0 as follows:
S(t) = P[T0 > t] (2.17)
has been assigned. Clearly, T0 denotes the random lifetime for a new-
born. In the age-continuous framework, it is usual to assume that the
possible outcomes of Tx lie in (0, +∞); nonetheless, we can assume that
the probability measure outside the interval (0, ω) is zero, where ω is the
limiting age.
Consider the probability (2.4); we have
P[T0 > x + h]
P[Tx > h] = P[T0 > x + h | T0 > x] = (2.18)
P[T0 > x]
we then find
S(x + h)
h px = (2.19)
S(x)
For probability (2.5), via the same reasoning, we obtain
S(x + h) − S(x + h + k)
h|k qx = (2.20)
S(x)
and, in particular
S(x) − S(x + k)
k qx = (2.21)
S(x)
Turning back to the life table, we note that, since lx is the expected number
of people alive at age x out of a cohort initially consisting of l0 individuals,we
have:
lx = l0 P[T0 > x] (2.22)
and, in terms of the survival function,
lx = l0 S(x) (2.23)
(provided that all individuals in the cohort have the same age-pattern of
mortality, described by S(x)). Thus, the lx ’s are proportional to the values
which the survival function takes on integer ages x, and so the life table can
be interpreted as a tabulation of the survival function.
Remark If a mathematical formula has been chosen to express the function
S(t), ‘exact’ survival and death probabilities can be calculated, with ages
and times given by real numbers. Conversely, when the survival function is
tabulated at integer ages only, for example, derived from the life table setting
S(x) = lx /l0 (see (2.23)), approximate methods are needed to calculate
survival and death probabilities at fractional ages. Some of these methods
are described in Section 2.3.5.
Figure 2.1(a) illustrates the typical behaviour of the survival function

S(x). This behaviour reflects results of statistical observations on mortality,
as we will see in Chapter 3.
(a) (b)
1 1
S(x)
S(x)
0 0
Age x Age x
Figure 2.1. Survival functions.
Figure 2.1(b) focusses on the dynamic aspects of mortality. In particular,

two aspects (which emerge from mortality observations throughout time)
can be singled out:
– the survival curve moves (in a north easterly direction over time) towards
a rectangular shape, and hence the term rectangularization is used to
describe this feature;
– the point of maximum downwards slope of the survival curve progres-
sively moves towards the very old ages; this feature is called the expansion
of the survival function.
These aspects will be considered in more detail in Chapter 7, when dealing

with longevity risk.
2.3.2 Other related functions
Other functions can be involved in age-continuous actuarial calculations.

The most important is the force of mortality (or mortality intensity), dealt
with in Section 2.3.3. In the present section we introduce the probability
density function (pdf) and the distribution function of the random variable
Tx , x ≥ 0.
First, we focus on the random lifetime T0 . Let f0 (t) and F0 (t) denote,
respectively, the pdf and the distribution function of T0 . In particular, F0 (t)
expresses, by definition, the probability of a newborn dying within t years.
Hence,
F0 (t) = P[T0 < t] (2.24)
or, according to the actuarial notation,
F0 (t) = t q0 (2.25)
Of course, we have
F0 (t) = 1 − S(t) (2.26)
The following relation holds between the pdf f0 (t) and the distribution
function F0 (t):
t
F0 (t) = f0 (u) du (2.27)
0
Usually it is assumed that, for t > 0, the pdf f0 (t) is a continuous function.
Then, we have
d d
f0 (t) = F0 (t) = − S(t) (2.28)
dt dt
The pdf f0 (t) is frequently called the curve of deaths.
Figure 2.2(a) illustrates the typical behaviour of the pdf f0 (t). Equation
(2.28) justifies the relation between the curve of deaths and the survival
curve (see Fig. 2.1(a)). In particular, we note that the point of maximum
downward slope in the survival curve corresponds to the modal point (at
adult-old ages) in the curve of deaths.
Moving to the remaining lifetime at age x, Tx (x > 0), the following
relations link the distribution function and the pdf of Tx with the analogous
functions relating to T0 :
P[x < T0 ≤ x + t] F0 (x + t) − F0 (x)

Fx (t) = P[Tx < t] = = (2.29)
P[T0 > x] S(x)
d
d dt
F0 (x + t) f0 (x + t)
fx (t) = Fx (t) = = (2.30)
dt S(x) S(x)
From functions Fx (t) and fx (t) (and in particular, via (2.29) and
(2.30), from F0 (t) and f0 (t)), all of the probabilities involved in actuarial
(a) (b)
f0(x)
mx
0 0
Age x Age x
Figure 2.2. Probability density function and force of mortality.

calculations can be derived. For example:

+∞ +∞
1
p
t x = 1 − F x (t) = fx (u) du = f0 (x + u) du (2.31)
t S(x) t
2.3.3 The force of mortality
We refer to an individual age x, and consider the probability of dying before

age x + t (with x and t real numbers), namely t qx . The force of mortality
(or mortality intensity) is defined as follows:
P[Tx ≤ t] t qx
µx = lim = lim (2.32)
t0 t t0 t
and hence it represents the instantaneous rate of mortality at a given age x.

In reliability theory, this concept is usually referred to as the failure rate or
the hazard function.
From
F0 (x + t) − F0 (x)
P[Tx ≤ t] = Fx (t) = (2.33)
S(x)
we obtain
F0 (x + t) − F0 (x) f0 (x)
µx = lim = (2.34)
t0 t S(x) S(x)
or
d
− dx S(x) d
µx = =− ln S(x) (2.35)
S(x) dx
Hence, once the survival function S(x) has been assigned, the force of
mortality can be derived. Thus, the force of mortality does not add any
information concerning the age-pattern of mortality, provided that this has
been described in terms of S(x) (or f0 (x), or F0 (x)). Conversely, the role
of the force of mortality is to provide a tool for a fundamental statement
of assumptions about the behaviour of individual mortality as a function
of the attained age. The Gompertz model for the force of mortality (see
Section 2.5.1) provides an excellent example.
Note that, as µx = f0 (x)/S(x) (see (2.34)), the relation between the graph
of µx and the graph of f0 (x) (see Fig. 2.2) can be explained in terms of the
behaviour of S(x). When S(x) is close to 1, the two graphs are quite similar,
whereas as S(x) strongly decreases, µx definitely increases.
From (2.35), with the obvious boundary condition S(0) = 1, we obtain:
x
S(x) = exp − µu du (2.36)

0
As clearly appears from (2.36), the survival function S(x) can be obtained
once the force of mortality has been chosen. Clearly, the possibility of
finding a ‘closed’ form for S(x) strictly depends on the structure of µx .
Relations between the force of mortality and the basic mortality functions
relating to an individual age x can be easily found. For example, from (2.34)
and (2.30), we obtain
f0 (x + t) fx (t)
µx+t = = (2.37)
S(x + t) 1 − Fx (t)
and hence
fx (t) = t px µx+t (2.38)
Finally, the cumulative standard force of mortality (or cumulative hazard

function) is defined as follows:
x
H(x) = µu du (2.39)
0
Remark A link between the quantities used in an age-discrete context

(like lx , dx , etc.) and the quantities used in age-continuous circumstances
(like S(x), f0 (x), etc.) may be of interest, especially when comparing
and interpreting graphical representations of data provided by statistical
experiences.
The analogy between lx and S(x) immediately emerges from (2.23). As
regards dx (see equation (2.2)), the analogy with the pdf f0 (x) follows from
the fact that the former is minus the first-order difference of the function
lx , while the latter is minus the derivative of the survival function S(x).
Finally, an interesting link can be found between the probabilities h|1 qx
and the pdf fx (t). The quantities
h|1 qx = h px qx+h ; h = 0, 1, . . . , ω
constitute the probability distribution of the curtate lifetime Kx (see (2.10)

and (2.11)). Conversely, in age-continuous circumstances, the pdf of the
probability distribution of Tx is given by
fx (t) = t px µx+t ; t ≥ 0
(see (2.38)). The analogy between the right-hand sides of the two expres-
sions is evident. Note, however, that fx (t) (as well as µx+t ) does not
represent a probability, the probability of a person age x dying between
age x + t and x + t + dt being given by fx (t) dt.
2.3.4 The central death rate
The behaviour of the force of mortality over the interval (x, x + 1) can be
summarized by the central death rate at age x, which is usually denoted by
mx . The definition is as follows:
1
0 S(x + u) µx+u du S(x) − S(x + 1)
mx = 1 = 1 (2.40)
0 S(x + u) du 0 S(x + u) du
We note that mx is defined as the (age-continuous) weighted arithmetic

mean of the force of mortality over (x, x + 1), the weighting function being
the probability of being alive at age x + u, 0 < u ≤ 1, expressed in terms of
the survival function S(x + u).
1
The integral 0 S(x+u) du can be approximated using the trapezoidal rule
(and an approximation has to be used when only a life table is available).
Then, we obtain an approximation to the central death rate:
S(x) − S(x + 1)
m̃x = (2.41)
(S(x) + S(x + 1))/2
Note that m̃x can also be expressed in terms of the annual probabil-
ity of survival or the annual probability of death. Indeed, from (2.41) we
immediately obtain:
1 − px 2 qx
m̃x = 2 = (2.42)
1 + px 2 − qx
2.3.5 Assumptions for non-integer ages
Assume that a life table (as described in Section 2.2) is available. How to
obtain the survival function for all real ages x, and probabilities of death
and survival for all real ages x and durations t? In what follows, we describe
three approximate methods widely used in actuarial practice:
(a) Uniform distribution of deaths. Relation (2.23) suggests a practicable

approach. First, set S(x) = lx /l0 for all integer x using the available life
table. Then, for x = 0, 1, . . . , ω − 1 and 0 < t < 1, define
S(x + t) = (1 − t) S(x) + t S(x + 1) (2.43)
and assume S(x) = 0 for x > ω, and so the survival function is a piece-
wise linear function. It easy to prove that, from (2.43) we obtain in
particular t qx = t qx , that is, a uniform distribution of deaths between
exact ages x and x + 1, whence the name of this approximation. It is

also easy to prove that, from (2.43) and (2.35),
qx
µx+t = (2.44)
1 − t qx
so that µx+t is an increasing function of t in the interval 0 < t < 1.
(b) Constant force of mortality. Let us assume, for 0 < t ≤ 1
µx+t = µ(x) (2.45)
where µ(x) denotes a value estimated from mortality observations. It
follows, in particular, t px = e−tµ(x) . This assumption, consisting in a
piece-wise constant force of mortality, is frequently adopted in actuarial
calculations. We note that, from (2.40),
mx = µ(x) (2.46)
(c) The Balducci assumption. Let us define, for 0 < t ≤ 1
t qx
t qx = (2.47)
1 − (1 − t) qx
The Balducci assumption has an important role in traditional actuar-
ial techniques for constructing life tables from mortality observations.
However, it is possible to prove that, from (2.47) and (2.35),
qx
µx+t = (2.48)
1 − (1 − t) qx
so that µx+t is a decreasing function of t in the interval 0 < t < 1: for
most ages, this would be an undesirable consequence of the Balducci
assumption.
2.4 Summarizing the lifetime probability distribution

Age-specific functions are usually needed in actuarial calculations. For
example, in the age-discrete context functions like lx , qx , etc. are commonly
used, whereas, for age-continuous calculations, the survival function S(x)
or the force of mortality µx are the usual starting points.
Nevertheless, the role of single-figure indices (or markers), summarizing
the lifetime probability distribution, should not be underestimated. In par-
ticular, important features of past mortality trends can be singled out by
focussing on the behaviour of some indices over time, as we will see in
Chapter 3.
2.4.1 The life expectancy
In the age-continuous context, the life expectancy (or expected lifetime) for
a newborn, denoted with ē0 , is defined as follows:
∞
ē0 = E[T0 ] = t f0 (t) dt (2.49)
0
integrating by parts, we also find in terms of the survival function:

∞
ē0 = S(t) dt (2.50)
0
The definition can be extended to all (real) ages x. So, the expected
remaining lifetime at age x is given by
∞
ēx = E[Tx ] = t fx (t) dt (2.51)
0
and also, integrating by parts, by

∞
1
ēx = S(x + t) dt (2.52)
S(x) 0
Note that, for an individual age x, the random age at death can be
expressed as x + Tx , and then the expected age at death is given by
x + E[Tx ] = x + ēx (2.53)
For all x, x > 0, the following inequality holds:

x + ēx ≥ ē0 (2.54)
The expected lifetime is often used to compare mortality in various

populations. In this regard, the following aspects should be stressed. The
definition of ēx is based on the probability distribution of the lifetime condi-
tional on being alive at age x. Thus, when x = 0 the probability distribution
involved has the pdf f0 (t) (see (2.49)), and hence mortality at all ages
contributes to the value of ē0 , in particular, for example, the infant mor-
tality. Conversely, if x > 0 the conditional pdf fx (t) is involved, and so the
age-pattern of mortality beyond age x only determines the value of ēx .
The expected value of the curtate lifetime Kx is called the curtate expec-
tation of life at age x. It is usually denoted by ex , and is defined as
follows:
ω−x

ex = E[Kx ] = k k|1 qx (2.55)
k=0
From (2.55), the following simpler expression can be derived:

ω−x

ex = k px (2.56)
k=1
Another interesting quantity is the so-called complete expectation of life

at age x, defined as follows:
◦

ex = E Kx + 12 = ex + 12 (2.57)
This quantity can be taken as an approximation to ēx , and is useful

◦
when only a life table is available. Thus, it is possible to prove that ex
is an approximation to ēx by applying the trapezoidal rule to the integral
in (2.52).
Remark Age-specific functions (namely, functions of age x), like lx , qx , ex ,
etc. in the age-discrete context, and S(x), f0 (x), µx , ēx , etc. in the age-
continuous context, are frequently named biometric functions (or life table
functions, even in the age-continuous context). It should be noted that,
once one of certain of these functions has been assigned, the other func-
tions (in the same context) can be derived. For example, in age-discrete
calculations from the lx values we can derive the functions qx , ex , etc.;
in the age-continuous framework, from the force of mortality µx the
survival function can be calculated and then all of the probabilities of
interest.
2.4.2 Other markers
As it is well known in probability theory, the expected value provides a

location measure of a probability distribution, and this is also the case for
the random lifetime T0 (or Tx in general). Other location measures can be
used to summarize the probability distribution of the random lifetime. In
particular:
– the modal value (at adult ages) of the curve of death, Mod[T0 ], also called
the Lexis point;
– the median value of the probability distribution of T0 , Med[T0 ], or
median age at death.
A number of variability measures can be used to summarize the dispersion

of the probability distribution of the lifetime. As we will see in Chapter 3,
in a dynamic context interesting information about the rectangularization
process can be obtained from these characteristics. Some examples follow:
– A traditional variability measure is provided by the variance of the

random lifetime, Var[T0 ], or its standard deviation,

σ0 = Var[T0 ] (2.58)
– The coefficient of variation, defined as
√
Var[T0 ] σ0
CV[T0 ] = = (2.59)
E[T0 ] ē0
provides a relative measure of variability.
– The entropy H[T0 ] is defined as follows:
∞
S(x) ln S(x) dx
H[T0 ] = − 0 ∞ (2.60)
0 S(x) dx
thus, the entropy is minus the mean value of ln S(x), weighted by S(x); it
is possible to prove that, as deaths become more concentrated, the value
of H declines and, in particular, H = 0 if the survival function has a
perfectly rectangular shape.
– As deaths become more concentrated in an increasingly narrow interval,
the slope of the survival curve becomes steeper. A simple variability mea-
sure is thus the maximum downward slope of the graph of S(x) in the
adult and old age range. Thus, a lower variability implies a steeper slope.
Formally, the slope at the point of fastest decline is
d
max − S(x) = max{S(x) µx } = max{f0 (x)} (2.61)
x dx x x
Note that the point of fastest decline is Mod[T0 ], that is, the Lexis point.
Further characteristics of the random lifetime follow:
– the probability of a new born dying before a given age x1 ,
x1 q0 = 1 − S(x1 ) (2.62)
which, for x1 small (say 1, or 5), provides a measure of infant mortality;
– the percentiles of the probability distribution of T0 ; in particular, the
10-th percentile, usually called endurance, is defined as the age ξ such that
S(ξ) = 0.90 (2.63)
– the interquartile range is defined as follows:
IQR[T0 ] = x − x (2.64)
where x and x are respectively the first quartile (the 25-th percentile)
and the third quartile (the 75-th percentile) of the probability distribution
of T0 , namely the ages such that S(x ) = 0.75 and S(x ) = 0.25; note that
the IQR decreases as the lifetime distribution becomes less dispersed.
While most markers are referred to the probability distribution of T0 , it

is also interesting to single out some characteristics referred to individuals
alive at a chosen age x, that is, concerning the distribution of Tx , say with
x = 65 (of obvious interest when analysing the age-pattern of mortality
of annuitants and pensioners). An example is provided by the expected
remaining lifetime at age x, ēx (or the expected age at death, x + ēx ) (see
(2.52), (2.53)). Other examples are given by
√
– the variance Var[Tx ], the standard deviation σx = Var[Tx ], and the
coefficient of variation CV[Tx ];
– the interquartile range IQR[Tx ].
For example, the analysis of the values of IQR[T65 ] related to various

subsequent mortality observations allows us to check whether the rectan-
gularization phenomenon occurs even when only old ages are addressed.
Figure 2.3 illustrate some markers of practical interest.
2.4.3 Markers under a dynamic perspective
Information provided by markers calculated on the basis of a period obser-

vation must be carefully interpreted, in particular keeping in mind mortality
trends.
Consider, in particular, the complete expectation of life at age x (see
(2.57)), namely
ω−x

◦ 1
ex = k px + (2.65)
2
k=1
Probabilities k px are derived from the qx ’s according to (2.7) and (2.8), and,
in turn, the qx ’s are determined as the result of a (recent) period mortality
◦
observation. The quantity ex is usually called the (complete) period life
expectancy.
The life expectancy drawn from a period life table can be taken as a
reasonable estimate of the remaining lifetime for an individual currently
age x only if we accept the hypothesis that, from now on, the age-pattern
of mortality will remain unchanged. See also the comments in Section 2.2.1
regarding the construction of the life table in terms of lx .
f0(x)
max{f0(x)}
x
IQR[T0]
x1q0
Age x
x1 _ x''
Endurance x' e0
Lexis
_
e65+65
Figure 2.3. Some markers.
When the hypothesis of unchanging future mortality is rejected, the cal-

◦
culation of period quantities like ex (as well as other markers) and the
corresponding ‘cohort’ quantities requires the use of appropriate mortality
forecasts, and hence of projected life tables. This aspect will be dealt with
in Section 4.4.1.
2.5 Mortality laws

Since the earliest attempt to describe in analytical terms a mortality sched-
ule (due to A. De Moivre and dating back to 1725), great effort has been
devoted by demographers and actuaries to the construction of analytical
formulae (or laws) that fit the age-pattern of mortality. When a mortality
law is used to fit observed data, the age-pattern of mortality is summarized
by a small number of parameters (two to ten, say, in the mortality laws com-
monly used in actuarial and demographical models). This exercise has the
advantage of reducing the dimensionality of the problem – thus, we could
replace the 120, say, items of a life table by a small number of parameters
without sacrificing much information.
It is beyond the scope of this book to present an extensive list of mor-

tality laws. Conversely we only focus on some important laws, which are
interesting because of their possible use in a dynamic context, that is, to sum-
marize observed mortality trends and to project the age-pattern of mortality
in future years.
2.5.1 Laws for the force of mortality
A number of mortality laws refer to the force of mortality, µx (although

some of them have been originally proposed in different terms, for example,
in terms of the life table lx ).
The Gompertz law, proposed in 1825, is as follows:
µx = B c x (2.66)
Sometimes the following equivalent notation is used:
µx = α eβ x (2.67)
It is interesting to look at the hypothesis underlying the Gompertz law.

Assume that, moving from age x to age x+ x, the increment of the mortality
intensity is proportional to its initial value, µx , and to the length of the
interval, x; thus
µx = β µx x (2.68)
This assumption leads to the differential equation
dµx
= β µx , β>0 (2.69)
dx
and finally to (2.67), with α > 0. The Gompertz law is used to represent the
age progression of mortality at the old ages, that is, the senescent mortality.
The (first) Makeham law, proposed in 1867, is a generalization of the
Gompertz law, namely
µx = A + B cx (2.70)
where the term A > 0 (independent of age) represents non-senescent mor-
tality, for example, because of accidents. An interpretation in more general
terms can be found in Section 2.5.3. The following equivalent notation is
also used:
µx = γ + α eβ x (2.71)
The second Makeham law, proposed in 1890, is as follows:
µx = A + H x + B c x (2.72)
and hence constitutes a further generalization of the Gompertz law.

The Thiele law, proposed in 1871, can represent the age-pattern of

mortality over the whole life span:
2
µx = A e−Bx + C e−D(x−E) + F Gx (2.73)
The first term decreases as the age increases and represents the infant mortal-
ity. The second term, which has a ‘Gaussian’ shape, represents the mortality
hump (mainly due to accidents) at young-adult ages. Finally, the third term
(of Gompertz type) represents the senescent mortality.
In 1932 Perks proposed two mortality laws. The first Perks law is as
follows:
α eβx + γ
µx = (2.74)
δ eβx + 1
Conversely, the second Perks law has the following more general structure:
α eβx + γ
µx = (2.75)
δ eβx + e−βx + 1
As we will see in Section 2.8, Perks’ laws have an important role in repre-
senting the mortality pattern at very old ages (say, beyond 80); moreover,
the first Perks law can be reinterpreted in the context of the ‘frailty’ models
(see Section 2.9.5).
The Weibull law, proposed in 1951 in the context of reliability theory, is
given by
µx = A xB (2.76)
or, in equivalent terms:
α−1
α x
µx = (2.77)
β β
The GM class of models (namely, the Gompertz-Makeham class of
models), proposed by Forfar et al. (1988), has the following structure:
 

r−1
s−1
µx = αi xi + exp βj xj  (2.78)
i=1 j=0
with the proviso that when r = 0 the polynomial term is absent, and when
s = 0 the exponential term is absent. The general model in the class (2.78) is
usually labelled as GM(r, s). Note that, in particular, GM(0, 2) denotes the
Gompertz law, GM(1, 2) the first Makeham law and GM(2, 2) the second
Makeham law. Models used by the Continuous Mortality Investigation
Bureau in the UK to graduate the force of mortality µx are of the GM(r, s)
type. In particular, models GM(0, 2), GM(2, 2), and GM(1, 3) have been
widely used.
2.5.2 Laws for the annual probability of death
Various mortality laws have been proposed in terms of the annual proba-
bility of death, qx , and in terms of the odds φx (see (2.14)). For example,
Beard proposed in 1971 the following law:
B cx
qx = A + (2.79)
E c−2x + 1 + D cx
Barnett proposed, in 1974, the following law for the odds:
φx = A − H x + B cx (2.80)
The odds can also be graduated using the following formula:
φx = ePx (2.81)
where Px is a polynomial in x. For example, with a first degree polynomial,

we have
φx = ea+b x (2.82)
Heligman and Pollard (1980) proposed a class of formulae which aim to
represent the age-pattern of mortality over the whole span of life (as made
by Thiele, see (2.73)). The first Heligman–Pollard law, expressed in terms
of the odds, is
C 2
φx = A(x+B) + D e−E(ln x−ln F) + G H x (2.83)
while the second Heligman–Pollard law, in terms of qx , is given by

C 2 G Hx
qx = A(x+B) + D e−E(ln x−ln F) + (2.84)
1 + G Hx
Note that, in both cases, at higher ages we have
G Hx
qx ≈ (2.85)
1 + G Hx
The third Heligman–Pollard law, which generalizes the second one, is as

follows:
C 2 G Hx
qx = A(x+B) + D e−E(ln x−ln F) + (2.86)
1 + K G Hx
Another generalization of the second law is provided by the fourth
Heligman–Pollard law, which is given by
k
C 2 G Hx
qx = A(x+B) + D e−E(ln x−ln F) + k
(2.87)
1 + G Hx
2.5.3 Mortality by causes
When various (say, r) causes or death are singled out, the force of mortality
µx can be expressed in terms of ‘partial’ forces of mortality, each force
pertaining to a specific cause:

r
µx = µ(k)
x (2.88)
k=1
(k)
where µx refers to the k-th cause of death.
Makeham proposed a reinterpretation of his first law (see (2.70)) in terms
of partial forces of mortality. Let

m
A= Ak (2.89)
k=1
and

m+n
B= Bk (2.90)
k=m+1
whence

m
m+n
m+n
µx = Ak + c x Bk = µ(k)
x (2.91)
k=1 k=m+1 k=1
2.6 Non-parametric graduation

2.6.1 Some preliminary ideas
The term ‘graduation’ denotes an adjustment procedure applied to a set

of estimated quantities, in order to obtain adjusted quantities which are
close to a reasonable pattern and, in particular, do not exhibit an erratic
behaviour. We note that previous experience and intuition suggest a smooth
progression.
In actuarial science, graduation procedures are typically applied to raw
mortality rates which result from statistical observation. Graduated series
of period mortality rates should exhibit a progressive change over a series
of ages, without sudden and/or huge jumps, which cannot be explained by
intuition or supported by past experience.
A detailed analysis of the various aspects of graduation is beyond the
scope of this book. So, we only focus on some topics which constitute
starting points for projection models presented in Chapters 5 and 6.
Various approaches to graduation can be adopted. In particular, two

broad categories can be recognized:
– parametric approaches, involving the use of mortality laws;

– non-parametric approaches.
According to a parametric approach, a functional form is chosen (e.g. Make-

ham’s law, Heligman–Pollard’s law, and so on; see Section 2.5), and the
relevant parameters are estimated in order to find the parameter values
which provide the best fit to the observed data, for example, to mortal-
ity rates. Various fitting criteria can be adopted for parameter estimation,
for example maximum likelihood, based on a Generalized Linear Models
formulation.
The choice of a particular functional form is avoided when a non-
parametric graduation method is adopted. Important methods in this
category are: weighted moving average methods, kernel methods, the
Whittaker–Henderson model, methods based on spline functions. In what
follows, we restrict our attention to the latter two methods only.
2.6.2 The Whittaker–Henderson model
The Whittaker–Henderson approach to graduation is based on the mini-

mization of an objective function. We denote by z1 , z2 , . . . , zn the observed
values of a given quantity, and by y1 , y2 , . . . , yn the corresponding gradu-
ated values. For example, referring to the graduation of mortality rates, zh
could represent the raw mortality rate at age xh , namely m̂xh , and yh the
corresponding graduated value, mxh .
The objective function (to be minimized with respect to y1 , y2 , . . . , yn ) is

n
n−k
F(y1 , y2 , . . . , yn ) = wh (yh − zh )2 + λ ( k yh )2 (2.92)
h=1 h=1
where
– w1 , w2 , . . . , wn are weights attributed to the squared deviations;

– k yh is the k-th forward difference of yh , defined as follows:

k
k i k
yh = (−1) y (2.93)
i h+k−i
i=0
– λ is a (constant) parameter.
The first term on the right-hand side of formula (2.92) provides a mea-
sure of the discrepancy between observed and graduated values. The choice
of each weight wh allows us to attribute more or less importance to the
squared deviation related to the h-th observation. In particular, referring
to the graduation of mortality rates, an appropriate choice of the weights
should reflect a low importance attributed to the raw mortality rates con-
cerning very old ages at which few individuals are alive, and hence the
observed values could be affected by erratic behaviour. To this purpose,
the weights can be chosen to be inversely proportional to the estimated
variance of the observed mortality rates.
The second term on the right-hand side of (2.92) quantifies the degree
of roughness in the set of graduated values. Usually, the value of k is
set equal to 2, 3, or 4. Finally, the parameter λ allows us to express our
‘preference’ regarding features of the graduation results: higher values of
λ denote a stronger preference for a smooth behaviour of the graduated
values, whereas lower values express more interest in the fidelity of the
graduated values to the observed ones.
The objective function can be generalized and modified. For example,
it has been proposed to replace, in the first term of the right-hand side of
(2.92), the squared deviations with other powers. As regards the second
term, a mixture of differences of various orders can be used instead of the
k-th differences only.
2.6.3 Splines
A spline is a function defined piecewise by polynomials. We denote by

[a, b] an interval of real numbers, and by ξ0 , ξ1 , . . . , ξm , ξm+1 real numbers
such that
a = ξ0 < ξ1 < · · · < ξm < ξm+1 = b (2.94)
Let s denote the spline function, and p0 , p1 , . . . , pm the polynomials. Thus,
the spline function is defined as follows:


 p0 (x); ξ0 ≤ x < ξ1


p (x); ξ ≤ x < ξ
1 1 2
s(x) = (2.95)

 . . .


p (x); ξ ≤ x ≤ ξ
m m m+1
The m + 2 numbers ξ0 , ξ1 , . . . , ξm+1 are called the knots. In particular,

ξ1 , . . . , ξm are the internal knots. If the knots are equidistantly distributed in
[a, b], the spline is called a uniform spline (a non-uniform spline otherwise).
As regards the behaviour of s(x) in a neighbourhood of the generic knot

ξh , a measure of smoothness is provided by the maximum order of the
derivative of the polynomials such that the polynomials ph−1 and ph have
common derivative values; if the maximum order is k, the spline is said to
have smoothness (or continuity) of class C k at ξh .
When all polynomials have degree at most r, the spline is said to be of
degree r. A spline of degree 0 is a step function. A spline of degree 1 is
also called a linear spline. An example of a linear spline is provided by
the piecewise linear survival function, constructed by assuming as knots all
of the integer ages and adopting the hypothesis of uniform distribution of
deaths over each year of age (see point (a) in Section 2.3.5).
A spline of degree 3 is a cubic spline. In particular, a natural cubic spline
has continuity C 2 at all of the knots, and the second derivatives of the
polynomials equal to 0 in a and b; thus, the spline is linear outside the
interval [a, b].
It can be proved that, for a given interval [a, b] and a given set of m
internal knots, the set of splines of degree r constitutes a (real) vector space
of dimension d = m + r + 1. A basis for this space is provided by the
following d functions:
1, x, . . . , xr , [(x − ξ1 )+ ]r , . . . , [(x − ξm )+ ]r (2.96)
where
0; x < ξh
(x − ξh )+ = (2.97)
x − ξh ; x ≥ ξh
for h = 1, . . . , m. The corresponding representation of the spline function
is given by:
r
m
s(x) = αj xj + βh [(x − ξh )+ ]r (2.98)
j=0 h=1
where the αj ’s and the βh ’s are the coefficients of the linear combination.
If d is the dimension of the space, then any basis consists of d elements.
We denote by b1 , b2 , . . . , bd a basis. Hence, any spline s in the space can be
represented as a linear combination of these functions, namely
s(x) = γ1 b1 (x) + γ2 b2 (x) + · · · + γd bd (x) (2.99)
where the coefficients γ1 , γ2 , . . . , γd are uniquely determined by the

function s.
The choice of a basis constitutes a crucial step in the graduation process
through splines. The starting point of this process is the choice of the result
we want to achieve by using a spline function, and the related objective

function to optimize.
We assume that our target is a ‘best fit’ graduation, namely we require
that the spline function is as close as possible (according to a stated criterion)
to our data set, consisting of n points,
(x1 , z1 ), (x2 , z2 ), . . . , (xn , zn ) (2.100)
with a ≤ xh ≤ b for h = 1, 2, . . . , n. For example, referring to actu-

arial applications, the xh ’s may represent ages, whereas the zh are the
corresponding observed mortality rates (namely the m̂xh ’s referred to in
Section 2.6.2).
As regards the best-fit criterion, we focus on the weighted mean square
error, expressed by the quantity

n
wh [s(xh ) − zh ]2 (2.101)
h=1
where the wh ’s are positive weights. Using (2.99) to express the spline func-
tion, our best-fit problem can be stated as follows: find the coefficients
γ1 , γ2 , . . . , γd which minimize the function
  2

n d
G(γ1 , γ2 , . . . , γd ) = wh  γj bj (x) − zh  (2.102)
h=1 j=1
Although minimizing the function G is, in principle, a simple exer-

cise which consists in solving a set of simultaneous equations, in practice
computational difficulties may arise. However, the complexity of the min-
imization problem can be reduced if a particular basis is chosen in order
to express the spline function s, namely the one consisting of the so-called
B-splines.
A formal definition of the B-splines and a detailed discussion of their use
as a basis in graduation problems through splines is beyond the scope of
this Section. The interested reader can refer, for example, to McCutcheon
(1981). We just mention that the idea underlying the B-splines is to choose
a basis such that each spline in the basis is zero outside a short interval.
Typically, the basis consists of cubic polynomial pieces, smoothly joined
together. In particular, when the spline function is uniform (i.e. the knots
are equidistantly distributed), the B-splines are (for a given degree) just
shifted copies of each other. The advantage provided by B-splines in the
minimization problem (2.102) derives from the fact that, as each B-spline
is zero outside a given short interval, the matrix involved by solving the
related set of simultaneous equations has many entries equal to zero, and
this improves the tractability of the best-fit problem.
Spline functions can be introduced by adopting a different approach,
namely the ‘variational approach’. Following Champion et al. (2004), we
start by defining an interpolation problem. Assume that we need to find a
function f interpolating the n data points (x1 , z1 ), (x2 , z2 ), . . . , (xn , zn ), that
is, such that
f (xh ) = zh ; h = 1, 2, . . . , n (2.103)
Among all functions f fulfilling condition (2.103), we are interested in those

which have a continuous second derivative and a ‘limited’ oscillation (i.e. a
smooth behaviour) in the interval [x1 , xn ]. We introduce the functional
xn
[f ] = [f (x)]2 dx (2.104)
x1
(where f (x) denotes the second derivative of f ) as a measure of oscillation.

Then, it is possible to prove that a cubic spline is the only function which
minimizes the functional (2.104).
We now shift from the interpolation problem to a graduation problem.
To this purpose, we use the following functional in order to express our
objective:
n xn
[f ] = 2
[zh − f (xh )] + λ [f (x)]2 dx (2.105)
x1
h=1
Clearly, [f ] generalizes the functional (2.104). The first term on the right-
hand side of (2.105) provides a measure of the discrepancy between the data
zh ’s and the graduated values f (xh )’s, whereas the second term can be inter-
preted as a measure of smoothness. The parameter λ allows us to express
our preference in the trade-off between closeness to data and smoothness.
The analogy with the structure of formula (2.92) is self-evident.
It can be proved that, among all functions f with continuous second
derivatives, there is a unique function which minimizes the functional
(2.105).
Finally, it is worth noting that the spline functions so far dealt with are
‘univariate’ splines, as their domains consist of intervals of real numbers.
Extension to a bivariate context is possible; an example will be presented
in Section 5.4, together with the more general concept of P-splines (namely,
‘Penalized’ splines).
2.7 Some transforms of the survival function 73
2.7 Some transforms of the survival function

Some transforms of life table functions may help us in reaching a bet-
ter understanding of some aspects of the age-pattern of mortality (and of
mortality trends as well). Two examples will be provided: the logit trans-
form of the survival function S(x), and the so-called resistance function.
Some aspects of their use in mortality projections will be addressed in
Section 4.6.3.
The logit transform of the survival function is defined as follows:

1 1 − S(x)
(x) = ln (2.106)
2 S(x)
Features of this transform have been analysed by Brass (see

e.g. Brass (1974)). In particular, Brass noted empirically that (x) can be
expressed in terms of the logit of the survival function describing the age-
pattern of mortality in a ‘standard’ population, ∗ (x), via a linear relation,
that is,
(x) = α + β ∗ (x) (2.107)
whose parameters are (almost) independent of age.
Figures 2.4–2.6 show the effect of various choices for the parameters α
and β.
A different transform of the survival function S(x) has been proposed by
Petrioli and Berti (1979). The proposed transform is the so-called resistance
function, defined as follows:
S(x)/(ω − x)
ρ(x) = (2.108)
(1 − S(x))/x
(a) (b)
7 1.25
6 a = 0; b = 1
5 1
a = –0.2; b = 1
4 a = 0.2; b = 1
3 0.75
a = 0; b = 1
2
0.5 a = –0.2; b = 1
1 a = 0.2; b = 1
0
0.25
–1 0 10 20 30 40 50 60 70 80 90 100
–2 0
–3 30 40 50 60 70 80 90 100
Figure 2.4. Logit transforms and survival functions.

(a) (b)
8 1.25
6 a = 0; b = 1
1
a = 0; b = 1.25
4 a = 0; b = 0.75
0.75
a = 0; b = 1
2
0.5 a = 0; b = 1.25
0 a = 0; b = 0.75
0 10 20 30 40 50 60 70 80 90 100 0.25
–2
0
–4 30 40 50 60 70 80 90 100
(a) (b)
8 1.25
a = 0; b = 1
6 a = –0.2; b = 1.25 1
4 0.75
2 0.5
0 a = 0; b = 1
0 10 20 30 40 50 60 70 80 90 100 0.25 a = –0.2; b = 1.25
–2
0
–4 0 10 20 30 40 50 60 70 80 90 100
where ω denotes, as usual, the limiting age. Thus, the transform is the ratio
of the average annual probability of death beyond age x to the average
annual probability of death prior to age x (both probabilities being referred
to a newborn).
2.8 Mortality at very old ages

Several problems arise when analysing the mortality experience of very

old population segments. A first problem obviously concerns the observed
old-age mortality rates, which are heavily affected by random fluctuations
because of their scarcity. In the past, mortality at very old ages was largely
hypothetical and assumptions were normally made as the result of extrap-
olations from younger ages, based on models such as the Gompertz or
the Makeham law. In recent times, mortality statistics have been improved
2.8 Mortality at very old ages 75
Gompertz–Makeham–Thiele Lindbergson
Force of mortality mx
e.g. Logistic
Age x
Figure 2.7. Mortality at highest ages.
in many countries, and provide stronger evidence about the shape of the
mortality curve at old and very old ages.
In particular, it has been observed that the force of mortality is slowly
increasing at very old ages, approaching a rather flat shape. In other words,
the exponential rate of mortality increase at very old ages is not constant,
as for example in Gompertz’s law (see (2.66)), but declines (see Fig. 2.7).
However, a basic problem arises when discussing the appropriateness of
mortality laws in representing the pattern of mortality at old ages: ‘what’
force of mortality are we dealing with? We will return on this important
issue in Section 2.9.3.
As classical mortality laws may fail in representing the very old-age mor-
tality, shifting from the exponential assumption may be necessary in order
to fit the relevant pattern of mortality.
2.8.2 Models for mortality at highest ages
Several alternative models have been proposed. In Section 2.5.2 we have

addressed the Heligman–Pollard family of laws, which aim to represent
the age-pattern of mortality over the whole span of life. As regards old
ages, according to the first and the second Heligman–Pollard law, qx can
G Hx
be approximated by 1+G H x (see (2.85)). Conversely, the third Heligman–
Pollard law when applied to old ages reduces to
G Hx
qx = (2.109)
1 + K G Hx
In Perks’ laws (see (2.74) and (2.75)), the denominators have the effect
of reducing the mortality especially at old and very old ages. In particular
the graph of the first law is a logistic curve.
The logistic model for the force of mortality proposed by Thatcher (1999)
assumes that
δ α eβx
µx = +γ (2.110)
1 + α eβx
Its simplified version, used in particular for studying long-term trends and
forecasting mortality at very old ages, has δ = 1 and hence has only three
parameters, namely α, β, and γ:
α eβx
µx = +γ (2.111)
1 + α eβx
A modified version of the Makeham law has been proposed by

Lindbergson (2001), replacing the exponential growth with a straight line
at very old ages:

a + b cx if x ≤ w
µx = (2.112)
a + b c + d (x − w) if x > w
w
The model proposed by Coale and Kisker (see Coale and Kisker (1990))
relies on the so-called exponential age-specific rate of change of central
death rates, defined as follows:
mx
kx = ln (2.113)
mx−1
The model assumes that kx is linear over the age of 85:
kx = k85 − (x − 85) s (2.114)
as documented by statistical evidence. The parameter s is determined

assuming that k85 is calculated from empirical data, whereas a predeter-
mined value is given to the mortality rate m110 . For given values of kx ,
x = 85, 86, . . . , 110, we find from (2.113)
 
x
mx = m85 exp kh  (2.115)
h=86
From (2.115) it follows that the Coale–Kisker model implies an exponential-

quadratic function for central death rates at the relevant ages, that is,
mx = exp(a x2 + b x + c) (2.116)
which is clearly in contrast with the Gompertz assumption.

2.9 Heterogeneity in mortality models

It is well known that any given population is affected by some degree of
heterogeneity, as far as individual mortality is concerned. Heterogeneity in
populations should be approached addressing two main issues:
(i) detecting and modelling observable heterogeneity factors (e.g. age,

gender, occupation, etc.);
(ii) allowing for unobservable heterogeneity factors.
2.9.1 Observable heterogeneity factors
As regards observable factors, mortality depends on:
(1) biological and physiological factors, such as age, gender, genotype;

(2) features of the living environment; in particular: climate and pollution,
nutritional standards (mainly with reference to excesses and deficiencies
in diet), population density, hygienic and sanitary conditions;
(3) occupation, in particular in relation to professional disabilities or
exposure to injury, and educational attainment;
(4) individual lifestyle, in particular with regard to nutrition, alcohol and
drug consumption, smoking, physical activities and pastimes;
(5) current health conditions, personal and/or family medical history, civil
status, and so on.
Item 2 affects the overall mortality of a population. That is why mortality

tables are typically considered specifically for a given geographic area. The
remaining items concern the individual and, when dealing with life insur-
ance, they can be observed at policy issue. Their assessment is performed
through appropriate questions in the application form and, as to health
conditions, possibly through a medical examination.
The specific items considered for insurance rating depend on the types of
benefits provided by the insurance contract (see also Section 2.2.2). The aim
of the insurer is to group people in classes within which insured lives bear
the same expected mortality profile. Age is always considered, due to the
apparent variability of mortality in this regard. Gender is usually accounted
for, especially when living benefit are involved, given that females on aver-
age live longer than males. As far as genetic aspects are concerned, the
evolving knowledge in this area has raised a lively debate (which is still
running) on whether it is legitimate for insurance companies to resort to
genetic tests for underwriting purposes. Applicants for living benefits are
usually in good health condition, so a medical examination is not neces-

sary; on the contrary, a proper investigation is needed for those who buy
death benefits, given that people in poorer health conditions may be more
interested in them and hence more likely to buy such benefits.
When death benefits are dealt with, health conditions, occupation and
smoking status lead to a classification into standard and substandard risks;
for the latter (also referred to as impaired lives), a higher premium level is
adopted, given that they bear a higher probability to become eligible for
the benefit. In some markets, standard risks are further split into regular
and preferred risks, the latter having a better profile than the former (e.g.
because they never smoked); as such, they are allowed to pay a reduced
premium rate.
Mortality for people in poorer or better conditions than the average
is usually expressed in relation to average (or standard) mortality. This
allows us to deal only with one life table (or one mortality law), prop-
erly adjusted when substandard or preferred risks are dealt with. For
the case of life annuities, usually specific tables are constructed for each
subpopulation.
2.9.2 Models for differential mortality
Let us index with (S) standard mortality and with (D) a different (higher
or lower) mortality. Below, some examples of differential mortality models
follow.
qx(D) = aqx(S) + b (2.117)

µx(D) = aµx(S) + b (2.118)
( S)
qx(D) = qx+z (2.119)
( S)
µx(D) = µx+z (2.120)
qx(D) = qx(S) ϕ(x) (2.121)
( D)
q[x−t]+t = qx(S) ρ(x − t, t) (2.122)
( D) ( S)
q[x−t]+t = q[x−t]+t ν(t) (2.123)
( D) ( S)
q[x−t]+t = q[x−t]+t η(x, t) (2.124)
In any case, x is the current age and t the time elapsed since policy issue
(t ≥ 0), whence x − t is the age at policy issue.
Models (2.117) and (2.118) are usually adopted for substandard risks.
( S) ( S)
Letting a = 1 and b = δqx−t , δ > 0, in (2.117) (b = δµx−t in (2.118)) the so-
called additive model is obtained, where the increase in mortality depends
on initial age. An alternative model is obtained choosing b = θ, θ > 0, that
is, a mortality increase which is constant and independent of the initial age;
such a model is consistent with extra-mortality due to accidents (related
either to occupation or to extreme sports). Letting a = 1 + γ, γ > 0, and
b = 0 the so-called multiplicative model is derived, where the mortality
increase depends on current age. When risk factors are only temporarily
effective (e.g. some diseases which either lead to an early death or have a
short recovery time), parameters a, b may be positive up to some proper
time τ; for t > τ, standard mortality is assumed, so that a = b = 0.
Models (2.119) and (2.120) are very common in actuarial practice, both
for substandard and preferred risks, due to their simplicity; they are called
age rating or age shifting models. Model (2.120), in particular, can be
formally justified, assuming the Gompertz law for the standard force of
mortality and the multiplicative model for differential mortality. Actually,
( S)
if µx = α eβx (see (2.67)), we have from (2.118), with a = 1 + γ and b = 0,
( S)
µx(D) = (1 + γ) α eβx = α eβ(x+z) = µx+z (2.125)
where eβz = 1 + γ. In insurance practice, the age-shifting is often applied

directly to premium rates.
In (2.121), mortality is adjusted in relation to age. Such a choice is com-
mon when annuities are dealt with. For example, ϕ(x) may be a step-wise
linear function.
The other models listed above concern the effect on mortality of the time
elapsed since policy issue, t (see Section 2.2.4). Model (2.122) expresses
issue-select mortality in terms of aggregate mortality (so that, differential
mortality simply means, in this case, select mortality). Conversely, models
(2.123) and (2.124) express issue-select differential mortality through a
transform of the issue-select standard probabilities of death; in particular,
ν(t) and η(x, t) may be chosen to be linear.
A particular implementation of model (2.117) (with b = 0) is given by
the so-called numerical rating system, introduced in 1919 by New York
Life Insurance and still adopted by many insurers. A set of m risk factors is
referred to. The annual probability of death specific for a given individual is
 

m
q(spec)
x = q(S)
x
1 + γh  (2.126)
h=1
where the parameters γh lead to a higher or lower death probability for

the individual in relation to the values
massumed by the chosen risk factors
(clearly, with the constraint −1 < h=1 γh < (1/qx ) − 1). Note that an
(S)
additive effect of each of the risk factors is assumed.
2.9.3 Unobservable heterogeneity factors. The frailty
Heterogeneity of a population in respect of mortality can be explained by

differences among the individuals; some of these are observable, as discussed
in the previous section, whilst others (e.g. the individual’s attitude towards
health, some congenital personal characteristics) are unobservable.
When allowing for unobservable heterogeneity factors, two approaches
can be adopted:
– A discrete approach, according to which heterogeneity is expressed

through a (finite) mixture of appropriate functions.
– A continuous approach, based on a non-negative real valued variable,
called the frailty, whose role is to include all unobservable factors
influencing the individual mortality.
The second approach is most interesting. We will deal with this approach
only. In the following discussion, the term heterogeneity refers to unob-
servable risk factors only; in respect of the observable risk factors, the
population is instead assumed to be homogeneous.
In order to develop a continuous model for heterogeneity, a proper
characterization of the unobservable risk factors must be introduced. In
their seminal paper, Vaupel et al. (1979) extend the earlier work of Beard
(1959, 1971) and define the frailty as a non-negative quantity whose level
expresses the unobservable risk factors affecting individual mortality. The
underlying idea is that those people with a higher frailty die on average
earlier than others. Several models can be developed, which are susceptible
to interesting actuarial applications.
With reference to a population (defined at age 0, and as such closed to new
entrants), we consider people current age x. They represent a heterogeneous
group, because of the unobservable factors. Let us assume that, for any
individual, such factors are summarized by a non-negative variable, viz the
frailty. The specific value of the frailty of the individual does not change
over time, but remains unknown. On the contrary, because of deaths, the
distribution of people in respect of frailty does change with age, given that
people with low frailty are expected to live longer; we denote by Zx the
random frailty at age x, for which a continuous probability distribution
with pdf gx (z) is assumed. It must be mentioned that the hypothesis of

unvarying individual frailty, which is reasonable when thinking of genetic
aspects, seems weak when referring to environmental factors, which may
change over time affecting the risk of death; however, there is empirical
evidence which validates quite satisfactorily this assumption.
For a person current age x with frailty level z, the (conditional) force of
mortality (see (2.32)) is defined as
P[Tx ≤ t|Zx = z]
µx (z) = lim (2.127)
t0 t
Now the task is to look at possible relations between µx (z) and a standard
force of mortality, given that mortality analysis requires the joint distribu-
tion of (Tx , Zx ). For brevity, conditioning on Zx = z will be denoted simply
with z.
In Vaupel et al. (1979) a multiplicative model for the force of mortality
has been proposed:
µx (z) = z µx (2.128)
where µx represents the force of mortality for an individual with z = 1; µx
is considered as the standard force of mortality. If z < 1, then µx (z) < µx ,
which suggests that the person is in good conditions; vice versa if z > 1.
Note that (2.128) may be adopted also when the standard frailty level is
other than 1. Let a, a = 1, be the standard frailty level and µx the stan-
dard force of mortality; according to the multiplicative model, µx (a) = µx ,
whence (replacing in (2.128)) µx = 1a µx . So, following (2.128), the force
of mortality for a person age x and frailty level z may be written as
z
µx (z) = µ = z µx (2.129)
a x
which coincides with (2.128) using an appropriate definition of the standard
force of mortality and a scaling of the frailty level. A simple generalization
may further be adopted to represent a mortality component independent of
age and frailty (e.g. accident mortality). The model
b µx (z) = b + zµx (2.130)

may be considered for this purpose. For brevity, in the following we refer
just to (2.128).
We denote with H(x) the cumulative standard force of mortality in (0, x)
(see (2.39)).
Let us refer to age 0. The survival function for a person with frailty z is
x
S(x|z) = e− 0 µt (z)dt
= e−zH(x) (2.131)
The pdf of T0 conditional on a frailty level z, given by f0 (x|z) = S(x|z) µx (z),

can be expressed as
d
f0 (x|z) = e−zH(x) zµx = − S(x|z) (2.132)
dx
The joint pdf of (T0 , Z0 ), denoted by h0 (x, z), can then be easily obtained.
We have
h0 (x, z) = f0 (x|z) g0 (z) = S(x|z) µx (z) g0 (z) (2.133)
Referring to the whole population, we can define the average survival
function as ∞
S̄(x) = S(x|z) g0 (z) dz (2.134)
0
Note that S̄(x) represents the share of people alive at age x out of the initial
newborns.
We now refer to a given age x, x ≥ 0. The pdf of Zx may be derived
from the distribution of (T0 , Z0 ) considering that, as was mentioned earlier,
the distribution of Zx changes because of a varying composition of the
population due to deaths. We can then relate Zx to Z0 as follows:
Zx = Z0 |T0 > x (2.135)
For the pdf of Zx we obtain

P[z < Zx ≤ z + z]
gx (z) = lim =
z0 z
P[T0 > x|z < Z0 ≤ z + z] P[z < Z0 ≤ z + z]
= lim (2.136)
z0 z P[T0 > x]
from which, under usual conditions, we obtain
S(x|z) g0 (z) S(x|z) g0 (z)
gx (z) = = ∞ (2.137)
S̄(x) 0 S(x|z) g0 (z) dz
Note that the pdf of Zx is given by the pdf of Z0 , adjusted by the ratio
S(x|z)/S̄(x) which updates at age x the proportion of people with frailty
z. It is also interesting to stress that the assessment of gx (z) is based on
an update of g0 (z) with regard to the number of survivors with frailty z
compared to what would be expected over the whole population.
We define the average force of mortality in the population as
∞ ∞
0 µx (z) S(x|z) g0 (z) dz h0 (x, z) dz
µ̄x = ∞ = 0 (2.138)
0 S(x|z) g0 (z) dz S̄(x)
Thanks to (2.128) and (2.137) we obtain

∞
µ̄x = µx z gx (z) dz (2.139)
0
that is, to
µ̄x = µx z̄x (2.140)
∞
where z̄x = 0 z gx (z) dz = E[Zx ] represents the expected frailty at age x.
Note that the average force of mortality coincides with the standard one
only if z̄x = 1. A similar relation holds for model (2.130): we easily find
µ̄x = b + µx z̄x .
It is easy to show that
d
z̄x = −µx Var[Zx ] < 0 (2.141)
dx
Then, according to (2.140), µ̄x varies less rapidly than µx . This is due
to the fact that those with a high frailty die earlier, therefore leading to a
reduction of z̄x with age. If one disregards the presence of heterogeneity,
on average an underestimation of the force of mortality follows when one
cohort only is addressed.
2.9.4 Frailty models
In order to get to numerical valuations (and further analytical results, as

well), the distribution of Z0 must be chosen. In Vaupel et al. (1979), a
Gamma distribution has been suggested, due to its nice features. Let then
Z0 ∼ Gamma(δ, θ). The pdf g0 (z) is therefore
θ δ zδ−1 −θz
g0 (z) = e (2.142)
(δ)
We have in particular
δ
E[Z0 ] = z̄0 = (2.143)
θ
δ
Var[Z0 ] = (2.144)
θ2
The coefficient of variation of Z0
√
Var[Z0 ] 1
CV[Z0 ] = =√ (2.145)
E[Z0 ] δ
shows that δ plays the role of measuring, in relative terms, the level of
heterogeneity in population. If δ ∞, then CV[Z0 ] 0, that is, the
population can be considered homogeneous; for small values of δ, on the

contrary, the value of CV[Z0 ] is high, representing a wide dispersion, that
is, heterogeneity, in the population.
It can be shown that also Zx , x > 0, has a Gamma distribution, with one
of the two parameters updated to the current age. In order to check this, we
need the expression of the average survival function at age x. Substituting
(2.142) into (2.134), and using (2.131), we have
δ ∞
θ (θ + H(x))δ zδ−1 −(θ+H(x))z
S̄(x) = e dz (2.146)
θ + H(x) 0 (δ)
δ δ−1
Note that (θ+H(x))
(δ)
z
e−(θ+H(x))z is the pdf of a random variable Gamma-
distributed with parameters (δ, θ + H(x)); hence, the integral in (2.146)
reduces to 1. Therefore,
δ
θ
S̄(x) = (2.147)
θ + H(x)
Replacing in (2.137) and rearranging we have
(θ + H(x))δ zδ−1 −(θ+H(x))z

gx (z) = e (2.148)
(δ)
which is the pdf of a random variable Gamma(δ, θ+H(x)) Thus, the Gamma
distribution has a self-replicating property, and the relevant parameters
need to be chosen with reference to the distribution at age 0.
So it follows that
δ
E[Zx ] = z̄x = (2.149)
θ + H(x)
δ
Var[Zx ] = (2.150)
(θ + H(x))2
√
Var[Zx ] 1
CV[Zx ] = =√ (2.151)
E[Zx ] δ
Note that whilst the expected value of the frailty reduces with age, its relative
variability keeps constant.
We can give an interesting interpretation for the average survival
function. Rearranging (2.146) we find
δ δ
θ δ z̄x
S̄(x) = = (2.152)
δ θ + H(x) z̄0
and then we argue that the average survival function at age x, that is, the
average probability of newborns attaining age x, depends on the compar-
ison between the expected frailty level at age x and age 0; this result is
independent of the particular mortality law that we adopt for the standard
force of mortality, which actually has not yet been introduced, and is simply
due to the properties of the Gamma distribution.
The population force of mortality is
δ
µ̄x = µx (2.153)
θ + H(x)
Usually, the initial values of the parameters of the Gamma distribution are
chosen so that z̄0 = 1, that is, θ = δ. So we have
δ
µ̄x = µx (2.154)
δ + H(x)
Only the parameter δ has to be assigned, in a manner which is consistent

with the level of heterogeneity in the population. Finally, the unconditional
pdf of Tx may be easily obtained from previous results.
An alternative choice for the distribution of Z0 is the Gaussian-Inverse
distribution. Like the Gamma, this distribution is self-replicating, so that
Zx is Gaussian-Inverse for any age x; and hence, the relevant parameters
need to be chosen only with reference to the distribution at age 0. When
a Gaussian-Inverse distribution is used, in relative terms the variability of
Zx decreases with age, which can be justified by the fact that as time passes
those with a low (and similar) frailty keep on living, hence reducing the het-
erogeneity of the population. In this regard, the Gassian-Inverse hypothesis
is more interesting than the Gamma. However, some authors (e.g. see Butt
and Haberman (2004) and Manton and Stallard (1984)) note that individ-
ual frailty is unlikely to remain unchanged over the lifetime, but should
increase with age. So the assumption that, within the population, the rel-
ative variability keeps constant can be accepted. In the following, we will
mainly deal with the Gamma case.
2.9.5 Combining mortality laws with frailty models
Referring to adult ages, we can assume the Gompertz law (see (2.67)) for
describing the standard force of mortality. So the cumulative standard force
of mortality is
x
α
H(x) = α eβt dt = (eβx − 1) (2.155)
0 β
If we accept the Gamma assumption for Z0 , then the population force of

mortality is
αδeβx
µ̄x = (2.156)
θ − βα + βα eβx
Rearrange as
1 αδeβx
µ̄x = α α (2.157)
θ− β 1 + βθ−α eβx
αδ α
Let θ−(α/β) = α , βθ−α = δ ; so
α eβx
µ̄x = (2.158)
1 + δ eβx
which is the first Perks law (see (2.74)), with γ = 0. Hence, (2.156) has a
logistic shape; see Fig. 2.8.
The logistic model for describing mortality within a heterogeneous pop-
ulation may be built also adopting a different approach (see Cummins
et al. (1983); Beard (1971)). With reference to a heterogeneous population,
assume that the individual force of mortality is Gompertz, with unknown
‘base’ mortality; hence
µx = A eβx (2.159)
where A (the parameter for base mortality) is a random quantity, specific to
the individual, whilst β (the parameter for senescent mortality) is common
to all individuals and known. Let ϕ(a) denote the pdf of A; the population
force of mortality is then
∞
µ̄x = a eβx ϕ(a) da = eβx E[A] (2.160)
0
2.5 5
Gomperiz
x=0 4.5 Perks
2 x = 85 4
3.5
1.5 3
2.5
1 2
1.5
0.5 1
0.5
0 0
0.4 0.6 0.8 1 1.2 1.4 1.6 65 75 85 95 105 115
Figure 2.8. Gamma distributions and forces of mortality.

If A ∼ Gamma(ρ, ν), then

ρ
µ̄x = eβx (2.161)
ν
1 βx
α δ +e
Letting ρ = δβ , ν= β , we find
αeβx
µ̄x = (2.162)
1 + δeβx
which is still a particular case of (2.74), with γ = 0. Note, however, that
this choice implies that the probability distribution of A depends on age.
What we have just described can be easily classified under the multiplica-
tive frailty model. Actually, if A in (2.159) is replaced with αz (with α certain
and z random), one finds (2.128). The Perks model then follows by choosing
a Gamma distribution for Z0 , with appropriate parameters. However, this
approach is less elegant than that proposed by Vaupel et al. (1979), given
that in (2.159) the distribution of A is not forced to depend on age. Actually,
the multiplicative model allows for extensions and generalizations; further,
it does not require a Gompertz force of mortality.

As regards the ‘traditional’ mortality model, that is, the model disregarding
specific issues as mortality at very old ages and frailty, we restrict ourselves
to general references. In some of these, the reader can find references to the
original papers and reports, for example, by Gompertz, Makemam, Thiele,
Perks, and so on.
A number of textbooks of actuarial mathematics deal with life tables
and mortality models, in both an age-discrete and an age-continuous
context. The reader can refer for example to Bowers et al. (1997),
Gerber (1995), Gupta and Varga (2002), Rotar (2007). The textbook by
Benjamin and Pollard (1993) is particularly devoted to mortality analysis
and mortality laws.
The articles by Forfar (2004a) and Forfar (2004b) provide a compact and
effective presentation of life tables and mortality laws respectively.
Graduation methods are dealt with by many actuarial and statistical text-
books. Besides the textbook by Benjamin and Pollard (1993) already cited,
the reader should consult, for example, London (1985), and the article by
Miller (2004) which also provides an extensive list of references. As regards
spline functions and their use to graduate mortality rates, the reader can
refer to McCutcheon (1981), and Champion et al. (2004) and references

therein.
Historical aspects are dealt with by Haberman (1996), Haberman and
Sibbett (1995) and Smith and Keyfitz (1977). In particular, in Haberman
and Sibbett (1995) the reader can find the reproduction of milestone papers
in mortality analysis up to 1919.
In relation to mortality at old and very old ages, the deceleration in the
rate of mortality increase is analysed in detail in the demographic literature.
In particular, the reader can refer to Horiuchi and Wilmoth (1998), where
the problem is attacked in the context of the frailty models. A discussion
about non-Gompertzian mortality at very old ages is provided by Olshansky
and Carnes (1997).
Allowing for heterogeneity in population mortality (and, in particular, for
non-observable heterogeneity) constitutes, together with mortality dynam-
ics modelling, one of the most important issues in the evolution of survival
models (see e.g. Pitacco (2004a)). Modelling frailty can suggest new ways to
forecast mortality. Although the earliest contribution to this topic probably
came from the actuarial field (Beard (1959) proposed the idea of individ-
ual frailty for capturing heterogeneity due to unobservable risk factors),
the topic itself was ignored by actuaries up to some time ago. Conversely,
seminal contributions have come from demography and biostatistics, also
concerning the dynamics of mortality and longevity limits (see Vaupel et
al. (1979), Hougaard (1984), and Yashin and Iachine (1997)). However,
very recent contributions show interest for this topic within the actuarial
community; see Butt and Haberman (2002), Butt and Haberman (2004),
Olivieri (2006).
Conversely, the interest of actuaries in observable factors (like gender,
health condition, etc.) can be traced back to the first scientific models for
life insurance. For example, see Cummins et al. (1983) as regards risk clas-
sification in life insurance and the numerical rating system in particular,
that was pioneered by the New York Life Insurance company.
Mortality trends during
3 the 20th century
3.1 Introduction
Life expectancy at birth among early humans was likely to be between 20
and 30 years as testified by evidence that has been glaned from tombstones
inscriptions, genealogical records, and skeletal remains. Around 1750, the
first national population data began being collected in the Nordic countries.
At that time, life expectancy at birth was around 35–40 years in the more
developed countries. It then rose to about 40–45 by the mid-1800s. Rapid
improvements began at the end of the 19th century, so that, by the middle
of the 20th century it was approximately 60–65 years. By the beginning of
the 21st century, life expectancy at birth has reached about 70 years. The
average life span has thus, roughly tripled over the course of human history.
Much of this increase has happened in the past 150 years: the 20th century
has been characterized by a huge increase in average longevity compared
to all of the previous centuries. Broadly speaking, the average life span
increased by 25 years in the 10,000 years before 1850. Another 25-year
increase took place between 1850 and 2000. And there is no evidence that
improvements in longevity are tending to slow down.
The first half of the 20th century saw significant improvement in
the mortality of infants and children (and their mothers) resulting from
improvements to public health and nutrition that helped to withstand infec-
tious diseases. Since the middle of the 20th century, gains in life expectancy
have been due more to medical factors that have reduced mortality among
older persons. Reductions in deaths due to the ‘big three’ killers (cardio-
vascular disease, cancer, and strokes) have gradually taken place, and life
expectancy continues to improve.
The population of the industrialized world underwent a major mortality
transition over the course of the 20th century. In recent decades, the pop-
ulations of developed countries have grown considerably older, because of
two factors – increasing survival to older ages as well as the smaller numbers
90 3 : Mortality trends during the 20th century
of births (the so-called ‘baby bust’ which started in the 1970s). In this new
demographic context, questions about the future of human longevity have
acquired a special significance for public policy and fiscal planning. In par-
ticular, social security systems, which in many industrialized countries are
organized according to the pay-as-you-go method, are threatened by the
ageing of the population due to the baby bust combined with the increase in
life expectancy. As a consequence, many nations are discussing adjustments
or deeper reforms to address this problem.
Thus, mortality is a dynamic process and actuaries need appropriate tools
to forecast future longevity. We believe that any sound procedure for pro-
jecting mortality must begin with a careful analysis of past trends. This
chapter purposes to illustrate the observed decline in mortality, on the basis
of Belgian mortality statistics. The mortality experience during the 20th cen-
tury is carefully studied by means of several demographic indicators which
have been introduced in Chapter 2. Specifically, after having presented the
different sources of mortality statistics, we compute age-specific death rates,
life expectancies, median lifetimes and interquartile ranges, inter alia, as well
as survival curves. We also compare statistics gathered by the insurance
regulatory authorities with general population figures in order to measure
adverse selection. A comparison between the mortality experience of some
EU member countries is performed in Section 3.5.
Before proceeding, let us say a few words about the notation used in
this chapter. Here, we analyse mortality in an age-period framework. This
means that we use two dimensions: age and calendar time. Both age and
calendar time can be either discrete or continuous variables. In discrete
terms, a person aged x, x = 0, 1, 2, . . ., has an exact age comprised between
x and x+1. This concept is also known as ‘age last birthday’ (i.e., the age of
an individual as a whole number of years, by rounding down to the age at
the most recent birthday). Similarly, an event that occurs in calendar year
t occurs during the time interval [t, t + 1]. This two-dimension setting is
formally defined in Section 4.2.1; see Table 4.1. Otherwise, we follow the
notation introduced in the previous chapters.
3.2 Data sources

In this chapter, we use three different sources of mortality data. Official
data coming from a National Institute of Statistics or another governmen-
tal agency, data available from a scientific demographic database allowing
for international comparisons, and market data provided by national
regulatory authorities.
3.2 Data sources 91
3.2.1 Statistics Belgium
Statistics Belgium is the official statistical agency for Belgium. Formerly

known as NIS-INS, Directoriate General Statistics Belgium is part of
the Federal Public Service Economy. It is based in Brussels. Its mission
is to deliver timely, reliable and relevant figures to the Belgian govern-
ment, international authorities (like the EU), academics, and the public.
For more information, we refer the reader to the official website at
http://www.statbel.fgov.be. A national population register serves as the
centralizing database in Belgium and provides official population figures.
Statistics on births and deaths are available from this register by basic
demographic characteristics (e.g. age, gender, marital status).
Statistics Belgium constructs period life tables, separately for men
and women. These life tables are available for the periods 1880–1890,
1928–1932, 1946–1949, 1959–1963, 1968–1972, 1979–1982, 1988–
1990, 1991–1993 and 1994–1996. After 1996, period life tables have been
provided each year based on a moving triennium, starting from the 1997–
1999 life table, and continuing with the 1998–2000 life table, 1999–2001
life table, etc. The last available life table relates to the period 2002–2004.
In each case, the mortality experienced by the Belgian population is repre-
sented as a set of one-year death probabilities qx (see Section 2.2.3 for
a formal definition). Here, we use the life tables of the periods 1880–
1890, 1928–1932, 1968–1972, and 2000–2002 to investigate the long-term
evolution of the mortality in Belgium.
Even if the figures are computed from Belgian mortality experience, the
analysis conducted in this chapter applies to any industrialized country and
the findings would be very similar.
3.2.2 Federal Planning Bureau
The Federal Planning Bureau (FPB) is a public utility institution based in

Brussels. The FPB makes studies and projections on socio-economic and
environmental policy issues for the Belgian government. The population
plays an important role in numerous themes examined by the FPB. This is
why the FPB produces regularly updated projected life tables for Belgium.
The official mortality statistics for Belgium come from FPB together
with Statistics Belgium. Specifically, from 1948 to 1993, annual death
probabilities were computed by FPB. From 1994, annual death prob-
abilities are computed by Statistics Belgium and published on a yearly
basis. The annual death probabilities are now available for calendar years
t = 1948, 1949, . . . , 2004 and ages


 0, 1, . . . , 100, for t = 1948, 1949, . . . , 1993
x = 0, 1, . . . , 101, for t = 1994, 1995, . . . , 1998

0, 1, . . . , 105, for t = 1999, 2000, . . .
3.2.3 Human mortality database
The Human mortality database (HMD) was launched in May 2002 to pro-
vide detailed mortality and population data to those interested in the history
of human longevity. It has been put together by the Department of Demog-
raphy at the University of California, Berkeley, USA, and the Max Planck
Institute for Demographic Research in Rostock, Germany. It is freely avail-
able at http://www.mortality.org and provides a highly valuable source of
mortality statistics.
HMD contains original calculations of death rates and life tables for
national populations, as well as the raw data used in constructing those
tables. The HMD includes life tables provided by single years of age up to
109, with an open age interval for 110+. These period life tables represent
the mortality conditions at a specific moment in time. We refer readers
to the methods protocol available from the HMD website for a detailed
exposition of the data processing and table construction.
For Belgium, date were compiled by Dana Glei, Isabelle Devos and Michel
Poulain. They cover the period starting in 1841 and ending in 2005. How-
ever, data are missing during World War I. This is why we have decided to
restrict the study conducted in this chapter to the period 1920–2005.
3.2.4 Banking, Finance, and Insurance Commission
In addition to general population data, we also analyse mortality statistics

from the Belgian insurance market. Any difference between the general pop-
ulation and the insured population is due to adverse selection, as explained
in Section 1.6.5.
Market data are provided by the Banking, Finance and Insurance Com-
mission (BFIC) based in Brussels. BFIC has been created as a result of the
integration of the Insurance Supervisory Authority (ISA) into the Bank-
ing and Finance Commission (BFC). Since the 1st of January 2004, it is
the single supervisory authority for the Belgian financial sector. For more
information, we refer readers to the official website http://www.cbfa.be.
Annual tabulations of the number of deaths by age, by gender, and by

policy type are made by the BFIC based on information supplied by insur-
ance companies. Together with the number of deaths, the corresponding
(central) risk exposure is also available in each case. These data allow us to
calculate age-gender-type-of-product specific (central) death rates. We do
not question the quality of the data provided by BFIC.
3.3 Mortality trends in the general population
3.3.1 Age-period life tables
As explained in Section 2.2, life table analyses are based upon an analytical
framework in which death is viewed as an event whose occurrence is prob-
abilistic in nature. Life tables create a hypothetical cohort (or group) of,
say, 100,000 persons at age 0 (usually of males and females separately) and
subject it to age-gender-specific annual death probabilities (the number of
deaths per 1,000 or 10,000 or 100,000 persons of a given age and gender)
observed in a given population. In doing this, researchers can trace how the
100,000 hypothetical persons (called a synthetic cohort) would shrink in
numbers due to deaths as the group ages.
As stressed in Section 2.2.1, there are two basic types of life tables: period
life tables and cohort life tables. A period life table represents the mortality
experience of a population during a relatively short period of time, usually
between one and three years. Life tables based on population data are gen-
erally constructed as period life tables because death and population data
are most readily available on a time period basis. Such tables are useful
in analysing changes in the mortality experienced by a population through
time. These are the tables used in the present chapter.
We analyse the changes in mortality as a function of both age x and cal-
endar time t. This is the so-called age-period approach. In this chapter, we
assume that the age-specific forces of mortality are constant within bands
of age and time, but allowed to vary from one band to the next. This
extends to a dynamic setting the constant force of mortality assumption
(b) in Section 2.3.5.
Specifically, let us denote as Tx (t) the remaining lifetime of an individual
aged x at time t. Compared to Section 2.2.3, we supplement the notation
Tx for the remaining lifetime of an x-aged individual with an extra index
t representing calendar time. This individual will die at age x + Tx (t) in
year t + Tx (t). Then, qx (t) is the probability that an x-aged individual in
calendar year t dies before reaching age x + 1, that is, qx (t) = P[Tx (t) ≤ 1].
Similarly, px (t) = 1 − qx (t) is the probability that an x-aged individual in
calendar year t reaches age x + 1, that is, px (t) = P[Tx (t) > 1].
The force of mortality µx (t) at age x and time t is formally defined as
P[x < T0 (t − x) ≤ x + |T0 (t − x) > x]
µx (t) = lim (3.1)
0
Compare (3.1) to (2.32)–(2.34). Now, given any integer age x and calendar
year t, we assume that
µx+ξ1 (t + ξ2 ) = µx (t) for 0 ≤ ξ1 , ξ2 < 1 (3.2)
This is best illustrated with the aid of a coordinate system that has calendar
time as abscissa and age as coordinate as in Fig. 3.1. Such a representation
is called a Lexis diagram after the German demographer who introduced
it. Both time scales are divided into yearly bands, which partition the Lexis
plane into square segments. Formula (3.2) assumes that the mortality rate
is constant within each square, but allows it to vary between squares; see
Fig. 3.1 for a graphical interpretation. Since life tables do not include mor-
tality measures at non-integral ages or for non-integral durations, (3.2) can
also be seen as a convenient interpolation method to expand a life table for
estimating such values.
Under (3.2), we have for integer age x and calendar year t that

1
px (t) = exp − µx+ξ (t + ξ) dξ = exp(−µx (t)) (3.3)
0
Age
x+1
t–x–1 t–x t t+1 Time
Figure 3.1. Illustration of the basic assumption (3.2) with a Lexis diagram.
which extends (2.36). For durations s less than 1 year, we have under
assumption (3.2) that
s
s px (t) = exp − µx+ξ (t + ξ) dξ
0
s
= exp (−sµx (t)) = px (t) (3.4)
Moreover, the forces of mortality and the central death rates (see Section
2.3.4 for formal definitions) coincide under (3.2), that is, µx (t) = mx (t).
This makes statistical inference much easier since rates are estimated by
dividing the number of occurrences of a selected demographic event in a
(sub-) population by the corresponding number of person-years at risk (see
next section).
3.3.2 Exposure-to-risk
When working with death rates, the appropriate notion of risk exposure
is the person-years of exposure, called the (central) exposure-to-risk in
the actuarial literature. The exposure-to-risk refers to the total number of
‘person-years’ in a population over a calendar year. It is similar to the aver-
age number of individuals in the population over a calendar year adjusted
for the length of time they are in the population.
Let us denote as ETRxt the exposure-to-risk at age x last birthday during
year t, that is, the total time lived by people aged x last birthday in calendar
year t. There is an easy expression for the average exposure-to-risk that is
valid under (3.2). As in (1.45), let Lxt be the number of individuals aged x
last birthday on January 1 of year t. Then,
1
ξ
E[ETRxt |Lxt = l] = l px (t) dξ
ξ=0
l
=− 1 − px (t)
µx (t)
−lqx (t)
= (3.5)
ln(1 − qx (t))
Hence, provided the population size is large enough, we get the
approximation
−Lxt qx (t)
ETRxt ≈ (3.6)
ln(1 − qx (t))
that can be used to reconstitute the ETRxt ’s from the Lxt ’s and the qx (t)’s in
the case where the ETRxt ’s are not readily available. This formula appears
to be useful since, in the majority of the applications to general population

data, the exposure-to-risk is not provided. When the actuary works with
market data, or with statistics gathered from a given insurance portfolio,
the exposures-to-risk are easily calculated so that there is no need for the
approximation formula (3.6).
3.3.3 Death rates
We consider the estimation of µx (t) under assumption (3.2). We will see

that the maximum likelihood estimator of µx (t) is obtained by dividing
the number of deaths recorded at age x in year t by the corresponding
exposure-to-risk ETRxt . This is an expected result since µx (t) and mx (t)
coincide under (3.2).
To get this result in a formal way, let us associate to each of the Lxt
individuals alive at the beginning of the period an indicator variable δi
defined as
1, if individual i dies at age x
δi = (3.7)
0, otherwise
i = 1, 2, . . . , Lxt . Furthermore, let τi be the fraction of the year lived by indi-
vidual i, and let Dxt be the number of deaths recorded at age x last birthday
during calendar year t, from an exposure-to-risk ETRxt . We obviously have
that

Lxt
Lxt
δi = Dxt and τi = ETRxt (3.8)
i=1 i=1
Note that the method of recording the calendar year of death and the age
last birthday at death means that the death counts Dxt cover individuals
born on January 1 in calendar year t−x−1 through December 31 in calendar
year t − x (i.e., two successive calendar years) with a peak representation
around January 1 in calendar year t − x.
Under the assumption (3.2) and using (3.3), the contribution of individual
i to the likelihood may be written as
px (t) = exp(−µx (t)) (3.9)
if he survives, and
τi px (t)µx+τi (t + τi ) = exp(−τi µx (t))µx (t) (3.10)
if he dies at time τi during year t. Combining expressions (3.9)–(3.10), the

contribution of individual i to the likelihood can be transformed into
exp(−τi µx (t)) (µx (t))δi (3.11)

If the individual lifetimes are mutually independent, the likelihood for the
Lxt individuals aged x is then equal to
!
Lxt
L µx (t) = exp(−τi µx (t)) (µx (t))δi
i=1
= exp(−µx (t)ETRxt ) (µx (t))Dxt (3.12)

Note that this likelihood is proportional to the one based on the Poisson
distributional assumption for Dxt . Setting the derivative of ln L µx (t) equal
to 0, we find the maximum likelihood estimate " µx (t) of the force of mortality
µx (t) that is given by
Dxt
µx (t) =
" " x (t)
=m (3.13)
ETRxt
The m " x (t)’s are referred to as crude (i.e. unsmoothed) death rates for age
x in calendar year t. The death rate is, thus, the proportion of people of a
given age expected to die within the year, expressed in terms of the expected
number of life-years rather than in terms of the number of individuals ini-
tially present in the group. Often, ETRxt is approximated by an estimate of
the population aged x last birthday in the middle of the calendar year. This
quantity is estimated by a national institute of statistics taking account of
recorded births and deaths and net immigration. Formula (3.6) can also be
used to reconstitute the exposure-to-risk under assumption (3.2).
Figure 3.2 displays the logarithm of the death rates m " x (t) for males and
females for four selected periods. They come from the official life tables
constructed by Statistics Belgium, and cover the last 120 years. For each
period, death rates are relatively high in the first year after birth, decline
rapidly to a low point around age 10, and thereafter rise, in a roughly
exponential fashion, before decelerating (or slowing their rate of increase)
at the end of the life span. This is the typical shape of a set of death rates.
From Fig. 3.2, it is obvious that dramatic changes in mortality have
occurred over the 20th century. The striking features of the evolution of
mortality are the downard trends and the substantial variations in shape.
We see that the greatest relative improvement in mortality during the 20th
century occurred at the young ages, which has resulted largely from the con-
trol of infectious diseases. The decrease over time at ages 20–30 for females
reflects the rapid decline in childbearing mortality. The hump in mortal-
ity around ages 18–25 has become increasingly important, especially for
young males. Accidents, injuries, and suicides account for the majority of
the excess mortality of males over females at ages under 45 (this is why this
hump is often referred to as the accident hump).
2000–2002
1968–1972
–2 1928–1932
1880–1890
–4
ln mx
–6
–8
0 20 40 60 80 100
x
2000–2002
–2 1968–1972
1928–1932
1880–1890
–4
ln mx
–6
–8
0 20 40 60 80 100
x
Figure 3.2. Death rates (on the log scale) for Belgian males (top panel) and Belgian females (bot-
tom panel) from period life tables 1880–1890, 1928–1932, 1968–1972, and 2000–2002. Source:
Statistics Belgium.
The trend in the logarithm of the m" x (t)’s for some selected ages is depicted
in Figs 3.3 and 3.4. An examination of Fig. 3.3 reveals distinct behaviours
for age-specific death rates affecting Belgian males. At age 20, a rapid reduc-
tion took place after a peak which occurred in the early 1940s due to World
War II. A structural break seems to have occurred, with a relatively high
level of mortality before World War II, and a much lower level after 1950.
Since the mid-1950s, only modest improvements have occurred for the
" 20 (t)’s. This is typical for ages around the accident hump, where male
m
mortality has not really decreased since the 1970s. At age 40, the same
decrease after World War II is apparent, followed by a much slower reduc-
tion after 1960. The decrease after 1970 is nevertheless more marked than
–4.5
–5.0
–5.0
ln m40(t)
ln m20(t)
–5.5
–5.5
–6.0
–6.5 –6.0
–7.0
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
–3.6
–1.8
–3.8
–2.0
ln m80(t)
ln m60(t)
–4.0
–2.2
–4.2
–2.4
–4.4
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
Figure 3.3. Trend in observed death rates (on the log scale) for Belgian males at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.
–5.0
–5.5
–6.0
–5.5
–6.5
ln m40(t)
ln m20(t)
–7.0 –6.0
–7.5
–6.5
–8.0
–8.5
–7.0
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
–3.8 –1.8
–4.0 –2.0
–4.2 –2.2
ln m80(t)
–4.4
ln m60(t)
–2.4
–4.6
–2.6
–4.8
–2.8
–5.0
–3.0
–5.2
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t
Figure 3.4. Trend in observed death rates (on the log scale) for Belgian females at ages 20, 40, 60, and 80, period 1920–2005. Source: HMD.
at age 20. At ages 60 and 80, mortality rates have declined rapidly after
1970, whereas the decrease during 1920–1970 was rather moderate. We
note that the effect of World War II is much more important at younger
ages than at older ages. This clearly shows that gains in longevity have been
concentrated on younger ages during the first half of the 20th century, and
have then moved to older ages after 1950.
The analysis for Belgian females illustrated in Fig. 3.4 parallels that for
males for ages 20 and 40, but with several differences. At age 20, modest
improvements are visible after the mid-1950s. At age 40, more pronounced
reductions occurred after 1960. At older ages, the rate of decrease is more
regular, and has tended to accelerate after 1980.
This acceleration is a feature seen in a number of Western European
countries. Kannisto et al. (1994) report an acceleration in the late 1970s in
rate of decrease of mortality rates at ages over 80 in an analysis of mortality
rates for 9 European countries with reliable mortality data at these ages over
an extended period.
3.3.4 Mortality surfaces
The dynamic analysis of mortality is often based on the modelling of the

mortality surfaces that are depicted in Fig. 3.5. Such a surface consists of a
three-dimensional plot of the logarithm of the m" x (t)’s viewed as a function
of both age x and time t. Fixing the value of t, we recognize the classi-
cal shape of a mortality curve visible in Fig. 3.2. Specifically, along cross
sections when t is fixed (or along diagonals when cohorts are followed),
one observes relatively high mortality rates around birth, the well-known
presence of a trough at about age 10, a ridge in the early 20s (which is less
pronounced for females), and an increase at middle and older ages.
Mortality does not vary uniformly over the age-year plane and the advan-
tage of plots as in Fig. 3.5 is that they facilitate an examination of the way
that mortality changes with year and cohort as well as with age. In addition
to random deviation from the underlying smooth mortality surface, the sur-
face is subject to period shocks corresponding to wars, epidemics, harvests,
summer heat waves, etc. Roughness of the surface indicates volatility and
ridges along cross sections at given years mark brief episodes of excess mor-
tality. For instance, higher mortality rates are clearly visible for the years
around World War II.
3.3.5 Closure of life tables
At higher ages (above 80), death rates displayed in Fig. 3.5 appear rather
smooth. This is a consequence of the smoothing procedure implemented
–2
–4
–6
–8
1920
1940 100
80
1960
60
t
1980 40 x
20
2000
0
–2
–4
–6
–8
1920
1940 100
80
1960
60
t 1980 40 x
20
2000
0
Figure 3.5. Observed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), ages 0 to 109, period 1920–2005. Source: HMD.
in HMD. Death rates for ages 80 and above were estimated according
to the logistic formula and were then combined with death rates from
younger ages in order to reconstitute life tables. To have an idea of the
behaviour of mortality rates at the higher ages, we have plotted in Fig. 3.6
the rough death rates observed for the Belgian population. As discussed
in Section 2.8, we clearly see from Fig. 3.6 that data at old ages produce
suspect results (because of small risk exposures): the pattern at old and
very old ages is heavily affected by random fluctuations because of the
scarcity of data. Sometimes, data above some high age are not available
at all.
Recently, some in-depth demographic studies have provided a more
sound knowledge about the slope of the mortality curve at very old ages.
It has been documented that the force of mortality is slowly increasing at
very old ages, approaching a rather flat shape. The deceleration in the rate
of increase in mortality rates can be explained by the selective survival of
healthier individuals at older ages (see, e.g. Horiuchi and Wilmoth, 1998)
for more details, as well as the discussion about frailty in Section 2.9.3).
Demographers and actuaries have suggested various techniques for estimat-
ing the force of mortality at old ages and for completing the life table. See
Section 2.8.2 for examples and references. Here, we apply a simple and
powerful method proposed by Denuit and Goderniaux (2005).
The starting point is standard: there is ample empirical evidence that
the one-year death probabilities behave like the exponential of a quadratic
polynomial at older ages, that is, qx (t) = exp(at + bt x + ct x2 ). Hence, a
log-quadratic regression model of the form
ln "
qx (t) = at + bt x + ct x2 + xt (3.14)
for the observed one-year death probabilities, with xt independent and

Normally distributed with mean 0 and variance σ 2 , is fitted separately to
each calendar year t (t = t1 , t2 , . . . , tm ) and to ages xt and over. Then,
constraints are imposed to mimic the observed behaviour of mortality at
old ages. First, a closure constraint
q130 (t) = 1 for all t (3.15)
which retains as working assumption that the limit age 130 will not be
exceeded. Secondly, an inflection constraint
#
∂ #
qx (t)## =0 for all t (3.16)
∂x x=130
–2
–4
–6
–8
1950
1960 100
1970 80
60
t 1980
40 x
1990
20
2000
0
–2
–4
–6
–8
–10
–12
1950
1960 100
1970 80
60
1980
t
40 x
1990
20
2000
0
Figure 3.6. Observed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), period 1950–2004. Source: Statistics Belgium.
which is used to ensure that the behaviour of the ln qx (t)’s will be ulti-
mately concave. This is in line with empirical studies that provide evidence
of a decrease in the rate of mortality increase at old ages. One explana-
tion proposed for this deceleration is the selective survival of healthier
individuals to older ages, as noted above.
Note that both constraints are imposed here at age 130. In general, the
closing age could also be treated as a parameter and selected from the data
(together with the starting age xt , thereby determining the optimal fitting
age range).
These two constraints yield the following relation between the at ’s, bt ’s,
and ct ’s for each calendar time t:
at + bt x + ct x2 = ct (130 − x)2 (3.17)
for x = xt , xt + 1, . . . and t = t1 , t2 , . . . , tm . The ct ’s are then estimated
on the basis of the series {" qx (t), x = xt , xt + 1, . . .} relating to year t
from equation (3.14), noting the constraints imposed by (3.17). It is worth
mentioning that the two constraints underlying the modelling of the qx (t)
for high x are in line with empirical demographic evidence.
Let us now apply this method to the data displayed in Fig. 3.6. The
optimal starting age is selected from the age range 75–89. It turns out to
be around 75 for all of the calendar years. Therefore, we fix it to be 75 for
both genders and for all calendar years. The R2 corresponding to the fitted
regression models (3.14), as well as the estimated regression parameters ct
are displayed in Fig. 3.7. We keep the original "
qx (t) for ages below 85 and
we replace the death probabilities for ages over 85 with the fitted values
coming from the constrained quadratic regression (3.14). The results for
calendar years 1950, 1960, 1970, 1980, 1999, and 2000 can be seen in
Fig. 3.8 for males and in Fig. 3.9 for females. The completed mortality
surfaces are displayed in Fig. 3.10.
3.3.6 Rectangularization and expansion
Figure 3.11 shows the rectangularization phenomenon. It presents the pop-

ulation survival functions based on period life tables for, from left to right,
1880–1890, 1928–1932, 1968–1972, and 2000–2002. Survival functions
have been formally introduced in Section 2.3.1. Broadly speaking, they give
the proportion of individuals reaching the age displayed along the x-axis,
where this proportion is computed on the basis of the set of age-specific
mortality rates corresponding to the different period life tables.
As we have noted in the introduction, considerable progress has been
made in the 20th century towards eliminating the hazards to survival
which existed at the young ages in the early 1900s. This is clearly visi-
ble from Fig. 3.11 where the proportion of the population still alive at
some given age increases as we move forward in calendar time. As a con-
sequence, the slope of the survival function has become more rectangular
(less diagonal) through time. This is the so-called ‘curve squaring’ concept,
1.00
R2 0.98
0.96
0.94
Males
Females
0.92
1950 1960 1970 1980 1990 2000

t
–0.0008
–0.0009
–0.0010
ct
–0.0011
Males
Females
–0.0012
1950 1960 1970 1980 1990 2000

t
Figure 3.7. Adjustment coefficients and estimated regression parameters for model
(3.14)–(3.17).
which has been the subject of passionate debate among demographers in

recent years.
Let us now consider the age corresponding to a value of 0.5 for the
survival curve. This age is called median age at birth and is one of the
standard demographic markers; see Section 2.4.2. Broadly speaking, the
median is the age reached by half of a hypothetical population with mortal-
ity experience reflected by that particular period life table. Figure 3.12(top
panel) shows the increasing trend in the median life at birth: median life-
times are depicted by gender and calendar year, based on period life tables.
Figure 3.12(bottom panel) is the analogue for the median remaining lifetime
at age 65.
0 0 0
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
0 0 0
–2 –2 –2
–4 –4
ln qx
ln qx
ln qx
–4
–6
–6 –6
–8
–8
–8
–10
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
Figure 3.8. Completed life tables for Belgian males, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (broken
line), on the log-scale.
0 0 0
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
0 0 0
–2 –2 –2
ln qx
ln qx
ln qx
–4 –4 –4
–6 –6 –6
–8 –8 –8
0 20 40 60 80 100 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120
x x x
Figure 3.9. Completed FPB life tables for Belgian females, years 1950, 1960, 1970, 1980, 1990, and 2000, together with empirical death probabilities (broken
line), on the log-scale.
−2
−4
−6
−8
1950
1960
1970 100
1980
t
50
1990
x
2000
0
−2
−4
−6
−8
1950
1960
1970 100
1980
t 50
1990
x
2000
0
Figure 3.10. Completed death rates (on the log scale) for Belgian males (top panel) and Belgian
females (bottom panel), period 1920–2005.
1.0
0.8
0.6
S(x)
0.4
2000–2002
1968–1972
0.2 1928–1932
1880–1890
0.0
0 20 40 60 80 100
x
1.0
0.8
0.6
S(x)
0.4
2000–2002
0.2 1968–1972
1928–1932
1880–1890
0.0
0 20 40 60 80 100
x
Figure 3.11. Survival curves for Belgian males (top panel) and Belgian females (bottom panel)
corresponding to the 1880–1890, 1928–1932, 1968–1972, and 2000–2002 period life tables.
Source: Statistics Belgium.
Rectangularization of survival curves is associated with a reduction in the

variability of age at death. As deaths become concentrated in an increasingly
narrow age range, the slope of the survival curve in that range becomes
steeper, and the curve itself begins to appear more rectangular. A simple
measure of rectangularity is thus the maximum downward slope of the
survival curve S in the adult age range that has been formally defined in
(2.61). Increasing rectangularity according to this measure implies a survival
curve which becomes increasingly vertical at older ages.
Figure 3.13 displays the distribution of ages at death (empirical version
of the theoretical probability density function f defined in (2.28)). It can be
seen that the distribution of ages at death has shifted to the right and has
85
Males
80 Females
Med[T65(t)]
75
70
65
60
1920 1940 1960 1980 2000
t
20
Males
18 Females
Med[T65(t)]
16
14
12
10
1920 1940 1960 1980 2000
t
Figure 3.12. Observed median lifetimes at birth (top panel) and at age 65 (bottom panel), period
1920–2005. Source: HMD.
become less variable and less obviously bimodal. We clearly observe that
the point of fastest decline increases with time, which empirically supports
the expansion phenomenon.
3.3.7 Life expectancies
The index, life expectancy, has been formally defined in Section 2.4.1. Life
expectancy statistics are very useful as summary measures of mortality, and
they have an intuitive appeal. However, it is important to interpret data
on life expectancy correctly when their computation is based on period
life tables. Period life expectancies are calculated using a set of age-specific
mortality rates for a given period (either a single year, or a run of years), with
2000–2002
1968–1972
0.03 1928–1932
1880–1890
0.02
f(x)
0.01
0.00
0 20 40 60 80 100
x
0.04 2000–2002
1968–1972
1928–1932
1880–1890
0.03
f(x)
0.02
0.01
0.00
0 20 40 60 80 100
x
Figure 3.13. Observed proportion of ages at death for Belgian males (top panel) and Belgian
females (bottom panel) corresponding to 1880–1890, 1928–1932, 1968–1972, and 2000–2002
period life tables. Source: Statistics Belgium.
no allowance for any future changes in mortality. Cohort life expectancies

are calculated using a cohort life table, that is, using a set of age-specific
mortality rates which allow for known or projected changes in mortality at
later ages (in later years).
Period life expectancies are a useful measure of the mortality rates that
have been actually experienced over a given period and, for past years,
provide an objective means of comparison of the trends in mortality over
time, between areas of a country and with other countries. Official life
tables which relate to past years are generally period life tables for these
reasons. Cohort life expectancies, even for past years, may require projected
mortality rates for their calculation. As such, they are less objective because
they are subject to substantial model risk and forecasting error.
In this chapter, we only compute period life expectancies. Cohort life

expectancies will be derived in Chapter 5 using appropriate mortality pro-
↑
jection methods. Let ex (t) be the period life expectancy at age x in calendar
year t. Here, we have used a superscript ‘↑’ to recall that we work along a
vertical band in the Lexis diagram, considering death rates associated with
↑
a given period of time. Specifically, ex (t) is computed from the period life
table for year t, given by the set µx+k (t), k = 0, 1, . . . . The formula giving
↑
ex (t), under assumption (3.2), is
ξ
e↑x (t) = exp − µx+η (t) dη dξ
ξ≥0 0

1 − exp − µx (t)
= (3.18)
µx (t)
 
k−1! 1 − exp −µx+k (t)
+  exp −µx+j (t) 
µx+k (t)
k≥1 j=0
In this formula, the ratio (1 − exp(−µx+k (t))/µx+k (t)) is the average frac-
tion of the year lived by an individual alive at age x + k, and the product
$k−1 ↑
j=0 exp(−µx+j (t)) is the probability k px (t) of reaching age x+k computed
from the period life table.
↑
Figure 3.14 shows the trend in the period life expectancies at birth e0 (t)
↑
and at retirement age e65 (t) by gender. The period life expectancy at a
particular age is based on the death rates for that and all higher ages that
were experienced in that specific year. For life expectancies at birth, we
observe a regular increase after 1950, with an effect due to World War II
which is visible before that time (especially at the beginning and at the end
↑
of the conflict for e0 (t), and during the years preceding the conflict as well as
↑
during the war itself for e65 (t)). Little increase was experienced from 1930
to 1945. It is interesting to note that period life expectancies are affected
by sudden and temporary events, such as a war or an epidemic.
3.3.8 Variability
Wilmoth and Horiuchi (1999) have studied different measures of variability

for the distribution of ages at death. These authors favour the interquartile
range for both its ease of calculation and for its straightforward interpreta-
tion. The interquartile range measures the distance between the lower and
the upper quartiles of the distribution of ages at death in a life table. This
range is formally defined as the difference between the age corresponding
85
Males
80
Females
75
70
e0(t)
65
60
55
50
1920 1940 1960 1980 2000
t
20
Males
18 Females
16
e65(t)
14
12
10
1920 1940 1960 1980 2000
t
Figure 3.14. Observed period life expectancies at birth (top panel) and at age 65 (bottom panel)
for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:
HMD.
to the value 0.25 of the survival curve minus the age corresponding to the
value 0.75 of this curve; see (2.64). The former age (called the third quar-
tile) is attained by 25% of the population whereas 75% of the population
reaches the latter age (called the first quartile). The interquartile range is
thus the width of the age interval containing the 50% central deaths in
the population. As age at death becomes less variable, we would expect
that this measure would decrease. It is very simple to calculate because it
equals the difference between the ages where the survival curve S crosses the
probability levels 0.25 and 0.75. Being the length of the span of ages con-
taining the middle 50% of deaths, it possesses a simple interpretation. Note
that the rectangularization of survival curves is associated with decreasing
interquartile range.
45 Males
Females
40
35
IQR
30
25
20
15
1920 1940 1960 1980 2000

t
11.5
11.0
IQR
10.5
Males
10.0 Females
1920 1940 1960 1980 2000

t
Figure 3.15. Observed interquartile range at birth (top panel) and at age 65 (bottom panel)
for Belgian males (continuous line) and Belgian females (dotted line), period 1920–2005. Source:
HMD.
Figure 3.15 depicts the interquartile range at birth and at age 65. Whereas
the interquartile range at birth clearly decreases over time, there is an
upward trend at age 65. This suggests that even if variability is decreasing
for the entire lifetime, this may not be the case for the remaining lifetime at
age 65.
3.3.9 Heterogeneity
Within populations, differences in life expectancy exist with regard to gen-

der. Females tend to outlive males in all populations, and have lower
mortality rates at all ages, starting from infancy. This is clear from all of the
figures examined so far in this chapter. Another difference in life expectancy
occurs because of social class, as assessed through occupation, income, or

education.
In recent decades, population data have shown widening mortality dif-
ferentials by socio-economic class. The mortality of the better off classes
has improved more rapidly. The major cause of death responsible for the
widening differential is cardiovascular disease: persons of higher social
classes have experienced much larger declines in death due to cardiovas-
cular disease than persons of lower classes. Other possible explanations
include cigarette smoking (which is known to vary significantly according
to social class) as well as differences in diet, selection mechanims, poorer
quality housing conditions and occupation. In general, individuals with
higher socio-economic status live longer than those in lower socio-economic
groups. This heterogeneity can be accounted for as discussed in Section 2.9.
We will see below that the effect of social class is significant for insurance
market mortality statistics. Indeed, the act of purchasing life insurance prod-
ucts often reveals that the individual belongs to upper socio-economic class,
which in turn yields lower mortality (even in the case of death benefits).
3.4 Life insurance market
3.4.1 Observed death rates
Figure 3.16 displays the period life tables for the Belgian individual life
insurance market, group life insurance market, and the general population
observed in the calendar years 1995, 2000, and 2005. The variability in
the set of death rates is clearly much higher for the insurance market, as
exposures-to-risk are considerably smaller. This is why smoothing the mar-
ket experience to make the underlying trend more apparent is desirable.
This is achieved as explained below.
The standardized mortality ratio (SMR) is a useful index for comparing
mortality experiences: actual deaths in a particular population are com-
pared with those which would be expected if ‘standard’ age-specific rates
applied. Precisely, the SMR is defined as

" x (t)
(x, t)∈D ETRxt m (x, t)∈D Dxt
SMR = =
" stand
(x, t)∈D ETRxt m x (t) (x, t)∈D ETRxt m " stand
x (t)
where D is the set of ages and calendar years under interest.
Here are the SMRs by calendar year for the life insurance market: com-
puted over 1993–2005, the estimated SMR is equal to 0.5377419 for ages
0 0 0
–2 –2 –2
ln mx
ln mx
ln mx
–4 –4 –4
–6 –6 –6
–8 –8 –8
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100

x x x
0 0 0
–2 –2 –2
ln mx
ln mx
ln mx
–4 –4 –4
–6 –6 –6
–8 –8 –8
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100

x x x
Figure 3.16. General population (broken line) and individual (circle) and group (triangle) life insurance market death rates (on the log scale) observed in
1995, 2000, and 2005 for Belgian males (top panel) and females (bottom panel). Source: HMD for the general population and BFIC for insured lives.
45–64 and to 0.3842981 for ages 65 and over for individual policies, and
to 0.495525 and to 0.8042604 for group policies. The same values com-
puted over 2000–2005 are equal to 0.4796451, 0.3699633, 0.4963897, and
0.8692767, respectively. Note that the values for group contracts, ages 45–
64 have been computed by excluding calendar year 2001, which appeared to
be atypical for group life contracts before retirement age. We see that SMR’s
are around 50% for individual and group life insurance contracts before
retirement age, and then decrease to reach 40% for individual policies and
increase to 80% for group life policies.
3.4.2 Smoothed death rates
It is clear from Fig. 3.16 that death rates based on market data exhibit
considerable variations. This is why some smoothing is desirable in order
to obtain a better picture of the underlying mortality experienced by insured
lives. Since possible changes in underwriting practices or tax reforms are
likely to affect market death rates, we smooth the death rates across ages by
calendar year, as in Hyndman and Ullah (2007). To this end, we use local
regression techniques.
Local regression is used to model a relation between a predictor variable
(or variables) x and a response Y, which is related to the predictor variable.
Typically, x represents age in the application that we have in mind in this
chapter, while Y is some (suitably transformed) demographic indicator such
as the logarithm of the death rate or the logit of the death probability. The
logarithmic and logit transformations involved in these models ensure that
the dependent variables can assume any possible real values.
As pointed out by Loader (1999), smoothing methods and local regres-
sion originated in actuarial science in the late 19th and early 20th centuries,
in the problem of graduation. See Section 2.6 for an introduction to these
concepts. Having observed (x1 , Y1 ), (x2 , Y2 ), . . ., (xm , Ym ), we assume a
model of the form Yi = f (xi ) + i , i = 1, 2, . . . , m, where f (·) is an unknown
function of x, and i is an error term, assumed to be Normally distributed
with mean 0 and variance σ 2 . This term represents the random departures
from f (·) in the observations, or variability from sources not included in the
xi ’s. No strong assumptions are made about f , except that it is a smooth
function that can be locally well approximated by simple parametric func-
tions. For instance, invoking Taylor’s theorem, any differentiable function
can be approximated locally by a straight line, and a twice differentiable
function can be approximated locally by a quadratic polynomial.
In order to estimate f at some point x, the observations are weighted in
such a way that the largest weights are assigned to observations close to
x. In many cases, the weight wi (x) assigned to (xi , Yi ) to estimate f (x) is

obtained from the formula

xi − x
wi (x) = W (3.19)
h(x)
where W(·) is choosen to be continuous, symmetric, peaked at 0 and

supported on [−1, 1]. A common choice is the tricube weight function
defined as

3
1 − |u|3 for −1 < u < 1
W(u) = (3.20)
0 otherwise
The bandwidth h(x) defines a smoothing window (x − h(x), x + h(x)), and

only observations in that window are used to estimate f (x). Within the
smoothing window, f is approximated by a polynomial. The coefficients of
this polynomial are then estimated via weighted least-squares.
The bandwidth h(x) has a critical effect on the local regression. If h(x) is
too small, insufficient data fall within the smoothing window and a noisy
fit results. On the other hand, if h(x) is too large, the local polynomial may
not fit the data well within the smoothing window, and important features
of the mean function may be distorted or even lost. The nearest neighbour
bandwidth is often used. Specifically, h(x) is selected so that the smoothing
window contains a specified number of points.
A high polynomial degree can always provide a better approximation
to f than a low polynomial degree. But high order polynomials have large
numbers of coefficients to estimate, and the result is increased variability in
the estimate. To some extent, the effects of the polynomial degree and band-
width are confounded. It often suffices to chose a low degree polynomial
and to concentrate on choosing the bandwidth in order to obtain a satis-
factory fit. The most common choices are local linear and local quadratic.
A local linear estimate usually produces better fits, especially at the bound-
aries. The weight function W(·) has much less effect on the bias-variance
trade-off, and the tricube weight function (3.20) is routinely used.
Let us approximate f by a linear function β0 (x) + β1 (x)x in the smooth-
ing window (x − h(x), x + h(x)). This leads to local linear regression. The
coefficients β0 (x) and β1 (x) are estimated by minimizing the local residual
sum of squares

m
2
OW (x) = wi (x) Yi − β0 (x) − β1 (x)xi (3.21)
i=1
Denoting as m
wi (x)xi
xw = i=1
m (3.22)
i=1 wi (x)
the weighted average of the xi ’s in the smoothing window, the minimization
of the objective function OW (x) gives
" β0 (x) + "
f (x) = " β1 (x)x
m m
i=1 wi (x)Yi
i=1 wi (x) xi − xw Yi
= m + x − xw (3.23)
i=1 wi (x)
m 2
wi (x) xi − xw
i=1
Let us give an interpretation to this expression for "

f (x). The first term in "
f (x)
is the well-known Nadaraya–Watson kernel estimate that is obtained by
approximating f by a constant in the smoothing window (x−h(x), x+h(x)).
The second term is a correction for local slope of the data and skewness of
the xi ’s. A local linear estimate would exhibit bias if the mean function f
had a high curvature.
Let us now apply this methodology to the life insurance market data. For
a fixed calendar year t, we use the model
" x (t) = f (x) + xt ,
ln m x = 40, 41, . . . , 98 (3.24)
where m " x (t) is the observed death rate in the insurance market. Hence, the
smoothed death rates are given by exp(" f (x)), x = 40, 41, . . . , 98. The model
is fitted separately to males and females, and to group and individual mor-
tality experiences. The result is visible in Fig. 3.17 which is the analogue of
Figure 3.16, leading to smoothed mortality curves for the insurance mar-
ket. We see that the individual life experience is consistently better than
the general population mortality. The experience for group life contracts is
better than the general population mortality before retirement age but then
deteriorates and becomes comparable to the general population mortality
after retirement.
Remark Alternatively, f can be estimated by minimizing the objective
function (2.105), that is,
m 2
2
Oλ (f ) = yi − f (xi ) + λ f "(u) du (3.25)
i=1 u∈R
The first term ensures that f (·) will fit the data as well as possible. The
second term penalizes roughness of f (·); it imposes some smoothness on
the estimated f (·). The factor λ quantifies the amount of smoothness: if
λ +∞ then f " = 0 and we get a linear fit; and if λ 0 then f perfectly
interpolates the data points.
–1 –1 –1
Smooth log. death rate, year an


–2 –2 –2
–3 –3 –3
–4 –4 –4
–5 –5 –5
–6 –6 –6
–7 –7 –7
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100
Age Age Age
–1 –1 –1

–2 –2 –2
–3 –3 –3
–4 –4 –4
–5 –5 –5
–6 –6 –6
–7 –7 –7
40 50 60 70 80 90 100 40 50 60 70 80 90 100 40 50 60 70 80 90 100
Age Age Age
Figure 3.17. General population (broken line) death rates and individual (circle) and group (triangle) life insurance market smoothed death rates (on the log
scale) observed in 1994 for Belgian males (top panel) and females (bottom panel).
If x1 < x2 < · · · < xm then the solution " fλ is a cubic spline with knots
x1 , x2 , . . . , xm ; see Section 2.6.3. This means that "
fλ coincides with a third-
degree polynomial on each interval (xi , xi+1 ) and possesses continuous first
and second derivatives at each xi .
Remark Instead of working in a Gaussian regression model, we could also
move to the generalized linear modelling framework by implementing a
local likelihood maximization principle. Consider for instance the Bernoulli
model where P[Yi = 1] = 1 − P[Yi = 0] = p(xi ). The contribution of the
ith observation to the log-likelihood is
l(Yi , p(xi )) = Yi ln p(xi ) + (1 − Yi ) ln(1 − p(xi ))

p(xi )
= Yi ln + ln(1 − p(xi )) (3.26)
1 − p(xi )
A local polynomial approximation for p(xi ) is difficult since the inequal-

ities 0 ≤ p(xi ) ≤ 1 must be fulfilled. Therefore, we prefer to work on the
logit scale, defining the new parameter from the logit transformation
p(x)
θ(x) = ln (3.27)
1 − p(x)
Note that θ(x) can assume any real value as p(x) moves from 0 to 1. The
local polynomial likelihood at x is then

m
wi (x) Yi β0 (x) + β1 (x)xi − ln 1 + exp(β0 (x) + β1 (x)xi ) (3.28)
i=1
The estimation of p(x) is then obtained from

exp "β0 (x) + "β1 (x)x
"
p(x) = (3.29)
1 + exp " β0 (x) + "
β1 (x)x
3.4.3 Life expectancies
Figure 3.18 gives the life expectancy at age 65 for the general population
and for insured lives, computed on the basis of observed death rates.
We see that the life expectancies for the group life insurance market are
close to the general population ones. This is due to the moderate adverse
selection present in the collective contracts, where the insurance coverage is
made compulsory by the employment contract, noting that there is a selec-
tion effect through being employed (the so-called ‘healthy worker effect’).
On the contrary, the effect of adverse selection seems to be much stronger
for individual policies. This is due to the particular situation prevailing
26
24
Life exp. 65
22
20
18
16
1994 1996 1998 2000 2002 2004

Year
28
26
Life exp. 65
24
22
20
18
1994 1996 1998 2000 2002 2004

Year
Figure 3.18. Life expectancy at age 65 for males (top panel) and females (bottom panel): General
population (diamond) and individual (circle) and group (triangle) life insurance market. Source:
HMD for the general population and BFIC for insured lives.
in Belgium, where no tax incentives are offered for buying life annuities or
other life insurance products after retirement. This explains why only people
with improved health status consider insurance products as valuable assets.
Note that this situation has recently changed in Belgium, where purchasing
life annuities at retirement age is now encouraged by the government.
3.4.4 Relational models
Actuaries are aware that the nominee of a life annuity is, with a high proba-
bility, a healthy person with a particularly low mortality in the first years of
life annuity payment and, generally, with an expected lifetime higher than
average. In order to account for this phenomenon, Delwarde et al. (2004)

have suggested a method for adjusting a reference life table to the experi-
ence of a given portfolio, based on non-linear regression models using local
likelihood for inference.
Denoting as m " HMD
x (t) the population death rates contained in the HMD,
" x (t) their analogue for the life insurance market computed from
and as m BFIC
BFIC statistics, we consider models of the form

" BFIC
ln m x (t) = f ln "
m HMD
x (t) + xt (3.30)
for ages x = 40, 41, . . . , 98 and calendar years 1994–2005. The similarity
with (3.24) is clearly apparent. Now, population death rates are used as
explanatory variables, instead of age x. Note that both variables could
enter the model as covariates, but we need here to establish a link between
population and insurance market mortality statistics that will be exploited
in Chapter 5. Figure 3.19 describes the result of the procedure for males,
whereas Fig. 3.20 is the analogue for females.
Figures 3.19 and 3.20 suggest that a linear relationship exists between
population and market death rates (at least for older ages). If we fit the
regression model
" BFIC
ln m x " HMD
(t) = a + b ln m x (t) + xt (3.31)
" HMD
to the observed pairs (ln m x " BFIC
(t), ln m x (t)) that are available for ages
60–98, and calendar years 1994 to 2005, we obtain estimated values for b
that are significantly less than 1 (for group and individual policies, males
and females). Moreover, the estimations are very sensitive to the age and
time ranges included in the analysis. Let us briefly explain why b < 1 seems
inappropriate.
Mortality reduction factors express the decrease in mortality at some
future time t + k compared with the current mortality experience at time
t. They are widely used to produce projected life tables and are formally
introduced in Section 4.3.2. The link between the regression model (3.31)
and the mortality reduction factors for the insurance market is as follows.
It is easily seen that if the linear relationship given above indeed holds
true then
mBFIC
x (t + k) mHMD
x (t + k)
ln = b ln (3.32)
mBFIC
x (t) mHMD
x (t)
b
mBFIC (t + k) mHMD (t + k)
⇔ x BFIC = x
(3.33)
mx (t) mxHMD (t)
3
–1 –1
2
AM fit, ind. ins. market

–2 –2
Ind. ins. market
Ind. ins. market

–3 1 –3
–4 0 –4
–5 –5
–1
–6 –6
–2
–7 –7
–6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1
Gen. population Gen. population Gen. population
3
–2 –2
AM fit, group ins. market

2
Group ins. market

Group ins. market
1
–4 –4
0
–6 –1 –6
–2
–8 –3 –8
–6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1 –6 –5 –4 –3 –2 –1
Figure 3.19. Relational models for males: observed pairs (ln m " HMD
x " BFIC
(t), ln m x (t)) are displayed in the left panels, the estimated functions f in (3.30) are
displayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.
–1 3 –1
AM fit, ind. ins. market

–2 –2
2
Ind. ins. market

Ind. ins. market
–3 –3
1
–4 –4
0
–5 –5
–6 –1 –6
–7 –2 –7
–7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1
AM fit, group ins. market

–2 –2
2
Group ins. market
Group ins. market

–4 0 –4
–6 –2 –6
–8 –4 –8
–7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1 –7 –6 –5 –4 –3 –2 –1
Figure 3.20. Relational models for females: observed pairs (ln m " HMD
x " BFIC
(t), ln m x (t)) are displayed in the left panels, the estimated functions f in (3.30) are
displayed in the middle panels, and the resulting fits are displayed in the right panels, individual policies in the top panels, group policies in the bottom panels.
so that the mortality reduction factor for the market is equal to the mortality
reduction factor for the general population raised to the power b. The same
reasoning obviously holds for the group life insurance market. We note that
the mortality reduction factors are less than 1 in the presence of decreasing
trends in mortality rates.
As socio-economic class mortality differentials have widened over time,
we expect mortality improvements for assured lives to have been greater
than in the general population. This statement is based on the fact that
the socio-economic class mix of this group is higher than the population
average. Of course, there may be distortion factors, like changes in under-
writing practices, or reforms in tax systems. Considering that the estimated
values for parameters b are less than 1, the interpretation is that the speed of
the future mortality improvements in the insured population is somewhat
smaller than the corresponding speed for the general population. This is
not desirable and only reflects the changes in the tax regimes in Belgium,
lowering adverse selection.
This is why we now consider the following model:
" BFIC
ln m x mHMD
(t) = f (x) + ln" x (t) + xt (3.34)
We fit (3.34) to the observed pairs (ln m " HMD

x " BFIC
(t), ln m x (t)) over calen-
dar years 1994–2005 and ages 60–98. This produces estimated SMR’s
of the form exp(" f (x)) that can be used to adapt mortality projections to
the insurance market. Note that in (3.34), we force the speed of mortality
improvements to be equal to the one for the general population. The quality
of the fit of (3.34) is remarkable, as it can be seen from the high values of
the R2 ’s: 99.8% for males, individual policies, 97.2% for males, group poli-
cies, 99.8% for females, individual policies, and 97.8% for females, group
policies. The estimated SMR’s are displayed in Fig. 3.21.
3.4.5 Age shifts
Another approach to quantify adverse selection consists in determining age

shifts or Rueff’s adjustments. More details can be found in Section 4.4.3.
Here, we determine the age shift (t) to minimize the objective function

80
BFIC 2
HMD
Ot ( ) = "
ex (t) −"
ex− (t) (3.35)
x=65
We select the optimal value of (t) by a grid search over {−10, −9, . . . , 10}.
Then, the overall age shift is determined by minimizing O( ) =
2005
t=1994 Ot ( ). This gives the values displayed in Table 3.1.
0.50
0.8
0.45
SMR
0.7
SMR
0.40
0.6
0.35
0.5
60 70 80 90 60 70 80 90
x x
0.60 1.00
0.95
0.55
0.90
SMR
SMR
0.50 0.85
0.80
0.45
0.75
0.40 0.70
60 70 80 90 60 70 80 90
x x
Figure 3.21. Estimated SMR’s from (3.34) for males (top panels) and females (bottom panels),
individual (left) and group (right) life insurance market.
Table 3.1. Optimal age shifts obtained from the objective functions Ot in (3.35),

t = 1994, 1995, . . . , 2005 and O = 2005
t=1994 Ot .
Year t Ind., males Ind., females Group, males Group, females
1994 −8 −6 −4 −1
1995 −7 −6 −1 0
1996 −9 −8 −2 −1
1997 −8 −5 −2 −1
1998 −6 −4 −1 −1
1999 −9 −6 −1 −1
2000 −8 −5 −1 0
2001 −9 −8 −2 0
2002 −9 −5 0 1
2003 −9 −4 −1 0
2004 −8 −4 1 3
2005 −6 −3 1 1
1994–2005 −9 −5 −1 0
Considering the period 1994–2005, we see that the actuarial computa-

tions for males, individual policies, should be based on general population
life tables with age decreased by 9 years. The corresponding shift for group
life policies is reduced to −1 year. For females, the values are −5 years for
individual policies with no adjustment for group life contracts.
Let us now briefly explain another approach to get these age shifts. To this
end, we assume that the observed number of deaths at age x in calendar year
3.5 Mortality trends throughout EU 129
–125000
–122000 –130000
–124000 –135000
L
–126000 –140000
L
–145000
–128000
–150000
–130000
0 5 10 15 20 0 5 10 15 20
a a
–74000
–60000
–76000
–65000
L
L
–78000
–70000
–80000
–75000
0 5 10 15 20 0 5 10 15 20
a a
Figure 3.22. Log-likelihood L in function of the age shift for males (top panels) and females
(bottom panels), individual (left) and group (right) life insurance market.
t in the insurance market DBFIC

xt is Poisson distributed, with a mean equal to
the product of the exposure to risk ETRBFIC xt of the market multiplied by the
population death rate m " HMD
x− (t) at age x− . This distributional assumption
is not restrictive as the likelihood (3.12) has been seen to be proportional
to a Poisson one. The age shift is then determined by maximizing the
likelihood obtained by considering the DBFIC xt ’s as mutually independent,
that is, by maximizing the objective function
DBFIC
xt
! −ETRBFIC
xt "
m HMD (t)
x−
L( ) = exp −ETRBFIC
xt m " HMD
x− (t)
x,t xt !
DBFIC
over by a grid search.

The results are displayed in Fig. 3.22 when calendar years 2005 and over
are considered. The log-likelihood L = ln L is given in function of the age
shift . We clearly see that the log-likelihoods peak around the age shifts
given in Table 3.1.
3.5 Mortality trends throughout EU

This section compares the Belgian mortality experience with several other
EU members: Sweden, Italy, Spain, West-Germany, France, and England
& Wales. Even if the trend is comparable in these countries, regional
differences in death rates produce gaps in life expectancies, especially at

retirement age. All the data used to perform the multinational comparison
come from HMD and all the analysis is performed on a period basis.
Figure 3.23 shows the trend in the period life expectancy at birth for males
in Sweden, Italy, Spain, West-Germany, France, and England & Wales,
compared with Belgium. Figure 3.25 is the analogue for females. Figures
3.24 and 3.26 display the period life expectancy at age 65.
Sweden has had a complete and continually updated register of its pop-
ulation for more than 2 centuries. In general, life expectancies at birth
increased from 1750 to present. In 1771–1772, a harvest failure lead to
famine and epidemics, resulting in a drastic reduction of life expectancy
at birth. Another increase in mortality took place during the first decade
of the 19th century because of the Finnish war of 1808–1809 and related
epidemics. The effect of the 1918 Spanish influenza epidemic is also clearly
visible. Because Sweden has remained neutral during both world wars, life
expectancies were minimally affected relative to other European countries.
Compared to Belgium, we see that the life expectancy at birth is higher in
Sweden for both genders, but that the gap tends to narrow over time. Con-
sidering age 65, we notice an important difference for males, with a clear
advantage for Sweden over Belgium.
Considering the Italian experience, the effect of the Spanish influenza
epidemics is clearly visible, as well as the impact of World War II. The
speed of longevity improvement seems to be better in Italy: until the 1950s,
life expectancy at birth was higher in Belgium and this changed in the 1960s.
Italy has now a slightly higher life expectancy at birth. The advantage of
Italy is even more apparent for life expectancies at age 65.
In addition to the marked effect of the 1918 influenza epidemics, the
Spanish civil war (1936–1939) and post-war period (1941–1942) caused
an important decline of life expectancy at birth. As has been observed with
Italy, the Belgian advantage over Spain disappeared in the 1960s. Bigger
differences exist for life expectancies at age 65, in favour of Spain.
Let us now consider Germany. Instead of considering the whole country,
we restrict ourselves to the territory of the former Federal Republic of Ger-
many (called West Germany), starting after the end of World War II. We
see that the mortality trends in Belgium and in West Germany are similar:
life expectancies at birth and at age 65 closely agree in these two areas.
The trends in life expectancies at birth in France and Belgium are almost
identical, despite the fact that the effect of World War II is more pronounced
in France. Note also that the conjuntion of World War I and the Spanish flu
epidemics have a very strong effect on life expectancies in the second half
80
70 70 70
Life exp. at birth
Life exp. at birth
Life exp. at birth

60 60 60
50
50 50
40
40 40
30
20 30 30
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
Calendar year Calendar year Calendar year
76
70 70
74
Life exp. at birth
Life exp. at birth

Life exp. at birth
60 60
72
50
70 50
40
68
40
66 30
1960 1970 1980 1990 2000 1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
Figure 3.23. Life expectancy at birth in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
Wales, compared to Belgium (broken line). Source: HMD.
16 16
16
Life exp. at 65
14
Life exp. at 65
Life exp. at 65
14
14
12
12
10 12
8 10 10
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000
1920 1940 1960 1980 2000
Calendar year Calendar year
Calendar year
18
16 16
16 15
15
Life exp. at 65
Life exp. at 65
Life exp. at 65
14
14 14 13
12
13 12
11
12 10
10
1960 1970 1980 1990 2000 1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
Figure 3.24. Life expectancy at age 65 in the EU for males for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
80 80 80
70
70 70
Life exp. at birth
Life exp. at birth
Life exp. at birth

60
60 60
50
50 50
40
30 40 40
20 30
30
1750 1800 1850 1900 1950 2000 1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
80
80
80
70
Life exp. at birth
Life exp. at birth

70
Life exp. at birth
78
60
76 60
50
74
50
72 40
1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
1960 1970 1980 1990 2000 Calendar year Calendar year
Calendar year
Figure 3.25. Life expectancy at birth in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England &
20 20
20
18 18
18
Life exp. at 65
Life exp. at 65
Life exp. at 65
16 16
16
14 14
14
12
12
12
10
10
8 10
1880 1900 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
1750 1800 1850 1900 1950 2000
Calendar year
22
19 18
20
Life exp. at 65
Life exp. at 65
18 16
Life exp. at 65
18
17 16
14
16
14
12
15
12
14 10
1900 1920 1940 1960 1980 2000 1850 1900 1950 2000
1960 1970 1980 1990 2000
Calendar year
Figure 3.26. Life expectancy at age 65 in the EU for females for, from upper left to lower right, Sweden, Italy, Spain, West-Germany, France and England
& Wales, compared to Belgium (broken line). Source: HMD.
3.6 Conclusions 135
of the 1910s. Quite surprisingly, significant differences appear between the

French and Belgian life expectancies at age 65, with a clear advantage for
France.
Mortality in England and Wales has been significantly influenced by the
two world wars as well as by the 1918 flu epidemics. We see that the trend
in the life expectancies at birth and at retirement age are very similar in
Belgium and in England and Wales.
3.6 Conclusions
As clearly demonstrated in this chapter, mortality at adult and old ages
reveals decreasing annual death probabilities throughout the 20th century.
There is an ongoing debate among demographers about whether human
longevity will continue to improve in the future as it has done in the past.
Demographers such as Tuljapurkar and Boe (2000) and Oeppen and Vaupel
(2002) argue that there is no natural upper limit to the length of human life.
The approach that these demographers use is based on an extrapolation
of recent mortality trends. The complexity and historical stability of the
changes in mortality suggest that the most reliable method of predicting
the future is merely to extrapolate past trends. However, this approach has
come in for criticisms because it ignores factors relating to life style and the
environment that might influence future mortality trends. Olshansky et al.
(2005) have suggested that the future life expectancy might level off or even
decline. This debate clearly indicates that there is considerable uncertainty
about future trends in longevity.
Mortality improvements are viewed as a positive change for individuals
and as a substantial social achievement. Nevertheless, they pose a chal-
lenge for the planning of public retirement systems as well as for the private
life annuities business. Longevity risk is also a growing concern for com-
panies faced with off-balance-sheet or on-balance-sheet pension liabilities.
More generally, all the components of social security systems are affected by
mortality trends and their impact on social welfare, health care and societal
planning has become a more pressing issue. And the threat has now become
a reality, as testified by the failure of Equitable Life, the world’s oldest life
insurance company, in the UK in 2001. Equitable Life sold deferred life
annuities with guaranteed mortality rates, but failed to predict the improve-
ments in mortality between the date the life annuities were sold and the date
they came into effect.
Despite the fact that the study of mortality has been core to the actuarial
profession from the beginning, booming stock markets and high interest
rates and inflation have largely hidden this source of risk. In the recent past,
with the lowering of inflation, interest rates, and expected equity returns,
mortality risks have no longer been obscured.
Low nominal interest rates have made increasing longevity a much big-
ger issue for insurance companies. When living benefits are concerned, the
calculation of expected present values (which are needed in pricing and
reserving) requires an appropriate mortality projection in order to avoid
underestimation of future costs. This is because mortality trends at adult/old
ages reveal decreasing annual death probabilities. In order to protect the
company from mortality improvements, actuaries have to resort to life
tables including a forecast of the future trends of mortality (the so-called
projected tables). The building of such life tables will be the topic of the
next chapters.
Forecasting mortality:
4 An introduction
4.1 Introduction
This chapter aims at describing various methods proposed by actuaries and
demographers for projecting mortality. Many of these have been actually
used in the actuarial context, in particular for pricing and reserving in rela-
tion to life annuity products and pensions, and in the demographic field,
mainly for population projections.
First, the idea of a ‘dynamic’ approach to mortality modelling is intro-
duced. Then, projection methods are presented starting from extrapolation
procedures which are still widely used in current actuarial practice. More
complex methods follow, in particular methods based on mortality laws,
on model tables, and on relations between life tables. The Lee–Carter
method, recently proposed, and some relevant extensions are briefly intro-
duced, whereas a more detailed discussion, together with some examples of
implementation, is presented in Chapters 5 and 6.
The presentation does not follow a chronological order. In order to obtain
an insight into the historical evolution of mortality forecasts the reader
should refer to Section 4.9.1, in which some landmarks in the history of
dynamic mortality modelling are identified.
Allowing for future mortality trends (and, possibly, for the relevant uncer-
tainty of these trends) is required in a number of actuarial calculations and
applications. In particular, actuarial calculations concerning pensions, life
annuities, and other living benefits (provided, e.g. by long-term care cov-
ers and whole life sickness products) are based on survival probabilities
which extend over a long time horizon. To avoid underestimation of the
relevant liabilities, the insurance company (or the pension plan) must adopt
an appropriate forecast of future mortality, which should account for the
most important features of past mortality trends.
Various aspects of mortality trends can be captured looking at the
behaviour, through time, of functions representing the age-pattern of
138 4 : Forecasting mortality: An introduction
mortality. The examples discussed in Chapter 3 clearly witness this

possibility.
Particular emphasis has been placed by many researchers on the
behaviour, for each integer age x, of the quantity qx (i.e. the probability
of dying within one year), drawn from a sequence of life tables relating to
the same kind of population (e.g. males living in a given country, annuitants
of an insurance company, etc.). The graph constructed plotting qx , for any
given age x, against time is usually called the mortality profile. Mortality
profiles are often declining, in particular at adult and old ages.
Further, mortality experience over the last decades shows some aspects
affecting the shape of curves representing the mortality as a function of the
attained age, such as the curve of deaths (i.e. the graph of the probability
density function of the random lifetime, in the age-continuous setting) and
the survival function. In particular (see also Section 2.3.1):
(a) an increasing concentration of deaths around the mode (at old ages) of
the curve of deaths is evident; so the graph of the survival function moves
towards a rectangular shape, whence the term rectangularization to
denote this aspect; see Fig. 3.11 for an actual illustration, and Fig. 4.1(a)
for a schematic representation;
(b) the mode of the curve of deaths (which, owing to the rectangularization,
tends to coincide with the maximum age ω) moves towards very old
ages; this aspect is usually called the expansion of the survival function;
see Fig. 3.13 for an actual illustration, and Fig. 4.1(b) for a schematic
representation;
(c) higher levels and a larger dispersion of accidental deaths at young ages
(the so-called young mortality hump) have been more recently observed;
see Fig. 3.2 for an illustration.
(a) Rectangularization (b) Expansion
1 1
0 v 0 v v⬘
Age Age
Figure 4.1. Mortality trends in terms of the survival function.

4.2 A dynamic approach to mortality modelling 139
From the above aspects, the need for a dynamic approach to mortal-
ity assessment clearly arises. Addressing the age-pattern of mortality as a
dynamic entity underpins, from both a formal and a practical point of view,
any mortality forecast and hence any projection method.
4.2 A dynamic approach to mortality modelling

4.2.1 Representing mortality dynamics: single-figures
versus age-specific functions
When working in a dynamic context (in particular when projecting

mortality), the basic idea is to express mortality as a function of the
(future) calendar year t. When a single-figure representation of mortality is
concerned (see Sections 2.4.1 and 2.4.2), a dynamic model is a real-valued
function (t). For example, the expected lifetime for a newborn, denoted by
ē0 in a non-dynamic context, is represented by ē0 (t), a function of the cal-
endar year t (namely the year of birth), when the mortality trend is allowed
for. Similarly, the general probability of death in a given population can be
represented by a function q(t), where t denotes the calendar year in which
the population is considered.
In actuarial calculations, however, age-specific measures of mortality are
usually needed. Then in a dynamic context, mortality is assumed to be a
function of both the age x and the calendar year t. In a rather general setting,
a dynamic mortality model is a real-valued or a vector-valued function
(x, t). In concrete terms, a real-valued function may represent one-year
probabilities of death, mortality odds, the force of mortality, the survival
function, some transform of the survival function, etc. This concept has
been already introduced in Section 3.3. Further, a vector-valued function
would be involved when causes of death are allowed for.
The projected mortality model is given by the restriction (x, t)|t > t ,
where t denotes the current calendar year, or possibly the year for which
the most recent (reliable) period life table is available. The calendar year t
is usually called the base year. The projected mortality model (and, in par-
ticular, the underlying parameters) is constructed by applying appropriate
statistical procedures to past mortality experience.
Although age-specific functions are needed in actuarial calculations, the
interest in single-figure indexes as functions of the calendar year should
not be underestimated. In particular, important features of past mortality
trends can be singled out by focussing on the behaviour of some indexes that
are intended to be markers of the probability distribution of the random

lifetime at birth, T0 , or at some given age x, Tx (see Section 2.4). In a
dynamic context, all such markers should be noted to be functions of the
calendar year t, for example, ē0 (t), σ0 (t), ξ(t), etc.
4.2.2 A discrete, age-specific setting
Turning back to age-specific functions, we assume now that both age and
calendar year are integers. Hence, (x, t) can be represented by a matrix
whose rows correspond to ages and columns to calendar years. In particular,
let (x, t) = qx (t), where qx (t) denotes the probability of an individual
aged x in the calendar year t dying within one year (namely, the one-year
probability of death in a dynamic context).
The elements of the matrix (see Table 4.1) can be read according to three
arrangements:
(a) a vertical arrangement (i.e. by columns),

q0 (t), q1 (t), . . . , qx (t), . . . (4.1)
corresponding to a sequence of period life tables, with each table
referring to a given calendar year t;
(b) a diagonal arrangement,
q0 (t), q1 (t + 1), . . . , qx (t + x), . . . (4.2)
corresponding to a sequence of cohort life tables, with each table
referring to the cohort born in year t;
(c) a horizontal arrangement (i.e. by rows),
. . . , qx (t − 1), qx (t), qx (t + 1), . . . (4.3)
yielding the mortality profiles, with each profile referring to a given
age x.
Table 4.1. One-year probabilities of death in a dynamic context
... t−1 t t+1 ...
0 ... q0 (t − 1) q0 (t) q0 (t + 1) ...

1 ... q1 (t − 1) q1 (t) q1 (t + 1) ...
... ... ... ... ... ...
x ... qx (t − 1) qx (t) qx (t + 1) ...
x+1 ... qx+1 (t − 1) qx+1 (t) qx+1 (t + 1) ...
... ... ... ... ... ...
ω−1 ... qω−1 (t − 1) qω−1 (t) qω−1 (t + 1) ...
4.3 Projection by extrapolation of annual probabilities of death 141
4.3 Projection by extrapolation of annual

probabilities of death
An extrapolation procedure for mortality simply aims at deriving future

mortality patterns (e.g. future probabilities of death) from a database that
expresses past mortality experience. The database typically consists in cross-
sectional observations and, possibly, (partial) cohort observations. This idea
is sketched in Figs. 4.2 and 4.3.
However, a number of points should be addressed. In particular, consider
the following:
1. How are the items in the database interpreted? Are they correctly inter-
preted as observed outcomes of random variables (e.g. frequencies of
death), or, conversely, are they simply taken as ‘numbers’?
2. The projected table, resulting from the extrapolation procedure, is
a two-dimensional array of numbers, providing point estimates of
future mortality. How do we get further information, namely, interval
estimates?
If the answer to question (1) is ‘data are simply numbers’, then the extrap-
olation procedure does not allow for any statistical feature of the infor-
mation available, as, for example, the reliability of the data. Conversely,
Past Future
t⬘
0
1
Projected
x qx( t ⬘)
table
v–1
Figure 4.2. The projected table.

Projection
0
1
Projected
x
table
v–1
Database
Figure 4.3. From the data set to the projected table.
when the data are interpreted as the outcomes of random variables, the
extrapolation procedure must rely on sound statistical assumptions and,
as a consequence, future mortality can be represented in terms of both
point and interval estimates (whilst only point estimates can be provided
by extrapolation procedures only based on ‘numbers’).
Various traditional projection methods consist in extrapolation pro-
cedures simply based on ‘numbers’. First, we will describe these meth-
ods which, in spite of several deficiencies, offer a simple and intuitive
introduction to mortality forecasts.
Let us assume that several period observations (or ‘cross-sectional’ obser-
vations) are available for a given population (e.g. males living in a country,
pensioners who are members of a pension plan, etc.). Each observation
consists of the age-pattern of mortality for a given set X of ages, say
X = {xmin , xmin + 1, . . . , xmax }. The observation referred to calendar year
t is expressed by
{qx (t)}x∈X = {qxmin (t), qxmin +1 (t), . . . , qxmax (t)} (4.4)
Let us focus on the set of observation years T = {t1 , t2 , . . . , tn }. Then, we

assume that the matrix
{qx (t)}x∈X ; t∈T = {qx (t1 ), qx (t2 ), . . . , qx (tn )}x∈X (4.5)

Past Future
(Graduation) (Extrapolation)
qx (t)
t'
Time t
Figure 4.4. Extrapolation of the mortality profile
constitutes the data base for mortality projections. Note that each sequence
on the right-hand side of (4.5) represents the observed mortality profile at
age x.
We assume that the trend observed in past years (i.e. in the set of years
T ) can be graduated, for example, via an exponential function. Further, we
suppose that the observed trend will continue in future years. Then, future
mortality can be estimated by extrapolating the trend itself (see Fig. 4.4).
Remark The choice of the set T is a crucial step in building up a mortality

projection procedure. Even if a long sequence of cross-sectional observa-
tions is available (throughout a time interval of, say, more than 50 years), a
choice restricted to recent observations (over, say, 30–50 years) may be
more reasonable than the whole set of data. Actually, a very long sta-
tistical sequence can exhibit a mortality trend in which recent causes of
mortality improvement have a relatively small weight, whereas causes of
mortality improvement whose effect should be considered extinguished are
still included in the trend itself (see Fig. 4.5). For more information, see
Section 5.5.
Extrapolation of the qx ’s (namely of the mortality profiles) represents

a particular case of the horizontal approach for mortality forecasts (see
Fig. 4.6). The horizontal approach can be applied to quantities other than
the annual probabilities of death, for example, the mortality odds φx , the
central death rates mx , etc.
Adopting the horizontal approach means that extrapolations are per-
formed independently for each qx (or other age-specific quantity), so that
qx (t)
t'
Time t
Figure 4.5. Extrapolation results depending on the graduation period.
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻ c x min(t)
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻ c x(t)
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻ c x max(t)
Figure 4.6. The horizontal approach.
the result is a function ψx (t) for each age x. This may lead to inconsisten-
cies with regard to the projected age-pattern of mortality, as we will see in
Section 4.5.3.
4.3.2 Reduction factors
As far as future mortality is concerned, let us express the relation between

probability of death at age x, referred to a given year t (e.g. t = tn ) and a
generic year t (t > t ) respectively, as follows:
qx (t) = qx (t ) Rx (t − t ) (4.6)
The quantity Rx (t − t ) is called the variation factor (and usually reduction

factor, as it is expected to be less than 1 because of the prevailing downward
trends in probabilities of death) at age x for the interval (t , t).
A simplification can be obtained assuming that the reduction factor does
not depend on the age x, that is, assuming for all t and x
Rx (t − t ) = R(t − t ) (4.7)
Mortality forecasts can then be obtained through an appropriate mod-

elling procedure applied to the reduction factor. The structure as well as the
parameters of Rx (t − t ) should be carefully chosen. Then, projected mor-
tality will be obtained via (4.6) (provided that we assume that the observed
trend, on which the reduction factors are based, will continue in the future).
Remark The approach to projection by extrapolation which we are describ-
ing is based on a mathematical formula, namely, the formula for the
reduction factor (examples are provided in Sections 4.3.3–4.3.8). Con-
versely, extrapolation may be based on a graphical method. The graphical
approach to extrapolation consists in drawing, for each age x, a smooth
curve representing the past trend in probabilities of death, assumed to con-
tinue after the calendar year t , and then reading the projected probabilities
from the extrapolated part of the curve.
4.3.3 The exponential formula
Let us suppose that the observed mortality profiles are such that the
behaviour over time of the logarithms of the qx ’s is, for each age x, approx-
imately linear (see Fig. 4.7). Then, we can find a value δx such that, for
h = 1, 2, . . . , n − 1, we have approximately:
ln qx (th+1 ) − ln qx (th ) ≈ −δx (th+1 − th ) (4.8)
Hence
qx (th+1 )
≈ e−δx (th+1 −th ) (4.9)
qx (th )
or, defining rx = e−δx :
qx (th+1 ) t −t
≈ rxh+1 h (4.10)
qx (th )
Assume that, for each age x, the parameter δx (or rx ) is estimated, for
example via a least squares procedure. So, the graduated probabilities q̂x (t)
can be calculated. The constraint q̂x (tn ) = qx (tn ) is usually applied in the
estimation procedure.
0
Time t
ln qx (t)
qx (t)
0
Time t
Figure 4.7. Behaviour of the qx ’s along time
Relation (4.10) suggests a natural extrapolation formula. Set t = tn , and

assume for t > t :

qx (t) = qx (t ) rt−t
x (4.11)
from which we can express the reduction factor as follows:

Rx (t − t ) = rt−t
x = e−δx (t−t ) (4.12)
The extrapolation formula (4.11) (as well as, for instance, formula (4.17)
in Section 4.3.5) originates from the analysis of the mortality profiles, and
hence constitutes an example of the horizontal approach.
4.3.4 An alternative approach to the

exponential extrapolation
For the calculation of parameters rx ’s (or δx ’s), procedures other than least
squares estimation can be used. An example follows.
Suppose, as above, that n period tables are available. For each age x and
(h)
for h = 1, 2, . . . , n − 1, calculate the quantities rx ’s as follows:
% & 1
qx (th+1 ) th+1 −th
r(h)
x = (4.13)
qx (th )
Then, for each x, we calculate rx as the weighted geometric average of

(h)
the quantities rx :
! wh
n−1
rx = r(h)
x (4.14)
h=1
n−1must, of course, fulfill the conditions: wh ≥ 0, h = 1, 2, . . . ,

The weights
n − 1; h=1 wh = 1.
Each weight, wh , should be chosen in a way to reflect both the length of the
time interval between observations and the statistical reliability attaching to
the observations themselves. Trivially, if we set wh = (th+1 − th )/(tn − t1 )
for all h, only the lengths of the time intervals are accounted for, and so
expression (4.14) reduces to
% & 1
qx (tn ) tn −t1
rx = (4.15)
qx (t1 )
so that rx is determined only by the first and last values of qx (t) in the past
data.
4.3.5 Generalizing the exponential formula
Let us turn back to the exponential formula. From (4.11) it follows that, if
rx < 1, then
qx (∞) = 0 (4.16)
where qx (∞) = limt +∞ qx (t). Although the validity of mortality forecasts

should be restricted to a limited time interval, it may be more realistic to
assign a positive limit to the mortality at any age x. To this purpose, the
following formula with an assigned asymptotic mortality can be adopted:

qx (t) = qx (t ) αx + (1 − αx ) rt−t
x (4.17)
where αx ≥ 0 for all x see Fig. 4.8. The reduction factor is thus given by

Rx (t − t ) = αx + (1 − αx ) rt−t
x (4.18)
Clearly, (4.17) is a generalization of (4.11). From (4.17) we have:
qx (∞) = αx qx (t ) (4.19)
The exponential formula expressed by equation (4.17) can be simplified

by assuming that rx = r for all x, from which we obtain:

qx (t) = qx (t ) αx + (1 − αx ) rt−t (4.20)
Although the mortality decline is not necessarily uniform across a given

(wide) age range, this assumption can be reasonable when a limited set of
ages is involved in the mortality forecast. This would be the case for mor-
tality projections concerning annuitants or pensioners. In any case, some
flexibility is provided by the parameters αx .
qx (t') qx (t' )
qx (t' ) αx
t' t'
Time t Time t
Figure 4.8. Asymptotic mortality in exponential formulae.
4.3.6 Implementing the exponential formula
An alternative version of the exponential formula (4.17) can help in directly

assigning estimates to the parameters rx . Without loss of generality, we
address the simplified structure represented by equation (4.20), so that r is
independent of the age x.
The total (asymptotic) mortality decline, from time t on, is given by
qx (t ) − qx (∞), whereas the decline in the first m years is given by
qx (t ) − qx (t + m). Let us define the ratio fx (m) as follows:
qx (t ) − qx (t + m)
fx (m) = (4.21)
qx (t ) − qx (∞)
then, fx (m) is the proportion of the total mortality decline assumed to occur
by time m. Dividing both numerator and denominator by qx (t ), we obtain:
1 − Rx (m) (1 − αx )(1 − rm )
fx (m) = = = 1 − rm (4.22)
1 − Rx (∞) 1 − αx
Note that, since we have assumed rx = r for all x, we have fx (m) = f (m).
Hence
1
r = (1 − f (m)) m (4.23)
The choice of the couple (m, f (m)) unambiguously determines the
parameter r. Finally, we have
t−t
Rx (t − t ) = αx + (1 − αx ) (1 − f (m)) m (4.24)
For example, if we assume that 60% of the total mortality decline occurs
1
in the first 20 years, we set (m, f (m)) = (20, 0.60), and so r = 0.40 20 =
0.9552.
4.3.7 A general exponential formula
The exponential formulae discussed in Sections 4.3.3 and 4.3.5 can be

placed in a more general context. We assume the following expression for
the annual probability of death:
qx (t) = ax + bx cxt (4.25)
in which the parameters ax , bx , cx depend on age x and are independent of

the calendar year t. Thus, qx (t) is an exponential function of t. Equation
(4.25) then represents a general exponential formula for projections via
extrapolation.
The projection formulae which are currently used in actuarial practice
constitute particular cases of formula (4.25). For instance, with ax = 0,

bx = qx (t ) rtx , cx = rx , we obtain formula (4.11). With ax = αx qx (t ),

bx = (1 − αx ) qx (t ) rtx , cx = rx , we find the more general formula (4.17).
The projection formula
t−t
qx (t) = qx (t ) a x+b (4.26)
t−t
(called the Sachs formula) where a and b are constants and a x+b represents
the reduction factor, also constitutes a particular case of (4.25), as can be
easily proved.
Note that formulae (4.11) and (4.17) (and some related expressions)
explicitly refer to the base year t (usually related to the most recent
observation, that is, t = tn ). Conversely, formula (4.25) as well as other
formulae presented in Section 4.3.9 do not explicitly address a fixed calen-
dar year. Nonetheless, a link with a given calendar year can be introduced
via parameters, as illustrated, for example, by formula (4.26).
4.3.8 Some exponential formulae used in actuarial practice
Exponential formulae have been widely used in actuarial practice. Imple-

mentations of these formulae can be found, for instance, in the USA, Great
Britain, Germany and Austria. Some examples follow.
Example 4.1 In the UK, formula (4.11) was used for forecasting the mortal-
ity of life office pensioners and annuitants; see CMIB (1978). In particular,
a simplified version with the same reduction factor at all ages (see (4.7))
was implemented, that is,

qx (t) = qx (t ) rt−t (4.27)
The approximation was considered acceptable from age x = 60 upwards.

Example 4.2 Formula (4.20) was also proposed in the UK; see CMIB
(1990). The reduction factor Rx (t − t ), with t = 1980 as the base year, is
given by
t−t
Rx (t − t ) = αx + (1 − αx )(1 − f ) 20 (4.28)
with f = f (20) = 0.60 [see formula (4.24)] and:


 0.50 if x < 60

x − 10
αx = if 60 ≤ x ≤ 110 (4.29)


 100
1 if x > 110
It is easy to see that, for any year t, the reduction factor increases (i.e.
the mortality improvement reduces) linearly with increasing age, between
t−t
0.50 + 0.50 (0.40) 20 at age 60 and below, to unity at age 110 and above.
For any given age x, the rate of improvement decreases as t increases.
Further, following the analysis in Section 4.3.6, it is easy to prove that
expression (4.28) for the reduction factor, with f = 0.60, implies that 60%
of the total (asymptotic) mortality improvement (at any age x) is assumed
to occur in the first 20 years.
Example 4.3 A recent implementation of formula (4.17) by the Continuous
Mortality Investigation Bureau is as follows (see CMIB (1999)). In this case,
the reduction factor is given by
t−t
Rx (t − t ) = αx + (1 − αx )(1 − fx ) 20 (4.30)
The functions αx , fx have been chosen as follows:



 c if x < 60

x − 110
fx = 1 + (1 − c) if 60 ≤ x ≤ 110 (4.31)

 50

1 if x > 110


 h if x < 60

(110 − x) h + (x − 60) k
αx = if 60 ≤ x ≤ 110 (4.32)

 50

k if x > 110
where c = 0.13, h = 0.55, and k = 0.29. Parameters have been adjusted so

that t = 1992 is the base year.
Example 4.4 An exponential formula has also been used in the United
States. The Society of Actuaries published the 1994 probabilities on death
as the base table and the annual improvement factors 1 − rx ; see Group
Annuity Valuation Table Task Force (1995). The projected probabilities of
death are determined as follows:
qx (t) = qx (1994) rt−1994

x (4.33)
The parameter rx varies from 0.98 to 1, being equal to 1 for x > 100, for
both males and females.
4.3.9 Other projection formulae
Mortality improvements resulting from observed data may suggest assump-

tions other than the exponential decline of annual probabilities of death.
Thus, a formula different from the exponential one (see (4.25)) can be used
to express the probabilities qx (t). Conversely, the exponential formula can
be used to express other life table functions, or a transform of a life table
function, such as the odds φx (t).
Below we present some formulae which have been suggested or used in
applications:
bx
qx (t) = ax + (4.34)
t
p

qx (t) = ax,h t h (4.35)
h=0
eGx (t)
qx (t) = (4.36)
1 + eGx (t)
where Gx (t) is, for each age x, a polynomial in t, that is,
p

Gx (t) = cx,h t h (4.37)
h=0
Some comments about these formulae follow. Formula (4.35) with p = 1

represents the linear extrapolation method:
qx (t) = ax,0 + ax,1 t (4.38)
with ax,1 < 0 to express mortality decline. This formula is not usually
adopted because of its obvious disadvantage that for large t a negative
probability is predicted. The polynomial extrapolation formula (4.35) with
p = 3 is called the Esscher formula.
Referring to formula (4.36), note that it can be expressed as follows:

qx (t)
ln = Gx (t) (4.39)
px (t)
If observed mortality improvements suggest a linear behaviour of the log-

arithms of the odds, and thus an exponential behaviour of the odds, then
we can use formula (4.36) with
Gx (t) = cx,0 + cx,1 t (4.40)
and so we have the following expression:
ecx,0 +cx,1 t
qx (t) = (4.41)
1 + ecx,0 +cx,1 t
4.4 Using a projected table

4.4.1 The cohort tables in a projected table
A projected mortality table is a rectangular matrix {qx (t)}x∈X ; t≥t , where t

is the base year. The appropriate use of the projected table requires that, in
each year t, probabilities concerning the lifetime of a person age x in that
year are derived from the diagonal
qx (t), qx+1 (t + 1), . . . (4.42)
that is, from the relevant cohort table (see also Section 4.2.2). Then, the
probability of a person age x in year t being alive at age x + k is given by:
!
k−1

k px (t) = [1 − qx+j (t + j)] (4.43)
j=0
where the superscript recalls that we are working along a diagonal band
in the Lexis diagram (see Section 3.3, and Fig. 3.1 in particular), or, simi-
larly, along a diagonal of the matrix in Table 4.1 with the proviso that the
ordering of the lines is inverted. Note that explicit reference to the year of
birth τ is omitted, as this is trivially given by τ = t − x.
For example, to calculate, in the calendar year t, the expected remaining
lifetime of an individual age x in that year, the following formula should
be adopted, rather than formula (2.65) (which relies on the assumption of
unchanging mortality after the period observation from which the life table
was drawn):
ω−x

◦ 1
ex (t) = k px (t) + (4.44)
2
k=1
◦
The quantity ex (t) is usually called the (complete) cohort life expectancy,
for a person age x in year t. If a decline in future mortality is expected (and
hence represented by the projected cohort table), the following inequality
holds:
◦ ◦
ex (t) > ex (4.45)
◦
where ex denotes the period life expectancy (see Section 2.4.3).
Note that, in a dynamic framework, the period life expectancy should be
denoted as follows:
ω−x

◦↑ ↑ 1
ex (t) = k px (t) + (4.46)
2
k=1
with
!
k−1
↑
k px (t) = [1 − qx+j (t)] (4.47)
j=0
where the superscript ↑ recalls that we are working along a vertical band in
the Lexis diagram, or, similarly, along a column of the matrix in Table 4.1.
The same cohort-based approach should be adopted to calculate actuarial
values of life annuities, for both pricing and reserving. Hence, various cohort
tables should be simultaneously used, according to the year of birth of the
individuals addressed in the calculations.
4.4.2 From a double-entry to a single-entry projected table
From a strictly practical point of view, the simultaneous use of various

cohort tables may have some disadvantages. Moreover, probabilities con-
cerning people with the same age x at policy issue vary according to the
issue year t. These disadvantages have often led to the adoption, in actu-
arial practice, of one single-entry table only, throughout a period of some
(say 5, or 10) years. The single-entry table must be drawn, in some way,
from the projected double-entry table.
Single-entry tables can be derived, in particular, as follows (see also
Fig. 4.9):
(1) A birth year τ̄ is chosen and the cohort table pertaining to the generation
born in year τ̄ is only addressed; so, the probabilities
qxmin (τ̄ + xmin ), qxmin +1 (τ̄ + xmin + 1), . . . , qx (τ̄ + x), . . . (4.48)
Past Future
t t⬘ t
0
1
Projected
x qx( t ⬘)
table
v–1
(2) (1)
Figure 4.9. Two approaches to the choice of a single-entry projected table.
where xmin denotes the minimum age of interest, are used in actuarial
calculations. Thus, just one diagonal of the matrix {qx (t)} is actu-
ally used. The choice of τ̄ should reflect the average year of birth of
annuitants or pensioners to whom the table is referred.
(2) A (future) calendar year t̄ is chosen and the projected period table
referring to year t̄ is only addressed; and so the probabilities
qxmin (t̄), qxmin +1 (t̄), . . . , qx (t̄), . . . (4.49)
are adopted in actuarial calculations. Thus, just one column of the

matrix is used. The choice of t̄ should be broadly appropriate to the
mix of life annuity business in force over the medium-term future.
Following approach (1), and using the superscript [τ̄] to denote refer-
ence to the cohort table for the generation born in year τ̄, the probability
of being alive at age x + k is given (for any year of birth τ = t − x) by
!
k−1
[τ̄]
k px = [1 − qx+j (τ̄ + x + j)] (4.50)
j=0
Adopting approach (2), and denoting by [t̄] ↑ the reference to the period
table for year t̄, the probability of being alive at age x+k is conversely given
(for any year of birth τ = t − x) by
!
k−1
[t̄]↑
k px = [1 − qx+j (t̄)] (4.51)
j=0
Of course, both approaches lead to biased evaluations. Notwithstand-

ing this deficiency, approach (1) can be ‘adjusted’ reducing such a bias. A
common adjustment is described in the following section.
4.4.3 Age shifting
For people born in year τ = t −x, the probabilities (4.43) (which are related
to the year of birth τ) should be used, whereas approach (1) leads to the use
of probabilities (4.50), which are independent of the actual year of birth. To
reintroduce a dependence on τ, at least to some extent, we use the following
probabilities:
qxmin +h(τ) (τ̄ + xmin + h(τ)), qxmin +1+h(τ) (τ̄ + xmin + 1 + h(τ)), . . . ,
qx+h(τ) (τ̄ + x + h(τ)), . . . (4.52)
Note that all the probabilities involved belong to the same diagonal referred
to within approach (1).
This adjustment (often called Rueff’s adjustment) involves an age-shift
of h(τ) years. Assuming a mortality decline, the function h(τ) must satisfy
the following relations:


≥ 0 for τ < τ̄
h(τ) = 0 for τ = τ̄ (4.53)


≤ 0 for τ > τ̄
The survival probability is then calculated as follows (instead of using

formula (4.50)):
!
k−1
[τ̄; h(τ)]
k px = [1 − qx+h(τ)+j (τ̄ + x + h(τ) + j)] (4.54)
j=0
where the superscript also recalls the age-shift. Probabilities given by for-
mula (4.54) can be adopted to approximate the cohort life expectancy (see
Section 4.4.1) as well as actuarial values of life annuities.
Table 4.2. Age-shifting function

(table TPRV; τ̄ = 1950; i = 0)
τ h(τ)
1901–1910 5
1911–1920 4
1921–1929 3
1930–1937 2
1938–1946 1
1947–1953 0
1954–1960 −1
1961–1967 −2
1968–1975 −3
1976–1984 −4
≥ 1985 −5
As regards the determination of the age-shift function h(τ), various cri-

teria can be adopted. We just mention that most criteria are based on
the analysis of the actuarial values of life annuities calculated using the
appropriate probabilities, given by (4.43), and, respectively, using the prob-
abilities (4.54), with the aim of minimizing the ‘distance’ (conveniently
defined) between the sets of actuarial values. When a criterion of this type
is adopted, the function h(τ) depends on the interest rate used in calculating
the actuarial values.
Example 4.5 Table 4.2 shows the age-shifting function used in connection
with the French projected table TPRV. The interest rate assumed for the
construction of the function is i = 0.
Remark It is worth noting that adjustments via an age-shifting mechanism

are rather common in life insurance actuarial technique. For example, an
increment in the insured’s age is often used to account for the effects of
impairments on the age-pattern of mortality; see Section 2.9.1 and, in par-
ticular, formulae (2.119) and (2.120). A further example of age-shifting
in the context of mortality projections is presented in Section 4.5.1 (see
Example 4.8).
4.5 Projecting mortality in a parametric context

4.5.1 Mortality laws and projections
When a mortality law is used to fit observed data, the age-pattern of mortal-

ity is summarized by some parameters (see Section 2.5). Then, the projection
procedure can be applied to the set of parameters (instead of the set of age-
specific probabilities), with a dramatic reduction in the dimension of the
forecasting problem, namely in the number of the ‘degrees of freedom’.
Consider a law, for example, describing the force of mortality:
µx = ϕ(x; α, β, . . . ) (4.55)
In a dynamic context, the calendar year t enters the model via its parameters
µx (t) = ϕ(x; α(t), β(t), . . . ) (4.56)
Let T = {t1 , t2 , . . . , tn } denote the set of observation years. Hence, for

a given set X of ages, the data base is represented by the set of observed
values
{µx (t)}x∈X ; t∈T = {µx (t1 ), µx (t2 ), . . . , µx (tn )}x∈X (4.57)
For each calendar year th , we estimate the parameters to fit the model
µx (th ) = ϕ(x; αh , βh , . . . ) (4.58)
(e.g. via least squares, or minimum χ2 , or maximum likelihood) so that a

set of n functions of age x is obtained
{µx (t1 ), µx (t2 ), . . . , µx (tn )} (4.59)
Trends in the parameters are then graduated via some mathematical

formula, and hence a set of functions of time t is obtained:
α1 , α2 , . . . , αn ⇒ α(t)
β1 , β2 , . . . , βn ⇒ β(t)
...
(see Fig. 4.10).

It is worth noting that the above projection procedure follows a vertical
approach to mortality forecast, as the parameters of the chosen law are
estimated for each period table based on the experienced mortality (see
Fig. 4.11).
Conversely, a diagonal approach can be adopted, starting from parameter
estimation via a cohort graduation (see Fig. 4.12). In this case, parameters
depend on the year of birth τ:
µx (τ) = ϕ(x; γ(τ), δ(τ), . . . ) (4.60)

Past Future
(Graduation) (Extrapolation)
a
a(t)
ah
th t'
Time t
Figure 4.10. Projection in a parametric framework.
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
w (x; ah, bh, ...)
w (x; a(t), b(t), ...)

w (x; a2, b2, ...)
w (x; a1, b1, ...)
Figure 4.11. The vertical approach.
For each year of birth τh , h = 1, 2, . . . , m, we estimate the parameters to

fit the model
µx (τh ) = ϕ(x; γh , δh , . . . ) (4.61)
so that a set of m functions of age x is obtained
{µx (τ1 ), µx (τ2 ), . . . , µx (τm )} (4.62)

Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
w (x; gh, dh, ...)
w (x; g(t), d(t), ...)

w (x; g2, d2, ...)
w (x; g1, d1, ...)
Figure 4.12. The diagonal approach.
Trends in the parameters are then graduated via some mathematical

formula, and hence a set of functions of time τ is obtained:
γ1 , γ2 , . . . , γm ⇒ γ(τ)
δ1 , δ2 , . . . , δm ⇒ δ(τ)
...
Example 4.6 A Makeham’s law (see (2.70)), representing mortality dynam-

ics according to the vertical approach, can be defined as follows:
µx (t) = A(t) + B(t) c(t)x (4.63)
where t represents the calendar year.

When the diagonal approach is adopted, the dynamic Makeham law is
µx (τ) = Ā(τ) + B̄(τ) c̄(τ)x (4.64)
where τ = t − x denotes the year of birth of the cohort.
Example 4.7 In some law-based projection models it has been assumed
that the age-pattern of mortality is represented by one of the Heligman–
Pollard laws (see (2.83) to (2.87)), and that various relevant parameters
are functions of the calendar year. Thus, according to a vertical approach,
functions A(t), B(t), C(t), . . . are used to express the dependency of the
age-pattern of mortality on the calendar year t.
Example 4.8 We assume that, for each past calendar year t, the odds
φx (t) = qx (t)/px (t) are graduated using (2.81). Then, we have
φx (t) = ePx (t) (4.65)
where Px (t) denotes, for each t, a polynomial in x. Further, we assume that

the odds are extrapolated, for t > t , via an exponential formula, that is,
φx (t) = φx (t ) rs (4.66)
where s = t − t and r < 1.

As far as the age-pattern of mortality in the base year t is concerned, we
assume:
Px (t ) = α + βx (4.67)
Then, from (4.66) we have:
ln φx (t) = α + βx + s ln r (4.68)
Defining
ln r
w=− (4.69)
β
we finally obtain:
ln φx (t) = α + β(x − w s) = Px−ws (t ) (4.70)
By assumption r < 1, and, given the behaviour of probabilities qx (t ) and

px (t ) as functions of the age x, it is sensible to suppose β > 0. Then we find
w > 0. Hence, a constant reduction factor applied to the odds leads to an
age reduction w for each of the s projection years. If this result is transferred
from the odds φx (t) to the probabilities qx (t), we have approximately:
qx (t) ≈ qx−ws (t ) (4.71)
Formulae (4.70) and (4.71) provide examples of approximate evaluation

via age-shifting. See also the Remark in Section 4.4.3.
4.5.2 Expressing mortality trends via Weibull’s parameters
Assume that the probability distribution of the random lifetime at birth, T0 ,

is represented (for a given cohort of lives) by the Weibull law, hence with
force of mortality given by (2.77). The corresponding pdf is then

α x α−1 −( βx )α
f0 (x) = e ; α, β > 0 (4.72)
β β
whereas the survival function is given by

−( βx )α
S(x) = e (4.73)
It is well known that, whilst the Weibull law does not fit well the age-
pattern of mortality throughout the whole life span (especially because of
the specific features of infant and young-adult mortality), it provides a rea-
sonable representation of mortality at adult and old ages. Moreover, the
choice of the Weibull law is supported by the possibility of easily express-
ing, in terms of its parameters, the mode (at adult ages) of the distribution
of the random lifetime T0 , that is, the Lexis point,
1
α−1 α
Mod[T0 ] = β ; α>1 (4.74)
α
as well as the expected value and the variance,

1
E[T0 ] = β +1 (4.75)
α
' 2 (
2 2 1
Var[T0 ] = β +1 − +1 (4.76)
α α
where denotes the complete gamma function (see, e.g. Kotz et al. (2000)).
Moments for the remaining lifetime at age x > 0, Tx , can similarly be
derived.
The above possibility facilitates the choice of laws which reflect specific
future trends of mortality. When a dynamic mortality model is con-
cerned, the force of mortality must be addressed as a function of the
(future) calendar year t (according to the vertical approach), or the year of
birth τ (diagonal approach). Hence, referring for example to the diagonal
approach, we generalize formula (2.77) as follows:

α(τ) x α(τ)−1
µx (τ) = (4.77)
β(τ) β(τ)
Functions α(τ) and β(τ) should be chosen in order to reflect the assumed
trends in the rectangularization and expansion processes. To this purpose,
formulae (4.74) to (4.76) provide us with a tool for checking the validity of
a choice of the above functions.
qx(t)
x1
x2
t' t* Time t
Figure 4.13. A possible inconsistency in mortality profile extrapolation.
4.5.3 Some remarks
Comparing mortality profile extrapolations (i.e. the horizontal approach)

with law-based projections (i.e. the vertical and the diagonal approaches),
we note the following point. First, when the projection consists in a straight
extrapolation of the mortality profiles, inconsistencies may emerge as a
result of the extrapolation itself. For example we may find that a future
calendar year t ∗ exists, such that for t > t ∗ and for some ages x1 ,
x2 , with x1 < x2 , we find qx1 (t) > qx2 (t) (see Fig. 4.13), even at old
ages. Hence, appropriate adjustments may be required. Conversely, simple
calculation procedures have an advantage when extrapolating mortality
profiles.
A further disadvantage of mortality profile extrapolations is due to the
fact that they do not ensure the representation of future sensible mortality
scenarios. On the contrary, such outcomes can be rather easily produced by
controlling the behaviour of projected parameters in a law-based context
(see, in particular, Section 4.5.2).
As already noted in Section 4.5.1, law-based mortality projections lead to
a dramatic reduction in the dimension of the forecasting problem, namely in
the number of the degrees of freedom. However, the age-pattern of mortal-
ity can be summarized without resorting to mathematical laws (and hence
avoiding the choice of an appropriate mortality law). In particular, some
typical values, or markers (see Section 4.2.1), of the mortality pattern can
be used to this purpose; this aspect is dealt with in Section 4.6.2.
Finally, many Authors note that the parameters of most mortality laws are
often strongly dependent, for example the B and c parameters in Makeham’s
law (see (2.70)). Hence, univariate extrapolation (as in the vertical and
the diagonal approaches) may be misleading. Conversely, a multivariate
approach may provide a better representation of mortality trends, although

problems in computational tractability may arise.
4.5.4 Mortality graduation over age and time
As seen in the previous sections, the construction of projected quantities

(e.g. the one-year probabilities of death, or the force of mortality) is usually
worked out in two steps separately.
First, mortality tables are built up for various past calendar years and
possibly graduated, in particular using mathematical formulae, for exam-
ple, in order to obtain the force of mortality for each calendar year (see
Section 4.5.1).
Second, when no mortality law is involved, mortality profiles are analysed
in order to construct a formula for extrapolating probabilities of death.
Conversely, when a law-based projection model is used, the behaviour of the
parameters over time is analysed, in order to obtain formulae for parameter
extrapolation.
In conclusion, the construction of the projected mortality is performed
with respect to age and calendar year separately.
The above approach is computationally straightforward, in particular
thanks to the possibility of using well known techniques while performing
the first step. Despite this feature, recent research work has shown that mod-
els which incorporate (simultaneously) both the age variation in mortality
and the time trends in mortality have considerable advantages in terms of
goodness-of-fit and hence, presumably, in terms of forecast reliability.
Mortality projections based on models incorporating age variation and
time trends represent the surface approach to mortality forecasts (see
Fig. 4.14).
We focus on the so-called Gompertz–Makeham class of formulae,
denoted by GM(r, s) and defined in Section 2.5.1 (see (2.78)). Formulae of
the GM(r, s) type can be included in models allowing for mortality trends.
In this section, as an illustration, we introduce the model proposed by
Renshaw et al. (1996), implemented also by Sithole et al. (2000), and
Renshaw and Haberman (2003b), albeit in a modified form.
Consider the following model:
     
s
r
s
µx (t) = exp βj Lj (x̄) exp αi + γij Lj (x̄) t̄ i  (4.78)
j=0 i=1 j=1
Calendar year
t1 t2 th
xmin ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
Age
x ⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
⫻ ⫻ ⫻
xmax ⫻ ⫻ ⫻
⌽(x,t)
Figure 4.14. The surface approach.
with the proviso that some of the γij may be preset to 0. Lj (x̄) are Legendre
polynomials. The variables x̄ and t̄ are the transformed ages and trans-
formed calendar years, respectively, such that both x̄ and t̄ are mapped
onto [−1, +1]. Note that the first of the two multiplicative terms on the
right hand side is a graduation model GM(0, s + 1), while the second one
may be interpreted as an age-specific trend adjustment term (provided that
at least one of the γij is not preset to zero). Formula (4.78) has been pro-
posed by Renshaw et al. (1996) for modelling with respect to age and time,
noting that, for forecasting purposes, low values of r should be preferred –
that is, polynomials in t with a low degree.
A further implementation of this model has been carried out by Sithole
et al. (2000). Trend analysis of UK immediate annuitants’ and pensioners’
mortality experiences (provided by the CMIB) suggested the adoption of
the following particular formula (within the class of models (4.78)):
 
3
µx (t) = expβ0 + βj Lj (x̄) + (α1 + γ11 L1 (x̄)) t̄  (4.79)
j=1
where we note that r = 1.

Moreover, the reduction factor Rx (t − t ) related to the force of mortality
(rather than to the probabilities of death) has been addressed:
µx (t) = µx (t ) Rx (t − t ) (4.80)
where, as usual, t is the base year for the mortality projection. From (4.5.30)
we obtain: % &
t − t
Rx (t − t ) = exp (α1 + γ11 x̄) (4.81)
w
where w denotes half of the calendar year range for the investigation period.
Hence:
) *
Rx (t − t ) = exp (a + b x) (t − t ) (4.82)
(with a < 0 and b > 0, which result from the fitting of the observed data).
Renshaw and Haberman (2003b) consider a regression-based forecasting
model of the following simple structure:
ln mx (t) = ax + bx (t − t ) (4.83)
Then, introducing a reduction factor that is related to the central death rate
and interpreting the term ax as representing the central death rate for the
base year mx (t ), we have that
mx (t) = mx (t ) Rx (t − t ) (4.84)
and
Rx (t − t ) = exp[bx (t − t )] (4.85)
Renshaw and Haberman (2003b) also experiment with a series of break-

point predictors (equivalent to linear splines) in order to model changes of
slope in the mortality trend that have been observed in the past data. With
one such term, the reduction factor would be
Rx (t − t ) = exp[bx (t − t ) + bx (t − t0 )+ ] (4.86)
where (t − t0 )+ = t − t0 for t > t0 , and 0 otherwise.
4.6 Other approaches to mortality projections

4.6.1 Interpolation versus extrapolation: the limit table
From Sections 4.3 and 4.5, it clearly emerges that a number of projection
methods are based on the extrapolation of observed mortality trends, pos-
sibly via the parameters of some mortality law. Important examples are
provided by formulae (4.11), (4.20), and (4.63). Athough it seems quite
natural that mortality forecasts are based on past mortality observations,
different approaches to the construction of projected tables can be adopted.
We suppose that an ‘optimal’ limiting life table can be assumed. The

relevant age-pattern of mortality is to be interpreted as the limit pattern to
which mortality improvements can lead. Let q̃x denote the limit probability
of death at age x, whereas qx (t ) denotes the current mortality. Then, we
assume that the projected mortality qx (t) is expressed as follows:
qx (t) = I[q̃x , qx (t )] (4.87)
where the symbol I denotes some interpolation model.

Example 4.9 Adopting an exponential interpolation formula, we have:

qx (t) = q̃x + [qx (t ) − q̃x ] rt−t (4.88)
with r < 1. Note that formula (4.20) can be easily linked to (4.88), choosing
αx such that qx (t )αx = q̃x .
Determining a limit table requires a number of assumptions about the

trend in various mortality causes, so that an analysis of mortality by causes
of death should be carried out as a preliminary step (see Section 4.8.2).
4.6.2 Model tables
As noted in Section 4.5.1, when a mortality law is used to fit mortality

experience, the age-pattern of mortality is summarized by some param-
eters. Then, the projection procedure can be applied to each parameter
(instead of each mortality profile), with a dramatic reduction in the dimen-
sion of the forecasting problem. However, the age-pattern of mortality can
be summarized without resorting to mathematical laws, and, in particu-
lar, some markers of the mortality pattern can be used to this purpose (see
Section 4.2.1). The possibility of summarizing the age-pattern of mortality
by using some markers underpins the use of model tables.
The first set of model tables was constructed in 1955 by the United
Nations. A number of mortality tables was chosen, with the aim of rep-
resenting the age-pattern of mortality corresponding to various degrees of
social and economic development, health status, etc. The set was indexed
◦
on the expectation of life at birth, e0 , so that each table was summarized
by the relevant value of this marker.
Procedures based on model tables can be envisaged also for mortality
forecasts relating to a given population. With this objective in mind, we
choose a set of tables, representing the mortality in the population at several
epochs, and assumed to represent also future mortality for that population.
Observed trend
in markers Extrapolation
Markers
Set of model
Life tables
tables
The past Future
In a given population
Figure 4.15. Model tables for mortality forecasts.
Trends in some markers are analysed and then projected, possibly using
some mathematical formula, in order to predict their future values. Pro-
jected age-specific probabilities of death are then obtained by entering the
system of model tables for the various projected values of the markers. The
procedure is sketched in Fig. 4.15.
4.6.3 Projecting transforms of life table functions
A number of methods for mortality forecasts require that the projection

procedure starts from the analysis of trends in mortality, in terms of one-
year probabilities of death or other fundamental life table functions, such
as the force of mortality (in an age-continuous context) or the survival
function. An alternative approach is to use some transforms of life table
functions which may help us reach a better understanding of some features
of mortality trends. Two examples will be provided: the relational method
and the projection of the resistance function.
The relational method was proposed by Brass (1974), who focussed on
the logit transform of the survival function; see Section 2.7.
For the purpose of forecasting mortality, equation (2.107) can be used in
a dynamic sense. In a dynamic context, the Brass logit transform is partic-
ularly interesting when applied to cohort data, as the logits for successive
birth-year cohorts seem to be linearly related (see Pollard (1987)). Hence,
denoting by (x, τ) the logit of the survival function, S(x, τ), for the cohort
born in the calendar year τ, we have:

1 1 − S(x, τ)
(x, τ) = ln (4.89)
2 S(x, τ)
Referring to a pair of birth years, τk and τk+1 , we assume
(x, τk+1 ) = αk + βk (x, τk ) (4.90)
So, the problem of projecting mortality reduces to the problem of extrap-

olating the two series αk and βk . Projected values of various life table
functions can be derived from the inverse logit transform:
1
S(x, τ) = (4.91)
1 + exp[2 (x, τ)]
Figures 2.4–2.6 show how rectangularization and expansion phenomena,

in particular, can be represented by choices of the parameter α and β.
Application of the Brass transform to cohort-based projections requires a
long sequence of mortality observations, in order to build up cohort survival
functions. Further, inconsistencies may appear, since the method does not
ensure that, for any year of birth τ, S(x1 , τ) > S(x2 , τ) for all pairs (x1 , x2 )
with x1 < x2 . So, negative values for mortality rates qx (τ + x) may follow,
and hence appropriate adjustments in the linear extrapolation procedure
are required.
A different transform of the survival function S(x) has been addressed
by Petrioli and Berti (see Petrioli and Berti (1979); see also Keyfitz (1982)).
The proposed transform is the resistance function, defined in Setion 2.7 (see
(2.108)). The resistance function has been graduated with the formula:
2 +Bx+C
ρ(x) = xα (ω − x)β eAx (4.92)
and, in particular, with the three-parameter formula:
ρ(x) = k xα (ω − x)β (4.93)
Model tables have been constructed using combinations of the three

parameters, by focussing on the values of some markers. In a dynamic
context, the mortality trend is represented by assuming that (some of) the
parameters of the resistance function depend on the calendar year t. Thus,
referring to equation (4.93), we have:
ρ(x, t) = k(t) xα(t) (ω − x)β(t) (4.94)

Note that, when a model for the resistance function (see (4.92) and (4.93))
is assumed, the resulting projection model can be classified as an analytical
model, even though it does not directly address the survival function.
The Petrioli–Berti model has been used to project the mortality of the
Italian population, and then has been adopted by the Italian Association
of Insurers in order to build up projected mortality tables for life annuity
business.
4.7 The Lee–Carter method: an introduction

In general, most of the projection formulae presented in the previous

sections do not allow for the stochastic nature of mortality. Actually, a
number of projection methods used in actuarial practice simply consist in
graduation–extrapolation procedures (see e.g. (4.11), (4.17), (4.63)).
A more rigorous approach to mortality forecasts should take into account
the stochastic features of mortality. In particular, the following points
should underpin a stochastic projection model:
– observed mortality rates are outcomes of random variables representing

past mortality;
– forecasted mortality rates are estimates of random variables representing
future mortality.
Hence, stochastic assumptions about mortality are required, that is, prob-
ability distributions for the random numbers of deaths, and a statistical
structure linking forecasts to observations must be specified (see Fig. 4.16).
In a stochastic framework, the results of the projection procedures
consist in
• Point estimates
• Interval estimates
of future mortality rates (see Fig. 4.17) and other life table functions.
Clearly, traditional graduation–extrapolation procedures, which do not
explicitly allow for randomness in mortality, produce just one numerical
value for each future mortality rate (or some other age-specific quantity).
Moreover, such values can be hardly interpreted as point estimates, because
of the lack of an appropriate statistical structure and model.
qx(t) Sample= Path of

observed outcomes a stochastic process=
ofthe random possible future outcomes
mortality frequency ofthe random
. . . mortality frequency
.
.
A model linking the probabilistic structure
of the stochastic process to the sample
t' Time t
Figure 4.16. From past to future: a statistical approach.
qx(t) Interval estimation

Graduation
Point estimation
Observations
t⬘ Time t
Figure 4.17. Mortality forecasts: point estimation vs interval estimation.
An effective graphical representation of randomness in future mortality is

given by the so-called fan charts; see Fig. 4.18, which refers to the projection
of the expected lifetime. The fan chart depicts a ‘central projection’ together
with some ‘prediction intervals’. The narrowest interval, namely the one
with the darkest shading, corresponds to a low probability prediction, say
10%, and is surrounded by prediction intervals with higher probabilities,
say 30%, 50%, etc. See also Section 5.9.4.
The Lee–Carter (LC) method (see Lee and Carter (1992); Lee (2000))
represents a significant example of the stochastic approach to mortality
forecasts and constitutes one of the most influential proposals in recent
times. A number of generalizations and improvements have been proposed,
which follow and build on the basic ideas of the LC methodology.
e65(t)
Central projection Prediction

(point estimate) intervals
t' Time t
Figure 4.18. Forecasting expected lifetime: fanchart.
4.7.2 The LC model
In order to represent the age-specific mortality we address the central death

rate. Let mx (t) denote the central death rate for age x at time t, and we
assume the following log-bilinear form:
ln mx (t) = αx + βx κt + x,t (4.95)
where the αx ’s describe the age-pattern of mortality averaged over time,

whereas the βx ’s describe the deviations from the averaged pattern when
κt varies. The change in the level of mortality over time is described by
the (univariate) mortality index κt . Finally, the quantity x,t denotes the
error term, with mean 0 and variance σ2 , reflecting particular age-specific
historical influence that are not captured by the model. Expression (4.95)
constitutes the starting point of the LC method.
It is worth stressing that the LC model differs from ‘parametric models’
(namely, mortality laws, see Section 2.5), because in (4.95) the depen-
dence on age is non-parametric and is represented by the sequences of αx ’s
and βx ’s.
The model expressed by (4.95) cannot be fitted by simple regression,
since there is no observable variable on its right-hand side. A least squares
solution can be found by using the first element of the singular value decom-
position. The parameter estimation is based on a matrix of available death
rates, and we note that the system implied by (4.95) is undetermined with-
out
additional constraints. Lee and Carter (1992) propose the normalization
β
x x = 1, t κt = 0, which in turn forces each αx to be an average of the
log-central death rates over calendar years.
Once the parameters αx , βx and κt are estimated, obtaining the estimates
α̂x , β̂x , κ̂t , mortality forecast follows by modelling the values of κt as a time
series, for example, as a random walk with drift. Starting from a given year
t , forecasted mortality rates are then computed, for t > t , as follows:

mx (t) = exp(α̂x + β̂x κt ) = mx (t ) exp β̂x (κt − κ̂t ) (4.96)
It is worth noting that mx (t) is modelled as a stochastic process, that

is driven by the stochastic process κt , from which interval estimates can be
computed for the projected values of mortality rates.
4.7.3 From LC to the Poisson log-bilinear model
The LC method implicitly assumes that the random errors are homoskedas-
tic. This assumption, which follows from the ordinary least squares estima-
tion method that is used as the main statistical tool, seems to be unrealistic,
as the logarithm of the observed mortality rate is much more variable at
older ages than at younger ages, because of the much smaller number of
deaths observed at old and very old ages.
In Brouhns et al. (2002b) and Brouhns et al. (2002a), possible improve-
ments of the LC method are investigated, using a Poisson random variation
for the number of deaths. This is instead of using the additive error term x,t
in the expression for the logarithm of the central mortality rate (see (4.95)).
In terms of the force of mortality µx (t), the Poisson assumption means
that the random number of deaths at age x in calendar year t is given by

Dx (t) ∼ Poisson ETRx (t) µx (t) (4.97)
where ETRx (t) is the central number of exposed to risk. In order to define
the Poisson parameter ETRx (t) µx (t), Brouhns et al. (2002a) and Brouhns
et al. (2002b) assume a log-bilinear force of mortality, that is,
ln µx (t) = αx + βx κt (4.98)
hence with the structure expressed by (4.95), apart from the error term.
The meaning of the parameters αx , βx , κt is essentially the same as for
the corresponding parameters in the LC model. The parameters are then
determined by maximizing the log-likelihood based on (4.97) and (4.98).
Brouhns et al. (2002b) do not modify the time series part of the LC
method. Hence, the estimates α̂x and β̂x are used with the forecasted κt
in order to generate future mortality rates (as in (4.96)), as well as other
age-specific quantities.
4.8 Further issues 173
4.7.4 The LC method and model tables
An interesting example of projecting mortality patterns using the LC

method is provided by Buettner (2002). The LC method is used to project
mortality patterns on the basis of model tables that are indexed on the expec-
◦
tation of life at birth e0 (see Section 4.6.2). Since model tables do not contain
any explicit time reference, the LC model has been implemented replacing
the time index κt with an index reflecting the level of life expectancy. Then,
the model is
ln mx (e) = αx + βx κe + x,e (4.99)
where the parameter κe represents the trend in the level of life expectancy
at birth.
4.8 Further issues

In this section we address some issues of mortality forecasts part of which
are, at least to some extent, beyond the main scope of this book, whereas
others will be developed in the following chapters.
4.8.1 Cohort approach versus period approach. APC models
First, consider the following projection model referred to the mortality odds
φx (t) = qx (t)/px (t):

φx (t) = φx (t ) rt−t (4.100)
where the first term on the right-hand side does not depend on t, whereas
the second term does not depend on x. Denoting the first term with A(x)
and the second term with B(t), equation (4.100) can be rewritten as follows:
φx (t) = A(x) B(t) (4.101)
Then, we consider the so-called K-K-K hypothesis (formulated in 1934 by

Kermack, McKendrick, and McKinlay), according to which the following
factorization is assumed:
µx (τ) = C(x) D(τ) (4.102)
where τ denotes, as usual, the year of birth of the cohort.

In projection model (4.101), the future mortality structure is split into:
– a factor A(x), expressing the age effect;

– a factor B(t), expressing the year of occurrence effect or period effect.
Conversely, in model (4.102) it is assumed that the future mortality

structure can be split into:
– a factor C(x), expressing the age effect;
– a factor D(τ), expressing the year of birth effect or cohort effect.
Recently, models including both the period effect and the cohort effect
(as well as the age effect) have been proposed. These models are commonly
called APC (Age-Period-Cohort) models. An APC model, referring to the
force of mortality, can be expressed as follows:
µx (t) = Q(x) R(t) S(t − x) (4.103)
(where t − x = τ) or, in logarithmic terms:
ln µx (t) = ln Q(x) + ln R(t) + ln S(t − x) (4.104)
A slightly modified version of (4.104), referring to central death rates (see

Willets (2004)), is as follows:
ln mx (t) = m + αx + βt + γt−x (4.105)
with finite sets for the values of x and t. Constraints are usually as follows:

αx = βt = γt−x = 0 (4.106)
x t t−x
The model can be estimated using Poisson maximum likelihood, or

weighted least squares methods. However, no unique set of parameters
result in an optimal fit because of the trivial relation
cohort + age = period
Further weak points can be found in APC models like (4.102) and
(4.103). In particular, these models assume an age-independent period
effect, or an age-independent cohort effect, whereas the impact of mortality
improvements over time (or between cohorts) may vary with age.
As far as statistical evidence is concerned, both period and cohort effects
seem to impact on mortality improvements. In particular, it is reason-
able that period effects summarize contemporary factors, for example,
the general health status of the population, availability of healthcare ser-
vices, critical weather conditions, etc. Conversely, cohort effects quantify
historical factors, for example, World War II, diet, smoking habits, etc.
From a practical point of view, the main difficulty in implementing projec-
tion models allowing for cohort effects obviously lies in the fact that statist-
ical data for a very long period are required, and such data are rarely available.
Conversely, from a general point of view, the role of period and cohort effects
in quantifying factors that affect mortality improvements suggests that we
consider future likely scenarios and, in particular, causes of death.
4.8.2 Projections and scenarios. Mortality by causes
When projecting mortality, the collateral information available to the fore-

caster can be allowed for. Information may concern, for example, trends
in smoking habits, trends in prevalence of some illness, improvements in
medical knowledge and surgery, etc. Thus, projections can be performed
according to an assumed scenario.
The introduction of relationships between causes (e.g. advances in medi-
cal science) and effects (mortality improvements) underpins mortality pro-
jections which are carried out according to assumed scenarios. Obviously,
some degree of arbitrariness follows, affecting the results.
The projection methods that we have described refer to mortality in
aggregate. Nonetheless, many of them can be used to project mortality
by different causes separately.
Projections by cause of death offer a useful insight into the changing
incidence of the various causes. Conversely, some important problems arise
when this type of projection is adopted. In particular, it should be stressed
that complex interrelationships exist among causes of death, whilst the
classic assumption of independence is commonly accepted. For example,
mortality from heart diseases and lung cancer are positively correlated, as
both are linked to smoking habits. A further problem concerns the difficult
identification of the cause of death for elderly people.
A final issue concerns the phenomenon in long-term projections by cause
of death whereby the future projected mortality rate is dominated by the
mortality trend for the cause of death where mortality rates are reducing at
the lowest speed.
For these reasons, many forecasters prefer to carry out mortality projec-
tions only in aggregate terms.

4.9.1 Landmarks in mortality projections
4.9.1.1 The antecedents

As noted by Cramér and Wold (1935), the earliest attempt to project mor-
tality is probably due to the Swedish astronomer H. Gyldén. In a work
presented to the Swedish Assurance Association in 1875, he fitted a straight

line to the sequence of general death rates of the Swedish population con-
cerning the years 1750–1870, and then extrapolated the behaviour of the
general death rate. A similar graphical fitting was proposed in 1901 by T.
Richardt for sequences of the life annuity values a60 and a65 , calculated
according to various Norwegian life tables, and then projected via extrapo-
lation for application to pension plan calculations. Note that, in both of the
proposals of Gyldén and Richardt, the projection of a single-figure index
was concerned.
Mortality trends and the relevant effects on life assurance and pension
annuities were clearly identified at the beginning of the 20th century, as wit-
nessed by various initiatives in the actuarial field. In particular, it is worth
noting that the subject ‘Mortality tables for annuitants’ was one of the top-
ics discussed at the 5th International Congress of Actuaries, held in Berlin
in 1906. Nordenmark (1906), for instance, pointed out that improvements
in mortality must be carefully considered when pricing life annuities and, in
particular, cohort mortality should be addressed to avoid underestimation
of the related liabilities. The 7th International Congress of Actuaries, held
in Amsterdam in 1912, included the subject ‘The course, since 1800, of the
mortality of assured persons’.
As Cramér and Wold (1935) notes, a life table for annuities was con-
structed in 1912 by A. Lindstedt, who used data from Swedish population
experience and, for each age x, extrapolated the sequence of annual prob-
abilities of death, namely the mortality profile qx (t), hence adopting a
horizontal approach. Probably, this work constitutes the earliest projection
of an age-specific function.
4.9.1.2 Early seminal contributions

Blaschke (1923) proposed a Makeham-based projected mortality model
(see Section 4.5.1). In particular he adopted a vertical approach, consisting
in the estimation of Makeham’s parameters for each period table based on
the experienced mortality, and then in fitting the estimated values. Hence,
projected values for the three parameters were obtained via extrapolation.
In 1924, the Institute of Actuaries in London proposed a horizontal
method for mortality projection (see Cramér and Wold (1935)), assum-
ing that probabilities of death are exponential functions of the calendar
year, from which comes the name ‘exponential model’ frequently used to
denote this approach to mortality projections. Various extrapolation for-
mulae used by UK actuaries in recent times for annuitants and pensioners
tables are particular cases of the early exponential model (see Section 4.3.8).
We now turn to the diagonal approach. In 1927 A. R. Davidson and A. R.

Reid proposed a Makeham-based projection model, in which Makeham’s
law refers to cohort mortality experiences. The relevant parameters were
estimated via a cohort graduation (see Reid and Davidson (1927)).
The use of Makeham-based projections is thoroughly discussed by
Cramér and Wold (1935), dealing with the graduation and extrapolation of
Swedish mortality. In particular, the vertical (i.e. period-based, see (4.63))
and the diagonal (i.e. cohort-based, see (4.64)) approaches are compared.
Let
µx (z) = γ(z) + α(z) β(z)x
denote the force of mortality in both the vertical (with z = t) and the
diagonal (with z = t − x) approach. For the graduation of the parameters,
Cramér and Wold (1935) assumed that, in both the vertical and the diagonal
approach, α(z) is linear while ln β(z) and ln γ(z) are logistic.
The assumption formulated in 1934 by Kermack, McKendrick, and
McKinlay constitutes another example of the diagonal approach to mor-
tality projections. As Pollard (1949) notes, these authors showed that, for
some countries, it was reasonable to assume that the force of mortality
depended on the attained age x and the year of birth τ = t − x, and they
deduced that µx (t) = C(x) D(τ), where C(x) is a function of age only and
D(τ) is a function of the year of birth only; see also Section 4.8.1.
4.9.1.3 Some modern contributions

Seminal contributions to mortality modelling and mortality projections in
particular have been produced by demographers, throughout the latter half
of the 20th century. The ‘optimal’ life table, model tables and relational
methods probably constitute three of the most influential proposals in recent
times, in the framework of mortality analysis.
The idea of an ‘optimal’ table (see Section 4.6.1) was proposed by
Bourgeois-Pichat (1952). The question was: ‘can mortality decline indef-
initely or is there a limit, and if so, what is this limit?’ While a number
of projection methods are based on the extrapolation of observed mortal-
ity trends, focussing on optimal tables provides an alternative approach to
mortality forecasts, as an interpolation procedure between past data and
the limit table is required.
The possibility of summarizing the age-pattern of mortality by using some
markers underpins the use of ‘model tables’ in mortality projections (see
Section 4.6.2). Model tables were first constructed by the United Nations,
in 1955. Each table is summarized by the relevant value of the expectation

of life at birth.
A new way to mortality forecasts was paved by the ‘relational method’
proposed by W. Brass (see e.g. Brass (1974)), who focussed on the logit
transform of the survival function (see Section 2.7). A different transform
of the survival function, namely the ‘resistance function’, has been addressed
by Petrioli and Berti (1979); see also Keyfitz (1982). In a dynamic context,
the mortality trend is represented assuming that (some of) the parameters
of the resistance function depend on the calendar year t.
4.9.1.4 Recent contributions

In the last decades of the 1900s, various mortality law-based projection
models have been proposed. In particular, Forfar and Smith (1988) have fit-
ted the Heligman–Pollard curve to the graduated English life tables ELT1 to
ELT13, for both males and females, and then have analysed the behaviour of
the relevant parameters. Mortality projections have been performed assum-
ing that various parameters of the Heligman–Pollard law are functions of
the calendar year (see Benjamin and Soliman (1993) and Congdon (1993)
for examples).
In the 1990s, a new method for forecasting the age-pattern of mortal-
ity was proposed and then extended by L. Carter and R.D. Lee (see Lee
and Carter (1992) and Lee (2000)). The LC method addresses the central
death rate to represent the age-specific mortality (see Section 4.7.2). While
traditional projections models provide the forecaster with point estimates
of future mortality rates (or other age-specific quantities), the LC method
explicitly allows for random fluctuations in future mortality, representing
the related effect in terms of interval estimates. The LC methodology con-
stitutes one of the most influential proposals in recent times, in the field of
mortality projections. Indeed, much research work as well as many recent
applications to actuarial problems are directly related to this methodology
(for detailed references see Section 4.9.2).
Finally, frailty models in the context of mortality forecast have been
addressed by Butt and Haberman (2004) and Wang and Brown (1998).
4.9.2 Further references
There are a number of both theoretical and practical papers dealing with
mortality forecasts, produced by actuaries as well as by demographers. The
reader interested in various perspectives on forecasting mortality should
refer to Tabeau et al. (2001), and Booth (2006), in which a number of
approaches to mortality projections are discussed and several applications

are described. Interesting reviews on mortality forecast methods can be
found also in Benjamin and Pollard (1993), Benjamin and Soliman (1993),
National Statistics - Government Actuary’s Department (2001), Olshansky
(1988), Pollard (1987), and Wong-Fupuy and Haberman (2004).
Mortality projections via reduction factors represent a practical and
widely adopted approach to mortality forecast. As regards formulae used
by UK actuaries, the reader should refer to CMIB (1978, 1990, 1999).
Recent contributions to the modelling of reduction factors have been
given by Renshaw and Haberman (2000, 2003a), and Sithole et al.
(2000).
In the field of law-based mortality projections, Felipe et al. (2002) have
used the Heligman–Pollard law 2 for fitting and projecting mortality trends
in the Spanish population. Also more traditional mortality laws have been
used for analysing mortality trends and producing mortality forecasts. For
example, Barnett (1960) has analysed mortality trends through the param-
eters of a modified Thiele’s formula, whereas Buus (1960) has used the
Makeham law, focussing on the interdependence between the parameters.
Poulin (1980) has proposed a Makeham-based projection formula, whereas
Wetterstrand (1981) has used Gompertz’s law. Functions other than the
force of mortality can also be addressed. For example, Beard (1952) built
up a projection model by fitting a Pearson Type III curve to the curve
of deaths, and then taking some parameters (in particular the maximum
age) as functions of the year of birth. The Weibull law has been used
by Olivieri and Pitacco (2002a) and Olivieri (2005), in order to express,
via the relevant parameters, various assumptions about the expansion and
rectangularization of the survival function.
The use of a law-based approach to mortality forecasting is rather contro-
versial. For interesting discussions on this issue, the reader should consult
Keyfitz (1982) and Pollard (1987). Brouhns et al. (2002b) stress that
the estimated parameters are often strongly dependent. Hence, univariate
extrapolation of the parameters may be misleading, whereas a multivari-
ate time series for the parameters is theoretically possible but can lead
to computational intractability. Of course, a distribution-free approach to
mortality projections avoids these problems. Very important examples of
the distribution-free approach are provided by the LC model and several
models aiming to improve the LC methodology.
The practical use of projected tables deserves special attention, especially
when just one cohort table is actually adopted in pricing and reserving (see
Section 4.4.2). In particular, the optimal choice of the age-shifting function
(see Section 4.4.3) has been dealt with by Delwarde and Denuit (2006); see
also Chapter 3.
Considerable research work has been recently devoted to improve and
generalize the LC methodology. In particular, the reader should refer to
Carter (1996), Alho (2000), Renshaw and Haberman (2003a, b, c), Brouhns
and Denuit (2002), Brouhns et al. (2002b). See also the list of references in
Lee (2000).
Among the extensions of the LC method, we note the following devel-
opments. Carter (1996) incorporates in the LC methodology uncertainty
about the estimated trend of mortality kt , through a specific model for the
trend itself. Renshaw and Haberman (2003c) have noted that the standard
LC methodology fails to capture and then project recent upturn in crude
mortality rates in the age range 20–39 years. So, an extension of the LC
methodology is proposed, in order to incorporate in the LC model specific
age differential effects.
Booth et al. (2002) have developed systematic methods for choosing the
most appropriate subset of the data to use for modelling – the graduation
subset of Fig. 4.4. The importance of ensuring that the estimates α̂x and β̂x
are smooth with respect to age so that irregularities are not magnified via
extrapolations into the future has been discussed by Renshaw and Haber-
man (2003a), Renshaw and Haberman (2003c), De Jong and Tickle (2006),
and Delwarde et al. (2007).
A cause-of-death projection study was proposed by Pollard (1949), based
on Australian population data.
As regards scenario-based mortality forecasts, Gutterman and Van-
derhoof (1998) stress that a projection methodology should allow for
relationships between causes (e.g. advances in medical science) and effects
(mortality improvements).
5 applications and
examples of age-period
models
5.1 Introduction
As explained in Chapter 4, actuaries working in life insurance and pension
have been using projected life tables for some decades. But the problem
confronting actuaries is that people have been living much longer than they
were expected to according to the life tables being used for actuarial com-
putations. What was missing was an accurate estimation of the speed of the
mortality improvement: thus, most of the mortality projections performed
during the second half of the 20th century have underestimated the gains
in longevity. The mortality improvements seen in practice have quite con-
sistently exceeded the projected improvements. As a result, insurers have,
from time to time, been forced to allocate more capital to support their in-
force annuity business, with adverse effects on free reserves and profitability.
From the point of view of the actuarial approach to risk management, the
major problem is that mortality improvement is not a diversifiable risk. Tra-
ditional diversifiable mortality risk is the random variation around a fixed,
known life table. Mortality improvement risk, though, affects the whole
portfolio and can thus not be managed using the law of large numbers
(see Chapter 7 for a detailed discussion of systematic and non-systematic
risks). In this respect, longevity resembles investment risk, in that it is
non-diversifiable: it cannot be controlled by the usual insurance mecha-
nism of selling large numbers of policies, because they are not independent
in respect of that source of uncertainty. However, longevity is different
from investment risk in that there are currently no large traded markets
in longevity risk so that it cannot easily be hedged. The reaction to this
problem is twofold. First, actuaries are trying to produce better models for
mortality improvement, paying more attention to the levels of uncertainty
involved in the forecasts. The second part of the reaction is to look to the
182 5 : Age-period projection models
capital markets to share the risk, through the emergence of mortality-linked

derivatives or longevity bonds. This kind of securitization will be discussed
in Chapter 7.
As explained in the preceding chapter, there is a variety of statistical mod-
els used for mortality projection, ranging from the basic regression models,
in which age and time are viewed as continuous covariates, to sophisti-
cated robust non-parametric models. Mortality forecasting is a hazardous
yet essential enterprise for life annuity providers. This chapter examines the
problem in the favourable circumstances encountered in developed coun-
tries, where extensive historical data are often easily available. A statistical
model (in the form of a regression or a time series) is used to describe
historical data and extrapolate past trends to the future.
In this chapter, we first consider the log-bilinear projection model pio-
neered by Lee and Carter (1992) that has been introduced in Section 4.7.2.
The method describes the log of a time series of age-specific death rates as the
sum of an age-specific component that is independent of time and another
component that is the product of a time-varying parameter reflecting the
general level of mortality, and an age-specific component that represents
how rapidly or slowly mortality at each age varies when the general level
of mortality changes. This model is fitted to historical data. The resulting
estimate of the time-varying parameter is then modelled and projected as
a stochastic time series using standard Box-Jenkins or ARIMA methods.
From this forecast of the general level of mortality, the future death rates
are derived using the estimated age effects. The key difference between the
classical generalized linear regression model approach (see Section 4.5.4)
and the method pioneered by Lee and Carter (1992) centers on the inter-
pretation of time which in the logbilinear approach is modelled as a factor
and under the generalized linear regression approach is modelled as a known
covariate.
The model proposed by Lee and Carter (1992) has now been widely
adopted. However, it is of course not the only candidate for extrapolat-
ing mortality to the future. It should be stressed that some models are
designed to project specific demographic indicators, and that the forecast
horizon may depend on the type of model. In this respect, the model pro-
posed by Lee and Carter (1992) is typically meant for long-term projections
of aggregate mortality indicators like life expectancies. It is not intended
to produce reliable forecasts of series of death rates for a particular age.
This is why this model is so useful for actuaries, who are interested in
life annuity premiums and reserves, which are weighted versions of life
expectancies (the weights being the financial discount factors). Some exten-
sions incorporating features specific to each cohort are proposed in the next
chapter.
In addition to the Lee–Carter model, we also consider a powerful alter-

native mortality forecasting method proposed by Cairns et al. (2006a). It
includes two time factors (whereas only one time factor drives the future
death rates in the Lee–Carter case) with a smoothing of age effects using a
logit transformation of one-year death probabilities. Specifically, the logit
of the one-year death probabilities is modelled as a linear function of age,
with intercept and slope parameters following some stochastic process.
Compared with the Lee–Carter approach, the Cairns–Blake–Dowd model
includes two time factors. This allows the model to capture the imperfect
correlation in mortality rates at different ages from one year to the next. This
approach can also be seen as a compromise between the generalized regres-
sion approach and the Lee–Carter views of mortality modelling, in that age
enters the Cairns–Blake–Dowd model as a continuous covariate whereas
the effect of calendar time is captured by a couple of factors (time-varying
intercept and slope parameters).
The Cairns–Blake–Dowd model is fitted to historical data. The resulting
estimates for the time-varying parameters are then projected using a bivari-
ate time series model. From this forecast of the future intercept and the
slope parameters, the future one-year death probabilities are computed in
combination with the linear age effect.
Mortality forecasts performed by demographers are traditionally based
on the forecaster’s subjective judgements, in the light of historical data and
expert opinions. This traditional method has been widely used for official
mortality forecasts, and by international agencies. A range of uncertainty
is indicated by high and low scenarios (surrounding the medium scenario
which is meant to be the best estimate), which are also constructed through
subjective judgements.
In the hands of a skilled and knowledgeable forecaster, the traditional
method has the advantage of drawing on the full range of relevant infor-
mation for the medium forecast and the high–low range. However, it also
has certain deficiencies. First, mortality projections in industrialized coun-
tries have been found to under-predict mortality declines and gains in life
expectancy when compared to subsequent outcomes, as pointed out by Lee
and Miller (2001). Thus, a systematic downward bias has been observed
for this traditional approach during the 20th century. A second difficulty is
that it is not clear how to interpret the high–low range of a variable unless
a corresponding probability for the range is stated. We will come back to
this issue in Section 5.8.
Both the Lee–Carter and the Cairns–Blake–Dowd models greatly reduce
the role of subjective judgement since standard diagnostic and statistical
modelling procedures for time series analysis are followed. Nonetheless,

decisions must be taken about a number of elements of these models–for
example, how far back in history to begin, or exactly what time series model
to use.
It should be noted that the models investigated in this chapter do not
attempt to incorporate assumptions about advances in medical science or
specific environmental changes: no information other than previous history
is taken into account. The (tacit) underlying assumption is that all of the
information about the future is contained in the past observed values of
the death rates. This means that this approach is unable to forecast sudden
improvements in mortality due to the discovery of new medical treatments,
revolutionary cures including antibiotics, or public health innovations. Sim-
ilarly, future deteriorations caused by epidemics, the appearance of new
diseases or the aggravation of pollution cannot enter the model. The actu-
ary has to keep this in mind when he uses this model and makes decision
on the basis of the outputs, for example, in the setting of a reinsurance
programme.
Some authors have severely criticized the purely extrapolative approach
because it seems to ignore the underlying mechanisms of a social, economic
or biological nature. As pointed out by Wilmoth (2000), such a critique
is valid only insofar as such mechanisms are understood with sufficient
precision to offer a legitimate alternative method of prediction. Since our
understanding of the complex interactions of social and biological fac-
tors that determine mortality levels is still imprecise, we believe that the
extrapolative approach to prediction is particularly compelling in the case
of human mortality.
The R software has been found convenient to perform the analysis
described in this chapter (as well as those in Chapter 3). R is a free lan-
guage and environment for statistical computing and graphics. R is a GNU
project which is similar to the S language and environment which was devel-
oped at Bell Laboratories (formerly AT&T, now Lucent Technologies) by
John Chambers and colleagues. For more details, we refer the interested
reader to http://www.r-project.org/.
In addition to our own R code, we have benefitted from the demography
package for R created by Rob J. Hyndman, Heather Booth, Leonie Tickle,
and John Maindonald. This package contains functions for various demo-
graphic analyses. It provides facilities for demographic statistics, modelling
and forecasting. In particular, it implements the forecasting model pro-
posed by Lee and Carter (1992) and several variations of it, as well as the
forecasting model proposed by Hyndman and Ullah (2007).
After the Crédit Suisse longevity index based on the expectation of life
derived from US data, the more comprehensive JPMorgan LifeMetrics has
innovated by producing publicly available indices on population longevity.
LifeMetrics is a toolkit for measuring and managing longevity and mor-
tality risk. LifeMetrics advisors include Watson Wyatt and the Pensions
Institute at Cass Business School. LifeMetrics Index provides mortality rates
and period life expectancy levels across various ages, by gender, for each
national population covered. Currently the LifeMetrics Index publishes
index values for the United States, England & Wales, and The Netherlands.
All of the methodology, algorithms and calculations are fully disclosed and
open. The LifeMetrics toolkit includes a set of computer based models that
can be used in forecasting mortality and longevity. These models have been
evaluated in the research paper ‘A quantitative comparison of eight stochas-
tic mortality models using data from England & Wales and the United
States’ by Cairns et al. (2007). The R source code required to run the forecast
models is available for download along with a user guide.
We also mention two other resources which are available from the web
(but which were not used in the present book). Federico Girosi and Gary
King offer the YourCast software that makes forecasts by running sets of
linear regressions together in a variety of sophisticated ways. This open
source software is freely available from http://gking.harvard.edu/yourcast/.
It implements the methods introduced in Federico Girosi and Gary King’s
manuscript on Demographic Forecasting, to be published by Princeton
University Press.
Further, we note the recent initiative of the British CMIB (Continuous
Mortality Investigation Bureau), that is, the bureau affiliated to the UK actu-
arial profession, with the function of producing mortality tables for use by
insurers and pension plans. The CMIB has made available software running
on R with the aim of illustrating the P-Spline methodology for projecting
mortality. CMIB software now allows the fitting of the Lee-Carter model
as well, but with restricted ARIMA specifications. For more details, please
consult http://www.actuaries.org.uk.
Before embarking in the presentation of the Lee–Carter and the Cairns–
Blake–Dowd approaches, let us say a few words about the material not
included in the present chapter. First, we do not consider possible cohort
effects, and limit our analysis to the age and period dimensions. For coun-
tries like Belgium, cohort effects are weak enough and can be neglected.
However, for countries like the UK, cohort effects are significant and must
be accounted for. Chapter 6 is devoted to the inclusion of cohort effects in
the Lee–Carter and Cairns–Blake–Dowd models discussed here.
We also do not consider continuous-time models for mortality, that are

inherited from the interest rate management and credit risk literature. We
refer the reader to the works by Biffis and Millossovich (2006a), Biffis and
Millossovich (2006b), Biffis and Denuit (2006), and Biffis (2005) for more
information and further references about this approach. See also Chapter 7
in this book.
5.2 Lee–Carter mortality projection model

5.2.1 Specification
Lee and Carter (1992) proposed a simple model for describing the secu-
lar change in mortality as a function of a single time index. Throughout
this chapter, we assume that assumption (3.2) is fulfilled, that is, that the
age-specific mortality rates are constant within bands of age and time, but
allowed to vary from one band to the next. Recall that under (3.2), the force
of mortality µx (t) and the death rate mx (t) coincide.
Lee and Carter (1992) specified a log-bilinear form for the force of
mortality µx (t), that is,
ln µx (t) = αx + βx κt (5.1)
The specification (5.1) differs structurally from parametric models given
that the dependence on age is non-parametric, and represented by the
sequences of αx ’s and βx ’s. Interpretation of the parameters is quite simple:
exp αx is the general shape of the mortality schedule and the actual forces
of mortality change according to an overall mortality index κt modulated
by an age response βx (the shape of the βx profile tells which rates decline
rapidly and which slowly over time in response of change in κt ). The param-
eter βx represents the age-specific patterns of mortality change. It indicates
the sensitivity of the logarithm of the force of mortality at age x to varia-
tions in the time index κt . In principle, βx could be negative at some ages
x, indicating that mortality at those ages tends to rise when falling at other
ages. In practice, this does not seem to happen over the long-run, except
sometimes at the very oldest ages. There is also some evidence of negative βx
estimates for males at young adult ages in certain industrialized countries.
This has been attributed to an increase in mortality due to AIDS in the late
1980s and 1990s.
In a typical population, age-specific death rates have a strong tendency
to move up and down together over time. The specification (5.1) uses this
tendency by modelling the changes over time in age-specific death rates as
driven by a scalar factor κt . This strategy implies that the modelled death
rates are perfectly correlated across ages, which is the strength but also the
weakness of the approach. As pointed by Lee (2000), the rates of decline
in the ln µx (t)’s at different ages are given by βx (κt − κt−1 ) so that they
always maintain the same ratio to one another over time. In practice, the
relative speed of decline at different ages may vary. In such a case, the
extended version of the Lee–Carter model introduced by Booth et al. (2002)
– see equation (5.14) – or the Cairns–Blake–Dowd approach might be
preferable.
Remark Hyndman and Ullah (2007) extend the principal components

approach by adopting a functional data paradigm combined with non-
parametric smoothing (penalized regression splines) and robust statistics.
Univariate time series are then fitted to each component coefficient (or level
parameter). The Lee-Carter method then appears to be a particular case of
this general approach.
Remark Many models produce projected death rates that tend to 0. Hence,
some constraint should be imposed on the long-term behaviour of the
death rates. In that respect, limit life tables that have been discussed in
Section 4.6.1 may be specified, or we can use a forecast that incorporates
a theoretical maximum achievable life expectancy. This feature implies a
slowdown in the rate of mortality decline as the theoretical maximum life
expectancy is reached. If we denote as µ∞ x the limiting force of mortality,
∞
the model becomes ln µx (t) − µx = αx + βx κt .
Remark Considering the global convergence in mortality levels, and the

common trends evidenced in Section 3.5 of Chapter 3, it may seem appro-
priate to prepare mortality forecasts for individual national populations in
tandem with one another. Li and Lee (2005) have modified the original pro-
jection model of Lee and Carter (1992) for producing mortality forecasts
for a group of populations. To this end, the central tendencies for the group
are first identified using a common factor approach, and national historical
particularities are then taken into account.
Note that the most direct application of this approach is to forecast in a
population for the two sexes mortality. The same βx and κt can be used for
both males and females, letting the αx ’s depend on gender, as in Li and Lee
(2005). Alternatively, Carter and Lee (1992) used the same κt ’s for males
and females but allowed the αx ’s and βx ’s to be gender-specific.
Delwarde et al. (2006) have analysed the pattern of mortality decline in
the G5 countries (France, Germany, Japan, UK, and USA). Each G5 country
is viewed as the value of a covariate. This model allows us to analyse the
level and age pattern of mortality by country, the general time pattern of
mortality change, and the speed and age pattern of mortality change by
country. As for the Lee–Carter model, the extrapolation of estimates " κt
gives future mortality rates for given gender, age, time, and country. The
main interest of this method lies in the estimation of a unique time series
(or two if each gender is treated separately) which gives mortality rates for
all countries and age-time categories.
As expected, the analysis conducted by Delware et al. (2006) reveals
that age is the most important factor determining mortality rate. The time
effect is more relevant than the country effect if weights are taken into
account, which is a sign of convergence. In other words, the time horizon
is more important than the country, but since the country effect is not neg-
ligible, the differences between country-specific death rates increase with
time. These results allow us to compare the mortality experience observed
in the G5 countries through the same model and also to produce forecasts.
An estimated average death rate and a common index of mortality decline
can be obtained from the analysis, which is essential for economists. Most
financial and insurance decisions are taken on the basis of a worldwide
view, more than on a regional or particular location. From this analysis,
one can obtain baseline mortality forecasts from the pooled G5 popu-
lation, but at the same time, one can see the influence of each gender,
age, time trend, and country on the mortality forecast. In this way, the
observed past behaviour of the G5 is summarized in a single model and
the identification and comparison of each country specific effect become
much easier.
5.2.2 Calibration
5.2.2.1 Identifiability constraints

Let us assume that we have observed data for a set of calendar years t =
t1 , t2 , . . . , tn and for a set of ages x = x1 , x2 , . . . , xm . On the basis of these
observations, we would like to estimate the corresponding αx ’s, βx ’s, and
κt ’s. However, this is not possible unless we impose additional contraints.
In (5.1), the αx parameters can only be identified up to an additive
constant, the βx parameters can only be identified up to a multiplicative
constant, and the κt parameters can only be identified up to a linear trans-
formation. Precisely, if we replace βx with cβx and κt with κt /c for any c = 0
or if we replace αx with αx − cβx and κt with κt + c for any c, we obtain
the same values for the death rates. This means that we cannot distinguish
between the two parametrizations: different values of the parameters pro-
duce the same mx (t)’s. To see that two constraints are needed to ensure
identification, note that if (5.1) holds true, we also have
αx + +
ln µx (t) = + βx+
κt (5.2)
αx = αx +c1 βx , +
with + βx = βx /c2 and+
κt = c2 (κt −c1 ). Therefore, we need to
impose two constraints on the parameters αx , βx , and κt in order to prevent
the arbitrary selection of the parameters c1 and c2 .
A pair of additional constraints are thus required on the parameters for
estimation to circumvent this problem. To some extent, the choice of the
constraints is a subjective one, although some choices are more natural than
others. In the literature, the parameters in (5.1) are usually subject to the
constraints
tn
xm
κt = 0 and βx = 1 (5.3)
t=t1 x=x1
ensuring model identification. Under this normalization, βx is the propor-

tion of change in the overall log mortality attributable to age x. We also
note that other
msets of constraints can be found in the literature, for instance,
κtn = 0 or xx=x β
1 x
2 = 1.
Note that the lack of identifiability of the Lee-Carter model is not a real
problem. It just means that the likelihood associated with the model has an
infinite number of equivalent maxima, each of which would produce identi-
cal forecasts. Adopting the constraints (5.3) consists in picking one of these
equivalent maxima. The important point is that the choice of constraints
has no impact on the quality of the fit, or on forecasts of mortality. Some
care is needed, however, in any bootstrap procedures used for simulation
(see Section 5.8).
5.2.2.2 Least-squares estimation

Statistical model The model classically used to estimate the αx ’s, βx ’s, and
κt ’s is
" x (t) = αx + βx κt + x (t)
ln m (5.4)
" x (t) denotes the
for x = x1 , x2 , . . . , xm and t = t1 , t2 , . . . , tn , where m
observed force of mortality at age x during year t computed according to
(3.13), and where the x (t)’s are homoskedastic centered error terms. The
error term x (t), with mean 0 and variance σ2 reflects any particular age-
specific historical influences that are not captured in the model. Note that
the errors have the same variance over age, which is sometimes a question-
able assumption: the logarithm of the observed force of mortality is usually
much more variable at the older ages than at the younger ages because of
the much smaller absolute number of deaths at the older ages. However,
if the mortality surface has been previously completed (i.e. extrapolated to

the oldest ages using a parametric model), the homoskedasticity assump-
tion is not a problem provided that the actuary restricts the age range for
modelling to 50 and over, say, in order to avoid the instability around the
accident hump.
It is worth mentioning that model (5.4) is not a simple regression model,
since there are no observed quantities on the right-hand side. Specifically,
age x and calendar time t are treated as factors and the effect on mortality
is quantified by the sequences αx1 , αx2 , . . . , αxm and βx1 , βx2 , . . . , βxm for
age, and by the sequence κt1 , κt2 , . . . , κtn for calendar time. Note that the
model (5.4) is particularly useful when the actuary has only a set of death
rates m " x (t) at his disposal. In the case where more detailed information
is available, the Poisson approach described in the next section makes an
effective use of observations of death counts and exposure-to-risk.
Objective function The model (5.4) is fitted to a matrix of age-specific

observed forces of mortality using singular value decomposition. Specifi-
αx ’s, "
cally, the " βx ’s, and "
κt ’s are such that they minimize
tn
xm 2
OLS (α, β, κ) = " x (t) − αx − βx κt
ln m (5.5)
x=x1 t=t1
This is equivalent to maximum likelihood estimation provided that the

x (t)’s obey the Normal distribution.
Remark Wilmoth (1993) suggested a weighted least-squares procedure for
estimating the (α, β, κ) parameters. Specifically, the objective function (5.5)
is replaced with

xm
tn 2
OWLS (α, β, κ) = " x (t) − αx − βx κt
wxt ln m (5.6)
x=x1 t=t1
Empirical studies reveal that using the observed dxt ’s as weights (i.e.
wxt = dxt ) has the effect of bringing the parameters estimated into close
agreement with the Poisson-response-based estimates (discussed below).
However, the choice of the death counts as weights is questionable, and
the Poisson maximum likelihood approach described in the next section
has better statistical properties, and should therefore be preferred for infer-
ence purposes. The reason is that a valid weighted least-squares approach
must use exogeneous weights, but obviously the number of deaths is a ran-
dom variable. As such, estimates resulting from the minimization of OWLS
have no known statistical properties and can be strongly biased.
∂
Effective computation: Singular value decomposition Setting ∂αx OLS
equal to 0 yields

tn
tn
" x (t) = (tn − t1 + 1)αx + βx
ln m κt (5.7)
t=t1 t=t1
tn
Since t=t1 κt = 0 by the constraint (5.3), we get
1 tn
"
αx = " x (t)
ln m (5.8)
tn − t1 + 1 t=t
1
The minimization of (5.5) thus consists in taking for " αx the row average
" x (t)’s. When the model (5.4) is fitted by ordinary least-squares,
of the ln m
the fitted value of αx exactly equals the average of ln m " x (t) over time t so
that exp αx represents the general shape of the mortality schedule. We then
obtain the "
βx ’s and" κt ’s from the first term of a singular value decomposition
of the matrix ln m " x (t) − "αx .
Specifically, death rates can be combined to form a matrix
 
mx1 (t1 ) · · · mx1 (tn )
 .. .. .. 
M= . . .  (5.9)
mxm (t1 ) · · · mxm (tn )
of dimension (xm − x1 + 1) × (tn − t1 + 1). Model (5.1) is then fitted so that
it reproduces M as closely as possible. Now, let us create the matrix
" −"
Z = ln M α
 
" x1 (t1 ) − "
ln m αx1 " x1 (tn ) − "
· · · ln m αx1
 . .. .. 
= .
. . .  (5.10)
" xm (t1 ) − "
ln m αxm " xm (tn ) − "
· · · ln m αxm
of dimension (xm −x1 +1)×(tn −t1 +1). Approximating the zxt ’s with their
Lee–Carter expression βx κt indicates that the absence of age-time interac-
tions is assumed, that is, the βx ’s are fixed over time and the κt ’s are fixed
over ages. Most data sets do not comply with the time-invariance of the
βx ’s, unless the optimal fitting period has been selected as explained below.
Now, the " βx ’s and "
κt ’s are such that they minimize
tn
xm 2
+LS (β, κ) =
O zxt − βx κt (5.11)
x=x1 t=t1
The solution is given by the singular value decomposition of Z. More pre-

cisely, let us define the square matrices Z T Z of dimension (tn − t1 + 1) ×
(tn − t1 + 1) and ZZ T of dimension (xm − x1 + 1) × (xm − x1 + 1). Let u1

be the eigenvector corresponding to the largest eigenvalue of Z T Z. Let v 1
be the corresponding eigenvector of ZZ T . The best approximation of Z in
the least-squares sense is known to be

Z ≈ Z = λ1 v 1 u T
1 (5.12)
from which we deduce

 
xm −x
1 +1
v1
"
β = x −x +1 κ = λ1 
and " v1j  u1 (5.13)
m 1
j=1 v1j j=1
xm −x1 +1
provided that j=1 v1j = 0. The constraints (5.3) are then satisfied by
the "
βx ’s and"
κt ’s. Note that the second and higher terms of the singular value
decomposition together comprise the residuals. Typically, for low mortality
populations, the first order approximation (5.12) behind the Lee–Carter
model accounts for about 95% of the variance of the ln m " x (t)’s.
Remark As pointed out by Booth et al. (2002), the original approach by
Lee and Carter (1992) makes use of only the first term of the singular value
decomposition of the matrix of centered log death rates. In principle, the
second-and higher-order terms could be incorporated in the model. The full
expanded model is

r
[j] [j]
" x (t) = αx +
ln m β x κt (5.14)
j=1
[j] [j]
where r is the rank of the ln mx (t)−αx matrix. In this case, βx κt is referred
to as the jth order term of the approximation. Any systematic variation
in the residuals from fitting only the first term would be captured by the
second and higher terms. In their empirical illustration, Booth et al. (2002)
find a diagonal pattern in the residuals that was interpreted as a cohort-
period effect. We will come back to the modelling of cohort effects in the
next chapter. Brouhns et al. (2002b) have tested whether the inclusion of
a second log-bilinear term significantly improves the quality of the fit, and
this was not the case in their empirical illustrations.
Renshaw and Haberman (2003a) report on the failure of the first-order
Lee–Carter model to capture important aspects of the England and Wales
mortality experience (despite explaining about 95% of the total variance)
together with the presence of noteworthy residual patterns in the second-
order term. As a consequence, Renshaw and Haberman (2003b) have
investigated the feasibility of constructing mortality forecasts on the basis
of the first two sets of SVD vectors, rather than just on the first set of such
vectors, as in the Lee–Carter approach. Whereas Renshaw and Haberman

(2003b) have applied separate univariate ARIMA processes to the first two
period components, Renshaw and Haberman (2005) have used a bivariate
time series.
Effective computation: Newton–Raphson The estimations for the param-

eters αx , βx and κt can also be obtained recursively using a Newton–
Raphson algorithm avoiding singular value decomposition.
The system to solve in order to obtain the estimated values of the param-
eters αx , βx and κt is obtained by equating to 0 the partial derivative of
OLS (α, β, κ) given in (5.5) with respect to αx , κt and βx , that is,

tn

0= " x (t) − αx − βx κt ,
ln m x = x1 , x2 , . . . , xm
t=t1

xm

0= " x (t) − αx − βx κt ,
βx ln m t = t1 , t2 , . . . , tn (5.15)
x=x1

tn

0= " x (t) − αx − βx κt ,
κt ln m x = x1 , x2 , . . . , xm
t=t1
Each of these equations is of the form f (ξ) = 0, where ξ is one of the

parameters αx , βx , and κt .
The idea is to update each parameter in turn using a univariate Newton-
Raphson recursive scheme. Starting from some initial value ξ (0) , the (k+1)th
iteration gives ξ (k+1) from ξ (k) by
f (ξ (k) )
ξ (k+1) = ξ (k) −
f (ξ (k) )
Each time one of the Lee–Carter parameters αx , βx and κt is updated, the
already revised values of the other parameters are used in the iterative
formulas. The recurrence relations are thus as follows:
tn (k) (k) (k)
(k+1) (k) t=t1 ln m " x (t) − "αx − " βx " κt
"
αx ="αx +
tn − t 1 + 1
xm
" (k) ln m " x (t) − " (k+1)
−"
(k) (k)
(k+1) (k) x=x1 βx αx βx "κt
"
κt ="κt + xm (k) 2 (5.16)
"
β x
x=x1
tn (k+1) (k+1) (k) (k+1)
t=t1 "
κt " x (t) − "
ln m αx −"βx " κt
"
βx(k+1) " (k)
= βx + tn (k+1) 2
t=t1 " κt
This alternative to singular value decomposition does not require a rect-

angular array of data (it suffices to let the summation indices range over
the available observations). Further estimation can proceed in the presence
of empty cells, as these would receive a zero weight and are then simply
excluded from the computations.
Identifiability constraints The estimates for αx , βx , and κt produced by the

methods described above (the singular value decomposition or the Newton-
Raphson procedure (5.16)) do not satisfy the constraints (5.3). To fullfill
the identifiability constraints, we replace "αx with " αx +" κt −κ)"
κt with ("
βx κ," β• ,
and "βx with "
βx /"β• where "β• is the sum of the "
βx ’s coming out of the singular
value decomposition or the Newton–Raphson procedure (5.16), and κ is the
average of the " κt ’s coming out of the singular value decomposition or the
Newton–Raphson procedure (5.16).
Adjustment of the κt ’s by refitting to the total observed deaths Instead of

keeping the "κt ’s obtained from singular value decomposition or Newton–
Raphson algorithm, Lee and Carter (1992) suggested that the " κt ’s (taking
the " "
αx ’s and βx ’s asgiven) be adjusted in order to reproduce the observed
number of deaths xx=x m
1
Dxt in year t. This avoids discrepancies arising
from modelling on the logarithmic scale.
Since it is desirable that the differences between the actual and expected
total deaths in each year are zero, as in the construction and graduation of
period life tables, the adjusted "
κt ’s solve the equation

xm
xm
Dxt = αx + "
ETRxt exp(" βx ζ) (5.17)
x=x1 x=x1
in ζ. So, the κt ’s are reestimated in such a way that the resulting death rates
(with the previously estimated " αx and "βx ), applied to the actual risk expo-
sure, produce the total number of deaths actually observed in the data for
the year t in question. There are several advantages to making this second
stage estimate of the parameters κt . In particular, it avoids sizable discrep-
ancies between predicted and actual deaths (which may occur because the
model (5.4) is specified by means of logarithms of death rates). We note
that no explicit solution is available for (5.17), which has thus to be solved
numerically (using a Newton–Raphson procedure, for instance).
It is worth mentioning that more than one solution for (5.17) may arise
when all the "βx ’s do not have the same sign. A nonuniform sign for the "
βx ’s
implies that mortality is increasing at some ages and decreasing at others.
This is not normally expected to happen, except sometimes at advanced ages
(but the phenomenon disappears when the actuary starts the modelling by
closing the life tables). Therefore, solving (5.17) usually does not pose any
problem.
Adjustment of the κ̂t ’s by refitting to the observed period life expectancies

Whereas Lee and Carter (1992) have suggested that the " κt be adjusted as
in (5.17) by refitting to the total observed deaths, Lee and Miller (2001)
have proposed an adjustment procedure in order to reproduce the period
life expectancy at some selected age (instead of the total number of deaths
recorded during the year).
In practice, the actuary first selects an age x0 . In population studies, it is
common to take x0 = 0 but in mortality projections for annuitants, taking
x0 = 60 or 65 may be more meaningful. Considering (3.18), the estimated
"
κt is adjusted to match the observed life expectancy at age x0 in year t given
the estimated αx ’s and βx ’s obtained from the singular value decomposition
or from the Newton–Raphson algorithm. Thus, the adjusted " κt ’s solve the
equation

1 − exp − exp " αx0 + "βx0 ζ
ex↑0 (t) =
exp " αx + "βx0 ζ
 0 
k−1 !
+  exp − exp " αx +j + "
0
βx +j ζ 
0
k≥1 j=0

1− exp − exp " αx0 +k + "
βx0 +k ζ
× (5.18)
exp "αx0 +k + " βx0 +k ζ
in ζ.
The advantage of this second adjustment procedure is that it does not
require exposures-to-risk nor death counts and is thus generally applicable.
Note that, as before, numerical problems may arise when the " βx ’s do not
have the same sign, but we believe that this problem is unlikely to occur in
practice.
Adjustment of the κ̂t ’s by refitting to the observed age distribution of deaths

Booth et al. (2002) have suggested another procedure for adjusting the
κt ’s. Rather than fitting the yearly total number of deaths xx=x
" m
1
D xt as in
(5.17), this variant fits to the age distribution of deaths Dxt assuming the
Poisson distribution for the age-specific death counts and using the deviance
statistic to measure the goodness-of-fit. Specifically, for a fixed calendar
year t, the Dxt ’s are considered as independent random variables obeying

the Poisson distribution with respective mean ETRxt exp " αx + "
βx κt , where
the values of the "αx ’s and "
βx ’s are those coming from either the singular
value decomposition or the Newton–Raphson iterative method, and where
κt has to be determined in order to make the observed Dxt ’s as likely as
possible. This means that κt maximizes the Poisson log-likelihood
xm

Dxt ln ETRxt exp "αx + "
βx ζ − ETRxt exp "αx + "
βx ζ (5.19)
x=x1
over ζ, or equivalently, minimizes the deviance

xm

Dxt " xt
D=2 Dxt ln − Dxt − D (5.20)
" xt
D
x=x1

where D " xt = ETRxt exp " αx +"
βx ζ is the expected number of deaths, keeping
the " "
αx ’s and βx ’s unchanged.
Identifiability constraints The identifiability constraints (5.3) are no

longer satisfied by the adjusted " κt . Therefore, we replace "
κt with " κt − κ
and "
αx with " αx + "
βx κ, where κ is the average of the adjusted " κt ’s. This
simple method only works because we are dealing with an identification
constraint (not a model restriction).
Poisson maximum likelihood estimation
Statistical model
Let us now assume that the actuary has at his/her disposal observed death
counts Dxt and corresponding exposures ETRxt . Then, the least-squares
approach can be applied to the ratio of the death numbers to the expo-
" x (t) = Dxt /ETRxt ’s as explained above). The method
sure (i.e. to the m
presented in this section better exploits the available information, and does
not assume that the variability of the m" x (t)’s is the same whatever the age
x. Specifically, we assume that the number of deaths at age x in year t
has a Poisson random variation. To justify this approach, we prove that
assumption (3.2) is compatible with Poisson modelling for death counts.
To this end, let us focus on a particular pair: age x – calendar year t. We
observe Dxt deaths among Lxt individuals aged x on January 1 of year t.
We assume that the remaining lifetimes of these individuals are independent
and identically distributed. The likelihood function (3.12) is proportional to
the Poisson likelihood, that is, the one obtained under the assumption that
Dxt is Poisson distributed with mean ETRxt µx (t) = ETRxt exp(αx + βx κt )
where the parameters are still subjected to the constraints (5.3). Therefore,
provided that we resort to the maximum likelihood estimation procedure,
working on the basis of the ‘true’ likelihood (3.12) or working on the
basis of the Poisson likelihood are equivalent, once the assumption (3.2)
has been made.
Objective function The parameters αx , βx , and κt are now estimated

by maximizing the log-likelihood based on the Poisson distributional
assumption. This is given by
tn
xm
L(α, β, κ) = Dxt (αx + βx κt ) − ETRxt exp(αx + βx κt ) + constant.
x=x1 t=t1
(5.21)
Equivalently, the parameters are estimated by minimizing the associated
deviance defined as
D = −2(L(α, β, κ) − Lf ) (5.22)
where Lf is the log-likelihood of the full or saturated model (characterized
by equating the fitted and actual numbers of deaths).
Effective computation Because of the presence of the bilinear term βx κt ,

it is not possible to estimate the proposed model with commercial statisti-
cal packages that implement Poisson regression. We can nevertheless easily
solve the likelihood equations with the help of a uni-dimensional or elemen-
tary Newton–Raphson method implemented in (5.16) in the least-squares
case.
(0) (0)
The updating scheme is as follows: starting with " αx = 0, "βx = 1, and
(0) (k) (k)
"
κt = 0 (random values can also be used), the sequences of " αx , " βx , and
(k)
"
κt are obtained from the formulas
tn
(k) (k) (k)

D xt − ETR xt exp "αx + "
β x "κ t
t=t
α(k+1) α(k)
1
"x =" x − n
(k) " (k) (k)
− tt=t 1
ETR xt exp "αx + β x t"
κ
xm
(k+1) "(k) (k)

(k)
D −ETR exp "α + β "κ "βx
(k+1) ( k) x=x 1 xt xt x x t
"
κt ="κt − (5.23)
xm (k+1) "(k) (k) (k) 2
− x=x1 ETRxt exp " αx +βx " κt "βx
tn
(k+1) " (k) (k+1) (k+1)
t=t D xt − ETR xt exp "α x + βx "κt "
κt
"
βx(k+1) = "
βx(k) −
1

n (k+1) (k) (k+1) (k+1) 2
− tt=t ETR xt exp "α x + "
β x t"
κ "
κt
1
The criterion used to stop the procedure is a relative increase in the log-
likelihood function that is smaller than a pre-selected sufficiently small fixed
number.
The maximum likelihood estimations of the parameters coming out of
(5.23) have to be adapted in order
m to fulfill the constraints
xm (5.3): specifically,
we replace " κt − κ) xx=x
κt with (" "
β "
β "
β / "
1 x
, x with x x=x1 βx , and "
αx with
" "
αx + βx κ.
Remark As pointed out by Renshaw and Haberman (2006), the error struc-
ture can be imposed by specifying the second moment properties of the
model, as in the framework of generalized linear modelling. This allows for
a range of options for the choice of the error distribution, including Pois-
son, both with and without dispersion, as well as Gaussian, as used in the
original approach by Lee and Carter (1992).
Remark In contrast to the classical least-squares approach to estimating the

parameters, the error applies directly on the number of deaths in the Poisson
regression approach. There is, thus, no need for a second-stage estimation
like (5.17) for the κt ’s.
Note that differentiating the log-likelihood (5.21) with respect to αx gives
the equation
tn tn
Dxt = ETRxt exp(" αx + "
βx"κt ) (5.24)
t=t1 t=t1
which is similar to (5.17) except that the sum is now over calendar time
instead of age. So, the estimated κt ’s are such that the resulting death rates
applied to the actual risk exposure produce the total number of deaths
actually observed in the data for each age x. Sizable discrepancies between
predicted and actual deaths are thus avoided.
5.2.2.3 Alternative estimation procedures for logbilinear models

Brillinger (1986) showed that under reasonable assumptions about the
processes governing births and deaths, the Poisson distribution is a good
candidate to model the numbers of deaths at different ages. This provides a
sound justification for the Poisson model for estimating the (α, β, κ) param-
eters. There are nevertheless (at least) two alternatives for estimating the
parameters.
Binomial maximum likelihood estimation Cossette and Marceau (2007)

have proposed a Binomial regression model for estimating the parame-
ters in logbilinear mortality projection models. The annual number Dxt
of recorded deaths is then assumed to follow a Binomial distribution, with

a death probability qx (t), which is expressed as a function of the force of
mortality (5.1) via qx (t) = 1 − exp(−µx (t)).
The number of deaths Dxt at age x during year t has a Binomial
distribution with parameters Lxt and qx (t). The specification for µx (t) gives

qx (t) = 1 − exp − exp αx + βx κt (5.25)
To ensure identifiability, we adhere to the set of constraints (5.3). Assuming

independence, the likelihood for the entire data is the corresponding product
of binomial probability factors. The log-likelihood is then given by
xm
tn

L(α, β, κ) = dxt ln 1 − "
qx (t) + dxt ln "
qx (t) + constant (5.26)
t=t1 x=x1
As in the Poisson case, the presence of the bilinear term βx κt makes

commercial statistical packages that implement linear Binomial regression
useless. An iterative procedure has been proposed in Cossette et al. (2007)
for estimating the parameters. A parallel analysis is provided by Haber-
man and Renshaw (2008) with an investigation of a number of alternative
specifications to (5.25).
Overdispersed Poisson and Negative Binomial maximum likelihood

estimation Poisson modelling induces equidispersion. We know from
Section 3.3.9 that populations are heterogeneous with respect to mortal-
ity. Heterogeneity tends to increase the variance compared to the mean (a
phenomenon termed as overdispersion), which rules out the Poisson specifi-
cation and favours a mixed Poisson model. Besides gender, age x and year t,
there are many other exogeneous factors affecting mortality. It is, therefore,
natural to extend the Lee–Carter model in order to take this feature into
account. One approach advocated by Renshaw and Haberman (2003b),
Renshaw and Haberman (2003c), Renshaw and Haberman (2006), is to
postulate that the random number of deaths Dxt has an overdispersed
Poisson distribution. Thus, it is suggested that
Var[Dxt ] = φE[Dxt ] (5.27)
where φ is a parameter that measures the degree of overdispersion. Clearly,

φ = 1 reduces to the standard Poisson case.
An alternative approach is to take the exogeneous factors into account
by adding a random effect xt super-imposed on the Lee–Carter predic-
tor αx + βx κt , exactly as in (5.4). More precisely, Delwarde et al. (2007b)
have suggested the replacement of the Poisson model with a Mixed Poisson
one. Given xt , the number of deaths Dxt is assumed to be Poisson dis-
tributed with mean ETRxt exp(αx + βx κt + xt ). Unconditionally, Dxt obeys
a mixture of Poisson distributions. The xt ’s are assumed to be indepen-
dent and identically distributed. A prominent example consists in taking the
Dxt ’s to be Negative Binomial distributed. See also Renshaw and Haberman
(2008).
Mortality data from the life insurance market often exhibit overdisper-
sion because of the presence of duplicates. It is common for individuals to
hold more than one life insurance or annuity policy and hence to appear
more than once in the count of exposed to risk or deaths. In such a case,
the portfolio is said to contain duplicates, that is, the portfolio contains sev-
eral policies concerning the same lives. It is well known that the variance
becomes inflated in the presence of duplicates. Consequently, even if the
portfolio (or one of its risk class) is homogeneous, the presence of duplicates
would increase the variance and cause overdispersion. The overdispersed
Poisson and Negative Binomial models for estimating the parameters of log-
bilinear models for mortality projections are thus particularly promising for
actuarial applications.
5.2.3 Application to Belgian mortality statistics
Before embarking on a mortality projection case study, we have to decide

about the type of mortality statistics that will be used. In some countries
(like in the UK), extensive data are available for policyholders, according to
the type of contract. In such a case, we might wonder whether the forecast
should be based on population or market data.
Using market data allows us to take adverse selection into account. How-
ever, basing mortality projections on market data implicitly would mean
that no structural breaks have occurred because of changes to the character
of the market, or modifications in the tax system or in the level of adverse
selection, for instance. Thus, this is not always the best strategy. Assume,
for example, that the government starts offering incentives to individuals
from the lower socio-economic classes to buy life annuities in order to sup-
plement public pensions. Using market data would result in a worsening
in mortality because of a modification in the profile of the insured lives
(as lower socio-economic classes usually experience higher mortality rates).
Hence, this will artificially modify the mortality trends for the market. It
is, thus, impossible to separate long-term mortality trends from modifica-
tions in the structure of the insured population. If, however, we need to
undertake forecasts based on market data, covariates are often helpful, like
the amount of the annuity (reflecting individuals’ socio-economic class), for

instance.
Actuaries sometimes weight their calculations by policy size to account
for socio-economic differentials amongst policyholders. These ‘amount-
based’ measures usually produce lower mortality rates than their ‘lives-
based’ equivalents due to the tendency for wealthier policyholders to live
longer. The pension size is thus used as a proxy of socio-economic group.
However, this approach is somewhat ad hoc, and the amount of pension
should better be included explicitly as a covariate in the regression models
used for mortality projections.
For the reason given above, we prefer to use general population data for
mortality forecasting. Relational models introduced in Section 3.4.4 would
allow us to take adverse selection into account, and to exploit the long-term
changes in population mortality. Specifically, the overall mortality trend
would be estimated from the general population, and a regression model
then used to switch from the general population to the insurance market.
Proceeding in this way would separate the long-term mortality trends from
the particular features of the insured population.
We begin by fitting the log-bilinear model to the HMD data set by the
least-squares method. We only consider males; the analysis for females is
similar. The calendar years 1920–2005 and ages 0–104 are included in the
analysis. The reason for restricting the highest age to 104 is that the Belgian
2002–2004 population life table that will serve as the basis for the forecast
(as explained in below) does not extend beyond this age. Note that the data
at high ages have been processed in the HMD, so that the independence
assumption is no more valid at these ages and the corresponding results
have to be interpreted with care. Figure 5.1 (top panels) plots the estimated
αx ’s, βx ’s, and κt ’s. The estimated αx ’s exhibit the typical shape of a set
of log death rates with relatively high values around birth, a decrease at
infant ages, the accident hump, and finally the increase at adult ages with
an ultimately concave behaviour. The estimated βx ’s appear to decrease
with age, suggesting that most of the mortality decreases are concentrated
on the young ages. The estimated κt ’s are adjusted to reproduce the observed
period life expectancies at birth. The estimated κt ’s are affected by World
War II, with comparatively higher values in the early 1940s. We note that
the model explains 92.09% of the total variance.
We now restrict ourselves to ages above 60. Figure 5.1 (bottom panels)
plots the estimated αx , βx , and κt . The model now explains 90.18% of the
total variance. Note that compared to the case where all the ages 0–104
were included in the analysis, the adjusted " κt ’s are much more similar to
the initial ones coming from singular value decomposition.
0.030
50
–2 0.025
0.020 0
Beta
Alpha
Kappa
–4
0.015
0.010 –50
–6
0.005
–100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
–0.5 10
–1.0 0.030
5
–1.5 0.025 0
Alpha
Kappa
–2.0
Beta
0.020 –5
–2.5
–10
–3.0 0.015
–15
–3.5
0.010
–4.0 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
Figure 5.1. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005,
obtained with HMD data by minimizing the sum of squares (5.5) with the estimated κt ’s adjusted by refitting to the period life expectancies at birth or at ages
60 (for the estimated κt ’s, the values before adjustment are displayed in broken line).
It is important to mention that the sole use of the proportion of the total
temporal variance (as measured by the ratio of the first singular value to the
sum of singular values) is not a satisfactory diagnostic indicator. An exam-
ination of the residuals is needed to check for model adequacy (see below).
The fitted mortality surfaces are depicted in Fig. 5.2. These surfaces
should be compared with Fig. 5.3. The mortality experience appears rather
smooth, with some ridges around 1940–1945.
We now fit the log-bilinear model to the HMD data set by the method of
Poisson maximum likelihood. All of the ages 0–104 are included in the anal-
ysis. Figure 5.3 (top panels) plots the estimated αx , βx and κt . The estimated
parameters are compared with those obtained by minimizing the sum of the
squared residuals (5.5). We see that the least-squares and Poisson maximum
likelihood procedures produce very similar sets of estimated parameters αx ,
βx , and κt .
As above, we restrict ourselves to ages above 60. Figure 5.3 (bottom pan-
els) plots the estimated αx , βx , and κt . The estimated parameters are com-
pared with those obtained by minimizing least squares. We observe sizeable
discrepancies between the " βx ’s produced by the least-squares and Poisson
maximum likelihood procedures, whereas the " αx ’s and "
κt ’s remain similar.
5.3 Cairns–Blake–Dowd mortality projection model

5.3.1 Specification
Empirical analyses suggest that ln qx (t)/px (t) is reasonably linear in x for

fixed t (sometimes with a small degree of curvature in the plot of x versus
ln qx (t)/px (t)), except at younger ages. This is why Cairns et al. (2006a)
assume that

[1] [2]
qx (t) exp κ t + κ t x
ln = κt[1] + κt[2] x ⇔ qx (t) = (5.28)
px (t) 1 + exp κ + κ[2] x
[1]
t t
where κt[1] and κt[2] are themselves stochastic processes. This specification
does not suffer from any identifiability problems so that no constraints need
to be specified.
We see that age is now treated as a continuous covariate and enters the
model in a linear way on the logit scale. The intercept κt[1] and slope κt[2]
parameters make up a bivariate time series the future path of which governs
the projected life tables. The intercept period term κt[1] is generally declining
–2
–4
–6
–8
1920
1940 100
80
1960
60
t
1980 40
x
20
2000
0
–1
–2
–3
–4
1920
1940 100
1960 90
t 80
1980 x
70
2000
60
Figure 5.2. Fitted death rates (on the log scale) for Belgian males, ages 0–104 (top panel) and
ages 60–104 (bottom panel), period 1920–2005.
0 0.035
50
0.030
–2 0.025
0
Kappa
Alpha
0.020
Beta
–4
0.015
–50
0.010
–6
0.005
0.000 –100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
10
–1 0.03 5
0
Alpha
Kappa
Beta
–2 0.02
–5
–10
–3 0.01
–15
–4 0.00 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
obtained with HMD data by maximizing the Poisson log-likelihood (5.21) (the values obtained by least-squares are displayed in broken line).
over time, which corresponds to the feature that mortality rates have been
decreasing over time at all ages. Hence, the upward-sloping plot of the logit
of death probabilities against age is shifting downwards over time. If during
the fitting period, the mortality improvements have been greater at lower
ages than at higher ages, the slope period term κt[2] would be increasing over
time. In such a case, the plot of the logit of death probabilities against age
would be becoming more steep as it shifts downwards over time.
Sometimes, the logit of the death probabilities qx (t) plotted against age x
exhibits a slight curvature after retirement age. This curvature can be mod-
elled by including a quadratic term in age in the Cairns–Blake–Dowd model.
However, the dynamics of the time factor associated with this quadratic
effect often remains unclear and when combined with the quadratic age
term, its contribution to mortality dynamics is highly complex.
The Cairns–Blake–Dowd model has two time series κt[1] and κt[2] which
affect different ages in different ways. This is a fundamental difference com-
pared with the 1-factor Lee–Carter approach where a single time series
induces perfect correlation in mortality rates at different ages from one
year to the next. There is empirical evidence to suggest that changes in the
death rates are imperfectly correlated, which supports the Cairns–Blake–
Dowd model or the 2-factor Lee–Carter model represented by equation
(5.14) with r = 2. Compared to the 1-factor Lee–Carter model, the Cairns–
Blake–Dowd model thus allows changes in underlying mortality rates that
are not perfectly correlated across ages. Also, the longer the run of data that
the actuary uses, the better does the 2-factor model relative to its 1-factor
counterpart. For example, if we consider the entire 20th century, mortality
improvements concentrate on younger ages during the first half of the cen-
tury and on higher ages during the second half. We need a 2-factor model to
capture these two different dynamics. Note, however, that the restriction to
the optimal fitting period in the Lee–Carter case favours recent past history
so that the inclusion of a second factor may not be needed.
Note that the switch from a unique time series to a pair of time-dynamic
factors has far-reaching consequences when we discuss securitization, as the
existence of an imperfect correlation structure implies, for example, that
hedging longevity-linked liabilities would require more than one hedging
instrument.
5.3.2 Calibration
We assume that we have observed data for a set of calendar years t =

t1 , t2 , . . . , tn and for a set of ages x = x1 , x2 , . . . , xm . On the basis of
these observations, we would like to estimate the intercept κt[1] and slope
κt[2] parameters. This can be done by least-squares. This means that the
regression model
"
qx (t)
ln = κt[1] + κt[2] x + x (t) (5.29)
"
px (t)
is fitted to the observations of calendar year t, where the " qx (t)’s are the
crude one-year death probabilities, and where the error terms x (t) are
independent and Normally distributed, with mean 0 and constant variance
σ2 . The objective function
xm
2
"
qx (t)
Ot (κ) = ln − κt[1] − κt[2] x (5.30)
"
px (t)
x=x1
has to be minimized for each calendar year t, giving the estimations of

the κt[1] and κt[2] parameters. Note that, in contrast to the Lee–Carter case,
where the estimated time index κt depends on the observation period, the
time indices κt[1] and κt[2] are estimated separately for each calendar year t
in the Cairns–Blake–Dowd model.
The Cairns–Blake–Dowd model can also be calibrated in a number of
alternative ways, as was the case for the Lee–Carter model. For instance,
a Poisson regression model can be specified by assuming that the observed
death counts are independent and Poisson distributed, with a mean equal
to the product of the exposure-to-risk times the population death rate of
the form

µx (t) = − ln(1 − qx (t)) = ln 1 + exp κt[1] + κt[2] x (5.31)
Estimation based on a Binomial or Negative Binomial error structure can

also be envisaged.
As for the implementation of the Lee–Carter approach, we fit the Cairns-

Blake-Dowd model by least-squares to the HMD data set, using Belgian
males from the general population. The results of the fit are displayed in
Fig. 5.4.
The top panels of Fig. 5.4 display the results when all of the ages
0–104 are included in the analysis. Note that the Cairns–Blake–Dowd
model was never designed to cover all ages, certainly not down to age 0.
The linearity in x means that this model is not able to capture the level-
ling off around age 30 and the accident hump around age 20. From left
–6.5 0.96
0.085
–7.0 0.94
0.080
0.92
–7.5 0.075
R2[t]
k[2]
0.90
k[1]
t
t
–8.0 0.070
0.88
–8.5 0.065
0.86
–9.0 0.060
0.84
–9.5
1920 1940 1960 1980 2000 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t t
0.998
–9.0 0.105
0.996
–9.5 0.100
R2[t]
0.994
k[1]
k[2]
–10.0 0.095
t
–10.5 0.090 0.992
–11.0 0.085 0.990

1920 1940 1960 1980 2000 1920 1940 1960 1980 2000 1920 1940 1960 1980 2000
t t t
[1] [2]
Figure 5.4. Estimated κt and κt parameters together with the values of the adjustment coefficient by calendar year (from left to right), for ages
x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), t = 1920, 1921, . . . , 2005, obtained with HMD data by least-squares.
5.4 Smoothing 209
to right, we see the estimated κt[1] ’s, the estimated κt[2] ’s, and the value
of the adjustment coefficient R2 (t) for each calendar year t. The bot-
tom panels give the corresponding results for the restricted age range 60,
61, . . . , 104.
When all of the ages are considered, the estimated κt[1] ’s exhibit a down-
ward trend, which expresses the improvement in mortality rates over time
for all ages. A peak around 1940–1945 indicates a higher mortality experi-
ence during World War II. The estimated κt[2] ’s tend to increase over time,
indicating that mortality improvements have been comparatively greater at
younger ages over the period 1920–2005. We note that World War II also
affected the estimated κt[2] ’s, with a decrease in the early 1940s. The val-
ues of the adjustment coefficient R2 (t) indicate that the Cairns-Blake-Dowd
model explains from about 80% of the variance in 1920 to about 95% in
the early 2000s.
If we restrict the age range to 60, 61, . . . , 104, we see that the goodness-
of-fit is greatly increased, with adjustment coefficients larger than 99%. The
Cairns–Blake–Dowd model takes advantage of the approximate linearity in
age (on the logit scale) at higher ages to provide a parsimonious represen-
tation of one-year death probabilities. The adjustment coefficients close to
1 demonstrate the ability of the Cairns–Blake–Dowd model to describe the
mortality experienced in Belgium. The trend in the estimated intercept and
slope parameters is less clear, unless we restrict our interest to the latter part
of the 20th century, where the estimated κt[1] ’s and κt[2] ’s become markedly
linear (with a decreasing trend for the former, and an increasing one for the
latter).
5.4 Smoothing
5.4.1 Motivation
Actuaries use projected life tables in order to compute life annuity prices,
life insurance premiums as well as reserves that have to be held by insurance
companies to enable them to be able to pay the future contractual benefits.
Any irregularities in these life tables would then be passed on to the price
list and to balance sheets, which is not desirable. Therefore, as long as these
irregularities do not reveal particular features of the risk covered by the
insurer, but are likely to be caused by sampling errors, actuaries prefer to
resort to statistical techniques to produce life tables that exhibit a regular
progression, in particular with respect to age.
5.4.2 P-splines approach
Durban and Eilers (2004) have smoothed death rates with P-splines in the
context of a Poisson model. The P-spline approach is an example of a regres-
sion model and is similar to the generalized linear modelling discussed in
Section 4.5.4. But unlike generalized linear models, P-splines allow for more
flexibility in modelling observed mortality.
Regression models take a family of basis functions, and choose a com-
bination of them that best fits the data according to some criterion. The
P-spline approach uses a spline basis, with a penalty function that is intro-
duced in order to avoid oversmoothing. P-splines are related to B-splines
which have been discussed in Section 2.6.3. Recall that univariate, or uni-
dimensional, B-splines are a set of basis functions each of which depends
on the placement of a set of ‘knot’ points providing full coverage of the
range of data. Defining B-splines in two dimensions is straightforward. We
define knots in each dimension, and each set of knots gives rise to a uni-
variate B-spline basis. The two-dimensional B-splines are then obtained by
multiplying the respective elements of these two bases.
Durban and Eilers (2004) have suggested a decomposition of µx (t) as
follows:

ln µx (t) = θij Bij (x, t) (5.32)
i,j
for some prespecified two-dimensional B-splines Bij in age x and calendar

time t, with regularly-spaced knots, and where the θij ’s are parameters to
be estimated from historical data.
If we use a large number of knots in the year and age dimensions, then we
can obtain an extremely accurate fit. However, such a fit does not smooth
the random variations present in the data and the resulting death rates
become less reliable. Switching to P-splines helps to overcome this problem,
because of the presence of the penalty function.
The method of P-splines suggested by Eilers and Marx (1996) is now
well-established as a method of smoothing in Generalized Linear Models.
It consists of using B-splines as the basis for the regression and in modify-
ing the log-likelihood by a difference penalty that relates to the regression
coefficients. The inclusion of a penalty with appropriate weight means
that the number of knots can be increased without radically altering the
smoothness of the fit. Penalties can be calculated separately in each dimen-
sion, involving sums of (θij − 2θi−1, j + θi−2, j )2 in the age dimension, sums
of (θij − 2θi, j−1 + θi, j−2 )2 in the calendar year dimension, and sums of
5.4 Smoothing 211
(θi+1, j−1 − 2θij + θi−1, j+1 )2 across cohorts. The CMI Bureau in the UK has
suggested the use of age and cohort penalties (see also Chapter 6). Each
of these penalties involves an unknown weight coefficient that has to be
selected from the data.
Note that there is a difference in the structural assumption behind the P-
spline approach, compared with the Lee–Carter and Cairns–Blake–Dowd
alternative approaches: the P-spline approach assumes that there is smooth-
ness in the underlying mortality surface in the period effects as well as in the
age and cohort effects. Some further extensions have recently been proposed
to account for period shocks.
The P-splines approach is a powerful smoothing procedure for the
observed mortality surface. Using the penalty to project the θij ’s to the
future, it is also possible to use this tool to forecast future mortality rates,
by extrapolating the smooth mortality surface. However, as pointed out
by Cairns et al. (2007), the P-spline approach to mortality forecasting is
not transparent. Its output is a smooth surface fitted to historical data and
then projected into the future. An important difference (compared with the
Lee–Carter and Cairns–Blake–Dowd alternatives) is that forecasting with
the P-splines approach is a direct consequence of the smoothing process.
The choice of the penalty then corresponds to a view of the future pattern
of mortality. In contrast, the two stages of fitting the data and extrapolating
past trends are kept separate in the Lee–Carter annd Cairns–Blake–Dowd
approaches. This is an advantage for actuarial applications, since it allows
for more flexibility.
Moreover, the form of the penalty is usually difficult to infer from the
data, whereas it entirely drives the P-spline mortality forecast (a similar
feature occurs in period-based mortality graduation using splines when
mortality rates are extrapolated beyond the data to the oldest ages). The
degree of smoothing in empirical applications depends on the variabil-
ity of the observed death rates. The size of the population under study,
as well as the range of ages considered, thus, both influence the smooth-
ing coefficient and, possibly, the choice of the penalty. In the Lee–Carter
and Cairns–Blake–Dowd approaches, these features of the data do not
directly affect the projection of the time index. As the order of the penalty
has no discernible effect on the smoothness of the observed data, it is
hard to deduce it from the observed data. The choice of the penalty,
in fact, corresponds to a view of the future pattern of mortality: future
mortality continuing at a constant level, future mortality improving at a
constant rate or future mortality improving at an accelerating (quadratic)
rate.
5.4.3 Smoothing in the Lee–Carter model
As can be seen from Fig. 5.1, the estimated βx ’s exhibit an irregular pat-
tern. This is undesirable from an actuarial point of view, since the resulting
projected life tables will also show some erratic variations across ages.
Bayesian formulations assume some sort of smoothness of age and period
effects in order to improve estimation and facilitate prediction. A Bayesian
treatment of mortality projections has been proposed by Czado et al. (2005).
Note that the estimated αx ’s are usually very smooth, since they represent
an average effect of mortality at age x (however, Renshaw and Haberman
(2003a) experiment with different choices for αx , representing different
averaging periods and hence different levels of smoothing, as well as explicit
graduation of the αx estimates). The estimated κt ’s are often rather irregular,
but the projected κt ’s, obtained from some time series model (as explained
below), will be smooth. Hence, we only need to smooth the βx ’s in order
to get projected life tables with mortality varying smoothly across the ages.
This can be achieved by penalized least-squares or maximum likelihood
methods.
The estimated Lee–Carter parameters are traditionally obtained by min-
imizing (5.5). This has produced estimated βx ’s and κt ’s with an irregular
shape in the majority of empirical studies. In order to smooth the estimated
βx ’s we can use the objective function

xm
tn
2
OPLS (α, β, κ) = ln "
µx (t) − αx − βx κt
x=x1 t=t1

xm
2
+πβ βx+2 − 2βx+1 + βx (5.33)
x=x1
where πβ is the smoothing parameter. This is the penalized least-squares

approach proposed in Delwarde et al. (2007a). The second term penalizes
irregular βx ’s. The objective function can therefore be seen as a compro-
mise between goodness-of-fit (first term) and smoothness of the βx ’s (second
term). The penalty involves the sum of the squared second-order differences
of the βx ’s, that is, the sum of the squares of βx+2 − 2βx+1 + βx . Second-
order differences penalize deviations from the linear trend. The trade off
between fidelity to the data (governed by the sum of squared residuals) and
smoothness (governed by the penalty term) is controlled by the smooth-
ing parameters πβ . The larger the smoothing parameters the smoother the
resulting fit. In the limit (πβ ∞) we obtain a linear fit. The choice
of the smoothing parameters is crucial as we may obtain quite different
5.4 Smoothing 213
fits by varying the smoothing parameters πβ . The choice of the optimal

πβ is based on the observed data, using cross-validation techniques. See
Delwarde et al. (2007a) for more details. We note that equation (5.33) is
similar to the objective function used in Whittaker–Henderson graduation
discussed in Section 2.6.2, a non-parametric graduation method that has
been commonly used in the United States.
If the parameters are estimated using Poisson maximum likelihood,
the penalized least-squares method becomes a penalized log-likelihood
approach. Specifically, following Delwarde et al. (2007a) the log-likelihood
(5.21) is replaced with
1
xm
2
L(α, β, κ) − πβ βx+2 − 2βx+1 + βx (5.34)
2 x=x
1
As above, the selection of the optimal value for the roughness penalty
coefficient πβ is based on cross validation.
Here, we adopt a very simple strategy in our case study: instead of fitting
the Lee–Carter model to the rough mortality surface, we first smooth it
using the methods described in Section 3.4.2 and then we fit the model to
the resulting surface.
Remark An alternative aproach to smoothing the βx ’s has also been sug-
ested. It is more ad hoc in nature than the above, in that it introduces an extra
stage in the modelling process. Thus, Renshaw and Haberman (2003a,c)
smooth the Lee–Carter βx estimates using linear regression as well as cubic
B-splines and natural cubic splines and the methods of least-squares.
Remark An advantage of the Cairns–Blake–Dowd model is that it auto-

matically produces smooth projected life tables, because future death
probabilities depend on age in a linear way, and on the projected time
indices κt[1] and κt[2] .
We fit the Lee–Carter model to the set of smoothed HMD death rates by

least-squares to ages 0–104 and the years 1920–2002. Figure 5.5 (top pan-
els) plots the estimated αx , βx , and κt . The estimated κt ’s are then adjusted
in order to reproduce the observed period life expectancies at birth. The val-
ues obtained without smoothing (i.e. those displayed in Fig. 5.1) are plotted
using a broken line. We see that the prior smoothing of the death rates does
not impact on the estimated αx ’s, except just before the accident hump, nor
on the estimated κt ’s (mainly because of the adjustment procedure). Prior
smoothing does, however, impact on the estimated βx ’s which now appear

to behave very regularly with age. The model explains 93.70% of the total
variance.
We now restrict ourselves to ages above 60. Figure 5.5 (bottom panels)
plots the estimated αx , βx , and κt . We see that prior smoothing has almost no
impact on the estimated αx ’s nor on the estimated κt ’s, whereas the estimated
βx ’s are smoothed in an appropriate way. The model now explains 91.37%
of the total variance. The estimated αx ’s and κt ’s closely agree, while the
estimated βx ’s are smoothed appropriately.
5.5 Selection of an optimal calibration period

5.5.1 Motivation
Many actuarial studies have based the projections of mortality on the statis-
tics relating to the years from 1950 to the present. The question then
becomes why the post-1950 period better represents expectations for the
future than does the post-1900 period, for example. There are several justi-
fications for the use of the second half of the 20th century. First, the pace of
mortality decline was more even across all ages over the 1950–2000 period
than over the 1900–2000 period. Second, the quality of mortality data, par-
ticularly at the older ages, for the 1900–1950 period is questionable. Third,
infectious diseases were an uncommon cause of death by 1950, while heart
disease and cancer were the two most common causes, as they are today.
This view seems to imply that the diseases affecting death rates from 1900
through 1950 are less applicable to expectations for the future than the
dominant causes of death from 1950 through 2000.
According to Lee and Carter (1992), the length of the mortality time
series was not critical as long as it was more than about 10–20 years.
However, Lee and Miller (2001) obtained better fits by restricting the
start of the calibration period to 1950 in order to reduce structural shifts.
Specifically, in their evaluation of the Lee–Carter method, Lee and Miller
(2001) have noted that for US data the forecast was biased when using
the fitting period 1900–1989 to forecast the period 1990–1997. The main
source of error was the mismatch between fitted rates for the last year
of the fitting period (1989 in their study) and actual rates in that year.
This is why a bias correction is applied. It was also noted that the βx
pattern did not remain stable over the whole 20th century. In order to
obtain more stable βx ’s, Lee and Miller (2001) have adopted 1950 as
the first year of the fitting period. Their conclusion is that restricting the
0
0.030
–1 50
0.025
–2
–3 0.020 0
Beta
Kappa
Alpha
–4 0.015
–5
0.010 –50
–6
0.005
–7
–100
0 20 40 60 80 100 0 20 40 60 80 100 1920 1940 1960 1980 2000
Age Age Time
–0.5 10
–1.0 0.030
5
–1.5 0.025 0
Kappa
–2.0
Alpha
Beta
0.020 –5
–2.5
–10
–3.0 0.015
–15
–3.5
0.010
–4.0 –20
60 70 80 90 100 60 70 80 90 100 1920 1940 1960 1980 2000
Age Age Time
obtained with smoothed HMD death rates by minimizing the sum of squares (5.5) with the resulting κt ’s adjusted by refitting to the period life expectancies at
birth (corresponding values obtained without smoothing are displayed in broken line).
fitting period to 1950 on avoids outlier data. Similarly, Lundström and

Qvist (2004) have reduced the 1901–2001 period to the past 25 years with
Swedish data.
Baran and Pap (2007) have applied the Lee–Carter method to forecast
mortality rates in Hungary for the period 2004–2040 on the basis of either
mortality data between 1949 and 2003 or on a restricted data set corre-
sponding to the period 1989–2003. The model fitted to the data of the
period 1949–2003 forecasts increasing mortality rates for men between
ages 45 and 55, indicating that the Lee–Carter method may not be appli-
cable for countries where mortality rates exhibit trends as peculiar as in
Hungary. However, models fitted to the data for the last 15 years both
for men and women forecast decreasing trends, which are similar to the
case of countries where the method has been successfully applied. This
clearly shows that the selection of an optimal fitting period is of paramount
importance.
5.5.2 Selection procedure
Booth et al. (2002) have designed procedures for the selection of an opti-
mal calibration period which identifies the longest period for which the
estimated mortality index parameter κt is linear. Specifically, these authors
seek to maximize the fit of the overall model by restricting the fitting period
in order to maximize the fit to the linearity assumption. The choice of the
fitting period is based on the ratio of the mean deviances of the fit of the
underlying Lee–Carter model to the overall linear fit. This ratio is computed
by varying the starting year (but holding the jump-off year fixed) and the
chosen fitting period is that for which the ratio is substantially smaller than
for periods starting in previous years.
More specifically, Booth et al. (2002) assume, a priori, that the trend
in the adjusted " κt ’s is linear, based on the ‘universal pattern’ of mortality
decline that has been identified by several researchers, including Lee and
Carter (1992) and Tuljapurkar and Boe (2000). When the " κt ’s depart from
linearity, this assumption may be better met by appropriately restricting the
fitting period. As noted above, the ending year is kept equal to tn and the
fitting period is then determined by the starting year (henceforth denoted
as tstart ).
Restricting the fitting period to the longest recent period (tstart , tn )
for which the adjusted " κt ’s do not deviate markedly from linearity
has several advantages. Since systematic changes in the trend in " κt are
avoided, the uncertainty in the forecast is reduced accordingly. More-
over, the βx ’s are likely to satisfy better the assumption of time invariance.
5.5 Selection of an optimal calibration period 217
Finally, the estimate of the drift parameter more clearly reflects the recent
experience.
An ad hoc procedure for selecting tstart has been suggested in Denuit and
Goderniaux (2005). Precisely, the calendar year tstart ≥ t1 is selected in such
a way that the series {"
κt , t = tstart , tstart + 1, . . . , tn } is best approximated
by a straight line. To this end, the adjustment coefficient R2 (which is the
classical goodness-of-fit criterion in linear regression) is maximized (as a
function of the number of observations included in the fit).
Note that in Denuit and Goderniaux (2005), the κt ’s are replaced by
a linear function of t and a parametric regression model (using a linear
effect term for the continuous covariate calendar time with an interaction
with the categorical variable age, together with a term for the categor-
ical variable age) is then used. Even if this approach produces almost
the same projections as the Lee–Carter method, it underestimates the
uncertainty in mortality forecasts. The resulting confidence intervals are
then artificially narrow because of the imposition of the linear trend in
the κt ’s.
The situation is slightly different in the Cairns–Blake–Dowd model. As
the time-varying parameters are estimated separately for each calendar year,
they remain unaffected if we modify the range of calendar years under
interest. Considering Fig. 5.4, we clearly see that the slope and intercept
parameters become linear only in the last part of the observation period
(especially for ages 60 and over). Therefore, it is natural to extrapolate their
future path on the basis of recent experience only. The approach suggested
by Denuit and Goderniaux (2005) is easily extended to the Cairns-Blake-
Dowd setting, by selecting the starting year as the maximum of the starting
years for each time factor. The deviance approach proposed by Booth et al.
(2002) can also easily be adapted to the Cairns–Blake–Dowd model.
Note, however, that the selection of the optimal fitting period is subject
to criticisms, in the sense that it could lead to an underestimation of the
uncertainty in forecasts, and artificially favours the Lee–Carter specifica-
tion. The same comment applies in the Cairns–Blake–Dowd approach. We
do not share this view, and we believe that the selection of the optimal
fitting period is an essential part of the mortality forecast.
We first consider the Lee–Carter fit. Applying the method of Booth et al.
(2002) gives tstart = 1978. The ad hoc method suggested in Denuit and
Goderniaux (2005) roughly confirms this choice. Restricting the age range
to 60 and over yields tstart = 1974. Again, the ad-hoc method agrees with
this choice.
Whereas the common practice would consist of taking all of the available
data 1920–2005, we discard here observations for the years 1920–1977
when all of the ages are considered, and observations for the years 1920–
1973 when the analysis is restricted to ages 60 and over. Here, short-term
trends are preferred even if long-term forecasts are needed for annuity pric-
ing. The reason is that past long-term trends are not expected to be relevant
to the long-term future. Note that the fact that the optimal fitting period is
selected on the basis of goodness-of-fit criteria to the linear model results
in relatively small deviations from this short-term linear trend, but the
shorter fitting period results in a more rapid widening of the confidence
intervals.
The final estimates based on observations comprised in the optimal fitting
period are displayed in Fig. 5.6 which plots the estimated αx , βx , and κt .
We see that the estimated αx ’s and κt ’s obtained with and without prior
smoothing closely agree whereas the estimated βx ’s are smoothed in an
appropriate way. The model explains 67.70% of the total variance for males
on the basis of unsmoothed data, 90.57% of the total variance for males
on the basis of smoothed data for ages 0–104. The model explains 92.62%
of the total variance for males on the basis of unsmoothed data, 95.74% of
the total variance for males on the basis of smoothed data for ages 60 and
over.
For the Cairns–Blake–Dowd model, the optimal projection periods now
become 1969–2005 when all of the ages are included in the analysis and
1979–2005 when ages are restricted to the range 60–104. Note that the esti-
mated time indices are not influenced by the restriction of the time period,
so that those displayed in Fig. 5.4 remain valid.
5.6 Analysis of residuals

5.6.1 Deviance and Pearson residuals
Since we work in a regression framework, it is essential to inspect the resid-

uals. Model performance is assessed in terms of the randomness of the
residuals. A lack of randomness would indicate the presence of systematic
variations, such as age–time interactions. We note that the adjustment of
the "
κt ’s in the Lee–Carter case may have introduced systematic changes to
0.025 20
–2
0.020 10
Kappa
0.015
Alpha
–4
Beta
0
0.010
–6 –10
0.005
–20
–8 0.000
0 20 40 60 80 100 0 20 40 60 80 100 1980 1985 1990 1995 2000 2005

Age Age Time
0.04
–1
5
0.03
–2
Alpha
Kappa
0
Beta
0.02
–3
0.01 –5
–4
0.00 –10
60 70 80 90 100
60 70 80 90 100 1975 1980 1985 1990 1995 2000 2005
Age
Age Time
Figure 5.6. Estimated αx , βx , and κt (from left to right), x = 0, 1, . . . , 104 (top panels) and x = 60, 61, . . . , 104 (bottom panels), obtained by minimizing the
sum of squares (5.5) over the optimal fitting period 1978–2005 for ages 0–104 and 1974–2005 for ages 60–104 with smoothed HMD death rates (corresponding
values obtained without smoothing are displayed in broken line).
the residuals so that the examination of model performance is in fact based

on the residuals computed with the adjusted "κt ’s.
When the parameters are estimated by least-squares, Pearson residuals
have to be inspected. In the Lee–Carter case, these residuals are given by
"
x (t)
rxt = . xm tn (5.35)
1 2
(xm −x1 )(tn −t1 −1) x=x1 t=t1 " x (t)
where" αx +"
" x (t) − ("
x (t) = ln m βx"
κt ). In the Cairns–Blake–Dowd case, these
residuals are given by
"
x (t)
rxt = . xm tn (5.36)
1 2
(xm −x1 −1)(tn −t1 +1) x=x1 t=t1 " x (t)
x (t) =
where " qx (t)/"
ln(" κt[1]
px (t)) − (" κt[2] x).
+"
If the residuals rxt exhibit some regular pattern, this means that the model
is not able to describe all of the phenomena appropriately. In practice,
looking at (x, t) → rxt , and discovering no structure in those graphs ensures
that the time trends have been correctly captured by the model.
With a Poisson, Binomial, or Negative Binomial random component, it is
more appropriate to consider the deviance residuals in order to monitor the
quality of the fit. These residuals are defined as the signed square root of the
contribution of each observation to the deviance statistics. These residuals
should also be displayed as a function of time at different ages, or as a
function of both age and calendar year.
We find that the residuals computed from the model fitted to ages 0–104
reveal systematic patterns and comparatively large values at young ages. In
the Lee–Carter case, the fit around the accident hump is very poor, with
large negative residuals for ages below 20. The residuals are positive for all
of the higher ages. The same phenomenon appears with the Cairns–Blake–
Dowd fit, with huge positive residuals around age 0. Overall, we find that
the inclusion of young ages significantly deteriorates the quality of the fit
at the higher ages. The presence of a trend in the residuals violates the
independence assumption and homoskedasticity does not hold as the graph
presents clustering. The large residuals before the accident hump suggest
that the Lee–Carter and Cairns–Blake–Dowd approaches are not able to
account for the particular mortality dynamics at younger ages. Since older
ages are the most relevant in pension and annuity applications, we restrict
the analysis to ages 60 and over.
Residuals can be displayed as a function of both age and calendar time,

and inspected with the help of maps as displayed in Fig. 5.7 for the Lee–
Carter fit (top panel) and for the Cairns–Blake–Dowd fit (bottom panel),
for ages 60–104. The particular patterns at oldest ages come from the
closing procedure applied to HMD mortality statistics, and do not invali-
date the fit. The residuals are unstructured, except for a moderate cohort
effect for generations reaching age 60 around 1980. Thus, apart from
these cohorts born just after World War I and the 1918–1920 influenza
epidemics, there is no significant diagonal pattern in the residuals. We
find that the cohort effect revealed in the residuals is too weak to reject
the age-period Lee–Carter model. In some countries, the cohort effects
are stronger and need to be included in the mortality modelling. This is
the case for instance in the United Kingdom, as will be seen in the next
chapter where cohort effects will be included in the models discussed in the
present chapter.
We now turn to the residuals for the Cairns–Blake–Dowd model. The
residuals are less dispersed than those for the Lee–Carter fit. The generations
born around 1920 again emerge as a notable feature in the residuals plot.
We now observe a clustering of negative residuals for the generations born
after this particular one, whereas positive residuals are associated with the
older generations. This suggests that the inclusion of a cohort effect could
be envisaged in the Cairns–Blake–Dowd setting. We postpone the analysis
of this kind of effect to the next chapter.
5.7 Mortality projection

5.7.1 Time series modelling for the time indices
An important aspect of both the Lee–Carter and the Cairns–Blake–Dowd

methodology is that the time factor (κt in the Lee–Carter case, and (κt[1] , κt[1] )
in the Cairns–Blake–Dowd case) is intrinsically viewed as a stochastic pro-
cess. Box-Jenkins techniques are then used to estimate and forecast the time
factor within an ARIMA time series model. These forecasts in turn yield pro-
jected age-specific mortality rates, life expectancies and single premiums for
life annuities.
In the Lee–Carter model, the estimated κt ’s are viewed as a realization of
a time series that is modelled using the classical autoregressive integrated
moving average (ARIMA) models. Such models explain the dynamics of
a time series by its history and by contemporaneous and past schocks.
4.6179
100 4.1200
3.5400
2.9600
2.3800
90 1.8000
1.2200
0.6400
0.0600
Age
80
–0.5200
–1.1000
–1.6800
70 –2.2600
–2.8400
–3.4200
–4.0753
60
1975 1980 1985 1990 1995 2000 2005
Time
3.1197
100 2.7199
2.3202
1.9205
1.5208
90 1.1211
0.7213
0.3216
Age
–0.0780
80
–0.4777
–0.8774
–1.2771
70 –1.6769
–2.0766
–2.4763
–2.8760
60
1980 1985 1990 1995 2000 2005
Time
Figure 5.7. Residuals for Belgian males, Lee–Carter model, ages x = 60, 61, . . . , 104 (top panel),
and Cairns–Blake–Dowd model, ages x = 60, 61, . . . , 104 (bottom panel).
The dynamics of the κt ’s is described by an ARIMA(p, d, q) process if it

is stationary and
∇ d κt = φ1 ∇ d κt + · · · + φp ∇ d κt + ξt + ψ1 ξt−1 + · · · + ψq ξt−q (5.37)
with φp = 0, ψq = 0, and where ξt is a Gaussian white noise process

such that σξ2 > 0.
There are a few basic steps to fitting ARIMA models to time series data.
The main point is to identify the values of the autoregressive order p, the
order of differencing d, and the moving average order q. If the time index
is not stationary, then a first difference (i.e. d = 1) can help to remove the
time trend. If this proves unsuccessful then it is standard to take further
differences (i.e. investigate d = 2 and so on). Preliminary values of p and
q are chosen by inspecting the autocorrelation function and the partial
autocorrelation function of the κt ’s. More details can be found in standard
textbooks devoted to time series analysis.
The appropriateness of the Lee–Carter approach has been questioned
by several authors. The rigid structure imposed by the model necessitates
the selection of an optimal fitting period (which is also conservative in
the context of life annuities, that is, it tends to overstate the expected
value of annuities). The Gaussian distributional assumption imposed on
the κt ’s means that large jumps are unlikely to occur. This feature can
be problematic for death benefits, where negative jumps correspond to
events which threaten the financial strength of the insurance company. For
instance, insurers currently are worrying about an avian influenza pan-
demic which could cause the death of many policyholders. On the basis of
vital registration data gathered during the 1918–1920 influenza pandemic,
extrapolations indicate that if the mortality were concentrated in a sin-
gle year, it would increase global mortality by 114%. However, neglecting
such jumps is conservative for life annuities. Positive jumps correspond-
ing to sudden improvements in mortality thanks to the availability of new
medical treatments are considered to be unlikely to occur, since it would
take some time for the population to benefit from these innovative treat-
ments. Hence, the assumptions behind the Lee–Carter model are compatible
with mortality projections for life annuity business, and we do not need to
acknowledge explicitly period shocks in the stochastic mortality model.
We note also that the optimal fitting period, that has been widely used, has
tended to start after the three pandemics of the 20th century (1918–1920,
1957–1958, and 1968–1970).
5.7.2 Modelling of the Lee–Carter time index
5.7.2.1 Stationarity
Time series analysis procedures require that the variables being studied be
stationary. We recall that a time series is (weakly) stationary if its mean
and variance are constant over time, and the covariance for any two time
periods (t and t + k, say) depends only on the length of the interval between
the two time periods (here k), not on the starting time (here t).
Nonstationary series may be the result of two different data-generating

processes:
1. The non-stationarity can reflect the presence of a deterministic compo-
nent. Such a trending series can be rendered stationary by simply setting
up a regression on time and working on the resulting residuals. These
series are said to be trend stationary.
2. The non-stationarity can result from a ‘non-discounted’ accumulation
of stochastic shocks. In this case, stationarity may be achieved by differ-
encing the series one or more times. These series are said to be difference
stationary.
A first check for stationarity consists of displaying the data in graphic
form and in looking to see if the series has an upward or downward trend.
We have observed a gradually decreasing underlying trend in the estimated
κt ’s. The series of the estimated κt ’s is, thus, clearly not stationary: it tends
to decrease over time on average. Figure 5.8 displays the estimated auto-
correlation function of the estimated κt ’s (on the left panel). The classic
signature for a nonstationary series is a set of very strong correlations that
decay slowly as the lag length increases. Specifically, if the time series is
stationary, then its autocorrelation function declines at a geometric rate.
As a result, such processes have short-memory since observations far apart
in time are essentially independent. Conversely, if the time series needs to
be differenced once, then its autocorrelation function declines at a linear
rate and observations far apart in time are not independent. The sample
autocorrelation coefficients of the " κt ’s in Fig. 5.8 clearly exhibit a linear
decay which supports nonstationarity.
In addition to these graphical procedures, several formal tests for
(non)stationarity have been developed. Stationarity tests are for the null
hypothesis that a time series is trend stationary. Taking the null hypothesis
as a stationary process and differencing as the alternative hypothesis is in
accordance with a conservative testing strategy: if we reject the null hypoth-
esis then we can be confident that the series indeed needs to be differenced
(at least once). The Kwiatowski–Philips–Schmidt–Shin test with a linear
deterministic trend has a test-statistic equal to 0.168 with 3 lags, and 0.1529
with 9 lags. This leads to rejecting trend stationarity for males (at 5%) and
to the conclusion that the κt ’s need to be differenced.
Since the estimated κt ’s are difference stationary, we compute the first
differences of the estimated κt ’s for males and females. In order to check
whether a second difference is needed, we test the resulting series for
(non)stationarity using unit root tests. The Augmented Dickey–Fuller p-
value is less than 1%, so that we conclude that the first differences of the
κt ’s are stationary and so do not need further differencing.
1.0
0.8
0.6
0.4
ACF
0.2
0.0
–0.2
–0.4
0 5 10 15
Lag
0.8
0.6
Partial ACF
0.4
0.2
0.0
–0.2
2 4 6 8 10 12 14
Lag
Figure 5.8. Autocorrelation function (on the left) and partial autocorrelation function (on the
right) of the estimated κt ’s obtained with completed data for the ages 60 and over.
5.7.2.2 Random walk with drift model for the time index
As no autocorrelation coefficient nor partial autocorrelation coefficient
of the differenced time index appears to be significantly different from
0, an ARIMA(0,1,0) process seems to be appropriate for the estimated
κt ’s. The Ljung–Box–Pierce test supports this model. Running a Shapiro–
Wilk test yields a p-value of 23.08%, which indicates that the residuals
seem to be approximately Normal. The corresponding Jarque-Bera p-value
equals 48.27%, which confirms that there is no significant departure from
Normality.
The previous analysis suggests that for Belgian mortality statistics, a ran-
dom walk with drift model is suitable for modelling the estimated κt ’s (as
is the case in many of the empirical studies in the literature). In this case,
the dynamics of the estimated κt ’s are given by
κt = κt−1 + d + ξt (5.38)
where the ξt ’s are independent and Normally distributed with mean 0 and
variance σ 2 , and where d is known as the drift parameter. In this case,

k
κtn +k = κtn + kd + ξtn +j (5.39)
j=1
The point forecast of the time index is thus
κ̇tn +k = E[κtn +k |κt1 , κt2 , . . . , κtn ] = κtn + kd (5.40)
which follows a straight line as a function of the forecast horizon k, with

slope d. The conditional variance of the forecast is
Var[κtn +k |κt1 , κt2 , . . . , κtn ] = kσ 2 (5.41)
Therefore, the conditional standard errors for the forecast increase with the
square root of the distance of the forecast horizon k.
Using the random walk with drift model for forecasting κt is equivalent
to forecasting each age-specific death rate to decline at its own rate. Indeed,
it follows from (5.38) that the differences in expected log-mortality rates
between times t + 1 and t is
ln µx (t + 1) − ln µx (t) = βx E[κt+1 − κt ] = βx d (5.42)
The ratio of death rates in two subsequent years of the forecast is equal
to exp(βx d) and is thus invariant over time. The product βx d is therefore
equal to the rate of mortality change over time at age x. In such a case, the
parameter βx can be interpreted as a normalized schedule of age-specific
rates of mortality change over time.
It is important to notice that the future mortality age profile produced by
the Lee–Carter model always becomes less smooth over time, as pointed
out by Girosi and King (2007). This explains why this approach has
been designed to forecast aggregate demographic indicators, such as life
expectancies (or actuarial indicators like annuity values), and not future
period or cohort life tables. This comes from the fact that the forecast of
the log-death rates is linear over time from (5.42): as the βx ’s vary with age,
the age profile of log-mortality will eventually become less smooth over
time, since the distance between log-mortality rates in adjacent age groups
can only increase. Each difference in βx is amplified as we forecast further
into the future. Sometimes, the forecast lines converge for a period, but after
converging they cross and the age profile pattern becomes inverted.
The dynamics (5.38) ensures that κt −κt−1 , t = t2 , t3 , . . . , tn , are indepen-
dent and Normally distributed, with mean d and variance σ 2 . The maximum
likelihood estimators of d and σ 2 are given by the sample mean and vari-
ance of the κt − κt−1 ’s, that is, the maximum likelihood estimators of the
model parameters are
1
tn
" "
κt − "κt1
d= ("
κt − "
κt−1 ) = n (5.43)
tn − t1 t=t tn − t1
2
and
tn 2
1
σ2 =
" " κt−1 − "
κt − " d (5.44)
tn − t1 t=t
2
This gives "

d = −0.5867698 and "
σ 2 = 0.3985848 for Belgian males for the
optimal fitting period 1974–2005.
This approach is known as the ruler method of forecast as it connects
the first and last points of the available data with a ruler and then extends
the resulting line further in order to produce a forecast. Considering the
expression for " d, the actuary has to check the value of "
κtn for reasonable-
ness. For instance, if a summer heat wave occurs during calendar year tn ,
producing excess mortality at older ages, then"κtn might be implausibly high,
resulting in a too small "
d, and biasing downwards the future improvements
in longevity (as noted by Lee (2000), Renshaw and Haberman (2003a),
Renshaw and Haberman (2003c)). As note above, Lee and Carter (1992)
did not prescribe the random walk with drift model for all situations. How-
ever, this model has been judged to be appropriate in very many cases.
For instance, Tuljapurkar et al. (2000) find that the decline in the κt ’s is
in accordance with the random walk with drift model for the G7 coun-
tries. Even when a different model is indicated, the more complex model
is found to give results which are close to those obtained with the random
walk with drift.
Remark Building on the random walk with drift model for the κt ’s, Girosi
" x (t)’s using
and King (2007) propose that we should model directly the ln m
a multivariate random walk with drift model. In this reformulation of
the Lee–Carter model, the drift vector and the covariance matrix of the
innovations are arbitrary.
Remark Carter (1996) has developed a method in which the drift d in the
random walk forecasting equation for κt is itself allowed to be a random
variable. This is done using state-space methods for modelling time series.
Nevertheless, it is noteworthy that the forecast and probability intervals
remain virtually unchanged compared to the simple random walk with drift
model.
Remark Booth et al. (2006) compare the original Lee–Carter method

with the different adjustments for the estimated κt ’s, as well as the
extensions proposed by Hyndman and Ullah (2007) and De Jong and
Tickle (2006). They find that, from the forecasting point of view,
there are no significant differences between the five methods. See also
Booth et al. (2005).
5.7.3 Modelling the Cairns-Blake-Dowd time indices
The analysis of each time index in isolation parallels the analysis performed
for the Lee–Carter time index. These preliminary results have now to be sup-
plemented with a bivariate analysis of the time series κ t = (κt[1] , κt[2] )T that
goes beyond the scope of this book. When fitted to data, the changes over
time in κ t have often been approximately linear, at least in the recent past.
This suggests that the dynamics of the time factor κ t could be appropriately
described by a bivariate random walk with drift of the form

κt[1] = κt−1
[1]
+ d1 + ξt[1]
(5.45)
κt[2] = κt−1
[2]
+ d2 + ξt[2]
where d1 and d2 are the drift parameters, and ξ t = (ξt[1] , ξt[2] )T are inde-
pendent bivariate Normally distributed random pairs, with zero mean and
variance-covariance matrix
2
σ1 σ12
= (5.46)
σ12 σ22
The drift parameters are estimated as
" κt[i]
" κt[i]
−"
di = n 1
, i = 1, 2 (5.47)
tn − t 1
the marginal variances are estimated as

tn 2
1
σi2
" = "κt[i] − "[i]
κt−1 −"
di , i = 1, 2 (5.48)
tn − t1 t=t
2
and the covariance is estimated as

n −1
n −1 t
t
1
"
σ12 = " [1]
κs+1 κs[1] − "
−" d1 " [2]
κt+1 κt[2] − "
−" d2 (5.49)
tn − t1 s=t t=t
1 1
d1 = −0.0757558, "
This gives " d2 = 0.0007619443, "σ12 = 0.01563272,
σ22 = 3.3048 × 10−6 , and "
" σ12 = −0.0002247978 for Belgian males for the
period 1979-2005.
While a bivariate random walk with drift model has been used in connec-
tion with the Cairns–Blake–Dowd approach to mortality forecasting, mean
reverting alternatives might have a stronger biological justification. Andrew
Cairns pointed out in a personal communication that negative autocorrela-
tion coefficients between the "κt[2] −"[2]
κt−1 ’s indicate that at higher ages good
years and bad years alternate. This can be explained as follows: if a flu epi-
demic kills a lot of the unhealthy older people, it leaves the healthy older
and then the next year mortality is low.
5.8 Prediction intervals

5.8.1 Why bootstrapping?
The projections made so far, while interesting, reveal nothing about the
uncertainty attached to the future mortality. In forecasting, it is important
to provide information on the error affecting the forecasted quantities. In
the traditional demographic approach to mortality forecasting, a range of
uncertainty is indicated by high and low scenarios, around a medium fore-
cast that is intended to be a best estimate. However, it is not clear how to
interpret this high-low range unless a corresponding probability distribution
is specified.
In this respect, prediction intervals are particularly useful. This section
explains how to get such margins on demographic indicators in the Lee–
Carter setting. The ideas are easily extended to the Cairns–Blake–Dowd
setting.
In the current application, it is impossible to derive the relevant prediction
intervals analytically. The reason for this is that two very different sources
of uncertainty have to be combined: sampling errors in the parameters αx ,
βx , and κt , and forecast errors in the projected κt ’s. An additional compli-
cation is that the measures of interest – life expectancies or life annuities
premiums and reserves – are complicated non-linear functions of the param-
eters αx , βx , and κt and of the ARIMA parameters. The key idea behind the
bootstrap is to resample from the original data (either directly or via a fitted
model) in order to create replicate data sets, from which the variability of the
quantities of interest can be assessed. Because this approach involves repeat-
ing the original data analysis procedure with many replicate sets of data, it
is sometimes called a computer-intensive method. Bootstrap techniques are
particularly useful when, as in our problem, theoretical calculation with the
fitted model is too complex.
If we ignore the other sources of errors, then the confidence bounds
on future κt ’s can be used to calculate prediction intervals for demo-
graphic indicators. Even if for long-run forecasts (over 25 years), the
error in forecasting the mortality index clearly dominates the errors in
fitting the mortality matrix, prediction intervals based on κt alone seri-
ously understate the errors in forecasting over shorter horizons. We know
from Lee and Carter (1992), Appendix B, that prediction intervals based
on κt alone are a reasonable approximation only for forecast horizons
greater than 10–25 years. If there is a particular interest in forecasting over
the shorter term, then we cannot make a precise analysis of the forecast
errors.
Because of the importance of appropriate measures of uncertainty in an
actuarial context, the next sections aim to derive prediction intervals taking
into account all of the sources of errors. To fix the ideas, we will con-

sider a cohort life expectancy ex (t) as defined in Section 4.4.1 or in (5.57)
below, but the approach is easily adapted to other demographic or actuarial
indicators.
5.8.2 Bootstrap percentiles confidence intervals
To avoid any (questionable) Normality assumption, we use the bootstrap

percentile interval to construct confidence interval for the predicted life
expectancy. The bootstrap procedure yields B samples of αx , βx , and κt
parameters, denoted as αbx , βxb , and κtb , b = 1, 2, . . . , B. This procedure can
be carried on in several ways:
Monte Carlo simulation Brouhns et al. (2002b), Brouhns et al. (2002a)

sample directly from the approximate multivariate Normal distribution
of the Poisson maximum likelihood estimators (" α, "
β,"κ ), that is, those
obtained by maximizing the log-likelihood (5.21). Invoking the large
sample properties of the maximum likelihood estimators, we know that
α, "
(" β,"κ ) is asymptotically multivariate Normally distributed, with mean
(α, β, κ) and covariance matrix given by the inverse of the Fisher infor-
mation matrix, whose elements equal minus the expected value of the
second derivatives of the log-likelihood with respect to the parameters

of interest. We refer to Appendix B in Brouhns et al. (2002b) for the
expression of the information matrix and how to sample values from the
multivariate Normal distribution of interest.
As reported by Renshaw and Haberman (2008), the result of this first
approach heavily rely on the identifiability constraints. Given that the
choice of constraints is not unique and that this choice materially affects
the resulting simulations, this first approach should not be used for risk
assessment purposes unless there are compelling reasons for selecting a
particular set of identifiability constraints.
Poisson bootstrap Starting from the observations (ETRxt , Dxt ), Brouhns
et al. (2005b) create B bootstrap samples (ETRxt , Dbxt ), b = 1, 2, . . . , B,
where the Dbxt ’s are realizations from the Poisson distribution with mean
ETRxt " µx (t) = Dxt . The bootstrapped death counts Dbxt are thus obtained
by applying a Poisson noise to the observed numbers of deaths. For each
bootstrap sample, the αx ’s, βx ’s, and κt ’s are estimated.
Residuals bootstrap Another possibility is to bootstrap from the residuals
of the fitted model, as suggested by Koi et al. (2006). The residuals should
be independent and identically distributed (provided that the model is
well specified). From these, it is possible to reconstitute bootstrapped
residuals, and then bootstrapped mortality data. A good fit resulting
in a set of pattern-free random residuals for sampling repeatedly with
replacement is a basic requirement for this approach. When this is not
the case, distortions can occur in the simulated histogram of the quantity
of interest.
Specifically, we create the matrix R of residuals, with elements rxt
as defined in Section 5.6. Then, we generate B replications Rb , b =
1, 2, . . . , B by sampling with replacement the elements of the matrix R.
The inverse formula for the residuals is then used to obtain the cor-
responding matrix of death counts Dbxt ; we refer the reader to Koissi
et al. (2006) for further explanation about the inverse of the residu-
als, as well as to Renshaw and Haberman (2008) for further comments.
This leads to the computations of B sets of estimated parameters " αbx , "
βxb ,
and "κt .
b
We then estimate the time series model using the κtb as data points. This
yields a new set of estimated ARIMA parameters. We can then generate a
projection κtb , t ≥ tn + 1 using these ARIMA parameters. The future errors
ξtb are sampled from a univariate Normal distribution with a mean of 0 and
a standard deviation of σεb . Note that the κt ’s are projected on the basis of
the reestimated ARIMA model. Note that we do not select a new ARIMA
model but keep the ARIMA model selected on the basis of the original
data. Nevertheless, the parameters of these models are reestimated with the
bootstrapped data.
The first step is meant to take into account the uncertainty in the param-
eters αx ’s, βx ’s, and κt ’s. The second step deals with the fact that the
uncertainty in the ARIMA parameters depends on the uncertainty in the
αx ’s, βx ’s and κt ’s parameters. The third step ensures that the uncertainty
of the forecasted κt ’s not only depends on the ARIMA standard error, but
also on the uncertainty of the ARIMA parameters themselves. Finally, in the
computation of the relevant measures in step four, all sources of uncertainty
are taken into account.
This yields B realizations αbx , βxb , κtb and projected κtb on the basis of which

we can compute the measure of interest ex (t). Assume that B bootstrap
estimates ex (t), b = 1, 2, . . . , B, have been computed. The (1−2α) percentile
b
b(α) b(1−α) b(ζ)
interval for ex (t) is given by (ex (t), ex (t)), where ex (t) is the 100 ×

ζth empirical percentile of the bootstrapped values for ex (t), which is equal
to the (B×ζ)th value in the ordered list of replications exb (t), b = 1, 2, . . . , B.
For instance, in the case of B = 1, 000 bootstrap samples, the 0.95th and the
0.05th empirical percentiles are, respectively, the 950th and 50th numbers

in the increasing ordered list of 1,000 replications of ex (t).
Note that these bootstrap procedures account for parameter uncertainty
as well as Arrowian uncertainty (also known as risk, in which the set
of future outcomes is known and probabilities can be assigned to each
of the possible outcomes based on a known model with known parame-
ters generating the distribution of future outcomes). Knightian uncertainty,
by comparison, ackowledges the presence of both model uncertainty and
parameter uncertainty. Allowing for model uncertainty would require the
consideration of several mortality projection models and the assignment to
these of probabilities in line with their relative likelihoods.
Remark Empirical studies conducted in Renshaw and Haberman (2008)
reveal varying magnitudes of the Monte Carlo based confidence and pre-
diction intervals under different sets of identifiability constraints. Such
diverse results are attributed by these authors to the over parametriza-
tion present in the model rather than to the non-linearity of the parametric
structure.
In the approach proposed by Lee and Carter (1992), future age-specific

death rates are obtained using extrapolated κt ’s and fixed αx ’s and βx ’s, that
is, the pointwise projections κ̇tn +s of the κtn +s ’s, s = 1, 2, . . ., are inserted
into the formulas giving the force of mortality and provide
αx + "
µ̇x (tn + s) = exp(" βx κ̇tn +s ) (5.50)
In this case, the jump-off rates (i.e. the rates in the last year of the fitting
period or jump-off year) are fitted rates. The basic Lee–Carter method has
been criticized by Bell (1997) for the fact that a discontinuity is possible
between the observed mortality rates and life expectancies for the jump-
off year and the forecast values for the first year of the forecast period.
The bias arising from this discontinuity would then persist throughout the
forecast.
As suggested by Bell (1997), Lee and Miller (2001), Lee (2000), Renshaw
and Haberman (2003a), Renshaw and Haberman (2003c), the forecast
could be started with observed death rates rather than with fitted ones.
This would help to eliminate a jump between the observed and forecasted
death rates in the first year of the forecast as the model does not fit age-
specific death rates exactly in the last year. If the fitting period is sufficiently
long, then the difference between the observed and the fitted death rates
can be appreciable. Specifically, the forecast mortality rates are aligned to
the latest available empirical mortality rates as

" x (tn ) exp "
µ̇x (tn + s) = m βx κ̇tn +s − "
κtn
" x (tn )RF(x, tn + s)
=m (5.51)
" x (tn ) denotes the death rate observed at age x in year tn

Note that here, m
and not the fitted one, and RF denotes the reduction factor as introduced
in equation (4.6).
If the latest empirical mortality rates were judged to be atypical in level
or shape, an alternative would be to average across a few years at the end
of the observation period, or to resort to a recent population life table,
as advocated by Renshaw and Haberman (2003a,c). In the example, here,
we use the Belgian 2002–2004 population life table released in 2006 by
Statistics Belgium as the base for the mortality forecast.
Here, we bootstrap the residuals displayed in Figure 5.7 (top panel). With
10,000 replications, we obtain the histograms displayed in Figure 5.9 for the

cohort life expectancies e65 (2006) for males. The point estimate is 18.17.
The mean of the bootstrapped values is 18.05, with a standard deviation
of 0.3802. The bootstrap percentiles confidence interval at level 90% is
(17.41183,18.66094).
We have also applied a Poisson bootstrap. The results are shown
in Fig. 5.9, lower panel. The mean and the standard deviations are
1000
Frequency 800
600
400
200
0
16.5 17.0 17.5 18.0 18.5 19.0 19.5
1000
800
Frequency
600
400
200
0
16.5 17.0 17.5 18.0 18.5 19.0 19.5
Figure 5.9. Histograms for the 10,000 bootstrapped values of the cohort life expectancies at age
65 in year 2006 for the general population, males: residuals bootstrap in the top panel, Poisson
bootstrap in the bottom panel.
almost equal to those of the residuals bootstrap (respectively, 18.03 and

0.3795). The bootstrap percentiles confidence interval at level 90% is
(17.40993,18.65580). The histograms obtained with the Poisson boot-
strap and with the residuals bootstrap have very similar shapes, and the
confidence intervals closely agree.
5.9 Forecasting life expectancies

In this section, we consider the computation of projected life expectancies at
retirement age 65, obtained from the Lee–Carter and Cairns–Blake–Dowd
approaches, by replacing the death rates with their forecasted values. More-
over, the results are then compared with other projections performed for
the Belgian population.
5.9.1 Official projections performed by the Belgian Federal

Planning Bureau (FPB)
The FPB was asked in 2003 by the Pension Ministry to produce (in collab-
oration with Statistics Belgium) projected life tables to be used to convert
pension benefits into life annuities in the second pillar. A working party was
set up by the FPB with representatives from Statistics Belgium, BFIC, the
Royal Society of Belgian Actuaries and UCL. The results are summarized
in the Working Paper 20-04 available from http://www.plan.be.
The FPB model specifies qx (t) = exp(αx + βx t) where αx = ln qx (0) and
βx is the rate of decrease of qx (t) over time. Thus, each age-specific death
probability is assumed to decline at its own exponential rate. The αx ’s and
βx ’s are first estimated by the least-squares method, that is, minimizing the
objective function
tn
xm 2
ln "
qx (t) − αx − βx t (5.52)
x=x1 t=t1
Then, the resulting "

βx ’s are smoothed using geometric averaging. Finally,
the "
αx ’s are adjusted to represent the recent mortality experience. This
methodology is similar to the generalized linear modelling regression-based
approach proposed by Renshaw and Haberman (2003b).
" x (t) and the death probabilities "
The death rates m qx (t) are typically very
close to one another in value. This is why we would expect that the FPB
approach would lead to similar projections to the Lee–Carter method once
the optimal fitting period has been selected. However, no such selection is
performed in the FPB analysis, which may result in some differences in the
forecasts.
5.9.2 Andreev–Vaupel projections
The method used by Andreev and Vaupel (2006) is based on Oeppen and
Vaupel (2002). Plotting the highest period female life expectancy attained
for each calendar year from 1840 to 2000, Oeppen and Vaupel (2002) have
noticed that the points fall close to a straight line, starting at about 45 years
in Sweden and ending at about 85 years in Japan. They find that record
female life expectancy has increased linearly by 2.43 years per decade from
1840 to 2000 (with an adjustment coefficient R2 = 99.2%). The record
male life expectancy has increased linearly from 1840 to 2000 at a rate of
2.22 years per decade (with R2 = 98%). Moreover, there is no indication
of either an acceleration or deceleration in the rates of change. If the trend
continues, they predict that female record life expectancy will be 97.5 by
mid-century and 109 years by 2100. Life expectancy can be forecast for a
given country by considering the gap between national performance and
the best-practice level. See also Lee (2003).
Andreev and Vaupel (2006) combine the approach due to Oeppen and
Vaupel (2002) with the Lee–Carter model to gain stability over the long
run. More precisely, they assume that the linear trend in the best practice
female life expectancy continues into the future and also that the differ-
ence between the life expectancy of a particular country and the general
trend stays constant over time. Then, the life expectancy at birth can be
forecast as
↑ ↑
e0 (t) = e0 (tn ) + s(t − tn ) (5.53)
where s is the pace of increase in the best practice life expectancy over time
↑
that has been estimated by Oeppen and Vaupel (2002) and e0 (t) is the life
expectancy at birth in the particular country. Andreev and Vaupel (2006)
do not use separate values of s for males and females but the female value
of 0.243 for both genders.
Andreev and Vaupel (2006) consider ages 50 and over so that they need to
↑ ↑
deduce the value of e50 (t) from e0 (t). To do so, they start with a forecast of
death rates by the linear decline model (according to which each age-specific
death rate is forecasted to decline at its own independent rate) along the
lines of
µ̇x (tn + s) = "
µx (tn ) exp(−gx s) (5.54)
where gx is the annual rate of decline for the mortality rate at age x.
Then, the forecasted death rates are multiplied by a constant factor so that
↑
the life expectancy at birth matches the e0 (t) values coming from (5.53).
↑
The value of e50 (t) is then obtained from these adjusted death rates.
↑
Given the estimated value of e50 (t), we need to calculate the set of mortal-
ity rates at ages over 50 that correspond to this value. Andreev and Vaupel
(2006) use the Kannisto model
at exp(bt x)
µx (t) = (5.55)
1 + at exp(bt x)
which is fitted to data for ages 50 and over by the method of Poisson
maximum likelihood. The at ’s are then projected into the future from
the linear model
ln at = β0 + β1 t (5.56)
↑
Then, for each t ≥ tn + 1, the parameter bt is determined to match e50 (t)
given the value of at obtained from (5.56).
This method may produce a jump in death rates. To avoid this drawback,
the death rates can be blended with the death rates produced by the Lee–
Carter method over a short period of time. Specifically, the Lee–Carter
model is fitted to data for ages 50 and over, and the estimated κt ’s are
↑
adjusted by refitting to the e50 (t)’s. The bias correction ensures that the
forecasted death rates closely agree with the latest available death rates in
the first years of the forecast. The weight assigned to the Lee–Carter death
rates is 1 − k/n + 1 for years tn + k, k = 1, 2, . . . , n, where n is the length
of the blending period. The value of n ranges from 10 for countries where
the model (5.55) provides a good fit to 40 where this is not the case.
Life expectancies are often used by demographers to measure the evolution

of mortality. Specifically, ex (t) is the average number of years that an x-aged
individual in year t will survive, allowing for the evolution of mortality rates

with time after t. We, thus, expect that this person will die in year t + ex (t)

at age x + ex (t). The formula giving ex (t) under (3.2) is
ξ
e
x (t) = exp − µx+η (t + η) dη dξ
ξ≥0 0

1 − exp − µx (t)
=
µx (t)
 
k−1! 1 − exp −µx+k (t + k)
+  exp −µx+j (t + j)  . (5.57)
µx+k (t + k)
k≥1 j=0
It is interesting to compare (5.57) with the expression (3.18) previously

obtained for the period life expectancy. The actual computation of the pro-
jected cohort life expectancies at (retirement) age 65 is made using formula
(5.57) where the future death rates are replaced with their forecast values.
First, the cohort life expectancies obtained in the Lee–Carter model are
compared with the values coming from the Cairns–Blake–Dowd forecast.
Then, the Lee–Carter projections are compared with two projections per-
formed for the Belgian population, by the Federal Planning Bureau and by
Andreev and Vaupel (2006).
Figure 5.10 displays the values of the cohort life expectancies at age
65 obtained from the Lee–Carter and Cairns–Blake–Dowd mortality pro-
jections. For the Lee–Carter forecast, we present also a 90% prediction
interval. We see that the values obtained from the two approaches are in
21
20
e65(t)
19
18
2010 2015 2020 2025

t
Figure 5.10. Forecast of cohort life expectancies at age 65 for the general population (circle)
with 90% confidence intervals (gray-shaded area), together with values obtained from the Cairns–
Blake–Dowd model (triangle).
close agreement, with slightly larger values coming from the Lee–Carter
approach.
Figure 5.11 displays the cohort life expectancies at age 65 resulting from
the Lee–Carter forecast for the general population, together with the official
FPB values and the corresponding values obtained by Andreev and Vaupel
(2006). The small differences (of <6 months) between the FPB forecasts
and the projections obtained in this chapter remain stable over time. The
official FPB forecasts lie inside the 90% confidence interval for the cohort
life expectancy at age 65. Hence, the FPB forecast is as plausible as the
Lee–Carter projection performed in this chapter. These two projections,
however, significantly differ from the results derived in Andreev and Vaupel
(2006), which are either implausibly small or become rapidly significantly
larger than the present forecasts.
Considering the values obtained by Andreev and Vaupel (2006) using the
Lee–Carter methodology, the differences relative to the forecast obtained
in the present study can be explained as follows. First, Andreev and Vau-
pel (2006) use age groups 50–54, 55–59,…, 100+ and not single years of
ages. Next, the optimal fitting period is not determined by Andreev and
Vaupel (2006) who routinely used 1950–2000. Finally, the forecast starts
with death rates observed in the last year with available data (i.e. 2000 in
their case). We see that the projections obtained in this chapter from the
Lee–Carter model after the optimal fitting period has been selected exceed
those produced by Andreev and Vaupel (2006) by the same methodology
22
21
20
e65(t)
19
18
17
2010 2015 2020 2025
t
Figure 5.11. Forecast of cohort life expectancies at age 65 for the general population (circle)
with 90% confidence intervals (gray-shaded area), together with official FPB values (triangle),
with values obtained by Andreev and Vaupel (2006) using the Lee–Carter methodology (square),
and with values obtained by Andreev and Vaupel (2006) using the Oeppen and Vaupel (2002)
modified methodology (diamond).
from calendar years 1950–2000. The selection of the optimal fitting period
may thus have a dramatic effect on the forecast, and is in line with the
conservative actuarial approach.
Andreev and Vaupel (2006) apply the same rate of decrease 0.243 for
both genders in order to forecast future life expectancies using the Oeppen–
Vaupel line of record life expectancies. The life expectancy at age 50 is
deduced from the projected life expectancy at birth using a forecast of death
rates by the linear decline model (i.e. letting each age-specific death rate
decline at its own independent rate by fitting a random walk with drift model
separately to the log of death rates in each age group). Finally, the Lee–
Carter projection is combined with a Kannisto model to produce projected
life tables. We see from Fig. 5.11 that this method yields a much higher life
expectancy at age 65 than the other approaches. Moreover, the speed of
improvement exceeds the other forecasts.
It is interesting to note that all of the mortality forecasting models con-
sidered in the present chapter (Lee–Carter with optimal fitting period,
Cairns–Blake–Dowd, FPB, and Oeppen-Vaupel) agree about the forecasts

of e65 (t) in the next few years. Significant differences compared with the
Oeppen–Vaupel approach emerge from 2013, this approach suggesting sig-
nificantly higher values for the life expectancy at retirement age than its
competitors.
5.9.4 Longevity fan charts
Following Blake and Dowd (2007), we produce longevity fan charts for

e65 (2006 + t), t = 0, 1, . . . , 20 based on residuals bootstrap (with B =
10,000). The result is shown in Fig. 5.12. Such charts depict some cen-
tral projection of the forecasted variable, together with bounds around this
showing the probabilities that the variable will lie within specified ranges.
The chart in Fig. 5.12 shows the central 10% prediction interval with the
heaviest shading surrounded by the 20%, 30%, . . ., 90% prediction inter-
vals with progressively lighter shading. The shading becomes stronger as
the prediction interval narrows. We can therefore interpret the degree of
shading as reflecting the likelihood of the outcome: the darker is the shad-
ing, the more likely is the outcome. The fan in Fig. 5.12 consists of 9 grey
bands of varying intensity. The upper and lower boundaries correspond to
paths of the forecast 95% and 5% quantiles, and the inner edges of the
bands in the fan correspond to the 10%, 15%, . . ., 90% quantiles. The
darkest band in the middle is bounded by the 45% and 55% quantiles.
Note that the quantiles are calculated for each year in isolation. The fan
chart in Fig. 5.12 shows that longevity risk is rather low. The question as to
whether these narrow confidence bounds are realistic remains an open one.
5.9.5 Back testing
Let us now forecast the period life expectancies for calendar years
1981–2005, 1991–2005, and 2001–2005 on the basis of the observations
21
20
e65(t)
19
18
2010 2015 2020 2025

t

Figure 5.12. Fan chart for the cohort life expectancies at age 65 e65 (2006 + t), t = 0, 1, . . . , 20,
for the general population.
Table 5.1. Summary of the Lee–Carter fit to the periods 1950–1980,

1950–1990, and 1950–2000.
1950–1980 1950–1990 1950–2000
Opt. fitting period 1960–1980 1968–1980 1968–1980

% var. explained 65.34 91.63 94.78
"
d −0.3526104 −0.6126091 −0.5541269
"
σ2 5.509462 0.6228406 0.4072292
16
15
e65(t)
14
13
12
1985 1990 1995 2000 2005

t
Figure 5.13. Observed period life expectancies at age 65 for the general population (circles),
together with forecast based on 1950–1980, 1950–1990, and 1950–2000 periods (triangles)
surrounded by prediction 90% intervals.
relating to calendar years 1950–1980, 1950–1999, and 1950–2000, respec-

tively. We thus investigate the predictive power of the Lee–Carter approach
if it were applied in the past to forecast future mortality for ages 60–104.
Table 5.1 summarizes the features of the Lee–Carter fits to each of these
three periods.
We see that the fit is rather poor when the observation period is restricted
to 1950–1980, with only 65% of the total variance explained by the Lee–
Carter decomposition. Also, the drift parameter gives an higher yearly
improvement in longevity for the two subsequent periods. Also, the esti-
mated variance of the random walk with drift model is considerably larger
for the 1950–1980 period compared with the two subsequent ones.
Figure 5.13 displays the forecast of the period life expectancies at age 65,
together with observed values and 90% prediction intervals (grey areas,
with progressively heavier shading). We see that using 1950–1980 data
gives a point forecast far below the actual life expectancies observed during
1981–2005. Moreover, the prediction intervals are wider compared to the
1950–1990 and 1950–2000 periods. The Lee–Carter model would thus
clearly have underestimated the actual gains in longevity after 1980 on the
basis of the 1950–1980 observation period. The forecast becomes better
when the 1950–1990 and 1950–2000 periods are considered. The Lee–
Carter model captures the trends in the observed period life expectancies,
which remain in the prediction intervals.
6 applications and
examples of
age-period-cohort models
6.1 Introduction
In this chapter, we consider the proposal that the models introduced in
Chapter 5 should be extended to include components that represent a cohort
effect, as well as how this proposal has been implemented. We illustrate
this implementation with a case study based on the UK experience. The
justification for this proposal comes initially from some descriptive studies
of mortality trends in the United Kingdom which demonstrate that there
is a strong birth cohort effect present. Thus, the Government Actuary’s
Department, which was responsible at the time for the official UK popula-
tion projections, has highlighted, in a series of reports (GAD 1995, 2001,
and 2002), the existence of cohorts in the United Kingdom who have expe-
rienced rapid mortality improvement, relative to those born previously or
more recently. The generations (of both sexes) born approximately between
1925 and 1945 (and centered on the generation with year of birth 1931)
seem to have experienced this more rapid mortality improvement.
Further evidence has come also from the Continuous Mortality Investiga-
tion Bureau (CMIB) in the United Kingdom. In an analysis of the mortality
experience of males with life insurance over an extended period, CMIB
(2002) notes the existence of a similar effect, although this seems to be
centered on a slightly earlier cohort, that is, that born in 1926. A similar
cohort effect is also noted in an investigation into the mortality rates of
male pensioners who are members of insured pension schemes – again the
highest rates of mortality decrease are noted for the 1926 cohort.
The reasons for this so-called cohort effect are not precisely understood.
A number of explanatory factors have been suggested in the literature – a
helpful review is provided by Willets (2004). Among the most plausible
244 6 : Forecasting mortality
factors are the following. First, the diet in the United Kingdom during the
1940s and early 1950s may have had a beneficial effect on the health of
children growing up during this time. Although this was a time of food
rationing, there is evidence that the average consumption of fresh vegeta-
bles, bread, milk, and fish was higher during those years than in a recent
period like the early 1990s – and, at the same time, average consump-
tion of cheese and meat was lower. Second, the introduction of a universal
social security system in 1948 (following the Beveridge Report of 1942),
the introduction of free secondary school education for all in 1944 and the
establishment of the National Health Service in 1947 meant that the social
conditions for children growing up in the early 1950s would have been
very different than that experienced by earlier generations. Third, there are
strong cohort-related patterns in mortality from diseases that are linked
to smoking, for example, lung cancer and heart disease (ONS 1997). It is
clear that in the United Kingdom (and elsewhere) different generations have
had different smoking histories. Those born around 1920 may have started
smoking during the 1930s, they may have been given free cigarettes during
World War II and would have been smoking for some considerable time
when the deleterious health impact of smoking was first identified in the late
1950s and early 1960s. There is a marked contrast with those born some
20 years later who would have reached adulthood just when these research
findings were being widely discussed.
Given the close association of lung cancer with smoking, Willets (2004)
also examines the patterns of cause-specific mortality rates from lung cancer
in the United Kingdom. He argues that, for males, lung cancer death rates
plotted for different cohorts indicate an upward trend for those born from
1870 onwards with the peak rates for those born between 1900 and 1905
and the greatest average annual improvement for those born in the period
1930–1935.
These findings are supported by published analyses; for example,
Evandrou and Falkingham (2002) have studied smoking prevalence rates.
They estimate that approximately 95% of men born in 1916–1920 had
smoked at some point by the time they reached age 60 – while, for the
cohort born in 1931–1935, the corresponding figure was 25%. Finally,
varying birth cohort sizes may confer benefits in that those born at a time
of low birth rates may acquire social and economic advantages relative to
those born at times of higher birth rates. In this regard, we note that the
period from 1925 to 1945 was a period of falling birth rates sandwiched
between the two post-war ‘baby booms’.
Other hypotheses have been suggested. For example, Catalano and
Bruckner (2006) have tested the ‘diminished entelechy hypothesis’ – this
postulates that cohorts, who experience relatively many or relatively viru-

lent environmental insults (e.g. infectious diseases, extreme weather, poor
diet in terms of quality and/or quantity) in their early years, then suffer
a reduced subsequent life span. From a thorough time series analysis, they
find a positive association between mortality in the first five years of life and
average lifespan at age 5 for those born in Denmark (in the period 1835–
1913), England and Wales (in the period 1841–1912) and Sweden (in the
period 1751–1912). It is not clear whether this hypothesis, however, would
be relevant to cohorts born later in the 20th century in these countries where
early childhood mortality rates have been much reduced, through the con-
trol of infectious diseases via antibiotics and immunization. Other studies
have also looked at the role played by cohort effects in cause-specific rather
than all causes mortality.
Similarly, Crimmins and Finch (2006) investigate the association between
exposure to infections and late life health outcomes within the same cohort.
In particular, they consider the relationship between mortality decline
among older persons in a cohort and earlier mortality decline in child-
hood within the same cohort, using childhood mortality as an index of
environmental exposure to infections. The analysis focuses on four west-
ern European countries (Sweden, England, France, and Denmark) where
cohort-based mortality data for cohorts born before 1900 is of a high qual-
ity. The authors find that mortality declines among older persons tend to
occur in the same cohorts that had experienced mortality decline as children.
The choice of cohorts born before 1900 was made to avoid the confound-
ing influence of smoking, immunization, and antibiotics. Although this may
mean that the results have reduced relevance for developed countries, it is
clear that there are implications for developing countries where childhood
mortality has reduced markedly in the last few decades, suggesting, for
example, that cohort effects may emerge in the future.
Davy Smith et al. (1998) have looked at the association between adverse
social circumstances in childhood and adult mortality from a range of major
causes – they find a positive association between deprivation in childhood
and adult mortality from stroke and stomach cancer (and less strong asso-
ciations with coronary heart disease and respiratory disease). They suggest
that the re-emergence of child poverty in the last 20 years may well lead to
a cohort effect that will be observed in the future.
We find also that there is corresponding evidence from the United States,
Japan, and Germany that there is a cohort effect present in national mor-
tality data during the last 40 years, and also for Sweden over a much longer
period: see the analyses of data in Cairns et al. (2007), Richards et al. (2007),
and Willets (2004).
In the following sections, we investigate the extension of the Lee–Carter

(LC) and the Cairns–Blake–Dowd models in order to incorporate a cohort
effect. We, thus, build on the detailed discussions in Chapter 5 – in
particular, in Sections 5.2 and 5.3.
6.2 LC age–period–cohort mortality projection model
6.2.1 Model structure
We begin the discussion by presenting the LC model in a wider setting. Given

the important role played by the mortality reduction factors in generating
mortality projections for actuarial applications, we emphasis the targeting
of the mortality reduction factor, as opposed to the force of mortality.
Thus, while the parametric structure is expanded to allow for age-cohort as
well as the familiar LC effects, the error structure is imposed by specifying
the second moment properties of the model. This allows for a range of
options for the choice of error distribution including Poisson, both with
and without dispersion, as well as Gaussian, as used in the original LC
approach. We then review the methods of fitting such models and expand
on them. As in Chapter 5, extrapolation is conducted using the standard
approach advocated by LC of parametric time series forecasting.
As before, we let the random variable Dxt denote the number of deaths
in a population at age x and time t. A rectangular data array (dxt , ETRxt ) is
available for analysis where dxt is the actual number of deaths and ETRxt
is the matching exposure to the risk of death. The force of mortality and
empirical mortality rates are denoted by µx (t) and mx (t)(= dxt /ETRxt )
respectively. Cross-classification is by individual calendar year t ∈ [t1 , tn ]
(range n) and by age x ∈ [x1 , xm ], either grouped into m (ordered) cat-
egories, or by individual year (range m), in which case year-of-birth or
cohort year z = t − x ∈ [t1 − xm , tn − x1 ] (range n + m − 1) is defined. We
assume that this is the case throughout.
In terms of the force of mortality (as opposed to the central rate of
mortality), the LC model structure is
ln µx (t) = αx + βx κt (6.1)
subject to the most commonly used (but non-unique) constraints which are
adopted to ensure the identifiability of the parameters:

tn
xm
κt = 0, βx = 1. (6.2)
t=t1 x=x1
As discussed in Chapter 5, the LC model structure reduces the dimension-

ality of the problem by identifying a single time index, which affects the
6.2 LC age–period–cohort mortality projection model 247
force of mortality at time t at all ages simultaneously. The first constraint

under (6.2) has the effect of centring the κt values over the range t ∈ [t1 , tn ].
The model structure is designed to capture age-period effects with the αx
terms incorporating the main age effects, averaged over time, and the bilin-
ear terms βx κt incorporating the age specific period trends (relative to the
main age effects). We re-write equation (6.1) in the following form:

µx(t) = exp αx + ln RF(x, t) (6.3)
in general, where specifically the mortality reduction factor RF
ln RF(x, t) = βx κt (6.4)
is defined under LC modelling. We subsequently adjust the constraints (6.2),
so that
ln RF(x, tn ) = 0, for all x (6.5)
when extrapolating mortality rates, as described in Chapter 4.
We now generalize the model structure in order to incorporate an age-
cohort term. Thus, we consider the age–period–cohort (APC) version of the
LC model, first introduced by Renshaw and Haberman (2006),
ln µx (t) = αx + βx(0) ii−x + βx(1) κt (6.6)
and the related mortality reduction factor

RF(x, t) = exp βx(0) ii−x + βx(1) κt (6.7)
(0)
with an extra pair of bilinear terms βx it−x introduced in order to represent
the cohort effects. We can see that (6.6) is then a natural extension of
equation (4.98) which was introduced in Chapter 4.
It is clear that the structure represented by equations (6.6) and (6.7) gives
rise to a rich sub-structure of models:
Lee−Carter age-period (LC) model βx(0) = 0
Age−Cohort (AC) model βx(1) = 0
(j)
plus versions where either or both of the βx = 1 for j = 1, 2: where the
application of age adjustments to one or both of the main period-effects
and cohort effects terms is not found to be significant.
In formulating these structures, in each of the above cases, we may
partition in the force of mortality as follows:
µx (t) = exp(αx )RF(x, t) (6.8)
that is into the product of a static term, representing the age profile
of mortality and incorporating the main age effects αx , and a dynamic
parameterized mortality reduction factor RF(x, t), which contains both the
age-specific (κt ) and cohort (it−x ) effects.
6.2.2 Error structure and model fitting
6.2.2.1 Introduction
In Chapter 5, a number of approaches to specifying the error structure
for the model and to model fitting have been described. In this section,
we set aside the standard least-squares and singular value decomposition
approach, and focus on selecting a Poisson response model and using
maximum likelihood estimation (as in Section 5.2.2.3).
Thus, we model the random number of deaths, Dxt , as a Poisson response
variable. As noted in the previous chapter, direct modelling of Dxt is very
useful in many practical applications where, for example, we might need to
simulate the future cash flows of a life annuity or pensions portfolio. We
allow also for over-dispersion and the allocation of prior weights, which
can be important in the presence of empty data cells. This is formalized by
following the approach of generalized linear models and by specifying the
first two moments of the responses Yxt where
Yxt = Dxt ,
E(Yxt ) = ETRxt µx (t) = ETRxt exp(αx )RF(x, t) (6.9)
Var(Yxt ) = φE(Yxt ) (6.10)
with a scale parameter φ, variance function V(E(Yxt )) = E(Yxt ) and prior

weights wxt = 1 (or 0 if the data cell is empty). Then under the log link, the
non-linear predictor ηxt is defined as

ln E(Yxt ) = ηxt = ln ETRxt + αx + ln RF(x, t) (6.11)
It is also of interest to note a possible alternative error structure in that

the original LC model with a Gaussian error structure is re-established on
replacing (6.9) and (6.10) with

Dxt φ
Yxt = ln , E(Yxt ) = αx + ln RF(x, t), Var(Yxt ) = .
ETRxt wxt
(6.12)
This formulation comprises a free standing scale parameter φ(=σ 2 ), vari-
ance function V(E(Yxt )) = 1 and prior weights wxt = 1. Then, under the
identity link, the non-linear predictor is given by
(E(Yxt ) =) ηxt = αx + βx κt (6.13)
which is the standard LC structure (6.1).

Given the non-linear nature of the parametric predictors (ηxt ), we focus

on two alternative model-fitting procedures: Method A which is based on
an unpublished technical report by Wilmoth (1993); and Method B which
is based on a method of mortality analysis incorporating age–year inter-
actions, from the field of medical statistics and attributable to James and
Segal (1982), and which predates the LC model. We present Methods A
and B in the context of the LC model and then go on to describe the fitting
procedures for the new APC and AC versions of the model, which have
been described above.
6.2.2.2 Fitting the LC model by Method A

We adapt the approach of Section 5.2.2.3, which is based on Wilmoth
(1993) and Brouhns et al. (2002b), and obtain maximum likelihood esti-
mates under the original LC Gaussian error structure given by (6.10) and
(6.11) using an iterative process, which can be re-expressed as follows:
Set starting values α̂x , β̂x , κ̂t ; compute ŷxt

↓
update α̂x ; compute ŷxt
tn
update κ̂t , adjust s.t. κt = 0; compute ŷxt
t=t1
update β̂x ; compute ŷxt
compute D(yxt , ŷxt )
↓
Repeat the updating cycle; stop when D(yxt , ŷxt ) converges
where
yxt = ln m̂x (t), ŷxt = α̂x + β̂x κ̂t (6.14)
yxt
yxt − u
D(yxt , ŷxt ) = dev(x, t) = 2wxt du = wxt (yxt − ŷxt )2
x,t x,t
V(u) x,t
ŷxt
(6.15)
with weights
1, ETRxt > 0
wxt = . (6.16)
0, ETRxt = 0
The updating of a typical parameter θ proceeds according to
/
∂D ∂2 D
updated(θ) = u(θ) = θ − (6.17)
∂θ ∂θ 2
where D is the deviance of the current model. Table 6.1 provides fuller
details. Effective starting values, conforming to the usual LC constraints
(6.2) are κ̂t = 0, β̂x = 1/k, coupled with the SVD estimate
1 tn
∧
α̂x = ln mx (t) (6.18)
tn − t1 + 1 t=t
1
so that αx is estimated by the logarithm of the geometric mean of the empir-

ical mortality rates. The model has ν = (k − 1)(n − 2) degrees of freedom.
Table 6.1. Parameter updating relationships
Gaussian Poisson

wxt (yxt − ŷxt ) wxt (yxt − ŷxt )
LC u(α̂x ) = α̂x + t u(α̂x ) = α̂x + t
wxt wxt ŷxt
t t

wxt (yxt − ŷxt )β̂x wxt (yxt − ŷxt )β̂x
u(κ̂t ) = κ̂t + x u(κ̂t ) = κ̂t + x
wxt β̂x2 wxt ŷxt β̂x2
x x

wxt (yxt − ŷxt )κ̂t wxt (yxt − ŷxt )κ̂t
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt κ̂t2 wxt ŷxt κ̂t2
t t

wxt (yxt − ŷxt )β̂x(0) wxt (yxt − ŷxt )β̂x(0)
x,t x,t
t−x=z t−x=z
APC u(ι̂z ) = ι̂z + 2 u(ι̂z ) = ι̂z + 2
wxt β̂x(0) wxt ŷxt β̂x(0)
x,t x,t
t−x=z t−x=z

(0) (0)
wxt (yxt − ŷxt )ι̂t−x (0) (0)
wxt (yxt − ŷxt )ι̂t−x
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt ι̂2t−x wxt ŷxt ι̂2t−x
t t

wxt (yxt − ŷxt )β̂x(1) wxt (yxt − ŷxt )β̂x(1)
u(κ̂t ) = κ̂t + x u(κ̂t ) = κ̂t +
x
(1)2 2
wxt β̂x wxt ŷxt β̂x(1)
x x

(1) (1)
wxt (yxt − ŷxt )κ̂t (1) (1)
wxt (yxt − ŷxt )κ̂t
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt κ̂t2 wxt ŷxt κ̂t2
t t
AC u(α̂x ) computedas above u(α̂x ) computedas above
wxt (yxt − ŷxt )β̂x wxt (yxt − ŷxt )β̂x
x,t x,t
t−x=z t−x=z
u(ι̂z ) = ι̂z + u(ι̂z ) = ι̂z +
wxt β̂x2 wxt ŷxt β̂x2
x,t x,t
t−x=z t−x=z

wxt (yxt − ŷxt )ι̂t−x wxt (yxt − ŷxt )ι̂t−x
u(β̂x ) = β̂x + t u(β̂x ) = β̂x + t
wxt ι̂2t−x wxt ŷxt ι̂2t−x
t t
This iterative fitting process generates maximum likelihood estimates under

the Poisson error structure presented in (6.9) and (6.10) on setting
yxt = dxt , ŷxt = d̂xt = ext exp(α̂x + β̂x κ̂t ) (6.19)
yxt
yxt − u
D(yxt , ŷxt ) = dev(x, t) = 2wxt du
x,t x,t
V(u)
ŷxt

yxt
= 2wxt yxt ln − yxt − ŷxt (6.20)
x,t
ŷxt
As noted by Renshaw and Haberman (2003a, 2003b, 2006), we can

attribute the iterative method for estimating log-linear models with bilinear
terms to Goodman (1979). Table 6.1 provides full details of the parameter
updating relationships.
6.2.2.3 Fitting the LC model by Method B

Following James and Segal (1982), we use the iterative procedure:
Set starting values β̂x

↓
given β̂x , update α̂x , κ̂t
given κ̂t , update α̂x , β̂x
compute D(yxt , ŷxt )
↓
Repeat the updating cycle; stop when D(yxt , ŷxt ) converges
Given β̂x or κ̂t , updating is by selecting the desired generalized linear

model and fitting the predictor, which is linear in the respective remain-
ing parameters. Thus, log-link Poisson responses yxt = dxt with offsets
ln ETRxt are set in order to generate the same results as in the iterative fit-
ting process of Section 5.2.2.3. The respective predictors are declared by
accessing the model formulae (design matrices), a feature which is available
in GLIM (Francis et al., 1993), for example, and other software packages.
In specifying the model formulae, we impose the constraints

κt1 = 0, βx = 1, (6.21)
x
and then revert back to the standard LC constraints (6.2) once convergence
is attained.
6.2.2.4 Fitting the APC LC model

It is well known that APC modelling is problematic, since the three factors
are constrained by the relationship
cohort = period − age
To ensure a unique set of parameter estimates, we resort to a two-
stage fitting strategy in which αx is estimated first, typically as in (6.18)
corresponding to the original LC SVD approach. Then, the remaining
parameters, those of the reduction factor RF, may be estimated by suit-
ably adapting Method B by declaring log-link Poisson responses yxt = dxt
and the augmented offsets ln ext + α̂x and adapting the design matrices,
together with the constraints

βx(0) = 1, βx(1) = 1 and either ιt1 −xk = 0 (or κt1 = 0). (6.22)
x x
Obvious simplifications to the design matrices are needed when fitting the
(0) (1)
associated sub-models with βx = 1 or βx = 1, while the iterative element
(0)
in the fitting procedure is redundant when fitting the model with βx =
(1)
βx = 1 for all x. We note that the APC model has ν = k(n − 3) − 2(n − 2)
degrees of freedom (excluding any provision for the first-stage modelling
(0) (1)
of αx ). We find that effective starting values are βx = βx = 1/k. Fitting
is also possible under Method A, once αx has been estimated, using the
extended definitions of ŷxt and adapting the core of the iterative cycle in
accordance with the relevant updating relationships (Table 6.1). Effective
(0) (1)
starting values may be obtained by setting βx = βx = 1 and fitting this
restricted version of the APC model to generate starting values for ιz and κt .
6.2.2.5 Fitting the AC LC model

Model identification is conveniently achieved by means of the parameter
constraints
ιt1 −xk = 0, βx(0) = 1 (6.23)
x
(0)
Model fitting is then possible by reformulating Method A in terms of αx , βx
and ιt−x . Thus, ιt−x instead of κt is updated in the core of the iterative cycle
(subject to the adjustment ιt1 −xk = 0), using the replacement updating rela-
tionships of Table 6.1. Fitting is also possible using Method B by replacing κt
with ιt−x and modifying the design matrices accordingly. A possible strategy
(0)
for generating starting values is to set β̂x = 1 and additionally fit the main
effects structure αx +ιt−x in accordance with the distributional assumptions
under Method A. There are ν = (k − 1)(n − 3) degrees of freedom in this
model.
6.2.3 Mortality rate projections
Projected mortality rates

· ∧ ·
mx (tn + s) = mx (tn ) RF(x, tn + s), s>0 (6.24)
∧
are computed by alignment with the latest available mortality rates mx (tn ).
Here,
·
RF(x, tn + s) = exp β̂x(0) ι̃tn −x+s − ι̂tn −x + β̂x(1) κ̇tn +s − κ̂tn , s > 0
(6.25)
for which
lim RḞ(x, tn + s) = 1
s→0
(i) (i)
is based on the parameter estimates β̂x , ι̂z , κ̂t and the time series forecasts
0 1 0 1
ι̂z : z ∈ [t1 − xk , tn − x1 ] → ι̇tn −x1 +s : s > 0
0 1 0 1
κ̂t : t ∈ [t1 , tn ] → κ̇tn +s : s > 0 (6.26)
where

ι̂tn −x+s , 0 < s ≤ x − x1
ι̃tn −x+s =
ι̇tn −x+s , s > x − x1
As we have seen in Section 5.7, the time series forecasts are typically gen-
erated using univariate ARIMA processes. The random walk with drift
(or ARIMA(0,1,0) process) features prominently in many of the published
applications of the LC model. If no provision for alignment with the latest
available mortality rates is made (as in equation (6.24)), the extrapolated
mortality rates decompose multiplicatively as
·
ṁx (tn + s) = exp α̂x + β̂x(0) ι̂tn −x + β̂x(1) κ̂tn RF(x, tn + s), s > 0 (6.27)
which has the same functional form as (6.8), and can be directly compared
with (6.24). This was the approach originally proposed in Lee and Carter
(1992).
6.2.4 Discussion
By specifying the second moment distributional properties when defining

the model error structure, the choice of distribution is not restricted to
the Poisson and Gaussian distributions, and may indeed be expanded by
selecting different variance functions (within the exponential family of dis-
tributions). Empirical evidence suggests that, for all practical purposes,
maximum likelihood estimates obtained for the LC model using the itera-
tive fitting processes under the Gaussian error structure given by (6.12), are
the same as those obtained under fitting by SVD. Unlike modelling by SVD,
however, the choice of weights (6.16) means that estimation can proceed,
in the presence of empty data cells, under the Gaussian, Poisson, and any
of the other viable error settings. Wilmoth (1993) uses weights wxt = dxt
in combination with the Gaussian error setting. Empirical studies reveal
that this has the effect of bringing the parameter estimates into close agree-
ment with the Poisson-response-based estimates. When comparing a range
of results obtained under both modelling approaches (with identical model
structures), we have found that the same number of iterations is required
to induce convergence. However, convergence is slow when fitting the APC
model.
As discussed in Section 5.6, diagnostic checks on the fitted model are very
important. For consistency with the model specification, we consider plots
of the standardized deviance residuals
.
rxt = sign(yxt − ŷxt ) dev(x, t)/φ̂ (6.28)
where
D(yxt , ŷxt )
φ̂ = (6.29)
ν
The sole use of the proportion of the total temporal variance, as measured
by the ratio of the first singular value to the sum of singular values under
SVD, is not a satisfactory diagnostic indicator. However, this index is widely
quoted in the demographic literature: see, for example, Tuljapurkar et al.
(2000).
The parameters αx are estimated simultaneously with the parameters of
the reduction factor RF in both the LC and AC models. A two-stage esti-
mation process is necessary, however, in which αx is estimated separately
to condition on the estimation of RF, when fitting the APC model (and its
substructures). This two-stage approach can also be applied when fitting
the LC and AC models. In the case of the former, empirical studies show
that this has little practical material effect, due to the robust nature of the
αx estimate (6.18).
6.3 Application to United Kingdom mortality data

To explore the potential of the APC model, we present results for the United
Kingdom 1961–2000 mortality experiences for each gender, with cross-
classification by individual year of age from 0 to 99. This data set has been
provided by the Government Actuary’s Department – the availability of

data cross-classified by single year of age and by individual calendar year
facilitates the construction of cohort-based mortality measures. We make a
direct comparison with the standard age–period LC and the AC models. In
this application, all of the models are fitted under the Poisson error setting,
represented by equations (6.9) and (6.10).
The implications of the choice of model structure are immediately
apparent from the respective residual plots, illustrated for the UK female
experience (Fig. 6.1). Here the distinctive ripple effects in the year-of-birth
residual plots under LC modelling (Fig. 6.1(a), RH frame), signify a fail-
ure of the model to capture cohort effects. This is then transferred to the
calendar-year residual plots under AC modelling (Fig. 6.1(b), LH frame)
signifying a reciprocal failure to capture period effects. However, these dis-
tinctive ripple effects are largely removed under APC modelling (Fig. 6.1(c))
– this feature indicates that the model captures in a relatively successful man-
ner all of the three main effects and represents a significant improvement
(a) 3.0 3.0 3.0

2.5 2.5 2.5
2.0 2.0 2.0
1.5 1.5 1.5
1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
–0.5 –0.5 –0.5
–1.0 –1.0 –1.0
–1.5 –1.5 –1.5
–2.0 –2.0 –2.0
–2.5 –2.5 –2.5
–3.0 –3.0 –3.0
1965 1970 1975 1980 1985 1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000
Calendar year Age Year of birth
(b) 3.0 3.0 3.0

2.5 2.5 2.5
2.0 2.0 2.0
1.5 1.5 1.5
1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
–0.5 –0.5 –0.5
–1.0 –1.0 –1.0
–1.5 –1.5 –1.5
–2.0 –2.0 –2.0
–2.5 –2.5 –2.5
–3.0 –3.0 –3.0
1965 1970 1975 1980 1985 1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000
(c) 3.0 3.0 3.0

2.5 2.5 2.5
2.0 2.0 2.0
1.5 1.5 1.5
1.0 1.0 1.0
0.5 0.5 0.5
0.0 0.0 0.0
–0.5 –0.5 –0.5
–1.0 –1.0 –1.0
–1.5 –1.5 –1.5
–2.0 –2.0 –2.0
–2.5 –2.5 –2.5
–3.0 –3.0 –3.0
1965 1970 1975 1980 1985 1990 1995 2000 0 10 20 30 40 50 60 70 80 90 100 1880 1900 1920 1940 1960 1980 2000
Figure 6.1. Female mortality experience: residual plots for (a) LC model; (b) AC model; and (c)
APC model.
over the fitted LC model. Similar patterns are observed in the residual plots
for the UK male experience (not reproduced here but the details are available
from the authors).
Turning first to the parameter estimates for the APC modelling approach
(Fig. 6.2), we believe that it is helpful and informative to compare matching
frames between the sexes. Thus, the main age–effect plots (α̂x vs x) display
the familiar characteristics, including the ‘accident’ humps, of static cross-
sectional life-tables (on the log scale), with a more pronounced accident
hump and heavier mortality for males than for females. We recall that these
effects are estimated separately, by averaging crude mortality rates over t
for each x, to condition for both period and cohort effects.
The main period effects plot (κ̂t vs t) is linear for females but exhibits mild
curvature for males, which can be characterized as piece-wise linear with
a knot or hinge positioned in the first-half of the 1970s. This effect is also
present in the separate LC analysis of mortality data of the G7 countries
–1 0.022
(a) 0.035
0.020
–2
0.030 0.018
–3 0.016
0.025
–4 0.014
b0
b1
0.020
a
–5 0.012
0.010
–6 0.015
0.008
–7 0.010 0.006
–8 0.004
0.005
0.002
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age Age
0
–10 80
–20 60
–30
–40 40
–50
k
20
i
–60
–0
–70
–80 –20
–90 –40
–100
1960 19651970 1975 1980 1985 1990 1995 2000 2005 20102015 2020 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020
Calendar year Year or birth
(b) –1 0.060
0.055 0.014
–2 0.050 0.012
–3 0.045
0.040 0.010
–4 0.035
b0
b1
a
–5 0.030 0.008
0.025
–6 0.020 0.006
0.015 0.004
–7 0.010
–8 0.005 0.002
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age Age
0 100
–10 80
–20
60
–30
–40 40
i
k
–50 20
–60
–0
–70
–20
–80
1960 1965 1970 1975 1980 19851990 19952000 20052010 2015 2020 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020
Calendar year Year of birth
Figure 6.2. Parameter estimates and forecasts for the APC model: (a) females; (b) males.
(Tuljaparkar et al., 2000) and has been discussed further for the United
Kingdom by Renshaw and Haberman (2003a). The forecasts for κt are
based on the auto-regressive time series model
yt = d + φ1 yt−1 + εt where yt = κt − κt−1 (6.30)
which is the equivalent of ARIMA(1, 1, 0) modelling. There are noteworthy

(1)
differences in the β̂x patterns, which control the rate of decline by period
of the age specific rates of mortality in the projections. In particular, the
(1)
trough in the male β̂x pattern in the 20–40 age range is consistent with
similar findings from trends in the male England and Wales mortality rates
(Renshaw and Haberman, 2003a).
The plots of the main cohort effects (ι̂z vs z = t−x) are particularly reveal-
ing. Thus, noteworthy discontinuities occur corresponding to the ending of
World Wars I and II. While it is possible to identify the first of these with
the 1919 influenza epidemic, we are not aware of the likely cause of the
second discontinuity. (The 1886–1887 discontinuity can be traced to a set
of outliers, and is possibly due to mis-stated exposures for this particular
cohort.) The pronounced decline in the ι̂z profile in the inter-war years is
consistent with the reported rapid mortality improvements experienced by
generations born between 1925 and 1945 (for both sexes) and discussed
at the start of this chapter. The apparent stable linear trends in the ι̂z pro-
files, present since the late 1940s, form the basis of the depicted time series
forecasts, generated using an ARIMA (0, 1, 0) process for females and an
(0)
ARIMA(1, 1, 0) process for males. The β̂x patterns, which control the age-
specific cohort contributions to the mortality projections, are similar, for
both sexes, for ages up to 65.
We illustrate the implications of these projections in Fig. 6.3 by plotting
current log crude mortality rates (for the calendar year 2000) against age
for each gender and comparing these with projections for the calendar year
2020 under three different models: the LC (or standard Lee-Carter) model,
AC model, and APC model. In Fig. 6.3(a), we show the LC–AC comparison
and in Fig. 6.3(b) the LC–APC comparison. We note the marked mortality
reductions projected for 2020 under the AC and APC models at ages which
correspond to the cohorts identified at the start of this chapter (based on
descriptive analyses): those born between 1925 and 1945 and hence aged
75–95 in 2020.
In order to illustrate the impact of such diverse projections under age-
period (LC) and age-period-cohort (APC) modelling, we have calculated
complete life expectancies e65 (t) at age 65 (Fig 6.4) and immediate life
annuity values a65 (t) at age 65 assuming a 5% pa fixed interest rate (Fig. 6.5)
(a) UK female population study UK male population study

–1 –1
–2 Age-cohort (2020) –2 Age-cohort (2020)
–3 Age-period (2020) –3 Age-period (2020)
–4 Latest (2000) –4 Latest (2000)
–5 –5
–6 –6
–7 –7
–8 –8
–9 –9
–10 –10
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age
(b) UK female population study UK male population study

–1 –1
–2 Age-period-cohort (2020) –2 Age-period-cohort (2020)
–3 Age-period (2020) –3 Age-period (2020)
–4 Latest (2000) –4 Latest (2000)
–5 –5
–6 –6
–7 –7
–8 –8
–9 –9
–10 –10
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
Age Age
Figure 6.3. Current (2000) and projected (2020) ln µx (t) age profiles: (a) LC and AC models; (b)
LC and APC models.
for a range of years t using both the cohort and period method of computing.
(We note that the annuity values represent the expected present value of an
income of one paid annually in arrears while the individual initially aged 65
remains alive.) For the cohort method of computing, we use the following
formulae, which are analogous to (5.57):
1
h≥0 lx+h (t + h){1 − 2 qx+i (t + h)}
ex (t) = ,
lx (t)

h≥1 lx+i (t + h)v
h
ax (t) = (6.31)
lx (t)
where
qx (t) ≈ 1 − exp(−µxt ), lx+1 (t + 1) = {1 − qx (t)}lx (t) (6.32)
with annual discount factor v, where v = 1/(1 + i) is calculated using

a constant interest rate. Thus, in the cohort versions, we allow fully for
the dynamic aspect of the mortality rates, with the summations proceeding
(diagonally) along a cohort. We illustrate values up to the year 2005 calcu-
lated by the cohort method and this requires extrapolation up to the year
UK female population study

26
by period: LC
24 by period: APC
by cohort: LC
22 by cohort: APC
20
e(65,t)
18
16
14
12
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
UK male population study

26
by period: LC
24 by period: APC
by cohort: LC
22 by cohort: APC
20
e(65,t)
18
16
14
12
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
Figure 6.4. Projected life expectancies at age 65, computed by period and by cohort methods for
age-period (LC) and age-period-cohort (APC) models.
2040. In contrast, under the period method of calculation, the mortality

rates are treated as a sequence of (annual) static life tables, and computing
proceeds by suppressing the variation in t in expressions (6.31) and (6.32),
with (marginal) summation over age (≥x) for each fixed t, as for example
in (3.18). We illustrate values up to the year 2020 using this method based
on the empirical mortality rates µ̂x (t) = m̂x (t) in the period up to 2000 and
requiring extrapolation for subsequent years up to 2020. The period method
of computation fails to capture the full dynamic impact of the evolving
mortality rates under the modelling assumptions and generates less uncer-
tainty than the cohort method of calculation. The latter necessarily requires
UK female population study

14
by period: LC
13 by period: APC
by cohort: LC
12 by cohort: APC
11
a(65,t)
10
7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
UK male population study

14
by period: LC
13 by period: APC
by cohort: LC
12 by cohort: APC
11
a(65,t)
10
7
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020
Period t
Figure 6.5. Projected life annuity values at age 65 (calculated using a 5% per annum fixed interest
rate), computed by period and by cohort under age-period (LC) and age-period-cohort (APC)
models.
more lengthy extrapolations and this contributes a source of increasing

uncertainty. One means of quantifying this uncertainty is through the adop-
tion of boot-strapping simulation methods, as described in Section 5.8, in
the context of LC modelling. This and other methods are currently under
investigation for the case of the APC model. The reserves that insurance
companies selling life annuities and pension funds would have to hold in
order to meet their future contractual liabilities are directly related to terms
like a65 (t); see Booth et al. (2005). The financial implications of the upward
trends in cohort-based life annuity values (which are the most relevant for
pricing and reserving calculations) in Fig. 6.5 are clear and significant and
e(65, t) computed by period

6
5 t = 2020
4 t = 2016
3 t = 2012
2 t = 2008
Age-period (LC)
1 t = 2004 Age-period-cohort (APC)
Age 65
0
14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0
Life expectancy
e(x, 2000) computed by cohort

6
5 x = 85
4 x = 80
3 x = 75
2 x = 70
Age-period (LC)
1 Age-period-cohort (APC) x = 65
Period 2000
0
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Life expectancy
Figure 6.6. E +W male mortality: comparison of life expectancy predictions using (i) age-period-
cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrapping the time
series prediction error in the period (and cohort) components, and selecting the resulting 2.5, 50,
97.5 percentiles.
indicate the burden that increasing longevity may place on such financial
institutions.
As we have discussed in Section 5.8, it is important to be able to qual-
ify any projections of key mortality indices with measures of the error or
uncertainty present. Because of the complexities of the structure of the APC
LC model, the indices of interest are non-linear functions of the parameters
αx , βx , κt , it−x and hence analytical deviations of prediction intervals are not
possible. It is therefore necessary to employ bootstrapping techniques.
In Figs. 6.6 and 6.7, we use the LC and APC models fitted to England and
Wales male mortality rates over the period 1961–2000 in order to compare
estimates of life expectancy and 95% prediction intervals. Specifically, we
show in Fig. 6.6(a) computations of the period life expectancy at age 65 for
various future periods (equivalent to the median of the simulated distribu-
tions) and the corresponding 2.5 and 97.5 percentiles from the simulated
a(65, t) computed by period

6
5 t = 2020
4 t = 2016
3 t = 2012
2 t = 2008
Age-period (LC)
1 t = 2004 Age-period-cohort (M)
Age 65
0
10.0 10.2 10.4 10.6 10.8 11.0 11.2 11.4 11.6 11.8 12.0 12.2 12.4 12.6 12.8 13.0 13.2 13.4 13.6 13.8 14.0
4% Annuity
a(x, 2000) computed by cohort

6
5 x = 85
4 x = 80
3 x = 75
2 x = 70
Age-period (LC)
1 Age-period-cohort (M) x = 65
Period 2000
0
3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0
4% Annuity
Figure 6.7. E + W male mortality: comparison of 4% fixed rate annuity predictions using (i) age-
period-cohort and (ii) age-period Poisson structures. Predictions with intervals by bootstrapping
the time series prediction error in the period (and cohort) components, and selecting the resulting
2.5, 50, 97.5 percentiles.
distributions. In this case, the simulated distributions have been calculated

by bootstrapping only the time series prediction error as experiments reveal
that this is the most important component of the uncertainty in the model
(as originally suggested by Lee and Carter, 1992). In this way, we avoid
using the detailed bootstrapping strategies discussed in Section 5.8 which
can be rather slow to converge.
The results in Fig. 6.6 (upper frame) show that the central estimates for
life expectancy are higher when using the APC model as against using the
LC model (as illustrated in Fig. 6.4) and that the prediction intervals are
wider for the APC models for each year. As we move forward in time from
2004 to 2020, we note that both pairs of prediction intervals become wider,
indicating a greater level of uncertainty present in the estimates for future
years. Thus, the calculations for 2004 involve 4 years of forward projections
whereas the calculations for 2020 involve 20 years of projections.
6.4 Cairns–Blake–Dowd mortality projection model: allowing for cohort effects 263
The results in Fig. 6.6 (lower frame) show the corresponding figures for
cohort life expectancy for five cohorts of males aged 65, 70, 75, 80, and 85 in
2000. The younger cohorts have estimates of cohort life expectancy that are
higher under the APC model than under the LC model (as in Fig. 6.4). The
prediction intervals under the APC model are much wider for the younger
cohorts. As we consider the older cohorts, we note that the central estimates
and the prediction intervals become more similar under the two models
indicating the particular incidence of the cohort effect which affects those
aged 55–75 in 2000. Expectedly, under both models, the prediction intervals
are wider for the cohorts aged 65 and 70 in 2000 than for the older cohorts,
and the width decreases in stages as age in 2000 increases. This reflects the
underlying level of projection involved in the calculations – if we regard age
110 as approximately the terminal age in the underlying survival model,
then the cohort estimates at age 65 would involve 45 years of projected
quantities while the cohort estimates at age 85 would involve only 25 years
of projections.
Figure 6.7 reproduces the calculations of Fig. 6.6 but for immediate life
annuities calculated using a constant interest instant rate of 4% per annum.
We can regard Fig. 6.7 as extending the results of Fig. 6.5 by including
prediction intervals and a more detailed comparison. Figure 6.7 shows the
same principal features as Fig. 6.6.
6.4 Cairns–Blake–Dowd mortality projection model:

allowing for cohort effects
In Section 5.3, we introduced the Cairns–Blake–Dowd mortality projection
model which is motivated by the empirical observation that logitqx (t) is a
reasonably linear function of x for fixed t. The model introduced by Cairns
et al. (2007) is the following:
qx (t) (1) (2)
ln = κt + κ t x (6.33)
px (t)
which can be regarded as a specific example of a more general class of

models
qx (t) (1) (2)
ln = βx(1) κt + βx(2) κt (6.34)
px (t)
Responding to the need to consider the cohort effect observed in the historic
mortality data for a number of countries, Cairns et al. (2007) introduce
an AC term into the predictor as follows, in an analogous manner to the
Renshaw and Haberman (2006) enhancement of the original Lee-Carter

model. Thus, Cairns et al. (2007) propose the following family of models:
qx (t) (1) (2)
ln = βx(1) κt + βx(2) κt + βx(3) ιt−x (6.35)
px (t)
where the it−x term represents a cohort effect as in (6.6). Having considered
goodness-of-fit of this family of models to historic data from England and
Wales and the USA, Cairns et al. (2007) investigate two specific versions in
some detail.
The special cases are
I. βx(1) = 1, βx(2) = x − x, and βx(3) = 1 (6.36)

II. βx(1) = 1, βx(2) = x − x, and βx(3) = xc − x (6.37)
where x is the average age in the data set and xc is a constant parameter that
needs to be estimated. As with the APC version of the Lee Carter model in
Section 6.2, we need to introduce some identifiability constraints to ensure
that the parameters can be uniquely estimated. Version II is motivated by
the observation that in the applications of the APC model of Section 6.2, the
coefficient of the cohort term it−x is often found to be a decreasing function
(3)
of age: (6.37) incorporates the simplest such specification of βx .
Cairns et al. (2007) fit the models by the method of maximum likelihood,
assuming that Dxt has a Poisson distribution as assumed in earlier Sections
5.2.2.3 and 6.2.2. For England and Wales data comprising the calendar
years 1961–2004 inclusive and ages 60–89, they find that the best-fitting
model is (6.37). For US data comprising the calendar years 1968–2003
inclusive and ages 60–89 (although only data for ages 85–89 are used for
1980–2003), they find that the best-fitting model is (6.6). When robustness
to the choice of fitting period is considered, the best fits to the historic data
from both countries are obtained for an augmented version of (6.35) viz.
qx (t) (1) (2) (3)
ln = βx(1) κt + βx(2) κt + βx(3) κt + βx(4) it−x (6.38)
px (t)
(1) (2) (3)
with the specific choices βx = 1, βx = x − x, βx = (x − x̄)2 − σ̂x2 and
(4)
βx = 1. Here σ̂x2 is the average value of (x − x̄)2 . This development of
the model is inspired by the observation that there is some curvature in the
age-profile of log it qx (t) in the United States data.
As in Sections 5.3 and 6.2, we could use the Cairns–Blake–Dowd class
of models for projection purposes. This would require models to be pos-
tulated and estimated for the dynamics of the period and cohort effects
6.5 P-splines model: allowing for cohort effects 265
terms in (6.35)–(6.38). An obvious approach would be to employ standard

time series methods based on ARIMA models, as discussed earlier. A com-
plication with the models represented by equations (6.35)–(6.38) is that
they involve three or four stochastic (time series) terms. We could follow
Section 6.2 and postulate that and (for different values of i) are independent.
Still, the presence of two or three terms would mean that we would need
to consider multivariate time-series modelling to estimate the underlying
dependency structure. This would be likely to involve vector autoregressive
models and co-integration techniques as discussed by Hamilton (1994).
6.5 P-splines model: allowing for cohort effects

As noted in Section 5.4.2, Currie et al. (2004) have introduced a two-
dimensional graduation methodology based on B-splines, which is fitted
to observational data using a regression framework. The two-dimensional
version of univariate B-splines is obtained by multiplying the respective ele-
ments of the univariate B-splines in the age- and time-dimensions. Thus, the
model is
ln µx (t) = θij Bij (x, t) (6.39)
i, j
where Bij (x, t) = Bi (x) · Bj (t)and where Bij and the θij are parameters to be
estimated from the data and Bi and Bj are the respective univariate splines.
In reality, B-splines can provide a very good fit to the data if we employ a
large number of knots in the year and age dimensions. But this excellent level
of goodness of fit is achieved by sacrificing smoothness in the resulting fit.
The method of P-splines (or penalized splines) has been suggested by Eilers
and Marx (1996) to overcome this problem: in this case, the log-likelihood
is adjusted by a penalty function, with appropriate weights.
Schematically, the penalized by likelihood would have the following form
for an LC model:
PL(θ) = L(θ) − λx Px (θ) − λt Pt (θ) (6.40)
where λx and λt are weighting parameters and Px (θ) is a penalty function

in the age dimension and Pt (θ) is a penalty function in the calendar time
dimension. An alternative formulation would involve an AC model:
PL(θ) = L(θ) − λx Px (θ) − λz Pz (θ) (6.41)
where, as in Section 6.2, we use z = t − x to index cohorts. The λ’s are esti-
mated from the data. As noted in Section 5.4.2, typical choices for quadratic
penalty functions would be

Px (θ) = (θij − 2θi−1, j + θi−2, j )2 (6.42)
i, j

Pt (θ) = (θij − 2θi, j−1 + θi, j−2 )2 (6.43)
i, j

Pz (θ) = (θi+1,j−1 − 2θij + θi−1, j+1 )2 (6.44)
i, j
Thus, the B-splines are used as the basis for the underlying regression and
the log likelihood is modified by penalty functions like the above which
depend on the smoothness of the θij parameters.
The idea of using P-spline regression not just for graduating mortality
data but also for mortality projections was first suggested by CMIB (2005).
In this application, that is, projecting mortality rates, the choice of the P(θ)
function plays a critical role in extending the mortality surface beyond the
range of the data so that projections are a direct consequence of the smooth-
ing process. Thus, a quadratic penalty function effectively leads to linear
extrapolation – in the age and time dimensions, for (6.40) combined with
the choices (6.42) and (6.43); or in the age and year of birth dimensions
for (6.41) combined with the choices (6.42) and (6.44). Different choices
for P(θ) would be possible, and these may have little impact on the quality
of fit to the historic data and hence would be difficult to infer from the
data. However, the impact on the projected mortality surface is consider-
able. The choice of P(θ) corresponds to a decision on the projected trend.
We have seen the implications of a quadratic penalty. Similarly, a linear
penalty function would lead to constant log mortality rates being projected
in the appropriate dimensions and a cubic penalty function would lead to
quadratic log mortality rates being projected in the appropriate dimensions.
Detailed applications of the P-spline methodology indicate that it is bet-
ter suited to graduation and smoothing of historic observational data than
to projection: see, for example, Cairns et al. (2007) and Richards et al.
(2007). Further, we should note that P-spline models can be used to gen-
erate percentiles for the measurement of uncertainty but unlike, the LC
and Cairns–Blake–Dowd families of models, P-spline models are not able
to generate sample paths. In many asset–liability modelling applications in
insurance and pensions, the production of sample paths is an important fea-
ture and could be useful elsewhere such as in the pricing of longevity-linked
financial instruments – see Chapter 7.
The longevity risk:
7 actuarial perspectives
7.1 Introduction
In this chapter we deal with the mortality risks borne by an annuity provider,
and in particular with the longevity risk originating from the uncertain
evolution of mortality at adult and old ages.
The assessment of longevity risk requires a stochastic representation of
mortality. Possible approaches are described in Section 7.2, which is also
devoted to an analysis of the impact of longevity risk on the risk profile of
the provider. In Section 7.3 and 7.4 we take a risk management perspective,
and we investigate possible solutions for risk mitigation. In particular, risk
transfers as well as capital requirements for the risk retained are discussed.
Policy design and the pricing of life annuities allowing for longevity risk are
dealt with in Section 7.5 and 7.6; such aspects, owing to commercial pres-
sure and modelling difficulties, are rather controversial. We do not develop
an in-depth analysis, but we instead remark on the main issues. To reach a
proper arrangement of the policy conditions of a life annuity, the possible
behaviour of the annuitant in respect of the planning of her/his retirement
income has to be considered. In Section 7.7 we describe possible choices
available to the annuitant in this respect.
The topics dealt with in this chapter are rather new and not well-
established either in practice or in the literature. So the chapter is based on
recent research. To give a comprehensive view of the available literature,
most contributions are cited in Section 7.8, which is devoted to comments
on further readings; for some specific issues, however, references are also
quoted in the previous sections.
In this chapter, we refer usually to annuitants and insurers. Such terms
are anyhow used in a generic sense. The discussion could also be referred
to pensioners, with a proper adjustment of the parameters of the relevant
268 7 : The longevity risk: actuarial perspectives
mortality models, and to annuity providers other than an insurer. Just for
brevity, only annuitants and insurers are mentioned.
7.2 The longevity risk
7.2.1 Mortality risks
Mortality risk may emerge in different ways. Three cases can in particular
be envisaged.
(a) One individual may live longer or less than the average lifetime in the
population to which she/he belongs. In terms of the frequency of deaths
in the population, this may result in observed mortality rates higher
than expected in some years, lower than expected in others, with no
apparent trend in such deviations.
(b) The average lifetime of a population may be different from what is
expected. In terms of the frequency of deaths, it turns out that mortality
rates observed in time in the population are systematically above or
below those coming from the relevant mortality table.
(c) Mortality rates in a population may experience sudden jumps, due to
critical living conditions, such as influenza epidemics, severe climatic
conditions (e.g. hot summers), natural disasters and so on.
In all the three cases, deviations in mortality rates with respect to what is
expected are experienced; an illustration is sketched in Fig. 7.1 where, with
reference to a given cohort, in each panel dots represent mortality rates
observed along time, whereas the solid line plots their forecasted level.
Case (a) is the well-known situation of possible deviations around
expected mortality rates; the mortality risk here comes out as a risk of
Mortality rates
Mortality rates
Mortality rates
Time Time Time

Case (a) Case (b) Case (c)
Figure 7.1. Experienced (dots) vs expected (solid line) mortality rates for a given cohort.
random fluctuations, which is traditional in the insurance business, in both

the life and the non-life area (actually, it is the basic grounds of the insurance
business). It is often named process risk or also insurance risk. It concerns
the individual position, and as such its severity reduces as the single posi-
tion becomes negligible in respect of the overall portfolio. The process risk
can be hedged through the realization of a proper pooling effect, since it
reduces as soon as the portfolio is made of similar policies and its size is
large enough, as well as through traditional risk transfer arrangements.
Under case (b), deviations are from expected values, rather than around
them, hence their systematic nature. This may be the result of either a mis-
specification of the relevant mortality model (e.g. because the time-pattern
of actual mortality differs from that implied by the adopted mortality table)
or a biased assessment of the relevant parameters (e.g. due to a lack of data).
The former aspect is referred to as the model risk, the latter as the parameter
risk. The term uncertainty risk is often used to refer to model and parameter
risk jointly, meaning uncertainty in the representation of a phenomenon
(e.g. future mortality). When adult-old ages are concerned, uncertainty risk
may emerge in particular because of an unanticipated reduction in mortality
rates (as is presented in the mid-panel of Fig. 7.1, where the mortality profile
of the cohort is better captured by the dashed line rather than by the solid
line). In this case, the term longevity risk is used instead of uncertainty
risk. It must be stressed that longevity risk concerns aggregate mortality; so
pooling arguments do not apply for its hedging.
In case (c), a catastrophe risk emerges, namely the risk of a sudden and
short-term rise in the frequency of deaths. Similar to case (b), aggregate
mortality is concerned; however, when compared with longevity risk, the
time-span involved by the emergence of the risk needs to be stressed: short-
term in case (c), long-term (possibly, permanent) in case (b). Clearly, a
proper hedging of catastrophe risk is required when death benefits are dealt
with (whilst when dealing with life annuities, profit arises because of the
higher mortality experienced). The usual pooling arguments do not apply;
however, diversification effects may be realized and risk transfers can be
conceived as well. Some remarks in this regard are given in Sections 7.3.2
and 7.4.2.
Apart from some short remarks on the management of process and catas-
trophe risk, in the following we focus on longevity risk. Before moving to
the relevant discussion, it is necessary to make a comment on terminology.
The vocabulary introduced above for mortality risks is commonly
acknowledged in the literature. In some risk classification systems, how-
ever, the meaning of some terms may be different, and this may lead to
possible misunderstandings. We mention in particular the evolving Solvency
2 system, where (see CEIOPS (2007)) both the mortality and the longevity
risks are meant to result from uncertainty risk. Mortality risk addresses
possible situations of extra-mortality; concern here is for a business with
death benefits. On the contrary, longevity risk addresses the possible real-
ization of extra-survivorship; clearly, in this case concern is for a business
with living benefits, life annuities in particular. In the following, we disre-
gard this meaning; reference is therefore to what we have described under
items (a)–(c) above and the relevant remarks.
7.2.2 Representing longevity risk: stochastic

modelling issues
Whenever we aim at representing a risk, a stochastic valuation is required.

In general terms, a stochastic mortality model should allow for the sev-
eral types of possible deviations in the frequency of death in respect of the
forecasted mortality rate, namely:
(a) random fluctuations (to represent process risk);

(b) deviations due to the shape of the time-pattern implied by the mortality
model, in respect of both age and calendar year (model risk);
(c) deviations due to the level of the parameters of the mortality model
(parameter risk);
(d) shocks due to period effects (catastrophe risk).
As to the shape of the time-pattern of mortality rates in respect of calendar

year, we recall that by longevity risk we mean the risk of an unanticipated
decrease in mortality rates at adult ages (see Section 7.2.1); hence, some
projection must be adopted. Except for the Lee–Carter model, projected
mortality models do not allow explicitly for risk (see Chapter 4). So, given
the purpose of this chapter we need to attack mortality modelling in a new
perspective.
Embedding four sources of randomness in the mortality model is a tricky
business. So some modelling choices are required. In this section, we explore
general aspects of stochastic modelling. Notation is stated in general terms.
More specific examples are then presented in Section 7.2.3.
Let Y be the random number of deaths in a given cohort at a given
age. We assume that Y depends on two input variables, say X1 , X2 ; so
Y = φ(X1 , X2 ). The quantity X1 could be meant to represent the prob-
ability of death or the force of mortality in the cohort in a given year
in the absence of extreme situations. Possible shocks are then represented
by X2 .
Various approaches can be conceived for investigating Y. A graphical

illustration is provided by Fig. 7.2.
Approach 1 is purely deterministic. Assigning specific values x1 , x2 to
the two input variables, the corresponding outcome y of the result vari-
able is simply calculated as y = φ(x1 , x2 ). In our example, x1 is a normal
(projected) probability of death or force of mortality, whilst x2 is a given
shock (possibly set to zero). It is interesting to note that classical actu-
arial calculations follow this approach, replacing random variables with
their expected or best-estimate value. In a more modern perspective, this
approach is adopted for example when performing stress testing (assigning
to some variables ‘extreme’ values), or scenario testing.
Randomness in input variables is, to some extent, acknowledged when
approach 2 is adopted. Reasonable ranges for the outcomes of the input
variables are chosen (e.g. the interval of possible values for a shock in a
given year), and consequently a range (ymin , ymax ) for Y is derived. As far
as X1 is concerned, the range of possible values may represent randomness
due to random fluctuations, as well as to the unknown trend of the cohort.
Note, however, that the valuation is fundamentally deterministic; the main
difference between approach 1 and approach 2 is the number of possible
outcomes which is considered.
Approach 3 provides a basic example of stochastic modelling, typically
adopted for assessing the impact of process risk. The probabilistic structure
assigned to X1 is meant here to represent the intrinsic stochastic nature of
mortality, that is, random fluctuations. Assuming a continuous probability
distribution function, the probability density function fX1 can be obtained,
for example, first assigning the probability distribution function of the life-
time of each individual (based on some projected mortality model with given
parameters), then aggregating the relevant results. Note that setting fixed
parameters for the mortality model, a deterministic assumption for trend is
understood. The probability distribution of Y (and X1 as well) can be found
using only analytical tools just in very simple (or simplified) circumstances.
Numerical methods or stochastic simulation procedures help in most
cases.
Approach 4 addresses, albeit in a naive manner, the risk of systematic
deviations. The three probability distributions assigned to X1 are intended
to be based on alternative models for the lifetime of each individual. In prac-
tical terms, the same mortality projection model may be assumed, but with
alternative sets of values chosen for the relevant parameters to represent
alternative mortality trends. Approach 4 then simply consists of iterating
the procedure implied by approach 3, each iteration corresponding to a
specific assumption about the probability distribution of an input variable.
Input Output
X1
1
X2
Y
X1
2 X2
Y
fX1
X2
fY
A1 A2 A3
fX1|Ah
4
X2
fY|Ah
fX1|Ah +
5
X2
fY
fX1|Ah +
6 fX2
fY
Figure 7.2. Modelling approaches to stochastic valuations.

A set of conditional distributions of Y is determined. Note that in respect

of systematic deviations a representation similar to approach 2 is gained;
the difference concerns process risk, which is explicitly addressed under
approach 4.
Under approach 5 a probability distribution is assigned over the set of
trend assumptions. Hence, the unconditional distribution of the output vari-
able Y can be calculated. Note that, this way, both process and uncertainty
risk are allowed for. In the graphical representation of Fig. 7.2 a discrete
setting is considered in respect of uncertainty risk; more complex models
may attack the problem within a continuous framework.
Finally, under approach 6 a probabilistic structure is assigned to all of
the input variables. In this case, either the joint distribution may be directly
addressed or the marginal distributions of the input variables as well as the
relevant correlation assumptions (as is depicted in Fig. 7.2). The problem
can be handled just through stochastic simulation; difficulties arise with
reference to the choice of the probability distribution of the uncertainty
risk, of the catastrophe risk, as well as with regard to the dependencies
among the various sources of randomness.
7.2.3 Representing longevity risk: some examples
We now specifically address longevity risk. Clearly, approach 5 (or 6) in

Fig. 7.2 is required, but some insights into the problem may be gained also
from approach 4.
Let (x, t) denote a projected mortality quantity, where x is the age
attained in calendar year t by the cohort born in year τ = t − x. The pro-
jected quantity may either be the probability of death, qx (t), the mortality
odds, qx (t)/px (t), the force of mortality, µx (t), and so on.
To develop approach 4, alternative hypotheses about future mortality
evolution must be chosen. Such alternative assumptions may originate from
different sets of the relevant parameters of the projection model; in this
way, parameter risk is addressed. Otherwise, the alternative assumptions
may be given by mortality projections obtained under different procedures;
in this case, also model risk would be addressed. However, it is intrinsically
difficult to perform an explanatory comparison of different models (e.g.
it is not easy to state whether the different outcomes of two models are
mainly due to the implied time-pattern or to the relevant parameters). For
this reason, we focus in the following discussion on parameter risk. In any
case, unless it is explicitly addressed (as it was depicted in Fig. 7.2), the risk
of catastrophe mortality is not considered.
Let A(τ) denote a given assumption about the mortality trend for peo-
ple born in year τ, and A(τ) the set of such assumptions. The notation
(x, τ + x|A(τ)) refers to the projected mortality quantity conditional on
the specific assumption A(τ). The set of all mortality projections is denoted
as the family {(x, τ + x|A(τ)); A(τ) ∈ A(τ)}.
In principle, the set A(τ) can be either discrete or continuous. The former
case is anyhow more practicable. Examples may be found in the projec-
tions developed by CMIB, addressing the cohort effect and assuming three
hypotheses about the persistence in the future of such an effect; see CMI
(2002) and CMI (2006).
Let us then suppose that a discrete set has been designed for A(τ). A
scenario testing, and possibly a stress testing, can be performed. In par-
ticular, the sensitivity of some quantities, such as reserves, profits, and
so on, in respect of future mortality trends can be investigated. As men-
tioned in Section 7.2.2, process risk can be explicitly appraised through the
probability distribution function of the lifetime of all the individuals in the
cohort, conditional on a given trend assumption. However, the approach
in respect of parameter risk is deterministic. Some examples are described
in Section 7.2.4.
A step forward consists of assigning a (non-negative and normalized)
weighting structure to A(τ) (see approach 5 in Fig. 7.2). In this way,
unconditional valuations can be performed, thus accounting explicitly for
parameter risk. Let
A(τ) = {A1 (τ), A2 (τ), . . . , Am (τ)} (7.1)
be the set of alternative mortality assumptions; then, let ρh be the weight

attached
m to assumption Ah (τ), such that 0 ≤ ρh ≤ 1 for h = 1, 2, . . . , m and
h=1 ρh = 1. The set
{ρh }h = 1,2,...,m (7.2)
can be intended as a probability distribution on A(τ). Unfortunately, expe-

rience providing data for estimating such weights is rarely available and so
personal judgement is often required. See Section 7.2.4 for some examples.
We now address possible ways of attacking the problem within a contin-
uous setting. To define A(τ) as a continuous set, a continuous probability
distribution must be assigned to the parameters of the mortality model. Dif-
ficulties, here, concern the appropriate correlation assumptions among the
parameters and hence the complexity of the overall model is clearly greater
than in the discrete case. Because there is likely a paucity of data allowing
us to make a reliable estimate of the correlations, simplifying hypotheses
would have to be accepted. Hence, the setting would not necessarily be

more powerful than the discrete one. For this reason, we do not provide
examples in respect of a continuous approach.
Whatever setting is referred to, either discrete or continuous, the frame-
work discussed above can, to some extent, be classified as a static one.
Actually, the notation indicates that the set A(τ) is fixed. Uncertainty is
expressed in terms of which of the assumptions A(τ) ∈ A(τ) is the better
one for describing the aggregate mortality behaviour of the cohort, that is,
the relevant prevailing trend. Irrespective of the setting, either discrete or
continuous, no future shift from such a trend is allowed for in the proba-
bilistic distribution (see also panels 5 and 6 in Fig. 7.2). A critical aspect
can be found in the fact that assumptions about the temporal correlation of
changes in the probabilities of death are implicitly involved; see, for exam-
ple, Tuljapurkar and Boe (1998). Further, we note that mortality shocks are
not embedded into the static representation, which is not a serious problem
given that we are addressing the longevity risk. Finally, we mention that,
while keeping the setting as a static one, possible updates to the weights
(7.2) based on experience could be introduced; an example in this respect
is described in Section 7.2.4.
According to a dynamic probabilistic approach, either the probability of
death or the force of mortality (or possibly some other quantity) is modelled
as a path of a random process. In this context, the probabilistic model
consists of assumptions concerning the random process and the relevant
parameters. In the current literature, many authors have been focussing on
this approach. Most investigations, which are, in particular, motivated by
the problem of setting a pricing structure for longevity securities, move from
assumed similarities between the force of mortality and interest rates or
simply from the assumption that the market for longevity securities should
behave like other aspects of the capital market. The application to mortality
of some stochastic models developed originally for financial purposes is
then tested. In particular, interest rate models and credit risk models have
been considered. However, financial models are not necessarily suitable for
describing mortality; actually, the force of mortality and interest rates do
not necessarily behave in a similar manner. Therefore, the basic building
blocks of the new theory still require careful discussion and investigation.
Some examples are quoted in Section 7.6.
It is important to note that the Lee–Carter model (see Chapters 5 and
6) is an early example of mortality modelled as a stochastic process. In its
original version, deviations originating from sampling errors are in particu-
lar addressed, and hence process risk is considered. Indeed, when stochastic
processes are adopted, certainly the intrinsic stochastic nature of mortality,
that is, random fluctuations, must be acknowledged. To represent also

aggregate mortality risk, a second source of randomness must be intro-
duced. So, in recent proposals, mortality is described as a doubly stochastic
process. In particular when, moving from financial modelling, diffusion
processes are considered for the force of mortality, unexpected movements
in the mortality curve may be accounted for through stochastic jumps. See
Section 7.8 for some references.
7.2.4 Measuring longevity risk in a static framework
In this section we highlight the impact of longevity risk. With reference

to a portfolio comprising one cohort of annuitants, the distribution of
both the present value of future payments and annual outflows is inves-
tigated. What follows can be referred also to a cohort of pensioners, with a
proper adjustment of the parameters of the mortality model; in either case, a
homogeneous group is considered. As mentioned in Section 7.1, for brevity
explicit reference is to annuitants only. Similarly, the provider could be an
insurer or a pension fund; however, we refer explicitly just to an insurer.
A static representation is considered for evolving mortality and, in par-
ticular, parameter risk is addressed. To understand better the impact of
longevity risk, a comparison is made with process risk.
We assume
qx (t)
= G(τ) (K(τ))x (7.3)
px (t)
where τ = t − x is the year of birth of the cohort. Hence, the third term
of the first Heligman–Pollard law, that is, the one describing the old-age
pattern of mortality, is adopted to express the time-pattern of mortality
(see Section 2.5.2). Note, in particular, that the relevant parameters are
cohort-specific.
Whilst the age-pattern of mortality for cohort τ is accepted to be logistic,
namely
G(τ) (K(τ))x
qx (t) = (7.4)
1 + G(τ) (K(τ))x
see (2.85) (see also the second Heligman–Pollard law in Section 2.5.2),
uncertainty concerns the level of parameters G(τ), K(τ). Actually, our inves-
tigation focusses on parameter risk and we note that such uncertainty may,
in particular, arise from an underlying unknown cohort effect.
We define five alternative sets of parameters, quoted in Table 7.1 which
shows the expected lifetime (E[T65 |Ah (τ)]) and the standard deviation
also
( Var[T65 |Ah (τ)]) of the lifetime at age 65 conditional on a given set of
Table 7.1. Parameters for the Heligman–Pollard law
A1 (τ) A2 (τ) A3 (τ) A4 (τ) A5 (τ)
G(τ) 6.378E-07 3.803E-06 2.005E-06 1.060E-06 3.149E-06

K(τ) 1.14992 1.12347 1.13025 1.13705 1.11962
E[T65 |Ah (τ)]
20.170 20.743 21.849 22.887 24.187
Var[T65 |Ah (τ)] 7.796 8.780 8.707 8.602 9.910
parameters. It emerges that, in terms of the survival function itself, the alter-
native assumptions imply different levels of rectangularization
(i.e. squaring
of the survival function, as it is witnessed by Var[T65 |Ah (τ)]) and expan-
sion (i.e. forward shift of the adult age at which most deaths occur, which
is then reflected in the value of E[T65 |Ah (τ)]) (see Sections 3.3.6 and 4.1
for the meaning of rectangularization and expansion). The relevant survival
functions and curves of deaths are plotted in Fig. 7.3.
Assumption A3 (τ) will be referred to as the best-estimate description of
the mortality trend for cohort τ; its parameters have been obtained by fitting
(7.3) to the current market Italian projected table for immediate annuities
(named
IPS55). When comparing the values taken by E[T65 |Ah (τ)] and
Var[T65 |Ah (τ)] (quoted in Table 7.1) under the various assumptions, it
turns out that in respect of A3 (τ) at age 65:
– assumption A1 (τ) implies a lighter expansion (i.e. lower expected life-

time) joint with a stronger rectangularization (i.e. lower standard
deviation of the lifetime);
– assumption A2 (τ) implies a lighter expansion and rectangularization as
well;
– assumption A4 (τ) implies a stronger expansion and rectangularization
as well;
– assumption A5 (τ) implies a stronger expansion joint with a lighter
rectangularization.
In each case, the maximum attainable age has been set equal to 117,
according to the reference projected table.
The portfolio we refer to consists of one cohort of immediate life annu-
ities. We assume that all annuitants are aged x0 at the time t0 of issue.
To shorten the notation, time t will be recorded as the time elapsed since
the policy issue, that is, it is the policy duration; hence, at policy dura-
tion t the underlying calendar year is t0 + t. The lifetimes of annuitants
are assumed, conditional on any given survival function, to be indepen-
dent of each other and identically distributed. Since our objective is the
0.9
0.8
0.7
Number of survivors
0.6 A1
A2
0.5 A3
A4
0.4 A5
0.3
0.2
0.1
0
65 75 85 95 105 115
Age
0.06
0.05
0.04
Number of deaths
A1
A2
0.03 A3
A4
A5
0.02
0.01
0
65 75 85 95 105 115
Age
Figure 7.3. Survival functions (top panel) and curves of deaths (bottom panel) under the
Heligman–Pollard law.
measurement of longevity risk only, we disregard uncertainty in financial

markets; hence, a given flat yield curve is considered. All of the annuitants
are entitled to a fixed annual amount (participating mechanisms are not
allowed for). Finally, we focus on net outflows; therefore, expenses and
related expense loadings are not accounted for.
Let Nt be the random number of annuitants at time t, t = 0, 1, . . . , with

N0 a specified number (viz the initial size of the portfolio). Whenever the
current size of the portfolio is an observed quantity, we will denote it as nt ;
so N0 = n0 . Quantities relating to the generic annuitant are labelled with
(j) on the top, j = 1, 2, . . . , n0 . The in-force portfolio at policy time t is
defined as
(j)
t = {j|Tx0 > t} (7.5)
Quantities relating to the portfolio are then labelled with () on the top.
Annual outflows for the portfolio are defined, for t = 1, 2, . . . , as
()

Bt = b(j) (7.6)
j:j∈t
where b(j) is the annual amount to annuitant j.

The present value of future payments at time t, t = 0, 1, . . . , may at first
be defined for one annuitant as
(j)
Yt = b(j) aK(j) (7.7)
x0
(see Section 1.5.1). By summing up in respect of in-force policies, we obtain

the present value of future payments for the portfolio
()
(j)
Yt = Yt (7.8)
j:j∈t
() ()
We are interested in investigating some typical values of Bt and Yt , as
well as the coefficient of variation and some percentiles. We will in particular
consider the impact of longevity risk in relation to the size of the portfolio.
So, unless otherwise stated, a homogeneous portfolio in respect of annual
amounts is considered; that is, we set b(j) = b for all j. Note that in this case
(7.6) may be rewritten as
()
Bt = b Nt (7.9)
whilst the present value of future payments for the portfolio may also be
expressed as
ω−x
0 ω−x
0
() ()
Yt = Bh (1 + i)−(h−t) = b Nh (1 + i)−(h−t) (7.10)
h=t+1 h=t+1
where i is the annual interest rate. For a homogeneous portfolio, in the

(1)
following Yt is used to denote the present value of future payments to a
generic annuitant.
We first adopt approach 4 described in Section 7.2.2 (see also Fig. 7.2).
All valuations are then conditional on a given mortality assumption. We
have
E[Yt() |Ah (τ), nt ] = nt E[Yt(1) |Ah (τ)] (7.11)
Because we are assuming independence of the annuitants’ lifetimes, condi-
tional on a given mortality trend, the following results hold:
Var[Yt() |Ah (τ), nt ] = nt Var[Yt(1) |Ah (τ)] (7.12)

.
() 1 Var[Yt(1) |Ah (τ)]
CV[Yt |Ah (τ), nt ] = √ (7.13)
nt E[Yt(1) |Ah (τ)]
where nt is the size of the in-force portfolio, observed at the valua-

(1) (1)
tion policy time t. (Expressions for E[Yt |Ah (τ)] and Var[Yt |Ah (τ)] are
straightforward and therefore omitted.)
The coefficient of variation, in particular, allows us to investigate the
effect of the size of the portfolio on the overall riskiness. Expression (7.13)
shows that, in relative terms, the riskiness of the portfolio decreases as nt
increases. Thus, we have
()
lim CV[Yt |Ah (τ), nt ] = 0 (7.14)
nt ∞
This represents the well-known result that the larger is the portfolio, the
less risky it is, since with high probability the observed values will be close
()
to the expected ones. The quantity CV[Yt |Ah (τ), nt ] is sometimes called
the risk index.
Conditional on a given mortality assumption and because of the inde-
pendence among the lifetimes of the annuitants and the assumption of
()
homogeneity of annual amounts, the percentiles of Yt could be assessed
through a process of convolution. In practice, however, due the number
()
of random variables constituting Yt (i.e. due to the magnitude of nt ),
analytical calculations are not practicable and so we must resort to stochas-
()
tic simulation. The ε-percentile of the distribution of Yt conditional on
assumption Ah (τ) and an observed size of the in-force portfolio nt at time
t is defined as
# ()
yt,ε [Ah (τ), nt ] = inf u ≥ 0 # P Yt ≤ u|Ah (τ), nt > ε (7.15)
()
In particular, we are interested in investigating the right tail of Yt ;
therefore, high values for ε should be considered.
()
As far as the distribution of annual outflows Bt is concerned, simi-
()
lar remarks to those for Yt can be made. Thus, due to independence
Table 7.2. Expected present value of future payments, con-

ditional on a given scenario, per policy in-force at time
()
E[Yt |Ah (τ),nt ] (1)
t: nt = E[Yt |Ah (τ)]
Assumption
Time t A1 (τ) A2 (τ) A3 (τ) A4 (τ) A5 (τ)
0 14.462 14.651 15.259 15.817 16.413

5 12.004 12.374 12.956 13.500 14.238
10 9.504 10.076 10.599 11.097 11.981
15 7.102 7.862 8.294 8.714 9.724
20 4.962 5.846 6.167 6.484 7.570
25 3.221 4.127 4.336 4.543 5.626
30 1.944 2.766 2.877 2.988 3.980
35 1.099 1.765 1.807 1.849 2.681
()
and homogeneity, the random variables Bt have (under the information
available at time 0) a binomial distribution, with parameters n0 and the
survival probability from issue time to policy duration t calculated under
the given mortality assumption. For reasons of space, we omit the relevant
expressions (which are straightforward).
Example 7.1 In the following tables, we provide an example, in which the

age at entry is x0 = 65, the interest rate is 3% p.a., the annual amount
()
of each annuity is b(j) = 1. It then follows that Bt = Nt .
In Table 7.2, the expected present value of future payments is presented,
per annuitant (having set, under each assumption Ah (τ), nt = E[Nt |Ah (τ)]
for each valuation time t, t = 0, 5, . . . , 35). As was clear from the assump-
tions (see also Table 7.1), at the time of issue the five assumptions (ordered
from 1 to 5) imply an increasing expected present value of future payments.
The comparison may change in later years, due to the shape of the survival
function for a given assumption (actually, some survival functions cross
over each other; see Fig. 7.3, top panel). From these results, we get an idea
about the possible range of variation of the current value of liabilities, due
to uncertainty about the mortality trend.
In Table 7.3, we present the variance of the present value of future pay-
ments, per annuitant. The illustrated variability is a consequence of the
rectangularization level implied by the different assumptions. We recall that
only process risk is accounted for in this assessment; so when addressing
longevity risk such information is not of intrinsic interest, but is helpful for
comparison with the impact of longevity risk.
To compare longevity risk with process risk, we make some further
calculations involving process risk only. Thus, Tables 7.4 and 7.5 show,
Table 7.3. Variance of the present value of future pay-

ments, conditional on a given scenario, per policy in-force
()
Var[Yt |Ah (τ),nt ] (1)
at time t: nt = Var[Yt |Ah (τ)]
Assumption
0 20.838 25.301 23.804 22.250 25.315

5 20.858 24.858 23.994 22.985 26.102
10 18.963 22.607 22.375 21.970 25.229
15 15.314 18.780 19.008 19.095 22.581
20 10.777 14.092 14.505 14.838 18.493
25 6.550 9.497 9.855 10.181 13.726
30 3.479 5.771 5.969 6.159 9.198
35 1.677 3.217 3.277 3.337 5.594
Table 7.4. Coefficient of variation of the present value of future payments,

()
conditional on the best-estimate scenario: CV[Yt |A3 (τ), nt ]
Initial portfolio size

Time t n0 = 1 n0 = 100 n0 = 1 000 n0 = 10 000 … n0 ∞
0 31.974% 2.982% 0.969% 0.027% … 0%

5 38.514% 3.618% 1.156% 0.030% … 0%
10 47.039% 4.452% 1.397% 0.038% … 0%
15 58.973% 5.626% 1.734% 0.056% … 0%
20 77.647% 7.469% 2.259% 0.103% … 0%
25 111.894% 10.853% 3.218% 0.250% … 0%
30 189.580% 18.541% 5.379% 0.883% … 0%
35 424.200% 41.832% 11.815% 5.202% … 0%
respectively, the coefficient of variation for some initial sizes of the portfo-
lio and some percentiles of the present value of future payments, per unit
of expected value. Only the best-estimate assumption is considered. As far
as the coefficient of variation is concerned, we note that at any given time it
decreases rapidly as the size of the portfolio increases, as we commented on
earlier. For a given initial portfolio size, the coefficient of variation increases
in time; this is due to the decreasing residual size of the portfolio and to
annuitants becoming older as well. A similar result is found when analysing
the right tail of the distribution, as it emerges in Table 7.5.
Tables 7.6 and 7.7 give a highlight on the distribution of annual outflows.
In particular, Table 7.6 quotes the expected value of annual outflows under
the different assumptions; we recall that, having set b(j) = 1, what is shown
is the expected number of annuitants (not rounded, to avoid too many
approximations). Remarks are similar to those discussed for the present
value of future payments.
Table 7.5. Some percentiles of the present value of future payments,

conditional on the best-estimate scenario, per unit of expected value:
yt,ε [A3 (τ),nt ]
()
E[Yt |A3 (τ),nt ]
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99
Initial portfolio size: n0 = 100

0 2.159% 3.995% 4.983% 6.739%
5 2.500% 4.863% 6.266% 8.554%
10 3.074% 6.110% 7.904% 11.289%
15 3.738% 7.161% 9.857% 13.801%
20 5.418% 10.393% 12.898% 17.866%
25 8.319% 15.577% 20.338% 26.503%
30 13.658% 26.115% 33.982% 47.540%
35 32.386% 63.107% 83.067% 130.409%
Initial portfolio size: n0 = 1,000
0 0.635% 1.286% 1.631% 2.286%
5 0.820% 1.531% 1.934% 2.668%
10 0.898% 1.923% 2.423% 3.386%
15 1.131% 2.221% 2.854% 4.472%
20 1.354% 2.692% 3.781% 6.223%
25 2.117% 4.281% 5.443% 7.967%
30 3.638% 7.355% 9.765% 14.334%
35 9.155% 18.426% 22.253% 31.641%
0 0.200% 0.407% 0.523% 0.733%
5 0.238% 0.461% 0.609% 0.850%
10 0.332% 0.622% 0.786% 1.051%
15 0.415% 0.739% 0.967% 1.385%
20 0.518% 0.968% 1.239% 1.761%
25 0.670% 1.414% 1.765% 2.705%
30 1.165% 2.317% 3.090% 4.309%
35 2.323% 4.661% 6.048% 10.430%
We now assign the (naive) probability distribution (7.2) on the set A(τ).
The unknown mortality trend, assumed to lie in A(τ), is denoted by Ã(τ).
For the unconditional expected present value of future payments, the
following relations hold (the suffix ρ denotes that the underlying probability
distribution is given by (7.2)):
E[Yt() |nt ] = Eρ [E[Yt() |Ã(τ), nt ]] = nt Eρ [E[Yt(1) |Ã(τ)]]

m
= nt E[Yt(1) |Ah (τ)] ρh = nt E[Yt(1) ] (7.16)
h=1
Table 7.6. Expected value of annual outflows, conditional on a

()
given scenario: E[Bt |Ah (τ)]; initial portfolio size: n0 = 1,000
Assumption
5 963.105 954.252 963.630 971.087 969.636

10 893.255 877.810 900.177 918.558 918.556
15 768.675 756.711 794.479 826.872 835.463
20 570.930 581.960 632.539 678.377 707.957
25 319.516 367.226 418.752 468.784 530.954
30 105.929 165.716 200.690 237.508 323.526
35 14.160 43.221 55.764 70.089 139.572
Table 7.7. Coefficient of variation of annual outflows,

()
conditional on the best-estimate scenario: CV[Bt |A3 (τ)];
initial portfolio size: n0 = 1,000
Time t n0 = 100 n0 = 1,000 n0 = 10,000
5 1.943% 0.614% 0.194%

10 3.330% 1.053% 0.333%
15 5.086% 1.608% 0.509%
20 7.622% 2.410% 0.762%
25 11.782% 3.726% 1.178%
30 19.957% 6.311% 1.996%
35 41.150% 13.013% 4.115%
(1) m (1)
where E[Yt ] = h=1 E[Yt |Ah (τ)] ρh .
()
The unconditional variance of Yt can be calculated as
Var[Yt() |nt ] = Eρ [Var[Yt() |Ã(τ), nt ]] + Varρ [E[Yt() |Ã(τ), nt ]]

(1) (1)
= nt Eρ [Var[Yt |Ã(τ)]] + nt2 Varρ [E[Yt |Ã(τ)]]

m
= nt Var[Yt(1) |Ah (τ)] ρh
h=1
m
2
+ nt2 E[Yt(1) |Ah (τ)] − E[Yt(1) ] ρh (7.17)
h=1
The first term in the expression for the variance reflects deviations around
the expected value; so it can be thought of as a measure of process
risk. The second term, instead, reflects deviations from the expected value
(i.e. systematic deviations) and so it may be thought of as a measure of
longevity (namely parameter, in our example) risk. Under the unconditional
valuation, the coefficient of variation now takes the following expression:

.
()
Var[Yt() ]
CV[Yt |nt ] =
E[Yt() ]
2
3
3 1 Eρ [Var[Y (1) |Ã(τ)]] Varρ [E[Y (1) |Ã(τ)]]
=4 t
+ t
(7.18)
nt E2 [Yt(1) ] E2 [Yt(1) ]
The first term under the square root shows that random fluctuations rep-
resent a pooling risk, since (in relative terms) their effect is absorbed by
the size of the portfolio. This result is similar to that obtained under the
valuation conditional on a given mortality trend (see (7.13)). The second
term, instead, shows that systematic deviations constitute a non-pooling
risk, which is not affected by changes in the portfolio size. In particular, the
asymptotic value of the risk index
2
3
3 Varρ [E[Y (1) |Ã(τ)]]
lim CV[Yt |nt ] = 4
() t
(7.19)
nt ∞ E [Yt(1) ]
2
can be thought of as a measure of that part of the mortality risk which is

not affected by simply changing the size of the portfolio.
()
The ε-percentile of the unconditional probability distribution of Yt
under an observed size of the in-force portfolio nt at time t is defined as
# ()
yt,ε [nt ] = inf u ≥ 0 # P Yt ≤ u|nt > ε (7.20)
To assess this quantity, stochastic simulation is required, through which

first the mortality trend is randomly picked up from A(τ), and then the
lifetimes of annuitants are generated.
In regard of annual amounts, similar valuations and comments can be
performed.
Example 7.2 We now describe a numerical example of the results presented

above. We consider the same inputs of Example 7.1. We assign to A(τ) the
weights quoted in Table 7.8. The best-estimate assumption (A3 (τ)) has been
given the highest weight. The residual weight has been spread out uniformly
on the remaining assumptions.
Table 7.9 shows the unconditional expected value of future payments. Its
magnitude is driven by the best-estimate assumption, as seen by comparison
with the results in Table 7.2.
Table 7.8. Probability

distribution on A(τ)
Assumption Weight ρh
A1 (τ) 0.1
A2 (τ) 0.1
A3 (τ) 0.6
A4 (τ) 0.1
A5 (τ) 0.1
Table 7.9. (Unconditional) expected

present value of future payments, per
policy in-force at time t
()
E[Yt |nt ] (1)
Time t nt = E[Yt ]
0 15.290
5 12.985
10 10.625
15 8.317
20 6.187
25 4.353
30 2.894
35 1.824
()
In Table 7.10, the unconditional variance of Yt for some portfolio
sizes is shown, split into the pooling and non-pooling components. For
comparison with the conditional valuation, also the case n0 = 1 is quoted.
We note the increase in the magnitude of the variance, due to the non-
pooling part, as the portfolio size increases. Whenever the portfolio is large
at policy issue, the non-pooling component remains important relative to
the pooling component even at high policy durations.
The behaviour of the coefficient of variation in respect of the portfolio
size is illustrated in Table 7.11. When compared with the case allowing for
process risk only (see Table 7.4), the risk index decreases more slowly as the
portfolio size increases. We note, in particular, its positive limiting value,
which is evidence of the magnitude of the systematic risk.
In Table 7.12 the right tail of the distribution of the present value of
future payments is investigated, for some portfolio sizes. We note that the
tail is rather heavier than in the case allowing for process risk only (see
Table 7.5).
Finally, in Tables 7.13–7.15 the distribution of annual outflows is inves-
tigated. Similar remarks hold to those made above for the distribution of
future payments.
Table 7.10. (Unconditional) variance of the present value of

future payments per policy in-force at time t, and components
Time Variance Pooling part Non-pooling part

() () ()
Var[Yt |nt ] Eρ [Var[Yt |Ã(τ),nt ]] Varρ [E[Yt |Ã(τ),nt ]]
t nt () ()
Var[Yt |nt ] Var[Yt |nt ]

0 23.916 98.90% 1.10%
5 23.303 98.73% 1.27%
10 20.369 98.56% 1.44%
15 15.322 98.43% 1.57%
20 9.331 98.45% 1.55%
25 4.202 98.75% 1.25%
30 1.221 99.30% 0.70%
35 0.187 99.79% 0.21%
0 50.026 47.28% 52.72%
5 52.493 43.83% 56.17%
10 49.436 40.61% 59.39%
15 39.209 38.46% 61.54%
20 23.670 38.81% 61.19%
25 9.391 44.18% 55.82%
30 2.062 58.81% 41.19%
35 0.226 82.60% 17.40%
0 287.390 8.23% 91.77%
5 317.858 7.24% 92.76%
10 313.680 6.40% 93.60%
15 256.365 5.88% 94.12%
20 154.023 5.96% 94.04%
25 56.568 7.33% 92.67%
30 9.707 12.49% 87.51%
35 0.580 32.20% 67.80%
0 2 661.023 0.89% 99.11%
5 2 971.508 0.77% 99.23%
10 2 956.118 0.68% 99.32%
15 2 427.922 0.62% 99.38%
20 1 457.548 0.63% 99.37%
25 528.334 0.79% 99.21%
30 86.159 1.41% 98.59%
35 4.120 4.53% 95.47%
We finally address the problem of choosing the weights (7.2). As we

have already mentioned, data to estimate such weights are available rarely.
However, some numerical tests suggest that the weights do not deeply affect
the results of the investigation, unless only process risk is allowed for. We
show this effect in Example 7.3.
Table 7.11. (Unconditional) coefficient of variation of the present value of future

()
payments: CV[Yt |nt ]

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ∞
0 31.985% 4.626% 3.506% 3.374% … 3.359%

5 38.579% 5.790% 4.506% 4.356% … 4.339%
10 47.188% 7.351% 5.856% 5.685% … 5.665%
15 59.241% 9.477% 7.663% 7.457% … 7.434%
20 78.060% 12.432% 10.029% 9.756% … 9.725%
25 112.448% 16.811% 13.048% 12.610% … 12.560%
30 190.275% 24.726% 16.965% 15.983% … 15.870%
35 425.348% 46.751% 23.680% 19.957% … 19.499%
Table 7.12. Some (unconditional) percentiles of the present value of

yt,ε [nt ]
future payments, per unit of expected value: ()
E[Yt |nt ]
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99

0 3.473% 6.175% 7.796% 11.465%
5 3.453% 8.010% 10.041% 14.405%
10 4.638% 10.290% 14.492% 19.236%
15 5.293% 13.350% 18.412% 25.884%
20 7.968% 17.498% 25.518% 34.996%
25 14.135% 23.243% 30.693% 52.379%
30 17.964% 37.964% 46.114% 75.150%
35 33.817% 70.810% 92.929% 138.271%
0 1.057% 5.164% 7.357% 8.591%
5 1.378% 6.300% 9.654% 11.276%
10 1.878% 7.501% 12.471% 14.889%
15 2.108% 9.426% 16.712% 19.332%
20 2.797% 11.838% 22.206% 25.939%
25 3.417% 14.771% 29.772% 34.880%
30 5.100% 19.791% 37.774% 47.734%
35 14.933% 27.796% 48.299% 72.891%
0 0.261% 4.124% 7.298% 7.726%
5 0.189% 4.800% 9.695% 10.062%
10 0.254% 5.417% 12.743% 13.304%
15 0.316% 6.102% 16.754% 17.552%
20 0.461% 6.499% 22.162% 23.146%
25 0.799% 6.417% 28.886% 30.326%
30 1.571% 7.292% 36.976% 40.782%
35 2.902% 13.794% 47.056% 52.805%
Table 7.13. (Unconditional)

expected value of annual out-
()
flows: E[Bt ]; initial portfolio
size: n0 = 1,000
()
Time t E[Bt ]
5 963.986
10 900.924
15 795.459
20 633.446
25 419.899
30 203.682
35 60.162
Table 7.14. Components of the (unconditional) variance of

annual outflows
Pooling part Non-pooling part

() ()
Eρ [Var[Bt |Ã(τ)]] Varρ [E[Bt |Ã(τ)]]
Time t () ()
Var[Bt ] Var[Bt ]

5 99.488% 0.512%
10 98.652% 1.348%
15 97.119% 2.881%
20 94.229% 5.771%
25 89.724% 10.276%
30 85.729% 14.271%
35 86.181% 13.819%
5 95.104% 4.896%
10 87.976% 12.024%
15 77.124% 22.876%
20 62.016% 37.984%
25 46.613% 53.387%
30 37.528% 62.472%
35 38.409% 61.591%
5 66.013% 33.987%
10 42.253% 57.747%
15 25.214% 74.786%
20 14.036% 85.964%
25 8.030% 91.970%
30 5.667% 94.333%
35 5.870% 94.130%
Table 7.15. (Unconditional) coefficient of variation

()
of annual outflows: CV[Bt ]
Time t n0 = 100 n0 = 1,000 n0 = 10,000
5 6.143% 1.943% 0.614%

10 10.531% 3.330% 1.053%
15 16.084% 5.086% 1.608%
20 24.103% 7.622% 2.410%
25 37.257% 11.782% 3.726%
30 63.110% 19.957% 6.311%
35 130.126% 41.150% 13.013%
Table 7.16. Alternative probability distributions

on A(τ)
Weighting system
Weight (a) (b) (c) (d)
ρ1 0 0.1 0.15 0.2

ρ2 0 0.1 0.15 0.2
ρ3 1 0.6 0.4 0.2
ρ4 0 0.1 0.15 0.2
ρ5 0 0.1 0.15 0.2
Example 7.3 In this example, we compare the right tail of the distribution
of the present value of future payments assuming the alternative weighting
systems for (7.2) that are presented in Table 7.16. System (a) is the one
allowing for process risk only (see Example 7.1). System (b) is the one
adopted in Example 7.2. System (c) is similar to (b), with the highest weight
assigned to the best-estimate assumption; however, such weight has been
reduced. System (d), finally, consists of a uniform distribution of weights.
We focus on the right tail of the distribution of the present value of future
payments (and not on the other risk measures considered previously, such
as the risk index) due to its practical importance. Actually, reserving or
capital allocation could be based on this quantity (see also Section 7.3.3).
From the details presented for this example in Table 7.17, it seems that
whenever parameter risk is allowed for, the magnitude of the right tail is
not deeply affected by the weighting system (although, of course, the actual
figure does depend on the specific weights). Indeed, an apparent difference
emerges between results found under system (a), on the one hand, and
systems (b)–(d), on the other. This suggests that, having poor information,
the allowance for longevity risk is more important than the actual choice
of the weights.
Table 7.17. Some (unconditional) percentiles of the present value of future payments, per unit
yt,ε [nt ]
of expected value: () , under alternative weighting systems; n0 = 1, 000
E[Yt |nt ]
Probability
Time t ε = 0.75 ε = 0.90 ε = 0.95 ε = 0.99
System (a)
0 0.635% 1.286% 1.631% 2.286%
5 0.820% 1.531% 1.934% 2.668%
10 0.898% 1.923% 2.423% 3.386%
15 1.131% 2.221% 2.854% 4.472%
20 1.354% 2.692% 3.781% 6.223%
25 2.117% 4.281% 5.443% 7.967%
30 3.638% 7.355% 9.765% 14.334%
35 9.155% 18.426% 22.253% 31.641%
System (b)
0 1.057% 5.164% 7.357% 8.591%
5 1.378% 6.300% 9.654% 11.276%
10 1.878% 7.501% 12.471% 14.889%
15 2.108% 9.426% 16.712% 19.332%
20 2.797% 11.838% 22.206% 25.939%
25 3.417% 14.771% 29.772% 34.880%
30 5.100% 19.791% 37.774% 47.734%
35 14.933% 27.796% 48.299% 72.891%
System (c)
0 2.912% 6.785% 7.652% 8.539%
5 3.206% 8.850% 9.822% 11.384%
10 3.615% 11.893% 13.119% 15.178%
15 3.881% 15.645% 17.404% 19.643%
20 4.258% 20.997% 23.363% 26.253%
25 5.047% 27.391% 31.364% 35.474%
30 6.192% 35.431% 41.423% 49.504%
35 16.809% 41.965% 54.794% 74.366%
System (d)
0 3.697% 7.142% 7.642% 8.508%
5 4.609% 9.408% 10.362% 11.702%
10 5.079% 12.195% 13.497% 15.037%
15 5.480% 16.397% 17.687% 19.671%
20 5.929% 21.825% 23.725% 26.542%
25 7.025% 29.249% 32.239% 35.218%
30 8.782% 36.966% 42.059% 50.565%
35 19.443% 46.965% 60.689% 74.138%
We have noted that the most important aspect is to allow for parameter
risk by assigning positive weights to trend assumptions alternative to the
best-estimate one. However, the specific weights do affect the magnitude of
quantities of interest (such as the tail of the distribution of future payments).
A Bayesian inferential model could provide an appropriate method for

updating the weights. We briefly discuss how one could structure such a
procedure.
We still refer to a cohort of annuitants, which is homogeneous and,
conditional on a given trend, with identically distributed and independent
lifetimes. The observed number of annuitants at time t is nt . As the static
approach for stochastic mortality evolution, we assume that the trend of
the cohort is unknown, but fixed (i.e. not subject neither to shocks nor to
unanticipated shifts). The set of trend assumptions is given by (7.1). In the
current context, the set of weights (7.2) will be denoted as
{ρ(Ah (τ))}h=1,2,...,m (7.21)
We let f0 (t|A(τ)) denote the probability density function (briefly: pdf)

of the lifetime at birth of one individual, conditional on assumption A(τ)
about the mortality trend. We then let S(t|A(τ)) denote the relevant survival
function.
Within the inferential procedure, the sampling pdf is defined as follows:

0 for z ≤ t
ft (z|A(τ)) = f0 (z|A(τ)) (7.22)
S(t|A(τ)) for z > t
The multivariate sampling pdf is then given by

!
nt
ft (z(1) , z(2) , . . . , z(nt ) |A(τ)) = ft (z(j) |A(τ)) (7.23)
j=1
Note that

m
ft (z) = ft (z|A(τ)) ρ(Ah (τ)) (7.24)
h=1
represents the (prior) predictive pdf restricted to the age interval [t, ω − x0 ].
Assume now the observation period [t, t ]. Let d denote the number of
deaths observed in such period. With an appropriate renumbering, let
x = {x(1) , x(2) , . . . , x(d) } (7.25)
denote the array of ages at death. We note that the defined observation pro-
cedure implies a Type I-censored sampling (see, for instance, Namboodiri
and Suchindran (1987)).
Using the information provided by the pair (d, x), the (posterior) predic-
tive pdf ft (z|d, x) can be constructed. With this objective in mind, we can
adopt the following procedure (usual in the Bayesian context):
1. Update the initial opinion about the possible evolution of mortality,

and hence about the probability distribution over the set of trend
assumptions A(τ), by calculating the posterior pdf
ρ(Ah (τ)|d, x) ∝ ρ(Ah (τ)) L(Ah (τ|d, x)) (7.26)
where L(Ah (τ|d, x)) denotes the likelihood function;

2. Calculate the (posterior) predictive pdf as

m
ft (z|d, x) = ft (z|Ah (τ)) ρ(Ah (τ)|d, x) (7.27)
h=1
Step 1 requires the construction of the likelihood function L(Ah (τ|d, x)).
We have (see, e.g. Namboodiri and Suchindran (1987)):
 
!
S(t |Ah (τ)) nt −d
d
L(Ah (τ|d, x)) ∝  ft (x(k) − t|Ah (τ)) (7.28)
S(t|Ah (τ))
k=1
The inferential procedure described above could be adopted within inter-

nal solvency models, whenever alternative projected mortality tables are
available. Some numerical investigations in this regard are discussed by
Olivieri and Pitacco (2002a).
7.3 Managing the longevity risk
7.3.1 A risk management perspective
Several tools can be developed to manage longevity risk. These tools can be
placed and analysed in a risk management (RM) framework.
As sketched in Fig. 7.4, the RM process consists of three basic steps,
namely the identification of risks, the assessment (or measurement) of the
relevant consequences, and the choice of the RM techniques. In what fol-
lows we refer to the RM process applied to life insurance, in general, and
to life annuity portfolios, in particular.
The identification of risks affecting an insurer can follow, for example, the
guidelines provided by IAA (2004) or those provided within the Solvency 2
project (see CEIOPS, 2007 and CEIOPS, 2008). Mortality/longevity risks
belong to underwriting risks; the relevant components have already been
discussed (see Section 7.2.1). Obviously, for an insurer the importance of
the longevity risk within the class of mortality risks is strictly related to the
IDENTIFICATION UNDERWRITING RISK

Mortality/Longevity risk
– Volatility
– Level uncertainty
– Trend uncertainty
– Catastrophe
Lapse risk
...
MARKET RISK
...
ASSESSMENT DETERMINISTIC MODELS

Sensitivity testing
Scenario testing
STOCHASTIC MODELS
Risk index, VaR,
Probability of default
...
RISK PORTFOLIO
MANAGEMENT STRATEGIES
TECHNIQUES
LOSS CONTROL PRODUCT DESIGN

Pricing (life table,
Loss prevention guarantees, options,
(frequency control) expense loading, etc.)
Loss reduction Participation mechanism
(severity control)
RISK
MITIGATION LOSS FINANCING PORTFOLIO PROTECTION
Natural hedging
Hedging Reinsurance, ART
Transfer
No advance funding
Retention Capital allocation
Figure 7.4. The risk management process.
relative weight of the life annuity portfolio with respect to the overall life
business.
A rigorous assessment of the longevity risk requires the use of stochastic
models (i.e. approach 5 in Fig. 7.2). In Section 7.2.4 we have provided
some examples of risk measurement, viz the variance, the coefficient of
variation, and the right tail of liabilities – these need to be appropriately
defined; in Section 7.2.4 they were stated in terms of the present value of
future payments and of annual outflows. A further example is given by
Actual Expected
Threshold
outflows values
Annual outflows
Time
Figure 7.5. Annual outflows in a portfolio of immediate life annuities (one cohort).
the probability of default (or ruin probability, in the traditional language),

which will be considered in Section 7.3.3 when dealing with the solvency
problem. As discussed in Section 7.2.2, deterministic models (i.e. approach
4 in Fig. 7.2) can provide useful, although rough, insights into the impact of
longevity risk on portfolio results. In particular, as outlined in Sections 7.2.3
and 7.2.4, deterministic models allow us to calculate the range of values
that some quantities (present value of future payments, annual outflows,
or others) may assume in respect of the outcome of the underlying random
quantity.
Risk management techniques for dealing with longevity risk include a
wide set of tools, which can be interpreted, under an insurance perspective,
as portfolio strategies, aimed at risk mitigation.
A number of portfolio results can be taken as ‘metrics’ to assess the
effectiveness of portfolio strategies. In what follows, we focus on annual
outflows relating to annuity payments only, which, in any event, consti-
tute the starting point from which other quantities (e.g. profits) may be
derived.
In Fig. 7.5, we present a sequence of outflows, together with a barrier (the
‘threshold’) which represents a maintainable level of benefit payment. The
threshold amount is financed first by premiums via the portfolio technical
provision, and then by shareholders’ capital as the result of the allocation
policy (consisting of specific capital allocations as well as an accumulation
of undistributed profits).
The situation occurring in Fig. 7.5, namely, some annual outflows being
above the threshold level, should be clearly avoided. To lower the proba-
bility of such critical situations, the insurer can resort to various portfolio
strategies, in the framework of the RM process.
Figure 7.6 illustrates a wide range of portfolio strategies which aim at risk
mitigation, in terms of lowering the probability and the severity of events
like the situation depicted in Fig. 7.5. In practical terms, a portfolio strategy
can have as targets
(i) an increase in the maintainable annual outflow, and thus a higher

threshold level;
(ii) lower (and smoother) annual outflows in the case of unanticipated
improvements in portfolio mortality.
Both loss control and loss financing techniques (according to the RM lan-
guage) can be adopted to achieve targets (i) and (ii). Loss control techniques
are mainly performed via the product design, that is, via an appropriate
choice of the various items which constitute an insurance product. In par-
ticular, loss prevention is usually interpreted as the RM technique which
aims to mitigate the loss frequency, whereas loss reduction aims at lowering
the severity of the possible losses.
The pricing of insurance products provides a tool for loss prevention. This
portfolio strategy is represented by path (1) → (a) in Fig. 7.6. Referring to
(1) Single
Reserve
premiums
(a) Threshold
(2) Allocation
Shareholders'
capital
(3) Undistributed
profits
(4) Profit partic.

Annual
Gross outflow
benefits
(5) [Reduction]
(b) Net outflow

(6) Reinsurance
(7) Swaps Transfers
(8) Longevity
bonds
Figure 7.6. Portfolio strategies for risk mitigation.

a life annuity product, the following issues, in particular, should be taken

into account.
– Mortality improvements require the use of a projected life table for

pricing life annuities.
– Because of the uncertainty in the future mortality trend, a premium for-
mula other than the traditional one based on the equivalence principle
(see Section 1.6.1, and formula (1.57) in particular) should be adopted. It
should be noted that, by adopting the equivalence principle, the longevity
risk can be accounted for only via a (rough) safety loading, which is calcu-
lated by increasing the survival probabilities resulting from the projected
life table. Indeed, this approach is often adopted in current actuarial
practice.
– The presence, in an accumulation product such as an endowment, of an
option to annuitize at a fixed annuitization rate (the so-called Guaranteed
Annuity Option, briefly GAO – see Section 1.6.2) requires an accurate
pricing model accounting for the value of the option itself.
To pursue loss reduction, it is necessary to control the annuity amounts

paid out. Hence, some flexibility must be added to the life annuity product.
One action could be the reduction of the annual amount as a consequence
of an unanticipated mortality improvement (path (5) → (b) in Fig. 7.6).
However, in this case the product would be a non-guaranteed life annuity,
although possibly with a reasonable minimum amount guaranteed. A more
practicable tool, consistent with the features of a guaranteed life annuity,
consists of reducing the level of investment profit participation when the
mortality experience is adverse to the annuity provider (path (4) → (b)). It
is worth stressing that undistributed profits also increase the shareholders’
capital within the portfolio, hence increasing the maintainable threshold
(path (3) → (a)).
Loss financing techniques require specific strategies involving the whole
portfolio, and in some cases even other portfolios of the insurer. Risk
transfer can be realized via (traditional) reinsurance arrangements (path
(6) → (b)), swap-like reinsurance ((7) → (b)) and securitization, that is,
Alternative Risk Transfer (ART). In the case of life annuities, ART requires
the use of specific financial instruments, for example, longevity bonds
((8) → (b)), whose performance is linked to some measure of longevity
in a given population.
A comment is required on traditional risk transfer tools. Traditional
reinsurance arrangements (e.g. surplus reinsurance, XL reinsurance, and
so on) at least in principle can be applied also to life annuity portfolios.
But, it should be stressed that such risk transfer solutions mainly rely on
the improved diversification of risks when these are taken by the reinsurer,
thanks to a stronger pooling effect. Notably, such an improvement can be
achieved in relation to process risk (i.e. random fluctuations in the num-
ber of deaths), whilst uncertainty risk (leading to systematic deviations)
cannot be diversified ‘inside’ the insurance–reinsurance process. Hence, to
become more effective, reinsurance transfers must be completed with a fur-
ther transfer, that is, a transfer to capital markets. Such a transfer can
be realized via bonds, whose yield is linked to some mortality/longevity
index, so that the bonds themselves generate flows which hedge the pay-
ment of life annuity benefits. While mortality bonds (hedging the risk of
a mortality higher than expected) already exist, longevity bonds (hedg-
ing the risk of a mortality lower than expected) are yet to appear in the
market.
To the extent that mortality/longevity risks are retained by an insurer, the
impact of a poor experience falls on the insurer itself. To meet an unexpected
amount of obligations, an appropriate level of advance funding may provide
a substantial help. To this purpose, shareholders’ capital must be allocated
to the life annuity portfolio (path (2) → (a), as well as (3) → (a) in Fig. 7.6),
and the relevant amount should be determined to achieve insurer solvency.
Conversely, the expression ‘no advance funding’ (see Fig. 7.4) should be
referred to the situations where no specific capital allocation is provided in
respect of mortality/longevity risks. In the case of adverse experience, the
unexpected amount of obligations has to be met (at least partially) by the
available residual assets, which are not tied up to specific liabilities.
Hedging strategies in general consist of assuming the existence of a risk
which offsets another risk borne by the insurer. In some cases, hedging
strategies involve various portfolios or lines of business (LOBs), or even
the whole insurance company, so that they cannot be placed in the port-
folio framework as depicted in Fig. 7.6. In particular, natural hedging (see
Fig. 7.4) consists of offsetting risks in different LOBs. For example, writing
both life insurance providing death benefits and life annuities for similar
groups of policyholders may help to provide a hedge against longevity risk.
Such a hedge is usually named across LOBs. A natural hedge can be realized
even inside a life annuity portfolio, allowing for a death benefit (possibly
decreasing as the age at death increases) combined with the life annuity;
see Section 1.6.4. Clearly, in the case of a higher than anticipated mortality
improvement, death benefits which are lower than expected will be paid.
Such a hedge is usually called across time.
Clearly, mortality/longevity risks should be managed by the insurer
through an appropriate mix of the tools described above. The choice of the
RM tools is also driven by various interrelationships among the tools them-

selves. For example, the possibility of purchasing profitable reinsurance is
strictly related with the features of the insurance product and, in particular,
the life tables underlying the pricing, as well as with the availability of ART
for the reinsurer.
The following sections are devoted to an in-depth analysis of the RM
tools which currently seem to be the most practicable.
7.3.2 Natural hedging
In the context of life insurance, natural hedging refers to a diversification

strategy combining ‘opposite’ benefits with respect to the duration of life.
The main idea is that if mortality rates decrease then life annuity costs
increase while death benefit costs decrease (and vice versa). Hence the mor-
tality risk inherent in a life annuity business could be offset, at least partially,
by taking a position also on some insurance products providing benefits in
the case of death. We discuss two situations, one concerning hedging across
time and one across LOBs.
We first consider hedging across time. We assume that at time 0 (i.e.
calendar year x0 + t0 ) an immediate life annuity is issued to a person aged
x0 , with the proviso that at death (e.g. at the end of the year of death) the
mathematical reserve therein set up to meet the life annuity benefit (only) is
paid back to the beneficiaries. Reasonably, the reserving basis concerning
the death benefit should be stated at policy issue so that the death benefit,
although decreasing over time, is guaranteed.
At time 0, the random present value of future (life annuity and death)
benefits for an individual (generically, individual j) is defined as follows:
(j)
(j) (j)
Y0 = b(j) aK(j) + (1 + i)−(Kx0 +1) C (j) (7.29)
x0 Kx0 +1
(j)
where Ct is the death benefit payable at time t if death occurs in (t − 1, t),
defined as follows
ω−x
0 −t
(j)
Ct = b(j) a[A]
x0 +t
(j)
=b (1 + i)−h h p[A]
x0 +t (7.30)
h=1
(j)
The benefit Ct is therefore the mathematical reserve set up at time t to meet
the life annuity benefit, calculated according to the mortality assumption
A(τ) and the annual interest rate i. Note that the individual reserve (meeting
both the life annuity and the death benefit) to be set up at time t according
to the (traditional) equivalence principle is

ω−x
0 −t
(j) (j) (j)
Vt =b ax0 +t + h/1 qx0 +t (1 + i)−(h+1) Ct+h+1 (7.31)
h=0
(calculated according to a proper technical basis, possibly other than that

(j) (j) (j)
assumed in the calculation of Ct ). The sum at risk, Ct − Vt , in each year
(t − 1, t) is intended to be close to 0.
Intuitively, when dealing with both a life annuity and a death benefit the
insurer benefits from a risk reduction, given that the longer is the annuity
payment period, the lower is the amount of the death benefit. However, the
risk reduction cannot be total, because of the definition of the death benefit
(which is in particular guaranteed). The tricky point of this package is the
cost to the annuitant. Intuitively, we expect that the death benefit (7.30)
will be expensive (given that the consequence – which is the insurer’s target
as well – is a strong reduction of the cross-subsidy effect); so commercial
difficulties may arise.
For the sake of brevity we do not give analytical details; for discussion,
we only provide a numerical example.
Example 7.4 We take the assumptions adopted in Examples 7.1 and 7.2.
We assume that the death benefit is calculated according to the annual
interest rate i = 0.03 and the mortality assumption A3 (τ). Table 7.18 quotes
the risk index (i.e. the coefficient of variation of the present value of future
payments), when a given mortality assumption is adopted. The reduction
in the risk profile of the insurer is apparent (compare with Table 7.4). The
reduction of the riskiness can be noticed also in the unconditional case;
see Table 7.19, which should be compared with Table 7.11. However, the
death benefit requires a 22.730% increase in the single premium at age 65
(according to a pricing basis given by i = 0.03 and the mortality assumption
A3 (τ)). Actually, the mutuality effect is weaker in this case than when just
a life annuity benefit is involved.
For the sake of brevity, we do not investigate further risk measures.
From the point of view of the annuitant, the previous policy structure has
the advantage of paying back the assets (in terms of the amount stated under
policy conditions) remaining at her/his death, hence meeting bequest expec-
tations. On the other hand, the death benefit is rather expensive. Further
solutions can be studied, in order to reconcile the risk reduction purposes
of the insurer with the request by the annuitant for a high level of the ratio
between the annual amount and the single premium. However, the lower is
the death benefit, the lower is the risk reduction gained by the insurer. To
Table 7.18. Coefficient of variation of the present value of future payments, conditional on
()
the best-estimate scenario: CV[Yt |A3 (τ), nt ], in the presence of death benefit (7.30)

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ∞
0 10.714% 1.071% 0.339% 0.107% … 0%

5 13.364% 1.336% 0.423% 0.134% … 0%
10 16.722% 1.672% 0.529% 0.167% … 0%
15 20.925% 2.093% 0.662% 0.209% … 0%
20 26.105% 2.610% 0.826% 0.261% … 0%
25 32.390% 3.239% 1.024% 0.324% … 0%
30 39.960% 3.996% 1.264% 0.400% … 0%
35 49.174% 4.917% 1.555% 0.492% … 0%

()
payments: CV[Yt |nt ], in the presence of death benefit (7.30)

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ∞
0 10.804% 1.764% 1.442% 1.405% … 1.401%

5 13.489% 2.256% 1.866% 1.822% … 1.817%
10 16.902% 2.924% 2.455% 2.403% … 2.397%
15 21.193% 3.830% 3.273% 3.213% … 3.206%
20 26.511% 5.043% 4.390% 4.319% … 4.312%
25 33.009% 6.620% 5.858% 5.776% … 5.767%
30 40.884% 8.574% 7.680% 7.585% … 7.575%
35 50.490% 10.859% 9.789% 9.675% … 9.663%
give an example that can be commercially practicable, we consider a death

benefit defined as the difference (if positive) between the single premium S
funding the life annuity benefit and the number of annual amounts paid up
to death (see also Section 1.6.4); so we have

(j)
Ct = max S − (t − 1) b(j) , 0 (7.32)
See Example 7.5.

Example 7.5 With the same inputs as Example 7.4, we quote, in Tables 7.20
and 7.21, the risk index. In Table 7.20 the calculation is conditional on
mortality assumption A3 (τ) and in Table 7.21 it is based on the uncondi-
tional probability distribution. The single premium has been calculated as
the expected present value of future payments, conditional on assumption
A3 (τ); hence, S = b(j) E[aK(j) |A3 (τ)]. When compared with Tables 7.4 and
x0
7.11, we note a reduction in the risk profile to the insurer in the early policy
Table 7.20. Coefficient of variation of the present value of future payments, condi-
()
tional on the best-estimate scenario: CV[Yt |A3 (τ), nt ], in the presence of death benefit
(7.32)

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ∞
0 18.877% 1.888% 0.597% 0.189% … 0%

5 26.330% 2.633% 0.833% 0.263% … 0%
10 37.817% 3.782% 1.196% 0.378% … 0%
15 52.312% 5.231% 1.654% 0.523% … 0%
20 61.755% 6.175% 1.953% 0.618% … 0%
25 72.408% 7.241% 2.290% 0.724% … 0%
30 84.929% 8.493% 2.686% 0.849% … 0%
35 100.172% 10.017% 3.168% 1.002% … 0%

()
payments: CV[Yt |nt ], in the presence of death benefit (7.32)

Time t n0 = 1 n0 = 100 n0 = 1,000 n0 = 10,000 … n0 ∞
0 19.010% 3.129% 2.568% 2.505% … 2.498%

5 26.497% 4.386% 3.609% 3.522% … 3.512%
10 38.040% 6.363% 5.263% 5.140% … 5.126%
15 52.659% 9.063% 7.594% 7.431% … 7.413%
20 62.362% 11.512% 9.918% 9.745% … 9.725%
25 73.394% 14.493% 12.766% 12.581% … 12.560%
30 86.413% 18.000% 16.096% 15.893% … 15.870%
35 102.214% 21.929% 19.756% 19.525% … 19.499%
years; of course, when the death benefit is zero, we find again the case of the
stand-alone life annuity benefit. The risk reduction is lower than in Exam-
ple 7.4, due to the lower death benefit. The increase in the single premium
required at age 65 is lower as well; according to the usual pricing basis
(i = 0.03, mortality assumption A3 (τ)), a 7.173% increase is required with
respect to the case of the stand-alone life annuity.
Death benefits like (7.32) are included in the so-called money-back

annuities; see Boardman (2006).
One further, very well-known, example of natural hedging across time is
given by reversionary annuities (see Section 1.6.3). In this case, the longer is
the payment period to the leading annuitant, the lower should be the num-
ber of payments to the reversionary annuitant. However, some increased
longevity risk arises in this case, due to the fact that two (or more) lives are
involved instead of just one (with a possibly correlated mortality trend).
We now address natural hedging across LOBs. A risk reduction could be

pursued by properly mixing positions in life insurances and life annuities.
The offset result is unlikely to be as good as those mentioned previously,
given that life insurances usually concern a different range of ages than
life annuities. Further, we would point out that mortality trends emerge
differently within life insurance and life annuity blocks of business.
Some empirical investigations have been performed (see Cox and Lin,
2007), considering a set of whole life insurances and a set of life annuities.
Some interesting effects in terms of risk reduction can be gained when, at
issue, the magnitude of the costs of life insurances is similar to those of life
annuities.
A satisfactory offsetting effect between sets of life insurances and life
annuities is difficult to obtain. Only large insurance companies could be
partially effective in this regard. Reinsurers, in particular, could offer proper
support, also through swap-like agreements (see Section 7.3.4).
7.3.3 Solvency issues
Appropriate capital allocation policies should be undertaken to deal with

the longevity risk which has been retained by the insurer. In particular, the
adoption of internal models addressing longevity risk should be considered.
In what follows we investigate some internal models in this regard and
compare the main results with the requirements embedded in Solvency 2
for longevity risk (only). We focus mainly on longevity risk and refer to
conventional immediate life annuities, so that there is no allowance for
participation in financial or other profits. To make the results easier to
understand, we further assume that no risk transfer (i.e. neither reinsurance
nor ART) has been undertaken. Where not specified, we adopt the notation
and assumptions introduced in Section 7.2.4.
()
With reference to time t, let Wt be the amount of portfolio assets and Vt
the portfolio reserve (or technical provision). These quantities are random
at the valuation time, because of the risks, mortality, and investment risks
in particular, facing the portfolio. Let z be the valuation time (z = 0, 1, . . . ).
The random path of portfolio assets is recursively described as follows:
()
Wt = Wt−1 (1 + it ) − Bt ; t = z + 1, z + 2, . . . (7.33)
where it is the investment yield in year (t − 1, t) and Wz is given (includ-

ing both the reserve and capital in the size required according to a chosen
solvency rule).
According to legislation, the portfolio reserve is normally calculated as

the expected present value (using an appropriate technical basis) of future
payments, increased by an appropriately defined risk margin. If the risk mar-
gin is a function of the expected present value of future payments, then (at
least in principle) the mathematical reserve can be calculated by aggregating
individual reserves. In this case, the reserve at time t is random because it
(j)
is the sum of a random number of individual reserves. If Vt denotes the
individual reserve at time t, we have
()
(j)
Vt = Vt (7.34)
j:j∈t
We will adopt this assumption in the following discussion. However, we

point out that if the risk margin is an appropriate risk measure assessed
for the portfolio as a whole, the reserve must be calculated directly at the
portfolio level, given that the number of in-force policies affects the amount
of the technical provision when pooling risks are present. For example, the
portfolio reserve could be defined as a given percentile (e.g., the 75th ) of the
present value of future payments (see Section 1.5.3); in this case, the risk
margin would be implicitly assessed as the difference between the percentile
and the expected value of the distribution of the present value of future
payments.
The quantity
()
Mt = Wt − Vt (7.35)
represents the assets available to meet the residual risks having allowed
for those risks met by the portfolio reserve; shortly, we will refer to Wt
as the total portfolio assets and to Mt as the capital assets in the portfolio
(conversely, Wt − Mt represents assets backing the portfolio reserve).
In line with common practice, we consider solvency to be the ability of
the insurer to meet, with an assigned (high) probability, random liabilities
as they are described by a realistic probabilistic structure. To implement
such a concept, choices are needed in respect of the following items:
1. The quantity expressing the ability of the insurer to meet liabilities; rea-
sonable choices are either the total portfolio assets Wt or, as it is more
usual in practice, the capital assets, Mt , which (clearly) is supposed to
be positive when the insurer is solvent.
2. The time span T which the above results are referred to; it may range
from a short-medium term (1–5 years, say), to the residual duration of
the portfolio.
3. The timing of the results, in particular annual results (e.g. the amount of
portfolio assets at every integer time within T years) versus single figure
results (e.g. the amount of portfolio assets at the end of the time horizon
under consideration, that is, after T years).
Further choices concern how to define the portfolio (just in-force poli-
cies or also future entrants). To make these choices, the point of view from
which solvency is ascertained must be stated. Policyholders, investors and
the supervisory authority represent possible viewpoints in respect of the
insurance business. However, the perspectives of the (current or poten-
tial) policyholders and investors involve profitability requirements possibly
higher than those implied by the need of just meeting current liabilities.
Such requirements would lead to a concept of insurer’s solidity, rather
than solvency. So, we restrict our attention to the supervisory authority’s
perspective.
The supervisory authority is charged to protect mainly the interests of
current policyholders. So a run-off approach should be adopted (hence
disregarding future entrants). Further, no profit release should be allowed
for within the solvency time-horizon T, nor should any need for capital
allocation be delayed.
Let z be the time at which solvency is ascertained (z = 0, 1, . . . ). The
capital required at time z could be assessed according to one of the following
(alternative) models
 
5
z+T
P Mt ≥ 0 = 1 − ε1 (7.36)
t=z+1
P [Mz+T ≥ 0] = 1 − ε2 (7.37)
 
5
z+T
()
P Wt − Y t ≥ 0 = 1 − ε3 (7.38)
t=z+1
where εi (i = 1, 2, 3) is the accepted default probability under the chosen

()
requirement and Yt is defined as in (7.8). Clearly, in all the solvency mod-
els above (i.e. (7.36)–(7.38)), the relevant probability is assessed conditional
on the current information at time z.
With reference to requirement (7.38), first note that recursion (7.33) can
be rewritten as
1
t
() 1
Wt = Wz − Bh (7.39)
v(z, t) v(h, t)
h=z+1
where
1
= (1 + ih+1 ) (1 + ih+2 ) . . . (1 + ik ) (7.40)
v(h, k)
is the accumulation factor based on investment returns from time h to time

k, and
v(h, k) = ((1 + ih+1 ) (1 + ih+2 ) . . . (1 + ik ))−1 (7.41)
is the discount factor, based on the annual investment yields, from time k to
()
time h. Referring to one cohort only, the quantity Yt can also be written
as (see (7.10))
ω−x
0
() ()
Yt = Bh v(t, h) (7.42)
h=t+1
Requirement (7.38) can be rewritten as

 
5
z+T
t ω−x
0
1 () 1 ()
P Wz − Bh − Bh v(t, h) ≥ 0 = 1 − ε3
v(z, t) v(h, t)
t=z+1 h=z+1 h=t+1
(7.43)
Assume, for brevity, that the annual investment yields are constant, that is,
ih = i for all h. Then we can write (7.43) as
 
5
z+T ω−x
0 ()
P Wz (1 + i)t−z − Bh (1 + i)t−h ≥ 0 = 1 − ε3 (7.44)
t=z+1 h=z+1
or also as
'
6
z+T
P (1 + i)t−(ω+1−x0 ) (Wz (1 + i)ω+1−x0 −z
t=z+1 ( (7.45)
ω−x
0 ()
− Bh (1 + i) ω+1−x 0 −h ) ≥ 0 = 1 − ε3
h=z+1
We note that
ω−x
0 ()
Wz (1 + i)ω+1−x0 −z − Bh (1 + i)ω+1−x0 −h = Wω+1−x0 (7.46)
h=z+1
represents the amount of portfolio assets available when the cohort is

exhausted, and so the following result can be easily justified:
 
5
z+T
P (1 + i)t−(ω+1−x0 ) Wω+1−x0 ≥ 0
t=z+1
 
5
z+T
= P Wω+1−x0 ≥ 0 = P[Wω+1−x0 ≥ 0] = 1 − ε3 (7.47)
t=z+1
Hence, requirement (7.38) can be replaced by the following:
P[Wω+1−x0 ≥ 0] = 1 − ε3 (7.48)
Before commenting on the above results from the perspective of solvency,

it is useful to note that such results hold in particular because: (a) the port-
folio is closed to new entrants; (b) the probability in requirement (7.38) (as
well as in (7.36) and (7.37)) is assessed according to the natural probability
distribution of assets and liabilities (so that no risk-adjustment is applied,
for example, in a risk-neutral sense) and it is implicitly conditional on the
information available at time z on the relevant variables (current number
of survivors, investment yields, and so on). The results described in (7.44)–
(7.47) could then be generalized to the case where more than one cohort is
addressed and the investment yield is not constant.
Turning back to the solvency requirements (7.36)–(7.38), the differ-
ence between requirement (7.37) and (7.36) is clear. The same quantity
is addressed in both, but whilst under requirement (7.36) it is checked at
every year within the solvency time-horizon, under (7.37) it is checked just
at its end. We note that requirement (7.37) allows, in particular, for tempo-
rary shortages of money within the solvency time-horizon. In the context
of a portfolio of immediate life annuities, possible deficiencies of assets may
be self-financed only by healthy financial profits and, also in this case, when
the participation mechanisms to such profits (when present) are under the
control of the insurer (i.e. if the insurer can reduce the participation in some
years to recover more easily past or future losses). In the case of immediate
life annuities, therefore, the outputs of requirement (7.37) should be close
to those of (7.36). Hence, in the following we will disregard requirement
(7.37).
The apparent difference between (7.36) and (7.38) arises from the way
that the liabilities are defined. In (7.38), the liabilities are stated in terms of
the random present value of future payments, whilst in (7.36) they are stated
as the expected value of such a quantity (plus possibly a risk margin). So
whilst in (7.38) a consistent assessment of assets and liabilities is performed,

under (7.36) some intermediate step is required.
To compare further (7.36) with (7.38), it is useful to note that the capi-
tal assets build up because of specific capital allocations, and also because
of the annual profits which are released according to the reserve profile
and, in our setting, retained within portfolio assets. On the other hand,
the amount of portfolio assets at the natural maturity of the cohort repre-
sents the surplus left to the insurer at the expiry of the cohort itself. Given
that, under (7.38), the maximum available time-horizon is implicitly con-
sidered (see (7.47)), we can argue that such a requirement takes care of
the overall losses possibly deriving from the portfolio. Assume that a time-
horizon T = ω + 1 − x0 − z is chosen in requirement (7.36); the difference
between (7.36) and (7.38) lies in the fact that, under the latter, only the
total amount of the surplus (and loss) is considered (see (7.48)), whilst,
under the former, also the timing of their emergence is taken into consid-
eration. According to valuation terminology, requirement (7.36) is based
on a ‘deferral and matching’ logic, whilst (7.38) on an ‘asset and liability’
approach. Further, whenever a shorter time-horizon is chosen in (7.36), we
note that just profits (and losses) emerging in the first T years are acco-
unted for.
Note that because of the differences among the three requirements, it is
reasonable that they are implemented with different levels of the accepted
default probability; in particular, we can imagine ε2 ≥ ε1 . The comparison
between ε1 and ε3 is not straightforward in general, given that, in a life
portfolio, short-term losses could be recovered in the long run. Referring
to a portfolio of immediate life annuities, however, we can imagine that
ε1 ≥ ε3 whenever T < ω + 1 − x0 − z. Should T = ω + 1 − x0 − z, then
ε1 = ε3 could be a reasonable choice.
Solving (7.36), through stochastic simulation, one finds the amount of
capital assets required at time z; we will denote such amount by Mz[R1] (T).
()
Then Wz[R1] (T) = Vz + Mz[R1] (T) is the amount of total portfolio
assets required at time z. Solving (7.48), again through stochastic simu-
lation, one finds the amount of total portfolio assets required at time z,
denoted as Wz[R3] ; the required amount of capital assets at time z is then:
()
Mz[R3] = Wz[R3] − Vz .
Example 7.6 Let us adopt the inputs of Example 7.2; so, in particular, we
refer to a homogeneous cohort. To focus on mortality, we disregard finan-
cial risk; so we set it = i = 0.03 for all t (i = 0.03 is adopted in the reserving
basis as well). To facilitate the comparisons among the results obtained
under the different requirements, we define the individual reserve as the
Table 7.22. Individual reserve
(1)
Time z Reserve Vz
0 15.259
5 12.956
10 10.599
15 8.294
20 6.167
25 4.336
30 2.877
35 1.807
expected value of future payments, under the best-estimate assumption;

then
(j) (j)
Vt = E[Yt |A3 (τ)] (7.49)
Further, the same default probability is set for all the requirements, so
ε1 = ε3 = 0.005. Such a level has been chosen to be consistent with the
developing Solvency 2 system (see CEIOPS (2007) and CEIOPS (2008)).
We note that under Solvency 2 a risk margin should be added to (7.49),
calculated according to the Cost of Capital approach; see CEIOPS (2007)
and CEIOPS (2008) for details.
Table 7.22 quotes the individual reserve. Clearly, at any time z the port-
() (1) (1)
folio reserve is simply: Vz = nz Vz , where Vz is the reserve at time z
for a generic annuitant.
In Table 7.23, we state the amount of the capital (per unit of portfolio
reserve) required according to (7.36) and (7.38) for several portfolio sizes.
For (7.36), the maximum possible time-horizon has been chosen. As we
would expect from the previous discussion, the two requirements lead to
similar outputs, at least when mortality only is addressed. In this case, at
least, the outputs suggest that requirement (7.36) is to some extent inde-
pendent of the reserve when T takes the maximum possible value for the
time-horizon. It should be stressed that in our investigation no risk mar-
()
gin is included in Vz . Thus, a share of the required capital quoted in
Table 7.23 should be included in the reserve and, possibly, charged to annu-
itants through an appropriate safety loading at the issue of the policy. When
interpreting the size of the required capital per unit of the portfolio reserve,
we also point out that the reserve is lower than what would be required by
the supervisory authority, and so the ratios in Table 7.23 would be higher
than what we would find in practice.
Table 7.23. Required capital based on requirements (7.36) and (7.38), facing longevity risk and
mortality random fluctuations
Required capital based on (7.36) Required capital based on (7.38)

Mz[R1] (ω+1−x0 −z) Mz[R3]
() ()
Vz Vz
Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000
0 12.744% 9.243% 8.103% 12.744% 9.241% 8.103%

5 16.510% 11.938% 10.525% 16.492% 11.938% 10.525%
10 21.474% 15.630% 13.890% 21.333% 15.621% 13.890%
15 28.097% 20.372% 18.282% 28.007% 20.372% 18.281%
20 37.722% 27.031% 24.131% 37.456% 27.008% 24.131%
25 53.980% 36.129% 31.832% 53.378% 36.113% 31.832%
30 82.980% 50.605% 42.152% 81.037% 50.476% 42.140%
35 171.782% 79.024% 56.968% 165.842% 77.890% 56.968%
It is worthwhile to comment on the similar magnitude of the ratios

() ()
Mz[R3] /Vz and yz,ε [nz ]/E[Yz |nz ] (see Table 7.12), when the probabil-
ity ε considered for the calculation of the percentile yz,ε [nz ] is very close
(or, better, the same as) the non-default probability 1 − ε3 adopted for
calculating Mz[R3] . In order to deal with an example, we can compare
()
the ratio Mz[R3] /Vz in Table 7.23 (where 1 − ε3 = 0.995) with the
()
ratio yz,0.99 [nz ]/E[Yz |nz ] in Table 7.12 (thus, we are setting ε = 0.99);
we can note that the two ratios have a similar magnitude at each time
()
z. First, we note that, as pointed out in Example 7.2, Vz (given by
(1) () ()
nz E[Yz |A3 (τ)] = E[Yz |A3 (τ), nz ]) and E[Yz |nz ] are very close (com-
pare in particular Tables 7.2 and 7.9). So given the similar values of the
two ratios, the quantities Mz[R3] and yz,ε [nz ] are also likely to be close to
one the other. Actually, under requirement (7.38) what is measured is the
accumulated value of annual payments, whilst with yz,ε [nz ] the relevant
present value is accounted for. Indeed in Section 7.2.4, we mentioned the
practical importance of investigating the right tail of the distribution of the
present value of future payments; this comes from the fact that the quantity
yz,ε [nz ] may be taken as a measure of the capital required to meet liabilities
under a low default probability (and according to the maximum possible
solvency time horizon).
In Table 7.24, outputs from requirement (7.36) are investigated for
shorter time-horizons. Comparing Tables 7.23 with 7.24, the long-term
nature of longevity risk clearly emerges. We note that, both in Tables 7.23
and 7.24, at each valuation time and for each requirement, the size of the
required capital decreases when a larger portfolio is considered. This is due
to the fact that also random fluctuations are accounted for in the assessment.
Table 7.24. Required capital based on requirements (7.36), per unit of portfolio reserve:
[R1] ()
(Mz (T))/Vz , facing longevity risk and mortality random fluctuations
Time-horizon T = 1 Time-horizon T = 3
Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000
0 0.574% 0.473% 0.242% 1.834% 1.076% 0.581%

5 1.058% 0.743% 0.397% 3.358% 1.711% 0.983%
10 1.951% 1.159% 0.649% 5.162% 2.568% 1.738%
15 3.600% 1.903% 1.226% 8.689% 4.463% 3.399%
20 6.639% 3.265% 2.306% 13.796% 8.003% 6.403%
25 12.246% 6.070% 4.465% 22.727% 14.314% 11.790%
30 22.588% 12.168% 8.655% 44.454% 26.438% 21.145%
35 41.664% 26.210% 16.739% 124.167% 51.506% 36.973%
We have obtained Table 7.25 by addressing random fluctuations only.

In particular, the required capital has been calculated adopting only the
best-estimate mortality assumption A3 (τ). In Table 7.26, in contrast, only
longevity risk has been accounted for, by assuming that whatever is the
realized mortality trend, the actual number of deaths in each year coincides
with what has been expected under the relevant trend assumption. We note
that in the latter case the amount of the required capital per unit of portfo-
lio reserve is independent of the size of the portfolio – this occurs because,
as noted previously, longevity risk is systematic. Regarding Table 7.25, we
point out that the random fluctuations accounted for there are not fully com-
parable to those embedded in Tables 7.23 and 7.24. Actually, in Tables 7.23
and 7.24, a mixture of the random fluctuations which can be appraised
under the several mortality assumptions in A(τ) is accounted for. When
comparing Table 7.25 (lower panels) with Table 7.24, we can see that, if
requirement (7.36) is implemented with a short time-horizon, in practice
we are mainly accounting for random fluctuations, rather than systematic
deviations; this is due to the long-term nature of longevity risk. Tables 7.25
and 7.26 do provide us with some useful information. However, it must be
pointed out that implementing an internal model allowing for a component
only of a risk represents an improper use of the model itself. As an illus-
tration, we note that on summing the results in Tables 7.25 and 7.26, for
a given requirement and portfolio size, we do not find the correspondent
results in Table 7.23 or 7.24. Thus, some aspects are missed when work-
ing with marginal distributions only (as is the case when we address either
random fluctuations or systematic deviations only).
Finally, it is interesting to compare the findings described by the previ-
ous Tables with some legal requirements. We refer here to the developing
Solvency 2 system, which is one of the few explicitly considering longevity
Table 7.25. Required capital based on requirements (7.36) and (7.38), facing mortality
random fluctuations only; mortality assumption A3 (τ)

Mz[R1] (ω+1−x0 −z) Mz[R3]
() ()
Vz Vz
Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000
0 7.813% 2.832% 0.879% 7.031% 2.698% 0.800%

5 9.983% 3.071% 1.067% 9.436% 2.949% 1.040%
10 12.144% 4.040% 1.217% 11.543% 3.759% 1.193%
15 16.153% 5.202% 1.544% 14.982% 4.921% 1.462%
20 22.343% 6.938% 2.091% 21.292% 6.554% 1.936%
25 29.728% 10.388% 3.072% 28.546% 9.642% 2.983%
30 54.183% 16.871% 5.547% 51.253% 16.807% 5.152%
35 155.859% 36.795% 11.715% 144.058% 34.809% 11.207%

Mz[R1] (1) Mz[R1] (3)
() ()
Vz Vz
Time z n0 = 100 n0 = 1, 000 n0 = 10, 000 n0 = 100 n0 = 1, 000 n0 = 10, 000
0 0.574% 0.473% 0.171% 1.834% 0.983% 0.378%

5 1.058% 0.743% 0.271% 3.358% 1.443% 0.479%
10 1.951% 0.932% 0.388% 5.162% 1.957% 0.657%
15 3.600% 1.642% 0.583% 8.604% 2.630% 0.932%
20 6.639% 2.458% 0.806% 13.304% 3.775% 1.329%
25 12.246% 4.633% 1.379% 19.609% 7.129% 2.166%
30 22.588% 7.878% 2.804% 41.023% 13.181% 4.168%
35 41.664% 21.058% 7.321% 124.167% 32.954% 10.176%
risk. The capital required to deal with such risk is the change expected in
the net asset value against a permanent reduction by 25% in the current
and all future mortality rates (we do not discuss further details, such as pos-
sible reductions of this amount; see CEIOPS (2007) and CEIOPS (2008)).
Under our hypotheses (we are considering just one cohort, there is no profit
participation, we are disregarding risks other than those deriving from mor-
tality, and so on), the requirement reduces to the difference between the
best-estimate reserve and a reserve set up with a mortality table embedding
probabilities of death 25% lower than in the best-estimate assumption. The
relevant results are quoted in Table 7.27, where the required capital at time
z is denoted by Mz[Solv2] . It is clear that, in relative terms, such an amount is
independent of the portfolio size. We further recall that, under Solvency 2,
no specific capital allocation is required for the risk of random fluctuations,
since they are treated as hedgeable risks.
Table 7.26. Required capital based on requirements (7.36) and

(7.38), facing longevity risk only
Required capital
Mz[R1] (ω+1−x0 −z) Mz[R3] Mz[R1] (1) Mz[R1] (3)
Time z () () () ()
Vz Vz Vz Vz
0 7.562% 7.562% 0.125% 0.389%

5 9.895% 9.895% 0.205% 0.651%
10 13.040% 13.040% 0.437% 1.394%
15 17.239% 17.239% 0.922% 2.857%
20 22.745% 22.745% 1.883% 5.621%
25 29.762% 29.762% 3.727% 10.564%
30 38.348% 38.348% 7.110% 18.745%
35 48.330% 48.330% 12.949% 30.875%
Table 7.27. Required capital according

to Solvency 2
Mz[Solv2]
Time z ()
Vz
0 7.274%
5 9.080%
10 11.377%
15 14.293%
20 18.000%
25 22.767%
30 29.102%
35 38.065%
Tables 7.26 and 7.27 may suggest that a deterministic approach can be
adopted for allocating capital to deal with longevity risk. In particular, the
assessment of the required capital could be based on a comparison between
the actual reserve and a reserve calculated under a more severe mortality
trend assumption (as turns out to be the case under Solvency 2).
()[B]
Let Vz be a reserve calculated according to the same valuation prin-
()
ciple adopted for Vz (the equivalence principle, in our implementation),
but based on a worse mortality assumption, so that
() ()[B]
Vz ≤ Vz (7.50)
The required capital would be
()[B] ()
Mz[R4] = Vz − Vz (7.51)
We note that requirement (7.51) would deal with longevity risk only. Fur-
ther, no default probability is explicitly mentioned; however, the mortality
()[B]
assumption adopted in Vz clearly implies some (not explicit) default
probability. The time-horizon implicitly considered is the maximum resid-
ual duration of the portfolio, given that this is the time-horizon referred
to in the calculation of the reserve. We also point out that, to simplify the
assessment of the required capital and to avoid any duplication of risk mar-
gins as well, it is reasonable that reserves in (7.51) are actually based on the
equivalence principle. So the required capital Mz[R4] turns out to be linear
in respect of the portfolio size nz .
To compare requirements (7.36) and (7.38) with (7.51), let us define the
following ratios:
Mz[R1] (T)
QMz[R1] (T; nz ) = ()
(7.52)
Vz
Mz[R3]
QMz[R3] (nz ) = ()
(7.53)
Vz
Mz[R4]
QVz = ()
(7.54)
Vz
Accounting also for the risk of random fluctuations, the ratios

QMz[R1] (T; nz ) and QMz[R3] (nz ) depend on the size of the portfolio while,
in contrast, the ratio QVz , which considers just longevity risk, is indepen-
dent of portfolio size. On the other hand, requirement (7.36) and (7.38)
could be implemented considering only the risk of random fluctuations or
the longevity risk, as we have illustrated in the calculations in Tables 7.25
and 7.26, respectively. As noted previously, when addressing longevity risk
only, the ratios QMz[R1] (T; nz ) and QMz[R3] (nz ) are independent of the size
of the portfolio (as it emerges from Table 7.26). However, we have already
commented on the fact that addressing just a component of the mortal-
ity risk represents an improper use of requirements (7.36) and (7.38). A
further difference between ratio QMz[R1] (T; nz ) and QVz stands in the pos-
sibility to set a preferred time-horizon; indeed, time-horizons other than the
maximum one may be chosen only when requirement (7.36) is adopted.
It is not possible to derive general conclusions regarding the comparison
between the outcoming levels of ratios QMz[R1] (T; nz ) and QMz[R3] (nz ), on
one hand, and QVz , on the other. However, we comment further through
an example.
Example 7.7 Figure 7.7 plots the ratios (7.53) and (7.54), for several
portfolio sizes, based on calculations performed at time 0. In particular:
12%
10%
Required capital, per unit of reserve
8%
6%
(3) (1)
4%
(2) (4)
2%
0%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3] [R3] ()[A5 (τ)]
Figure 7.7. Ratios QM0 (n0 ) and QV0 . (1): QM0 (n0 ); (2): QV0 , with V0 ; (3):
[R3] [R3] [R3]
QM0 (n0 ), with M0 accounting for random fluctuations only; (4): QV0 + QM0 (n0 ), with
[R3]
M0 accounting for random fluctuations only.
– case (1) plots the ratio QM0[R3] (nz );

– case (2) plots the ratio QV0 , obtained by choosing the mortality trend
A5 (τ) as an assumption more severe than the best-estimate;
– case (3) plots the ratio QM0[R3] (nz ) where, in contrast to case (1), the
required capital M0[R3] has been obtained by addressing random fluc-
tuations only (the best-estimate assumption has been used to describe
mortality);
– case (4) plots the required capital obtained summing the results in case
(2) (accounting for longevity risk only) and in case (3) (accounting for
random fluctuations only).
We first note that the outputs found under case (2) are very similar to
(indeed, in our example they coincide with) those found adopting require-
ment (7.38), as well as requirement (7.36) with T = ω + 1 − x0 (the ratio
()[A5 (τ)]
QV0 , with V0 , plotted in Fig. 7.7, amounts to 7.562% for each port-
folio size; compare this outcome with the ratios QM0[R1] (ω + 1 − x0 ; n0 ) =
() ()
(M0[R1] (ω + 1 − x0 ))/V0 and QM0[R3] (n0 ) = M0[R3] /V0 in Table 7.26).
This is explained by the fact that the (left) tail of the distribution of assets
(addressed in (7.38) and (7.36)) is heavily affected by the worst scenario
(A5 (τ), in our example) when low probabilities (of default) are addressed.
Thus, when allowing for longevity risk only, requirement (7.36) adopted
with the maximum possible time-horizon and requirement (7.38) reduce to
(7.51). This is why a practicable idea could be to split the capital allocation
process in two steps:
– one for longevity risk only, based on a comparison between reserves

calculated according to different mortality assumptions (i.e. adopting
requirement (7.51));
– one for random fluctuations only, adopting an internal model or some
other standard formula.
Case (4) in Fig. 7.7 is intended to represent such a choice. We note, however,
that an unnecessary allocation may result from this procedure; as we have
already commented, working separately on the components of mortality
risk is improper and may lead to an inaccurate capital allocation.
Undoubtedly, the advantage of requirement (7.51) is its simplicity, and

we note that it seems that this requirement will be adopted by Solvency 2
in respect of many risks. Of course, it is also possible to find the reserv-
ing basis avoiding the situation plotted in Fig. 7.7 (but to be sure, one
should first perform the valuation through an internal model, at least for
some typical compositions of the portfolio). Another possibility supporting
the separate treatment of the mortality risk components is to adopt differ-
ent solvency time-horizons for the different components of mortality risk.
So we could choose the maximum possible value for T for longevity risk
(adopting (7.51)) and a short-medium time-horizon for random fluctua-
tions (if requirement (7.36) is adopted, with say T = 1 to 5 years). For
practical purposes, this approach could represent a good compromise, on
condition that the relevant assumptions are properly disclosed. If valuation
tools other than an internal model are available or are required for the risk
of random fluctuations (as should be the case for Solvency 2), then require-
ment (7.51) is certainly able to capture properly the feature of longevity risk
(only).
Example 7.8 We conclude this section with a final example. So far, just
homogeneous portfolios have been investigated. We now consider the case
of a portfolio with some heterogeneity in annual amounts. A stronger dis-
persion of annual amounts usually leads to a poorer pooling effect. Also
there is the danger that if the annuitants living longer are those with higher
annual amounts, then the impact of longevity risk could be more severe.
Even though it is reasonable to assume that, because of adverse selection,
those with higher annual amounts live longer (as it is supported by some evi-
dence), in this example we do not account for this dependence. The impact
Table 7.28. Classes of annual amounts in five portfolios
Portf. 1 Portf. 2 Portf. 3 Portf. 4 Portf. 5

Class Amount Freq. Amount Freq. Amount Freq. Amount Freq. Amount Freq.
1 1 100% 0.75 40% 0.25 20% 0.75 90% 0.5625 80%

2 1 50% 0.75 20% 3.25 10% 2 15%
3 2 10% 1 20% 5 5%
4 1.25 20%
5 1.75 20%
distribution of the annual amount

Average value 1 1 1 1 1
standard
deviation 0 0.35355 0.5 0.75 1.0503
of the dispersion of annual amounts is checked through the calculation of

the capital required to meet mortality risks, assuming a zero correlation
between the annual amount and the lifetime of the annuitant.
We test the five portfolios described in Table 7.28. We note that, to facil-
itate comparisons, the same average annual amount per annuitant has been
assumed. The specific annual amount paid to each annuitant may, however,
be different from the average value, depending on the insurance class (each
class grouping people with the same annual amount). We note that the port-
folios are ordered with respect to the degree of heterogeneity, as measured
by the standard deviation of the distribution of the annual amounts.
Adopting the inputs of Example 7.6, we have calculated the capital
required based on requirement (7.38). The assessment has been performed
at time 0 only, for several portfolio sizes. The outputs are plotted in Fig. 7.8.
A stronger requirement emerges when portfolios with a wider dispersion of
annual amounts are considered: portfolio 5 versus portfolio 1, for example.
We note that the portfolio reserve at time 0 is the same in all portfolios,
due to the assumption about the average annual amount. It is interesting to
compare Figs. 7.8 to 7.9, where only random fluctuations have been consid-
ered. It seems that most of the change in the capital required when changing
the portfolio composition is due to random fluctuations. We note, in par-
ticular, the width of the range of variation of the capital required for the
several portfolios when also longevity risk is accounted for relative to what
happens when only random fluctuations are accounted for. Thus, when
comparing Figs. 7.8 and 7.9 in detail, we note, although the scale of the y-
axis is different, that the length of the range is the same. So we can conclude
that, to some extent, longevity risk is independent of the heterogeneity of
the portfolio. It is important to note, again, that this result is also due to the
12%
portf 1
Require dcapital (per unit of reserve) portf 2
portf 3
portf 4
11%
portf 5
10%
9%
8%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3 ]
Figure 7.8. Required capital, per unit of reserve: QM0 (n0 ).
model, which does not explicitly account for any dependence of the lifetime
of the individual on her/his annual amount.
7.3.4 Reinsurance arrangements
Various reinsurance arrangements can be conceived, at least in principle,

to transfer longevity risk. At the time of writing, reinsurers are reluctant
to accept such a transfer, due to the systematic nature of the risk of unan-
ticipated aggregate mortality. Actually, only some slight offset (through
natural hedging) can be gained by dealing with longevity risk just within
the insurance-reinsurance process. Longevity-linked securities, transferring
the risk to the capital market, could back the development of a longevity
reinsurance market (see Section 7.4). So in the following we describe several
arrangements, some of which in particular could be effective when linked
to longevity securities. At the same time, we disregard any arrangement
designed to deal with random fluctuations. To be consistent with the pre-
vious discussion, we refer to immediate life annuities (which in any case
are the most interesting type of annuity when a transfer of longevity risk is
being considered).
It must be pointed out that when mortality risk is reinsured in a life
annuity portfolio, one cannot be sure that just longevity risk is transferred.
Indeed, random fluctuations also contribute to deviations in mortality rates,
4%
portf 1
portf 2
Required capital (per unit of reserve) portf 3
portf 4
3% portf 5
2%
1%
0%
0 2000 4000 6000 8000 10000 12000
Portfolio size
[R3 ]
Figure 7.9. Required capital, per unit of reserve: QM0 (n0 ), facing mortality random fluctua-
tions only; mortality assumption A3 (τ).
as we have highlighted previously. If the reinsurance arrangement is meant

to deal mainly with longevity risk, then before underwriting it the risk of
random fluctuations has to be reduced; for example, some leveling of the
annual amounts has to be achieved through a first-step surplus reinsurance.
For this reason, in the following we will implicitly refer to homogeneous
portfolios in respect of the amount of benefits.
The more natural way to transfer longevity risk for an annuity provider is
to truncate the duration of each annuity. To this purpose, an Excess-of-Loss
(XL) reinsurance can be designed. Under such an arrangement, the reinsurer
would pay to the cedant the ‘final’ part of the life annuity in excess of a given
age xmax . Such an age should be reasonably old, but not too close to the
maximum age (otherwise the transfer would be ineffective); xmax could,
for example, be set equal to the Lexis point in the current projected table.
Note that xmax defines the deductible of the XL arrangement. See Fig. 7.10,
where x0 = 65 and xmax = 85.
From the point of view of the cedant, this reinsurance treaty converts
immediate life annuities payable for the whole residual lifetime into imme-
diate temporary life annuities. From the point of view of the reinsurer, a
heavy charge of risk emerges. Actually, the reinsurer takes the ‘worst part’
of each annuity, being involved at the oldest ages only. Therefore, from a
practical point of view, the reinsurance treaty would be acceptable to the
Reinsurer's intervention
Annuitants
n
. . .
. . .
. . .
5
4
3
2
1
65 85 Lifetime
Figure 7.10. An XL reinsurance arrangement.
reinsurer only if it were compulsory for some annuity providers. This could
be the case, for example, with pension funds, which may be forced by the
supervisory authority to back their liabilities through arrangements with
(re-)insurers.
The XL arrangement is clearly defined on a long-term basis, so imply-
ing a heavy longevity risk charged to the reinsurer. In more realistic terms,
reinsurance arrangements defined on a short-medium period basis could
be addressed. With this objective in mind, stop-loss arrangements could
provide interesting solutions. According to the stop-loss rationale, the rein-
surer’s interventions are aimed at preventing the default of the cedant,
caused by (systematic) mortality deviations.
The effect of mortality deviations can be identified, in particular, by com-
paring the total portfolio assets at a given time with the portfolio reserve
required to meet the insurer’s obligations. A Stop-Loss reinsurance on assets
can then be designed, according to which the reinsurer funds (at least par-
tially) the possible deficiency in assets; Fig. 7.11 sketches this idea (in a
run-off perspective).
Let z be the time of issue (or revision) of the reinsurance arrangement.
Adopting the notation introduced earlier, in practical terms the reinsurer’s
intervention can be limited to the case
()
Wz+k < (1 − π) Vz+k , π≥0 (7.55)
Assets and reserve Reinsurer's intervention
Required
portfolio
reserve Assets
available
Time
Figure 7.11. A Stop-Loss reinsurance arrangement on assets.
()
where the amount πVz+k represents the ‘priority’ of the stop-loss treaty and
k is a given number of years. We note that setting π > 0 may contain the
possibility of random fluctuations being transferred. However, thanks to
the fact that the assets and the reserve of a life annuity portfolio have long-
term features, the flows of the arrangement should not be heavily affected
by random fluctuations, at least up to some time. In fact, close to the natural
maturity of the portfolio we may expect that random fluctuations become
predominant relative to systematic deviations; see also Section 7.2.4. Setting
k > 1 (e.g. k = 3 or k = 5) ensures that the reinsurer intervenes in the more
severe situations, and not when the lack of assets may be recovered by the
subsequent flows of the portfolio. However, k should not be set too high,
otherwise the funding to the cedant in the critical cases would turn out to
be too delayed in time.
A technical difficulty in this treaty concerns the definition of assets and
reserve to be referred to for ascertaining the loss. Further, some control of
the investment policy adopted by the cedant in relation to these assets could
be requested by the reinsurer. For these reasons, the treaty can be conceived
as an ‘internal’ arrangement, that is, within an insurance group (where the
holding company takes the role of the reinsurer of affiliates) or when there
is some partnership between a pension fund and an insurance company (the
latter then acting as the reinsurer, the former as the cedant).
A Stop-Loss reinsurance may be designed on annual outflows, instead of
assets. The rationale, in this case, is that, at a given point in time, longevity
risk is perceived if the amount of benefits to be currently paid to annuitants
is (significantly) higher than expected. A transfer arrangement can then be
designed so that the reinsurer takes charge of such an extra amount, or

‘loss’. As in the previous case, the loss may be due to random fluctuations
– here, this situation is more likely, given that annual outflows are directly
referred to, instead of some accrual of outflows. By setting a trigger level for
the reinsurer’s intervention higher than the expected value of the amount
of benefits, we would reduce the possible transfer of such a random risk
component.
Reinsurance conditions should concern the following items:
– Let z be the time of issue (or revision) of the arrangement. The time
horizon k of the reinsurance coverage should be stated, as well as the
timing of the possible reinsurer’s intervention within it. Within the time
horizon k, policy conditions (i.e. premium basis, mortality assumptions,
and so on) should be guaranteed. As to the timing of the intervention
of the reinsurer, since reference is to annual outflows, it is reasonable to
assume that a yearly timing is chosen. Hence, in the following, we will
make this assumption.
– The mortality assumption for calculating the expected value of the out-
flow, required to define the loss of the cedant. Reasonably, we will adopt
the current mortality table, which will be generically denoted as A(τ) in
what follows.
– The minimum amount t of benefits (at time t, t = z + 1, z + 2, . . . , z + k)
below which there is no payment by the reinsurer. For example,
()
t = E[Bt |A(τ), nz ] (1 + r) = b E[Nt |A(τ), nz ] (1 + r) (7.56)
with r ≥ 0 and b the annual amount for each annuitant; thus the amount
t represents the priority of the Stop-Loss arrangement.
– The Stop-Loss upper limit, that is, an amount t such that t − t is
the maximum amount paid by the reinsurer at time t. From the point
of view of the cedant, the amount t should be set high enough so that
only situations of extremely high survivorship are charged to the cedant.
However, the reinsurer reasonably sets t in connection to the available
hedging opportunities. We will come back to this issue in Section 7.4.3.
As to the cedant, a further reinsurance arrangement may be underwritten,
if available, for the residual risk, possibly with another reinsurer; in this
case, the amount t − t operates as the first layer.
In Fig. 7.12, a typical situation is represented.

When we consider the features of this treaty, especially in relation to the
Stop-Loss arrangement on assets, we note that measuring annual outflows
is relatively easy, since this relies on some direct information about the
Actual Expected
outflows values
Priority Upperlimit
Annual outflows
Reinsurer's
intervention
Time
Figure 7.12. A Stop-Loss reinsurance arrangement on annual outflows.
portfolio (viz. the number of living annuitants, joint to the annual amount
of their benefits). On the other hand, as already pointed out, it is more
difficult to avoid the transfer of random fluctuations as well.
(SL)
We now define in detail the flows paid by the reinsurer. Let Bt denote
such flow at time t, t = z + 1, z + 2, . . . , z + k. We have
 ()

0 if Bt ≤ t
(SL)
Bt = B() ()
− t if t < Bt ≤ t (7.57)
 t
 ()
t − t if Bt > t
The net outflow of the cedant at time t (gross of the reinsurance premium),
(SL)
denoted as OFt , is then
 () ()

Bt if Bt ≤ t
(SL) () (SL) ()
OFt = Bt − Bt = t if t < Bt ≤ t (7.58)

 () ()
Bt − (t − t ) if Bt > t
The net outflow of the cedant is clearly random but, unless some ‘extreme’
survivorship event occurs, it is protected with a cap. It is interesting
(especially for comparison with the swap-like arrangement described sub-
sequently) to comment on this outflow. First of all, it must be stressed that
()
Bt ≤ t represents a situation of profit or small loss to the insurer. On
()
the contrary, the event Bt > t corresponds to a huge loss. Whenever
()
t < Bt ≤ t a loss results for the insurer, whose severity may range
() ()
from small (if Bt is close to t ) to high (if Bt is close to t ). So the
effect of the Stop-Loss arrangement is to transfer to the reinsurer all of the
loss situations, except for the lowest and the heaviest ones; any situation of
profit, on the contrary, is kept by the cedant.
To reduce further randomness of the annual outflow, the cedant may be
willing to transfer to the reinsurer not only losses, but also profits. Thus,
a reinsurance-swap arrangement on annual outflows can be designed. Let
B∗t be a target value for the outflows of the insurer at time t, t = z + 1, z +
2, . . . , z + k; for example,
()
B∗t = E[Bt |A(τ), nz ] (7.59)
where A(τ) is an appropriate mortality assumption and z is the time of issue

()
of the reinsurance swap. Under the swap, if Bt > B∗t the cedant receives
()
money from the reinsurer; otherwise, if Bt < B∗t , then the cedant gives
money to the reinsurer, so that the target outflow is reached.
(swap)
Let Bt be the payment from the reinsurer to the cedant, defined as
follows:
(swap) ()
Bt = Bt − B∗t (7.60)
The annual outflow (gross of the reinsurance premium) for the cedant at
time t is
(swap) () (swap)
OFt = Bt − Bt = B∗t (7.61)
()
The advantage for the cedant is to convert a random flow, Bt , into
a certain flow, B∗t and hence the term ‘reinsurance-swap” that we have
assigned to this arrangement. Figure 7.13 depicts a possible situation. Note
that, ceteris paribus, this arrangement should be less expensive than the
Stop-Loss treaty on outflows, given that the reinsurer participates not only
in the losses, but also in the profits.
Although one advantage for the cedant of the reinsurance-swap is a pos-
sible price reduction, the cedant may be unwilling to transfer profits. On the
contrary, the arrangement may be interesting for the reinsurer depending on
the hedging tools available in the capital market (so that it could even be the
only arrangement available on the reinsurance market); see Section 7.4.3 in
this regard.
The design of the reinsurance-swap can be generalized by assigning two
barriers t , t (with t ≤ B∗t ≤ t ) such that
 () ()

Bt − t if Bt ≤ t
(swap-b) ()
Bt = 0 if t < Bt ≤ t (7.62)

 () ()

Bt − t if Bt > t
Actual outflow
Target outflow
bn0
Annual outflows
Time
From/to cedant
Figure 7.13. A reinsurance-swap arrangement.
Clearly, when setting t = t = B∗t in (7.62), one finds (7.60) again. The
net outflow (gross of the reinsurance premium) to the cedant is then
 ()
 if Bt ≤ t
t
(swap-b) () (swap-b) () ()
OFt = Bt − Bt = Bt if t < Bt ≤ t (7.63)

 ()
t if Bt > t
It is interesting to compare (7.63) with (7.58). We have already com-

mented on the implications of (7.58) for the profit/loss left to the cedant.
Under (7.63), large losses as well as large profits are transferred to the rein-
surer; therefore, both a floor and a cap are now applied to the profits/losses
of the cedant.
So far we have not commented on the pricing of the reinsurance arrange-
ments which have been examined. Actually, we will not enter into details
regarding this subject, but just make some remarks.
The critical issue in pricing a reinsurance arrangement involving aggre-
gate mortality risk is the pricing of longevity risk. As already commented
in Section 7.2.3, many attempts have been devoted to this issue, but no
generally accepted proposal is yet available. The stochastic model used
extensively in this chapter, although useful for internal purposes (such as
capital allocation), is not appropriate in general for pricing, due to the wide
set of items to be chosen (alternative mortality scenarios, weights attached
to such scenarios, and so on), as well as to the intrinsic static representation

of stochastic mortality.
As far as the XL arrangement and the Stop-Loss treaty on assets are
concerned, the adoption of traditional actuarial pricing methods, such as
the percentile principle, is reasonable because of the traditional structure
of the arrangement. Due to the context within which they could have to
be realized (such as a compulsory backing of pension fund liabilities by a
life insurer), the stochastic model used so far, with the set A(τ) suggested
by some independent institution, can offer an acceptable representation of
stochastic mortality also for pricing purposes.
The Stop-Loss arrangement on outflows and the reinsurance-swap, in
contrast, have features very close to those of financial derivatives. As we
have already noted, these arrangements can develop if they are properly
backed by longevity-linked securities. So their pricing will depend on the
pricing of the backing securities; attempts in this respect are still at an early
stage.
The choice of a particular reinsurance arrangement clearly depends at first
on what is available in the reinsurance market. In the case that more than
one solution is available, attention should not be paid just to the price,
but also to the benefits obtained in terms of reduction of the required
capital. For the reasons discussed above, we are not going to compare
the arrangements in terms of their price but we conduct some numerical
investigations concerning the capital requirements resulting from various
reinsurance arrangements. Due to the practical interest that they might have,
we consider just the Stop-Loss treaty on outflows and the swap-reinsurance
arrangement. Given the earlier comments about solvency issues, we make
use of an internal model, to account jointly (and consistently) for the risk
of random fluctuations and the longevity risk.
Example 7.9 We refer to the assumptions of Example 7.6. As we have

highlighted in discussing this example, if one wants to deal with a model
recording the overall longevity risk, the proper time-horizon is the maxi-
mum residual duration of the portfolio. So we adopt requirement (7.38). At
any valuation time, we assume that the flows until the end of the reinsur-
ance period are the net outflows OFt(·) , whilst after that time they are simply
()
the annual payments Bt . Therefore, when assessing the required capital,
we do not assume that the reinsurance arrangement will be automatically
renewed.
Policy conditions have to be chosen specifically for the reinsurance
arrangement; in particular, the two bounds t and t must be set dif-
ferently in the Stop-Loss arrangement and in the reinsurance-swap with
barriers. The following choices are adopted:

()
t = 1.1 E[Bt |A3 (τ), nz ]
()
t = 2 E[Bt |A3 (τ), nz ] (7.64)
for the former;

()
t = 0.75 E[Bt |A3 (τ), nz ]
()
t = 1.25 E[Bt |A3 (τ), nz ] (7.65)
for the latter. For the reinsurance-swap we set:

()
B∗t = E[Bt |A3 (τ), nz ] (7.66)
For all of the arrangements, a 5-year reinsurance period has been chosen.
To allow for some comparisons, we have assumed that at the beginning of
each reinsurance period a premium must be paid by the cedant, assessed as
the (unconditional) expected present value of future reinsurance flows. We
should point out that this pricing principle does not make practical sense,
given that no risk margin is included; however, with this approach, we can
at least take into account the magnitude of the reinsurance premium. We
assume further that the reinsurer and the cedant adopt the same mortality
model, with the same parameters and that the reserve must be fully set up
by the cedant. The possible default of the reinsurer is disregarded when
assessing the required capital.
In Table 7.29, we give the required capital (per unit of reserve) for
the three arrangements, for different portfolio sizes, as well as for the
case of no reinsurance arrangement (these latter results are taken from
Table 7.23). Because of the increased certainty of the outflows during the
reinsurance period, the lowest amount of required capital is found under the
reinsurance-swap (with no barriers); but clearly, in such an arrangement the
premium for the risk (which we have not considered) could be higher than
in other cases. As already noted, due to the different parameter values, the
outflows under the alternative arrangements are not directly comparable.
It is interesting to note that most of the reduction in the required capital
is gained at the oldest ages, roughly after the Lexis point. Indeed, the most
severe part of the longevity risk is expected to emerge after this age. So, we
can argue that the need for reinsurance emerges in particular at the oldest
ages; at earlier ages, the risk could be managed through other RM tools.
We conclude this section by describing an arrangement which (at least in

principle) could help in realizing natural hedging across LOBs.
[R3 ] ()
Table 7.29. Required capital, per unit of reserve: Mz /Vz , with and without reinsurance
No reinsurance Stop-loss on outflows

Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000
0 12.744% 9.241% 8.103% 12.744% 9.241% 8.103%

5 16.492% 11.938% 10.525% 16.492% 11.938% 10.525%
10 21.333% 15.621% 13.890% 21.333% 15.621% 13.890%
15 28.007% 20.372% 18.281% 27.603% 20.372% 18.281%
20 37.456% 27.008% 24.131% 35.246% 26.230% 23.739%
25 53.378% 36.113% 31.832% 44.356% 31.746% 28.433%
30 81.037% 50.476% 42.140% 51.771% 35.389% 30.687%
35 165.842% 77.890% 56.968% 58.926% 30.841% 25.540%
Reinsurance-swap, no barriers Reinsurance-swap, with barriers

Time z n0 = 100 n0 = 1,000 n0 = 10,000 n0 = 100 n0 = 1,000 n0 = 10,000
0 12.451% 9.088% 8.002% 12.744% 9.241% 8.103%

5 15.819% 11.571% 10.241% 16.492% 11.938% 10.525%
10 20.138% 14.731% 13.196% 21.333% 15.621% 13.890%
15 24.683% 18.440% 16.548% 28.007% 20.372% 18.281%
20 30.776% 22.168% 19.918% 37.299% 27.008% 24.131%
25 37.998% 25.280% 22.112% 49.855% 35.183% 31.413%
30 45.167% 26.260% 21.452% 62.390% 41.945% 36.373%
35 66.244% 27.762% 17.984% 89.438% 48.414% 37.579%
As was mentioned in Section 7.3.2, an appropriate diversification effect

between life insurance and life annuities may be difficult to obtain by an
insurer on its own. Intervention of a reinsurer can help in reaching the target
and, inter alia, could provide a way for reinsurers to hedge the accepted
longevity risk.
We sketch a simple situation, involving two insurers, labelled IA and IB
respectively, and a reinsurer.
Insurer IA deals with life annuities. At time 0 a (total) single premium SA
is collected from the issue of immediate life annuities; the overall annual
()
amount paid at time t is Bt (t = 1, 2, . . . ). Insurer IB deals with whole life
insurances. Let us assume that annual premiums are payable up to the time
of death and the benefit is paid at the end of the year of death; the total
amount of premiums collected at time t is PtB (t = 0, 1, . . . ) whilst the
benefits falling due at time t (t = 1, 2, . . . ) over the portfolio are denoted as
()
Ct . A reinsurance arrangement is underwritten by the two insurers with
the same reinsurer, according to which
– at time t (t = 0, 1, . . . ) the reinsurer pays to insurer IA an amount equal

to PtB and at time 0 an amount equal to SA to insurer IB ;
Death Annuity
benefit benefit
Insurer IA Reinsurer Insurer IB
Insurance Annuity
premiums premiums
Annuity Annuity Death Insurance
premiums benefit benefit premiums
Annuitants Insureds
Figure 7.14. Flows in the swap-like arrangement between life annuities and life insurances.
– at each time t (t = 1, 2, . . . ) the reinsurer receives from insurer IA an

()
amount equal to Ct and at time t (t = 1, 2, . . . ) receives from insurer
()
IB an amount equal to Bt .
This would be a swap-like arrangement between life annuities and life

insurances; Fig. 7.14 gives a graphical idea of the overall flows.
Let us assume that the quantities introduced above are defined for each
()
time t; in particular, S0A = SA and StA = 0 for t = 1, 2, . . . whilst C0 = 0
()
and B0 = 0. Then it turns out that at any time t, t = 0, 1, . . . , the net
() ()
cashflow for both insurer IA and IB is StA + PtB − Bt − Ct , whilst for the
() ()
reinsurer, the net cashflow is Bt + Ct − StA − PtB . Each party has both a
position in life annuities and one in life insurances, and therefore gains the
benefit from natural hedging.
Practical difficulties inherent in such an arrangement are self-evident.
Advantages may be weak, especially because of the incomplete hedging
provided. It must also be pointed out that the actual duration of the life
insurance covers may be shortened because of surrenders. Further, some
reward has to be acknowledged to the reinsurer, which can reduce the
advantages gained from the new position. However, this structure could
represent a useful management framework within an insurance group,
where the holding company could play the part of the reinsurer (with
reduced fees charged to the counterparties).
A similar swap arrangement is described by Cox and Lin (2007), however
without explicit intervention of a reinsurer. Consider homogeneous port-
()
folios, both for insurer IA and IB . Therefore: Bt = bNt and Ct = cDt ,
where b denotes the annual amount to each annuitant, c the death benefit
to whole life policyholder, Nt the number of the annuitants at time t in
the portfolio of insurer IA and Dt the number of deaths in year (t − 1, t)
in the portfolio of insurer IB . Let nt∗ and dt∗ be two given benchmarks for
the number of annuitants at time t for insurer IA and the number of deaths
in year (t − 1, t) for insurer IB , respectively. Insurer IA and IB agree that
the flow b · max{Nt − nt∗ , 0} is paid at time t by insurer IB to insurer IA ,
whilst the flow c · max{Dt − dt∗ , 0} is paid at the same time by insurer IA
to IB . This way, insurer IA is protected against excess survivorship, whilst
insurer IB is protected in respect of excess mortality. However, insurer IA is
then exposed to excess mortality, whilst insurer IB to excess survivorship.
Cox and Lin (2007) show through numerical assessments that some natural
hedging effects are gained by both insurers, provided that the present value
of future payments for life annuities and for life insurances are the same at
the time of issue.
7.4 Alternative risk transfers
7.4.1 Life insurance securitization
Securitization consists in packaging a pool of assets or, more generally, a

sequence of cash flows into securities traded on the market. The aims of a
securitization transaction can be:
– to raise liquidity by selling future flows (such as the recovery of

acquisition costs or embedded profits);
– to transfer risks whenever contingent payments or random cash flows
are involved.
We note that, since new securities are issued, a counterparty risk arises (for
the investor).
The organizational aspects of a securitization transaction are rather com-
plex. Figure 7.15 sketches a simple design for a life insurance deal, focussing
on the main agents involved. The transaction starts in the insurance market,
where policies underwritten give rise to the cash flows which are securi-
tized (at least in part). The insurer then sells the right to some cash flows
to a special purpose vehicle (SPV), which is a financial entity that has been
established to link the insurer to the capital market. Securities backed by
the chosen cash flows are issued by the SPV, which raises monies from the
capital market. Such funds are (at least partially) available to the insurer.
According to the specific features of the transaction, further items may
be added to the structure. For example, a fixed interest rate could be paid
Policyholders
Premiums Benefits
Funding Price
Special Purpose Capital
Insurer
Vehicle (SPV) Market
Cash flows Securities
Figure 7.15. The securitization process in life insurance: a simplified structure.
Credit
Policyholders Enhancement
Mechanism
Premium Guarantee
Premiums Benefits
Funding Price
Special Purpose Capital
Insurer Vehicle (SPV) Market
Cash flows Securities
Floating Fixed interest

interest rate rate
Swap
counterparty
Figure 7.16. The securitization process in life insurance: a more composite structure.
to investors, so that the intervention by a swap counterparty is required;

see Fig. 7.16.
As has been pointed out above, some counterparty risk is originated by
the securitization transaction. This is due to the possible default of the
insurer with respect to the obligations assumed against the SPV, as well as
of the policyholders in respect of the insurer, in the form of surrenders and
lapses (which may possibly affect the securitized cash flows). To reduce such
default risks, some form of credit enhancement may be introduced, both
internal (e.g. transferring to the SPV higher cashflows than those required
by the actual size of the securities) and external, through the intervention
of a specific entity (issuing, for example, credit insurance, letters of credit,
and so on); see again Fig. 7.16. Further counterparty risk emerges from the
other parties involved, similarly to any financial transaction. We note that
the intervention by a third financial institution may result in an increase of
the rating of the securities.
Further details of the securitization transaction concern services for pay-

ments provided by external bodies, investment banks trading the securities
on the market, and so on. Since we are only interested in the main technical
aspects of the securitization process, we do not go deeper into these topics
(which, nevertheless, do play an important role in the success of the overall
transaction).
7.4.2 Mortality-linked securities
Mortality-linked securities are securities whose pay-off is contingent on the

mortality experienced in a given population; this is obtained, in particu-
lar, by embedding some derivatives whose underlying is a mortality index
assessed on the given population. These securities may serve two oppo-
site purposes: to hedge extra-mortality or extra-survivorship. In the former
case, we will refer to them as mortality bonds, in the latter case as longevity
bonds. We restrict the terminology to ‘bond”, without making explicit refer-
ence (in the name) to the derivative which is included in the security (which
could be option-like, swap-like, or other) because we are more interested
in the hedging opportunities rather than in the organizational aspects of
the deal. We are aware of the importance that such aspects play from a
practical point of view, but their discussion goes beyond the aims of this
book.
Both for mortality and longevity bonds, a reference population is cho-
sen, whose mortality rates are observed during the lifetime of the bond. The
population may consist of a given cohort (as can be the case for longevity
bonds) or a given mix of populations, possibly of different countries (typ-
ically this applies to mortality bonds). A mortality or a survivor index is
defined, whose performance is assessed according to the mortality experi-
enced in the reference population. Possible examples of an index are: the
average mortality rate in one-year’s time (or a longer period) across the
population, the number of survivors relative to the size of the population
at the time of issue of the bond, and so on. The amount of the coupon is
contingent on such index; in particular, the coupon may be higher/lower the
higher is the index, depending on the specific bond design. In some cases,
the principal may vary (in particular, be reduced) according to the mortal-
ity index. Specific cases are discussed below, separately for mortality and
longevity bonds. We point out that, to avoid lack of confidence in the way
that the pay-off of the mortality-linked security is determined, mortality
data should be collected and calculated by independent analysts; so typi-
cally general population mortality data are referred to instead of insurance
data (we will come back later to this aspect).
Mortality bonds are designed as catastrophe bonds. The purpose is to pro-

vide liquidity in the case of mortality being in excess of what is expected,
possibly owing to epidemics or natural disasters. So, typically a short posi-
tion on the bond may hedge liabilities of an insurer/reinsurer dealing with
life insurances.
Mortality bonds are typically short term (3-5 years) and they are linked
to a mortality index expressing the frequency of mortality observed in the
reference population in a given period. Some thresholds are normally set at
bond issue. If the mortality index outperforms a threshold, then either the
principal or the coupon are reduced. Although it is outside the scope of the
discussion to deal with mortality risk in the portfolios of life insurances,
we discuss in some detail possible structures for mortality bonds to give a
comprehensive picture of the developing mortality-linked securities. In what
follows, 0 is the time of issue of the bond and T its maturity. With It we
denote the mortality index after t years from bond issue (t = 0, 1, . . . , T).
Further, St denotes the principal of the bond at time t and Ct the coupon
due at time t.
Mortality bond – example 1. The bond is designed to protect against
high mortality experienced during the lifetime of the bond itself. This is
obtained by reducing the principal at maturity. Although just some ages
could be considered in detecting situations of high mortality, it is reasonable
to address a range of ages. Further, the index should account for mortality
over the whole lifetime of the bond. So the following quantities represent
possible examples of a mortality index
IT = max{q(t)}t=1,2,...,T (7.67)
T
q(t)
It = t=1 (7.68)
T
where q(t) is the annual frequency of death averaged across the chosen
population in year t (we stress that although in our notation t is the time
since the issue of the bond, the frequencies of death in (7.67) and (7.68) are
recorded in specific calendar years, namely, in the ‘calendar year of issue
+ t”). It is then reasonable for I0 = q(0).
At maturity the principal paid-back to investors is


1 if IT ≤ λ I0
ST = S0 × (IT ) if λ I0 < IT ≤ λ I0 (7.69)


0 if IT > λ I0
where λ , λ are two parameters (stated under bond conditions), with 1 ≤
λ < λ , and (IT ) is a decreasing function, such that (λ I0 ) = 1 and
(λ I0 ) = 0. For example,
λ I0 − IT
(IT ) = (7.70)
(λ − λ ) I0
Note that λ I0 and λ I0 represent two thresholds for the mortality index.
The coupon is independent of mortality; it could be defined as follows:
Ct = S0 (it + r) (7.71)
where it is the market interest rate in year t (defined by the bond conditions)
and r is an extra-yield rewarding investors for taking mortality risk.
We note that for an insurer/reinsurer dealing with life insurances and
taking a short position in the bond, in the case of high mortality experience,
the high frequency of payment of death benefits is counterbalanced by a
reduced payment to investors.
An example of this security is the mortality bond issued by Swiss Re; see,
for example, Blake et al. (2006a).
Mortality bond – example 2. The flows of the bond described in the
previous example try to match the flows in the life insurance portfolio just
at the end of a period of some years. An alternative design of the mortality
bond may provide a match on a yearly basis. This is obtained by letting the
coupon depend on mortality. For example,


it + r if It ≤ t
Ct = S0 × (it + r) φ(It ) if t < It ≤ t (7.72)


0 if It > t
where t , t set two mortality thresholds. For example,
t = λ E[Dt |A]

1 ≤ λ < λ (7.73)
t = λ E[Dt |A]
where Dt is the number of deaths in year (t−1, t) in the reference population

and E[Dt |A] is its expected value according to the mortality assumption A.
Clearly, in this structure the mortality index It should measure the number
of deaths in year (t − 1, t). The function φ(·) should then be decreasing; for
example,
− It
φ(It ) = t (7.74)
t − t
As in (7.71), the rate r in (7.72) is an extra-investment yield rewarding
investors for the mortality risk inherent in the pay-off of the bond.
For longevity bonds the critical situation is a mortality lower than

expected or, in other terms, people outliving their expected lifetime. In con-
trast to the situation of extra-mortality, excess survivorship is not a sudden
phenomena, but rather a persistent situation. So longevity bonds are, by
nature, long term.
Remark It is worthwhile stressing the difference between longevity bonds
and (fixed-income) long-term bonds. While the former are financial secu-
rities whose performance is linked to some longevity index (see below for
details), the latter are traditional bonds with, say, a 20–25 years maturity,
and (usually) a fixed annual interest (or possibly an annual interest linked
to some economic or financial index, as for example an inflation index).
Although not tailored to the specific needs arising from the longevity risk,
long-term bonds can help in meeting obligations related to a life annu-
ity portfolio. Actually, one of the most important problems in managing
portfolios of life annuities (with a guaranteed benefit) consists in mitigat-
ing the investment risk through the availability of fixed-income long-term
assets, to match the long-term liabilities. Clearly, this problem becomes
more dramatic as the expected duration of the life annuities increases.
Depending on its design, the longevity bond may offer hedging oppor-
tunities to an insurer/reinsurer dealing with life annuities through either a
long or a short position. In the first case, the pay-off of the bond increases
with decreasing mortality; vice versa in the second case. Given the long-
term maturity, it is reasonable that the link is realized through the coupon,
hence providing liquidity on a yearly basis. In the following, we therefore
assume that the principal is fixed.
The reference population should be a given cohort, possibly close to
retirement, that is, with age 60–65 at bond issue. Let Lt be the number of
individuals in the cohort after t years from issue, t = 0, 1, . . . ; viz, L0 = l0
is a known value. A maturity T may be chosen for the bond, with T high
(e.g.: T ≥ 85−initial age). In the following, some possible designs for the
coupons are examined.
Longevity bond – example 1. The easiest way to link the coupon to the
longevity experience in the reference population is to let it be proportional
to the observed survival rate. So
Lt
Ct = C × (7.75)
l0
where C is a given amount (linking the size of the coupon to the principal of
the bond). We note that in the case of unanticipated longevity the coupon
increases faster than expected; so a long position should be taken by an
insurer/reinsurer dealing with life annuities. A similar bond has been pro-
posed by EIB/BNP Paribas, although it has not been traded on the market;
see Blake, Cairns and Dowd (2006a) for details.
Longevity bond – example 2. In a similar way to the mortality bond
(example 1 or 2), two thresholds may be assigned, expressing survival levels.
If the number of survivors in the cohort exceeds such thresholds, then the
amount of the coupon is reduced, possibly to 0. The following definition
can be adopted:

 lt −lt

 l0 if Lt ≤ lt
Ct = C × lt −Lt if lt < Lt ≤ lt (7.76)

 l0
0 if L > l t t
where lt , lt are the two thresholds, expressing a given number of survivors.
For example: lt = λ E[Lt |A(τ)], lt = λ E[Lt |A(τ)], where 1 ≤ λ < λ and
A(τ) is a given mortality assumption for the reference cohort (assumed to
be born in year τ). We note that, in this case, the lower is the mortality (i.e.
the higher is Lt ), the lower is the amount of the coupon. A short position
should be taken to hedge life annuity outflows. A similar bond is described
by Lin and Cox (2005).
Longevity bond – example 3. The coupon can be set proportional to the
number of deaths observed in the reference cohort from issue. For example
l 0 − Lt
Ct = C × (7.77)
l0
where l0 − Lt is the observed number of deaths up to time t. In contrast to
the previous case, no target is set for such a number. Clearly, also in this
case a short position should be taken to hedge longevity risk.
We will discuss in more detail how to hedge longevity risk through
longevity bonds in Section 7.4.3. We now address some market issues.
There are many difficulties in developing a market for longevity bonds.
A first issue concerns who might be interested in issuing/investing in bonds
that offer hedging opportunities to insurers/reinsurers. In general terms,
one could argue that such securities may offer diversification opportunities,
in particular because of their low correlation with standard financial mar-
ket risk factors. Further, they may give long-term investment opportunities,
which may be rarely available. From the point of view of the issuer of bonds
like example 1, the possibility of building a longevity bond depends, how-
ever, on the availability of financial securities with an appropriate maturity
to match the payments promised under the longevity bond.
A further issue, already mentioned, concerns the choice of mortality data.

To encourage confidence in the linking mechanism, reference to insurance
data should be avoided. Data recorded and analysed by an independent
body should rather be adopted. This raises an issue of basis risk for hedgers
(see Section 7.4.3). Conversely, there are many weak points in a mecha-
nism linking the pay-off of the bond to insurance data; among these, we
mention the following: insurance data may be affected in particular by insur-
ers/reinsurers with large portfolios, so that some manipulation of data may
be feared by investors; due to commercial reasons, the mix of the insured
population may change over time, whilst reference to the general population
offers more stability.
A final aspect (but not least in terms of importance) concerns the pricing
of the longevity risk transferred to the capital market. Also in this respect
there are many difficulties. First, an overall accepted model for stochastic
mortality is not yet available (see Section 7.2.3). Second, a market is not yet
developed, nor are similar risks traded in the market itself. So, even if there
were common agreement on a pricing model, data to estimate the relevant
parameters are not yet available. Three theoretical approaches have been
proposed in the literature: distortion measures, risk-neutral modelling, and
incomplete markets. Researches in this respect can be considered to be at
an early stage and open issues remain requiring careful investigation. See
Section 7.6 for some examples, and Section 7.8 for references.
7.4.3 Hedging life annuity liabilities through longevity bonds
We refer here to an insurer or to a reinsurer dealing with immediate life

annuities. In the case of an insurer, we refer to the portfolio described in
Section 7.2.4; in the case of a reinsurer, we assume that support is provided
to an insurer with a portfolio as the one described in Section 7.2.4. We
have already noted (see Section 7.3.3) that heterogeneous annual amounts
mainly impact on random fluctuations. Therefore, when managing mortal-
ity risk in a life annuity portfolio, the insurer should first underwrite some
traditional surplus reinsurance to reduce the dispersion of annual amounts
in its portfolio. In the following, we will assume that such action has been
taken; so, unless otherwise stated, we make reference to a homogeneous life
annuity portfolio, where b(j) = b for each annuitant j. We recall that in this
()
case Bt = b Nt .
(·)
The insurer/reinsurer faces the random outflows Bt and counterbalances
(·)
them with random flows Ft , such that the net outflows Bt − Ft are close
∗
to some target outflows OFt . If the hedging is pursued by an insurer, then
()
reference is to the original outflows Bt of the life annuity portfolio. If the
(SL)
hedging is realized by a reinsurer, then reference is to the outflows Bt ,
(swap) (swap−b)
Bt , or Bt , depending on the reinsurance arrangement dealt with.
In the following, we discuss how the target OFt∗ can be set and reached
according to the hedging tools available in the market. For the sake of
brevity, we assume that the longevity bond is issued at the same time as
the life annuities; some comments will follow in this regard. Thus, unless
otherwise stated, time 0 will be the time of issue of the life annuities and
the bond.
We first consider the case of a longevity bond with coupon (7.75). An
insurer dealing with immediate life annuities should buy k units of such
bond at time 0, so that Ft = k Ct > 0 at time t = 1, 2, . . . . The net outflow
for the insurer at time t, t = 1, 2, . . . , is then
(LB) ()
OFt = Bt − k Ct (7.78)
which can be rewritten as
(LB) Nt Lt
OFt = b n0 − kC (7.79)
n0 l0
We assume that Nt /n0 = Lt /l0 for any time t; this means that mortality of
annuitants is perfectly replicated by mortality in the reference population.
The net outflow to the insurer then becomes
(LB) Lt
OFt = (b n0 − k C) (7.80)
l0
Note that the net outflow is still random because of the dependence on Lt .
However, if k = b n0 /C then the term b n0 − k C reduces to zero, and a
situation of certainty is achieved (i.e. the hedging would be perfect); the
target outflow for this situation is therefore OFt∗ = 0.
In practical terms, perfect hedging is difficult to realize. Although we can
rely on some positive correlation between the survival rate in the reference
population, Lt /l0 , and that in the annuitants’ cohort, Nt /n0 , it is unrealistic
that they coincide in each year, due to the fact that usually the annuitants
are not representative of the reference population. In particular, the year of
birth of the reference cohort and of annuitants may differ. This mismatching
leads to basis risk in the strategy for hedging longevity risk.
A second aspect concerns the lifetime of the bond. Typically the bond
is not issued when the life annuity payments start. If it is issued earlier,
the previous relations still hold, just with an appropriate redefinition of the
quantities l0 and Lt ; the problem in this case would consist in the availability
of the bond, in the required size, in the secondary market. If the bond is
issued later than the life annuities, the longevity risk of the insurer would be
unhedged for some years (but in a period when annuitants are still young,
and longevity risk is therefore not too severe). In both cases, the basis risk
may be stronger, due to the fact that it is more likely that the years of
birth of annuitants and the reference population are different. The critical
aspect of the lifetime of the bond is its maturity, T. Realistically, T is a
finite time, so that the hedge in (7.79) can be realized just up to time T (and
not for any time t). The insurer has to plan a further purchase of longevity
bonds after time T; however, the availability of bonds, in particular with
the features required for the hedging, is not certain. In the case that further
longevity bonds are available in the future, the basis risk may worsen in
time, given that for any bond issue a cohort of new retirees is likely to be
referred to.
We now move to longevity bonds with coupon (7.76) and (7.77). As
already mentioned in Section 7.4.2, such bonds require a short position to
hedge longevity risk. This position is, however, difficult for an insurer (or
other annuity provider) to realize on its own, because of the complexity
of the deal. It is reasonable to assume that some form of reinsurance is
purchased by the annuity provider. The reinsurer, who transacts business
on a larger scale than the insurer, then hedges its position through longevity
bonds, typically issued by an SPV (see Fig. 7.17).
Let us assume that a reinsurer is able to issue a bond with coupon (7.76).
The reinsurer should be willing, in this case, to underwrite the Stop-Loss
arrangement on annual outflows, whose reinsurance flows are described
Annuitants
Premiums
Annual
payments
Premium Coupons
Premium
and Principal
Annuity Capital
Reinsurer SPV
Provider Market
Income from
Benefits Benefits
bond sale
Figure 7.17. Longevity risk transfer from the annuity provider to the capital market.
by (7.57). Thus, the longevity bond should offer hedging opportunities

against the liabilities of the reinsurer in respect of the insurer, as we will
demonstrate.
(SL)
Assume that the reinsurer matches the outflow Bt arising from the
reinsurance arrangement with a short position on k units of the longevity
bond with coupon (7.76). In this case, Ft = −k Ct < 0 at time t = 1, 2, . . . .
If the underlying life annuity portfolio is homogeneous in respect of annual
(SL)
amounts, the net outflow of the reinsurer, NFt , is
(SL) (SL)
NFt = Bt + k Ct


0 if b Nt ≤ t
= b Nt − t if t < b Nt ≤ t


t − t if b Nt > t

 lt −lt

 l0 if Lt ≤ lt
lt −Lt
+ kC × if lt < Lt ≤ lt (7.81)

 l0
0 if L > l t t
Since we are aiming at perfect hedging, the thresholds t , t in the rein-
surance arrangement are reasonably chosen according to the feature of the
longevity bond. So we assume that t = (lt /l0 ) b n0 and t = (lt /l0 ) b n0 .
We can rewrite (replacing the relevant quantities and rearranging)

 l

 0 if N t
≤ lt


n 0 0

(SL) lt lt lt
NFt = b n0 × N t
− if < Nt
≤


n0 l0 l0 n 0 l0



 lt −lt lt
l0
if N n0 > l0
t

 lt −lt l

 if Ll t ≤ l t


l0 0 0
 l
+ k C × lt −L t l
if l t < Ll t ≤ lt (7.82)


l0 0 0 0



0 l
if Ll t > lt
0 0
If Nt /n0 = Lt /l0 , this reduces to


 l −l Lt lt

 kC tl t if ≤

 0 l0 l0
(SL) Lt −lt lt −Lt lt Lt lt
NFt = b n0 l + k C if < ≤ (7.83)

 0 l0 l0 l0 l0


b n lt −lt if Ll t >
lt
0 l 0 0 l0
Annuity
outflows
Annual outflows Priority Upper limit
Flow to
investors
Flow to the
insurer
Time
Figure 7.18. Flows for a reinsurer dealing with a Stop-Loss arrangement on annual outflows and
issuing a longevity bond – example 2.
Further, if k = b n0 /C, then


 lt −lt Lt lt

 b n0 if ≤


l0 l0 l0
(SL) lt −lt lt Lt lt
NFt = b n0 if < ≤


l0 l0 l0 l0

 lt −lt lt
b n if Lt
>
0 l0 l0 l0
lt − lt
= b n0 (7.84)
l0
which is a non-random situation. A graphical representation is provided in
Fig. 7.18.
The assumptions on which such a perfect hedging strategy is based are
the same as those adopted for the longevity bond – example 1, that is,
– the survival rate in the annuitant population, Nt /n0 , is the same as that
observed in the reference population, Lt /l0 ;
– the lifetime of the bond coincides with the lifetime of the life annuity
portfolio; in particular, no maturity has been set.
It is clear that such conditions are unrealistic, so that the reinsurer transfers
just partially the longevity risk to investors. In any case, the target outflow
l −l
in setting the hedging strategy in this case is OFt∗ = b n0 t l t . A similar
0
strategy is described by Lin and Cox (2005), albeit without calling explicitly
for a reinsurance arrangement between an insurer and a reinsurer.
We note that the unavailability of a longevity bond which perfectly

matches the reinsurer’s liability suggests that the reinsurance arrangement
should be underwritten just for a finite time, as we have considered in
Section 7.3.4. At any renewal time, the pricing of the arrangement, as well
as the relevant conditions, can be updated to take account of the current
availability of hedging tools.
If the reinsurer is able to issue a bond with coupon (7.77), then the
reinsurance-swap arrangement can be hedged. We assume that the reinsurer
takes a short position on k units of the longevity bond with coupon (7.77)
(note that in this case, similarly to the previous one, Ft = −k Ct < 0).
Underwriting jointly a reinsurance-swap arrangement, the following net
flow of the reinsurer is
(swap) ()
NFt = Bt − B∗t + k Ct (7.85)
First, we refer to a homogeneous life annuity portfolio and note that the
target outflow (7.59) for the insurer under the reinsurance-swap can be
restated as
B∗t = b E[Nt |A(τ), nz ] = b nt∗ (7.86)
and so the net flow for the reinsurer can be rewritten as
(swap) Nt l0 − Lt
NFt = b n0 − b nt∗ + k C
n0 l0
Nt Lt
= k C − b nt∗ + b n0 − kC (7.87)
n0 l0
Nt Lt b n0
If n0 = l0
and k = C , then
(swap)
NFt = k C − b nt∗ = b (n0 − nt∗ ) (7.88)
which is again non-random. We note that the net outflow of the rein-
surer is proportional to the number of deaths assumed as a target in the
reinsurance-swap, namely, n0 − nt∗ . Clearly, OFt∗ = b (n0 − nt∗ ) is the target
outflow for the hedging strategy. A graphical representation is provided in
Fig. 7.19. Remarks on the possibility of realizing a perfect hedging are as
in the previous cases.
The impossibility of relying on a perfect hedging strategy suggests to
adopt the reinsurance-swap arrangement with flows (7.62) instead of (7.60).
The arrangement (7.62) could also be justified by a hedging strategy
involving several positions on longevity-linked securities.
We conclude this section by recalling that whenever longevity risk is trans-
ferred to some other entities, either to the issuer of a longevity bond or
Net outflow
bn0
Annual outflow To investors
Time
From/to cedant
Figure 7.19. Flows for a reinsurer dealing with a reinsurance-swap arrangement and issuing a
longevity bond – example 3.
to a reinsurer, a default risk arises for the insurer. This aspect should be
accounted for when allocating capital for the residual longevity risk borne
by the insurer itself.
7.5 Life annuities and longevity risk
7.5.1 The location of mortality risks in traditional life annuity

products
So far in this chapter we have dealt with longevity risk referring to a portfo-
lio of immediate life annuities. The need for taking into account uncertainty
in future mortality trends and hence for a sound management of the impact
of longevity risk has clearly emerged.
However, life annuity products other than immediate life annuities are
sold on a number of insurance markets and, in many products, the severity
of longevity risk can be even higher than what has emerged in the previous
investigations. We now introduce some remarks considering cases other
than immediate life annuities.
The technical features of several types of life annuities have already been
examined in Chapter 1, and the relevant traditional pricing tools as well
(see, in particular, Section 1.6). Unsatisfactory features of such models can
be easily understood if one analyses the models themselves under the per-
spective of a dynamic mortality scenario. In this section, we develop some
general comments on the pricing of life annuities allowing for longevity
risk; a few examples are then mentioned in Section 7.6.
In Section 1.6, we recalled that in the traditional guaranteed life annuity
product the technical basis is stated when the premiums are fixed. So
(a) a deferred life annuity with (level) annual premiums implies the highest
longevity risk borne by the insurer, as the technical basis is stated at
policy issue (hence, well before retirement);
(b) a single premium immediate life annuity implies the lowest longevity
risk, as the technical basis is stated at retirement time only;
(c) the arrangement with single recurrent premiums represents an interme-
diate solution, given that the technical basis can be stated specifically
for each premium.
It follows that a stronger safety loading is required for solution (a) than
for (b), with solution (c) at some intermediate level. Clearly, in order to
calculate properly the safety loading required for the implied longevity risk,
some pricing model is needed. Alternatively, policy conditions that allow
for a revision of the technical basis should be included in the policy, as will
be commented later.
As it was recalled in Section 1.6, in case (b) the accumulation of
the amount funding an immediate life annuity can be obtained through
some insurance saving product, for example, an endowment insurance.
A package, in particular, can be offered, in which an endowment for the
accumulation period is combined with an immediate life annuity for the
decumulation period.
Combining an endowment insurance with a life annuity provides the
policyholder with
(a) an insurance cover against the risk of early death during the working
period;
(b) a saving instrument for accumulating a sum at retirement, to be (partly)
converted into a life annuity;
(c) a life annuity throughout the whole residual lifetime.
It is interesting to analyse the risks involved by this product, from the

point of view of the insurance company (see Fig. 7.20); we refer just to
the flows given by net premiums and benefits (hence we disregard risks
connected to expenses and other aspects). Consistent with the notation in
Risk of Annuitization
Mortality
surrender risk
risk
C1
Mortality risk
Sum at risk
(Longevity risk included)

Death benefit
Reserve
Investment
Reserve
risk
0 n Time
Accumulation period Post-retirement period
Figure 7.20. Risks in an endowment combined with a life annuity.
Section 1.6, we let 0 be the time of issue of the endowment, n the maturity
of the endowment and the retirement time as well, x the age at time 0.
During the accumulation period, that is, throughout the policy duration
of the endowment, the insurer in particular bears:
– the investment risk, related to the mathematical reserve of the endow-

ment, if some financial guarantee operates, involving for example a
minimum interest rate guarantee;
– the (extra-)mortality risk, related to the sum at risk;
– the risk of surrender, related to the amount of the reserve, if some guar-
antee on the surrender price (usually expressed as a share of the reserve)
is given.
During the decumulation period, as the annual amount is usually guaran-

teed, the insurer bears:
– the investment risk, related to the mathematical reserve of the annuity,

if a minimum interest rate guarantee is operating;
– the (under-)mortality risk, and in particular the longevity risk.
At retirement time, if some guarantee has been given on the annuitization

rate, the insurer bears the risk connected to the option to annuitize. This
aspect is discussed in more detail in Section 7.5.2.
As regards the longevity risk, the time interval throughout which the
insurer bears the risk itself clearly coincides with the time interval involved
by the immediate life annuity, if the annuity rate 1/ax+n is stated and hence
guaranteed at retirement time only. We recall that the annuity rate converts
the sum at maturity S (used as a single premium) into a life annuity of annual
amount b according to the relation b = S/ax+n (see (1.57)).
Even if the annuity rate is stated at time n only, it is worth noting that
the endowment policy contains an ‘option to annuitize’. Apart from the
severity of the longevity risk implied by the guarantee on the annuity rate,
the presence of this option determines the insurer’s exposure to the risk
of adverse selection, as most of the policyholders annuitizing the maturity
benefit will be in a good health status (see Section 1.6.5).
7.5.2 GAO and GAR
The so-called guaranteed annuity option (GAO) (see Section 1.6.2) entitles
the policyholder to choose at retirement between the current annuity rate
(i.e. the annuity rate applied at time n for pricing immediate life annuities)
and the guaranteed one.
By definition, the GAO condition implies a guaranteed annuity rate
(GAR). In principle, the GAR can be stated at any time t, 0 ≤ t ≤ n. In
practice, the GAR stated at policy issue, that is, at time 0, constitutes a more
appealing feature of the life insurance product. If the GAR is stated at time
n only, the GAO vanishes and the insurance product simply provides the
policyholder with a life annuity with a guaranteed annual amount. What-
ever may be the time at which the GAR is stated, the life annuity provides
a guaranteed benefit, so that it can be referred to as a guaranteed annuity
(see Fig. 7.21).
Conversely, the expression non-guaranteed annuity denotes a life annuity
product in which the technical basis (and in particular the mortality basis)
can be changed during the annuity payment period; in practice, this means
GAR
GAO at time t
(0 ≤ t ≤ n)
Guaranteed
Annuity
GAR
at time n
Figure 7.21. GAO, GAR and Guaranteed Annuity.

that the annual amount of the annuity can be reduced, according to the
mortality experience. Clearly, such an annuity is a rather poor product
from the point of view of the annuitant.
As a consequence of the GAR, the insurer bears the longevity risk
(and the market risk, as the guarantee concerns both the mortality table
and the rate of interest) from the time at which the guaranteed rate
is stated on. Obviously, the longevity (and the market) risk borne by
the insurer decreases as the time at which the guaranteed rate is stated
increases.
The importance of an appropriate pricing of a GAO, and therefore of
an appropriate setting of a GAR, is witnessed by the default of Equitable
Life. The unanticipated decrease in interest and mortality rates experienced
during the 1990s, let the GAOs issued by Equitable during the 1980s
to become deeply in the money at the end of the 1990s. As a conse-
quence, in 2000 the Equitable was forced to close to new life and pension
business.
Pricing a life annuity product within the GAR framework requires the use
of a projected mortality table. The more straightforward (and traditional)
approach for pricing the guarantee consists of adopting a table that includes
a safety loading to meet mortality improvements higher than expected. One
should, however, be aware of the fact that the possibility of unanticipated
mortality improvements reduces the reliability of such a safety loading (as
happened to Equitable). A more appropriate approach requires a pricing
model explicitly allowing for the longevity risk borne by the insurer, rather
than a safety loading roughly determined; see Section 7.6.
7.5.3 Adding flexibility to GAR products
A rigorous approach to pricing a GAR product usually leads to high pre-

mium rates, which would not be attractive from the point of view of
the potential clients. Conversely, lower premiums leave the insurer hardly
exposed to unexpected mortality improvements. However, in both cases,
adding some flexibility to the life annuity product can provide interesting
solutions to the problem of pricing guaranteed life annuities. In what follows
we focus on some practicable solutions.
Assume that the insurer decides to set the GAR 1/a[1] x+n (h) at time
h (0 ≤ h < n) for a deferred life annuity to be paid from time n. Suppose
that a[1]
x+n (h) is lower than the correspondent output of a rigorous approach
to GAR pricing. If an amount S is paid at time n as a single premium, the
Reduction in the
annual amount
b[1]
b'[1]
0 h r n Time
A new
projected table
Figure 7.22. Annual amount in a conditional GAR product.
resulting annual amount of the life annuity is given by
S
b[1] = (7.89)
a[1]
x+n (h)
Assume that the insurer promises to pay the annual amount b[1] from
time n on, with the proviso that no dramatic improvement in the mortality
experienced occurs before time n. Conversely, if such an improvement is
experienced (and it results, for example, from a new projected life table
available at time r, h < r ≤ n), then the insurer can reduce the annual
amount to a lower level b[1] (see Fig. 7.22). So a policy condition must be
added, leading to a conditional GAR product. Some constraints are usually
imposed (e.g. by the supervisory authority); in particular:
(a) the mortality improvement must exceed a stated threshold (e.g. in terms
of the increase in the life expectancy at age 65);
(b) r ≤ n − 2, say;
(c) no more than one reduction can be applied in a given number of years;
(d) whatever the mortality improvements may be, the reduction in the
annual amount must be less than or equal to a given share ρ, that is,
b[1] − b[1]
≤ρ (7.90)
b[1]
so that, combining (c) and (d), a guarantee of minimum annual amount

works. Conversely, from time n the annual amount is guaranteed, irrespec-
tive of any mortality improvement which can be recorded afterwards.
Increase in the
annual amount
b' [2]
b[2]
0 h n s Time
Experienced mortality
higher than expected
Figure 7.23. Annual amount in a participating GAR product.
Let us now turn to the case in which the insurer charges a rigourous (i.e.
lower) annuity rate 1/a[2]
x+n (h). Hence, the annuity amount is given by
S
b[2] = (7.91)
a[2]
x+n (h)
with b[2] < b[1] .

Suppose that, at time s (s > n), statistical observations reveal that
the experienced mortality is higher than expected, because of a mortal-
ity improvement lower than forecasted. Hence, a mortality profit is going
to emerge from the life annuity portfolio. Then, the insurer can decide
to share part of the emerging profit among the annuitants, by raising the
annual amount from the (initial) guaranteed level b[2] to b[2] (see Fig. 7.23).
This mechanism leads to a with-profit GAR product (or participating GAR
product).
Participation mechanisms work successfully in a number of life insur-
ance and life annuity products as far as distributing the investment profits
is concerned. Conversely, mortality profit participation is less common.
Notwithstanding, important examples are provided by mortality profit
sharing in group life insurance and, as regards the life annuity business,
participation mechanisms adopted in the German annuity market. The
critical point is that, in contrast to what happens for products with par-
ticipation to investment profits and to mortality profits in life insurance,
people participating to mortality profits in life annuity portfolios are not
those who have generated such profits and, so, a tontine scheme emerges (see
Section 1.4.3).
Reduction in the
annual amount
b[3]
b[2]
0 h n s Time
Experienced
mortality lower
than expected
Figure 7.24. Annual amount in a product with conditional GAR in the decumulation period.
It is worthwhile to note that from a technical point of view a policy

condition similar to the conditional GAR may work also during the decu-
mulation period. In this case, the amount of the benefit (possibly assessed at
retirement time with an annuity rate higher than what resulting from a ring
approach to GAR pricing) may be reduced in the case of strong unantici-
pated improvements in mortality. It would be reasonable to fix a minimum
benefit level in this case.
As an illustration, assume that the amount b[2] resulting from (7.91) is
considered the level of benefit that is consistent with a rigorous approach to
GAR pricing. However, considering that the implied safety loading could
turn out to be too severe according to the actual mortality experienced,
the insurer is willing to pay the annual benefit b[3] , with b[3] > b[2] . If
after time n, a strong mortality improvement is recorded, then the insurer
will reduce the annual amount down to b[2] (see Fig. 7.24). Constraints
similar to (a) and (c) for the conditional GAR in the accumulation period
should be applied. From a commercial point of view, care should be taken
in making clear to the annuitant that the guaranteed benefit is b[2] and
not b[3] . However, a tontine scheme emerges, given that in some sense a
participation to losses is realized.
7.6 Allowing for longevity risk in pricing

As already pointed out, we are not going to discuss in details the problem
of pricing long-term living benefits allowing for longevity risk. Indeed, the
unsolved issues are too important and complex to allow for a complete
description in the present chapter: for example, there are different opinions
on evolving mortality and hence on the appropriate stochastic model to
allow for uncertain mortality trends, and the data for estimating the main
parameters are unavailable.
On the other hand, pricing models for longevity risk are required when
dealing with life annuities and longevity bonds. Therefore, in this section,
we summarize a few of the main proposals which have been described in
literature. However, this is a subject which has been developing in the recent
literature, and we do not aim to give a comprehensive illustration of the
several proposals that have been put forward.
We first address the present value of life annuities. Denuit and Dhaene
(2007) and Denuit (2007) allow for randomness in the probabilities of
death within a Lee–Carter framework. Due to the importance of such a
framework, we briefly describe their approach. Let us adopt the standard
Lee–Carter framework, where the future forces of mortality are decom-
posed in a log-bilinear way (see Section 4.7.2). Specifically, the death rate
at age x in calendar year t is of the form exp(αx + βx κt ), where κt , in
particular, is a time index, reflecting the general level of mortality.
We denote as h Px0 (t0 ) the random h-year survival probability for an
individual aged x0 in year t0 , that is, the conditional probability that this
individual reaches age x0 + h in year t0 + h, given the κt ’s. Adopting
assumptions (3.2) (from which (3.13) holds), such probability is formally
defined as
 

h−1

h Px0 (t0 ) = exp − mx0 +s (t0 + s)
s=0
 

h−1
= exp− exp αx0 +s + βx0 +s κt0 +s  (7.92)
s=0
We refer to a basic life annuity contract paying the annual amount b = 1

at the end of each year, as long as the annuitant survives. The present value
of such annuity is the expectation of the payments made to an annuitant
aged x0 in year t0 , conditional on a given time index; it is calculated as
ω−x
0
ax0 (t0 ) = h Px0 (t0 )v(0, h)
h=1
 
ω−x
0
h−1
= exp− exp αx0 +s + βx0 +s κt0 +s  v(0, h) (7.93)
h=1 s=0
where v(0, h) is the discount factor, that is, the present value at time 0 of
a unit payment made at time h. We note that ax0 (t0 ) is a random vari-
able, since it depends on the future trajectory of the time index (i.e. on
κt0 , κt0 +1 , κt0 +2 , . . .). We note also that (7.93) generalizes (1.27).
The distribution function of ax (t0 ) is difficult to obtain. Useful approx-
imations have been proposed by Denuit and Dhaene (2007) and Denuit
(2007). Specifically, Denuit and Dhaene (2007) have proposed comono-
tonic approximations for the quantiles of the random survival probabilities
h Px0 (t0 ). Since the expression for ax (t0 ) involves a weighted sum of
the h Px0 (t0 ) terms, Denuit (2007) supplemented the first comonotonic
approximation with a second one. This second approximation is based
on the fact that the h Px0 (t0 ) terms are expected to be closely dependent
for increasing values of h so that it may be reasonable to approxi-
mate the vector of random survival probabilities with its comonotonic
version.
Interesting information can be obtained from a further investigation of
the distribution of ax0 (t0 ). We consider a homogeneous portfolio, made of
n0 annuitants at time t0 . We refer now to the random variable aK(j) , where
x0
(j) (j)
Kx0 is the curtate lifetime of individual j. Given the time index, the Kx0 ’s
are assumed to be independent and identically distributed, with common
conditional h-year survival probability h Px0 (t0 ).
We recall from Denuit et al. (2005) that a random variable X is said
to precede another one Y in the convex order, denoted as X cx Y, if
the inequality E[g(X)] ≤ E[g(Y)] holds for all the convex functions g
for which the expectations exist. Since X cx Y ⇒ E[X] = E[Y] and
Var[X] ≤ Var[Y], X cx Y intuitively means that X is ‘less variable’, or
‘less dangerous’ than Y.
Now, since the aK(j) ’s are exchangeable, we have from Proposition 1.1
x0
in Denuit and Vermandele (1998) that
n0 +1
j=1 aK(j)
x0
ax (t0 ) = E[aK(j) |κt0 +k , k = 1, 2, . . .] cx · · · cx
x0 n0 + 1
n0
j=1 aKx(j)
cx 0
. (7.94)
n0
Increasing the size of the portfolio makes the average payment per annuity
less variable (in the cx -sense), but this average remains random whatever
the number of policies comprising the portfolio, being bounded from below
by ax (t0 ) in the cx -sense. We note that, despite the positive dependence
existing between the Lee–Carter lifetimes, there is still some diversification

effect in the portfolio.
Biffis (2005) calculates the single premium of a life annuity adopting affine
jump-diffusions for modelling the force of mortality and the short interest
rate. In this way, one deals simultaneously with financial and mortality
risks and calculates values based on no-arbitrage arguments. The setting
is also applied for portfolio valuations in Biffis and Millossovich (2006a)
and to the valuation of GAOs in Biffis and Millossovich (2006b). Affine
mortality structures are also addressed by Dahl (2004) and Dahl and Møller
(2006), where, in particular, hedging strategies for life insurance liabilities
are investigated.
Turning to the problem of pricing longevity bonds, Lin and Cox (2005),
consider that the market is incomplete, and adopt the Wang transform
(see, e.g. Wang (2002) and Wang (2004)). Given the future random flow
X with cumulative probability distribution function (briefly, cdf) F(x), the
one-factor Wang transform is the distorted cdf F ∗ (x) such that
F ∗ (x) = (−1 (F(x)) + λ) (7.95)
where (·) is the standard normal cdf and λ is the market price of risk
(longevity risk included). The fair price of X is the present value of the
expected value of X, calculated with the risk-free rate and the distorted cdf
F ∗ (x).
Lin and Cox (2005) take X as the lifetime of an annuitant and calibrate
λ using life annuity quotations in the market (assuming that the price of a
life annuity is the present value of future payments, based on the risk-free
rate and the distorted cdf of the lifetime). They then apply the approach to
price mortality-linked securities.
The one-factor Wang transform assumes that the underlying distribu-
tion is known. However, usually F(x) is the best-estimate of the underlying
unknown distribution. The two-factor Wang transform is the cdf F ∗∗ (x)
such that
F ∗∗ (x) = Q(−1 (F(x)) + λ) (7.96)
where Q is the t-distribution with k degrees of freedom. Lin and Cox

(2008) adopt this latter approach for pricing mortality-linked securities,
with k = 6.
Cairns, Blake and Dowd (2006a) assume similarities between the force
of mortality and interest rates and adapt arbitrage-free pricing frameworks
developed for interest-rates derivatives to price mortality-linked securities.
In Cairns, Blake and Dowd (2006b) they introduce the two-factor model
described in Section 5.3 and price longevity bonds with different terms
to maturity referenced to different cohorts. In particular, they develop a
method for calculating the market risk-adjusted price of a longevity bond,
which allows for mortality trend uncertainty and parameter risk as well.
We finally address the problem of the valuation of a GAO. The GAO
(see Section 7.5.2) consists of a European call option with the underly-
ing asset the retail market value of a life annuity at retirement time and
the strike the GAR set when the GAO was underwritten. The pay-off of
the option by itself depends on the comparison between the guaranteed
and the current annuity rate. However, the actual exercise of the option
depends also on the preference that the holder expresses for a life annu-
ity instead of self-annuitization. The intrinsic structure of the pay-off of the
option is, therefore, uncertain because it depends on individual preferences,
with possible adverse selection in respect of the insurer. When assessing the
value of the GAO, individual preferences are usually disregarded in the cur-
rent literature. The pricing problem is therefore attacked by assuming that
the policyholder will decide to exercise the option just comparing the cur-
rent market quotes for life annuities and the GAR. Ballotta and Haberman
(2003) address this problem, assuming that the overall mortality risks (and
hence also the longevity risk) are diversified. In Ballotta and Haberman
(2006) the analysis is extended to the case in which mortality risk is incor-
porated via a stochastic model for the evolution over time of the underlying
force of mortality.
7.7 Financing post-retirement income
7.7.1 Comparing life annuity prices
We refer to a person buying an immediate life annuity. Let S be the capital

converted into the annuity and b the annual amount. The annuity rate b/S
is a function of:
• the discount rate, i;

• the reference mortality table, A(τ);
• a safety loading (possibly explicit) for longevity risk.
The buyer may be interested in comparing the annuity rates applied by

different providers, and in explaining the relevant differences. However, it
may not be straightforward to understand the reasons for such differences,
due to the interaction of the items building up the annuity rate and the
complexity of the pricing model for longevity risk.
Typically, the discount rate is disclosed; this is in particular required when

participation in investment profits occurs during the annuity payment. The
comparison of annuity rates then concerns the incidence of mortality and
the relevant interaction with the discount rate. Some equivalent parameters
should be produced by the annuity provider (or by some other entity) to
provide better information in this regard.
It is reasonable that the comparison among annuity rates makes reference
to traditional
pricing models. In particular, the actuarial value of a life
annuity, ax = ω−x −t
t=1 (1+i) t px , and the present value of an annuity certain,
k
aki = t=1 (1 + i)−t = (1 − (1 + i)−k )/i, may be addressed.
Given the discount rate assumed in the annuity rate bS , we can determine
the equivalent number of payments of an annuity certain, that is, the number
k such that aki = bS . If i > 0, we easily find
ln(1 − i bS )
k=− (7.97)
ln(1 + i)
Conversely, if i = 0 then k is simply given by bS and, according to a tradi-

tional actuarial valuation of the life annuity, it coincides with the expected
lifetime assumed by the annuity provider for the annuitant. Clearly, the
stronger is the cost of longevity embedded in the annuity rate bS , the lower
is k. Note that if i > 0, k depends also on i.
In the case where there is a prevailing mortality table referred to for the
traditional actuarial valuation of life annuities, one can calculate what is the
equivalent entry age x such that, according to this table and having set the
discount rate i, 1/ax coincides with the annuity rate quoted by the annuity
provider. Such an age should then be compared with the actual entry age,
say x0 .
Similarly, in the case where there is a prevailing mortality table referred
to for the traditional actuarial valuation of life annuities, an alternative
possibility is to refer to the actual age x0 and to calculate the equivalent
discount rate, that is, the rate i such that 1/ax0 (based on the reference
mortality table) coincides with the quoted annuity rate bS , as it is performed,
for example, by Verrall et al. (2006).
Example 7.10 With reference to the expected values quoted in Table 7.2 for
time 0, we perform the comparisons discussed above. We assume that the
prevailing mortality table referred to for the traditional actuarial valuation
of the life annuity is given by assumption A3 (τ). All of the other assumptions
are as in Example 7.1; in particular, the actual entry age is x0 = 65.
Table 7.30. Equivalent number of payments of an annu-

ity certain; discount rate: i = 0.03
Mortality Annuity Equivalent number

assumption rate of payments
1
A1 (τ) 14.462 = 0.06915 19.247
1
A2 (τ) 14.651 = 0.06825 19.587
1
A3 (τ) 15.259 = 0.06554 20.707
1
A4 (τ) 15.817 = 0.06322 21.767
1
A5 (τ) 16.413 = 0.06093 22.938
Table 7.31. Equivalent number of payments of an

annuity certain; mortality assumption: A3 (τ)
Discount Annuity Equivalent number

rate rate of payments
1
i=0 21.853 = 0.04576 21.853
1
i = 0.01 19.238 = 0.05198 21.473
1
i = 0.02 17.071 = 0.05858 21.091
1
i = 0.03 15.259 = 0.06554 20.707
1
i = 0.04 13.733 = 0.07282 20.321
Tables 7.30 and 7.31 give the equivalent number of payments of an annu-
ity certain, for several quoted prices of the life annuity. In particular, in
Table 7.30 the discount rate has been kept fixed, while alternative mor-
tality assumptions have been used; in Table 7.31 the annuity rate is based
on the mortality assumption A3 (τ) while alternative levels of the discount
rate are chosen. Clearly, given the mortality table, the equivalent number
of payments of an annuity certain is higher the lower is the discount rate.
With a fixed discount rate, the equivalent number of payments is higher the
stronger is the mortality improvement implied by the table.
In Table 7.32 the reference mortality assumption is A3 (τ) and the refer-
ence discount rate is i = 0.03. First, the equivalent discount rate relating to
different mortality assumptions is calculated (third column); then the equiv-
alent rounded entry age is quoted (fourth column). We note that a lower
equivalent discount rate and a lower equivalent entry age emerge from a
stronger assumption about mortality improvements.
7.7.2 Life annuities versus income drawdown
When planning post-retirement income, some basic features of the life

annuity product should be accounted for. In particular,
Table 7.32. Equivalent discount rate, equivalent entry age; reference

parameters: mortality A3 (τ), discount rate: i = 0.03
Mortality Annuity Equivalent Equivalent

assumption rate discount rate entry age x
1
A1 (τ) 14.462 = 0.06915 3.501% 67
1
A2 (τ) 14.651 = 0.06825 3.379% 66
1
A3 (τ) 15.259 = 0.06554 3% 65
1
A4 (τ) 15.817 = 0.06322 2.673% 64
1
A5 (τ) 16.413 = 0.06093 2.343% 62
(a) a life annuity provides the annuitant with an inflexible income, in the
sense that, if the whole fund available to the annuitant at retirement is
converted into a life annuity, the annual income is stated as defined by
the annuity rate (apart from the effect of possible profit participation
mechanisms);
(b) a more flexible income can be obtained via a partial annuitization of
the fund, or partially delaying the annuitization itself; the part of the
income not provided by the life annuity is then obtained by drawdown
from the non-annuitized fund;
(c) the life annuity product benefits from a mortality cross-subsidy, as
each life annuity in a given portfolio (or pension plan) is annually
credited with ‘mortality interests’, that is, a share of the technical pro-
visions released by the deceased annuitants, according to the mutuality
principle (see Sections 1.4 and 1.4.1 in particular).
Let us start with point (c). We refer to a life annuity issued at age x0
with annual amount b, whose technical provision (simply denoted by Vt ) is
calculated according to rule (7.49) (adopting a mortality assumption A(τ)).
Recursively, we may express the technical provision as follows:
V0 = S
Vt−1 (1 + i) = (Vt + b) px0 +t−1 , t = 1, 2, . . . (7.98)
where i is the technical interest rate, px0 +t−1 is based on mortality assump-
tion A(τ) and S is the single premium (see (1.28)). According to a traditional
pricing structure, we may further assume
S = b ax0 (7.99)
where ax0 is calculated according to the same assumptions adopted in (7.98).
To be more realistic, we consider a (financial) profit participation mech-
anism. We denote as b0 the amount of the benefit set at policy issue (so,
b0 = b, where b comes from (7.99)). Assume that in each policy year a

constant (to shorten notation) extra interest rate r is credited to the reserve.
As a consequence, the annual amounts b1 , b2 , . . . , bt , . . . are paid out, at
times 1, 2, . . . , t, . . . , where bt is assessed as follows
bt = bt−1 (1 + r), t = 1, 2, . . . (7.100)
The recursion describing the behaviour of the reserve then becomes
Vt−1 (1 + i) (1 + r) = (Vt + bt ) px0 +t−1 , t = 1, 2, . . . (7.101)
or, defining 1 + i = (1 + i) (1 + r), so that i represents the total annual

interest rate credited to the reserve
Vt−1 (1 + i ) = (Vt + bt ) px0 +t−1 , t = 1, 2, . . . (7.102)
Rearranging (7.102), we obtain
Vt − Vt−1 = −bt px0 +t−1 + Vt−1 i + Vt qx0 +t−1 (7.103)
which can be rewritten as
Vt − Vt−1 = Vt−1 i + (Vt + bt ) qx0 +t−1 − bt (7.104)
and, replacing Vt + bt according to (7.102), finally as

qx0 +t−1
Vt − Vt−1 = Vt−1 i + Vt−1 (1 + i ) − bt (7.105)
px0 +t−1
We note that (7.105) generalizes (1.13).

Recalling that Vt − Vt−1 < 0, from (7.103) we find that the variation in
the reserve is due to the following contributions:
(i) a positive contribution due to the (total) amount of interest assigned

to the reserve;
(ii) a positive contribution due to mutuality;
(iii) a negative contribution by the payment bt .
The splitting of the variation of the reserve in a year is sketched in Fig. 1.4.
We now address item (b) in the list at the beginning of this Section. As
was discussed in Section 1.2.1, the annuitant may decide not to use S to
buy a life annuity, but simply to invest it and receive the post-retirement
income via a sequence of withdrawals (set at her/his choice). Suppose that
the fund is credited each year with annual interest at the rate g. Further
assume that the annuitant withdraws from the fund a sequence of amounts
set to be a (constant) proportion α of the annual payments she/he would

have obtained under the life annuity, that is, of the sequence (7.100).
Let Ft be the fund available at time t. We have
F0 = S
(7.106)
Ft−1 (1 + g) = α bt + Ft , t = 1, 2, . . .
simply generalizing (1.1).
As already noted in Section 1.2.1, there is a time m such that Fm ≥ 0
and Fm+1 < 0, that is, the withdrawals b1 , b2 , . . . , bm exhaust the fund. If
the lifetime of the annuitant, Tx0 , turns out to be lower than m, then the
amount FTx0 is available at her/his death for bequest. However, if Tx0 > m
then at time m the annuitant is unfunded. To avoid early exhaustion, the
annuitant should set a low level for α or look for investments with a high
yield g. In the former case, however, the annual income may then become
insufficient to meet current needs; in the latter case, risky assets could be
involved, so that possible losses may then emerge because of fluctuating
values.
Example 7.11 Let us assume that the amount S = 15.259 can be used
to buy a life annuity with initial benefit b = b0 = 1, subject to profit
participation. The annuity rate b/S = 1/15.259 is based on a traditional
calculation of the actuarial value of the life annuity, under the mortality
assumption A3 (τ) and the annual interest rate i = 0.03 (see Table 7.32,
second column). We set the actual annual interest rate gained in each year
on investments to be i = 0.05, so that benefits are yearly increased by the
rate r = 1.05/1.03 − 1 = 0.01942.
With the parameters mentioned above, we now refer to the case of
drawdown, based on an annual consumption α bt , t = 1, 2, . . . . Setting
g = i = 0.05, Fig. 7.25, panel (a), shows the share α as a function of the
time m to fund exhaustion. Note that α becomes lower than 1 as soon as the
time m is greater than the expected lifetime of the annuitant under scenario
A3 (τ) (which turns out to be 21.853 years; see also Table 7.31 for i = 0).
Alternatively, setting α = 1, in panel (b) of Fig. 7.25 the required annual
investment yield g is quoted, again as a function of the time to exhaustion
of the fund. We note that, in this case, g exceeds i = 0.05 as soon as m is
greater than the expected lifetime of the annuitant.
7.7.3 The ‘mortality drag’
The absence of mutuality in an income drawdown process can be compen-

sated (at least partially) by a higher investment yield (see Section 1.4.1). The
(a)
2.00
1.80
1.60
1.40
1.20
Share α
1.00
0.80
0.60
0.40
0.20
0.00
0 10 20 30 40 50 60
Time to exhaustion, m
(b)
9%
8%
7%
6%
5%
Rate g
4%
3%
2%
1%
0%
0 10 20 30 40 50 60
Time to exhaustion, m
Figure 7.25. Annual withdrawal (panel (a)) and annual investment yield (panel (b)) as a function
of the time to fund exhaustion.
extra return required in each year for this purpose has been called the mor-
tality drag. However, it is worth stressing that a fixed drawdown sequence
leads in any case to wealth exhaustion in a given number of years (possibly
the maximum residual lifetime), whatever the interest rate may be, as was
depicted in Fig. 7.25, panel (b).
Conversely, the concept of mortality drag suggests an alternative arrange-

ment for the post-retirement income. Assume that at time 0 no life annuity
is purchased, whereas some amount will be converted into a life annuity at
time k, thus with a delay of k years since the retirement time. We suppose
that a traditional pricing method is adopted at time k by the insurer and that
the mortality assumption for the trend of the cohort is not revised during
the delay period. To facilitate a comparison, we assume that the amount
to be annuitized at time k must provide the annuitant with the sequence
bk+1 , bk+2 , . . . , whose items follow from (7.100) (assuming b0 = b as
given by (7.99)). Hence, the amount to be converted at time k into the life
annuity is
bk ax+k = Vk (7.107)
with Vk originated by (7.105). Therefore, an amount funding the reserve

to be set up must be provided at time k.
If the annuitant aims at getting the same income as under the life annuity
also during the delay period, than the drawdown process b1 , b2 , . . . , bk
must be defined. Because of the absence of mutuality, if the individual
investment provides the same yield as that which the insurer is willing to
recognize, then the fund available at time k, Fk , is lower then the required
amount to annuitise, Vk . However, an extra return may offset the loss of
‘mutuality (or mortality) returns”, thus leading to Fk = Vk . The size of
the extra investment yield required so that Fk = Vk can be obtained from
(7.106), considered with α = 1. If i is the yield on the life annuity product,
then intuitively g − i is an average of the annual quantities θx+t defined
in Section 1.4.1. It is worthwhile stressing that given the deferment k, the
extra yield g − i must be obtained in each of the k years of delay. Thus,
g − i is like a yield to maturity, measuring the mortality interest in k years,
whereas the quantity θx+t is the extra yield specific of year (t − 1, t) (see
also (1.34) and (1.35)).
Example 7.12 Under the assumptions adopted in Example 7.11 for the life
annuity, Fig. 7.26 plots the extra-yield required on individual investments
in each of the k years of delay to compensate the loss of mutuality. Trivially,
the higher is k, the higher is the required extra yield. Given that the extra
yield must be realized in each of the k years of delay, this target may be
very difficult to reach when the annuitization is planned for a distant time
in the future.
It is worthwhile to investigate in more detail how the average mortality

drag g − i is affected by the annuity rate. From (7.106), having set α = 1
9%
Extra investment yield
8% Life annuity yield
7%
6%
5%
4%
3%
2%
1%
0%
5 10 15 20 25 30 35 40 45 50
Delay period k
Figure 7.26. Extra investment yield required by mortality drag.
we get

t
t
Ft = S (1 + g) − bh (1 + g)t−h (7.108)
h=1
Let gk be the rate g such that Fk = Vk for a given k. The rate gk is therefore
defined by the following relation:

k
S (1 + gk )k − bh (1 + gk )k−h = Vk (7.109)
h=1
Note that Fig. 7.26 actually plots the rate gk for several choices of k.
From (7.100), we can express the annual benefit at time t as
bt = b (1 + r)t (7.110)
Replacing (7.110), (7.107), and (7.99) into (7.109), we obtain

k
b ax0 (1 + gk )k − b (1 + r)h (1 + gk )k−h = b (1 + r)k ax0 +k (7.111)
h=1
or equivalently
1+r (1 + r)k+1
ax0 (1 + gk )k − (1 + gk )k + = (1 + r)k ax0 +k (7.112)
gk − r gk − r
which suggests that gk depends on the annuity rate applied at time k,

1/ax0 +k , but also on that applied at time 0, 1/ax0 . The rate gk obtained with
r = 0 has been named the Implied Longevity Yield (ILY)1 ; see Milevsky
(2005) and Milevsky (2006).
The delay in the purchase of the life annuity may have some advantages.
In particular:
– in the case of death before time k, the fund available constitutes a bequest
(which is not provided by a life annuity purchased at time 0, because of
the implicit mortality cross-subsidy);
– more flexibility is gained, as the annuitant may change the annual income
modifying the drawdown sequence (with a possible change in the fund
available at time k).
Conversely, a disadvantage is due to the risk of a shift to a different mortality

assumption, leading to a conversion rate at time k which is less favourable
to the annuity purchaser than the one in force at time 0. Further, as already
noted, in the case where k is high, it may be difficult to gain the required
mortality drag.
7.7.4 Flexibility in financing post-retirement income
Combining an income drawdown with a delay in the life annuity purchase

constitutes an example of a post-retirement income arrangement which is
more general than the one consisting of a life annuity-based income only.
We now summarize what has emerged in the previous sections, thereby
defining a general framework for a discussion of post-retirement income
planning. Our focus will be mostly on mortality issues, to keep the presen-
tation in line with the main scope of the chapter. Nevertheless, important
financial aspects should not be disregarded when assessing and comparing
the several opportunities of meeting post-retirement income needs.
We assume that an accumulation process takes place during the work-
ing period of an individual. After retirement, a decumulation process takes
place and hence income requirements are met using, in some way, the
accumulated fund.
Figure 7.27 illustrates the process consisting of:
1. the accumulation of contributions during the working period;
1 Registered trademarks and property of CANNEX Financial Exchanges.

Contributions
(before retirement)
Annuity
purchase
Non-annuitized Annuitized
fund fund
Interests Interests
Mortality
Income Annuity
drawdown payment
(after retirement) (after retirement)
Figure 7.27. Accumulation process and post-retirement income.
2. a (possible) annuitization of (part of) the accumulated fund (before or

after retirement);
3. receiving a post-retirement income from life annuities or through income
drawdown.
The annuitization of (part of) the accumulated fund consists of pur-

chasing a deferred life annuity if annuitization takes place during the
accumulation period, and an immediate life annuity otherwise. Hence, at
any time, the resources available for financing post-retirement income are
shared between a non-annuitized and an annuitized fund. It is reason-
able to assume that a higher degree of flexibility in selecting investment
opportunities is attached to the non-annuitized fund.
We note that the non-annuitized fund builds up because of contributions
and investment returns. Conversely, the annuitized fund builds up because
of investment returns and mortality, as the fund coincides with the total
mathematical reserve of the life annuities purchased, and hence it benefits
from the cross-subsidy effect.
Figures 7.28 and 7.29 illustrate a possible behaviour of the non-
annuitized and the annuitized fund, respectively. Effects of the life annuity
purchase (jumps in the processes), of the income drawdown and of the
annuity payment are identified.
The slope of the non-annuitized fund depends, while the fund itself is
increasing, on both contributions and interest earnings, whereas it depends
on the drawdown policy while the fund is decreasing. As regards the annu-
itized fund, as previously noted, its slope depends on interest and mortality,
Life annuity purchase
Fund
Income
drawdown

Time
Figure 7.28. The non-annuitized fund.
Life annuity
Life annuity purchase payment
Fund

Time
Figure 7.29. The annuitized fund.
while it is increasing, whereas it also depends on the annuity payment while

decreasing.
Let us denote by Ft[NA] and Ft[A] the values of the non-annuitized and
the annuitized fund, respectively, at time t. The ‘degree’ of the annuitiza-
tion policy can be summarized by the annuitization ratio ar(t), defined as
follows:
Ft[A]
ar(t) = (7.113)
Ft[A] + Ft[NA]
Note that, obviously, 0 ≤ ar(t) ≤ 1; ar(t) = 0 means that up to time t no
life annuity has been purchased, whilst ar(t) = 1 means that at time t the
whole fund available consists of reserves related to purchased life annuities.
100%
(1)
Annuitization ratio
Deferred
life annuity
Income
drawdown
only
(2)
0%
Accumulation Post-retirement
period period
Time
Figure 7.30. Arrangements: (1) deferred life annuity; (2) income drawdown.
Example 7.13 Figures 7.30–7.33 illustrate some strategies for financing

post-retirement income. In most cases, the technical tool provided by the
life annuity is involved. The various strategies are described in terms of the
annuitization ratio profile; thus, the value of ar(t) is plotted against time t.
To improve understanding, we suppose that a specified mortality assump-
tion is adopted when annuitizing (a part of) the accumulated fund and that
the assumption itself cannot be replaced in relation to the purchased annu-
ity, whatever the mortality trend might be (so, that a guaranteed annuity is
involved).
Figure 7.30 illustrates two ‘extreme’ choices. Choice (1) consists of build-
ing up a traditional deferred life annuity. In this case, each amount paid to
the accumulation fund (possibly a level premium, or a single recurrent pre-
mium) is immediately converted into a deferred life annuity; this way, the
accumulated fund is completely annuitized. Post-retirement income require-
ments are met by the life annuity (a flat annuity or, possibly, a rising profile
annuity, viz an escalating annuity or an inflation-linked annuity).
Choice (2) represents the opposite extreme. There is no annuitization
operating, so that income requirements are fulfilled by income draw-
down, which implies spreading the fund accumulated at retirement over the
future life expectation, according to some spreading rule. Sometimes annu-
itants prefer this choice because of the high degree of freedom in selecting
investment opportunities even during the post-retirement period.
It should be stressed that choice (1) leads to an inflexible post-retirement
income, whilst choice (2) allows the annuitant to adopt a spreading rule
100%
Annuitization ratio
Immediate
life annuity
0%
period period
Time
Figure 7.31. Immediate life annuity.
consistent with a specific income profile. Conversely, it is worth noting

that arrangement (1) completely transfers the mortality risk (including its
longevity component) to the insurer, whilst according to arrangement (2)
the mortality risk remains completely with the annuitant (see Section 7.7.2).
In more general terms, the process of transferring mortality risk depends
on the annuitization profile: thus, the portion of mortality risk trans-
ferred from the annuitant to the insurer increases as the annuitization ratio
increases. The following arrangements provide practical examples of how
mortality risk can be transferred, as time goes by, to the insurer.
The annuitization of the fund at retirement time only is illustrated in
Fig. 7.31, which depicts the particular case of a complete annuitization of
the fund available at retirement. This arrangement can be realized through
purchasing a single-premium life annuity, and is characterized by flexibility
in the investment choice during the accumulation period. Conversely, it
produces an inflexible post-retirement income profile.
In Fig. 7.32, the annuitization ratio increases during the accumulation
period because of positive jumps corresponding to the purchase of life annu-
ities with various deferment periods. The behaviour of the annuitization
ratio between jumps obviously depends on the contributions and the inter-
est earnings affecting the non-annuitized fund as well as on the financial
and mortality experience of the annuitized fund.
In contrast Fig. 7.33 illustrates the case in which no annuitization is made
throughout the accumulation period, whereas the fund available after the
retirement date is partially used (with delays) to purchase life annuities;
100%
Annuitization ratio
Combined
annuities
0%
period period
Time
Figure 7.32. Combined life annuities.
100%
Annuitization ratio
Staggered
annuitization
0%
period period
Time
Figure 7.33. Staggered annuitization.
such a process is sometimes called staggered annuitization or staggered

vesting. The behaviour of the ratio between jumps depends on the interest
earnings and income drawdown as regards the non-annuitized fund as well
as financial and mortality experience of the annuitized fund.
Arrangements like those illustrated by Figs. 7.32 and 7.33 are charac-
terized by a high degree of flexibility as regards both the post-retirement
income profile and the choice of investment opportunities available for the
non-annuitized fund.
The framework proposed above clearly shows the wide range of choices
leading to different annuitization strategies. So, convenient investment and
life annuity products can be designed, to meet the different needs and pref-
erences of the clients. An example in this regard is given by the solutions
providing natural hedging across time (Section 7.3.2), such as the money-
back annuity with death benefit (7.32), which is designed so that at some
future time the death benefit reduces to zero. We note that, as long as the
death benefit is positive, a situation of fund just partially annuitized can be
identified. As soon as the death benefit reduces to zero, the fund turns out
to be fully annuitized. Thus, an annuitization strategy is embedded in the
structure of money-back annuities.

In this section we summarize the main contributions on the topics dealt
with in this chapter, some of which have already been mentioned while
addressing specific issues. However, the purpose is to add references to
those that have been previously cited.
An informal and comprehensive description of longevity risk, and in par-
ticular of the relevant financial impact on life annuities, is provided by
Richard and Jones (2004). See also Riemer-Hommel and Trauth (2000).
A static framework for representing the longevity risk according to a
probabilistic approach has been used, for example, by Olivieri (2001),
Olivieri and Pitacco (2002a), Olivieri and Pitacco (2003). Olivieri and
Pitacco (2002a) suggest a Bayesian-inferential procedure for updating the
weighting distribution. Marocco and Pitacco (1998) adopt a continuous
probability distribution for weighting the alternative scenarios. A dynamic
probabilistic approach to longevity risk modelling has been proposed,
among the others, by Biffis (2005), Dahl (2004), Cairns et al. (2006b). Biffis
and Denuit (2006) introduce, in particular, a class of stochastic forces of
mortality that generalize the Lee–Carter model. The static and the dynamic
probabilistic approaches to randomness in mortality trend are addressed by
Tuljapurkar and Boe (1998).
The investigation in Section 7.2, and in Section 7.2.3 in particular, is
based on Olivieri (2001). The analysis of the random value of future bene-
fits is addressed also by Biffis and Olivieri (2002), where a pension scheme
(or a group insurance) providing a range of life and death benefits is referred
to. Following Olivieri (2001), Coppola et al. (2000) provide an investigation
also addressing financial risk for life annuity portfolios. In the Lee–Carter
framework, given that the future path of the time index is unknown and
modelled as a stochastic process, the policyholders’ lifetimes become depen-
dent on each other. Consequently, systematic risk is involved. Denuit and
Frostig (2007a) study this aspect of the Lee–Carter model, in particular
considering solvency issues. Denuit and Frostig (2007b) further study the
distribution of the present value of benefits in a run-off perspective. As
the exact distribution turns out to be difficult to compute, various approx-
imations and bounds are derived. Denuit (2008) summarizes the results
obtained in this field.
The literature on risk management in industry and business in general
is very extensive. For an introduction to the relevant topics the reader can
refer, for example, to Harrington and Niehaus (1999), and to Williams,
Smith and Young (1998). Various textbooks address specific phases of
the risk management process. For example, Koller (1999) focuses on the
risk assessment in the risk management process for business and industry,
whereas Wilkinson Tiller, Blinn and Kelly (1990) deal with the topic of risk
financing. Pitacco (2007) addresses mortality and longevity risk within a
risk management perspective.
Several investigations have been performed with regard to natural hedg-
ing. As far as portfolio diversification effects are concerned, the reader may
refer to Cox and Lin (2007), where the results of an empirical investigation
concerning the US market are discussed. With regard to arrangements on
a per-policy basis, some possible designs referring to pension schemes with
combined benefits are discussed in Biffis and Olivieri (2002). Gründl et al.
(2006) analyse natural hedging from the perspective of the maximization
of shareholder value and show, under proper assumptions, that natural
hedging could not be optimal in this regard.
Solvency investigations in portfolio of life annuities are dealt with by
Olivieri and Pitacco (2003). Solvency issues within a Lee-Carter framework
are discussed by Denuit and Frostig (2007a). A review of solvency systems
is provided by Sandström (2006); when the longevity risk is addressed, typ-
ically the required capital in this respect is set as a share of the technical
provision. The most recent regulatory system is provided by the evolving
Solvency 2 system, where the required capital is the change expected in the
net asset value in case of a permanent shock in survival rates; see, for exam-
ple, CEIOPS (2007) and CEIOPS (2008). The idea of assessing the required
capital by comparing assets to the random value of future payments, exam-
ined in Section 7.3.3, has been put forward, for the life business in general,
by Faculty of Actuaries Working Party (1986).
Reinsurance arrangements for longevity risk have not received much

attention in the literature, due to the practical difficulty of transferring
the systematic risk. A Stop-Loss reinsurance on the assets has been pro-
posed by Marocco and Pitacco (1998), which the reader is referred to for
some numerical examples, evaluated using both analytical and simulation
methods. Olivieri (2005) deals, in a more formal setting, with both XL
and Stop-Loss treaties, analysing the effectiveness of these arrangements in
terms of the capital the insurer must allocate to face the residual longevity
risk not covered by the reinsurer. Olivieri and Pitacco (2008) refer to a
swap-like arrangement, in the context of the valuation of a life annuity
portfolio. Cox and Lin (2007) also design a swap-like arrangement, based
on natural hedging arguments.
In contrast, considerable attention has been devoted in the recent litera-
ture to longevity bonds. Securitization of risks in general is described by Cox
et al. (2000). The life insurance case is considered by Cowley and Cummins
(2005). A mortality-indexed bond is described in Morgan Stanley-Equity
Research Europe (2003). Various structures for longevity bonds have been
proposed by Lin and Cox (2005), Lin and Cox (2007), Blake and Burrows
(2001), Dowd (2003), Blake et al. (2006a), Blake et al. (2006b), Dowd et al.
(2006), Olivieri and Pitacco (2008), Denuit et al. (2007). Pricing problems
are also dealt with in Cairns et al. (2006b) and Denuit et al. (2007), the
latter, in particular, working within the classical Lee–Carter model.
The pricing of longevity risk has been addressed also in the framework
of portfolio valuation. Biffis and Millossovich (2006a) consider in partic-
ular new business. Olivieri and Pitacco (2008) design a valuation setting,
however without solving the problem of the appropriate stochastic mor-
tality model to use. Friedberg and Webb (2005) analyse the pricing of the
aggregate mortality risk in relation to the cost of capital of the insurance
company. With reference to the problem of pricing a life annuity, Denuit
and Frostig (2008) explain how to determine a conservative life table serv-
ing as first-order mortality basis, starting from a best-estimate of future
mortality.
Many recent papers deal with the pricing and valuation of insurance
products including an option to annuitize; see, for example, Milevsky and
Promislov (2001), O’Brien (2002), Wilkie et al. (2003), Boyle and Hardy
(2003), Ballotta and Haberman (2003), Pelsser (2003), Ballotta and Haber-
man (2006), Biffis and Millossovich (2006b). Some of them, mainly deal in
detail with financial aspects.
Innovative ideas and proposals for structuring post-retirement benefits
are presented and discussed in the reports by the Department for Work and
Pensions (2002) in the United Kingdom, and the Retirement Choice Work-
ing Party (2001). The paper by Wadsworth et al. (2001) suggests a technical
structure for a fund providing annuities. A comprehensive description of
several annuities markets is provided by Cardinale et al. (2002). Piggot
et al. (2005) describe Group-Self Annuitization schemes, which provide an
example of flexible GAR; however, the benefit in this case is not guaranteed.
Money-back annuities in the United Kingdom represent an interesting annu-
itization strategy; see Boardman (2006). Income drawdown issues within
the context of defined contribution pension plans are discussed by Emms
and Haberman (2008), Gerrard et al. (2006). An extensive presentation of
issues concerning financing the post-retirement income is given by Milevsky
(2006). An informal description of private solutions is provided by Swiss
Re (2007).
The reader interested in the impact of longevity risk on living benefits
other than life annuities can refer, for example, to Olivieri and Ferri (2003),
Olivieri and Pitacco (2002c), Olivieri and Pitacco (2002b). See also Pitacco
(2004b), where both life insurance and other living benefits are considered.
References
Alho, J. M. (2000). Discussion of Lee (2000). North American Actuarial

Journal, 4(1), 91–93.
Andreev, K. F. and Vaupel, J. W. (2006). Forecasts of cohort mortality
after age 50. Technical report.
Ballotta, L. and Haberman, S. (2003). Valuation of guaranteed annuity
conversion options. Insurance: Mathematics & Economics, 33, 87–108.
Ballotta, L. and Haberman, S. (2006). The fair valuation problem of
guaranteed annuity options: The stochastic mortality environment case.
Insurance: Mathematics & Economics, 38(1), 195–214.
Baran, S., Gall, J., Ispany, M., and Pap, G. (2007). Forecasting hun-
garian mortality rates using the Lee–Carter method. Journal Acta
Oeconomica, 57, 21–34.
Barnett, H. A. R. (1960). The trends of population mortality and
assured lives’ mortality in Great Britain. In Transactions of the 16th
International Congress of Actuaries, Volume 2, Bruxelles, pp. 310–326.
Beard, R. E. (1952). Some further experiments in the use of the incomplete
gamma function for the calculation of actuarial functions. Journal of the
Institute of Actuaries, 78, 341–353.
Beard, R. E. (1959). Note on some mathematical mortality models. In
CIBA Foundation Colloquia on Ageing (ed. C. E. W. Wolstenholme
and M. O. Connor), Volume 5, Boston, pp. 302–311.
Beard, R. E. (1971). Some aspects of theories of mortality, cause of death
analysis, forecasting and stochastic processes. In Biological aspects of
demography (ed. W. Brass), pp. 57–68. Taylor & Francis, London.
Bell, W. R. (1997). Comparing and assessing time series methods for
forecasting age-specific fertility and mortality rates. Journal of Official
Statistics, 13, 279–303.
Benjamin, B. and Pollard, J. H. (1993). The analysis of mortality and other
actuarial statistics. The Institute of Actuaries, Oxford.
Benjamin, J. and Soliman, A. S. (1993). Mortality on the move. Actuarial
Education Service, Oxford.
Biffis, E. (2005). Affine processes for dynamic mortality and actuarial
valuations. Insurance: Mathematics & Economics, 37(3), 443–468.
374 References
Biffis, E. and Denuit, M. (2006). Lee–Carter goes risk-neutral: an applica-

tion to the Italian annuity market. Giornale dell’Istituto Italiano degli
Attuari, 69, 33–53.
Biffis, E. and Millossovich, P. (2006a). A bidimensional approach to
mortality risk. Decisions in Economics and Finance, 29, 71–94.
Biffis, E. and Millossovich, P. (2006b). The fair value of guaranteed
annuity options. Scandinavian Actuarial Journal, 1, 23–41.
Biffis, E. and Olivieri, A. (2002). Demographic risks in pension
schemes with combined benefits. Giornale dell’Istituto Italiano degli
Attuari, 65(1–2), 137–174.
Black, K. and Skipper, H. D. (2000). Life & health insurance. Prentice
Hall, New Jersey.
Blake, D., Cairns, A. J., and Dowd, K. (2007). Facing up to the uncertainty
of life: the longevity fan charts. Technical Report.
Blake, D. and Burrows, W. (2001). Survivor bonds: helping to
hedge mortality risk. The Journal of Risk and Insurance, 68(2),
339–348.
Blake, D., Cairns, A. J. G., and Dowd, K. (2006a). Living with mortality:
longevity bonds and other mortality-linked securities. British Actuarial
Journal, 12, 153–228.
Blake, D., Cairns, A. J. G., Dowd, K., and MacMinn, R. (2006b).
Longevity bonds: financial engineering, valuation, and hedging. The
Journal of Risk and Insurance, 73(4), 647–672.
Blake, D. and Hudson, R. (2000). Improving security and flexibility in
retirement. Retirement Income Working Party, London.
Blaschke, E. (1923). Sulle tavole di mortalità variabili col tempo. Giornale
di Matematica Finanziaria, 5, 1–31.
Boardman, T. (2006). Annuitization lessons from the UK: money-back
annuities and other developments. The Journal of Risk and Insur-
ance, 73(4), 633–646.
Booth, H. (2006). Demographic forecasting: 1980 to 2005 in review.
International Journal of Forecasting, 22(3), 547–581.
Booth, H., Hyndman, R. J. Tickle, L., and De Jong, P. (2006). Lee–
Carter mortality forecasting: a multi-country comparison of variants
and extensions. Technical Report.
Booth, H., Maindonald, J., and Smith, L. (2002). Applying Lee–Carter
under conditions of variable mortality decline. Population Stud-
ies, 56(3), 325–336.
Booth, H., Tickle, L., and Smith, L. (2005). Evaluation of the variants
of the Lee–Carter method of forecasting mortality: a multi-country
comparison. New Zealand Population Review, 31, 13–34.
References 375
Booth, P., Chadburn, R., Heberman, S., James, D., Kharasarce, Z., Plumb,
R. and Rickayza, B. (2005). Modern advanced theory and practice. Boca
Rator: Chapman & Hall/CRC.
Bourgeois-Pichat, J. (1952). Essai sur la mortalité “biologique” de
l’homme. Population, 7(3), 381–394.
Bowers, N. L., Gerber, H. U., Hickman, J. C., Jones, D. A., and Nes-
bitt, C. J. (1997). Actuarial mathematics. The Society of Actuaries,
Schaumburg, Illinois.
Boyle, P. and Hardy, M. (2003). Guaranteed annuity options. ASTIN
Bulletin, 33, 125–152.
Brass, W. (1974). Mortality models and their uses in demography.
Transactions of the Faculty of Actuaries, 33, 123–132.
Brillinger, D. R. (1986). The natural variability of vital rates and associated
statistics. Biometrics, 42, 693–734.
Brouhns, N. and Denuit, M. (2002). Risque de longévité et rentes viagères.
II. Tables de mortalité prospectives pour la population belge. Belgian
Actuarial Bulletin, 2, 49–63.
Brouhns, N., Denuit, M., and Keilegom, van, I. (2005). Bootstrapping
the Poisson log-bilinear model for mortality forecasting. Scandinavian
Actuarial Journal, (3), 212–224.
Brouhns, N., Denuit, M., and Vermunt, J. K. (2002a). Measuring the
longevity risk in mortality projections. Bulletin of the Swiss Association
of Actuaries, 2, 105–130.
Brouhns, N., Denuit, M., and Vermunt, J. K. (2002b). A Poisson log-
bilinear approach to the construction of projected lifetables. Insurance:
Mathematics & Economics, 31(3), 373–393.
Buettner, T. (2002). Approaches and experiences in projecting mortality
patterns for the oldest-old. North American Actuarial Journal, 6(3),
14–25.
Butt, Z. and Haberman, S. (2002). Application of frailty-based mortality
models to insurance data. Actuarial Research Paper No. 142, Dept. of
Actuarial Science and Statistics, City University, London.
Butt, Z. and Haberman, S. (2004). Application of frailty-based mortality
models using generalized linear models. ASTIN Bulletin, 34(1), 175–
197.
Buus, H. (1960). Investigations on mortality variations. In Transactions
of the 16th International Congress of Actuaries, Volume 2, Bruxelles,
pp. 364–378.
Cairns, A. J. G., Blake, D., and Dowd, K. (2006a). Pricing death: frame-
works for the valuation and securitization of mortality risk. ASTIN
Bulletin, 36(1), 79–120.
376 References
Cairns, A. J. G., Blake, D., and Dowd, K. (2006b). A two-factor model for
stochastic mortality with parameter uncertainty: theory and calibration.
The Journal of Risk and Insurance, 73(4), 687–718.
Cairns, A., Blake, D., Dowd, K., Coughlan, G., Epstein, D., Ong, A.
and Balevich, I. (2007) A quantitative comparison of stochastic mor-
tality models using data from England and Wales and the United States.
Pensions Insitute Discussion Paper PI-0701, Cass Business School, City
University.
Cardinale, M., Findlater, A., and Orszag, M. (2002). Paying out pensions.
A review of international annuities markets. Research report, Watson
Wyatt.
Carter, L. and Lee, R. D. (1992). Modelling and forecasting US sex
differentials in mortality. International Journal of Forecasting, 8,
393–411.
Carter, L. R. (1996). Forecasting U.S. mortality: a comparison of Box –
Jenkins ARIMA and structural time series models. The Sociological
Quarterly, 37(1), 127–144.
Catalano, R. and Bruckner, T. (2006). Child mortality and cohort lifespan:
a test of diminished entelechy. International Journal of Epidemiology,
35, 1264–1269.
CEIOPS (2007). QIS3. Technical specifications. Part I: Instructions.
CEIOPS (2008). QIS4. Technical specifications.
Champion, R., Lenard, C. T., and Mills, T. M. (2004). Splines. In Ency-
clopedia of actuarial science (ed. J. L. Teugels and B. Sundt), Volume 3,
pp. 1584–1586. John Wiley & Sons.
CMI (2002). An interim basis for adjusting the “92” series mortality pro-
jections for cohort effects. Working Paper 1, The Faculty of Actuaries
and Institute of Actuaries.
CMI (2005). Projecting future mortality: towards a proposal for a stochas-
tic methodology. Working paper 15, The Faculty of Actuaries and
Institute of Actuaries.
CMI (2006). Stochastic projection methodologies: Further progress and
P-spline model features, example results and implications. Working
Paper 20, The Faculty of Actuaries and Institute of Actuaries.
CMIB (1978). Report no. 3. Continuous Mortality Investigation Bureau,
Institute of Actuaries and Faculty of Actuaries.
Coale, A. and Kisker, E. E. (1990). Defects in data on old age mortal-
ity in the United States: new procedures for calculating approximately
References 377
accurate mortality schedules and life tables at the highest ages. Asian
and Pacific Population Forum, 4, 1–31.
Congdon, P. (1993). Statistical graduation in local demographic anal-
ysis and projection. Journal of the Royal Statistical Society, A, 156,
237–270.
Coppola, M., Di Lorenzo, E., and Sibillo, M. (2000). Risk sources in a
life annuity portfolio: decomposition and measurement tools. Journal
of Actuarial Practice, 8(1–2), 43–61.
Cossette, H., Delwarde, A., Denuit, M., Guillot, F., and Marceau, E.
(2007). Pension plan valuation and dynamic mortality tables. North
American Actuarial Journal, 11, 1–34.
Cowley, A. and Cummins, J. D. (2005). Securitization of life insurance
assets and liabilities. The Journal of Risk and Insurance, 72(2), 193–
226.
Cox, S. H., Fairchild, J. R., and Pedersen, H. W. (2000). Economic aspects
of securitization of risk. ASTIN Bulletin, 30(1), 157–193.
Cox, S. H. and Lin, Y. (2007). Natural hedging of life and annuity
mortality risks. North Americal Actuarial Journal, 11, 1–15.
Cramér, H. and Wold, H. (1935). Mortality variations in Sweden: a
study in graduation and forecasting. Skandinavisk Aktuarietidskrift, 18,
161–241.
Crimmins, E. and Finch, C. (2006). Infection, inflammation, height
and longevity. Proceedings of the National Academy Sciences, 103,
498–503.
Cummins, J. D., Smith, B. D., Vance, R. N., and VanDerhei, J. L. (1983).
Risk classification in life insurance. Kluwer-Nijhoff Publishing, Boston,
The Hague, London.
Czado, C., Delwarde, A., and Denuit, M. (2005). Bayesian Poisson
log-bilinear mortality projections. Insurance: Mathematics & Eco-
nomics, 36(3), 260–284.
Dahl, M. (2004). Stochastic mortality in life insurance. Market reserves
and mortality-linked insurance contracts. Insurance: Mathematics &
Economics, 35(1), 113–136.
Dahl, M. and Møller, T. (2006). Valuation and hedging of life insur-
ance liabilities with systematic mortality risk. Insurance: Mathematics
& Economics, 39(2), 193–217.
Davidson, A. R. and Reid, A. R. (1927). On the calculation of rates of
mortality. Transactions of the Faculty of Actuaries, 11(105), 183–232.
Davy Smith, G., Hart, C., Blane, D., and Hole, D. (1998) Adverse socio-
economic conditions in childhood and cause specific adult mortality: a
prospective observational study. British Medical Journal, 316, 1631–
1635.
378 References
De Jong, P. and Tickle, L. (2006). Extending the Lee–Carter model of

mortality projection. Mathematical Population Studies, 13, 1–18.
Delwarde, A. and Denuit, M. (2006). Construction de tables de mortalité
périodiques et prospectives. Ed. Economica, Paris.
Delwarde, A., Denuit, M., and Eilers, P. (2007a). Smoothing the Lee–
Carter and Poisson log-bilinear models for mortality forecasting.
Statistical Modelling, 7, 29–48.
Delwarde, A., Denuit, M., and Partrat, Ch. (2007b). Negative binomial
version of the Lee–Carter model for mortality forecasting. Applied
Stochastic Models in Business and Industry, 23, 385–401.
Delwarde, A., Denuit, M., Guillen, M., and Vidiella, A. (2006). Applica-
tion of the Poisson log-bilinear projection model to the G5 mortality
experience. Belgian Actuarial Bulletin, 6, 54–68.
Delwarde, A., Kachakhidze, D., Olié, L., and Denuit, M. (2004). Mod-
èles linéaires et additifs généralisés, maximum de vraisemblance local
et méthodes relationnelles en assurance sur la vie. Bulletin Français
d’Actuariat, 6, 77–102.
Denuit, M. (2007). Comonotonic approximations to quantiles of life
annuity conditional expected present values. Insurance: Mathematics
& Economics, 42, 831–838.
Denuit, M. (2008). Life annuities with stochastic survival probability: a
review. Methodology and Computing in Applied Probability, to appear.
Denuit, M., Devolder, P., and Goderniaux, A.C. (2007). Securitization of
longevity risk: pricing survivor bonds with Wang transform in the Lee–
Carter framework. The Journal of Risk and Insurance, 74(1), 87–113.
Denuit, M. and Dhaene, J. (2007). Comonotonic bounds on the sur-
vival probabilities in the Lee–Carter model for mortality projections.
Computational and Applied Mathematics, 203, 169–176.
Denuit, M., Dhaene, J., Goovaerts, M. J., and Kaas, R. (2005). Actuarial
theory for dependent risks: measures, orders and models. Wiley, New
York.
Denuit, M. and Frostig, E. (2007a). Association and heterogeneity of
insured lifetimes in the Lee–Carter framework. Scandinavian Actuarial
Journal, 107, 1–19.
Denuit, M. and Frostig, E. (2007b). Life insurance mathematics with ran-
dom life tables. WP 07-07, Institut des Sciences Actuarielles, Université
Catholique de Louvain, Louvain-la-Neuve, Beglium.
Denuit, M. and Frostig, E. (2008). First-order mortality basis for life
annuities. The Geneva Risk and Insurance Review, to appear.
Denuit, M. and Goderniaux, A.-C. (2005). Closing and projecting life
tables using log-linear models. Bulletin of the Swiss Association of
Actuaries (1), 29–48.
References 379
Denuit, M. and Vermandele, C. (1998). Optimal reinsurance and stop-loss

order. Insurance: Mathematics & Economics, 22, 229–233.
Department for Work and Pensions (2002). Modernising annuities.
Technical Report, Inland Revenue, London.
Dowd, K. (2003). Survivor bonds: A comment on Blake and Burrows. The
Journal of Risk and Insurance, 70(2), 339–348.
Dowd, K., Blake, D., Cairns, A. J. G., and Dawson, P. (2006). Survivor
swaps. The Journal of Risk and Insurance, 73(1), 1–17.
Durban, I., Currie, M., and Eilers, P. (2004). Smoothing and forecasting
mortality rates. Statistical Modelling, 4, 279–298.
Eilers, P. H. C. and Marx, B. D. (1996). Flexible smoothing with B-splines
and penalties. Statistical Sciences, 11, 89–121.
Emms, P. and Haberman, S. (2008). Income drawdown schemes for a
defined contribution pension plan. Journal of Risk and Insurance, 75(3),
739–761.
Evandrou, E. and Falkingham, J. (2002). Smoking behaviour and socio-
economic class: a cohort analysis, 1974 to 1998. Health Statistics
Quarterly, 14, 30–38.
Faculty of Actuaries Working Party (1986). The solvency of life assurance
companies. Transactions of the Faculty of Actuaries, 39(3), 251–340.
Felipe, A., Guillèn, M., and Perez-Marin, A. M. (2002). Recent mortality
trends in the Spanish population. British Actuarial Journal, 8, 757–786.
Finetti, de, B. (1950). Matematica attuariale. Quaderni dell’Istituto per
gli Studi Assicurativi (Trieste), 5, 53–103.
Finetti, de, B. (1957). Lezioni di matematica attuariale. Edizioni Ricerche,
Roma.
Forfar, D. O. (2004a). Life table. In Encyclopedia of Actuarial Science (ed.
J. L. Teugels and B. Sundt), Volume 2, pp. 1005–1009. John Wiley &
Sons.
Forfar, D. O. (2004b). Mortality laws. In Encyclopedia of actuarial sci-
ence (ed. J. L. Teugels and B. Sundt), Volume 2, pp. 1139–1145. John
Wiley & Sons.
Forfar, D. O., McCutcheon, J. J., and Wilkie, A. D. (1988). On graduation
by mathematical formulae. Journal of the Institute of Actuaries, 115,
1–149.
Forfar, D. O. and Smith, D. M. (1988). The changing shape of English
Life Tables. Transactions of the Faculty of Actuaries, 40, 98–134.
Francis, B., Green, M., and Payne, C. (1993). The GLIM system: Release
4 Manual. Clarendon Press, Oxford.
Friedberg, L. and Webb, A. (2005). Life is cheap: using mortality bonds to
hedge aggregate mortality risk. WP No. 2005-13, Center for Retirement
Research at Boston College.
380 References
Gerber, H. U. (1995). Life insurance mathematics. Springer-Verlag.

Gerrard, R., Haberman, S., and Vigna, E. (2006). The management
of decumulation risks in a defined contribution environment. North
American Actuarial Journal, 10(1), 84–110.
Girosi, F. and King, G. (2007). Understanding the Lee–Carter mortality
forecasting method. Technical report.
Government Actuary’s Department (1995). National Population Projec-
tions 1992-based. HMSO, London.
tions: review of methodology for projecting mortality. Government
Actuary’s Department, London.
tions 2000-based. HMSO, London.
Goss, S. C., Wade, A., and Bell, F. (1998). Historical and projected mortal-
ity for Mexico, Canada and the United States. North American Actuarial
Journal, 2(4), 108–126.
Group Annuity Valuation Table Task Force (1995). 1994 Group annuity
mortality table and 1994 Group annuity reserving table. Transactions
of the Society of Actuaries, 47, 865–913.
Gründl, H., Post, T., and Schulze, R. N. (2006). To hedge or not to hedge:
managing demographic risk in life insurance companies. The Journal of
Risk and Insurance, 73(1), 19–41.
Gupta, A. K. and Varga, T. (2002). An introduction to actuarial mathe-
matics. Kluwer Academic Publishers.
Gutterman, S. and Vanderhoof, I. T. (1998). Forecasting changes in mor-
tality: a search for a law of causes and effects. North American Actuarial
Journal, 2(4), 135–138.
Haberman, S. (1996). Landmarks in the history of actuarial science (up
to 1919). Actuarial Research Paper No. 84, Dept. of Actuarial Science
and Statistics, City University, London.
Haberman, S. and Renshaw, A. (2008). On simulator-based approaches to
risk measurement in mortality with specific reference to binomial Lee–
Carter modelling. Presented to Living to 100. Survival to Advanced Ages
international symposium. Society of Actuaries, Orlando, Florida.
Haberman, S. and Renshaw, A. (2007) Discussion of “pension plan valua-
tion and mortality projection: A case study with mortality data”, North
American Actuarial Journal, 11(4), 2007, 148–150.
Haberman, S. and Sibbett, T. A. (eds.) (1995). History of actuarial science,
London. Pickering & Chatto.
Hald, A. (1987). On the early history of life insurance mathematics.
Scandinavian Actuarial Journal, (1), 4–18.
References 381
Hamilton, J. (1994). Time series analysis. Princeton: Princeton University

Press.
Harrington, S. E. and Niehaus, G. R. (1999). Risk management and
insurance. Irwin/McGraw-Hill.
Heligman, L. and Pollard, J. H. (1980). The age pattern of mortality.
Journal of the Institute of Actuaries, 107, 49–80.
Horiuchi, S. and Wilmoth, J. R. (1998). Deceleration in the age pattern of
mortality at older ages. Demography, 35(4), 391–412.
Hougaard, P. (1984). Life table methods for heterogeneous popu-
lations distributions describing the heterogeneity. Biometrika, 71,
75–83.
Hyndman, R. J. and Ullah, Md. S. (2007). Robust forecasting of mortality
and fertility rates: a functional data approach. Computational Statistics
and Data Analysis, 51, 4942–4956.
IAA (2004). A global framework for insurer solvency assessment. Research
Report of the Insurer Solvency Assessment Working Party, International
Actuarial Association.
James, I. R. and Segal, M. R. (1982). On a method of mortality analysis
incorporating age–year interaction, with application to prostate cancer
mortality. Biometrics, 38, 433–443.
Kannisto, V. J., Lauritsen, A. R. Thatcher and Vaupel, J. W. (1994).
Reductions in mortality at advanced ages: several decades of evi-
dence from 27 countries. Population and Development Review, 20,
793–810.
Keyfitz, N. (1982). Choice of functions for mortality analysis: Effec-
tive forecasting depends on a minimum parameter representation.
Theoretical Population Biology, 21, 329–352.
Koissi, M.-C., Shapiro, A. F., and Högnäs, G. (2006). Evaluating and
extending the Lee–Carter model for mortality forecasting: Bootstrap
confidence interval. Insurance: Mathematics & Economics, 38(1),
1–20.
Koller, G. (1999). Risk assessment and decision making in business and
industry. CRC Press.
Kopf, E. W. (1926). The early history of life annuity. Proceedings of the
Casualty Actuarial Society, 13(27), 225–266.
Kotz, S., Balakrishnan, N., and Johnson, N. L. (2000). Continuous multi-
variate distributions (2 edn), Volume 1: Models and applications. John
Wiley & Sons.
Lee, R. D. (2000). The Lee–Carter method for forecasting mortality,
with various extensions and applications. North American Actuarial
Journal, 4(1), 80–93.
382 References
Lee, R. D. (2003). Mortality forecasts and linear life expectancy trends.

Technical Report.
Lee, R. D. and Carter, L. R. (1992). Modelling and forecasting U.S.
mortality. Journal of the American Statistical Association, 87(14),
659–675.
Lee, R. and Miller, T. (2001). Evaluating the performance of the
Lee–Carter approach to modelling and forecasting. Demography, 38,
537–549.
Li, N. and Lee, R. D. (2005). Coherent mortality forecasts for a group of
populations: an extension of the Lee–Carter method. Demography, 42,
575–594.
Lin, Y. and Cox, S. H. (2005). Securitization of mortality risks in life
annuities. The Journal of Risk and Insurance, 72(2), 227–252.
Lin, Y. and Cox, S. H. (2008). Securitization of catastrophe mortality
risks. Insurance: Mathematics & Economics, 42, 628–637.
Lindbergson, M. (2001). Mortality among the elderly in Sweden 1988–97.
Scandinavian Actuarial Journal (1), 79–94.
Loader, C. (1999). Local regression and likelihood. Springer, New York.
London, D. (1985). Graduation: the revision of estimates. ACTEX
Publications.
Lundström, H. and Qvist, J. (2004). Mortality forecasting and trend
shifts: an application of the Lee–Carter model to swedish mortality data.
International Statistical Review, 72, 37–50.
Manton, K. G. and Stallard, E. (1984). Recent trend in mortality analysis.
Academic Press.
Marocco, P. and Pitacco, E. (1998). Longevity risk and life annuity reinsur-
ance. In Transactions of the 26th International Congress of Actuaries,
Birmingham, Volume 6, pp. 453–479.
McCrory, R. T. (1986). Mortality risk in life annuities. Transactions of
Society of Actuaries, 36, 309–338.
McCutcheon, J. J. (1981). Some remarks on splines. Transaction of the
Faculty of Actuaries, 37, 421–438.
Milevsky, M. A. and Promislov, S. D. (2001). Mortality derivatives and
the option to annuitise. Insurance: Mathematics & Economics, 29(3),
299–318.
Milevsky, M. A. (2005). The implied longevity yield: A note on developing
an index for life annuities. The Journal of Risk and Insurance, 72(2),
301–320.
Milevsky, M. A. (2006). The calculus of retirement income. Financial
models for pension annuities and life insurance. Cambridge University
Press.
References 383
Miller, R. T. (2004). Graduation. In Encyclopedia of actuarial science (ed.

J. L. Teugels and B. Sundt), Volume 2, John Wiley & Sons. pp. 780–784.
Morgan Stanley-Equity Research Europe (2003). Swiss Re-Innovative
mortality-based security. Technical report, Morgan Stanley.
Namboodiri, K. and Suchindran, C. M. (1987). Life table techniques and
their applications. Academic Press.
National Statistics-Government Actuary’s Department (2001). National
population projections: Review of methodology for projecting mortal-
ity. National Statistics Quality Review Series, Report No. 8.
Nordenmark, N. V. E. (1906). Über die Bedeutung der Verlängerung
der Lebensdauer für die Berechnung der Leibrenten. In Transactions
of the 5th International Congress of Actuaries, Volume 1, Berlin,
pp. 421–430.
O’Brien, C. D. (2002). Guaranteed annuity options: five issues for
resolution. British Actuarial Journal, 8, 593–629.
Oeppen, J. and Vaupel, J. W. (2002). Broken limits to life expectancy.
Science, 296, 1029–1031.
Office of National Statistics (1997) The health of adult Britain 1841–1994.
HMSO, London.
Olivieri, A. (2001). Uncertainty in mortality projections: an actuarial
perspective. Insurance: Mathematics & Economics, 29(2), 231–245.
Olivieri, A. (2005). Designing longevity risk transfers: the point of view
of the cedant. Giornale dell’Istituto Italiano degli Attuari, 68, 1–35.
Reprinted on: ICFAI Journal of Financial Risk Management, 4-March
2007: 55–83.
Olivieri, A. (2006). Heterogeneity in survival models. applications to
pension and life annuities. Belgian Actuarial Bulletin, 6, 23–39.
http://www.actuaweb.be/frameset/frameset.html.
Olivieri, A. and Ferri, S. (2003). Mortality and disability risks in long
term care insurance. IAAHS Online Journal. http://www.actuaries.org/
members/en/IAAHS/OnlineJournal/2003-1/2003-1.pdf.
Olivieri, A. and Pitacco, E. (2002a). Inference about mortality improve-
ments in life annuity portfolios. In Transactions of the 27th Interna-
tional Congress of Actuaries, Cancun (Mexico).
Olivieri, A. and Pitacco, E. (2002b). Managing demographic risks in long
term care insurance. Rendiconti per gli Studi Economici Quantitativi, 2,
15–37.
Olivieri, A. and Pitacco, E. (2002c). Premium systems for post-retirement
sickness covers. Belgian Actuarial Bulletin, 2, 15–25.
Olivieri, A. and Pitacco, E. (2003). Solvency requirements for pension
annuities. Journal of Pension Economics & Finance, 2, 127–157.
384 References
Olivieri, A. and Pitacco, E. (2008). Assessing the cost of capi-

tal for longevity risk. Insurance: Mathematics & Economics, 42,
1013–1021.
Olshansky, S. J., Passaro, D., Hershaw, R., Layden, J., Carnes, B. A.,
Brody, J., Hayflick, L., Butler, R. N., Allison, D. B., and Ludwig, D. S.
(2005). A potential decline in life expectancy in the United States in the
21st century. New England Journal of Medicine, 352, 1103–1110.
Olshansky, S. J. (1988). On forecasting mortality. The Milbank Quar-
terly, 66(3), 482–530.
Olshansky, S. J. and Carnes, B. E. (1997). Ever since Gompertz. Demog-
raphy, 34, 1–15.
O’Malley, P. (2007). Development of GMxB markets in Europe. In
Transactions of the 1st IAA Life Colloquium, Stockholm.
Pelsser, A. (2003). Pricing and hedging guaranteed annuity options via
static option replication. Insurance: Mathematics & Economics, 33(2),
283–296.
Petrioli, L. and Berti, M. (1979). Modelli di mortalità. Franco Angeli
Editore, Milano.
Piggot, J., Valdez, E. A., and Detzel, B. (2005). The simple analytics
of a pooled annuity fund. The Journal of Risk and Insurance, 72(3),
497–520.
Pitacco, E. (2004a). From Halley to “frailty”: a review of survival
models for actuarial calculations. Giornale dell’Istituto Italiano degli
Attuari, 67(1–2), 17–47.
Pitacco, E. (2004b). Longevity risks in living benefits. In Developing an
annuity market in Europe (ed. E. Fornero and E. Luciano), pp. 132–167.
Edward Elgar, Cheltenham.
Pitacco, E. (2004c). Survival models in a dynamic context: a survey.
Insurance: Mathematics & Economics, 35(2), 279–298.
Pitacco, E. (2007). Mortality and longevity: a risk management perspec-
tive. In Proceedings of the 1st IAA Life Colloquium, Stockholm.
Pollard, A. H. (1949). Methods of forecasting mortality using Australian
data. Journal of the Institute of Actuaries, 75, 151–182.
Pollard, J. H. (1987). Projection of age-specific mortality rates. Population
Bulletin of the UN, 21–22, 55–69.
Poulin, C. (1980). Essai de mise au point d’un modèle représentatif
de l’évolution de la mortalité humaine. In Transactions of the 21st
International Congress of Actuaries, Volume 2, Zürich-Lausanne,
pp. 205–211.
Renshaw, A. E. and Haberman, S. (2000). Modelling for mortality reduc-
tion factors. Actuarial Research Paper No. 127, Dept. of Actuarial
Science and Statistics, City University, London.
References 385
Renshaw, A. E. and Haberman, S. (2003a). Lee–Carter mortality fore-

casting, a parallel generalized linear modelling approach for England &
Wales mortality projections. Applied Statistics, 52, 119–137.
Renshaw, A. E. and Haberman, S. (2003b). Lee–Carter mortality fore-
casting with age specific enhancement. Insurance: Mathematics &
Economics, 33(2), 255–272.
Renshaw, A. E. and Haberman, S. (2003c). On the forecasting of mor-
tality reduction factors. Insurance: Mathematics & Economics, 32(3),
379–401.
Renshaw, A. E. and Haberman, S. (2005). Lee–Carter mortality forecast-
ing incorporating bivariate time series for England and Wales mortality
projections. Technical report.
Renshaw, A. E. and Haberman, S. (2006). A cohort-based extension
to the Lee–Carter model for mortality reduction factors. Insurance:
Renshaw, A. E. and Haberman, S. (2008). On simulation-based
approaches to risk measurement in mortality with specific reference
to poisson Lee–Carter modelling. Insurance: Mathematics and Eco-
nomics, 42, 797–816.
Renshaw, A. E., Haberman, S., and Hatzopoulos, P. (1996). The modelling
of recent mortality trends in United Kingdom male assured lives. British
Actuarial Journal, 2(II), 449–477.
Retirement Choice Working Party (2001). Extending retirement choices.
Retirement income options for modern needs. The Faculty and Institute
of Actuaries.
Richards, S., Ellam, J., Hubbard, J., Lu, J., Makin, S., and Miller, K.
(2007). Two-dimensional mortality data: patterns and projections.
Presented to the Institute of Actuaries.
Richard, S. J. and Jones, G. L. (2004). Financial aspects of longevity risk.
The Staple Inn Actuarial Society, London.
Riemer-Hommel, P. and Trauth, T. (2000). Challenges and solutions for
the management of longevity risk. In Risk management. Challenge and
opportunity (ed. M. Frenkel, U. Hommel, and M. Rudolf), pp. 85–100.
Springer.
Rotar, V. I. (2007). Actuarial Models. The Mathematics of Insurance.
Chapman & Hall/CRC.
Sandström, A. (2006). Solvency. Models, assessment and regulation.
Chapman & Hall, CRC.
Sithole, T. Z., Haberman, S., and Verrall, R. J. (2000). An investiga-
tion into parametric models for mortality projections, with applications
to immediate annuitants and life office pensioners’ data. Insurance:
386 References
Skwire, D. (1997). Actuarial issues in the novels of Jane Austen. North

American Actuarial Journal, 1(1), 74–83.
Smith, D. and Keyfitz, N. (eds.) (1977). Mathematical demography.
Selected papers, Berlin. Springer Verlag.
Sun, F. (2006). Pricing and risk management of variable annuities with
multiple guaranteed minimum benefits. Actuarial Practice Forum. Soci-
ety of Actuaries.
Sverdrup, E. (1952). Basic concepts in life assurance mathematics. Skan-
dinavisk Aktuarietidskrift, 3–4, 115–131.
Swiss Re (2007). Annuities: a private solution to longevity risk. Sigma, 3.
Tabeau, E., van den Berg Jeths, A., and Heathcote, C. (eds.) (2001). Fore-
casting mortality in developed countries. Kluwer Academic Publishers.
Thatcher, A. R. (1999). The long-term pattern of adult mortality and the
highest attained age. Journal of the Royal Statistical Society, A, 162,
5–43.
Tuljapurkar, S., Li, N., and Boe, C. (2000). A universal pattern of mortality
decline in the G7 countries. Nature, 405, 789–792.
Tuljapurkar, S. and Boe, C. (1998). Mortality change and forecasting: how
much and how little do we know. North American Actuarial Journal, 2,
13–47.
Vaupel, J. W., Manton, K. G., and Stallard, E. (1979). The impact
of heterogeneity in individual frailty on the dynamics of mortality.
Demography, 16(3), 439–454.
Verrall, R., Haberman, S., Sithole, T., and Collinson, D. (2006, Septem-
ber). The price of mortality. Life and Pensions, 35–40.
Wadsworth, M., Findlater, A., and Boardman, T. (2001). Reinventing
annuities. The Staple Inn Actuarial Society, London.
Wang, S. H. (2002). A universal framework for pricing financial and
insurance risks. ASTIN Bulletin, 32(2), 213–234.
Wang, S.H. (2004). Cat bond pricing using probability transforms. The
Geneva Papers on Risk and Insurance: Issues and Practice, 278, 19–29.
Wang, S. S. and Brown, R. L. (1998). A frailty model for projection of
human mortality improvement. Journal of Actuarial Practice, 6(1–2),
221–241.
Wetterstrand, W. H. (1981). Parametric models for life insurance mor-
tality data: Gompertz’s law over time. Transactions of the Society of
Actuaries, 33, 159–175.
Wilkie, A. D., Waters, H. R., and Yang, S. Y. (2003). Reserving, pric-
ing and hedging for policies with guaranteed annuity options. British
Actuarial Journal, 9, 263–425.
Wilkie, A. D. (1997). Mutuality and solidarity: assessing risks and sharing
losses. British Actuarial Journal, 3, 985–996.
References 387
Wilkinson Tiller, M., Blinn, J. D., and Kelly, J. J. (1990). Essentials of risk
financing. Insurance Institute of America.
Willets, R. C. (2004). The cohort effect: insights and explanations. British
Actuarial Journal, 10, 833–877.
Williams, JR. C. A., Smith, M. L., and Young, P. C. (1998). Risk
management and insurance. Irwin/McGraw-Hill.
Wilmoth, J. R. (1993). Computational methods for fitting and extrapolat-
ing the Lee–Carter model of mortality change. Technical report.
Wilmoth, J. R. (2000). Demography of longevity: Past, present, and future
trends. Journal of Experimental Gerontology, 35, 1111–1129.
Wilmoth, J. R. and Horiuchi, S. (1999). Rectangularization revisited: vari-
ability of age at death within human populations. Demography, 36(4),
475–495.
Wong-Fupuy, C. and Haberman, S. (2004). Projecting mortality trends:
recent developments in the United Kingdom and the United States.
North American Actuarial Journal, 8, 56–83.
Yaari, M. E. (1965). Uncertain lifetime, life insurance, and the theory of
the consumer. Review of Economic Studies, 32(2), 137–150.
Yashin, A. I. and Iachine, I. A. (1997). How frailty models can be used
for evaluating longevity limits: Taking advantage of an interdisciplinary
approach. Demography, 34, 31–48.
Index
account value 42 apportionable annuity 39

accumulation period 31, 32, 33–6, 344, asymptotic mortality 147
350, 364, 367 autoregressive integrated moving average
actuarial value 9–12 (ARIMA) models 221–3, 231–2,
additive model 79 253
adverse selection 41
age at death variability 113–15
age rating models 79 B-splines 71–2, 210, 265
age shifts 79, 127–9, 155–6 Balducci assumption 58
age-patterns of mortality 13–14, Banking, Finance, and Insurance
159–60, 178 Commission (BFIC) 92–3
age-period life tables 93–5 Barnett law 66
age-period-cohort models see APC Beard law 66
(Age-Period-Cohort) models Belgium 130–53
age-specific functions 60, 139–40 Cairns–Blake–Dowd model
aggregate table 51 application 207–9
alternative risk transfer (ART) 297 Lee–Carter model application 200–3
see also risk transfer prediction intervals 232–4
Andreev–Vaupel life expectancy smoothing 213–14
projections 235–7 life expectancy forecasting 237–9
annual probability of death 48 optimal calibration period
laws for 66 selection 217–18
mortality modelling by residuals analysis 220–1
extrapolation 141–52, 162 see also Federal Planning Bureau (FPB),
versus interpolation 165–6 Belgium
annual survival probability 48 Bernoulli model 122
annuities-certain 2–8, 36 binomial maximum likelihood
avoiding early fund exhaustion 5–6 estimation 198–9
equivalent number of payments 355 negative 199–200
risks in 6–8 bonus rates 39
withdrawing from fund 2–5 bootstrapping 229–30
annuitization 35, 364–9 application to Belgian mortality
staggered 368 statistics 232–4
annuity in advance 32–3 bootstrap percentiles confidence
annuity in arrears 8, 31 intervals 230–2
APC (Age-Period-Cohort) models Brass logit transform 167–8
173–5
application to UK mortality
data 254–63 Cairns–Blake–Dowd mortality
Lee–Carter APC model 246–54 projection model 183–4,
error structure and model 203–9
fitting 248–52 allowing for cohort effects 263–5
model structure 246–8 application to Belgian mortality
mortality rate projections 253 statistics 207–9
390 Index
Cairns–Blake–Dowd mortality central 57

projection model (Cont.) observed 116–18
calibration 206–7 smoothed 118–22, 209–14
optimal calibration period uniform distribution of deaths 57–8
selection 217, 218 see also mortality
residuals analysis 220–1 decumulation period 31, 32, 36–8, 344,
specification 203–6 345, 350
time index modelling 228–9 deferred life annuity 32–3
see also mortality modelling diminished entelechy hypothesis
calibration period selection 244–5
214–18 distribution function 53–4
application to Belgian mortality dynamic mortality model 139
statistics 217–18
motivation 214–16
selection procedure 216–17 endowment 33–4, 344–5
capital protection 40 endurance 61
cash-refund annuity 40 England see United Kingdom
catastrophe risk 269 enhanced annuities 41
central death rate 57 enhanced pensions 41
Coale–Kisker model 76 entropy 61
coefficient of variation 61 Equitable Life 135
cohort effect 243–5 equity-indexed annuity 38
in Cairns-Blake-Dowd model equivalence principle 12
263–5 equivalent discount rate 355
in P-splines model 265–6 equivalent entry age 355
UK 243–5 equivalent number of payments 355
see also APC (Age-Period-Cohort) escalating annuities 38
models Esscher formula 151
cohort life expectancies 112–13, 153 excess-of-loss (XL) reinsurance 319–20,
cohort life table 46, 140 326
in projected table 152–3 exhaustion time 5
complete expectation of life 60 expansion 138, 161, 168, 179
complete life annuity 39 expected lifetime 59, 139, 152, 170
conditional GAR products 348, 349 exponential formula 145–6, 149
constant-growth annuity 38 alternative approach 146–7
Continuous Mortality Investigation formulae used in actuarial
Bureau (CMIB), UK 243 practice 149–51
software 185 generalization 147
cross-subsidy 14–20 implementation 148
mutuality 14–16 exposure-to-risk (ETR) 95–6, 97
solidarity 16–18
tontine annuities 18–20
cubic spline 70 failure rate 55
curtate expectation of life 59 fan charts 170, 240
curtate remaining lifetime 49 Federal Planning Bureau (FPB),
curve of deaths 54 Belgium 91–2
curve squaring 105–6 life expectancy projections 235
financing post-retirement
income 354–69, 371–2
death comparing life annuity prices 354–6
age at, variability 113–15 flexibility in 363–9
annual probability of 48 life annuities versus income
curve of deaths 54 drawdown 356–9
death rates 96–101 mortality drag 359–63
Index 391
first-order basis 12, 13 life annuity liabilities through longevity

fixed-rate escalating annuity 38 bonds 337–43
force of mortality 55–6, 58, 82–3, 94–5 natural hedging 298, 299–303, 370
cumulative 56 Heligman–Pollard laws 12, 66, 75, 178,
laws for 64–5 179, 276
frailty 80–3 highest anniversary value 42
models 83–5, 88 Human mortality database (HMD) 92
combined with mortality laws 85–7
France 130–5 impaired-life annuities 41
fund exhaustion implied longevity yield (ILY) 15, 363
avoiding 5–6 inception-select mortality 51
exhaustion time 5 index-linked escalating annuity 38
inflation-linked annuity 38
instalment-refund annuity 40
Gamma distribution 83–5, 87 insurance risk 269
Gaussian-Inverse distribution 85 insured population 14
Germany 130 internal knots 69, 70
GLB (Guaranteed Living Benefits) 43 interquartile range 61–2
GM (Gompertz-Makeham) models 65, investment-linked annuities 38–9
163–4 issue-select mortality 51
GMAB (Guaranteed Minimum Italy 130
Accumulation Benefit) 42, 43
GMDB (Guaranteed Minimum Death
Benefit) 42 joint-life annuity 37
GMIB (Guaranteed Minimum Income
Benefit) 42, 43
K-K-K hypothesis 173
GMWB (Guaranteed Minimum
knots 69–70
Withdrawal Benefit) 42–3
Kwiatowski–Philips–Schmidt–Shin
GMxBs (Guarantees Minimum Benefits
test 224
of type ‘x’) 41–3
Gompertz model 55–6, 64, 85–6
see also GM (Gompertz-Makeham) last-survivor annuity 37
models Lee–Carter (LC) model 169–73, 178–80,
graduation 67–8, 87–8 182–4, 186–203
mortality graduation over age and age-period-cohort model 246–54
time 163–5 see also APC (Age-Period-Cohort)
see also non-parametric graduation models
guaranteed annuity 346 application to Belgian mortality
guaranteed annuity option (GAO) 35, statistics 200–3
297, 346–7 application to UK mortality
valuation of 354 statistics 254–63
guaranteed annuity rate (GAR) 346–7 calibration 188–200
adding flexibility 347–50 alternative estimation
conditional GAR products 348, 349 procedures 198–200
with-profit GAR products 349 identifiable constraints 188–9
Gyldén, H. 175–6 least-squares estimation 189–98
optimal calibration period
selection 214–18
hazard function 55 extensions 172, 180, 192–200
cumulative 56 life expectancy forecasting 237–9,
healthy worker effect 122 241–2
hedging 298 model tables and 173
across LOBs 303 prediction intervals 232–4
across time 299–302 residuals analysis 218–21
392 Index
Lee–Carter (LC) model (Cont.) life insurance market 116–29

smoothing in 212–13 age shifts 127–9
specification 186–8 life expectancies 122–3
time index modelling 221–8 observed death rates 116–18
random walk with drift smoothed death rates 118–22
model 225–8 life insurance securitization 330–2
stationarity 223–4 life tables 46–51, 93
see also mortality modelling aggregate table 51
level annuities 38 as probabilistic models 48–9
Lexis diagram 94 closure 101–5
Lexis point 60 cohort life table 46, 140
liability 11 in projected table 152–3
life annuities 2–8 limit table 165–6
accumulation period 33–6 optimal 166, 177
as financial transactions 8 period life table 46–7, 93, 140
avoiding early fund exhaustion 5–6 age-period 93–5
cross-subsidy in 14–20 population versus market tables 47–8
decumulation period 36–8 projected life table 47
deterministic evaluation 8–14 projecting transforms of life table
actuarial value 9–12 functions 167–9
technical bases 12–14 ultimate life table 51
immediate versus deferred LifeMetrics 185
annuities 31–3 lifetime probability distribution 58
longevity risk and 343–50 limiting age 4
mortality risk location 343–6 linear spline 70
payment profile 38–40 lines of business (LOBs) 298
present value of 351–2 natural hedging across LOBs 303
price comparisons 354–6 liquidation period see decumulation
risks in 6–8 period
stochastic evaluation 20–30 liquidity risk 7
focussing on portfolio location measure 60
results 21–4 logit transform of the survival
random present value 20–1 function 73
risk assessment 24–7 long-term bonds 335
uncertainty in mortality longevity bonds 332, 335–7, 371
assumptions 27–30 hedging life annuity liabilities
temporary life annuity 36 through 337–43
versus income drawdown 356–9 longevity risk 1, 267, 268–93, 369
whole life annuity 36 life annuities and 343–50
with a guarantee period 37 management 293–330
withdrawing from fund 2–5 natural hedging 299–303
life expectancy 59–60, 89 reinsurance arrangements 318–30,
Andreev–Vaupel projections 235–7 371
Belgian Federal Planning Bureau (FPB) risk management perspective 293–9
projections 235 solvency issues 303–18, 370
cohort life expectancies 112–13, 153 see also risk management
forecasting 234–42 measurement in a static
application to Belgian mortality framework 276–93
statistics 237–9 mortality risks 268–70
back testing 240–2 pricing and 350–4, 371
fan charts 240 representation 273–6
heterogeneity 115–16 stochastic modelling issues 270–3
observed 122–3 loss control techniques 296–7
period life expectancies 62, 111–13 loss financing techniques 297
Index 393
Makeham laws 64, 67, 76, 159, 176–7, projection in parametric

179 context 156–65
see also GM (Gompertz-Makeham) prediction intervals 229–34
models projected table use 152–6
market risk 7 projecting transforms of life table
maximum downward slope 61 functions 167–9
median age at death 60 relational method 178
model risk 269 surface approach 163
model tables 165–6, 173, 177–8 vertical approach 157, 159–60, 162,
money-back annuities 302 177
Monte Carlo simulation 22, 230–1 see also Cairns–Blake–Dowd mortality
mortality projection model; Lee–Carter (LC)
age-patterns 13–14, 159–60, 178 model
allowing for uncertainty 27–30 mortality odds 49
asymptotic 147 mortality profile 138, 140
at very old ages 74–6, 88 mortality risks 268–70
best estimate 29 location in traditional life annuity
by causes 67, 175 products 343–6
force of 55–6, 58, 64–5, 82–3, 94–5 mortality trends 93–116, 176
cumulative 56 age-period life tables 93–5
forecasting see mortality modelling closure of life tables 101–5
graduation over age and time 163–5 death rates 96–101
heterogeneity 77–87, 88 exposure-to-risk 95–6
frailty models 83–7 expression via Weibull’s
models for differential parameters 160–1
mortality 78–80 heterogeneity 115–16
observable heterogeneity life expectancies 111–13
factors 77–8 life insurance market 116–29
unobservable heterogeneity mortality surfaces 101
factors 80–3
rectangularization and
laws 63–7, 179
expansion 105–11
combined with frailty models 85–7
throughout the EU 129–35
projections and 156–60
variability 113–15
risk of random fluctuation 25
see also mortality
select 49–51
trends see mortality trends mortality-linked securities 332–7
see also death; life tables; survival multiplicative model 79
mortality bonds 332, 333–4 mutuality 6, 14–16, 17–18, 357
mortality drag 15, 359–63 interest from 15
mortality modelling 137–9, 175–80
age-period models 181–242
age-period-cohort models 243–66 Nadaraya–Watson kernel estimate 120
age-specific functions 139–40 natural cubic spline 70
cohort versus period approach 173–5 natural hedging 298, 299–303, 370
diagonal approach 157–9, 162, 177 across LOBs 303
dynamic approach 137–41 across time 299–302
extrapolation of annual probabilities of Newton–Raphson procedure 193–4
death 141–52, 162 no advance funding 298
versus interpolation 165–6 non-guaranteed annuity 346–7
horizontal approach 143–4, 162, 176 non-parametric graduation 67–72
life expectancy forecasting 234–42 splines 69–72
model tables 165–6, 173, 177–8 Whittaker–Henderson model 68–9
mortality by causes 175 non-pooling risk 285
mortality projection 221–9 numerical rating system 79–80
394 Index
option to annuitize 35, 297, 346 R software 184–5

overdispersed Poisson and negative random present value 20–1, 24, 43
binomial maximum likelihood random walk with drift model 225–8
estimation 199–200 ratchet 42
rating classes 16–17
realistic basis 12
P-splines model rectangularization 51, 105–11, 138, 161,
allowing for cohort effects 265–6 168, 179
smoothing approach 210–11 reduction factors 124, 144–5, 179, 233,
parameter risk 269 246–8, 252
participating GAR products 349 reinsurance arrangements 318–30, 371
payout period see decumulation period excess-of-loss (XL)
Pearson residuals 220 reinsurance 319–20, 326
pension annuities 40–1 pricing 325–6
period life expectancies 62, 111–13 reinsurance-swap arrangement on
period life table 46–7, 140 annual outflows 324–5
age-period 93–5 stop-loss reinsurance
Perks laws 65, 75–6, 86–7 on annual outflows 321–4, 326
Petrioli–Berti model 168–9 on assets 320–1, 326
Poisson bootstrap 231 swap-like arrangement between life
Poisson log-bilinear model 172 annuities and life
Poisson maximum likelihood insurances 329–30
estimation 196–8 Renshaw–Haberman model 165
overdispersed 199–200 Renshaw–Haberman–Hatzopoulos
pooling risk 285 model 163–4
post-retirement income financing see reserve 6, 27
financing post-retirement income residuals analysis 218–21
prediction intervals 229–34 application to Belgian mortality
application to Belgian mortality statistics 220–1
statistics 232–4
residuals bootstrap 231
premium 8
resistance function 73–4, 178
return of premiums 35, 42
return of premiums 35, 42
present value 351–2
reversionary annuity 38
pricing
longevity risk and 350–4, 371 Richardt, T. 176
reinsurance arrangements 325–6 risk 6–8, 78
probability density function (pdf) 53–4 assessment 24–7
probability of default 295 exposure-to-risk (ETR) 95–6, 97
process risk 25, 269 management see risk management
profit participation mechanisms 13, 39 of mortality random fluctuation 25
projected life table 47 process risk 25, 269
projected mortality model 139 uncertainty risk 28–9
extrapolation of annual probabilities of see also longevity risk; risk
death 141–52, 162 management (RM); risk transfer
versus interpolation 165–6 risk classes 16–17
parametric context 156–65 risk factors 40
see also mortality modelling; projected risk index 280
mortality table risk management (RM) 293–9, 370
projected mortality table 152–6 natural hedging 299–303
age shifting 155–6 reinsurance arrangements 318–30, 371
cohort tables in 152–3 solvency issues 303–18, 370
from double-entry to single-entry risk transfer 297–8
projected table 153–5 hedging life annuity liabilities through
prudential basis 12 longevity bonds 337–43
Index 395
life insurance securitization 330–2 expansion 51, 138

mortality-linked securities 332–7 rectangularization 51, 105–11, 138
see also risk management transforms of 73–4
roll-up 42 Sweden 130
Rueff’s adjustments 127, 155
ruin probability 295
temporary life annuity 36
Thiele law 65
time series modelling 221–3
Cairns–Blake–Dowd time
safe-side technical basis 12, 13
indices 228–9
safety loading 13 Lee–Carter time index 221–8
scenario technical basis 12 random walk with drift
second-order basis 12 model 225–8
securitization 330 stationarity 223–4
life insurance 330–2 Tonti, Lorenzo 18–19
select mortality 49–51 tontine annuities 14, 18–20
select period 50
select table 51
self-selection 17, 51 ultimate life table 51
single-entry projected table 153–5 uncertainty
Sithole–Haberman–Verrall model in mortality assumptions 27–30
164–5 uncertainty risk 269, 298
smoothing 118–22, 209–14 uni-sex annuities 40
application to Belgian mortality uniform spline 69
statistics 213–14 unit-linked life annuity 39
in Lee–Carter model 212–13 United Kingdom 135, 243–4
motivation 209 APC model application 254–63
P-splines approach 210–12 cohort effect 243–5
solidarity 14, 16–18
solvency 303–18, 370 value-protected annuities 40
assessment 24–7 variability measures 60–1
Spain 130 variable annuities 41–3
special-rate annuities 41 variance of the random lifetime 61
splines 69–72 variation factor 145
B-splines 71–2, 210, 265 voluntary annuities 40
P-splines model
allowing for cohort effects 265–6
smoothing approach 210–11 Wales see United Kingdom
staggered annuitization 368 Wang transform 353
standard annuities 38 Weibull law 65, 160–1, 179
standardized mortality ratio (SMR) Whittaker–Henderson model
116–18 68–9
stationarity 223–4 whole life annuity 36
Statistics Belgium 91 with-profit annuity 39
stochastic valuation 270–3 with-profit GAR products 349
life annuity evaluation 20–30
stop-loss reinsurance XL (excess-of-loss) reinsurance 319–20,
on annual outflows 321–4, 326 326
on assets 320–1, 326
survival, annual probability 48
see also mortality young mortality hump 138
survival function 51–3 YourCast software 185

Modelling Pensions

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Modelling Pensions

Enviado por

Direitos autorais:

Formatos disponíveis

Modelling Longevity Dynamics for

Pensions and Annuity Business

Actuarial science effectively began with the bringing together of compound

of a range of data sets that led to recommendations that were accepted by

Considerable attention is currently being devoted in actuarial work to the

as continuous covariates, to sophisticated robust non-parametric models.

2 The basic mortality model 45

3 Mortality trends during the 20th century 89

4 Forecasting mortality: an introduction 137

4.1 Introduction 137

4.3.3 The exponential formula 145

5 Forecasting mortality: applications and examples

5.1 Introduction 181

5.9.4 Longevity fan charts 240

6 Forecasting mortality: applications and examples of

6.1 Introduction 243

7 The longevity risk: actuarial perspectives 267

7.1 Introduction 267

7.6 Allowing for longevity risk in pricing 350

The presentation of the actuarial structure of life annuities focusses on a

1.2 Annuities-certain versus life annuities

1.2.1 Withdrawing from a fund

Figure 1.1. Annual variation in the fund providing an annuity-certain.

for all t, whence a constant withdrawal

(as probably will be needed to obtain a reasonable post-retirement income)

F0 > F1 > · · · > Ft > · · · (1.7)

and we can ﬁnd a time m such that

Fm ≥ 0 and Fm+1 < 0 (1.8)

Figure 1.2. The fund providing an annuity-certain (i = 0.03).

Figure 1.3. The fund providing an annuity-certain (b = 100).

It is interesting to compare the exhaustion time m with the remaining

In practice, the annual amount b (for a given interest rate i) could be

obviously removes the risk of remaining alive with no withdrawal possibil-

1.2.2 Avoiding early fund exhaustion

Risks related to random lifetimes can be transferred from the annuitants

(a) an income lx S at time 0;

Let Vt denote the fund pertaining to a generic annuitant at time t. The

lx+t Vt = lx+t−1 Vt−1 (1 + i) − lx+t b (1.11)

(a) a positive contribution provided by the interest Vt−1 i;

1.2.3 Risks in annuities-certain and in life annuities

First, let us focus on the simple model of annuity-certain we have dealt

Figure 1.4. Annual variation in the (individual) fund of a life annuity.

The provider of an annuity-certain does not bear any risk inherent in

partially funded, in this case, by the annuity provider. Conversely, num-

1.3 Evaluating life annuities: deterministic approach

1.3.1 The life annuity as a ﬁnancial transaction

Purchasing a life annuity constitutes a ﬁnancial transaction, whose cash

– a price, or premium, paid by the annuitant to the annuity provider;

In what follows, we only refer to annual payments, hence disregarding

1.3.2 Actuarial values

For a given i and a given sequence lx , lx+1 , . . . , lω , from recursion (1.11),

and, referring to a single annuitant,

In formula (1.16), S turns out to be the present value of the sequence

t px = P[Tx > t] (1.17)

An alternative expression is provided by the following formula:

– the symbol ah , deﬁned as follows:

h px qx+h = P[h ≤ Tx < h + 1] (1.22)

h px = (1 − qx )(1 − qx+1 ) . . . (1 − qx+h−1 ) (1.23)

where the sum expresses the probability of dying before age x + t.

Finally, the quantity Vt can be interpreted as the mathematical reserve of

It should be noted that recursion (1.28) expresses the reserve Vt as the

It is easy to prove, replacing Vt and Vt−1 in equation (1.28) with b ax+t

whence the amount Vt can be interpreted as the amount of assets exactly

lx+h+1 = lx+h (1 − qx+h ) (1.31)