Escolar Documentos
Profissional Documentos
Cultura Documentos
NET/JRF/IAS/ISS/JAM/GATE/STATISTICAL INFERENCE
A. SANTHAKUMARAN
About the Author
A. Santhakumaran received his Ph.D. in Mathematics - Statistics from
the Ramanujan Institute for Advanced Study in Mathematics, Univer-
sity of Madras.He has a rich experience in teaching and research. He
had positions as Associate Professor and Head of the Department of
Statistics at Salem Sowdeswari College,Salem, and Professor of Math-
ematics at the Indian Institute of Food Processing Technology, Than-
javur, Tamil Nadu. He has published research papers in Queuing
Theory, Statistical Quality Control,Neural Networks, Fuzzy
Statistics and Food Processing. He is the author of the book Fun-
damentals of Testing Statistical Hypotheses.
Dedicated to all My Teachers
A.Santhakumaran
PREFACE
Human knowledge and practical activities are encompassing the systematic study
of the behaviours of the physical observations and experiments.The purpose of the
book is different from the traditional courses of books. The objectives are to provide
a very basic concepts, elementary presentation emphasis the fundamentals of math-
ematical statistics of predictive models.The book is evolving as a versatile powerful
and indispensable instrument for analysing the statistical data in real life problems.
We have reached a stage where no empirical science can afford to ignore the science.
Mathematical model is a logical description of a system how it performs. Mathemat-
ical models are based on the systematic study of knowledge provided the facts are
existing in the world.Mathematical models involve only manual calculation errors and
disclose the facts as they are. Probability models or Predictive models are based on
the physical experimental outcomes of the data. The probability models must be iden-
tified since they are well described by the causes of variations. The predictive models
contain experimental and computational errors.Simulation is a process of imitating a
real system over time. Simulation affects the logic and it contains computational er-
rors. An experimenter has interested to take a decision of a real system without errors
The evaluation of models definitely depend on some of the per cent errors. The per
True value−Experimental value
cent error of an trail is True value × 100 . The experimenter expects
error free for good decision with minimum observation of data and there by reduce
the cost , administrative inconvenience and time. Apart from the reduction of data
the predicted models require ideal statistics for choosing the best probability models.
The methods of identification deals with the estimation theory of statistical inference.
They are point and confidential interval estimation. The prediction analysis are very
important to make a decision. The book is intended to serve for reaching the goals.
Keeping this in mind, the first chapter of the book deals with Mathematical Models
and Computational methods. The second chapter consists of identification of proba-
bility models from the physical experimental data and also provides some of the the
well known distributions.Chapter 3 gives the criteria of point estimation. Chapter 4
focuses on the study of optimal estimation. Chapter 5 illustrates the properties of reg-
ular family of distributions. Chapter 6 explains the methods of estimation. Chapter 7
discusses interval estimation and Bayesian estimation.
DISTINCTIVE FEATURES
• Care has been taken to provide conceptual clarity, simplicity and up to date
materials for the current situations.
• Properly graded and solved problems to illustrate each concept and procedure
are presented in the text.
• About three hundred solved problems and provide fifty remarks lead to induce
self thinking.
• The book is intended to serve as a text book of one semester course on Statistical
Inference of Under Graduate and Post Graduate Statistics of Indian Universities
and other Applicable Sciences, Allied Statistical Courses, Mathematical Sciences
and various UGC Competitive Examinations like , IAS, GATE, JAM, JRF, NET,
ISS, SLET,etc.
A.Santhakumaran
CONTENTS
1.1 Introduction . . .
1.4.1 Idealization
1.4.2 Formulation
1.4.3 Manipulation
1.4.4 Reformulation
1.4.5 Evaluation
1.4.6 Justification
1.4.7 Validation
2.1 Introduction
Problems
3.1 Introduction
3.2 Estimators
Problems
4.1 Introduction
Problems
5 Optimal Estimation
5.1 Introduction
Problems
6.1 Introduction
Problems
7 Interval Estimation
7.1 Introduction
Problems
Answers
Appendix
Glossary of Notation
Bibliography
Index
1. MATHEMATICAL MODELING AND COMPUTATIONAL METHODS
1.1 Introduction
The chapter illustrates that students are motivated to active capacity for reading,
power of understanding, induce self thinking and lead to implement the knowledge in
practical for making creativity in the disciplines. For this purpose consider a free falling
object in vacuum at sea level on the surface of the earth was considered for building
the mathematical model and computational methods. The governing principles of
building mathematical models are Idealization , Formation, Manipulation, Reformation
if necessary, Evaluation, Justification and Validation. Finally the mathematical model
results are compared with predictive model and simulation method results.
Mathematical Models are golden chance for finding the fact from outcomes of
random experiment. The non-mathematical nature of the physical experiments data is
the birth of mathematical modeling.A mathematical model is a logical description of a
system how it performs. It is a symbolic representation of the non-mathematical form
of real life problems which tells the system behaviours and helps to understand the
system features before conducting experiments. Mathematical model can be classified
into deterministic and predictive models. Mathematical or Deterministic models are
based on assumptions, axioms, principles and statements. Predictive or Empirical
models are obtained from the outcomes of physical experimental data after conducting
experiments. Simulation is the process of designing a model of a real system and
arbitrarily build the models for the purpose of understanding the behaviour and for
the operation of the system or Simulation methods are used to get an idea of how
the system will behave in future. In simulation method arbitrarily generate numerical
outcomes of the physical experiments without complicated integration and differential
equations with the help of computer soft wares. Modeling on the outcomes of physical
experimental values reduces time, less expensive where as non- mathematical form of
the physical experimental data do not.Modeling is a value addition and adding scope to
the non- mathematical form of the physical experimental observations. Mathematical
A.Santhakumaran 2
models have (i) Analytical solution, (ii) Graphical solution and (iii) Numerical solution.
Numerical solution consists of (a) Finite difference method, (b) Finite element method
and (c) Simulation method or Bootstrap method.
Analytic solution gives how a mathematical model behaves under all circum-
stances. It is also know as closed form. It helps to standardizes or optimizes the
outcomes of physical experimental non-mathematical form. Analytical solution con-
tains only manual errors. But predictive model consists of manual and experimental
errors where as simulation method involves manual and logic errors.
Building mathematical models depend upon the objectives for studying a partic-
ular problem. Based on the objectives, list out causes and their effects of the problem
for building the mathematical models. For example , the interest for studying the
effect of velocity on a free falling object from a moderate height in vacuum at sea level
on the surface of earth. In the earth, causes are
The velocity of falling objects depends on these causes and there by the mathe-
matical model is built.
Science is the systematic study of knowledge provided the facts are existing in the
world. The systematic study of knowledge is based on assumptions, axioms, principles
and hypotheses. Developing scientific understanding starts from these concepts for
building mathematical models.
A.Santhakumaran 3
1.4.1 Idealization
vacuum such that the effect of free falling object on the causes size, shape, mass, air
resistance, air flow and air density have on influence to affect the velocity of object.
The remaining causes, distance traveled, time of travel and gravitational force are
alone affect the velocity of objects.Idealization is that removing the unimportant or
not significant causes and thereby the significant causes alone consider for constructing
the mathematical models.
1.4.2 Formulation
dx dx
i.e., ∝x⇒ = kx
dt dt
1.4.3 Manipulation
One uses his mathematical knowledge to manipulate for arriving the solution of
mathematical models by graphical or analytical or numerical methods. The analytical
solution motion of free falling object is obtained by variable and separable method of
integration with respect to the variables x and t.
dx
Z Z
= k dt + log c
x
log x = kt + log c
x
log = kt
c
x = cekt
not move. Here the manipulation is perfect, but the equation x = 0 ∀ ≥ 0 is not
meaning full. Thus the assumption is wrong which needs reformulation.
1.4.4 Reformulation
The reform is that the rate of change distance x of free falling object at any time
t is directly proportional to the time t, it has been falling, i.e., dx
dt ∝ t ⇒ dx
dt = kt
where k is the proportionality constant. It is same for all objects, there is no matter
what the object is. Weight is the only force acting on the objects when object is
falling. Newton’s second law of motion is F = ma kgm/s2 ( One kilogram force =
F
9.81 Newton) where F - Force, m - mass and a - acceleration. Thus a = m, here force
is the weight W. Therefore
W mg
a= ⇒a= = g m/s2
m m
which is independent of mass such that the object’s mass has no effect on the motion
of falling object. Thus the proportionality constant is k = g. Now the differential
equation becomes
Z Z
dx = gtdt + c
Table 1.1 shows the actual calculation results based on the analytical solution,
velocity v(t) = gt m/s where g = 9.81 m/s2 for the given values of time t in seconds
and distance x = 21 gt2 m.
Experimental Results
0
Initially a stationary object ball is allowed to fall
12
freely under gravity, the distance traveled is directly pro- 22
portional to the square of elapsed time. This image of 32
102
Predictive models
The predictive model of the free falling object is obtained by least square method
of fitting the linear curve v(t) on t. The standardized form of the curve v(t) = 9.751t
( Note: Standardized form is independent of the unit measurement). In scale on
measurement
dx
v(t) = = 9.751t m/s t = 0, 1, 2....
dt
Integrating this with the initial condition t = 0 ⇒ c = 0. The estimated distance
x = 4.875t2 m, t = 0, 1, 2, ....
1.4.5 Evaluation
p
R02 = R2 − (1 − R2 ) n−p−1 where n stands for number of trials, p stands for number
of independent variables. Adjusted R02 can be negative but R2 cannot be negative
values. When adjusted R02 increases to indicate that a new additional independent
variable is included in the future selection of the model building.Table 1.3 shows the
observed and estimated values of velocity.
1.4.6 Justification
the coefficient of RMSE lies in the desirable interval 0 - 20%. From Table 1.3, the
sP
n
i=1 (Oi − Ei )2
RM S = = 0.3594.
n
The calculated RMSE lies in the desirable interval 0 to 1. The predictive model
v(t) = 9.751t m/s is justified more appropriate to the experimental(observed) values.
0.3594
Coefficient of variation of RMSE = 53.95 × 100 = 0.6661% where the average of
experimental values = 53.95. and it lies in the desirable interval 0 - 20%.
1.4.7 Validation
For a same set of experimental values, there is more than one predictive model is suitable
for the experimental values. In such a case χ2 statistics is used to test the goodness of fit
at 5% level of error for selecting the best model. The Chi-Square test statistics is χ2 =
Pn h (Oi −Ei )2 i
i=1 Ei ∼ χ2 distribution with (n − k − 1) degrees of freedom where n - number of
classes, k - number of parameters estimated..Here the χ2 value is equal to 0.019. The critical
χ2 ( Table value) value for the χ2 statistics is 15.50 with 8 degrees of freedom at 5% level of
error. The hypothesis is to test that H0 : the fitted velocity v(t) = 9.751t m/s is the best one
against the alternative that H1 : the fitted velocity curve is not the best one. The χ2 test is
the right tail one sided test. The acceptance region of the hypothesis H0 is 0 to 15.50 and the
rejection region of the hypothesis H0 is 15.50 to ∞. The calculated χ2 value 0.019 falls in the
acceptance region 0 to 15.50 of the hypothesis H0 .This shows that the velocity v(t) = 9.751t
m/s from the experimental data is validated at 5% level of error. Some time all the predictive
models the Chi-Square statistics values are not significant at fixed level of error.So all models
are suitable for the same experimental values. In this case one selects the predictive model
which has the least Chi- Square statistics value. The reason is that the experimental error of
Chi-Square statistics value. So the minimum Chi-Square statistics value among all the possible
Using the probability density function(pdf ) of the successive positions, Table 1.4 is
constructed.
Velocity
Time
Figure 1.2 Velocity with acceleration
A.Santhakumaran 11
Velocity
Time
Figure 1.3 Constant velcity with zero acceleration
Figure 1.3 shows the constant velocity of the falling objects with zero acceleration.
Table 1.4 Simulated velocity of experimental data for the free falling object
Generated
Velocity pdf pdf f (t) Cumulative Random Generated
∆v(t) f (t) % pdf f (t)% Interval ∆v(t) ∆v(t)
12 0.1 10 10 00 - 09 180(70)
36 0.1 10 20 10 - 19 108(46)
60 0.1 10 30 20 - 29 108(48)
84 0.1 10 40 30 - 39 132(57)
108 0.1 10 50 40 - 49 60(21)
132 0.1 10 60 50 - 59 132(51)
156 0.1 10 70 60 -69 180(71)
180 0.1 10 80 70 - 79 132(55)
204 0.1 10 90 80 - 89 204 (86)
228 0.1 10 100 90 - 99 60(26)
Total =1200 - - - - 1244
The following random numbers are used and given in Table 1.5
A.Santhakumaran 12
Choose arbitrarily a column or a row or diagonal from the random number table
and combine two digit numbers because the velocity of object consists two digits only.
Here the second column is chosen and jointly consider ten two digit numbers succes-
sively 70, 46, 48, 57, 21, 51, 71, 55, 86 and 26. The first number 70 falls in the interval
70 - 79 of the Table 1.4.The value against interval gives velocity which is 180 cm/s.
It is shown in the last column of the Table 1.4. Similarly the velocities are simulated
for the rest of free falling object in vacuum at sea level is v(t) = 12.44 m/s.
The motion of free falling objects in vacuum from moderate height at sea level on
the surface of earth is summarized in the Table 1.6
There were three methods of obtaining the velocity and distance of the free falling
object in vacuum at sea level on the surface of earth. The results were near to each
other.For the three methods the exact result was the mathematical model velocity
v(t) = gt m/s and the distance x = 12 gt2 m where g = 9.81 m/s2 was used when
the motion of free falling object in vacuum at sea level on the surface of the earth.
For differentiating twice, the distance equation x = 12 gt2 with respect to time t. The
d2 x
acceleration a = dt2
= g m/s2 .
Galileo’s remarkable observation is that all free falling objects fall at the
same rate of acceleration in a vacuum regardless of their mass.
2. PROBABILITY DISTRIBUTION MODELS
2.1 Introduction
Scientific techniques are inevitable when researchers have to deal with historical or
experimental data. The science statistics provides a systematic approach for making
decision which aims to resolve the real life problems. It originated more than 2000
years ago, but it was recognized as a separate discipline from 1940 in India. From then
till now, Statistics is evolving as a versatile powerful and indispensable instrument for
investigation in all fields of real life problems. It provides a wide variety of analytical
tools. We have reached a stage where no empirical science can afford to ignore the
science of Statistics,since the diagnosis of pattern recognition can be achieved through
the science of Statistics.
In India, during the period of Chandra Gupta Maurya, there was an efficient
system of collecting official and administrative Statistics. During Akbar’s reign ( 1556 -
1605 A.D.) people maintained good records of land and agricultural Statistics. Statis-
tics surveys were also conducted during his reign. Sir Ronald A. Fisher known as
father of Statistics placed Statistics on a very sound footing by applying it to various
diversified fields. His contributions in Statistics led to a very responsible position of
Statistics among sciences
Professor P. C. Mahalanobis is the founder of Statistics in India. He was
a Physicist by training, a Statistician by instinct and an Economist by conviction.
Government of India has observed on 29th June the birthday of Professor Prasanta
Chandra Mahalanobis as National Statistics Day. Professor C.R. Rao is an Indian
legend, whose career spans the history of modern Statistics. He is considered by many
to be the greatest living Statistician in the world to day.
There are many definitions of the term Statistics. Some authors have defined
Statistics as Statistical data ( plural sense) and others as Statistical methods ( singular
sense).
A.Santhakumaran 15
Yule and Kendall define Statistics is the quantitative data affected to a marked
extent by multiplicity of causes. Their definition point out the following characteristics:
The best definitions of Statistics is given by Croxton and Cowden. They define
Statistics as the science which deals with collection, analysis and interpretation of
numerical data. This definition points out the scientific ways of :
• Presentation • Interpretation
Collection of data is one of the important tasks in finding a solution for real life
problems. Even if the statistical pattern of the real life problems are valid, if the data
are inaccurately collected, inappropriately analyzed or not representative of the real
life problems, then the data will be misleading when used for decision making.
One can learn data collection from an actual experience. The following sug-
gestions may enhance and facilitate data collection. Data collection and analysis must
A.Santhakumaran 17
(i) Before collecting data, planning is very important. It could commence by a practice
of pre - observing experience. Try to collect the data while pre - observing. Forms
of the data are devised for due purposes. It is very likely that these forms will
have to be modified several times before the actual data collection begins. Watch
for unusual situations or normal circumstances and consider how they will be
handled. Planning is very important even if the data are collected automatically.
After collecting the data, find out whether the collected data are appropriate or
not.
(ii) If the data being collected are adequate to diagnosis the statistical distributions,
then determine the apt distribution. If the data being used are useless to di-
agnosis the statistical distribution, then there is no need to collect superfluous
data.
(iii) Try to combine homogeneous data sets. Check the data for homogeneity in
successive time periods and during the same time period on successive interval
of times.
(v) One may use scatter diagram which indicates the relationship between the two
variables of interest.
(vi) Consider the possibility that a sequence of observations which appear to be independent
may possess autocorrelation. Autocorrelation may exist in successive time periods.
A.Santhakumaran 18
The methods for selecting families of distributions are possible, if only the sta-
tistical data are available. The specific distribution within a family is specified by
estimating its parameters. Parameters estimation of a distribution leads to the theory
of estimation.
The formation of frequency distribution or histogram is useful in guessing the
shape of a distribution. Hines and Montgomery state that choosing the number of class
intervals approximately equals the square root of the sample size. If the intervals are
too long, the histogram will be coarse or blocking, and its shape, and other details will
not smoothness the data. So one has to allow the interval sizes to change until a good
choice is found. The histogram for continuous data corresponds to the probability
density function of a theoretical distribution. If continuous, a line drawn through
the centre point of each class interval frequency should result in a shape like that of
probability density function (pdf )( see Figure 2.2).
Histogram for discrete data where there are a large number of data points,
should have a cell for each value in the range of data. However if there are a few
data points, it may be necessary to combine adjacent cells to eliminate the ragged
appearance of the histogram. If histogram is associated with discrete data, it should
look like a probability mass function (pmf ) ( see Figure 2.1).
known as Sample Space S. A single point in the sample space is an elementary event or
simple event. A set function defined on the sample space is called probability measure,
if
Suppose tossing a coin twice one by one successively or at a time under uniform
conditions, then that to each event of the sample space, one assigns 0 to the outcome
of getting two tails, 1 to the outcome of getting one head and 2 to the outcome of
getting two heads. Let X denote the number of heads in tossing the two coins, then
a real valued function X(ω) is induced by the sample space S = {(T, H) × (T, H)} =
{T T, T H, HT, HH} as a random variable and represent ω1 = {T T }, ω2 = {T H} or
{HT }, ω3 = {HH} are simple events. The possible values of the random X(ω) is
X(ω1 ) = 0, X(ω2 ) = 1 and X(ω3 ) = 2. Table 2.1 shows the associated values of the
random variable and their probabilities.
Simple Event ω ω1 ω2 ω3
Number of heads X(ω) = x 0 1 2
P {X(ω) = x} = x2 ( 12 )2
1/4 /12 1/4
Thus a random variable X is a finite real valued function defined on sample space
S and its inverse image is an event,i.e., X −1 (B) = {ω : X(ω) ∈ B} ∈ S, ∀B ∈ B
where B is the Borel set and B is the Borel field generated by class of all semi closed
intervals B ∈ <. HereB1 = {0} = (−∞, 0] , B2 = {1} = (0, 1] , and B3 = {2} = (1, 2].
Further X −1 (B1 ) = ω1 , X −1 (B2 ) = ω2 and X −1 (B3 ) = ω3 are the events in the sample
space S.
A.Santhakumaran 20
Discrete random variables are used to describe the random phenomenon in which
only integer values can occur. The following are some important distributions.
From the above assumptions in a production process, let X denote the quality
of produced item, then X follow the Bernoulli random variable.
It is the probability that event {X = x} occur, when there are x failures followed
by a success.
A couple decides to have any number of children until they have a male
child. If the probability of having a male child in their family is p, they have
to expect how many children, they will have before the first male child is born.
Let X denote the number of children for the couple. Probability that there are
x female children preceding the first male child is born, is a geometric random
variable.
Consider two brand A and B. Each individual in the population prefers brand
A to brand B with probability θ1 , prefers B to A with probability θ2 and is
indifferent between brand A and B with probability θ3 = 1 − θ1 − θ2 . In a
random sample of n individual X1 prefers brand A, X2 prefers brand B and X3
prefers some brand other than A and B. Then the three random variables follow
a Trinomial distribution, i.e.,
For example, four balls are drawn one at a time, at random and no replace-
ment from 8 balls in a box, 3 black and 5 red. The probability that the third
ball drawn is black, i.e.,
Sn = X1 + X2 + · · · + Xn
where M is the number of ways of selecting the ith position with an object coded
1 and (N − 1)(N − 2) · · · (N − n + 1) is the number of ways of selecting the
remaining (n − 1) places in the sequence from (N − 1) remaining objects. It does
not matter whether the number of success among the n objects drawn, one at a
time, at random or that of simultaneously drawing n at random. The probability
function of Sn is
M N-M
! !
k n-k
k = 0, 1, 2, · · · , min(n, M )
P {Sn = k} = N
n
0
otherwise
The random variable Sn with the above probability function is said to have a
Hypergeometric distribution. The mean of random variable Sn is easily obtained
from the representation of a Hypergeometric variable as a sum of the Bernoulli
trials. That is,
= 1 × P {X1 = 1} + 0 × P {X1 = 0}
+ · · · + 1 × P {Xn = 1} + 0 × P {Xn = 0}
M M nM
= + ··· + =
N N N
M N −M N −n
Variance = V [Sn ] = n if N ∈ I+ (2.1)
N N N −1
The probability at each trial that the object drawn is of the type of which there
M
are initially M is p = N, then
N −n
V [Sn ] = npq if N ∈ I+ (2.2)
N −1
N −n
The equation (2.2) differs from the equation (2.1) by the extra factor N −1 . The
−n
V [Sn ] = npq N
N −1 in the no replacement case and the V [Sn ] = npq in the replace-
N −n
ment case for fixed p and fixed n, since the factor N −1 → 1 as N becomes finitely
A.Santhakumaran 26
After correcting 50 pages of a book, the proof readers find that there are, on
the average 2 errors per 5 pages. One would like to know the number of pages
with errors 0 , 1, 2, 3 · · · in 10000 pages of the first print of the book. X denote
the number of errors per page; then the random variable X follow the Poisson
2
distribution with parameter θ = 5 = 0.4.
A.Santhakumaran 27
If the random variable X follow a power series distribution, then its pmf is
ax θx
x ∈ S; ax ≥ 0, θ > 0
f (θ)
Pθ {X = x} =
0
otherwise
Particular Cases:
(i) Binomial distribution
p
Let θ = 1+p , f (θ) = (1 + θ)n and S = {0, 1, 2, 3, · · · , n} a set of non - negative
integers, then
X
f (θ) = ax θx
x∈S
Xn
(1 + θ)n = ax θx
x=0
n
ax = x
n x
p
x 1−p
Pp {X = x} = p n
[1 + 1−p ]
n
x px q n−x x = 0, 1, 2, · · · , n
=
0
otherwise
n + x - 1 p x
x 1+p
P {X = x} = h i−n
p
1 − ( 1+p )
x
n+x-1 p
= x (1 + p)−n
1+p
n+x-1 x
= x p (1 + p)−(n+x)
-n
= x (−p)x (1 + p)−(n+x) x = 0, 1, 2, · · ·
P {X ≥ 5 ∩ X is a multiple of 3 }
P {Xis a multiple of 3}
P {X is a multiple of 3} = P {X = 3 or 6 or 9 or ... }
1 1 1 1
= + + + ··· = .
23 26 29 7
P {X ≥ 5 ∩ X is a multiple of 3} = P {X = 15 or 18 or 21 or · · · }
1 1 1 1 1 1 1 8 1
= 215
+ 218
+ 221
+ ··· = 215
1+ 23
+ 26
+ ··· = 212
× 7 = 212
× 17 .
1
P {X ≥ 15 is a multiple of 3 | X is a multiple of 3} = .
212
Problem 2.2 In 100 sets of ten tosses an unbiased coin, in how many cases do you
expect to get 7 Heads in 3 Trials?.
A.Santhakumaran 29
1 1
Solution: There are ten tosses of a unbiased coin , n = 10. P (H) = 2 and P (T ) = 2 .
Let X denote the number of heads in tossing a coin 10 times, then X ∼ B(10, 21 ).
X=x 0 1 2 3
p(x) 0.1 0.3 0.4 0.2
Problem 2.4 1800 trials of a draw of two fair dice, what is the expected number of
items that sum will be less than 5?
Solution: When tossing two fair dice,the total number of equally likely cases are
6 × 6 = 36. The number of favorable to the event in a single throw of two dice with
sum will be less than 5 is 3, i.e. the events are 1+3 , 2+2, 3 +1. Probability of getting
3 1
a sum less than is 36 = 12 . Expected number of times, the total will be less than 5 in
1
1800 trials = 12 × 1800 = 150.
Problem 2.5 A and B throw with one die for a stake of Rs.66 that is to be won by
the player who first throws 2. If X have the first throw, what are their respective
expectations?
Solution: The chance of getting first throws 2 with one die is 61 .
A.Santhakumaran 30
6
A’s expectation = 66 ×
= Rs. 36.
11
5
B’s expectation = 66 × = Rs. 30.
11
Problem 2.6 In a business, a person can make a profit of Rs. 3000 with probability
0.6 or suffer a loss of Rs. 1200 with the probability 0.4. Determine his expectation.
Solution: Expectation of profit = Rs.3000 ×0.6 = Rs. 1800.
Expectation of loss = Rs. 1200 × 0.4 = Rs. 480.
His total expectation in the business = Rs. 1800 - Rs. 480 = Rs. 1320.
Problem 2.7 A coin is tossed until a head appear, what is the expectation of tosses
required.
Solution: If X denote the number of tosses in throwing a coin repeatedly, then
x = 1, 2, 3 · · · are the throws head appeared.
1 1 1
E[X] = 1 × + 2 × ( )2 + 3 × ( )3 + · · ·
2 2 2
A.Santhakumaran 31
1 1 1
= [1 + 2 × + 3 × 2 + · · ·
2 2 2
1 1 1
= [1 − ]−2 = × 4 = 2.
2 2 2
Problem 2.8 A man draws 3 balls from a box containing 5 white and 7 black balls.
He gets Rs.10 for each white ball and Rs.5 for each black ball. Find his amount of
expectation.
Solution: There are two types of white and black balls. Let X be a random variable
which represents the sum of the amount of Rs. 10 for a white ball and Rs.5 for a black
ball.Three balls are drawn out of 12 balls. It can be done in the following ways., i.e.,
WWW, WWB, WBB and BBB. The random variable X take the values Rs. 30 for
three whites Rs. 25 for two whites and one black, Rs. 20 for one white and two blacks
and Rs.15 for three blacks.
5
3 1
P {X = 30} = 12 = 22
3
5
2 7
P {X = 25} = 7 = 22
2
5
1 21
P {X = 20} = 7 = 44
2
7
3 7
P {X = 15} = 12 = 44
3
1 7 21 7
E[X] = 30 × + 25 × + 20 × + 15 × = 21.25.
22 22 44 44
Trial x 1 2 3 4 ···
Event A ĀA ĀĀA ĀĀĀA ···
1 5 1
P {X = x} 6 6 × 6 ( 56 )2 × 1
6 ( 56 )3 × 1
6 ···
The expected number of tosses required to get the face 2 is
2
1 5 1 5 1
E[X] = 1 × + 2 × × + 3 × × + ···
6 6 6 6 6
−2
1 5
= 1− = 6.
6 6
Problem 2.10 A die is thrown repeatedly until the face 2 is obtained . Find the
expectation that first time the face 2 is appeared.
Solution: If X denote the number of failures preceding the first success in throwing
a die repeatedly, then first time face 2 appeared in the tosses is x = 0, 1, 2, 3 · · ·. Let
A be the event of face 2 and Ā be the complement of event A.
First time face 2
appears A ĀA ĀĀA ĀĀĀA ···
Number of failures
x 0 1 2 3 ···
P {X = X} ( 56 )0 × 1
6 ( 65 )1 × 1
6 ( 56 )2 × 1
6 ( 65 )3 × 1
6 ···
Expected number of first time face 2 appeared is
E[X] = 0 × P {X = 0} + 1 × P {X = 1} + 2 × P {X = 2} + · · ·
2
5 1 5 1
= 1× × +2× × + ···
6 6 6 6
1 5 5 −2
= × 1− = 5.
6 6 6
Problem 2.11 Six dice are thrown 729 times. How many times do you expects at
least three dice to show 5 or 6 ?
2 1
Solution: The probability of getting 5 or 6 when a die is thrown = 6 = 2. If X
denote the number of dice to show 5 or 6 in throwing six dice , then X ∼ B(6, 13 ). The
probability of getting at least 3 dice to show 5 or 6 is
6
!
x 6−x
X 6 1 2
P {X ≥ 3} =
x=3
x 3 3
233
= .
729
A.Santhakumaran 33
Six dice are thrown 729 times. The expected number of times at least 3 dice to show
233
5 or 6 = 729 × 729 = 233 times.
Problem 2.12 An irregular six-faced die is such that the probability that it gives 3
even numbers in 5 trials is twice the probability that it gives 2 even numbers in 5 trials.
How many sets of exactly 5 trials can be expected to give no even numbers out of 250
sets.
Solution: Let p be the probability of getting an even number with unfair die and
q = 1 − p. The number of trials n = 5. If X denote the number of even numbers in 5
trials, then
P {X = 3} = 2P {X = 2}
! !
5 3 2 5 2 3
p q = 2× p q
3 2
p = 2q = 2(1 − p)
1 2
p = and q =
3 3
1
The expected number of trials for getting no even number is 250 × 243 = 1.028 ≈ 1.
1
Problem 2.13 The probability of a man hitting a target is 3. How many times he
must fire so that the probability of hitting the target at least once is more than 90%?
Solution: Let X be the number of times a man hitting a target. The probability of
hitting the target at least once is more than 90% , i.e., P {X ≥ 1} = 0.9. Given the
probability of hitting the target is p = 13 . Suppose n is the number of times hitting
the target, then X ∼ B(n, p). Thus
P {X ≥ 1} = 0.9
1 − P {X < 1} = 0.9
1 − P {X = 0} = 0.9
P {X = 0} = 0.1
A.Santhakumaran 34
!
0 n
n 1 2
= 0.1
0 3 3
n
2
= 0.1
3
2 log(0.1)
n log = log(0.1) ⇒ n =
3 log( 23 )
−1.0000
n = = 5.679 ≈ 5 or 6.
−0.17600
1
Problem 2.14 Suppose a boy is hitting a target with probability 2 in each trial.
what is the probability that his 10th shot is 5th hit?
Solution: The boy’s 10th shot should result his 5th hit. So in his first 9 shots , he
has to hit the target 4 times. Thus there are 4 successes in 9 times. If X denote the
number of times hit the target, then X ∼ B(9, 12 ). Probability of exactly 4 successes
in 9 trials is !
9
9 1
P {X = 4} =
4 2
Probability of hitting the target 5 times while the 10th hit 5th success is
Table 2.2 The probability of two defective while the lost item is defect.
Trial 1 2 3 ···
Examining item 4 5 6 ···
Number of time failures
in examining item 3 4 5 ···
Two defective while
the last one is defect
3 2 4 3 5 4
among the examining item 1 pq ×p 1 pq ×p 1 pq ×p ···
P {X ≥ 4} = P {X = 4} + P {X = 5} + · · ·
! ! !
3 2 2 4 2 3 5 2 4
= p q + p q + p q + ···
1 1 1
= p2 [3q 2 + 4q 3 + 5q 4 + · · ·]
= p2 [1 + 2q + 3q + · · · + −1 − 2q]
= p2 [1 − q]−2 − 1 − 2q]
= p2 [p−2 − 1 − 2q]
= 1 − p2 − 2p2 q
Problem 2.16 If the probability that a child is a boy is 0.80. Find the expected
number of boys in a family with 5 children given that there is at least one boy.
Solution: Let the probability of a boy child in a family be p, 0 < p < 1 and q = 1−p.
If X denote the number of boys in a family with n children, then X ∼ B(n, p). The
probability of exactly x boys in the family given that there at least one boy is
P {X = x ∩ X ≥ 1}
P {X = x | X ≥ 1} =
P {X ≥ 1}
P {X = x}
=
P {X ≥ 1}
n x n−x
x p q
=
1 − P {X = 0}
n x n−x
x p q
=
1 − qn
A.Santhakumaran 36
Expected number of boys in the family given that there at least one boy is
n n n x n−x
x p q
X X
xP {X = x | X ≥ 1} = x
x=1 x=1
(1 − q n )
n
X xn!px q n−x
=
x=1
x!(n − x)!
n
np X (x − 1)!p(x−1) q (n−1)−(x−1)
=
1 − q n x=1 (x − 1)!(n − x)!
np
= (p + q)n−1
1 − qn
where nx=1 n−1
x−1 n−x
= (p + q)n−1
P
x−1 p q
np
=
1 − qn
5 × 0.8
= where p = 0.8 and n = 5
1 − (0.2)5
5 × 0.2
= = 1.0003 ≈ 1.
0.99968
= P {X = 3} + P {X = 4} + P {X = 5}
! !
5 3 2 5 2
= p q + p q + p5
3 4
= 10p3 q 2 + 5p4 q + p5
= P {X = 2} + P {X = 3}
A.Santhakumaran 37
!
3 2
= p q + q3
2
= 3p2 q + q 3
5 component system will function more effectively than the 3 component system , if
1
⇒ p ≥ 0 or 2p − 1 ≥ 0 or p − 1 ≥ 0 i.e., p ≥ 0 or p ≥ or p ≥ 1
2
where λ is the average number of calls in every 9 minutes. The average number of calls
arriving a telephone booth in one minute is 32 . The average number of calls arriving
2
in every 9 minutes is 9 × 3 = 6, i.e., λ = 6.
P5 e−6 6x
Thus P {X ≥ 6 or more calls } = P {X ≤ 5} = 1 − x=0 x! = 1 − 0.4457 = 0.5543.
Problem 2.19 The number of grasshoppers on a broad bean leaf follows a Poisson
probability model with mean λ = 2. A plant inspector, however records the number of
A.Santhakumaran 38
P {X = x ∩ X ≥ 1}
P {X = x | X ≥ 1} =
P {X ≥ 1}
P {X = x}
=
1 − P {X = 0}
x
e−λ λx!
= x = 0, 1, 2, 3 · · ·
1 − e−λ
The probability of one or two grasshopper on a leaf given that at least one grasshoppers
on a leaf present is
P {X = 1 or X = 2 | X ≥ 1} = P {X = 1 | X ≥ 1} + P {X = 2 | X ≥ 1}
e−λ λ e−λ λ2
= +
1 − e−λ (1 − e− λ)2!
e−λ λ2
= [λ + ]
1 − e−λ 2
4e−2
= = 0.6260 where λ = 2
1 − e−2
Expected number of Grasshoppers recorded per leaf given at least one grasshopper
present is
∞ ∞
X X e−λ λx
xP {X = x | X ≥ 1} = x
x=1 x=1
(1 − e−λ )x!
∞
λe−λ λx X λx−1
=
1 − e−λ x=1
(x − 1)!
λe−λ eλ λ 2
For λ = 2 ⇒ = = = = 2.3129 ≈ 2 or 3.
1 − e−λ 1 − e−λ 1 − e−2
P {X = x} = 0 ∀ x in that interval.
Problem 2.20 Find the mean and variance of the distribution that has the form
0 x < 10
1
10 ≤ x < 15
4
F {X = x} =
3
4 15 ≤ x < 20
20 ≤ x
1
X=x 10 15 20
1 1 1
P {X = x} 4 2 4
1 1 1
Mean = E[X] = 10 × + 15 × + 20 ×
4 2 4
= 2.5 + 7.5 + 5 = 7.5
1 1 1 950
E[X 2 ] = 1o2 × + 152 × + 202 × =
4 2 4 4
2 2 50
V [X] = E[X ] − [E(X)] = .
4
A.Santhakumaran 40
P (X = 1) = F (1) − F (1−)
1 x2
= − lim
2 x→1− 4
1 1 1
= − =
2 4 4
The pdf of X is
x
2 0<x<1
1
x=1
4
f (x) =
1
6 x=2
1
2<x<3
3
R1 x R3 1
It satisfies 0 2 dx + P {X = 1} + 1 3 dx + P {X = 2} = 1
Z ∞
Mean E[X] = xf (x)dx
−∞
A.Santhakumaran 41
Z 1 2 Z 3
x x
= dx + 1 × P (X = 1) + dx + 2 × P (X = 2)
0 2 2 3
1 1 2
Z 3
1 1 1
Z
= x dx + 1 × + xdx + 2 ×
2 0 4 3 2 6
19
=
12
(i) Uniform probability distribution model
A random variable X is uniformly distributed at an interval [a, b], if its pdf is
given by
1
a≤x≤b
b−a
pa,b (x) =
0
otherwise
x2 −x1
Note that P {x1 < X < x2 } = F (x2 ) − F (x1 ) = b−a is proportional to the
length of the interval for all x1 and x2 satisfying a ≤ x1 ≤ x2 ≤ b. If a random
phenomenon has complete unpredictability, then it can be described as uniform
probability model distribution.
Problem 2.22 A passenger at a bus stop at 4 PM, knowing that the bus will
arrived at some time uniformly distributed between 4 PM and 5 PM. What is
the probability that he will have to wait longer 10 minutes? If at 4.30 PM the
bus has not yet arrived, what is the probability that he will have to wait at least
10 additional minutes?
Solution: Let X be the waiting time in minutes of the passenger. Probability
density function of random variable X is
1
0 < x < 60
60
f (x) =
0
otherwise
1 R 60
P {X > 40} 60 dx 2
= = 40
R 60 1 dx = .
P {X > 30} 30 60
3
Problem 2.23 The random variable a, b are independently and uniformly dis-
tributed in the intervals (0,6) and (0,9) respectively. Find the probability that
roots of the equation x2 − ax + b = 0 are real.
Solution : The equation x2 − ax + b = 0 has real roots, if ∆ = a2 − 4b ≥ 0 and
a ∼ U (0, 6) and b ∼ U (0, 9) , i.e.,
1
0<x<6
6
f (x) =
0
otherwise
1
0<x<9
9
f (x) =
0
otherwise
The probability that the roots of the equation x2 − ax + b = 0 are real, i.e.,
P {∆ ≥ 0} = P {a2 − 4b ≥ 0}
Z Z
= dadb subject to a2 − 4b ≥ 0
1 2
Z 6 !
Z
4
a
= f (b)db f (a)da 0 < a < 6 and 0 < b ≤ 14 a2
0 0
Z 6
1 1 2 1
= a da = .
54 0 4 3
any day is 0.05. How many copies of the book should be kept in the library so
that the probability may be greater than 0.95 that none of the students requiring
a copy from the library has to come back disappointed? Assume sample is large.
Solution: Given the sample size n = 100. Probability that a student requires a
book from the library = 0.05. Choose the probability of success is p = 0.05 and
q = 0.95. Let X be the number of students requiring the book , then X ∼ B(n, p).
Mean µ = np = 100 × 0.05 = 5 and variance σ 2 = npq = 100 × 0.05 × 0.95 = 4.75.
Using the Normal approximation, then X ∼ N (µ, σ 2 ). Let X be the required
number of books which satisfies.
x−5
From the Normal distribution table, the upper ordinate 2.18 corresponding to the
x−5
area 0.45 is 1.65 , i.e., 2.18 > 1.65 ⇒ x > 5 + 2.18 × 1.65, i.e., x > 17.985 ≈ 18.
Hence the college library should keep at least 18 copies of the book.
Problem 2.25 In an experiment, it is laid down that a student passes, if he
secures 40 per cent or more marks. He is placed in the first, second or third
division according as he secures 60% or more marks, between 50% and 60%
marks, and marks between 40% and 50% respectively. He gets distinction in
case he secures 80% or more marks. It is notified from the result 20% of students
failed on the examination, whereas 5% of them obtained distinction. Calculate
A.Santhakumaran 44
Problem 2.27 The fuel per cent specification in a rocket is followed a Normal
distribution with mean µ = 30 and variance σ 2 = 25 through the specification
of fuel is that it should lie between 25 and 35. The manufacturer will get a net
profit per liter of fuel Rs. 100, if the fuel specification lie between 25 and 35.
: Rs. 40, if the fuel specification lie between 20 and 25 or lie between 35 and
40.Also if the fuel specification lie less than 20 or greater than 40 increase a loss
of Rs. 50 per liter of the manufacturer.Find expected profit of the manufacturer.
If he wants to increase his expected profit by 50% the net profit on the category
of fuel that meets the specification, what should be the new profit per liter of
the fuel to his category?.
Solution: Let X be the fuel per cent specification in a rocket, i.e., X ∼ N (µ, σ 2 ).
The probability of fuel specification lie between 25 and 35 is
25 − 30 X − 30 35 − 30
P {25 < X < 35} = P < <
5 5 5
= P {−1 < Z < 1} where Z = X−30
5 ∼ N (0, 1)
= 2 × 0.3413 = 0.6826.
20 − 30 X − 30 25 − 30
P {20 < X < 25} = P < <
5 5 5
= P {−2 < Z < −1} = P {Z < −1} − P {Z < −2}
Expected profit per liter of the manufacturer = Rs. 72.38. Let the revised net
profit per unit of the first category fuel be k. Then the expected revised profit
per liter is Rs.[k × 0.6826 + 40 × 0.272 + −50 × 0.0137] = Rs.0.6826 + Rs.10.20
50% of the expected profit =Rs.72.38 × 0.5 = Rs.36.19.
Manufacturer expected revised profit per liter is Rs. 72.38 + Rs. 36.19 = Rs.
108.57.
108.57−10.20
Therefore 0.6826k + 10.20 = 108.57 ⇒ k = 0.6826 = Rs.144.11. The revised
net profit per unit of the first category fuel is Rs. 144.11.
The value of the intercept on the vertical axis is always equal to the value of θ.
Note that all pdf 0 s eventually intersect at θ, since the Exponential distribution
has its mode at the origin. The mean and standard deviation are equal in Expo-
nential distribution. In a random phenomenon, the time between independent
events which have memory less property may appropriately follow Exponential
random variable. For example, the time between arrivals of a large number of
customers who act independently of each other may fit adequately the data to
Exponential distribution.
A.Santhakumaran 47
Problem 2.28 The length of the shower on a tropical island during rainy sea-
son has an Exponential distribution with parameter 2, time being measured in
minutes. What is the probability that a shower will last more than 3 minutes?.
If a shower has already lasted for 2 minutes. What is the probability that it will
last one more minute?
Solution: Assume that X denote the length of shower on a rainy season in min-
utes. Given X follow Exponential probability model distribution with parameter
λ = 2 minutes. Probability that a shower will last more than 3 minutes is
Z ∞
P {X > 3} = 2e−2x dx = e−6 = 0.0025.
3
Probability that a shower has already lasted for 2 minutes given that he will last
for at least one more minutes is
P {X ≥ 2 ∩ X > 1} P {X > 2}
P {X ≥ 2 | X > 1} = = = e−2 = 0.1353.
p{X > 1} P {X > 1}
1 1
Expected failure time E[X] = λ =5⇒λ= 5 per hour.
The probability that a component functioning at end of 10 hours or after 10
hours as success
Z ∞
1
p = P {X ≥ 10} = λe−λx dx = e−10λ = e−10× 5 = e−2 = 0.1353.
10
5 components are installed and one half or more components functioning at the
end or after 10 hours are 3 or more than 3. Thus the probability of x success in
A.Santhakumaran 48
Probability that a bulb has life at the end or after 1200 hours is
Z ∞
6
P {X ≥ 1200} = λe−λx dx = e−1200λ = e− 5 = 0.3012.
1200
1
E[X] = = 1000
λ
1
⇒λ = per hour and P {X < 1200} = 0.6988
1000
Similarly , if unit II is used , then the expected life time = 1500 hours and
E[X] = 1
λ = 1500 hours, λ = 1
1500 per hour. P {X ≥ 1200} = e−8 = 0.4493 and
P {X < 1200} = 0.5507. The cost production per bulb of unit II is
30 if x ≥ 1200
C2 =
65
if x < 1200
where β is called the shape parameter and θ is called the scale parameter.
Pn
i=1 Xi ∼ G(n, 1θ ) , if each Xi ∼ exp( 1θ ). The cumulative distribution func-
tion F (x) = P {X ≤ x} of a random variable X is given by
1 − ∞ βθ (βθt)β−1 e−βθt dt x > 0
R
x Γβ
F (x) =
0
otherwise
The probability of power supply will be adequate on any given day = 1 - 0.0174
= 0.926.
Problem2.33 If a company employees n sales persons. Its gross sales in thou-
sands of rupees follows an Erlang probability model distribution with scale pa-
√
rameter λ = 0.5 and shape parameter k = 8000 n. If the sales cost is Rs. 4000
per sales person, how many sales persons should the company employ to maxi-
mize the expected profit?.
Solution: Let X be the gross sales in rupees. the company has n sales per-
son. The random variable follow the Erlang distribution with parameter λ and
√
k. The expected gross sales = E[X] = λk = 16000 n. Let T denote the total
expected profit of the company, then T = total expected sales - total sales cost
√ d2 T
= 16000 n − 4000 × n. For maximum profit dT dn = 0 and dn2 < 0 .
dT 1 1
= 16000 × n 2 −1 − 4000
dn 2
8000
= √ − 4000
n
dT 8000 dT 2
= 0 ⇒ √ − 4000 = 0i.e., n = 4and 2 < 0 at n = 4
dn n dn
The three parameters of the Weibull distribution are γ (−∞ < γ < ∞) which is
the location parameter, α (α > 0) which is the scale parameter and β (β > 0)
which is the shape parameter. When γ = 0 the Weibull pdf becomes
β ( x )β−1 exp[−( x )β ] x ≥ 0
α α α
pβ,α (x) =
0
otherwise
A.Santhakumaran 52
The pdf of Y = 3X 2 is
1
√1 e− 12 y 0<y<∞
2 3yπ
g(y) =
0
otherwise
1 1 2
G0 (y)dy = f (log y)d(log y) and f (x) = √ e− 2 x
2π
√1 e− 12 (log y)2 × 1 dy
0<y<∞
2π y
g(y)dy =
0
otherwise
A.Santhakumaran 54
The time taken to install 100 machines is collected. The data are given in
Table 2.3 which gives the number of machines together with time taken. For
example, 30 machines have installed between 0 and 1 hour, 25 between 1 and 2
hour, 20 between 2, and, 3 hour, and, 25 between 3, and, 4 hour. Let X denote
the time taken to install the machines.
Duration
of Hours Frequency p(x) F (x) = P {X ≤ x}
0≤x≤1 30 .30 .30
1<x≤2 25 .25 .55
2<x≤3 20 .20 .75
3<x≤4 25 .25 1.00
At the end of day, the number of shipments on the loading docks of an export
company are observed as 0, 1 , 2, 3, 4 and 5 with frequencies 23, 15, 12, 10, 25
and 15 respectively. Let X be the number of shipments on the loading docks of
the company at end of the day. Then X have discrete random variable which
takes the values 0 , 1, 2, 3, 4 and 5 with the distribution as given in Table 2.4.
Figure 2.1 is the Histogram of shipments on the loading docks of the company.
A.Santhakumaran 55
Number of
shipments x Frequency P {X = x} F (x) = P {X ≤ x}
0 23 .23 .23
1 15 .15 .38
2 12 .12 .50
3 10 .10 .60
4 25 .25 .85
5 15 .15 1.00
25
F
R
20
E
Q
U
E
N
C
Y 10
1 2 4 5 Number of shipments
0 3
Figure 2.1 Histogram of shipments
A.Santhakumaran 56
Equipment
Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Time
between
failures 19 12 16 1 15 5 10 1 46 7 33 25 4 9 1 10
For the sake of simplicity in processing the data , one can set up the ordered set as
given blow:
A.Santhakumaran 57
Equipment
Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Time
between
failures 1 1 1 4 5 7 9 10 10 12 15 16 19 25 33 46
On this basis, one may construct a Histogram to judge the pattern of the data in Table
2.6. An approximate value of the interval can be determined from the formula.
where the maximum and minimum are the values in the ordered set and N is the total
number of items of the order statistics. In this case maximum value is 46 , minimum
45
value is 1 and N is 16. Thus ∆t = 1+3.3 log10 16 = 9.05 ≈ 10 = width of the class
interval.
Time
interval 0 - 10 10 - 20 20 - 30 30 - 40 40 - 50
Number of
Equipment 9 4 1 1 1
Histogram is drawn based on the frequency distribution in Table 2.7 and is given
in Figure 2.2.
A.Santhakumaran 58
Number
of
equipments
0 10 20 30 40 50 Time
Figure 2.2 Histogram of time to failures
The Histogram reveals that the distribution could be Negative Exponential or the
right portion of the Normal distribution. Assume the time to failure follows Exponen-
tial distribution of the form,
θe−θx
θ > 0, x > 0
pθ (x) =
0
otherwise
How for the assumption is valid has to be verified? The validity of assumption is
tested by the χ2 test of goodness fit.
A.Santhakumaran 59
Expected Observed
frequency frequency
Interval pi E O
0 - 10 0.5262 8.41 ≈ 8 9
10 - 20 0.2493 3.98 ≈ 4 4
20 - 30 0.1181 1.886 ≈ 2 1
30 - 40 0.0559 0.8944 ≈ 1 1
40 - 50 0.0265 0.454 ≈ 1 1
R xi+1
where pi = xi θe−θx dx = e−θxi − e−θxi+1 , i = 0, 10, 20, · · · , 50. If the cell frequencies
are less than 5, then it can be made 5 or more than 5. One may get two classes
only, i.e, the expected frequencies are equal to 8 each and the corresponding observed
frequencies are 9 and 7 respectively. The χ2 test of goodness of fit fails to test the
validity of assumption that the sample data come from an Exponential distribution
1
with parameter θ = 13.38 = 0.0747 = failure rate per unit hour where the mean life
214
time of the equipments = 16 = 13.38 hours. To test the validity of assumption that
the time to failure follows an Exponential distribution, consider the likelihood function
of the cell frequencies of o1 = 9 and o2 = 7 is
n! e1 o1 e 2 o 2
o1 + o2 = n
o1 !o2 ! n n
L=
0
otherwise
Under H0 the likelihood function follows a Binomial probability law b(16, p) where
e1
p= n. To test the hypothesis that H0 : the fit is best one vs H1 : the fit is not the
best one. It is equivalent to test the hypothesis that H0 : p ≤ 0.5 vs H1 : p > 0.5 The
UMP level α = 0.05 test is given by
1 if x > 11
φ(x) = 0.17 if x = 11
0
otherwise
The observed value is 9 which is less than 11. There is no evidence to reject the
hypothesis H0 . The data come from an Exponential distribution with 5% level of
A.Santhakumaran 60
* yj
*
*
*
xj
Figure 2.3 Q - Q plot of the repairing times
Note: The diagnosis of statistical distributions of real life problems are not exact
but at best they represent reasonable approximations.
obtain a reasonable amount of failure data under classical conditions. The accelerated
random variables probability distribution are appropriate to study the time to failure of
the experimental units. The probability distribution model is fitted to the accelerated
failure times and then extrapolated to estimate the life distribution at experimental
conditions. The accelerated life test optimum failure time are obtained for changing
the stress of accelerated random variable. The stresses are constant, progressive and
step stress.
A simple step stress which consider only two stress in a experimental test. In
a random experiment initially low stress is first if the unit does not fail in a pre-
specified time, there after stress on it is raised and held for a specified time. Further
accelerated stress is repeatedly increased until the trust unit fails or censoring time
is reached. For example, if the test is a simple stress, then the cumulative exposure
probability distribution model with stress X1 , and , X2 and pre-specified time τ such
that X1 < X2 is
F1 (x), 0≤x<τ
F (x) =
F2 (x, τ + s),
τ ≤x<∞ (2.5)
where F1 (x) is the cumulative distribution function of the failure time at stress X1 ,
τ is the time to change stress and s is the solution of F1 (τ ) = F2 (s). Under constant
stress, an accelerated random variable is a second kind of the Pareto distribution which
appears as a mixture of the one parameter exponential distribution. The Lomax pdf
A.Santhakumaran 64
From equation (2.5) s = τ ( θθ22 ).The Lomax pdf of the simple stress accelerated random
variable is
λ
(1 + θx1 )−(λ+1) 0≤x<τ
θ1
f (x) = λ x−τ τ −(λ+1)
θ2 (1 + θ2 + θ1 ) τ ≤x<∞
0
otherwise
A.Santhakumaran 65
where Fi (x) is the cumulative distribution function of the failure time at ith stress
level i = 1, 2, 3 · · · k, τi is the time change from ith to (i + 1)th stress level and si−1
is an equivalent start time at ith stress level which is produced the same population
cumulative fraction failing. Thus si−1 is the solution of
From the equation(2.6), the cumulative exposure model for a three step stress
accelerated random variable is
F1 (x) 0 ≤ x < τ1
F2 (x − τ1 + s1 ) τ1 ≤ x < τ2
(2.7)
F (x) =
F3 (x − τ2 + s2 ) τ2 ≤ x < ∞
0 otherwise
with s1 is the solution of F2 (s1 ) = F1 (τ1 ), and s2 is the solution of F3 (s2 ) = F2 (τ2 −
τ1 + s1 ). In the case of three step stress, the following assumptions are made for the
accelerated random variable.
(i) Testing is done at stress X1 , X2 and X3 where X1 < X2 < X3
(ii) The scale parameters θi at stress level i = 1, 2, 3.
(iii) The life times of test units are independent and identically distributed.
(iv) The constant λ, independent of time and stress.
(v) All n testing unitsare initially placed on low stress X1 and run until pre-specified
time τ1 when the stress is changed to high stress X2 for those remaining test units
that have not failed. The test is continued until pre-specified time τ2 when stress is
A.Santhakumaran 66
changed to X3 and continued until all remaining units fail. The Lomax cumulative
exposure model for a three step stress accelerated random variable is
x −λ
1 − [1 + θ1 ] 0 ≤ x < τ1
x−τ1 τ1 −λ
1 − [1 + + τ1 ≤ x < τ2
θ1 ]
θ2
F (x) =
τ2 −τ1
1 − [1 + x−τ 2 τ1 λ
θ3 + θ2 + θ1 ] τ2 ≤ x < ∞
0 otherwise
From equation (2.7), s1 = τ1 ( θθ21 ) and s2 = {τ2 − τ1 + τ1 ( θθ21 )}( θθ32 ). The pdf of Lomax
exposure model for a three step stress accelerated random variable is
λ x −(λ+1)
θ1 [1 + θ1 ] 0 ≤ x < τ1
λ [1 + x−τ1 + τ1 ]−(λ+1)
τ1 ≤ x < τ2
θ2 θ2 θ1
f (x) =
λ x−τ τ2 −τ1 τ1 −(λ+1)
θ3 [1 + θ3 + θ2 + θ1 ]
2
τ2 ≤ x < ∞
0 otherwise
Problems
2.1 The mean and variance of the number of defective items drawn randomly one
by one with replacement from a lot are found to be 10 and 6 respectively. The
distribution number of defective items is
2.2 Let X be the Poisson random variable with mean 3, then P {|X − 3| < 1} will be
2.3 Let U(1) , U(2) , · · · , U(n) be the order statistics of a random sample U1 , U2 , · · · , Un
of size n from the Uniform (0, 1) distribution. Then the conditional distribution
of U1 given U(n) = u(n) is given by:
A.Santhakumaran 67
2.4 A biased coin is tested 4 times or until a head turns up, whichever occurs earlier.
The distribution of tails turning up is
2.5 If X and Y are independent Exponential random variables with the same mean
λ, then the distribution of min(X, Y ) is
λ
(a) Exponential with mean 2 (c) Not Exponential with mean λ
(b) Exponential with mean λ (d) Exponential with mean 2λ Ans:(d)
2.6 The χ2 goodness of fit is based on the assumption that a character under study is
2.7 The exact distribution of χ2 goodness of fit for each experiment unit is classified
into one of more k categories of a random sample of size n depends on
2.9 If X1 ∼ b(n1 , θ), X2 ∼ b(n2 , θ) and X1 , X2 are independent, then the sum of
variables X1 + X2 is distributed as
2.11 The skewness of a Binomial distribution with parameters n and p will be zero if:
(c)Fn2 ,n1 (1 − α2 ) = 1
Fn1 ,n2 ( α )
2
2.14 The distribution of which the moment generating function is not useful in finding
the moments is
A.Santhakumaran 69
Pn
2.18 If X1 , X2 , · · · , Xn are iid Geometric variables, then i=1 Xi follows
2.20 If a random experiment has only two mutually exclusive outcomes of a Bernoulli
trial, then the random variable leads to
A.Santhakumaran 70
2.21 A box contains N balls M of which are white and N − M are red. If X denotes
the number of white balls in the sample contains n balls with replacement, then
X is a
2.22 The number of independent events that occur in a fixed amount of time may
follow
2
2.24 The given probability function p(x) = 3x+1
for x = 0, 1, 2, 3, · · · , represents:
2.25 Dinesh receives 2, 2, 4 and 4 telephone calls on 4 randomly selected days. Assum-
ing that the telephone calls follow Poisson distribution, the estimate of telephone
calls in 8 days is
A.Santhakumaran 71
(a) 12 (c) 24
(b) 3 (d) 25 Ans:(c)
2.26 The exact distribution of χ2 goodness of fit test for each experiment units is
classified into one of two categories with a random sample of size n depends on
k+x
!
P∞ θx+k
(−1)k
k=0 k Γ(x+k+1) x = 0, 1, · · ·
pθ (x) =
=0 otherwise
It is known as
2.28 A fair coin is tossed repeatedly . Let X be the number of tails before the first
head occurs. Let Y denote the number of trials observed between the occurrence
of the first and second heads. Let X + Y = N . Then which of the following
statements are true.
(a) X and Y are independent random variables with
2−(k+1)
k = 0, 1, 2, · · ·
P {X = k} = P {Y = k} =
0
otherwise
Ans: (a)
2.29 A fair coin is tossed repeatedly . Let X be the number of tails before the first
head occurs. Let Y denote the number of trials observed between the occurrence
of the first and second heads. Let X + Y = N . Then which of the following
statements are true.
(a) given N = n , the conditional distribution of X and Y are independent.
(b) Given N = n ,
1
k = o, 1, 2, · · ·
n+1
P {X = k} =
0
otherwise
(c) Given N = n ,
1
k = o, 1, 2, · · ·
n
P {X = k} =
0
otherwise
(d) Given N = n ,
1
k = o, 1, 2, · · ·
k+1
P {X = k} =
0
otherwise
Ans: (a)
2.30 Suppose that (X, Y ) has a joint distribution with the marginal distribution of
X being N (0, 1) and E[Y | X = x] = x3 ∀ x ∈ <. Then which of the following
statement is true?
A.Santhakumaran 73
2.31 An urn has 3 red and 6 black balls. Balls are drawn at random one by one
without replacement. The probability that second red ball appears at the 5th
draw is
1 6!4!
(a) 9! (c) 4 × 9!
4! 6!4!
(b) 9! (d) 9! Ans:(c)
X
2.32 Suppose Y is a random vector such that the marginal distribution of X and
the marginal distribution of Y are the same and each is normally distributed
with mean 0 and variance 1. Then which of the following conditions imply
independence of X and Y ?.
1
(a) Cov(X, Y ) = 0 (c)P {X ≥ 0, Y ≤ 0} = 4
1
(b)P {X ≤ 0, Y ≤ 0} = 4 (d) αX + βY ∼ N (0, α2 + β 2 ) ∀ α and β Ans:(d)
X
2.33 Suppose Y is a random vector such that the marginal distribution of X and
the marginal distribution of Y are the same and each is normally distributed
with mean 0 and variance 1. Then which of the following conditions imply
1 − 12 (x2 +y 2 )
independence of X and Y ?. For all t and s ∈ < (a) f (x)(y) = 2π e =
f (x, y)
(b) E[eitX+itY ] = E[eitX ]E[eitY ]
(c) E[eitX+isY ] = E[eitX ]E[eisY ]
(d) f (x + y) = f (x)f (y) Ans: (a),(b),(c) and (d)
2.34 Consider a region R which is a triangle with vertices (0, 0), (0, θ), (θ, 0) where θ >
0 . A sample of size n is selected at random from this region R. Denote the sample
A.Santhakumaran 74
2.35 There are two boxes Box I contains 2 red balls and 4 green balls. Box II contains
4 red balls and 2 green balls. A box is selected at random and a ball is chosen
randomly from the selected box. If the ball turns out to be red. what is the
probability that Box I had been selected?.
1 2
(a) 2 (c) 3
1 1
(b) 3 (d) 6 Ans: (b)
2.36 For any three events A and B which of the following relations always holds?.
1
(a)P 2 (A ∩ B c ) + P 2 (A ∩ B) + P (Ac ) ≥ 3
1
(b)P 2 (A ∩ B) + P 2 (A ∩ B c ) + P 2 (Ac ) = 3
(c) P 2 (A ∩ B c ) + P 2 (A ∩ B) + P 2 (Ac ) = 1
(d) P 2 (A ∩ B c ) + P (A ∩ B) + P 2 (Ac ) = 1 Ans:(a)
2.37 Suppose customers arrive in a shop according to a Poisson process with rate 4
per hour. The shop open at 10 A M.If it is given that the second customer arrives
at 10.40 A M, what is the probability that no customer arrived before 10 A M ?.
1 1
(a) 4 (c) 2
1
(b) e−2 (d) e− 2 Ans:(a)
2.39 A and B play a game of tossing a fair coin. A starts the game by tossing the
coin twice, followed by A tossing the coin once and B tossing the coin twice and
this continues until a head turns up whoever gets the first head wins the game.
Then
2.42 Let X be a random variable with a certain non degenerate distribution. Then
identify the correct statements.
A.Santhakumaran 76
2.43 A random sample (without replacement) of sizen is drawn from a finite popu-
lation of size N (≥ 7). What is the probability that the 4th population unit is
included in the sample but the 6th population unit is not included in the sample?.
n(n−1) (N −n+1)
(a) N (N −1) (c) N (N −1)
n(N −n) n
(b) N (N −1) (d) N Ans:(b)
2.44 A fair die is thrown two times independently . Let X, Y be the outcomes of these
two throws and Z = X + Y . Then which of the following statement is true?
2.45 Suppose the random variable T follow an Exponential distribution with unit
mean. Which of the following statement is true?.
(a) The hazard function of T is a constant function
(b) The hazard function of T 2 is canstant function
(c) The hazard function of T 3 is identity function
(d) The hazard function of T is not constant Ans:(a)
2.46 X and Y are independent random variables each having the pdf is
1 12
−∞ < t < ∞
π 1+t
f (t) =
0
otherwise
X+Y
The density function of Z = 3 where −∞ < z < ∞ is
A.Santhakumaran 77
6 1 3 1
(a) π 4+9z 2 (c) π 1+9z 2
6 1 6 1
(b) π 5+8z 2 (d) π 9+9z 2 Ans: (c)
2.47 Let Nt denote the number of accidents up to time t. Assume that {Nt } is a
Poisson process with intensity 2. Given that there are exactly 5 accidents during
the time period [20, 30]. What is the conditional probability that there is exactly
one accident during the time period [15, 25]?.
15 −10 1
(a) 12 e (c) 5
105 −30
(b) 5! e (d) 20e−20 Ans:(a)
2.48 Suppose (X, Y ) follows a bivariate Normal distribution with E[X] = E[Y ] =
−y 2 /2 dy,
Rx
0, V [X] = V [Y ] = 2 and Cov[X, Y ] = −1 , if φ(x) = √1
2π −∞ e then
P {X − Y > 6} is
√
(a) φ(−1) (c) φ( 6)
(b) φ(−3) (d) φ(−6) Ans: (c)
2.50 Let X and Y be independent and identically distributed random variables such
1 1
that P {X = 0} = P {X = 1} = 2 and P {Y = 0} = P {Y = 1} = 2 Let
Z = X + Y and W = |X − Y |. Then which statement is not correct?.
A.Santhakumaran 78
2.51 Hundred tickets are marked 1, 2, 3 · · · 100 and are arranged at random. Four
tickets are picked from these tickets and are given to four persons A, B, C and
D. What is the probability that A gets the ticket with the largest value( among
A, B, C, D) and D gets the ticket with the smallest value( among A, B, CD) ?.
1 1
(a) 2 (c) 6
1 1
(b) 4 (d) 12 Ans: (d)
2.52 Suppose X and Y are independent and identically distributed random variables
and let Z = X + Y . Then the distribution of Z is in the same family as that of
X and Y , if X
2.54 Let (Ω, F, P ) be a probability space and let A be an event with P (A) > 0. In
which of the following cases does B define a probability measure on (Ω, F )
(a) B(D) = P (A ∪ D) ∀ D ∈ F
(b) B(D) = P (A ∩ D) ∀ D ∈ F
A.Santhakumaran 79
(c)Q(D) = P (D | A) ∀ D ∈ F
(d)
P (A | D)
if D ∈ F with P (D) > 0
B(D) =
0
if P (D) = 0
Ans:(c)
Pn 1
(a) {Zk > 0} > 0 (c) i=1 Zk → +∞ with probability 2
P∞ 1
(b) P {Zk > 0} < 0 (d) i=1 Zk → −∞ with probability 2 Ans:(a)
2.57 Let X denote the Exponential distribution with parameter λ > 0. Fix a > 0.
Define the random variable Y by Y = k if ka ≤ x < (k + 1)a , k = 0, 1, 2, 3 · · ·.
Which of the following statement is true?.
(a)P {4 < Y < 5} = 0
(b) Y ∼ an Exponential form distribution
(c) Y ∼ a Negative Exponential distribution with parameter aλ
(d) Y ∼ a Negative Exponential distribution with parameter a Ans: (b)
2.58 Let X and Y be random variables with joint probability density function
cxy o<x<y<1 c∈<
f (x, y) =
0
otherwise
1
(a) c = 8 (c) X and Y are independent
1
(b) c = 4 (d) P {X = Y } = 0 Ans:(c)
2.59 Let {Xn , n ≥ 1} be iid Uniform (−1, 2) random variables. Which of the following
1
Xi → 0 almost surely
P
statement is true ?. (a) n
Ans:(a)
2.60 Suppose X1 , X2 , X3 , · · · are iid random variables having common density function
Assume f (x) = f (−x) ∀ x ∈ <. Which of the following statement is correct?.
A.Santhakumaran 81
2.61 Let X and Y be iid Uniform (0, 1) random variables. Let Z = max(X, Y ) and
W = min(X, Y ). Then P {(Z − W ) > 21 } is
1 1
(a) 2 (c) 4
3 2
(b) 4 (d) 3 Ans:(c)
2.62 Two students are solving the same problem independently, if the probability
3
that the first one solves the problem is 5 and the probability that the second one
4
solves the problem is 5 . What is the probability that at least one of them solves
the problem?
16 21
(a) 25 (c) 25
18 23
(b) 25 (d) 25 Ans: (d)
2.63 A standard fair die is rolled until some face other than 5 or 6 turns up. Let X
denote the face value of the last roll and A = [ X even] and B = [ X is at most
2]. Then
1
(a) P (A ∩ B) = 0 (c) P (A ∩ B) = 4
1
(b) P (A ∩ B) = 6 (d) P (A ∩ B) = 13 . Ans:(a)
2.65 A box contains 40 numbered red balls and 60 numbered black balls. From the
box balls are drawn one by one at random without replacement till the balls are
drawn. The probability that the last ball drawn black is
1 3
(a) 100 (c) 5
1 2
(b) 60 (d) 5 Ans: (d)
2.67 Ten balls are put in 6 slots at random. Then the expected total number of balls
in the two extreme slots is
10 1
(a) 6 (c) 6
10 6
(b) 3 (d) 10 Ans:(b)
X−Y
2.68 Let X and Y be independent random variables and Z = 2 + 3. If X has
characteristic function ϕ and Y has characteristic function ψ, then Z has char-
acteristic function φ where
2.71 Let X and Y be two random variables with joint probability density function
1
if 0 ≤ x2 + y 2 ≤ 1
π
f (x, y) =
0
otherwise
1
(a) P {X > 0} = 2 (c) Cov(X, Y ) = 0
1
(b) E[Y ] = 0 (d) E[Y ] = 2 Ans:(b)
2.73 A sample random sample of size n will be drawn from a class of 125 students,
and the mean mathematics score of the sample will be computed. If the standard
error of the sample mean for with replacement sampling is twice as much as the
standard error of the sample mean for without replacement sampling, the value
of n is
A.Santhakumaran 84
(a) 32 (c) 79
(b) 63 (d) 94 Ans: (c)
2.74 Let F (t) , h(t) and m(t) be the life time distribution function, the hazard function
and the mean residual lifetime function respectively, defined [0, ∞). Assume that
F (t) is absolutely continuous which of the following statements is true?.
R∞
(a) 0 h(t)dt =1
R ∞
(1−F (y))du
(b) m(t) = t
1−F (t) for t > 0
(c) m(t) is strictly increasing in t if the life time distribution is Exponential with
mean λ > 0.
R∞
(d) 0 h(t)dt 6= 1 Ans(a)
2.75 Let F (t) , h(t) and m(t) be the life time distribution function, the hazard function
and the mean residual lifetime function respectively, defined [0, ∞). Assume that
F (t) is absolutely continuous which of the following statements is true?.
(a) h(t)m(t) = 1 ∀ t > 0 if the life time distribution is Exponential with mean
λ > 0. R∞
(1−F (y))du
(b) m(t) = t
1−F (t) for t > 0
(c) m(t) is strictly increasing in t if the life time distribution is Exponential with
mean λ > 0.
R∞
(d) 0 h(t)dt 6= 1 Ans:(a)
2.76 A parallel system consists of n identical components. The life times of the
components are independently identically distributed unifom random variables
with mean 30 hours and range 60 hours. If the expected life time of the system
is 50 hours, then the value of n
(a) 3 (c) 5
(b) 4 (d) 6 Ans:(b)
1 2
(a) 2 (c) 3
1 3
(b) 3 (d) 4 Ans:(c)
2.81 X and Y are independent Exponential random variables with mean 4 and 5
respectively. Which of the following statements is true?.
(a) X + Y is Exponential distribution with mean 9
(b) XY is Exponential distribution with mean 20
(c) max(X, Y ) is Exponential distribution
(d) min(X, Y ) is Exponential distribution Ans:(d)
2.82 There are five empty boxes. balls are placed independently one after another in
randomly selected boxes. The probability that the fourth ball is the first to be
A.Santhakumaran 86
(a) 54 ( 53 )2 (c) ( 35 )2
(b) ( 35 )3 (d) 54 ( 35 ) Ans:(d)
2.83 From the letters A, B, C, D, E and F are chosen at random with replacement.
What is the probability that either the word BAD or the CAD can be formed
the chosen letter?.
1 6
(a) 216 (c) 216
3 12
(b) 216 (d) 216 Ans:(d)
2.84 Let X be a random variable which is symmetric about 0. Let F be the cumulative
distribution function of X. Which of the following statements is always true?
For all x ∈ <
2.85 Let Xi ’s independent random variables such that Xi ’s are symmetric about 0
and V [Xi ] = 2i − 1, i ≥ 1. Then limn→∞ P {X1 + · · · + Xn > n log}
2.86 Let X1 and X2 be normal random variables with mean 0 and variance 1. Let
U1 and U2 be iid U (0, 1) random variables independent of X1 , X2 . Define Z =
X√
1 U1 +X2 U2
. Then
U12 +U22
2.87 Let X and Y be independent normal random variables with mean 0 and variance
1. Let the characteristic function of XY be denoted by φ. Then
1
(a) φ(2) = 2 (c) φ(t)φ( 1t ) = |t| ∀ t 6= 0
t2
(b) φ(t) is a even function (d) φ(t) = E[e− 2 ] Ans:(b)
2.88 Let X and Y be random variables with joint cumulative distribution F (x, y).
Then which sufficient for (x, y) ∈ <2 to be a point of community of F ?.
(a) P {X = x, Y = y} = 0
(b) Either P {X = x} = 0 or P {Y = y} = 0
(c) P {X = x} = 0 and P {Y = y} = 0
(d) P {X = x, Y ≤ y} =
6 0 and P {X ≤ x, Y = y} = 0 Ans:(c)
2.89 Let X and Y be random variables with joint cumulative distribution F (x, y).
Then which sufficient for (x, y) ∈ <2 to be a point of community of F ?.
(a) P {X = x, Y = y} = 0
(b) Either P {X = x} = 0 or P {Y = y} = 0
(c) P {X = x} =
6 0 and P {Y = y} = 0
(d) P {X = x, Y ≤ y} = 0 and P {X ≤ x, Y = y} = 0 Ans:(d)
1
Find the pdf of Y = X − 5. [ Ans: f (y) = 25 (y + 5) − 5 ≤ y ≤ 0 and
1
25 (5 − y) 0 ≤ x < 5]
2.95 If the random variable X is uniformly distributed over ( -1, 1). Find the density
function of Y = sin( πX 2
2 ). [ Ans:f (y) = 9 (y − 1) 1 < y < 4]
1 1
Find P {X ≤ 4}, P {−5 < X ≤ 4} and P {X = 4}.[Ans: 6, 6 and 0 ]
1
2.97 If X has the pdf f (x) = π, −π < x < π, find the pdf of Y = tan X. [ Ans:
1 1
f (y) = π 1+y 2 − ∞ < x < ∞]
2.99 Telephone calls are being placed a certain exchange at random times on the
average of 4 per minute. Assuming a Poisson law, determine the probability that
in a 15 seconds interval there are 3 or more calls. [ Ans: λ = 1, P {X ≥ 3} =
0.0803.]
2.100 Customer enter a waiting line at random at a rate of 4 per minute. Assuming
that the number enter the line in any given time interval has a Poisson distribu-
tion. Determine the probability that at least one customer enters the line in a
given half-minute interval. [ Ans: λ = 2, P {X ≥ 1} = 0.8647.]
3.1 Introduction
3.2 Estimators
where the expectation is respect to θ. The risk R[θ, δ(T )] is an average loss and it
is assumed that R[θ, δ(T )] < ∞ ∀ ∈ θ. Risk is a measure of accuracy an estimator.
A well defined class of estimators are unbiasedness or equivariance. For obtaining an
optimal estimator for the unknown parameter θ, the risk is minimum. It leads the
successes of estimation theory.
For example, a random variable X is assumed to follow a Normal distribution
with mean θ and variance σ 2 . The parameter space Ω = {(θ, σ); −∞ < θ < ∞, 0 <
σ 2 < ∞}. Suppose a random sample X1 , X2 , X3 , · · · , Xn is taken on X. Here a statistic
T = t(X) from the sample X1 , X2 , · · · , Xn which gives the best value for the parameter
θ. Particular value of the Statistic T = t(x) = x̄ based on the values x1 , x2 , · · · , xn
is called an estimate of θ. If the statistic T = X̄ is used to estimate the unknown
parameter θ, then the sample mean is called an estimator of θ. Thus an estimator is
a rule or a procedure to estimate the value of θ. The numerical value x̄ is called an
estimate of θ.
= 1 − 1 = 0 as n → ∞
P
Thus X̄ → θ as n → ∞. The sample mean X̄ of the normal population is a consistent
estimator of the population mean θ.
Remark 3.1 In general sample mean need not be a consistent estimator of the
population mean.
A.Santhakumaran 93
= 1 − Pθ {θ − < X̄ < θ + }
Z θ+
1 1
= 1− 2
dx̄
θ− π 1 + (x̄ − θ)
since X̄ ∼ Cauchy distribution with parameter θ
Z
1 1
= 1− 2
dz where x̄ − θ = z
− π 1 + z
1
= 1 − [tan−1 (z)]−
π
2
= 1 − tan−1 () since tan−1 (−θ) = − tan−1 (θ)
π
By Chebychev’s inequality
1
Pθ {|Tn − θ| > } ≤ Eθ [Tn − θ]2
2
1 h 2
i
≤ V [T
θ n ] + {E [T
θ n − θ]}
2
→ 0 as n → ∞
a sequence of iid random variables from a population with finite mean θ = Eθ [X], then
X̄ converges to θ in probability for each fixed θ ∈ Ω. It is known as Khintchin’s Weak
Law of Large Numbers, i.e., sample mean X̄ finitely exists, is a consistent estimator
for the population mean θ which does not require the condition Vθ [X̄] → 0 as n → ∞
for every fixed θ ∈ Ω. Thus consistency follows the existence of expectation statistic
and assumption of finite variance is not needed.
For illustration the Cauchy pdf is
1
1
−∞ < x < ∞
π 1+x2
p(x) =
0
otherwise
The Cauchy Principle value 0 is taken as the mean of Cauchy distribution. Thus the
Cauchy distribution has not the mean finitely exist. Hence for the Cauchy population,
the sample mean X̄ is not a consistent estimator of the parameter θ.
A.Santhakumaran 95
n
1 X
Eσ4 [T ] = E 4 [Xk4 ]
3n k=1 σ
n
1 X
= E 4 [Xk − 0]4 since E[Xk ] = 0 ∀ k = 1, 2, · · ·
3n k=1 σ
1 1
= nµ4 = 3nσ 4 = σ 4
3n 3n
= log θ − 1
A.Santhakumaran 96
1
log x x
Since lim x log x = lim 1 = lim =0
x→0 x→0 x→0 − 12
x x
1
Zθ
Eθ [log X]2 = (log x)2 dx
θ 0
Z θ
1 2 θ 1 log x
= [x(log x) ]0 − 2x dx
θ θ 0 x
1 2
= (log θ)2 − lim x(log x)2 − [θ log θ − θ]
θ x→0 θ
= (log θ) − 2 log θ + 2 since lim x(log x)2 = 0
2
x→0
= θ[1 + 2 + · · · + n]
n(n + 1)
= θ
2
n
" #
2 X
Eθ iXi = θ, ∀ θ ∈ Ω
n(n + 1) i=1
" n # n
X X
Vθ iXi = i2 Vθ [Xi ]
i=1 i=1
n
X
= σ2 i2
i=1
n(n + 1)(2n + 1)
= σ2
6
n
" #
2 X 2 (2n + 1) 2
Vθ iXi = σ → 0 as n → ∞
n(n + 1) i=1 3 n(n + 1)
A.Santhakumaran 97
2 Pn
Thus n(n+1) i=1 iXi is a consistent estimator of θ.
As the sample size increases the estimator should get closer to the parameter of
interest. Here closer means convergence. For every > 0, there exists an N where for
all n > N , |Tn − θ| < . Of course the estimators considered are random, i.e., for
every ω ∈ S ( set of all outcomes ) one has a different estimate. The natural question
is, what does convergence mean for random sequences?.
Problem 3.6 Let T = max1≤i≤n {Xi } be the nth order statistic of a random sample
of size n drawn from a population with a uniform distribution on the interval ( 0, θ).
Show that consistent estimator is not unique.
Solution: The pdf of T is
ntn−1
n 0 < t < θ, θ > 0
θ
pθ (t) =
0
otherwise
Z θ
n n
Eθ [T ] = tn dt = θ
θn 0 n+1
nθ2 nθ2
Eθ [T 2 ] = , Vθ [T ] =
(n + 2) (n + 2)(n + 1)2
Thus Eθ [T ] → θ and Vθ [T ] → 0 as n → ∞. T is a consistent estimator of θ. Also
h i
(n+1) θ2
Eθ n T = θ and Vθ [ (n+1)
n T] = n(n+2) → 0 as n → ∞, i.e., (n+1)
n T is also a
(n+1)
consistent estimator of θ. The statistic T and n T are the two consistent estimators
of the same parameter θ. Thus consistent estimator is not unique.
P
Proof: Given Tn = tn (X) is a consistent estimator τ (θ), i.e., Tn → τ (θ) as n → ∞.
Therefore for given > 0, η > 0 , there exist a positive integer n ≥ N (, η) such that
where 0 = 2θ
where T 0 = T 2 − 2
= Pθ {|T 0 − θ2 | < 0 } → 1 as n → ∞
T 0 = T 2 − 2 ⇒ T 2 as n → ∞ since → 0 as n → ∞
.. . Pθ {|T 2 − θ2 | < 0 } → 1 as → ∞. Thus T 2 is a consistent estimator of θ2 .
A.Santhakumaran 99
Otherwise, the statistic g(T ) is said to be a biased estimator of τ (θ). The unbiased
estimator is also called zero bias estimator. A statistic g(T ) is said to be asymptotically
unbiased estimator if Eθ [g(T )] → τ (θ) as n → ∞, ∀ θ ∈ Ω.
Problem 3.8 A random variable X has the pdf
2θx if 0 < x < 1
pθ (x) = (1 − θ) if 1 ≤ x < 2, 0 < θ < 1
0
otherwise
Show that g(X), a measurable function of X is an unbiased estimator of θ if and only
R 1 1 R2
if 0 xg(x)dx = 2 and 1 g(x)dx = 0.
Solution: Assume g(X) is an unbiased estimator of θ, i.e.,
Eθ [g(X)] = θ
Z 1 Z 2
g(x)2θxdx + g(x)(1 − θ)dx = θ
0 1
Z 1 Z 2 Z 2
θ 2xg(x)dx − g(x)dx + g(x)dx = θ
0 1 1
Z 1 Z 2
⇒ 2xg(x)dx − g(x)dx = 1 and
0 1
Z 2
g(x)dx = 0
1
Z 1
1
i.e., xg(x)dx = and
0 2
Z 2
g(x)dx = 0
1
R 1 1
R 2
Conversely, 0
xg(x)dx = 2 and 1
g(x)dx = 0, then g(X) is an unbiased estimator of θ.
Z 1 Z 2
Eθ [g(X)] = 2θxg(x)dx + (1 − θ)g(x)dx
0 1
Z 1 Z 2
= 2θ xg(x)dx + (1 − θ) g(x)dx
0 1
1
= 2θ + (1 − θ) × 0 = θ
2
A.Santhakumaran 100
g(t)cnt = cn−2
t−1
(n − 2)! t!(n − t)!
g(t) =
(t − 1)!(n − t − 1)! n!
(n − 2)!t(t − 1)!(n − t)(n − t − 1)!
=
(t − 1)!n(n − 1)(n − 2)!(n − t − 1)!
t(n − t)
= , if n = 2, 3, · · ·
n(n − 1)
T (n − T )
n = 2, 3, · · ·
n(n − 1)
Eθ [g ∗ (T )] = θ2
n t
θ
g ∗ (t)cnt
X
(1 − θ)n = θ2
t=0
1−θ
n
g ∗ (t)cnt ρt = ρ2 (1 + ρ)n−2
X
t=0
= ρ2 [1 + cn−2
1 ρ + · · · + cn−2
t ρt + · · · + ρn−2 ]
A.Santhakumaran 101
.. .g ∗ (t)cnt = ct−2
n−2
(n − 2)!t!(n − t)!
⇒ g ∗ (t) =
(t − 2)!(n − t)!n!
(n − 2)!t(t − 1)!(t − 2)!
=
(t − 2)!n(n − 1)(n − 2)!
t(t − 1)
= n = 2, 3, · · · · · ·
n(n − 1)
T [T − 1]
g ∗ (T ) = n = 2, 3, · · ·
n(n − 1)
Solution: Consider
1
Eθ [g(X)] =
θ
∞
X 1
g(x)θ(1 − θ)x−1 =
x=1
θ
∞
X (1 − θ)
g(x)(1 − θ)x =
x=1
θ2
Take 1 − θ = ρ ⇒ θ = 1 − ρ
∞
g(x)ρx = ρ(1 − ρ)−2
X
x=1
= ρ(1 + 2ρ + 3ρ2 + · · · + xρx−1 + · · ·)
⇒ g(x) = x ∀ x = 1, 2, 3, · · ·
Problem 3.11 Assume X ∼ b(1, θ), 0 < θ < 1. If a single observation x of X from a
Bernoulli population, then shown that there is no unbiased estimator exist for θ2 .
A.Santhakumaran 102
Eθ [g(X)] = θ2
1
X
g(x)θx (1 − θ)1−x = θ2
x=0
g(0)(1 − θ) + g(1)θ = θ2
1
Eθ [g(X)] =
θ
n x
n! θ 1
X
g(x) (1 − θ)n =
i=0
x!(n − x)! 1−θ θ
n
X n! (1 + ρ)n+1
g(x) ρx =
i=0
x!(n − x)! ρ
θ
where ρ = 1−θ
n! (1+ρ)n+1
ρx → g(0) as θ → 0 and → ∞ as ρ → 0 or θ → 0.
P
g(x) x!(n−x)! ρ
Problem 3.13 illustrates the uniqueness of unbiased estimator. The unbiased es-
timator of the parameter θ is negative. For practically it is not possible but it is
constructed for mathematical interest .
A.Santhakumaran 103
Problem 3.13 A random sample X is drawn from a Bernoulli population b(1, θ), θ =
{ 41 , 12 }. Show that there exists an unique unbiased estimator of θ2 .
Solution: Let g(X) be the unbiased estimator of θ2 , i.e.,
Eθ [g(X)] = θ2
1
X
g(x)θx (1 − θ)1−x = θ2
x=0
1 1
When θ = ⇒ 3g(0) + g(1) = (3.1)
4 4
1 1
When θ = ⇒ g(0) + g(1) = (3.2)
2 2
Solving the equations (3.1) and (3.2) for g(0) and g(1), one gets the values of g(0) = − 81
and g(1) = 58 ,
−1
for x = 0
8
i.e., g(x) =
5
8 for x = 1
θ2
is an unbiased estimator of θ and has variance n.
Pn
Solution: Let T = i=1 Xi ∼ G(n, θ). The pdf of T is
1 − θt n−1
θn Γn e t 0 < t < ∞, θ > 0
pθ (t) =
0
otherwise
A.Santhakumaran 104
Z ∞
1 − 1 t n+1−1
Eθ [T ] = e θ t dt
0 θn Γn
= nθ
" n #
X
Eθ Xi = nθ ∀ θ > 0
i=1
Eθ [nX̄] = nθ ∀ θ > 0
⇒ Eθ [X̄] = θ ∀ θ > 0
Vθ [T ] = nθ2 ∀ θ > 0
Pn
i=1 Xi
.
. . Vθ [X̄] = Vθ
n
1
= Vθ [T ]
n2
1 2 θ2
= nθ =
n2 n
Problem 3.15 Let X1 , X2 , · · · , Xn be a random sample drawn from a normal popu-
Pn
Xi2
lation with mean zero and variance σ 2 , 0 < σ 2 < ∞. Show that i=1
n is an unbiased
2σ 4
estimator of σ 2 and has variance n .
Pn ns2
Solution: Define ns2 = 2
i=1 Xi , then Y = σ2
∼ χ2 distribution with n degrees of
freedom , i.e.,Y ∼ G( n2 , 21 ).
1 n
1
n e− 2 y y 2 −1 0 < y < ∞
2 2 Γn
p(y) = 2
0
otherwise
Z ∞
1 1 n
E[Y ] = 1 e− 2 y y 2 +1−1 dy
n
0 2 Γ2
2
1 Γ( n2 + 1)
= n n =n
2 2 Γ n2 ( 1 ) 2 +1
2
E[Y 2 ] = n + 2n 2
V [Y ] = 2n
ns2
But Y = 2
" σ#
ns2
.. . Eσ2 = n
σ2
⇒ Eσ2 [s2 ] = σ 2
A.Santhakumaran 105
P
Xi2
Thus n is an unbiased estimator of σ 2 .
" #
ns2
Vσ2 = 2n
σ2
n2
V 2 [s2 ] = 2n
σ4 σ
2σ 4
Vσ2 [s2 ] =
n
Problem 3.16 Let Y1 < Y2 < Y3 be the order statistics of a random sample of size
3 drawn from an uniform population with pdf
1
0<x<θ
θ
pθ (x) =
0
otherwise
Show that 4Y1 and 2Y2 are unbiased estimators of θ. Also find the variance of these
estimators.
Solution: The pdf of Y1 is
hR i2
3! 1 θ 1
dx 0 < y1 < θ
1!2! θ y1 θ
pθ (y1 ) =
0
otherwise
3 [1 − y1 ]2
0 < y1 < θ
θ θ
pθ (y1 ) =
0
otherwise
3
Z θ y1
Eθ [Y1 ] = y1 (1 − )2 dy1
θ 0 θ
Z 1
3 y1
= θt(1 − t)2 θdt where θ =t
θ 0
Z 1
= 3θ t2−1 (1 − t)3−1 dt
0
Γ2Γ3 θ
= 3θ = ∀θ>0
Γ5 4
θ2 3θ2
Similarly Eθ [Y12 ] = and Vθ [Y1 ] =
10 15
3θ 2
.. . Vθ [4Y1 ] =
5
The pdf of Y2 is !
Z y2 Z θ
3! 1 1 1
pθ (y2 ) = dx dx
1!1!1! 0 θ θ y2 θ
A.Santhakumaran 106
62 y2 [1 − y2 ] 0 < y2 < θ
θ θ
pθ (y2 ) =
0
otherwise
θ
.˙. Eθ [Y2 ] = 2
3θ2 θ2 θ2
⇒ 2Y2 is an unbiased estimator of θ and Eθ [Y 2 ] = 10 and Vθ [Y2 ] = 20 ⇒ Vθ [2Y2 ] = 5
k1 Eθ [Y1 ] + k2 Eθ [Y2 ] = θ
⇒ k1 + k2 = 1
i.e., k2 = 1 − k1
Consider φ = Vθ [k1 Y1 + k2 Y2 ]
= k12 2σ 2 + (1 − k1 )2 σ 2
= 3k12 σ 2 − 2k1 σ 2 + σ 2
Z ∞
1 1 n−1
E[Y r ] = n−1 e− 2 y y 2
+r−1
dy
0 2 2 Γ n−1
2
n−1
1 Γ 2 +r
= n−1 n−1
2 Γ n−1
2
2 ( 12 ) 2
+r
2r n−1
= n−1 Γ +r
Γ 2 2
When r=1
2 n−1 n−1
E[Y ] = Γ =n−1
Γ n−1
2
2 2
" #
ns2
.. . Eσ2 = n−1
σ2
n−1 2
⇒ Eσ2 [s2 ] σ =
n
2(n − 1) 4
and Vσ2 [s2 ] = σ
n2
Thus Eσ2 [s2 ] → σ 2 and Vσ2 [s2 ] → 0 as n → ∞
1 Pn
.˙. n i=1 (Xi − X̄)2 is a consistent estimator of σ 2 .
1 Pn
But Eσ2 [s2 ] 6= σ 2 . .˙. n i=1 (Xi − X̄)2 is not an unbiased estimator of σ 2 .
Problem 3.19 Illustrate with an example that estimator is both consistent
and unbiased.
A.Santhakumaran 108
1 Pn ns2
and S 2 = n−1
2
i=1 (Xi −X̄) , then Y = σ2
∼ χ2 distribution with (n−1) degrees
2(n−1) 4
of freedom and Y ∼ G( n−1 1 2
2 , 2 ). with Eσ 2 [s ] =
n−1 2
n σ and Vσ2 [s2 ] = n2
σ .
n 2
(n − 1)S 2 = ns2 ⇒ S 2 = s
n−1
n
Eσ2 [S 2 ] = E 2 [s2 ]
n−1 σ
n n−1 2
= σ = σ2
n−1 n
n2
Vσ2 [S 2 ] = E 2 [s2 ]
(n − 1)2 σ
n2 2(n − 1) 4
= σ
(n − 1)2 n2
2σ 4
= → 0 as → ∞
(n − 1)
1 Pn
Thus S 2 = n−1 i=1 (Xi − X̄)
2 is consistent and also unbiased estimator of σ 2 .
Problem 3.20 Give an example that unbiased estimator but not consistent.
Solution: Let X1 , X2 , · · · , Xn be a random sample drawn from a normal pop-
ulation with mean θ and known variance σ 2 , then the estimator X1 ( first
observation) of the sample is unbiased but not consistent. Since Eθ [X1 ] = θ
and Vθ [X1 ] = σ 2 ∀ θ ∈ Ω and
= Pθ {θ − < X1 < θ + }
Z θ+
1 1 2
= √ e− 2σ2 (x1 −θ) dx1
2πσ θ−
6→ 1 as n → ∞
θ
Eθ [Y1 ] = 6 θ ∀ θ ∈ Ω and
=
4
θ θ θ
Pθ Y1 − <
= Pθ − < Y1 < +
4 4 4
θ 2
3 4
Z+ y1
= 1− dy1
θ 4 −
θ θ
6→ 1 as n → ∞
Thus Y1 the first order statistic is not consistent and not unbiased estimator
of θ.
Pθ {X1 = x1 , X2 = x2 , · · · | T = t}
Xi
It is independent of θ. Thus T = min1≤i≤n Yi = min1≤i≤n i is sufficient.
Problem 3.24 Let X1 and X2 be iid Poisson random variables with param-
eter θ. Prove that
Pθ {X1 = x1 , X2 = t − x1 }
Consider Pθ {X1 = x1 , X2 = x2 | T = t} =
Pθ {T = t}
Pθ {X1 = x1 }Pθ {X2 = t − x1 }
=
Pθ {T = t}
e−θ θx1 e−θ θt−x2
x1 ! (t−x2 )!
= e−2θ (2θ)t
t!
t!
= is independent of θ.
(t − x1 )!x1 !2t
+ Pθ {X1 = 2, X2 = 0}
θ2 −2θ
= θe−2θ + e
2
θ
= θe−2θ [1 + ]
2
Pθ {X1 = 0, X2 = 1}
Therefore Pθ {X1 = 0, X2 = 1 | X1 + 2X2 = 2} =
Pθ {X1 + 2X2 = 2}
e−2θ θ
=
θe−2θ [1 + 2θ ]
2
= depends on θ.
2+θ
Let T = X1 + X2 . Consider
Pθ {T = 1} = Pθ {X1 + X2 = 1}
= Pθ {X1 = 0, X2 = 1} + Pθ {X1 = 1, X2 = 0}
= θ(3 − 4θ)
Pθ {X1 = 0 ∩ X1 + X2 = 1}
.˙.Pθ {X1 = 0 | X1 + X2 = 1} =
Pθ {X1 + X2 = 1}
Pθ {X1 = 0, X2 = 1}
=
Pθ {X1 + X2 = 1}
(1 − θ)2θ
=
θ(3 − 4θ)
2(1 − θ)
= is dependent on θ.
(3 − 4θ)
Let T = X1 + X2 ∼ N (2θ, 2)
√ 1 √ e− 14 (t−2θ)2
−∞ < t < ∞
p(t)θ = 2π 2
0
otherwise
. ˙. T = X1 + X2 is a sufficient statistic.
= 1 − θ2
P {Y = 1} = P {X1 = 1, X2 = 1}
= θ2
P {Y + X3 = 1} = P {Y = 0, X3 = 1} + P {Y = 1, X3 = 0}
= (1 − θ2 )θ + θ2 (1 − θ)
Consider
P {Y = 1, T = 1}
P {Y = 1 | T = 1} =
P {T = 1}
P {Y = 1}P {X3 = 0}
=
P {T = 1}
θ2 θ
=
θ(1 − θ)(1 + 2θ)
θ2
=
(1 − θ)(1 + 2θ)
Remark 3.3 The definition of sufficient statistic is not always useful to find
a sufficient statistic, since
(ii) even if it is known in some cases, it is tedious to find the pdf of statistic.
To avoid the above difficulties one may use the Neyman Factorization The-
orem.
pθ (x1 , x2 , · · · , xn ) = pθ (t)h(x1 , x2 , · · · , xn )
pθ (x1 , x2 , · · · , xn ) = pθ (t)h(x1 , x2 , · · · , xn ).
Pθ {X1 = x1 , · · · , Xn = xn } = Pθ {X1 = x1 , X2 = x2 , · · · , Xn = xn , T = t}
= Pθ {T = t}P {X1 = x1 , · · · , Xn = xn | T = t}
Pθ {X1 = x1 , · · · , Xn = xn , T = t}
Pθ {X1 = x1 , · · · , Xn = xn | T = t} =
Pθ {T = t}
0 if T 6= t
=
Pθ {X1 =x1 ,···,Xn =xn }
if T = t
Pθ {T =t}
If T = t, then
Pθ {X1 = x1 , · · · , Xn = xn } pθ (t)h(x1 , x2 , · · · , xn )
=
Pθ {T = t} pθ (t) t(x)=t h(x1 , x2 , · · · , xn )
P
A.Santhakumaran 116
h(x1 , x2 , · · · , xn )
=
t(x)=t h(x1 , x2 , · · · , xn )
P
is independent of θ.
Theorem 3.4 If T = t(X) is a sufficient statistic, then any one to one function of
the sufficient statistic is also a sufficient statistic.
Proof: Let T = t(X) be a sufficient statistic, then by the Neyman Factor-
ization Theorem pθ (x1 , x2 , · · · , xn ) = pθ (t)h(x1 , x2 , · · · , xn ). Let U be any one to
one function of T = t(X), i.e., u = α(t). Since u = α(t) ⇒ t = α−1 (u)
dα−1 (u) 0
dt
= α−1 (u) .
.˙. du = du
h(x1 , x2 , · · · , xn )
pθ (x1 , x2 , · · · , xn ) = pθ (α−1 (u))[α−1 (u)]0
[α−1 (u)]0
= pθ (u)h1 (x1 , x2 , · · · , xn )
Theorem 3.5 Let T (X) be a statistic such that for some θ1 , θ2 ∈ Ω and the
distributions of X, Y have the support of pθ (.), θ ∈ Ω. The statistic T (X) is
not sufficient for θ if
Proof: Define the support of pθ1 (.) by I(θ1 ) and of pθ2 (.) by I(θ2 ) so that
I(θ1 ) = {x : pθ1 (x) > 0} and I(θ2 ) = {x : pθ2 (x) > 0}. Consider the condition
(ii) pθ1 (x)pθ2 (y) 6= pθ2 (x)pθ1 (y) when either x or y is in I(θ1 ) and not in I(θ2 ),
one of its side is zero and the other side is non zero. Further , if T were
sufficient then T (X) = T (Y ) implies both x and y are in I(θ1 ) and I(θ2 ) . This
is not possible. .. . T is not sufficient for θ. Further suppose T is sufficient ,
for single observation X on x by Neyman Factorization Theorem
Using equations (3.3) and ( 3.4) ⇒ pθ1 (x)pθ2 (y) = pθ2 (x)pθ1 (y) if T is sufficient.
By condition (ii) pθ1 (x)pθ2 (y) 6= pθ2 (x)pθ1 (y) and (i) T (X) = T (Y ) show that T
is not sufficient.
Problem 3.28 Let X1 , X2 , · · · , Xn be a random sample drawn from a popu-
lation with pmf
θ x (1 − θ)1−x
x = 0, 1
pθ (x) =
0
otherwise
Find the sufficient statistic.
where t = ni=1 xi
P
t
θ
= (1 − θ)n
1−θ
= pθ (t)h(x1 , x2 , · · · , xn )
t
θ
where pθ (t) = (1 − θ)n and h(x1 , x2 , · · · , xn ) = 1
1−θ
Pn
.˙. T = i=1 Xi is a sufficient statistic.
Remark 3.5 If the range of distribution depends on the parameter, Ney-
man Factorization Theorem is not convenient to find the sufficient statistic.
A.Santhakumaran 118
= e−yn +nθ−t+yn
n
X
where t = xi and Yn = max {Xi }
1≤i≤n
i=1
pθ (x1 , x2 , · · · , xn ) = e−yn +nθ e−t+yn
= pθ (yn )h(x1 , x2 , · · · , xn )
n
Y
Consider pθ (x1 , x2 , · · · , xn ) = pθ (xi )
i=1
A.Santhakumaran 121
n
1
Pn
1 2
= √ e− 2 i=1 (xi −iθ)
2π
n
1
Pn 2 Pn Pn 2 2
1
= √ e− 2 i=1 xi +θ i=1 ixi − i=1 i θ
2π
n
1
Pn 2 Pn
1 n(n+1)(2n+1) 2
= √ e− 2 i=1 xi +θ i=1 ixi − 12
θ
2π
= c(θ)eQ(θ)t(x) h(x)
n Pn
1
x2i
ixi , h(x) = e− 2
X
where t(x) = i=1
i=1
n
1
1 2
and c(θ) = √ e− 12 n(n+1)(2n+1)θ
2π
Pn
Thus T = i=1 iXi is a sufficient statistic.
Problem 3.32 Given n independent observations on a random variable X
with probability density function
1 − xθ
2θ e if x > 0, θ > 0
pθ (x) = θ θx
2e if x ≤ 0
0
otherwise
Since the equation holds for all values of θ, it is also true for θ = 0. So one
can obtain the relation t(x) = k(t) where
∂ log pθ (x)
|θ=0 = t(x) and Q0 (t) = k(t)
∂θ
∂t(x) ∂k(t) ∂t
=
∂x ∂t ∂x
Z Z Z
Qθ (t)dθ = k(t) λ(θ)dθ + c1 (θ)dθ + c(x)
R ∂ log pθ (x) R
dθ dθ = t(x) λ(θ)dθ + B(θ) + c(x)
R
since k(t) = t(x) for θ = 0 and B(θ) = c1 (θ)dθ
R
where Q(θ) = λ(θ)dθ
= c(θ)eQ(θ)t(x) h(x)
1
=
n(n − 1)(yn − y1 )n−2
Y1 = nX(1)
Y2 = (n − 1)[X(2) − X(1) ]
Y3 = (n − 2)[X(3) − X(2) ]
··· ······
1
The Jacobian transformation is |J| = n!. The joint pdf of X(1) , X(2) , · · · , X(n)
Qn
is given by p(x(1) , x(2) , · · · , x(n) ) = n! i=1 p(x(i) ) The joint pdf of Y1 , Y2 , · · · , Yn is
given by
n
Y
pθ,σ (y1 , y2 , · · · , yn ) = n! p(yi ) × |J|
i=1
n
Y
= p(yi )
i=1
1 − 1 (P yi +nθ)
= e σ nθ < y1 < ∞, 0 ≤ y2 < · · · , < yn < ∞
σn
U 1 = Y2
U 2 = Y2 + Y3
A.Santhakumaran 126
U 3 = Y2 + Y3 + Y4
··· ······
Un−2 = Y1 + Y2 + · · · + Yn−1
T = Y2 + Y3 + · · · + Yn
i.e., Y2 = U1
Y3 = U2 − U1
Y4 = U3 − U2
··· ······
Yn = T − Un−2
(n − 2)
p(u1 , u2 , · · · , un−2 | y1 , t) = 0 < u1 < u2 < · · · < un−2 < t
tn−2
A.Santhakumaran 127
Pn
Thus (Y1 , T ) is jointly sufficient statistics, i.e., X(1) , i=1 [X(i) − X(1) ] is
jointly sufficient statistics.
Definition 3.5 Let θ = (θ1 , θ2 , · · · , θk ) is a vector of parameters and T =
(T1 , T2 , · · · , Tk ) is a random vector . The vector T is jointly sufficient statistics
if pθ (x) is expressed of the form
Pk
c(θ)e j=1 Qj (θ)tj (x) h(x)
a<x<b
pθ (x) =
0 otherwise
αnβ −α P xi Y β−1
pα,β (x1 , x2 , · · · , xn ) = e ( xi )
(Γβ)n
A.Santhakumaran 128
αnβ Pn
where c(α, β) = (Γβ)n , Q1 (α, β) = −α, t1 (x) = i=1 xi , Q2 (α, β) = β , t2 (x) =
Pn
Pn − log xi
i=1 log xi and h(x) = e i=1 . It is a two parameter exponential family.
Xi2 ) is jointly sufficient statistic.
P P
Therefore ( Xi ,
The pdf of Y3 is
"Z #2
y3 1 2 θ 1
5! 1
Z
pθ (y3 ) = dx dx
2!1!2! 0 θ θ y3 θ
30 2
= y [θ − y3 ]2 0 < y3 < θ
θ5 3
30 2 y3
= 5
y3 [1 − ]2 0 < y3 < θ
θ θ
Z θ
30 y3
Eθ [Y3 ] = 3
y [1 − ]2 dy3
θ3 0 3 θ
Z 1
30 y3
= θ4 t3 (1 − t)2 dt where t = θ
θ3 0
Z 1
= 30θ t4−1 (1 − t)3−1 dt
0
Γ4Γ3
= 30θ
Γ7
3! × 2! θ
= 30 =
6! 2
Eθ [2Y3 ] = θ
The pdf of Y5 is
55 y 4
0 < y5 < θ
θ 5
pθ (y5 ) =
0
otherwise
The conditional distribution of Y3 given Y5 = y5 is
pθ (y3 , y5 )
pθ (y3 | y5 ) =
pθ (y5 )
60 y32 [y5 − y3 ]
= 0 < y3 < y5
5 y54
Z y5
12
Eθ [Y3 | Y5 ] = y33 [y5 − y3 ]dy3
y54 0
3
= y5
5
6
.. . Eθ [2Y3 | Y5 = y5 ] = y5
5
A.Santhakumaran 130
θ2 2θ2
Vθ [Y3 ] = since Eθ [Y32 ] =
28 7
θ2
Vθ [2Y3 ] =
7
Z θ
5 5
Eθ [Y5 ] = y55 dy5 = θ
θ5 0 6
5θ2
Eθ [Y52 ] =
7
5θ2
Vθ [Y5 ] =
5 × 36
6 θ2
Vθ Y5 =
5 35
3.5 Show that if the bias of an estimator and its variance approach zero,
then the estimator will be consistent.
A.Santhakumaran 131
3.6 When would you say that estimator of a parameter is good? In par-
ticular discuss the requirements of consistency and unbiasedness of an
estimator. Give an example to show that a consistent estimator need
not be unbiased.
3.9 Obtain the unbiased estimator of θ(1 − θ), where θ is the parameter of
Binomial distribution.
3.12 Obtain the sufficient statistic, given a sample of size n from a uniform
distribution ∪(−θ, θ).
3.13 State two equivalent definition of sufficient statistic and obtain their
equivalence.
3
3.19 If T1 = 2 max{X1 , X2 } and T2 = 2(X1 + X2 ) are estimators of θ based
on two independent observations X1 and X2 on a random variable
distributed uniformly over (0, θ). Which one do you prefer and why?
X1 + X2 + X3
T1 =
6
X1 + 2X2 + 3X3
T2 =
7
X1 + X2
T3 =
2
3.23 Discuss whether an unbiased estimator exists for the parametric func-
tion τ (θ) = θ2 of Binomial (1, θ) based on a sample of size one.
kn
(a) if n → 0 as n → ∞ (c) iff kn is bounded as n → ∞
(b) if and if kn = 0 ∀ n (d) whatever {kn } is Ans:(a)
2 X2
(a) X (c) n
n
X
(b) X (d) n2
Ans: (b)
n
3.30 Which of the following statement is not correct for a consistent esti-
mator?
1. If there exists one consistent estimator, then an infinite number of
consistent statistics may be constructed.
2. Unbiased estimators are always consistent.
3. A consistent estimator with finite mean value must tend to be un-
biased in large samples.
Select the correct answer given below
(a) 1 (b) 2 (c) 1 and 3 (d) 1, 2 and 3 Ans: (b)
n(µ1 +µ2 √ q 2
(c) an = 2 and bn = n (σ1 + σ22 )
q
n(µ1 +µ2 σ12 +σ22
(d) an = 2 and bn = n 2 Ans: (a)
3.33 Let (X, Y ) have the joint discrete such that [X | Y = y] ∼ B(y, 0.5)
and Y ∼ Poisson (λ), λ > 0 where λ is an unknown parameter. Let
T = T (X, Y ) be any unbiased estimation of λ. Then
(a) V [T ] ≥ λ ∀ λ (c) V [T ] ≤ V [Y ] ∀ λ
(b) V [T ] ≥ V [Y ] ∀ λ (d) V [T ] = V [Y ] ∀ λ
3.34 Let X1 , X2 , · · · be a random sample from Uniform (0, 3θ), θ > 0. Define
1
T = 3 max(X1 , X2 , · · · , Xn ). Which of the following is not true?.
Ans: (a)
n(N −1)
(a) p(1 − p) (c) N (n−1) p(1 − p)
N −n N (n−1)
(b) N −1 np(1 − p) (d) n(N −1) p(1 − p) Ans: ( c)
3.36 A Statistician has drawn a simple random sample of size 2 with re-
placement heights. Let X̄1 be the sample mean of their heights. Then
another Statistician has drawn a simple random sample size 2 without
replacement from those 4 boys. Let X̄2 be the sample mean their
A.Santhakumaran 136
Ans: (a)
(a) E[ X̄n ] = E[Sn ] for large n (d) V [Sn ] > V [ X̄n ] for sufficiently
(b) X̄n is consistent for µ large n Ans:(b)
(c) X̄n is sufficient for µ
for θ, if θ > 0
x
(a) pθ (x) = e−θ θx! , x = 0, 1, 2, · · · (c) pθ (x) = 1θ e−θx , x > 0
2
− x2θ
(b) pθ (x) = √1 e
2πθ
, −∞ < x < ∞ (d) pθ (x) = 1θ e−θx , x > 0 Ans: (b)
A.Santhakumaran 137
(a) The new mean is necessarily less than or equal to the original mean
(b) The new median is necessarily less than or equal to the original median
(c) The new variance is necessarily less than or equal to the original variance
(d) The new mode will be same as the original mode Ans:(d)
4.1 Introduction
There are several sufficient statistics with varied degrees of data reduction un-
der different statistical probability models.The degree of data reduction by a sufficient
statistic carries the amount of ancillary information.A sufficient statistic has resulted
into maximum data reduction if it contains no ancillary information.
Definition: 4.1 A statistic T = T (X1 , X2 , · · · Xn ) is known as ancillary if the dis-
tribution of statistics T is independent of the parameter θ and first order ancillary if
Eθ [T ] is independent of the parameter θ.
Eθ [θˆ1 (T ) − θˆ2 (T )] = 0 ∀ θ ∈ Ω
⇒ θˆ1 (T ) = θˆ2 (T ) ∀ F
Thus completeness helps for identifying the unique unbiased estimator through com-
plete statistic T . Definitely such estimator reduces the risk which is minimum.
The order statistic obtain from a random sample drawn from a continuous distribution
is complete in the following theorem.
Theorem 4.1 Let F be a class of absolutely continuous distribution functions F so
that F is convex. Also F contains all uniform densities in <. Let X1 , X2 , · · · , Xn be
iid F ∈ F . Then the order statistic T (X) = (X1 , X2 , · · · , Xn ) is complete.
Proof: An estimator T 0 is a function of T ,
i.e., T 0 = g(T ) if and only if Tn (xl ) = T 0 (x) ∀ l where xl = (xl1 , · · · , xln ) and
(l1 , l2 , · · · , ln ) is one of the n! permutations of numbers 1, 2, 3 · · · , n.
Consider cumulative distribution function F1 , F2 , · · · , Fn from F with corresponding
densities f1 (x), f2 (x), · · · , fn (x). For all positive numbers α1 , α2 , · · · , αn , there is some
F ∈ F , the densities Pn
αi fi (x)
f (x) = i=1
Pn
i=1 αi
A.Santhakumaran 142
the left hand side of the Equation (4.1) is a polynomial in α1 , α2 , · · · , αn . This polyno-
mial is identically equal to zero, which implies that corresponding coefficients are also
zero. i.e,
XZ Z n
T 0 (x1 , x2 , · · · xn )
Y
··· fj (xlj )dx1 dx2 · · · dxn = 0
l∈L j=1
XZ Z n
T 0 (xl )
Y
··· fj (xj )dx1 dx2 · · · xn = 0
l∈L j=1
Z Z n
Y
n! ··· g(T (x)) fj (xj )dx1 dx2 · · · dxn = 0
j=1
1
As the function fj (x) = bj −aj , aj < x < bj , then
Z b1 Z bn
··· g(T (x))dx1 dx2 · · · dxn = 0
a1 an
⇒ PF {g(T )} = 0 ∀F ∈F
Show that the family is not complete but the family of distributions Y = |X| is
complete.
A.Santhakumaran 143
Eθ [g(X)] = 0
1
X
g(x)pθ (x) = 0
x=−1
1 1
g(−1) θ(1 − θ) + g(0)[1 − θ(1 − θ)] + g(1) θ(1 − θ) = 0
2 2
Consider Eθ [g(Y )] = 0
1
X
g(y)[θ(1 − θ)]y [1 − θ(1 − θ)]1−y = 0
y=0
1
θ(1−θ)
X
g(y)ρy = 0 where ρ = 1−θ(1−θ)
y=0
g(0) + g(1)ρ = 0 ⇒ g(0) = 0 and g(1) = 0
Eθ [g(T )] = 0
A.Santhakumaran 144
n
X
g(t)cnt (1 − θ)n−t = 0
t=0
n t
θ
X
g(t)cnt (1 − θ)n = 0
t=0
1−θ
Here (1 − θ)n 6= 0
n
X θ
g(t)cnt ρt = 0 where ρ =
t=0
1−θ
g(0)cn0 + g(1)cn1 ρ + · · · + g(n)ρn = 0
g(0) = 0 coefficient of ρ0
⇒ g(1) = 0
g(n) = 0 coefficient of ρn
Thus g(t) = 0 ∀ t = 0, 1, 2, · · · , n.
Pn
Hence T = i=1 Xi is a complete statistic.
Problem 4.3 Let X1 , X2 , · · · , Xn be iid random sample drawn from a Poisson pop-
Pn
ulation with parameter λ > 0. Show that T = i=1 Xi is a complete statistic.
Pn
Solution: Let T = i=1 Xi ∼ P (nλ)
(nλ)t
i.e., pλ (t) = e−nλ , t = 0, 1, 2, · · · , ∞
t!
Eλ [g(T )] = 0
∞
(nλ)t
g(t)e−nλ
X
= 0
t=0
t!
∞
(nλ)t
= 0 since e−nλ 6= 0
X
g(t)
t=0
t!
nλ (nλ)n
g(0) + g(1) + · · · + g(n) + ··· = 0
1! n!
g(0) = 0 coefficient of λ0
A.Santhakumaran 145
ng(1) = 0 coefficient of λ1
⇒ g(1) = 0
Thus g(t) = 0 ∀ t = 0, 1, 2, · · · , ∞
Pn
Hence T = i=1 Xi is a complete statistic.
Problem 4.4 Let X ∼ ∪(0, θ), θ > 0. Show that the family of distributions is
complete.
Solution: For a single observation X, the mathematical expectation of the measurable
function g(X) is
Eθ [g(X)] = 0
Z θ
1
⇒ g(x) dx = 0
0 θ
Z θ
⇒ g(x)dx = 0
0
One can differentiate the above integral with respect to θ on both sides
Z θ
0dx + g(θ) × 1 − g(0) × 0 = 0
0
hR i
b(θ)
d a(θ) pθ (x)dx Z b(θ)
dpθ (x) db(θ)
since = dx + pθ [b(θ)]
dθ dθ a(θ) dθ
da(θ)
−pθ [a(θ)]
dθ
g(θ) = 0 ∀ θ > 0, i.e., g(x) = 0 ∀ 0 < x < θ, θ > 0
since Eθ [X − θ] = 0
Z ∞
1 1 2
⇒ (x − θ) √ √ e− 2θ x dx = 0
−∞ 2π θ
Z ∞
1 1 2
⇒ t √ √ e− 2θ (t+θ) dt = 0 where t = x − θ
−∞ 2π θ
Z ∞
t
1 2 θ
−t − 2θ
e √ e t
dt = 0 since e− 2 6= 0
0 2πθ
R∞ −st f (t)dt.
This is same as the Bilateral Laplace Transform of f (t) as −∞ e By the
uniqueness property of the Laplace Transform
Z ∞
e−st f (t)dt = 0
o
⇒ f (t) = 0 ∀ t ∈ (−∞, ∞)
t 1 2
i.e., √ e− 2θ t = 0
2πθ
⇒ t = 0 i.e., x − θ = 0
⇒x = θ >0
Thus x is not equal to zero for θ > 0. The family X ∼ N (0, θ), θ > 0 is not complete.
Problem 4.6 If X ∼ N (0, θ), θ > 0. Prove that T = X 2 is a complete statistic.
T X2
Solution:Let T = (X − 0)2 , then θ = θ ∼ χ2 distribution with one degree of
T
freedom. θ has the pdf of G( 12 , 21 ).
1 t 1
1
1
e− 2 θ ( θt ) 2 −1 1θ 0<t<∞
pθ (t) = 2 2 Γ 12
0 otherwise
1
√ 1 e− 2θ t 12 −1
t 0<t<∞
= 2πθ
0
otherwise
Eθ [g(T )] = 0
Z ∞
1 t 1
g(t) √ e− 2θ t 2 −1 dt = 0
0 2πθ
Z ∞
1 1
e− 2θ t [g(t)t− 2 ]dt = 0 ∀ θ > 0
0
A.Santhakumaran 147
R ∞ −st
This is same as the Laplace Transform of f (t) as 0 e f (t)dt.
Using the uniqueness property of Laplace Transform
1
g(t)t− 2 = 0 ∀ t > 0
i.e., g(t) = 0 ∀ t > 0. Thus T = X 2 is a complete statistic .
Problem 4.7 Examine whether the family of distributions
2θ if 0 < x < 12 , 0 < θ < 1
pθ (x) =
2(1 − θ) 1
if 2 ≤x<1
is complete.
Solution: Consider the mathematical expectation of the function g(X)
Eθ [g(X)] = 0
Z 1 Z 1
2
⇒ g(x)2θdx + g(x)2(1 − θ)dx = 0
1
0 2
Z 1 Z 1
2
2θ g(x)dx + 2(1 − θ) g(x)dx = 0
1
0 2
Z 1 Z 1 Z 1
2
θ g(x)dx − θ g(x)dx + g(x)dx = 0
1 1
0 2 2
1
"Z Z 1 # Z 1
2
θ g(x)dx − g(x)dx + g(x)dx = 0
1 1
0 2 2
1
"Z Z 1 #
2
θ g(x)dx − g(x)dx = 0
1
0 2
Z 1
and g(x)dx = 0
1
2
Z 1 Z 1
2
g(x)dx = g(x)dx θ 6= 0
1
0 2
Z 1
2
⇒ g(x)dx = 0
0
choose
1
+1 if 0 < x < 4
1 1
−1
if ≤x<
4 2
g(x) =
1 3
+1 if 2 ≤x< 4
−1 3
≤x<1
if 4
Z 1 Z 1
4 2
Eθ [g(X)] = (+1)2θdx + (−1)2θdx
1
0 4
Z 3 Z 1
4
+ (+1)2(1 − θ)dx + (−1)2(1 − θ)dx
1 3
2 4
1 1 1 1
= 2θ − 2θ + 2(1 − θ) − 2(1 − θ)
4 4 4 4
= 0
then g(t) = g + (t) − g − (t) and both g + (t) and g − (t) are non - negative functions
[g + (t) − g − (t)]eθt+s(t) = 0 ∀ θ ∈ Ω
X
t
g − (t)eθt+s(t) ∀ θ ∈ Ω
X X
g + (t)eθt+s(t) =
t t
g + (t)eθt+s(t)
p+ (t) = P + θt+s(t)
t g (t)e
g − (t)eθt+s(t)
p− (t) = P − θt+s(t)
t g (t)e
p− (t)eδt ∀ δ ∈ Ω
X X
p+ (t)eδt =
t t
Pn 2 2
( i=1 Xi ) = n2 X̄ 2 and X̄ ∼ N (θ, θn )
R ∞ 2 √n − n (x̄−θ)2
Eθ [X̄ 2 ] = −∞ x̄
√ e 2θ2
2πθ
dx̄
x̄−θ √
If z = θ n, then x̄ − θ = z √θn and dx̄ = √θ dz
n
√ 2
R∞ n − z √θ
.. . Eθ [X̄ 2 ] = −∞ (θ + z √θn )2 √2πθ e 2 n dz
R ∞h z2
2
i
= θ2 −∞ 1 + zn + √2zn √12π e− 2 dz
2
z
1 R ∞ 2 √1
= θ2 1 + n −∞ z 2π
e− 2 dz + 0
A.Santhakumaran 150
1 1
One can take z 2 = t, then z = t 2 and dz = 21 t 2 −1 dt
t 1
h R ∞ i
i.e., Eθ [X̄ 2 ] = θ2 1 + √2
n 2π 0 te− 2 12 t 2 −1 dt
1 ∞ t 3
Z
= θ2 1 + √ e− 2 t 2 −1 dt
n 2π 0
Γ 32
" #
2 1
= θ 1+ √ 3
n 2π ( 12 ) 2
√ √ #
1 12 π2 2
"
2
= θ 1+ √
n 2π
1 n+1 2
= θ2 1 + = θ
n n
" n #2
.
X n+1
. . Eθ Xi = Eθ [nX̄]2 = n2 Eθ [X̄]2 = n2 θ2
i=1
n
= n(n + 1)θ2
n
X n
X
Consider Xi2 = (Xi − θ + θ)2
i=1 i=1
Xn n
X
= (Xi − θ)2 + nθ2 + 2θ (Xi − θ)
i=1 i=1
Xn
= (Xi − θ)2 + 2θnx̄ − nθ2
i=1
" n #
X
Eθ Xi2 = Eθ [ns2 ] + 2θnEθ [X̄] − nθ2
i=1
= Eθ [ns2 ] + 2nθ2 − nθ2 = Eθ [ns2 ] + nθ2
1X
where s2 = (xi − θ)2
n
Pn
ns2 (Xi −θ)2
Let Y = σ2
= i=1
θ2
∼ χ2 distribution with n degrees of freedom. Y has the
pdf G( n2 , 12 )
1 n
1
1
e− 2 y y 2 −1 0 < y < ∞
p(y) = 22 Γn
2
0 otherwise
Z ∞
1 1 n
E[Y ] = n
n
e− 2 y y 2 +1−1 dy = n
0 2 Γ22
" #
ns2
i.e., Eθ = n
σ2
Eθ [s2 ] = θ2 since σ 2 = θ2
A.Santhakumaran 151
n
X
Eθ [ Xi2 ] = nθ2 + nθ2 = 2nθ2
i=1
" n #2 " n #
X X
Eθ [g(X)] = 2Eθ Xi − (n + 1)Eθ Xi2
i=1 i=1
2 2
= 2n(n + 1)θ − (n + 1)2nθ = 0
is complete.
Solution: Consider the expectation Eθ [g(X)] = 0
Z θ Z 1
θg(x)dx + (1 + θ)g(x)dx = 0 + 0
0 θ
Z θ
⇒ g(x)dx = 0 and
0
Z 1
g(x)dx = 0
θ
One can differentiate the above integrals with respect to θ
Z θ
0dx + g(θ) × 1 − g(0) × 0 = 0 and
0
Z 1
0dx + g(1) × 0 − g(θ) × 1 = 0
θ
V [g(T )]
P {|g(T ) − E[g(T )]| < } ≥ 1 − for every given > 0
2
V [g(T )]
P {|g(T )| < } ≥ 1 − for every given > 0
2
⇒ |g(t)| < ∀ t ∈ <
Now the function g(x) = x is bounded. If the family is bounded complete, then
n
θ
= θ(1 − θ)−2
X
xθx =
x=0
(1 − θ)2
Xn
xθx = θ[1 + 2θ + 3θ2 + · · ·]
x=0
= [θ + 2θ2 + 3θ3 + · · ·]
∞
X ∞
X
= xθx = xθx
x=1 x=0
Xn X∞
= xθx + xθx
x=0 x=n+1
∞
X
⇒ xθx = 0
x=n+1
Eθ [g(X)] = 0
∞
X
i.e., g(x)pθ (x) = 0
x=−1
∞
X
g(−1)θ + g(x)(1 − θ)2 θx = 0
x=0
X∞
g(x)(1 − θ)2 θx = −g(−1)θ
x=0
∞
−g(−1)θ
= −g(−1)θ(1 − θ)−2
X
g(x)θx =
x=0
(1 − θ)2
X∞
g(x)θx = −g(−1)θ[1 + 2θ + 3θ2 + · · · +
x=0
nθn−1 + (n + 1)θn + · · ·]
nθn + (n + 1)θn+1 + · · ·]
∞
X
= −g(−1) xθx
x=1
X∞
= −g(−1) xθx
x=0
⇒ g(x) = −g(−1)x = cx where c = −g(−1) and c ∈ <
Problem 4.11 Examine the family of distributions of the random variable X given
by Pθ {X = −1} = θ2 , Pθ {X = 0} = 1 − θ and Pθ {X = 1} = θ(1 − θ), 0 < θ < 1 is
complete.
Solution: For single observation X , consider
Eθ [g(X)] = 0
⇒ g(−1) = g(0)
⇒ g(0) = g(1)
g(1) = 0 coefficient of θ0
Hence g(−1) = g(1) = g(0) = 0. Thus g(x) = 0 for x = −1, 0 and 1. .˙. The family
of distributions is complete.
Problem 4.12 The random variable X has the following distribution
X=x: 0 1 2
Pθ {X = x} 1 − θ − θ2 θ θ2
Eθ [g(X)] = 0
g(0) = 0 coefficient of θ0
A.Santhakumaran 155
Hence g(0) = g(1) = g(2) = 0, i.e., g(x) = 0 for x = 0, 1 and 2. Thus the family of
distributions is complete.
Problem 4.13 X has the following distribution
X=x: 1 2 3 4 5 6
1 1 1 1 1 1
Pθ {X = x} 6 6 6 6 6 6
Consider E[g(X)] = 0
3c 3c
⇒ − + = 0
6 6
But g(x) 6= 0 for x = 1, 2, 3, 4, 5, 6.
1
pN (x) = , x = 1, 2, · · · , N and ∀ N = 1, 2, · · ·
N
i.e., pN =1 (x) = 1, x = 1
1
pN =2 (x) = , x = 1, 2
2
1
pN =3 (x) = , x = 1, 2, 3
3
······ ··· ···············
Consider EN g(X) = 0 ∀ N ∈ I+
PN 1
i.e., x=1 g(x) N = 0 ⇒ g(x) = 0 ∀ x and ∀ N
When N = 1 ⇒ g(1) = 0
A.Santhakumaran 156
g(x) = 0 ∀ x = 3, 4, · · · , N and ∀ N = 2, 3, 4, · · ·
PN 1
then x=1 g(x) N = 0 when N = 2, 3, · · ·
⇒ g(x) = 0 ∀ x = 1, 2, 3, · · · , N and N = 2, 3, · · · . This means that the family
of distributions is bounded complete. Thus there exist is a class of zero unbiased
estimators, i.e.,U0 = {g(X) | c ∈ <} where
(−1)x−1 c x = 1, 2 and c ∈ <
g(x) =
0
otherwise
If the family of distributions is complete, then the unbiased estimator of zero is unique.
A.Santhakumaran 157
= P {Xn = xn }
= pθ (xn )
Lehmann and Scheffe technique for obtaining a minimal sufficient statistic is par-
tition of the sample space. Once the partition is obtained, a minimal sufficient statistic
can be defined by assigning distinct numbers to distinct partition sets.
In constructing sets of a partition that is to be sufficient for the family of
densities pθ (x), for θ ∈ Ω, there is two sets of sample points X1 = x1 , · · · , Xn = xn and
Y1 = y1 , Y2 = y2 , · · · , Yn = yn will lie on the same partition of the minimal sufficient
partition iff the ratio of x1 , x2 , · · · , xn to its value at y1 , y2 , · · · , yn :
pθ (x1 , x2 · · · , xn )
= k(y1 , · · · , yn ; x1 , x2 , · · · , xn )
pθ (y1 , y2 , · · · , yn )
The reason for writing the definition in terms of a product rather than a
ratio is taken into account the points for which pθ (x1 , x2 , · · · , xn ) = 0, i.e., all points
x1 , x2 , · · · , xn such that pθ (x1 , x2 , · · · , xn ) = 0 ∀ θ ∈ Ω will be equivalent, and every
x1 , x2 · · · , xn will be lie in some partition D, namely in D(x1 , x2 , · · · , xn ) and there
is no overlapping of the D’s, so that they constitute a partition of the sample space.
For if two D’s, say D(x1 , x2 , · · · , xn ) and D(y1 , y2 , · · · , yn ) have a point z1 , z2 , · · · , zn
in common, then z1 , z2 , · · · , zn is equivalent to both x1 , x2 , · · · , xn and y1 , y2 , · · · , yn
which are then equivalent to each other and define the same D. Thus the partition of
the sample space D defines the minimal sufficient partition.
Problem 4.16 Let X1 , X2 , · · · Xn be iid random sample drawn from a Binomial
population b(n, θ). Obtaining the minimal sufficient statistic by partition method.
Solution: The joint pdf at X1 = x1 , X2 = x2 , · · · , Xn = xn is
P P
xi
pθ (x1 , x2 , · · · , xn ) = θ (1 − θ)n− xi
The ratio is
P xi =P yi
pθ (x1 , x2 , · · · , xn ) θ
= .
pθ (y1 , y2 , · · · , yn ) 1−θ
yi . Thus the points x1 , x2 , · · · , xn and
P P
The ratio is independent of θ iff xi =
y1 , y2 , · · · , yn whose coordinates have the same set of minimal sufficient partition.
P
Therefore Xi is a minimal sufficient statistic.
Problem 4.17 Let X1 , X2 , · · · , Xn be iid random sample from N (θ, σ 2 ). Assume θ
and σ 2 are unknown. Prove that ( Xi2 ) is a minimal sufficient statistic.
P P
Xi ,
Solution: Consider the ratio
pθ,σ2 (x1 , x2 , · · · , xn ) 1 hX 2 X 2
X X i
= exp − 2 xi − yi − 2θ xi − yi
pθ,σ2 (y1 , y2 , · · · , yn ) 2σ
Problem 4.18 Determine the minimal sufficient statistic based on a random sample
of size from each of the following:
(i)
θe−θx
θ>0
pθ (x) =
0
otherwise
(ii)
x exp[− x2 ] x > 0
θ 2θ
pθ (x) =
0
otherwise
and (iii) q
x2
2 x2 − σ 2
e x>0
πσ 3
pσ (x) =
0 otherwise
pθ (x1 , x2 , · · · , xn ) h X X i
= exp −θ xi − yi .
pθ (y1 , y2 , · · · , yn )
P P P
The ratio is independent of θ iff xi = yi . Therefore Xi is a minimal sufficient
statistic.
(ii) Consider the ratio
pθ (x1 , x2 , · · · , xn ) Y xi 1 X 2 X 2
= exp − xi − yi .
pθ (y1 , y2 , · · · , yn ) yi 2θ
P 2 P 2 P 2
The ratio is independent of the parameter θ iff xi = yi . Therefore Xi is a
minimal sufficient statistic.
(iii) Consider the ratio
!
pσ (x1 , x2 , · · · , xn ) Y x2i 1 X 2 X 2
= exp − 2 xi − yi
pθ (y1 , y2 , · · · , yn ) yi2 2σ
P 2
xi = yi2 . Therefore Xi2 is a minimal sufficient
P P
The ratio is independent of σ iff
statistic.
Theorem 4.3 The Exponential family of distributions consists of those dis-
tributions with densities or probability functions expressible in the form: pθ (x) =
c(θ)eQ(θ)t(x) h(x), i.e., pθ (x) is a member of exponential family, then there exist is
a minimal sufficient statistic.
A.Santhakumaran 160
Proof: The joint density function of the random sample X1 , X2 , · · · , Xn for a random
variable X is
X Y
pθ (x1 , x2 , · · · , xn ) = [c(θ)]n exp[Q(θ) t(xi )] h(xi ).
and
pθ0 (x1 , x2 · · · , xn ) = pθ0 (t)h(x1 , x2 , · · · , xn )
pθ1 (x1 ,x2 ,···,xn ) pθ1 (t)
respectively. Let the ratio pθ0 (x1 ,x2 ,···,xn ) = pθ0 (t) be a function of u(x), then U =
p (X)
u(X1 , X2 , · · · , Xn ) is a sufficient statistic for pθθ1 (X) iff T is a function of U . This
0
1 2 2
= e 2 [2n(θ−θ0 )x̄−n(θ −θ0 )]
A.Santhakumaran 161
1
P P P
− [ x2i −2 xi yi − yi2 ]
e 2(1−ρ2 )
= 1
P P P
− [ u2i −2 ui vi + vi2 ]
e 2(1−ρ2 )
x 1 2 3 4 5 6
θ1 1/30 1/15 1/10 4/15 2/ 15 2/5
θ2 1/60 1/30 1/20 1/3 1/15 1/2
pθ (x)
For T (x) = T (y) the ratios pθ (y) on the partitions A1 and A2 are free of θ . The
minimal sufficient statistics is
a if x ∈ A1 = {1, 2, 3, 5}
T (X) =
b
if x ∈ A2 = {4, 5}
Problems
X=x: 0 1 2
Pθ {X = x} 1 − θ − θ2 θ2 θ
4.3 Let X1 , X2 , · · · , Xn be iid random variables from ∪(0, θ). Prove that the statistic
YN = max1≤i≤n {Xi } is complete.
is complete.
4.6 Let X1 , X2 , · · · , Xn be a sample from ∪(θ − 12 , θ + 12 ), θ ∈ <. Show that the statistic
T = (min1≤i≤n (Xi ), max1≤i≤n (Xi )) is not complete.
4.9 Prove that a complete sufficient statistics is minimal sufficient whenever minimal
sufficient statistic exists.
where θ is a scale parameter( the pdf is scale density ). Find the ancillary statistic
for the family of distributions.
4.13 Let X be a discrete random variable with the following pmf pθ (x) .
x 1 2 3 4 5 6
θ1 1/14 2/14 3/14 3/14 4/14 1/14
θ2 1/18 2/18 5/18 5/18 4/18 1/18
1
4.15 For a fixed n0 = 1, 2, · · · from the family of densities {pN (x) = N,x =
1, 2, 3, · · · N, N ∈ I+ }. Let F = {pN (x) = 1
N,x = 1, 2, 3 · · · , N, N ∈ I+ and N 6=
n0 } where
1
x = 1, 2, · · · , N, N ∈ I+
N
pN (x) =
0
otherwise
then
4.18 Let X1 , X2 , · · · , Xn be iid uniform (θ1 , θ2 ) variables, where θ1 < θ2 are unknown
parameters. Which of the following is an ancillary statistic?.
X(k) X(k)
(a) X(n) for any k < n (c) X(n) −X(k) for any k < n
X(n) −X(k) X(k) −X(1)
(b) X(n) for any k <n (d) X(n) −X(k) for any k where 1 < k < n
Ans:(b)
A.Santhakumaran 166
1
4.22 The family of distributions {pN (x) = N,x = 1, 2, 3, · · · , N, N = 2, 3, 4, · · ·} is bounded
4.23 Which of the following statements are true?. Random variable X ∼ N (θ, σ 2 ), then
(a) P {X = θ} = 0 (b) P {X > θ} = 0.5 (c) P {X < Median} = 0.5
5.1 Introduction
Let g(T ) be an unbiased estimator of τ (θ) and δ(T ) be an another unbiased estimator
of τ (θ) different from g(T ). Then there always exists an infinite number of unbiased
estimators of τ (θ) such that λg(T ) + (1 − λ)δ(T ), 0 < λ < 1. In this case one can find
the best estimator or optimal estimator among all the unbiased estimators. The following
procedures are used to identify the optimal estimator.
• Uncorrelatedness approach
Theorem 5.1 Let U be the class of all unbiased estimators T = t(X) of a parameter
τ (θ) ∀ θ ∈ Ω with Eθ [T 2 ] < ∞ for all θ. Suppose that U is a non-empty set. Let U0 be
the set of all unbiased estimators of V of zero, i.e.,
U0 = {V | Eθ [V ] = 0, Eθ [V 2 ] < ∞ ∀ θ ∈ Ω}
Eθ [T + λ V ] = τ (θ) + λEθ [V ]
= τ (θ) since Eθ [V ] = 0
Vθ [T ] + λ2 Vθ [V ] + 2λCovθ [V, T ] ≥ Vθ [T ]
It is an quadratic equation in λ and it has two real roots λ = 0 and λ = − 2Covθ [T,V ]
Vθ [V ] . If
λ = 0, trivially T is an UMVUE of τ (θ).
For λ 6= 0, take λ0 = λ
2 = − Covθ [T,V ]
Eθ [V 2 ]
, then one can define T 0 ∈ U where T 0 = T + λ0 V
and Eθ [T + λ0 V ] = Eθ [T ] = τ (θ) and
2
Vθ [T 0 ] = Eθ [T + λ0 V ]2 − Eθ [T + λ0 V ]
= Eθ [T + λ0 V ]2 − τ 2 (θ)
2
= Eθ [T 2 ] − τ 2 (θ) + λ0 Eθ [V 2 ] + 2λ0 Covθ [T, V ]
(Covθ [T, V ])2 2 (Covθ [T, V ])2
= Vθ [T ] + Eθ [V ] − 2
(Eθ [V 2 ])2 Eθ [V 2 ]
(Covθ [T, V ]) 2
= Vθ [T ] −
Eθ [V 2 ]
(Covθ [T, V ])2
Vθ [T 0 ] = Vθ [T ] − ≤ Vθ [T ]
Eθ [V 2 ]
Thus λ0 = − EEθθ[T V]
[V 2 ]
contradicts that T is the UMVUE of τ (θ). If T is the UMVUE of
τ (θ), then Covθ [T, V ] = 0, i.e., Eθ [T V ] = 0 ∀ θ ∈ Ω.
A.Santhakumaran 169
Conversely, assume Covθ [T, V ] = 0 for some θ ∈ Ω. To prove that T is a UMVUE of τ (θ).
Let T 0 be another unbiased estimator of τ (θ) so that T 0 ∈ U, then T 0 − T ∈ U0 . Since
Eθ [T ] = τ (θ) and Eθ [T 0 ] = τ (θ) ⇒
Eθ [T 0 − T ] = 0
⇒ Eθ [T (T 0 − T )] = 0
Eθ [T T 0 ] = Eθ [T 2 ]
n o1 n o1
2
Eθ [T T 0 ] ≤ Eθ [T 2 ] 2
Eθ [T 0 ] 2
Eθ [T 2 ] n
2
o1
1 ≤ Eθ [T 0 ] 2
{Eθ [T 2 ]} 2
Vθ [T ] ≤ Vθ [T 0 ]
Eθ [T 0 ] = τ (θ)
Eθ [T ] = τ (θ)
⇒ Eθ [T 0 − T ] = 0
⇒ Eθ [T (T 0 − T )] = 0
i.e., Eθ [T 2 ] = Eθ [T T 0 ]
Covθ [T, T 0 ] = Vθ [T ]
⇒ Pθ {aT + bT 0 = 0} = 1 ∀ a, b ∈ <
A.Santhakumaran 170
Choose a = 1 and b = −1
⇒ Pθ {T = T 0 } = 1, then T and T 0 are the same. .˙. The UMVUE T is unique.
Theorem 5.3 If UMVUE’s Ti = ti (X), i = 1, 2 exist for real function τ1 (θ) and τ2 (θ)
of θ, then aT1 + bT2 is also UMVUE of aτ1 (θ) + bτ2 (θ).
Proof: Given T1 = t1 (X) is a UMVUE of τ1 (θ), i.e., Eθ [T1 V ] = 0 ∀ θ ∈ Ω and V ∈ U0 .
Again Eθ [T2 V ] = 0, ∀ θ ∈ Ω and V ∈ U0 .
Prove that Eθ {[(aT1 + bT2 )V ]} = 0 ∀ θ ∈ Ω.
= Eθ [aT1 V ] + Eθ [bT2 V ]
= a×0+b×0=0
= Eθ [T V ] − Eθ [Tn V ]
Eθ [T V ] = Eθ [(T − Tn )V ]
But Eθ [(T − Tn )V ] = Eθ [T V ]
1 n o1
.. . |Eθ [T V ]| ≤ Eθ [V 2 ] 2
Eθ [T − Tn ]2 2
n o1
But Eθ [T − Tn ]2 2
≥ 0 and
Eθ [T − Tn ]2 → 0 as n → ∞
.. . Eθ [T V ] → 0 as n → ∞
1 1
Eθ [T ] = Eθ [T1 ] + Eθ [T2 ]
2 2
1 1
= τ (θ) + τ (θ)
2 2
= τ (θ)
1
Vθ [T ] = Vθ [T1 + T2 ]
2
1
= {Vθ [T1 ] + Vθ [T2 ] + 2Covθ [T, T2 ]}
4
1 q
= Vθ [T1 ] + Vθ [T2 ] + 2ρ Vθ [T1 ] + Vθ [T2 ]
4
1
= {2Vθ [T1 ] + +2ρVθ [T1 ]}
4
1
= Vθ [T1 ](1 + ρ)
2
⇒ Vθ [T ] ≥ Vθ [T1 ]
A.Santhakumaran 172
1
Vθ [T1 ](1 + ρ) ≥ Vθ [T1 ]
2
(1 + ρ) ≥ 2
ρ ≥1
Eθ [T ] = θ
Eθ [T 0 ] = θ
Eθ [T − T 0 ] = 0
Eθ [T [T − T 0 ] = 0
Eθ [T 2 − T T 0 ] = 0
Eθ [T 2 ] − Eθ [T T 0 ] = 0
Eθ [T T 2 ] = Eθ [T 2 ]
Problem 5.3 Let T1 , T2 be two unbiased estimates having common variance ασ 2 (α > 1),
where σ 2 is the variance of the UMVUE. Show that the correlation coefficient between T1
2−α
and T2 is greater than or equal to α .
1 1
Vθ [ (T1 + T2 )] = {Vθ [T1 ] + Vθ [T2 ] + 2Covθ (T1 , T2 )}
2 4
1 q
= Vθ [T1 ] + Vθ [T2 ] + 2ρ Vθ [T1 ]Vθ [T2 ]
4
1
= [2Vθ [T1 ] + 2ρVθ [T2 ]]
4
1
= [Vθ [T1 ] + ρVθ [T1 ]
2
where ρ is the correlation coefficient between T1 and T2 . Let T be the UMVUE of τ (θ).
1
Vθ { [T1 + T2 ]} ≥ Vθ [T ]
2
1
i.e., Vθ [T1 ](1 + ρ) ≥ Vθ [T ]
2
1 2
ασ (1 + ρ) ≥ σ2
2
α(1 + ρ) ≥ 2
2
(1 + ρ) ≥
α
2−α
ρ ≥
α
2
{E[δ(T ) | T 1 | T ]} ≤ E[δ 2 (T ) | T ]E[12 | T ]
2
i.e., {E[δ(T ) | T ]} ≤ E[δ 2 (T ) | T ]
i.e., Eθ [g 2 (T )] Eθ E[δ 2 (T ) | T ] = Eθ [δ 2 (T )]
≤
Eθ [δ 2 (T )] = Eθ [g 2 (T )]
2
i.e., Eθ E[δ 2 (T ) | T ]
= Eθ {E[δ(T ) | T ]}
Eθ [V ar[δ(T ) | T ]] = 0
2
V ar[δ(T ) | T ] = 0 iff E[δ 2 (T ) | T ] = {E[δ(T ) | T ]}
If this is the case , then E[δ(T ) | T = t] = g(t) and the statistic g(T ) is a function of T.
Remark 5.1 The Rao - Blackwell Theorem has the following limitations.
(i) If the unbiased estimator T = t(X) is already a function of only one sufficient statis-
tic, then the derived statistic is identical to T = t(X). In this case there is no
improvement in the variance of the statistic T = t(X).
(ii) If more than one sufficient statistic exists, then one can improve the variance of
the unbiased estimator by using minimal sufficient statistics, since the set of jointly
A.Santhakumaran 175
Lehman -Scheffe Theorem states that if a complete sufficient statistic exists, then
the UMVUE of τ (θ) is unique. But it does not mean that only the complete sufficient
statistic has UMVUE. Even if a complete sufficient statistic does not exist, an UMVUE
may still exist.
Theorem 5.6 If T = t(X) is a complete sufficient statistic and there exists an unbiased
estimator δ(T ) of τ (θ), then there exists a unique UMVUE of τ (θ) which is given by
E[δ(T ) | T = t] = g(t).
Proof: Rao - Blackwell Theorem gives E[δ(T ) | T = t] = g(t) and g(T ) is the UMVUE
of τ (θ). It is only to prove that g(T ) is unique. If δ1 (T ) ∈ U and δ2 (T ) ∈ U , then
Eθ [E[δ1 (T ) | T ]] = τ (θ) and Eθ [E[δ2 (T ) | T ]] = τ (θ) ∀ θ ∈ Ω.
Eθ [E[δ1 (T ) | T ] − E[δ2 (T ) | T ]] = 0 ∀ θ ∈ Ω
⇒ E[δ1 (T ) | T ] − E[δ2 (T ) | T ] = 0
E[δ1 (T ) | T ] = E[δ2 (T ) | T ]
.˙. The UMVUE g(T ) is unique, if the sufficient statistic T = t(X) is complete.
From the Theorems 5.5 and 5.6, the UMVUE of τ (θ) is obtained by solving a set
of equations and conditioning on the sufficient statistic.
e−nθ (nθ)t
p(t | θ) = t = 0, 1, 2, · · ·
t!
= 0 otherwise
1
p(t | θ) = e−nθ et log nθ
t!
= c(θ)eQ(θ)t(x) h(x)
Pn
where c(θ) = e−nθ , Q(θ) = log nθ, t(x) = i=1 xi , h(x) = t!1 .
.˙. The statistic is complete and sufficient. Thus the UMVUE g(T ) of θ + 2 is
Eθ [g(T )] = θ + 2
∞
1
g(t)e−nθ (nθ)t
X
= θ+2
t=0
t!
∞
X 1
g(t)nt θt = (θ + 2)enθ
t=0
t!
∞
X (nθ)t
= (θ + 2)
t=0
t!
∞ ∞
X 1 X 1
= nt θt+1 +2 nt θ t
t=0
t! t=0
t!
Equivating the coefficient of θt
on both sides
n t nt−1 nt
g(t) = +2
t! (t − 1)! t!
t
g(t) = +2
n
A.Santhakumaran 177
P
xi
= +2
n
= x̄ + 2
(n − r)t−r+1 t!
g(t) = t
, r = 1, 2, · · · and n > r
n (t − r + 1)!
Thus the UMVUE of θr−1 e−rθ is
(n − r)T −r+1 T!
T
, r = 1, 2, · · · and n > r.
n (T − r + 1)!
t t T
n−1 1 1
Remark 5.2 When r = 1, g(t) = n = 1− n , n = 2, 3, · · · , then 1 − n is
the unbiased estimator of e−θ where T =
P
Xi .
T 2 T
= 3, 4, · · · is the UMVUE of e−2θ θ where T =
P
When r = 2, (n−2) [1− n ] , n Xi .
Problem 5.6 Obtain the UMVUE of θr + (r − 1)θ, r = 1, 2, · · · for the random sample
of size n from Poisson distribution with parameter θ.
Pn
Solution: As in the problem 5.1, T = i=1 Xi is complete and sufficient. There exists
a UMVUE of τ (θ) = θr + (r − 1)θ, r = 1, 2, · · ·
∞
(nθ)t
g(t)e−nθ
X
Eθ [g(T )] = = θr + (r − 1)θ
t=0
t!
∞
X nt θ t
g(t) = [θr + (r − 1)θ]enθ
t=0
t!
A.Santhakumaran 178
= θr enθ + (r − 1)θenθ
∞ ∞
r
X nt θ t X nt θ t
= θ + (r − 1)θ
t=0
t! t=0
t!
∞ ∞
X 1 X 1
= nt θt+r + (r − 1) nt θt+1
t=0
t! t=0
t!
Equivating the coefficient of θt on both sides
nt nt−r nt−1
g(t) = + (r − 1)
t! (t − r)! (t − 1)!
1 t! 1 (r − 1)
= + t!
nr (t − r)! n (t − 1)!
t(t − 1) · · · · · · (t − r + 1) (r − 1)
= + t
nr n
The UMVUE of θr + (r − 1)θ is
T (T − 1) · · · · · · (T − r + 1) (r − 1)
g(T ) = + T, r = 1, 2, · · ·
nr n
Remark 5.3 When r = 1, X̄ is the UMVUE of θ.
X̄(nX̄−1)
When r = 2, n + X̄ is the UMVUE of θ2 + θ.
Problem 5.7 Obtain UMVUE of θ(1 − θ) using a random sample of size n drawn from
a Bernoulli population with parameter θ.
Solution:
θ x (1 − θ)1−x
x = 0, 1
Given pθ (x) =
0
otherwise
n
X
Let T = Xi , then T ∼ b(n, θ)
i=1
i.e., pθ (x) = cnt θt (1 − θ)n−t t = 0, 1, 2, · · · , n
t
θ
= cnt (1 − θ)n
1−θ
θ
= (1 − θ)n et log( 1−θ ) cnt
= c(θ)eQ(θ)t(x) h(x)
θ X
n
where c(θ) = (1 − θ) , Q(θ) = log , t(x) = xi and h(x) = cnt .
1−θ
P
It is an one parameter exponentially family. .˙. The statistic T = Xi is complete and
sufficient. The UMVUE of θ(1 − θ) is
Eθ [g(T )] = θ(1 − θ)
A.Santhakumaran 179
∞
X
g(t)cnt θt (1 − θ)n−t = θ(1 − θ)
t=0
∞ t
θ
= θ(1 − θ)(1 − θ)−n
X
g(t)cnt
t=0
1−θ
θ
One can take ρ = , then
1−θ
θ 1
1+ρ = 1+ =
1−θ 1−θ
1
Thus 1 − θ =
1+ρ
ρ
⇒ θ =
1+ρ
∞
X
g(t)ρt cnt = ρ(1 + ρ)n−2
t=0
= ρ[1 + c1n−2 ρ + · · · + ρn−2 ]
= ρ + c1n−2 ρ2 + · · · + ρn−1
n−1
!
X n-2
= t-1 ρt
t=1
g(t)cnt = cn−2
t
(n − 2)! t!(n − t)!
g(t) =
(t − 1)!(n − t − 1)! n!
(n − 2)!t(t − 1)!(n − t)(n − t − 1)!
=
(t − 1)!(n − t − 1)!n(n − 1)(n − 2)!
t(n − t)
= if n = 2, 3, · · ·
n(n − 1)
T (n−T )
i.e., n(n−1) is the UMVUE of θ(1 − θ).
1
Problem 5.8 Obtain the UMVUE of p of the pmf
pq x
x = 0, 1, · · ·
pp (x) =
0
otherwise
= c(p)eQ(p)t(x) h(x)
X
where c(p) = pn , Q(p) = log(1 − p), t(x) = xi , h(x) = 1.
This is an one parameter exponentially family which is complete and sufficient. Thus
there exist an unique UMVUE of p1 . It is given by Ep [g(T )] = p1 .
Pn
The statistic T = i=1 Xi is the sum of n iid geometric variables with same
parameter p has the Negative Binomial distribution. The pmf of T is
n+t-1
!
n-1 pn q t t = 0, 1, · · ·
pp (t) = P {T = t} =
0 otherwise
∞ n+t-1
!
X 1
g(t) n-1 pn q t =
t=0
p
∞ n+t-1
!
q t = (1 − q)−(n+1)
X
g(t) t
t=0
∞ n+t
!
X
= t qt
t=0
n+t-1
!
Equivating the coefficient of q t on both sidesg(t) t
n+t
!
= t
1
Ep [g(X)] =
p
∞
X 1
g(x)pq x =
x=0
p
∞ ∞
g(x)q x = (1 − q)−2 =
X X
(x + 1)q x
x=0 x=0
→ g(x) = x + 1
1
Thus the UMVUE of p is X + 1.
Problem 5.10 Let X1 , X2 , · · · Xn be iid N(θ, 1). Prove that E[X1 | Y ] = x̄ where
Pn
Y = i=1 Xi .
σX1
E [X1 | Y ] = Eθ [X1 ] + bX1 Y (Y − Eθ [Y ]) where bX1 Y = ρ
σY
A.Santhakumaran 182
Pn
and ρ is the correlation coefficient between X1 and Y = i=1 Xi
Cov[X1 , Y ]
ρ =
σX1 σY
X √
Y = Xi ∼ N (nθ, n) σY = n, σX1 = 1
= nθ2 − θ2 + 1 + θ2
= nθ2 + 1
The pdf of T is
Z t n−1
n! 1 1
pθ (t) = dx 0<t<θ
1!(n − 1)! 0 θ θ
nn tn−1
0<t<θ
θ
pθ (t) =
0
otherwise
A.Santhakumaran 183
Thus T = t(X) is a complete and sufficient statistic. δ(T ) = 2X1 is an unbiased estimator
Rθ
of θ, since Eθ [X1 ] = 0 x1 1θ dx1 = θ
2. The UMVUE of θ is given by g(T ) and g(t) =
E[2X1 | T = t].
When x1 = t the conditional pmf of X1 given T = t is p(x1 | t) = n1 .
When 0 < x1 < t the conditional density of X1 given T = t is
1 (n−1) n−2
pθ (x1 , t) θ θn−1 t
pθ (x1 | t) = = n n−1 0 < x1 < t
pθ (t) nt
θ
n−1 1
0 < x1 < t
n t
=
0
otherwise
Z t
1
E[2X1 | T = t] = 2x1 pθ (x1 | t)dx1 + 2t
0 n
n−11 t Z
2t
= 2 x1 dx1 +
n t 0 n
n−11t 2 2t
= 2 +
n t 2 n
1
= (1 + )t
n
A.Santhakumaran 184
The pdf of T is
Z ∞ n−1
n!
pθ (t) = e−(t−θ) e −(x−θ)
dx
1!(n − 1)! θ
ne−n(t−θ)
θ<t<∞
=
0
otherwise
Eθ [g(T )] = 0
Z ∞
g(t)ne−n(t−θ) dt = 0
θ
Z ∞
g(t)e−n(t−θ) dt = 0
θ
= 0∀ θ<t<∞
Z ∞ Z ∞
= e−z z 2−1 dz + θ e−z z 1−1 dz
0 0
= Γ2 + θΓ1
= 1+θ
Eθ [X1 − 1] = θ
If one can take δ(T ) = X1 − 1, then the UMVUE of θ is given by g(T ) and g(t) =
E[(X1 − 1) | T = t].
When x1 = t, the conditional pmf of X1 given T = t is pθ (x1 | t) = n1 .
When t < x1 < ∞, the conditional density of X1 given T = t is
e−(x1 −θ) (n − 1)e−(n−1)(t−θ)
pθ (x1 | t) =
ne−n(t−θ)
n − 1 −(x1 −t)
= e
n Z ∞
(n − 1) 1
E[(X1 − 1) | T = t] = (x1 − 1)e−(x1 −t) dx1 + (t − 1)
n n
Z t∞ Z ∞
n−1 n−1
= x1 e−(x1 −t) dx1 − e−(x1 −t) dx1
n t n t
1
+ (t − 1)
n Z
n−1 ∞ n − 1 ∞ −z
Z
= (z + t)e−z dz − e dz
n 0 n 0
1
+ (t − 1) where z = x1 − t
n Z
n − 1 ∞ −z 2−1
Z ∞
n−1
= e z dz + t e−z z 1−1 dz
n 0 n 0
n − 1 ∞ −z 1−1
Z
1
− e z dz + (t − 1)
n 0 n
n−1 n−1 n−1 1
= Γ2 + t− Γ1 + (t − 1)
n n n n
n−1 1
= t + (t − 1)
n n
1
= t−
n
1
The UMVUE of θ is T − 1
n and the UMVUE of eθ is e{T − n }.
Problem 5.13 Let X1 and X2 be a random sample drawn from a population with pdf
1 e− xθ
0<x<∞
θ
pθ (x) =
0
otherwise
1 − 1 (x1 +x2 )
pθ (x1 , x2 ) = e θ
θ2
1 −1t
= e θ
θ
= c(θ)eQ(θ)t(x) h(x)
2
1 1 X
where c(θ) = , Q(θ) = − , t(x) = xi , h(x) = 1
θ2 θ i=1
∂x1 ∂x1
∂t ∂t1
J =
∂x2 ∂x2
∂t ∂t1
1 −1
=
0 1
The joint density of T and T1 is
12 e− θ1 t
0 < t1 < t < ∞
θ
p(t, t1 | θ) =
0
otherwise
0 < t1 < t or
1 e− θ1 t
t1 < t < ∞
= θ2
0 otherwise
The pdf of T is
Z t
pθ (t) = p(t, t1 | θ)dt1
0
1 t −1t
Z
= e θ dt1
θ2 0
1 −1t
= e θ t 0<t<∞
θ2
1
1
e− θ t t2−1 0<t<∞
θ2 Γ2
=
0
otherwise
The pdf of T1 is
Z ∞
1 −1t
pθ (t1 ) = 2
e θ dt
t1 θ
− θ1 t ∞
" #
1 e
=
θ − 1θ t1
A.Santhakumaran 187
1 − 1 t1
= e θ 0 < t1 < ∞
θ
The conditional density of T1 given T = t is
1
0 < t1 < t
t
p(t1 | t) =
0
otherwise
Eθ [X2 ] = θ .˙. δ(T ) = X2 = T1
Problem 5.14 The random variables X and Y have the joint pdf
22 e− θ1 (x+y)
0<x<y<∞
θ
p(x, y | θ) =
0
otherwise
Show that
(i) Eθ [Y | X = x] = x + θ
(ii) Eθ [Y ] = Eθ [X + θ] and
(iii) Vθ [X + θ] ≤ Vθ [Y ]
22 e− 2x
θ 0<x<∞
θ
=
0
otherwise
y
2 e− θ − 2 e− θ2 y
0<y<∞
θ θ
=
0
otherwise
The conditional pdf of Y given X = x is
2 − x+y
θ2
e θ
pθ (y | x) =
2 − θ2 x
e
θ
y
1 e xθ e− θ
x<y<∞
θ
=
0
otherwise
Z ∞
Eθ [Y | X = x] = ypθ (y | x)dy
x
x Z ∞
eθ y
= ye− θ dy
θ
Z x∞
x y
= e θ e− θ dy + x
x
= x+θ
Z ∞ ∞
2 2
Z
− yθ 2
Eθ [Y ] = ye dy − ye− θ y dy
θ 0 θ 0
2 Γ2 2 Γ2
= −
θ ( 1θ )2 θ ( 2θ )2
3
= θ
2
7θ2 5
Eθ [Y 2 ] = , Vθ (Y ) = θ2
2
Z ∞
4
2 −2x θ
Eθ [X] = 2
e θ dx =
0 θ 2
θ 3
Eθ [X + θ] = + θ = θ = Eθ [Y ]
2 2
θ2
Vθ [X + θ] = Vθ [X] =
4
Thus Vθ [X + θ] ≤ Vθ [Y ].
PN {X(n) ≤ x} = PN {X1 ≤ x1 , X2 ≤ x2 , · · · , Xn ≤ xn }
A.Santhakumaran 189
= PN {X1 ≤ x1 } · · · PN {Xn ≤ xn }
n
x x x
= ··· =
N N N
x−1 n
PN {X(n) ≤ x − 1} =
N
PN {X(n) = x} = PN {X(n) ≤ x} − PN {X(n) ≤ x − 1}
n n−1
x x−1
= −
N N
N
" n n−1 #
t t−1
X
EN [g(T )] = g(t) − =0
t=1
N N
g(t) = 0 ∀ t = 1, 2, · · · , N
N
X 1
EN [X1 ] = x1
x1 =1
N
1 N (N + 1) N +1
= =
N 2 2
EN [2X1 ] = N +1
EN [2X1 − 1] = N
E[(2X1 − 1) | X(n) = x]
x−1
X
= (2x1 − 1)PN {X1 = x1 | X(n) = x}
x1 =1
+(2x − 1)PN {X1 = x1 | X(n) = x}
x−1
xn−1 − (x − 1)n−1 X
= (2x1 − 1)
xn − (x − 1)n x =1
1
xn−1
+ n (2x − 1)
x − (x − 1)n
x−1
xn−1 X
= (2x1 − 1)
xn − (x − 1)n x =1
1
xn−1
+ n (2x − 1)
x − (x − 1)n
x−1
(x − 1)n−1 X
− (2x1 − 1)
xn − (x − 1)n x =1
1
xn−1
= [1 + 3 + 5 + · · · + (2x − 1)]
xn − (x − 1)n
(x − 1)n−1
− n [1 + 3 + · · · + (2x − 3)]
x − (x − 1)n
−(2 + 4 + · · · + 2x)
2x(2x + 1) x(x + 1)
= −2×
2 2
= x(2x + 1) − x(x + 1) = x2
h i xn−1 2 (x − 1)n−1
E 2X1 − 1 | X(n) = x = x − (x − 1)2
xn − (x − 1)n xn − (x − 1)n
xn+1 (x − 1)n+1
= −
xn − (x − 1)n xn − (x − 1)n
xn+1 − (x − 1)n+1
=
xn − (x − 1)n
X n+1 −(X−1)n+1
Thus the UMVUE of N is X n −(X−1)n .
Remark 5.4 In Chapter 4 , Example 4.15 is not complete, but it is bounded complete.
The class of unbiased estimators of zero is
U0 = {g(X) | c ∈ <}
where
c(−1)x−1
if x = 1, 2
g(x) =
0
x = 3, 4, · · · , N ; N = 2, 3, · · ·
By Theorem 5.7, CovN [δ(T ), g(X)] = 0 for N = 2, 3, · · · implies that δ(T ) is a UMVUE
of N where T = t(X). That is
EN [δ(t(X))g(X)] = 0 N = 2, 3, · · · , ∀ c ∈ <
N
X 1
δ(t(x))g(x) = 0 N = 2, 3, · · · , ∀ c ∈ <
x=1
N
N
X
⇒ δ(t(x))g(x) = 0 N = 2, 3, · · · , ∀ c ∈ <
x=1
i.e., δ(t(1))c − δ(t(2))c = 0 ∀ c ∈ <
.˙. Any estimator δ(T ) such that δ(t(1)) = δ(t(2)) is a UMVUE of N , provided
EN [δ 2 (T )] < ∞, for N = 2, 3, · · · . Thus a family of distributions is bounded complete,
then there is a class of UMVUE’s.
Problem 5.16 Let X1 , X2 , · · · , Xn be a random sample of size n from a distribution
with pdf
1 e− xθ
0 < x < ∞, θ > 0
θ
pθ (x) =
0
otherwise
Obtain the UMVUE of Pθ {X ≥ 2}.
A.Santhakumaran 192
Pθ {X ≥ 2} = 1 − Pθ {X < 2}
Z 2
1 −x
= 1− e θ dx
0 θ
2
= e− θ
Z ∞
1 − x 2−1
Eθ [X1 ] = e θ x1 dx1 = θ
0 θ
n
X
Let T = Xi , thenT ∼ G(n, θ)
i=1
1 − θ1 t n−1
θn Γn e t t>0
pθ (t) =
0
otherwise
n
X
Let y = xi , then
i=2
− θ1 t
1
θn Γ(n−1) e [t − x1 ]n−2
pθ (x1 | t) =
1 − θ1 t n−1
θn Γn e t
1
= (n − 1)[t − x1 ]n−2
tn−1
A.Santhakumaran 193
(n − 1) 1 [1 − x1 ]n−2
0 < x1 < t
t t
=
0
otherwise
The UMVUE of θ is
x1 n−2
Z t
n−1
E[X1 | T = t] = x1 1− dx1
0 t t
x1 n−2
Z t
n−1
= x1 1 − dx1
t 0 t
x1
One can take z = , then dx1 = tdz
t
When x1 = t → z = 1; when x1 = 0 → z = 0
Z 1
n−1
E[X1 | T = t] = (tz)[1 − z]n−2 tdz
t 0
Z 1
= (n − 1)t (1 − z)n−1−1 z 2−1 dz
0
Γ2Γ(n − 1) t nx̄
= (n − 1)t = = = x̄
Γ(n − 1 + 2) n n
2
The UMVUE of Pθ {X ≥ 2} is e− X̄
Problem 5.17 Let X1 , X2 , · · · , Xn be a random sample from N (θ, σ 2 ). Both θ and σ
are unknown. Find the UMVUE of σ and pth quantile.
P
2 (Xi −X̄)2
Solution: Let Y = (n−1)S σ2
= σ2
∼ χ2 distribution with (n − 1) degrees of
freedom. Y ∼ G( 12 , (n−1)
2 ).
1 n−1
n−1
1
e− 2 y y 2
−1
0<y<∞
p(y) = 2 2 Γ n−1
2
0 otherwise
√ Z ∞
1 1 n
E[ Y ] = n−1 e− 2 y y 2 −1 dy
0 2 2 Γ n−1
2
1 Γ n2
= n−1 1 n
2 2 Γ n−12
( 2)
2
Γ n2 √
"r #
n−1 2
i.e., Eσ S = 2
σ2 Γ n−1
2
Γ n2 √ σ
⇒ Eσ [S] = n−1 2
√
Γ 2 n−1
1 Γ n−1
q
2
= σ where k(n) = Γn
2
n −1
k(n) 2
A.Santhakumaran 194
p = Pθ,σ {X ≤ δp }
X −θ δp − θ
= Pθ,σ ≤
σ σ
δp − θ
X−θ
= Pθ,σ Z ≤ where Z = σ ∼ N (0, 1)
σ
Z δ−θ
σ
p = p(z)dz
0
Z ∞
i.e., 1 − p = δp −θ
p(z)dz
σ
δp − θ
⇒ = z1−p ⇒ δp = z1−p σ + θ
σ
Under some regularity conditions Cramer - Rao inequality provides a lower bound for
the variance of unbiased estimators. It may enable us to judge a given unbiased estimator
is an UMVUE or not. That is, the variance of an unbiased estimator coincides with the
Cramer - Rao lower bound, then the estimator is UMVUE.
A.Santhakumaran 195
Covariance inequality
Theorem 5.7 The covariance inequality between two functions T = t(X) and ψ(X, θ)
is defined as
{Covθ [T, ψ(X, θ)]}2
Vθ [T ] ≥ ∀θ∈Ω
Vθ [ψ(X, θ)]
where ψ(X, θ) is a function of X and θ and T = t(X) is a statistic with pdf pθ (t).
Proof: The Cauchy - Schwarz inequality between two variables X and Y is
pθ (x) 2
2 Z 0
∂ log pθ (X)
I(θ) = Eθ = pθ (x)dx
∂θ pθ (x)
Likelihood function
Property 5.1 Let IX (θ) and IY (θ) be the amount of information of two independent
samples (X1 , X2 , · · · , Xn ) and (Y1 , Y2 , · · · Yn ) respectively. Let IXY (θ) be the amount of
information of the joint sample (X1 , Y1 )(X2 , Y2 ), · · · , (Xn , Yn ). Then IXY (θ) = IX (θ) +
IY (θ). This is known as additive property of Fisher measure of information.
Proof:
2
∂ log pθ (X) ∂ log pθ (T )
Consider Eθ − ≥0
∂θ ∂θ
2 2
∂ log pθ (X) ∂ log pθ (T ) ∂ log pθ (X) ∂ log pθ (T )
Eθ + Eθ − 2Eθ ≥0
∂θ ∂θ ∂θ ∂θ
∂ log pθ (T ) 2
IX (θ) + IT (θ) − 2Eθ ≥ 0
∂θ
IX (θ) + IT (θ) − 2IT (θ) ≥ 0
IX (θ) − IT (θ) ≥ 0
IX (θ) ≥ IT (θ)
pθ (x) = pθ (t)h(x)
When a UMVUE does not exist, one may interest on a Locally Minimum Variance
Unbiased Estimator (LMVUE) which gives the smallest variance that an unbiased estima-
tor can achieve at θ = θ0 . This is helpful to measure the performance of a given unbiased
estimator with some lower bounds of the unbiased estimator which are not sharp. The
Cramer - Rao inequality is very simple to calculate the lower bound for the variance of an
unbiased estimator. Also it provides asymptotically efficient estimators. The assumptions
of the Cramer - Rao inequality are
∂pθ (x)
(iii) For any x and θ the derivative ∂θ exists and is finite.
Theorem 5.8 Under the assumptions (i) ,(ii) and (iii) and that I(θ) > 0. Let T = t(X)
be any statistic with Eθ [T 2 ] < ∞ for which the derivative with respect to θ of Eθ [T ] =
R
tpθ (x)dx exists can be obtained by differentiating under the integral sign, then
∂Eθ [T ] 2
h i
∂θ
Vθ [T ] ≥ ∀θ∈Ω
I(θ)
2 " #
∂ log pθ (X) ∂ 2 log pθ (X)
where I(θ) = Eθ = −Eθ
∂θ ∂θ2
R
Proof: Suppose the assumptions hold for a single observation x of X and pθ (x)dx = 1
is differentiated twice under the integral sign with respect to θ, then
∂pθ (x)
Z
dx = 0
∂θ
∂pθ (x) 1
Z
pθ (x)dx = 0
∂θ pθ (x)
∂ log pθ (x)
Z
pθ (x)dx = 0 (5.1)
∂θ
A.Santhakumaran 199
∂ log pθ (X)
⇒ Eθ = 0
∂θ
By covariance inequality
{Covθ [T, ψ(X, θ)]}2
Vθ [T ] ≥ ∀θ∈Ω
Vθ [ψ(X, θ)]
∂ log pθ (x)
Take ψ(x, θ) =
∂θ
∂Eθ [T ] 2
∂θ
then Vθ [T ] ≥ ∀θ∈Ω
Vθ [ ∂ log∂θ
pθ (X)
]
A.Santhakumaran 200
∂Eθ [T ] 2
∂θ
i.e., Vθ [T ] ≥ ∀θ∈Ω
I(θ)
(i) Suppose T = t(X) is a biased estimator of the parameter τ (θ), i.e.,Eθ [T ] = τ (θ)+b(θ),
then the Cramer - Rao inequality becomes
[τ 0 (θ) + b0 (θ)]2
Vθ [T ] ≥ ∀θ∈Ω
I(θ)
[τ 0 (θ)]2
Vθ [T ] ≥ ∀θ∈Ω
nI(θ)
h i
∂ log pθ (X)
where I(θ) = Vθ ∂θ of a single observation x of X.
or
[τ 0 (θ)]2
Vθ [T ] ≥ ∀θ∈Ω
I(θ)
h i
∂ log L(θ) Qn
where I(θ) = Vθ ∂θ and L(θ) = i=1 pθ (xi ).
Eθ [ψ(X, θ)] = 0, ∀ θ ∈ Ω
Z
Eθ [ψ(X, θ)] = ψ(x, θ)pθ (x)dx
pθ+∆ (x)
Z
= − 1 pθ (x)dx
pθ (x)
Z
= [pθ+∆ (x) − pθ (x)]dx
= 1−1=0
= Eθ [T ψ(X, θ)]
pθ+∆ (X)
= Eθ T −1
pθ (X)
pθ+∆ (x) − pθ (x)
Z
= t pθ (x)dx
pθ (x
Z Z
= tpθ+∆ (x)dx − tpθ (x)dx
= τ (θ + ∆) − τ (θ)
By covariance inequality
[τ (θ + ∆) − τ (θ)]2
Vθ [T ] ≥ h
pθ+∆ (X)
i
Vθ pθ (X) −1
It is true for all values of ∆
[τ (θ + ∆) − τ (θ)]2
Vθ [T ] ≥ sup hp i
θ+∆(X)
∆ V −1
θ pθ (X)
Problem 5.18 Using a single observation x of X, obtain the Chapman Robbin - Kiefer
bound for the parameter θ of the pdf
1
0<x<θ
θ
pθ (x) =
0
otherwise
A.Santhakumaran 202
Remark 5.6 Chapman Robbin - Kiefer bound becomes the Cramer - Rao lower bound
by allowing ∆ → 0 and assume the range of the distribution is independent of the
∂ log pθ (x)
parameter, and the derivative ∂θ exists and finite, then
[τ (θ + ∆) − τ (θ)]2
Vθ [T ] ≥ h i2
1
Eθ [pθ+∆ (X) − pθ (X)] pθ (X)
A.Santhakumaran 203
[τ (θ+∆)−τ (θ) 2
h i
lim∆→0 ∆
≥ h i2
[pθ+∆ (X)−pθ (X)] 1
Eθ lim∆ →0 ∆ pθ (X)
0
[τ (θ)]2
≥ h i2
1
Eθ p0 (X | θ) pθ (X)
[τ 0 ]2
≥
∂ log pθ (X) 2
h i
Eθ ∂θ
[τ 0 (θ)]2
≥ ∀ θ∈Ω
I(θ)
Problem 5.19 Obtain the Cramer - Rao lower bound for the variance of the unbiased
estimator of the parameter θ of the Cauchy distribution by considering a sample of size
n.
1 1
π 1+(x−θ)2 −∞ < x < ∞, −∞ < θ < ∞
pθ (x) =
0 otherwise
1 1
L(θ) = pθ (x) =
π 1 + (x − θ)2
log L(θ) = − log π − log[1 + (x − θ)2 ]
∂ log pθ (x) 2(x − θ)
=
∂θ 1 + (x − θ)2
2
4(x − θ)2
∂ log pθ (x)
=
∂θ [1 + (x − θ)2 ]2
2
4(X − θ)2
∂ log pθ (X)
Eθ = Eθ
∂θ [1 + (X − θ)2 ]2
Z ∞
4 (x − θ)2
= dx
π −∞ [1 + (x − θ)2 ]3
Z ∞
4 t2
= dt since t = x − θ
π −∞ (1 + t2 )3
Z ∞
8 t2
= dt
π 0 (1 + t2 )3
Z ∞ 3
4 u 2 −1 2
= 3 3 du since t = u
π 0 (1 + u) 2 + 2
4 Γ 32 Γ 32
=
π Γ3
4 1√ 1√
π 2 π2 π 1
I(θ) = =
2 2
The Cramer - Rao lower bound from the sample of size n for the variance of the unbiased
A.Santhakumaran 204
[τ 0 (θ)]2 1
estimator of the parameter τ (θ) = θ is nI(θ) = n 21
= n2 .
Problem 5.20 Let X1 , X2 , · · · , Xn is a sample from N (θ, 1). Obtain the Cramer - Rao
lower bound for the variance of (i) θ and (ii) θ2 . Also find the unbiased estimator of θ2 .
To verify that the actual variance of the unbiased estimator of θ2 is same as Cramer -
Rao lower bound.
Solution: (i) The likelihood function for θ is
n
Y
L(θ) = pθ (xi )
i=1
n
1
Pn
1
(xi −θ)2
= e− 2 i=1
2π
√ n
1X
log L(θ) = −n log 2π − (xi − θ)2
2 i=1
The Cramer - Rao lower bound for the variance of unbiased estimator of τ (θ) = θ2 is
[τ 0 (θ)]2 4θ2 dτ (θ)
I(θ) = n where τ 0 (θ) = dθ2
= 1.
Consider Eθ [X − θ]2 = 1
Eθ [X 2 ] − 1 = θ2
"P #
n 2
i=1 Xi
Eθ − 1 = θ2
n
Pn
Xi2
.. . i=1
n − 1 is the unbiased estimator of θ2 .
Pn Pn Pn
Xi2 (Xi −θ+θ)2 (Xi −θ)2 2θ Pn
Consider i=1
n = i=1
n = i=1
n + θ2 + n i=1 (Xi − θ)
"P # "P #
Xi2 Xi2
Vθ −1 = Vθ
n n
n
"P # 2 !
(Xi − θ)2 2θ
X
= Vθ + Vθ [Xi ] − 0
n n i=1
"P #
(Xi − θ)2 4θ2
= Vθ + 2n since Vθ [Xi ] = 1 ∀ i = 1 to n
n n
"P #
(Xi − θ)2 4θ2
= Vθ +
n n
ns2 (Xi − θ)2
P
Define Y = 2 = 2
∼ χ2 distribution with n degrees of freedom
σ σ
n 1
The pdf of Y ∼ G ,
2 2
1 n
n
1
e− 2 y y 2 −1 0 < y < ∞
2 2 Γn
p(y) = 2
0
otherwise
Z ∞
1 1 n
E [Y r ] = n
e− 2 y y 2 +r−1 dy
n
0 2 Γ2 2
1 Γ( n2 + r)
= n n
2 2 Γ n2 ( 12 ) 2 +r
2r Γ( n2 + r)
= r = 1, 2, · · ·
Γ n2
Γ( n + 1)
E[Y ] = 2 2 n =n
Γ2
E[Y 2 ] = (n + 2)n and V [Y ] = 2n
A.Santhakumaran 206
ns2
But Y and σ 2 = 1
=
σ 2
Y 2n 2
.. . Vθ [s2 ] = Vθ = 2 =
n n n
"P #
Xi 2 4θ 2 2 4θ2
Vθ − 1 = Vθ [s2 ] + = +
n n n n
P
Xi2 4θ2
The actual variance of n − 1 is n + n2 . Here the Cramer - Rao lower bound
P
Xi2
is less than the actual variance of the unbiased estimator n − 1 of the parameter θ2 .
Note that the UMVUE of θ2 is X̄ 2 − n1 , since Eθ [X̄ 2 ] − {Eθ [X̄]}2 = 1
n
1
⇒ Eθ [X̄ 2 ] − n = θ2
1
i.e., X̄ 2 − n is unbiased estimator of θ2 .
1
Problem 5.21 Given pθ (x) = θ, 0 < x < θ, θ > 0. Compute the reciprocal
h i2
∂ log pθ (X) n+1
nEθ ∂θ . Compare this with the variance of n T where T is the largest ob-
servation of a random sample of size n for this distribution.
Solution: Given pdf of the random variable X is
1
0<x<θ
θ
pθ (x) =
0
otherwise
1
log pθ (x = −
θ
∂ log pθ (x) 1
= −
∂θ θ
∂ log pθ (x) 1
=
∂θ θ2
∂ log pθ (X) 2 1
Eθ =
∂θ θ2
∂ log pθ (X) 2 n
i.e., nEθ =
∂θ θ2
1 θ2
=
∂ log pθ (X) 2 n
h i
nEθ ∂θ
Let T = max {Xi }
1≤i≤n
The pdf of T is
nn tn−1
0<t<θ
θ
p(t | θ) =
0
otherwise
A.Santhakumaran 207
n
Eθ [T ] = θ
n+1
n+1
⇒ T is an unbiased estimator of θ
n
n 2
Eθ [T 2 ] = θ
n+2
2
n 2 n
Vθ [T ] = θ − θ2
n+1 n+1
nθ2
=
(n + 1)(n + 2)
n+1 θ2
Vθ T =
n n(n + 2)
n+1 θ2
The actual variance of the unbiased estimator n T is n(n+2)
Here the actual variance of the unbiased estimator of θ is less than the Cramer
n+1
- Rao lower bound of the estimator n T. Since the distribution is not satisfied the
n+1
assumptions of the Cramer - Rao inequality . Note that n T is the UMVUE of θ.
Problem 5.22 Find the Cramer - Rao lower bound for the variance of the unbiased
estimator Pθ {X > 2} for a single observation x of X with pdf
1 e− xθ
x>0θ>0
θ
pθ (x) =
0
otherwise
Solution:
Z 2
1 −x
Consider τ (θ) = Pθ {X > 2} = 1 − e θ dx
0 θ
#2
− xθ
"
1 e
= 1−
θ − 1θ 0
− θ2 2
= 1+e − 1 = e− θ
1
log pθ (x) = − log θ − x
θ
2
One can take λ = e− θ , then log λ = − 2θ i.e., θ = − log2 λ .
2 x
log pλ (x) = − log − + log λ
log λ 2
∂ log pλ (x) log λ 1 x1
= − (−2)(−1) (log λ)−2 +
∂λ −2 λ 2λ
A.Santhakumaran 208
1 x
= +
λ log λ 2λ
∂ log pθ (x) θ x 2
= + e− θ
2 − θ2
∂ e− θ e 2
2
eθ
= [x − θ]
2
2
4
∂ log pθ (X) eθ
Eθ 2
= Eθ [X − θ]2
∂ e− θ 4
4
eθ 2
= θ since Eθ [X − θ]2 = θ2
4
−2
The Cramer - Rao lower bound for the variance of the unbiased estimator of τ (θ) = e θ
4 − θ4 2
is θ2
e , since τ 0 (θ) = ∂τ−(θ)2 = 1. The unbiased estimator of τ (θ) = e− θ is
∂ e θ
1
if X > 2
T =
0
otherwise
• UMVUE exists even the Cramer - Rao regularity conditions are not satisfied.
A.Santhakumaran 209
• UMVUE exists when the regularity conditions are satisfied but UMVUE’s are not
attained the Cramer - Rao lower bound.
The Cramer - Rao lower bound for the variance of the unbiased estimator of θ is
1 θ2
n 1 = n.
θ2
n
1
X
Let T = Xi , thenT ∼ G n,
i=1
θ
θn e−θt tn−1
0<t<∞
Γn
pθ (t) =
0
otherwise
Z ∞ n
1 θ −θt n−1−1
Eθ = e t dt
T 0 Γn
θn Γ(n − 1)
=
Γn θn−1
θ
=
n−1
n−1
Eθ = θ if n = 2, 3, · · ·
T
n−1
is the unbiased estimator of θ.
T
A.Santhakumaran 210
1 θ2
Eθ = if n = 3, 4, · · ·
T2 (n − 1)(n − 2)
1 θ2
Vθ =
T (n − 1)2 (n − 2)
n−1 θ2
Vθ = , if n = 3, 4, · · ·
T n−2
n−1 θ2 n−1
Actual variance of T is n−2 . Cramer - Rao lower bound of the unbiased estimator T
θ2
of θ is n.
θ2
n−2
Efficiency = θ2
n
n 1
= = 2 ,n = 3, 4, · · ·
n−2 1− n
→ 1 as n → ∞
n−1 n−1
Thus T is the asymptotic efficient estimator of θ. Note that T is the UMVUE of θ.
Theorem 5.10 A necessary and sufficient condition for an estimator to be the most
∂ log pθ (x)
efficient is that T = t(X) is sufficient and t(x) − τ (θ) is proportional to ∂θ where
Eθ [T ] = τ (θ).
∂ log pθ (x)
Proof: Assume T = t(X) is a most efficient estimator of τ (θ) and t(x)−τ (θ) ∝ ∂θ
∂ log pθ (x)
i.e., t(x) − τ (θ) = A(θ)
∂θ
Prove that T = t(X) is a sufficient statistic.
t(x) − τ (θ) ∂ log pθ (x)
=
A(θ) ∂θ
t(x) τ (θ) ∂ log pθ (x)
− =
A(θ) A(θ) ∂θ
t(x) τ (θ)
Z Z Z
dθ − dθ = d log pθ (x) + c(x)
A(θ) A(θ)
Z θ Z θ
1 τ (θ)
Choose dθ = Q(θ) and d(θ) = c1 (θ)
−∞ A(θ) −∞ A(θ)
Then t(x)Q(θ) − c1 (θ) − c(x) = log pθ (x)
∂ log pθ (x)
t(x) − τ (θ) = A(θ)
∂θ
t(x) − τ (θ) ∂ log pθ (x)
=
A(θ) ∂θ
2 2
t(x) − τ (θ) ∂ log pθ (x)
=
A(θ) ∂θ
2
1 ∂ log pθ (X)
2
Eθ [T − τ (θ)]2 = Eθ
[A(θ)] ∂θ
2
Vθ [T ] ∂ log pθ (X)
= Eθ
[A(θ)]2 ∂θ
2
∂ log pθ (X)
Vθ [T ] = [A(θ)]2 Eθ (5.2)
∂θ
∂ log pθ (X)
But Eθ T, = τ 0 (θ)
∂θ
∂ log pθ (x)
i.e, Eθ (T − τ (θ)) , = τ 0 (θ)
∂θ
since Eθ [ ∂ log∂θpθ (x) ] = 0
" 2 #
∂ log pθ (X)
Eθ A(θ) = τ 0 (θ)
∂θ
since t(x) − τ (θ) = A(θ) ∂ log∂θpθ (x)
2
∂ log pθ (X)
A(θ)Eθ = τ 0 (θ)
∂θ
τ 0 (θ)
i.e., A(θ) = h i2
∂ log pθ (X)
Eθ ∂θ
[τ 0 (θ)]2
From equation (5.2) ⇒Vθ [T ] = ∀θ∈Ω
Eθ [ ∂ log∂θ
pθ (X) 2
]
Thus the actual variance of T = t(X) is equal to the Cramer - Rao lower bound.
Remark 5.8 UMVUE may be most efficient estimator. As discussed in problem 5.20,
n−1
T , n = 3, 4, · · · is the UMVUE of θ but not most efficient estimator of θ.
A.Santhakumaran 212
Cramer - Rao inequality has been modified and extended in different directions. Consider
the first case, where θ is a vector. In second case, it may extend the inequality to get better
bounds for the variance of unbiased estimators. Bhattacharya gives a method of having a
whole sequence of non-decreasing lower bounds for the variance of an unbiased estimator
by successive differentiation of the likelihood function with respect to the parametric
function.
Lemma 5.1 For any random variables X1 , X2 , · · · , Xr with finite second moments, the
covariance matrix
C = [Cov(Xi , Xj )]r×r
ν 0 Σ−1 ν
positive definite, then ρ2 = V [Y ] , ρ is the multiple correlation coefficient between Y
and the vector (X1 , X2 , · · · , Xr ).
Proof: Define ρ is the correlation coefficient between a0 X and Y where a0 =
(a1 , a2 , · · · , ar ) and X0 = (X1 , X2 , · · · , Xr ),
{Cov [ ri=1 ai Xi , Y ]}2
P
i.e., ρ2 = .
V [Y ]V [ ri=1 ai Xi ]
P
V [T ] ≥ ν 0 C −1 ν
(iii) For any x and θ ∈ Ω and i = 1, 2, · · · , r the derivative exists and is finite.
h i
∂ log pθ (X)
Theorem 5.12 Suppose that assumptions (i) to (iii) and the relation Eθ ∂θi =
0, i = 1, 2, · · · , r hold and I(θ) is positive definite. Let T = t(X) be any statistic with
Eθ [T 2 ] < ∞ for which the derivative with respect to θi , i = 1, 2, · · · , r of Eθ [T ] =
R
tpθ (x)dx
exists for each i and can be obtained by differentiating under the integral sign. Then
∂Eθ [T ]
Vθ [T ] ≥ α0 I −1 (θ)α, where α0 is the row vector with ith element αi = ∂θi , i = 1, 2, · · · , r.
∂ log pθ (x)
Proof: As in Theorem 5.11, replace ψi (x, θ) = ∂θi ,i = 1, 2 · · · , r and ν = α ,
C = I(θ) ⇒ Vθ [T ] ≥ α0 I −1 (θ)α.
n
1
− 12
2
P
(xi −θ)2
= 2
e 2σ
2πσ
n n 1 X
log L(θ) = − log 2π − log σ 2 − 2 (xi − θ)2
2 2 2σ
∂ log L(θ) 1 X
= 2 (xi − θ)
∂θ 2σ 2
n
= [x̄ − θ]
σ2
2
∂ log L(θ) 1 2
Eθ = n Eθ [X̄ − θ]2
∂θ σ4
n2 σ 2 n
I11 (θ) = =
σ 4 n" σ 2 #
∂ 2 log L(θ)
I12 (θ) = I21 (θ) = −Eθ =0
∂θ∂σ 2
" #
∂ 2 log L(θ) n
since Eθ = − Eθ [X̄ − θ] = 0
∂σ 2 ∂θ σ4
∂ log L(θ) n 1 X
= − + (xi − θ)2
∂σ 2 2σ 2 2(σ 2 )2
∂ 2 log L(θ) n 1 X
= − (xi − θ)2
∂(σ 2 )2 2σ 4 (σ 2 )3
" #
∂ 2 log L(θ) n nσ 2
−Eσ2 = − +
∂(σ 2 )2 2σ 4 σ6
n 1 n
I22 (θ) = 1− = 4
σ4 2 2σ
n σ2
σ2
0 −1 n 0
I(θ) = I (θ) =
2σ 4
n
0 2σ 4
0 n
∂θ
1 ∂θ
α = =
∂σ 2
1 ∂σ 2
σ2
0 1
n
Vθ [T] ≥ 1 1
2σ 4
0 n 1
σ2 2σ 4
i.e.,Vθ [T1 ] ≥ n and Vσ2 [T2 ] ≥ n .
σ2
Remark 5.9 n is the actual variance of the unbiased estimator T1 = X̄ for θ is same
2σ 4
as the Cramer - Rao lower bound of that estimator but n−1 is the actual variance of
1 Pn 2
the unbiased estimator T2 = n−1 i=1 (Xi − X̄) is greater than the Cramer - Rao lower
When the lower bound is not sharp, it can be improved by considering the higher
order derivatives of the likelihood function of the parameter θ.
Assumptions: Let X1 , X2 , · · · , Xn be distributed with pdf p(x | θ), θ ∈ Ω.
Theorem 5.13 Suppose that the assumptions (i) to (iv) hold and that the covariance
matrix K(θ) is positive definite. Let T = t(X) be any statistic with Eθ [T 2 ] < ∞ for
which the higher order derivative τ i1 +i2 +···+is (θ) exists for each i = 1, 2, · · · , s and can be
obtained by differentiating under the integral sign. Then Vθ [T ] ≥ α0 K −1 (θ)α, where α0 is
row vector with elements
!
∂ i1 +i2 +···+is Eθ [T ] ∂ i1 +i2 +···+is log L(θ)
= Covθ T,
∂θ1i1 · · · ∂θsis ∂θ1i1 · · · ∂θsis
= τ i1 +···+is (θ)
then Vθ [T ] ≥ α0 K −1 (θ)α
A.Santhakumaran 218
Problem 5.25 Given that X ∼ b(n, θ), 0 < θ < 1 . Obtain the Bhattacharya bound
for the unbiased estimator of the parameter τ (θ) = θ2 .
Solution:
n θ(1−θ)
θ(1−θ) 0 −1 n 0
K(θ) = , K (θ) =
n2 θ 2 (1−θ)2
0 θ 2 (1−θ)2 0 n
(ii) When s = 2 Bhattacharya inequality gives the non decreasing lower bound for the
variance of an unbiased estimator of τ (θ). The Bhattacharya inequality is
Vθ [T ] ≥ α0 K −1 (θ)α
A.Santhakumaran 220
2 (θ)] ≥ τ 0 (θ)[τ 0 (θ)K (θ) − τ 00 (θ)K (θ)] − τ 00 (θ)[τ 0 (θ)K (θ) − τ 00 (θ)K (θ)]
Vθ [T ][K11 (θ)K22 (θ) − K12 22 12 12 11
1
≥ K11 (θ)
[τ 0 (θ)]2 K22 (θ)K11 (θ) − 2τ 0 (θ)τ 00 (θ)K11 (θ)K12 (θ) + [τ 00 (θ)]2 K11
2 (θ)
1
≥ K11 (θ)
[τ 0 (θ)]2 K12
2 (θ) + [τ 0 (θ)]2 K (θ)K (θ) − 2τ 0 (θ)τ 00 (θ)K (θ)K (θ) + [τ 00 (θ)]2 K 2 (θ) − [τ 0 (θ)]2 K 2 (θ)
22 11 11 12 11 12
1
≥ K11 (θ)
[τ 0 (θ)K12 (θ) − τ 00 (θ)K11 (θ)]2 + [τ 0 (θ)]2 [K11 (θ)K22 (θ) − K12
2 (θ)]
Problems
5.1 Let X1 , X2 , · · · , Xn be a random sample drawn from a normal population with mean
X1 +X2 +···+Xn X1 +X2 +···+Xn
θ. Which among the two estimators T1 = n and T2 = n is
better? Why?
5.2 Show that, under some conditions to be stated there is a lower limit to the variance
of an unbiased estimator. How you modify the lower limit to a biased estimator?
5.3 Let X1 , X2 be independent random variables each having Poisson distribution with
h i
X1 +X2
mean θ. Show that Vθ 2 ≤ Vθ [2X1 − X2 ]. Also justify the inequality by Rao
- Blackwell Theorem.
5.4 Show that Bhattacharya bound is better than Cramer - Rao bound.
A.Santhakumaran 221
5.5 Define Bhattacharya bound of order r. Also obtain Bhattacharya bound of order
2 for estimating θ2 unbiasedly, θ being the mean of a Bernoulli distribution from
which a sample of size n is available.
5.6 Let X and Y have a bivariate normal distribution with mean θ1 and θ2 with positive
variance σ12 and σ22 and with correlation coefficient ρ. Find Eθ2 [Y | X = x] = φ(x)
and variance of φ(X).
5.8 In what way, Lehman - Scheffe’s Theorem different from Rao - Blackwell Theorem.
5.10 Let X1 , X2 · · · , Xn be a random sample from a population with mean θ and finite
Pn
variance and T = t(X) be an estimator of θ of the form T = i=1 αi Xi . If T is
an unbiased estimator of θ that has minimum variance and T 0 = t0 (X) is another
linear unbiased estimator of θ, then Covθ (T, T 0 ) = Vθ [T ].
5.11 Let X1 , X2 , · · · , Xn be a random sample from p(x | θ) = θe−θx , θ > 0, x > 0. Show
that Pn−1
n
X
is the UMVUE of θ.
i=1 i
5.12 Stating the assumptions clearly, derive the Chapman - Robbin lower bound for the
variance of an unbiased estimator of a function of a real valued parameter θ.
5.14 State the Bhattacharya bound of order s. Also prove that it is a non - decreasing
function of s.
5.15 Define Bhattacharya bound. Show that it is sharper than the Cramer - Rao bound.
5.16 On the basis of a random sample of size n, the Cramer - Rao lower bound of variance
of an unbiased estimator of θ in
1
−∞ < x < ∞; −∞ < θ < ∞
π[1+(x−θ)2 ]
pθ (x) =
0
otherwise
is equal to
( a) n1 (b) 1
n2
(c) 2
n (d) 2
n2
Ans:(c)
5.17 T1 = t1 (X) and T2 = t2 (X) are independent unbiased estimators of θ with V [Ti ] =
vi , i = 1, 2. The best linear unbiased estimator (l1 T1 + l2 T2 ) of θ is the one for which
(a) l1 = l2 = 0.5
v2 v1
(b) l1 = (v1 +v2 ) ; l2 = (v1 +v2 )
v1−1
(c) l1 = (v1 +v2−1
−1 )
(d) l1 = 0, l2 = 1 if v1 > v2 and vice versa Ans:(b)
5.19 Which one of the following is not necessary for the UMVU estimation of θ by
T = t(X)?
(a) Eθ [T − θ] = 0
(b) Eθ [T − θ]2 < ∞
(c) Eθ [T − θ]2 is minimum
(d) T is a linear function of observations. Ans:(d)
5.20 If T1 = t1 (X) and T2 = t2 (X) are unbiased estimators of θ and θ2 (0 < θ < 1) and
T is a sufficient statistic, then E[T1 | T ] − E[T2 | T ] is
(a) the minimum variance unbiased estimator of θ
(b) always an unbiased estimator of θ(1 − θ) which has variance not exceeding that
of θ(1 − θ)
(c) always the minimum variance unbiased estimator of θ(1 − θ)
(d) not an unbiased estimator of θ(1 − θ). Ans:(b)
5.21 T 0 = t0 (X) and T = t(X) are two unbiased estimator of τ (θ) with variance Vθ [T ] <
∞ and Vθ [T 0 ] < ∞. The estimator T is said to be an efficient estimator of τ θ), if
(a) Vθ [T ] < Vθ [T 0 ]
(b)Vθ [T ] > Vθ [T 0 ]
(c) Vθ [T ] = Vθ [T 0 ]
(d) none of the above Ans:(d)
5.22 T 0 = t0 (X) and T = t(X) are two unbiased estimator of τ (θ) with variance Vθ [T ] <
∞ and Vθ [T 0 ] < ∞. The estimator T is an efficient estimator relative to T 0 of the
parameter τ (θ), if
(a) Vθ [T ] < Vθ [T 0 ]
(b) Vθ [T ] > Vθ [T 0 ]
(c) Vθ [T ] 6= Vθ [T 0 ]
(d) none of the above Ans:(a)
5.23 Suppose
r X1 , X2 , · · · Xn are iid as N (µ, σ 2 ) , −∞ < µ < ∞, σ 2 > 0. Then
Pn
(Xi −X̄)2
(a) i=1
n−1 is the Minimum Variance Unbiased Estimate of σ
A.Santhakumaran 224
Pn
(Xi −X̄)2
(b) i=1
n−1 is the minimum variance unbiased estimator of σ 2
Pn
(Xi −X̄)
(c) i=1
is the minimum variance unbiased estimator of σ
Pn n−1 2
(X i − X̄)
(d) i=1
n is the minimum variance unbiased estimator of σ 2 Ans: (b)
6 METHODS OF POINT ESTIMATION
6.1 Introduction
∂ 2 log L(θ) n
2
= − < 0 at θ = θ̂(x)
∂θ 2θ̂
Pn
Xi2
The MLE of θ is θ̂(X) = i=1
n .
Problem 6.2 A random sample of size n is drawn from a population having density
function
θxθ−1
0 < x < 1, 0 < θ < ∞
pθ (x) =
0
otherwise
Find the MLE of θ.
Solution: The likelihood function for θ of the sample size n is
n
Y
L(θ) = pθ (xi )
i=1
n
xθ−1
Y
= θn i
i=1
n
X
log L(θ) = n log θ + (θ − 1) log xi
i=1
n
∂ log L(θ) n X
= + log xi
∂θ θ i=1
∂ 2 log L(θ) n
2
= − 2
∂θ θ
∂ log L(θ)
For maximum, = 0
∂θ
n
n X
⇒ + log xi = 0
θ i=1
−n
i.e., θ̂(x) = Pn and
i=1 log xi
∂ 2 log L(θ) −n X 2
θ = θ̂(x) = log x i
∂θ2 n2
A.Santhaakumaran 226
log xi )2
P
(
= − <0
n
−n
Thus the MLE of θ is θ̂(X) = Pn .
i=1 log Xi
Let p = Pθ {X > 2}
1 − Pθ {X ≤ 2}
=
Z 2
1 −x 2
= 1− e θ dx = e− θ
0 θ
2 1 2
log p = − ⇒ log =
θ p θ
2
⇒ θ =
log p1
A sample of size n is taken and it is known that k of the observations are X > 2 and
(n − k) of the observation are X < 2. The likelihood function for p of the sample size
n is
L(p) = pk (1 − p)n−k
k
i.e., p̂ = and
n
2
∂ 2 log L(p) −n nk 2 − k + 2 nk k
= i2
∂p2
h
k
n (1 − nk )
k
p̂= n
h i
k
k 1− n k
= −h i2 < 0 since n < 1 for n = 1, 2, · · ·
k k
n (1 − n)
i=1 2πσ 2
− n 1
P 2
= 2πσ 2 2 e− 2σ2 (xi −θ)
n n 1 X
log L(θ) = − log 2π − log σ 2 − 2 (xi − θ)2
2 2 2σ
∂ log L(θ) 1 X
= (xi − θ)
∂θ σ2
∂ 2 log L(θ) n
= − 2 <0
∂θ2 σ
∂ log L(θ)
For maximum, = 0
∂θ
X ∂ 2 log L(θ)
⇒ (xi − θ) = 0 i.e., θ̂(x) = x̄ and <0
∂θ2
A.Santhaakumaran 228
n n 1 X
log L(σ 2 ) = − log 2π − log σ 2 − 2 (xi − θ)2
2 2 2σ
∂ log L(σ 2 ) (xi − θ)2
P
n 1
= − +
∂σ 2 2 σ2 2(σ 2 )2
2
∂ log L(σ ) 2 n
P
(xi − θ)2
= −
∂(σ 2 )2 2σ 4 σ6
∂ log L(σ 2 )
For maximum, = 0
∂σ 2
n 1 X
⇒ − 2+ 4 (xi − θ)2 = 0
2σ 2σ
(xi − θ)2
P
σ̂ 2 (x) = and
n
∂ 2 log L(σ 2 )
< 0
∂(σ 2 )2 σ2 =σ̂2 (x)
Pn
(xi −θ)2
Thus the value of the MLE of σ2 is σ̂ 2 (x) = i=1
n .
Case (iii) When θ and σ 2 are unknown, the likelihood function for θ and σ 2 is
n n 1 X
log L(θ, σ 2 ) = − log 2π − log σ 2 − 2 (xi − θ)2
2 2 2σ
∂ log L(θ, σ 2 ) 1 X
= (xi − θ)
∂θ σ2
∂ 2 log L(θ,σ 2 ) ∂ 2 log L(θ,σ 2 )
∂θ∂σ 2
= ∂σ 2 ∂θ
since both the partial derivatives exist and are continuous.
∂ 2 log L(θ, σ 2 ) X −1
= (xi − θ) 4
∂θ∂σ 2 σ
∂ log L(θ, σ 2 ) n 1 X
= − + (xi − θ)2
∂σ 2 2σ 2 2σ 4
∂ 2 log L(θ, σ 2 ) n 1 X
= − (xi − θ)2
∂(σ 2 )2 2σ 4 σ 6
∂ 2 log L(θ, σ 2 ) n
2
=− 2
∂θ σ
For maximum of L(θ, σ 2 ),
2
∂ 2 log L(θ,σ 2 ) ∂ 2 log L(θ,σ 2 ) ∂ 2 log L(θ,σ 2 )
and ∂θ 2 ∂(σ 2 )2 − ∂θ∂σ 2 >0
−n −n
at θ = θ̂(x) and σ 2 = σ̂ 2 (x) σ̂ 2 (x) 2σ̂ 4 (x)
− 0 > 0 at θ = θ̂(x) = x̄
P
(xi −x̄)2
and σ 2 = σ̂ 2 (x) = n
∂ 2 log L(θ,σ 2 )
−n
since ∂θ2 = σ̂ 2 (x)
<0
θ=θ̂(x)
∂2 log L(θ,σ 2 )
−n
= <0
∂(σ 2 )2
σ 2 =σ̂ 2 (x) 2(σ̂ 2 (x))2
∂ 2 log L(θ, σ 2 ) X −1
2
= (xi − x̄) 4 =0
θ = θ̂(x)
∂θ∂σ σ̂ (x)
σ 2 = σ̂ 2 (x)
P
(xi −x̄)2
.˙. The MLE value of θ and σ 2 are θ̂(x) = x̄ and σ̂ 2 (x) = n .
Problem 6.5 Find the MLE of the parameter α and λ ( λ being large) from a sample
of n independent observations from the population represented by the following density
function
λ λ
( α ) e− αλ x xλ−1
x > 0, λ > 0, α > 0
Γλ
pα,λ (x) =
0
otherwise
Also obtain the asymptotic form of the covariance for the two parameters for large n.
∂ log Γλ 1
Solution: Given that ∂λ ≈ log λ − 2λ .
P
∂ log L(α, λ) n xi X
= + n − n log α − + log xi
∂λ 2λ α
∂ 2 log L(α, λ) n
2
=− 2
∂λ 2λ
∂ log L(α,λ) ∂ log L(α,λ)
For maximum of log L(α, λ), ∂α =0 and ∂λ =0
P
λ xi
−n +λ 2 = 0 ⇒ α̂(x) = x̄ and
P α α
n xi X
+ n − n log α − + log xi = 0
2λ α
n
⇒ λ̂(x) = Pn
2 i=1 (log x̄ − log xi )
∂ 2 log L(α, λ) n nx̄
Further =− + 2 =0
∂λ∂α
α=α̂(x),λ=λ̂(x)
x̄ x̄
∂ 2 log L(α, λ)
<0 and
∂λ2
λ=λ̂(x)
" #2
∂ 2 log L(α, λ) ∂ 2 log L(α, λ) ∂ 2 log L(α, λ)
2 2
− > 0 at α = α̂(x) and λ = λ̂(x)
∂λ ∂α ∂λ∂α
" #
n nλ̂(x) 2λ̂(x)nx̄ n2 1
i.e., − − − 0 = >0
2λ̂2 (x) x̄2 x̄3 λ̂(x)x̄2 2
Thus the value of the MLE of α and λ are α̂(x) = x̄ and λ̂(x) = 2 P(log nx̄−log x ) .
i
n
" # " #
∂ 2 log L(α, λ) nλ 2λ X
−Eα,λ = − 2 + 3 Eα Xi
∂α2 α α i=1
nλ 2λ
= − + 3 nα
α2 α
nλ
= since Eα [Xi ] = α ∀ i
" # α2
∂ 2 log L(α, λ) n
−Eα,λ =
∂λ2 2λ2
The direct method cannot help to estimate the MLE of α. Since α ≤ x(1) ≤ x(2) ≤
· · · ≤ x(n) < ∞ , i.e., the range of the distribution depends on the parameter α.
If θ̂(x) = x(5) = max1≤i≤5 {xi }, then the value of the MLE of θ is θ̂(x) = x(5) .
Let Y = max1≤i≤5 {X5 }. The pdf of Y is
55 t4
0<y<θ
θ
pθ (y) =
0
otherwise
Z θ
5 5
Eθ [Y ] = t dt
0 θ5
5
= θ 6= θ
6
A.Santhaakumaran 233
Thus any point in [x(n) − 1, x(1) ] is a value of the MLE of θ. Thus the MLE of θ is not
unique and not sufficient statistic.
Problem 6.10 Given an example to show that MLE is not exist
Solution: Let X1 , X2 , · · · , Xn be a random sample drawn from a population with
pmf b(1, θ), 0 < θ < 1 both n and θ are unknown and the only sample values
(0, 0, 0, · · · , 0) or (1, 1, · · · , 1) is available.
The likelihood function for θ of the sample size n is
P P
xi
L(θ) = θ (1 − θ)n− xi
X X
log L(θ) = xi + n − xi log(1 − θ)
xi (n − xi )
P P
∂ log L(θ)
= +
∂θ θ 1−θ
∂ log L(θ)
For maximum , = 0
∂θ
⇒ θ̂(x) = x̄ and
A.Santhaakumaran 234
∂ 2 log L(θ)
<0
∂θ2
θ=x̄
i=1
2πσ 2
1 Pn 1 Pn
log L(µi , σ 2 ) = −n log 2π − n log σ 2 − 2σ 2 i=1 (xi − µi )2 − 2σ 2 i=1 (yi − µi )2
∂ log L(µi , σ 2 )
= 0
∂µi
1 1
⇒ 2 (xi − µi ) + 2 (yi − µi ) = 0
σ σ
xi + yi
⇒ µ̂i = , i = 1, 2, · · · , n
2
n n
" #
∂ log L(µi , σ 2 ) −n 1 X 2
X
= + (x i − µ i ) + (yi − µi )2 = 0
∂σ 2 σ2 2σ 4 i=1 i=1
n n
" #
−n 1 X x i + yi 2 X x i + yi 2
+ x i − + y i − =0
σ2 2σ 4 i=1 2 i=1
2
n n
" #
−n 1 1X 2 1X
+ (xi − yi ) + (xi − yi )2 = 0
σ2 2σ 4 4 i= 4 i=1
n
1 X
⇒ σ̂ 2 (x, y) = (xi − yi )2
4n i=1
A.Santhaakumaran 235
n
2 1 X
Thus σ̂ (X, Y ) = (Xi − Yi )2 is not consistent estimator of σ 2 .
4n i=1
The likelihood equations are often difficult to solve explicitly for θ even in cases
where all the regularity conditions hold and the unique solution exist. Equations in
the exponential cases are very often non-linear and difficult to solve. It may difficult
to locate the global maximum of the likelihood function for the following cases,
(i) the family of distributions under consideration is not of the exponential type.
If the initial solution θ0 was chosen, close to the root of the likelihood equations θ̂(x)
∂ 2 log L(θk )
and if ∂θ2
for k = 0, 1, · · · , is bounded away from zero, there is a good chance
that the sequence generated by equation (6.4) will converge to the root θ̂(x). The
sequence {θk , k = 0, 1, · · · , } generated by equation (6.4) depends on the sample values
X1 , X2 , · · · Xn . If the chosen initial solution θ0 is a consistent estimator of θ, then
A.Santhaakumaran 237
the sequence obtained by the equation (6.4) will faster converge to the root θ̂(x) and
provide the best asymptotically normal estimator of θ.
In small sample situations the sequence {θk , k = 0, 1, · · · , } generated by equation (6.4)
may convey irregularities due to the particular sample values obtained in the experi-
ment. In order to avoid irregularities in the approximating sequence, two methods are
proposed. They are fixed derivative method and method of scoring.
(ii) The method of fixed derivative
∂ 2 log L(θk )
In the fixed derivative method, the term ∂θ2
in equation (6.4) is replaced by
− ank where {ak , k = 0, 1, · · ·} is a suitable chosen sequence of constants and n is the
sample size.
Now the sequence {θk , k = 0, 1, · · ·} is generated by
ak ∂ log L(θk )
θk+1 = θk + , k = 0, 1, 2, · · · (6.5)
n ∂θ
The sequence {θk , k = 0, 1, · · · , } converge to the root θ̂(x) in a more regular fashion
rather than the equation (6.4) by the choice sequence {ak }∞
k=0
Fixed derivative method fails to converge in many cases, the method of scoring
may use to locate the local maximum, since the log likelihood curve is steep in the
neighbour hood of a local maximum equation (6.5).
(iii) The method of scoring
The method of scoring is a special case of the fixed derivative method. The
n
special sequence {ak , k = 0, 1, · · · , } is chosen by Fisher. It is ak = I(θk ) , where I(θk )
is the amount of Fisher Information of n observations x of X and θk is the value of
approximation after the (k − 1)th iteration. Thus Fisher’s scoring method generates
the sequence
1 ∂ log L(θk )
θk+1 = θk +
I(θk ) ∂θ
for the (k − 1)th iteration, k = 0, 1, 2, · · · . The method of iteration continues and stop
when the sequence {θk , k = 0, 1, · · · , } converges on a local maximum.
Problem 6.12 The following data represents a sample from a Cauchy population.
Obtain the maximum likelihood estimate for the parameter involved in the distribution
by the method of successive approximation.
A.Santhaakumaran 238
Arrange the sample values in the increasing order of magnitude. Let the first trial
value of θ is θ̂(x) = t1 = the value of the sample median. The first approximation
value is
n
4X (xi − t1 )
t2 = t1 +
n i=1 1 + (xi − t1 )2
The successive iteration values are t3 , t4 , · · ·. This procedure is continued until any two
successive iterations values are equal. The convergent value is the MLE value of θ.
n n
n X cX
⇒ + log xi − xc−1 = 0
c i=1 b i=1 i
n n
xc−1
X X
i.e., c2 i − cb log xi − nb = 0
i=1 i=1
The estimates of c and b are obtained to solve the above equations for c and b by
iterative method.
(ii) The range of the density functions p(x | θ) are independent of the parameter θ.
(iv) Ω contains an open interval and Ω containing θ0 , the true value of θ as an interior
point in Ω.
n
Y
L(θ0 ) = pθ0 (xi )
i=1
Define Sn = {x : L(θ0 ) > L(θ1 )}
n
1X pθ1 (xi ) pθ1 (X)
P
log → Eθ0 log as n → ∞
n i=1 pθ0 (xi ) pθ0 (X)
By Jensen’s Inequality for the convex function f (X) → E[f (X)] ≤ f (E[X]). Here
p (x) p (x) 1
− log pθθ0 (x) = log pθθ1 (x) is strictly convex.
1 0
pθ1 (x)
For the convex function, log
pθ0 (x)
pθ1 (X) pθ1 (X)
Eθ0 log ≤ log Eθ0
pθ0 (X) pθ0 (X)
pθ1 (X) pθ1 (x)
Z
But Eθ0 = pθ (x)dx = 1
pθ0 (X) pθ0 (x) 0
L(θ0 )
.˙. lim Pθ0 {Sn } = Pθ0 lim >1
n→∞ n→∞ L(θ1 )
n
( )
1X pθ1 (Xi )
= Pθ0 lim log <0
n→∞ n pθ0 (Xi )
i=1
pθ1 (X)
= Pθ0 Eθ0 log <0 → 1 as n → ∞
pθ0 (X)
1 dy 1
y = log x is a concave function and − log x is a convex function, since dx
= x
> 0 ↑ ∀ x > 0 and
d2 y
dx2
= − x12 <0
A.Santhaakumaran 243
pθ1 (X)
= Pθ0 log Eθ0 <0 → 1 as n → ∞
pθ0 (X)
Pθ0 {L(θ0 ) > L(θ1 )} → 1 as n → ∞
MLE is consistent
P
⇒ θ̂(X) → θ0 as n → ∞
⇒ θ̂(X) is a consistent estimator of θ.
∂2 log L(θ)
Proof: Expanding ∂ 2 θ2
as Taylor’s series around θ̂(x) is
∂2 log L[θ̂(x)] ∂2 log L(θ0 ) 3 log L(θ? )
∂θ2
= ∂θ2
+ [θ̂(x) − θ0 ] ∂ ∂θ3
where θ? = θ0 + ν(θ̂(x) − θ0 ), 0 < ν < 1
A.Santhaakumaran 244
3 log L(θ? )
Further, assume ∂ ≤ H(x) ∀ θ ∈ Ω and Eθ0 [H(X)] < ∞ is independent of
∂θ3
θ0 .
∂ 2 log L[θ̂(x)] ∂ 2 log L(θ0 ) ∂ 3 log L(θ ? )
− ≤ |θ̂(x) − θ0 |
∂θ2 ∂θ2 3
∂θ
≤ |θ̂(x) − θ0 |H(x)
P P
|θ̂(X) − θ0 |H(X) → 0 as n → ∞ since θ̂(X) → θ0 as n → ∞
( )
∂ 2 log L[θ̂(X)] ∂ 2 log L(θ0 )
Pθ0 − < → 1 as n → ∞
∂θ2 ∂θ2
n
" #
1X ∂ 2 log pθ (xi ) P ∂ 2 log pθ (X)
→ Eθ0 as n → ∞
n i=1 ∂θ2 ∂θ2
" #
∂ 2 log pθ (X)
Since I(θ0 ) ≥ 0 ⇒ Eθ0 = −I(θ0 ) < 0
∂θ2
n
( )
. 1X ∂ 2 log pθ (X)
. .Pθ0 <0 → 1 as n → ∞
n i=1 ∂θ2
n
Y ∂ 2 log L(θ)
Since L(θ) = pθ (xi ) ⇒ P θ0 <0 → 1 as n → ∞
2
i=1
∂θ
θ=θ̂(x)
∂ 2 log L(θ)
h i h i
∂ log L(θ)
(ii) Eθ0 ∂θ = 0, Eθ0 ∂θ2
= −nI(θ0 ) < 0 ∀ θ ∈ Ω where I(θ0 ) is the
amount of information for a single observation x of X.
3
(iii) ∂ log L(θ)
≤ H(x) and Eθ0 [H(X)] < ∞ is independent of θ0 .
∂θ3
A.Santhaakumaran 245
Theorem 6.3 ( Cramer 1946) Let θ̂(X) be the MLE of θ, then under the regularity
p
conditions (i) to (iii) nI(θ0 )(θ̂(X) − θ0 ) has an asymptotic normal distribution with
mean zero and variance one
∂ log L(θ)
Proof: Let θ̂(X) be the solution of ∂θ = 0 in an interval containing the true
value θ0 of θ.
∂ log L(θ)
Expanding the function ∂θ around θ̂(x) by using Taylor’s series for any fixed x,
2
∂ log L θ̂(x) ∂ log L(θ0 ) ∂ 2 log L(θ )
0
θ̂(x) − θ0 ∂ 3 log L(θ? )
i.e., = + θ̂(x) − θ0 +
∂θ ∂θ ∂θ2 2! ∂θ3
where θ? = θ0 + ν θ̂(x) − θ0 , 0 < ν < 1.
2
∂ log L(θ̂(x)) ∂ log L(θ0 ) ∂ 2 log L(θ0 ) θ̂(x) − θ0 ∂ 3 log L(θ? )
But =0 → + θ̂(x) − θ0 2
+ =0
∂θ ∂θ ∂θ 2 ∂θ3
2
∂ 2 log L(θ )
0
θ̂(x) − θ0 ∂ 3 log L(θ? ) ∂ log L(θ0 )
θ̂(x) − θ0 + =−
∂θ2 2 ∂θ3 ∂θ
∂ 2 log L(θ )
0
θ̂(x) − θ 0 ∂ 3 log L(θ ? ) ∂ log L(θ0 )
θ̂(x) − θ0 2
+ 3
=−
∂θ 2 ∂θ ∂θ
1 ∂ log L(θ0 )
n ∂θ
θ̂(x) − θ0 =
1 ∂ 2 log L(θ )
0 (θ̂(x)−θ0 ) 1 ∂ 3 log L(θ ? )
−n ∂θ2
− 2 n ∂θ3
I(θ0 )
nI(θ0 ) n1 ∂ log∂θL(θ0 )
p
p I(θ0 )
nI(θ0 ) θ̂(x) − θ0 =
2 (θ̂(x)−θ0 ) 1 ∂ 3 log L(θ? )
− n1 ∂ log L(θ0 )
∂θ 2 − 2 n ∂θ 3
1 ∂ log L(θ0 )
q √ ∂θ
nI(θ0 )
nI(θ0 ) θ̂(x) − θ0 =
2 (θ̂(x)−θ0 ) 1 ∂ 3 log L(θ? )
1
I(θ0 ) − n1 ∂ log L(θ0 )
∂θ2
− 2 n ∂θ3
n
1X ∂ 2 log pθ (xi ) P
→ −I(θ0 ) as n → ∞
n i=1 ∂θ2
A.Santhaakumaran 246
P P
Also θ̂(X) → θ0 as n → ∞ ⇒ θ̂(X) − θ0 → 0 as n → ∞ and Eθ0 [H(X)] = k as
h i
∂ log pθ (xi ) ∂ log pθ (Xi )
n → ∞. Denote Zi = ∂θ ,i = 1, 2, · · · , n. Eθ0 [Zi ] = Eθ0 ∂θ =0 ∀i=
1, 2, · · · , n. Let Sn = Z1 + · · · + Zn , then E[Sn ] = 0 and V [Sn ] = I(θ0 ) + · · · + I(θ0 ) =
nI(θ0 )
1 ∂ log L(θ0 )
q √ ∂θ
nI(θ0 )
nI(θ0 ) θ̂(X) − θ0 = h i as n → ∞
1
I(θ0 ) − n1 (−nI(θ0 )) − 0
q ∂ log L(θ0 )
∂θ
nI(θ0 ) θ̂(X) − θ0 = p as n → ∞
nI(θ0 )
n −E[Sn ] d
By Lindeberg - Levey Central Limit Theorem S√ → N (0, 1) as n → ∞.
V [Sn ]
p
.˙. nI(θ0 ) θ̂(X) − θ0 ∼ N (0, 1) as n → ∞.
Remark 6.2 Any consistent estimator θ̂(X) of roots of the likelihood equation satisfies
√
n(θ̂(X) − θ0 ) ∼ N (0, I(θ10 ) ), then θ̂(X) is an efficient likelihood estimator of θ or
asymptotically normal and efficient estimator of θ.
MLE is unique
Aω = [θ | g(θ) = ω]
.. . ∩ ω Aω = Ω
= eθt(x)−A(θ) h(x)
n
X
log L(θ) = θ t(xi ) − nA(θ) + log h(x)
i=1
n
∂ log L(θ)
t(xi ) − nA0 (θ)
X
=
∂θ i=1
n
∂ log L(θ) 1X
For maximum, = 0 ⇒ A0 (θ) = t(xi ) (6.6)
∂θ n i=1
and
∂ 2 log L(θ)
= −nA00 (θ) < 0
∂θ2
Z
Consider eθt(x)−A(θ) h(x)dx = 1
Assume that the integral is continuous and has derivatives of all orders with respect
to θ and it can be differentiated under the integral sign.
Z Z
t(x)eθt(x)−A(θ) h(x)dx − A0 (θ)e−A(θ) eθt(x) h(x)dx = 0
Z
0
Eθ [T ] = A (θ) eθt(x)−A(θ) h(x)dx
A0 (θ) = Eθ [T ] (6.7)
1 Pn
Using equations (6.6) and (6.7), one may get Eθ [T ] = n i=1 t(xi )
Z Z
t(x)eθt(x)−A(θ) h(x)dx − A0 (θ) = 0 since eθt(x) e−A(θ) h(x)dx = 1
[τ 0 (θ)]2
Vθ [T ] = ∀ θ∈Ω
∂ log L(θ) 2
h i
Eθ ∂θ
∂ log L(θ)
Covθ T, = τ 0 (θ), ∀ θ ∈ Ω.
∂θ
A.Santhaakumaran 250
∂ log L(θ)
i.e., Eθ T = τ 0 (θ), ∀ θ ∈ Ω.
∂θ
∂ logL(θ) 0 ∂ log L(θ)
Eθ (T − τ (θ)) = τ (θ), since Eθ =0∀θ∈Ω
∂θ ∂θ
∂ log L(θ)
A(θ)Eθ [T − τ (θ)]2 = τ 0 (θ) since = A(θ)[t(x) − τ 0 (θ)]
∂θ
A(θ)Vθ [T ] = τ 0 (θ)
τ 0 (θ)
A(θ) =
Vθ [T ]
Squaring both sides of (5.8), one can get
2
∂ log L(θ) 2
= A2 (θ) [t(x) − τ 0 (θ)]
∂θ
2
∂ log L(θ)
Eθ = A2 (θ)Vθ [T ]
∂θ
2
[τ 0 (θ)]2 Vθ [T ]
∂ log L(θ)
i.e., Eθ = 2
∂θ {Vθ [T ]}
[τ 0 (θ)]2
i.e., Vθ [T ] = h i2 ∀ θ ∈ Ω
∂ log L(θ)
Eθ ∂θ
T = t(X) attains the Cramer - Rao lower bound, i.e., T = t(X) is a MVBE of τ (θ).
∂ log L(θ)
Conversely, assume T = t(X) is a MVBE of τ (θ). Now to prove ∂θ ∝ [t(x) −
∂ log L(θ) 2 0 2
h i
∂ log L(θ)
τ (θ)], i.e., ∂θ = A(θ)[t(x) − τ (θ)], τ 0 (θ) = A(θ)Vθ [T ] and Eθ ∂θ = [τVθ(θ)]
[T ]
2
∂ log L(θ) A2 (θ)Vθ2 [T ]
.˙. Eθ =
∂θ Vθ [T ]
∂ log L(θ) 2
Eθ = A2 (θ)Vθ [T ]
∂θ
∂ log L(θ) 2
Eθ = A2 (θ)Eθ [T − τ (θ)]2
∂θ
∂ log L(θ)
⇒ = A(θ)[t(x) − τ (θ)]
∂θ
∂ log L(θ)
i.e., ∝ [t(x) − τ (θ)]
∂θ
h(x) and t(x) are functions of observations only and θ1 and θ2 are functions of θ only.
The parametric functions to be estimated is − dθ dθ2 dθ
dθ1 = − dθ dθ1 and the variance of the
2
2
h i
estimator is − ddθθ22 = d
dθ − dθ2
dθ1 dθ
1
1 dθ1
Proof: Let T = t(X) be the MVBE of τ (θ) where θ is the population parameter. For a
single observation x of X, the likelihood function for θ is L(θ) = pθ (x), and t(x)−τ (θ)
∂ log L(θ)
and ∂θ are proportional, i.e.,
∂ log L(θ)
= A(θ)[t(x) − τ (θ)]
∂θ
= eθ1 t(x)+θ2 ec
Further, assuming the differentiation with respect to θ1 under the integral sign is valid
and differentiate twice, one can get
dθ2
Z
h(x)et(x)θ1 t(x)dx = e−θ2 − (6.9)
dθ1
A.Santhaakumaran 253
2
dθ2 d2 θ2
Z
h(x)et(x)θ1 [t2 (x)]dx = e−θ2 − e−θ2 (6.10)
dθ1 dθ12
From equation (6.9), Eθ [T ] = − dθ
dθ1 = τ (θ)
2
2 2
dθ2
− ddθθ22
R 2
From equation (6.10), t (x)et(x)θ1 +θ2 h(x)dx = dθ1 1
2
dθ2 d2 θ2
Eθ [T 2 ] = −
dθ1 dθ12
d2 θ2
Vθ [T ] = Eθ [T 2 ] − (Eθ [T ])2 = −
dθ12
The variance of the MVBE of τ (θ) is
n o
d dθ2
dθ (− dθ1 )
[τ 0 (θ)]2
=
∂ log L(θ) 2 ∂ log L(θ) 2
h i h i
Eθ ∂θ Eθ ∂θ
The variance of the MVBE of τ (θ) = − dθ2
dθ1 is
2
{ ddθθ22 }2 { dθ1 2
dθ } d2 θ2
1
=−
− d2 θ2
( dθ1 2
dθ )
dθ12
dθ12
2
The variance of T = t(X) is − ddθθ22 . Thus T = t(X) attains the MVB of the parametric
1
function τ (θ).
Problem 6.16 Let X1 , X2 , · · · , Xn be a random sample drawn from the population
with pdf
θxθ−1
0 < x < 1, θ > 0
pθ (x) =
0
otherwise
Find the MVBE of θ.
Solution: The likelihood function for θ is
n
!θ−1
Y
n
L(θ) = θ xi
i=1
n
X
log L(θ) = n log θ + (θ − 1) log xi
i=1
P Pn
log xi −
⇒ L(θ) = en log θ+θ i=1
log xi
where θ1 = θ, θ2 = n log θ,
P
h(x) = e− log xi , t(x) =
X
log xi
dθ2 n
τ (θ) = − =−
dθ1 θ
2 d nθ
d θ2 n
Vθ [T ] = − 2 = − = 2
dθ1 dθ θ
−n −n
⇒
P
Since Eθ [T ] = τ (θ) = θ log Xi = θ .
∂ 2 log L(θ)
= A0 (θ)[t(x) − θ] + A(θ)(−1)
∂θ2
2
∂ log L(θ)
= −A(θ̂(x)) < 0 at θ = θ̂(x)
∂θ2
ˆ is MLE of θ.
where θ(X)
Problem 6.17 If T = t(X) is MVBE of τ (θ) and pθ (x1 , x2 , · · · , xn ) the joint density
function corresponding to n independent observations of a random variable X , then
∂ log pθ (x1 ,x2 ,···,xn )
show that correlation between T and ∂θ is unity.
Solution: Given T = t(X) is the MVUE of τ (θ), i.e., T attains the Cramer Rao
lower bound,
[τ 0 (θ)]2
⇒ Vθ [T ] = ] θ∈Ω
Vθ [ ∂ log pθ (x∂θ
1 ,x2 ,···,xn )
∂ log pθ (x1 , x2 , · · · , xn )
i.e., [τ 0 (θ)]2 = Vθ [T ]Vθ [ ]
∂θ
s
∂ log pθ (x1 , x2 , · · · , xn )
τ 0 (θ) = Vθ [T ]Vθ [ ]
∂θ
But τ (θ) = Eθ [T ]
Z
= tpθ (x1 , x2 , · · · , xn )dx
∂pθ (x1 , x2 , · · · , xn )
Z
τ 0 (θ) = t dx
∂θ
∂pθ (x1 , x2 , · · · , xn ) pθ (x1 , x2 , · · · , xn )
Z
= t dx
∂θ pθ (x1 , x2 , · · · , xn )
∂ log pθ (x1 , x2 , · · · , xn )
Z
= t pθ (x1 , x2 , · · · , xn )dx
∂θ
∂ log pθ (x1 , x2 , · · · , xn )
= Eθ T
∂θ
log pθ (x1 , x2 , · · · , xn )
= Covθ T,
∂θ
log pθ (x1 ,x2 ,···,xn )
Correlation coefficient between T and ∂θ is
h i
Covθ T, log pθ (x1∂θ
,x2 ,···,xn )
ρ= r h i
Vθ [T ]Vθ ∂ log pθ (x∂θ
1 ,x2 ,···,xn )
A.Santhaakumaran 256
τ 0 (θ)
ρ = r h i
∂ log pθ (x1 ,x2 ,···,xn )
Vθ [T ]Vθ ∂θ
= 1
r h i
∂ log pθ (x1 ,x2 ,···,xn )
Since τ 0 (θ) = Vθ [T ]Vθ ∂θ
This is not true when the moments of the distribution do not exist. For example in the
case of Cauchy distribution moment estimators do not exist.
Problem 6.18 A random sample of size n is taken from the log normal distribution
1
√1 1 − 2σ2 (log x−θ)2
e x>0
pθ,σ2 (x) = 2πσ x
0
otherwise
1 xr − 12 (log x−θ)2
Z ∞
E[X r ] = e 2σ √ dx
0 2πσ x
Take y = log x, i.e., ey = x ⇒ ey dy = dx
Z ∞
1 1 2
r
E[X ] = √ ery e− 2σ2 (y−θ) dy
0 2πσ
y−θ
Let = z ⇒ y = σz + θ, dy = σdz
σ
A.Santhaakumaran 257
Z ∞
r 1 1 2
E[X ] = √ erθ− 2 z +rσz dz
−∞ 2π
erθ
Z ∞
1 2 −2rσz]
= √ e− 2 [z dz
2π −∞
r2 σ2
erθ+ 2
Z ∞
1 2
= √ e− 2 [z−rσ] dz
2π −∞
2 2
rθ+ r 2σ √ Z ∞ √
0 e 1 2
µr = √ 2π since e− 2 [z−rσ] dz = 2π
2π −∞
r2 σ2
µr0 = erθ+ 2 r = 1, 2, · · ·
σ2 2
when r = 1 log µ10 = θ + , 2 log µ10 = 2θ + σ 2 , log µ10 = 2θ + σ 2
2 !
2 µ20
when r = 2 log µ20 = 2θ + 2σ , 2
log µ20 − log µ10 2
= σ , log = σ2
(µ10 )2
m02
P r
x
2
⇒ σ̂ (x) = log where m0r = i
r = 1, 2, · · ·
(m01 )2 n
m02
log(m01 )2 = 2θ̂(x) + log
(m01 )2
m02
log(m01 )2 − log = 2θ̂(x)
(m01 )2
(m01 )2
i.e., θ̂(x) = log q
m02
Problem 6.19 Find the moment estimates of α and β for the pdf
αβ e−αx xβ−1
x > 0, β > 0, α > 0
Γβ
pα,β (x) =
0
otherwise
µ20 1 (µ0 )2
= 1+ ⇒ β= 1
(µ10 )2 β µ2
P P P 2
(m01 )2 m01 xi x2i xi
Thus β̂(x) = m2 and α̂(x) = m2 where m01 = n and m2 = n − n .
Problem 6.20 Obtain the moment estimate of the parameter θ of the pdf
1 e−|x−θ|
−∞ < x < ∞
2
pθ (x) =
0
otherwise
= −(x − θ) if x ≤ θ
Z θ
x (x−θ)
Z ∞ x −(x−θ)
µ01 = e dx + e dx
−∞ 2 θ 2
when x − θ = t ⇒ x = t + θ
Z 0 Z ∞
2µ01 = (t + θ)et dt + (t + θ)e−t dt
−∞ 0
Z ∞ Z 0 Z ∞
= θ e−|t| dt + tet dt + te−t dt
−∞ −∞ 0
Z ∞ Z ∞ Z ∞
−|t| −t
= θ e dt − θ te dt + θ te−t dt
−∞ 0 0
Z ∞
1 −|t|
= θ since e dt = 1
2 −∞
P
xi
µ01 = θ ⇒ θ̂(x) = m01 where m01 = .
n
Problem 6.21 For a single random observation x of X , obtain the moment estimates
of the parameters a and b of the rectangular distribution
Solution: The pdf of rectangular distribution is
1
a<x<b
b−a
pa,b (x) =
0
otherwise
Z b
x a+b
µ01 = E[X] = dx =
a b−a 2
A.Santhaakumaran 259
!
x2 b3 − a3 b2 + ab + a2
Z b
1
µ02 = E[X ] =2
dx = =
a a−b b−a 3 3
b2
+ 2ab + a2 − ab (2µ01 )2 − ab
µ02 = =
3 3
3µ02 0 2 0
= 4(µ1 ) − ab and b = 2µ1 − a
2
a2 − 2aµ01 + 4µ01 − 3µ02 = 0
q
2µ01 ± 4µ01 2 − 4(4µ01 2 − 3µ02 )
a=
2
√ √ √
â(x) = m01 ± 3m2 . But 2µ01 = µ01 ± 3m2 + b ⇒ b̂(x) = m01 ± 3m2 . Thus the value
√ √
of the moment estimators of a and b are â(x) = m01 − 3m2 and b̂(x) = m01 + 3m2
P P P 2
xi x2i xi
where m01 = n and m2 = n − n
X=x 0 1 2
Pθ {X = x} 1 − θ − θ2 θ θ2
X=x Pθ {X = x} Frequency f
0 1 − θ − θ2 11
1 θ 10
2 θ2 4
P
Total 1 fi = 25
P
fx
Solution: One can get, µ01 = Eθ [X] = (1 − θ − θ2 ) × 0 + θ × 1 + θ2 × 2 = P fi i
i
0+10+8
0+θ+ 2θ2 = 25
50θ2 + 25θ − 18 = 0
√
−25± 625+4×50×18
θ̂(x) = 2×50 = 0.4
pdf
xα
α xα−1 e− β
x, β, α > 0
β
pα,β (x) =
0
otherwise
Obtain the moment estimates of α and β.
Solution: Compute the rth order moment, i.e.,
Z ∞ α
r α − xβ
E[X ] = xα+r−1 e dx
0 β
Z ∞
α 1
α+r−1
− βy 1 1 −1
= yα e y α dy where y = xα
β 0 α
Z ∞
1 −y r
= e β y α +1−1 dy
β 0
1 Γ αr + 1
r
r
µ0r = r = β α Γ + 1
β ( β1 ) α +1 α
1 2
1 2
µ01 = β α Γ + 1 and µ02 = β α Γ +1
α α
2
2 1
2 2
µ2 = βαΓ + 1 − βα Γ +1
α α
2
µ2 Γ( α2 + 1) − Γ( α1 + 1)
=
(µ01 )2
2
Γ( α1 + 1)
P
S 1 Pn Xi
Coefficient of variation = X̄
where S 2 = n−1
2
i=1 (Xi − X̄) and X̄ = n . Equating
2
S2 Γ( α2 + 1) − Γ( α1 + 1)
= 2
x̄2
Γ( α1 + 1)
and using iterative method to estimate the value of α. From the estimate α̂(x) one can
obtain the estimate of β.
e−θ θj
Let πj (θ) = j = 0, 1, · · · ,
j!
dπj (θ) e−θ θj−1 j θj e−θ (−1)
= +
dθ j! j!
−θ
e θ j j
= −1
j! θ
dπj (θ) j
= πj −1
dθ θ
A.Santhaakumaran 262
Iterative method may be used to solve the equation for θ. Alternatively, expand f (θ) =
fj2
h i
j
1−
P
j πj (θ) θ in a Taylor’s series as a function of θ upto first order about the sample
mean x̄ where x̄ is the trial value of θ,
X fj2 X fj2 X fj2
" 2 #
j j j j
1− = 1− + (θ − x̄) 2
+ 1−
j
πj (θ) θ j
mj x̄ j
mj x̄ x̄
e−x̄ x̄j
where πj (x̄) = mj = and f (θ) = f (x̄) + (θ − x̄)f 0 (x̄)
j!
d 1
πj (θ) (1 − θj ) 1
j
j
1 dπj (θ)
since = 0+ 2 − 1−
dθ πj (θ) θ θ πj2 (θ) dθ
" #
1 j j 1 j
= − 1− (−πj (θ)) 1 −
πj (θ) θ2 θ πj (θ) θ
" 2 #
1 j j
= 2
+ 1− = f 0 (θ)
πj (θ) θ θ
X fj2
j
But 1− = 0
j
πj (θ) θ
X fj2 X fj2
" 2 #
j j j
1− + (θ − x̄) + 1− =0
j
mj x̄ j
mj x̄2 x̄
P fj2 h j
i
− j mj 1− x̄
θ − x̄ = P fj2 j j 2
j mj [ x̄2 + (1 − x̄ ) ]
P fj2
− j mj [x̄ − j] x̄1
θ − x̄ = P fj2
j mj [j + (x̄ − j)2 ] x̄12
A.Santhaakumaran 263
P fj2
j mj [j − x̄]
Let θ1 = x̄ + x̄ P fj2
j mj [j + (j − x̄)2 ]
To improve the value of θ from x̄, repeat the process until to get the convergent value
of θ.
Problem 6.25 Show that for large sample size, maximizing the likelihood function
of the χ2 statistic is equal to minimizing the χ2 statistic.
Solution: Let oj be the observed frequency and ej be the theoretical fre-
P (oj −ej )2
quency of the j th class. Then χ2 = j ej . For large fixed sample size
n, the distribution of quantities oj , j = 1, 2, · · · , r is given by the likelihood
function
n! e1 o1 e2 o2 e r or
L = ···
o1 !o2 ! · · · or ! n n n
such that o1 + o2 + · · · + or = n
o1 o2 or o1 or
n! e1 e2 er o1 or
= ··· ···
o1 !o2 ! · · · or ! o1 o2 or n n
r
!
X ej
log L = constant + oj log
j=1
oj
1
For large fixed sample size, ej = oj + aj n1−δ , δ > 0, i.e., ej = oj + aj n 2 for δ = 21 ,
1 1
where aj is finite and |aj n 2 | < and
P P P
j oj = j ej = n so that n 2 j aj = 0 as
1
n → ∞ and if n 6→ ∞, then
aj < 0(n− 2 ).
P P
aj < 1 for every > 0, i.e.,
n2
r
" 1 #
oj + aj n 2
X
log L = constant + oj log
j=1
oj
r
" 1 !#
X aj n 2
= constant + oj log 1 +
j=1
oj
1
a2j n 1
" !#
X aj n 2
= constant + oj − 2 + ···
j
oj oj 2
1 1 X a2j n 1
+ 0(n− 2 )
X
= constant + aj n 2 −
j
2 j oj
1 X (ej − oj )2 1
= constant − + 0(n− 2 )
2 j oj
A.Santhaakumaran 264
P (ej −oj )2
If modified χ2 statistic is defined as χ2mod = j oj , then
1
log L = constant − χ2mod as n → ∞.
2
3 1
3
P a3j n 2 P a3j h 1
i3 1
Since j o2j
< for some n > N ⇒ j o2 <
3 = 3
1 = o(n− 2 = o(n− 2 )
j n2 n2
1
" 1
#4
P a4j (n 2 )4 P a4j 14 1
and j o3j
< 1 for some n > N ⇒ j o3 < 1
1 < 1 = (o(n− 2 ))4 =
j (n 2 )4 n2
1
o(n− 2 ) where > 0 and 1 > 0.
1
χ2 − χ2mod = o(n− 2 ) = 0 as n → ∞
1
Thus log L = constant − χ2 as n → ∞
2
1
max log L = constant + {− max χ2 } as n → ∞
2
1
= constant + min χ2 as n → ∞
2
X 0 Xθ = X 0 Y
θ̂(x) = (X 0 X)−1 X 0 Y
= (X 0 X)−1 X 0 [Xθ + ]
= (X 0 X)−1 X 0 Xθ + (X 0 X)−1 X 0
= θ since Eθ [] = 0
A.Santhaakumaran 266
θ̂(X) − θ = (X 0 X)−1 X 0 Y − θ
= (X 0 X)−1 X 0 [Xθ + ] − θ
= (X 0 X)−1 X 0 Xθ + (X 0 X)−1 X 0 − θ
= θ + (X 0 X)−1 X 0 − θ
= (X 0 X)−1 X 0
= σ 2 (X 0 X)−1
Linear estimation
ρ(X 0 ) = ρ(X 0 , b)
c0 Eθ [Y ] = b0 θ
Eθ [c0 Y ] = b0 θ
⇒ b0 θ is estimable
Remarks 6.5 ρ(X 0 ) = ρ(X 0 , b) ⇒ ρ(X 0 X) = ρ(X 0 X, b)
c0 Xθ = b0 θ → X 0 c = b (6.11)
L(λ) = c0 c − 2λ0 (X 0 c − b)
dL(λ)
= 0
dc
A.Santhaakumaran 268
⇒ c0 − λ0 X 0 = 0
⇒ c0 = λ0 X 0 i.e., c = Xλ
X 0 Xλ = b (6.13)
Thus equation (6.13) is solvable. Let c(1) and c(2) be two solutions for to λ(1)
and λ(2) of equation (6.13).
c(1) = Xλ(1)
c(2) = Xλ(2)
X 0 Xλ(1) = b
X 0 Xλ(2) = b
X 0 X(λ(1) − λ(2) ) = 0
→ c(1) = c(2)
.˙. c = Xλ
−1
= X1 X10 X1 b1
−1 −1
c0 c = b01 X10 X1 X10 X1 X10 X1 b1
−1
= b01 X 0 X1 b1
A.Santhaakumaran 269
= c0 [I − X1 (X10 X1 )−1 X10 ]c + b01 (X10 X1 )−1 (X10 X1 )(X10 X1 )−1 (X10 X1 )](X10 X1 )−1 b1
= c0 [I − X1 (X10 X1 )−1 X10 ][I − X1 (X10 X1 )−1 X10 ]c + b01 (X10 X1 )−1 b1
This indicates that the minimum is actually obtained. The LSE θ̂(X) of θ
is obtained by minimizing (Y − X θ̂)0 (Y − X θ̂). The normal equation is
X 0 Xθ = X 0 Y ⇒ c0 Y = λ0 X 0 Y = λ0 X 0 X θ̂ = b0 θ̂(X) since b0 = λ0 X 0 X.
.˙. Eθ [(Y −X θ̂ )0 (Y −X θ̂)] = Eθ [(Y −X θ̂ )0 (I −X1 (X10 X1 )−1 X10 )(Y −X θ̂)] = (n−r)σ 2 .
1 0 1 θ1 l1
0 1 0 θ = l
2 2
0 1 0 θ3 l3 − l1
1 0 1 θ1 l1
0 1 0 θ = l
2 2
0 0 0 θ3 l3 − l1 − l2
i Y X1 X2
1 62 2 6
2 60 9 10
3 57 6 4
4 48 3 13
5 23 5 2
5 25 35 790 −80 −42
0 −1
1
(X X) 25 155 175 = 480 −80 16
= 0
35 175 325 −42 0 6
790 −80 −42 250 37 a
1
θ̂(X) = (X 0 X)−1 X 0 Y =
−80 16 0 1265 = 1 = b
480
2 1
3
−42 0 6 1870 2 b2
The estimated linear model is Y = 37 + 12 X1 + 32 X2 .
Problems
6.1 Define LSE. Show that under certain assumptions to be stated, the
LSE’s are minimum variance unbiased estimators.
6.7 Show that under what regularity conditions to be stated, the MLE is
asymptotically normally distributed.
6.9 Derive the formula to calculate the MLE of θ, using a random sample
θ x
from the distribution with Pθ {X = x} = ax g(θ) , x = 1, 2, · · · where g(θ) =
ax θx . Also obtain the explicit expression for the case of truncated
P
5.13 Given a random sample from N (θ, 1), (θ = 0, ±1, ±2, · · · , ). Find the
MLE of θ.
6.15 Obtain the MLE of θ based on random samples of sizes n and m from
1 − xθ
populations with respective frequency functions θe and θe−xθ , x >
0, θ > 0.
6.23 A random sample of size n is available from pθ (x) = θxθ−1 , 0 < x <
1, θ > 0. Find that function of θ for which MVBE exists. Also find the
MVBE of this function and its variance.
e−θ θx
6.24 Derive the MVUE of θ2 in pθ(x) = x! , x = 0, 1, · · · , by taking a sample
of size n and show that it is not a MVBE of θ2 .
6.26 Show that MVBE’s exist for the exponential family of densities.
6.28 Obtain the BLUE of θ for the normal distribution with mean θ and
variance σ 2 based on n observations x1 , x2 , · · · , xn .
A.Santhaakumaran 274
6.29 Obtain the MLE for the coefficient of variation from a population
with N (θ, σ 2 ) based on n observations.
6.39 Which one of the following is not necessary for the UMVU estimation
of θ by T = t(X)?
(a) E[T − θ] = 0
A.Santhaakumaran 276
6.41 LSE and MLE are the same if the sample comes from the population
is
(a) Normal (b) Binomial (c) Cauchy ( d) Exponential Ans:(a)
6.44 Consider a series system with two independent components. Let the
A.Santhaakumaran 277
6.50 Consider a region R which is a triangle with vertices (0, 0), (0, θ), (θ, 0)
where θ > 0. A sample of size n is selected at random from this region
R. Denote the sample as {(Xi , Yi ), i = 1, 2, 3 · · · , n}. Then denoting
X(n) = max(X1 , X2 , · · · , Xn ) and Y(n) = max(Y1 , Y2 , · · · , Yn ) which of the
following statements is true
X(n) +Y(n)
(a) MLE of θ is 2
(Xi +Yi )
(b) MLE of θ is n
6.55 The set X1 , X2 , · · · , Xn are iid as N (µ, σ 2 ) , −∞ < µ < ∞, σ 2 > 0. Then
which of the following statements are true?
Pn
(Xi −X̄ 2 )
(a) i=1
is the unbiased estimator of σ 2
rP n
n
(Xi −X̄ 2 )
(b) i=1
n−1 is the minimum variance unbiased estimator of σ
rP
n
(Xi −X̄ 2 )
(c) i=1
n is the MLE of σ 2
Pn
(X i −X̄ 2 )
(d) i=1
n is the MLE of σ 2 Ans:(b), (c) and (d)
6.56 For the set (x1 , y1 ), · · · (xn , yn ) the following two models were fitted
using least squared method. Model I: yi = β0 + β1 xi , i = 1, 2, 3, · · · , n.
Model II: yi = β0 + β1 xi + β2 x2i , i = 1, 2, 3, · · · n. Let βˆ0 , βˆ1 be the least
squared estimates of β0 and β1 from Model I, and β1∗ , and β2∗ be the
Pn ˆ ˆ 2
least square estimates from Model II. Let A = i=1 [yi − (β0 + β1 xi )] and
Pn ∗ ∗ ∗ 2 2
B= i=1 [yi −(β0 +β2 xi +β2 xi ] . Then which of the following statements
are true?
(a) A ≥ B (b) A ≤ B
(c) It can happen that A = 0 but B > 0
(d) It can happen that B = 0 but A > 0 Ans:( a) and (d)
PN Yi
(d) An unbiased estimator of the variance of T is i=1 ( pi − T )2
Ans: (b) and (c)
6.62 Twenty identical items are put in a life testing experiment starting
at time 0. The failure times of the items are recorded in a sequential
manner. The experiment stops if all the items fail or a pre-fixed time
T > 0 reached which ever is earlier. If the life times of the items are
iid exponential random variables with mean θ, where 0 < θ ≤ 10, then
which of the following statements are true?
(a) The MLE of θ always exist
(b) The MLE of θ may not exist
(c) The MLE of θ is an unbiased estimator of θ, if it exists
(d) The MLE of θ is bounded with probability 1 , if it exists
Ans:(c) and (d)
1
and e2 = √
2(Y3 −Y4 )
. An unbiased estimator of σ 2 is
(a) 13 (e21 − e22 ) (b) 12 (e21 + e22 ) (c) 14 (e21 + e22 ) (d) e21 + e22 Ans:(b)
6.65 Suppose that the life time of an electric bulb follows an exponential
distribution with mean θ hours. In order to estimate θ, n bulbs are
switched on at the time. After t hours n − m(> 0) bulbs are found to
be in functioning state. If the life times of the other (m > 0) bulbs
are noted as X1 , X2 , · · · , Xm respectively, then the maximum likelihood
estimate of θ is given by
t
(a) θ̂ = n
log( n−m )
Pn
xi
(b) θ̂ = i=1
Pmm
xi (n−m)t
(c) θ̂ = i=1
Pm m
xi +(n−m)t
(d) θ̂ = i=1
n Ans:(a)
Let (βˆ1 , βˆ2 , βˆ3 ) be the least squares estimate of (β1 , β2 , β3 ).Let l1 , l2 , l3 ∈ <
(a) (βˆ1 , βˆ2 , βˆ3 ) is unique
P3 P3
(b) i=1 l1 β̂i is the linear unbiased estimate of i=1 li βi
P3
(c) i=1 li β̂i is the uniformly minimum variance unbiased estimate of
P3
i=1 li βi
P3 P3
(d) i=1 li β̂i is BLUE but not UMVUE of i=1 li βi
0.9 (x−θ)
6.68 Consider the pdf f (x, θ, σ 2 ) = σ φ( σ ) + 0.1φ(x − θ) where −∞ < θ < ∞
and σ > 0 are unknown parameters and φ(x) denotes the pdf N (0, 1).
Let X1 , X2 , · · · , Xn be a random sample from this probability function.
Then which of the following statements are correct?
(a) This model of moments estimators for θ and σ exist
(b) This model is not parametric
(c) An unbiased estimator of θ exists
(d) Consistent estimators of θ do not exist Ans:(a) and (c)
Y1 = µ1 − µ2 + 1
Y2 = µ2 − µ3 + 2
··· = ·········
Yn = µn − µ1 + n
A.Santhaakumaran 285
Ans:(c)
with pdf
θ √1 e− 21 x2 + (1 − θ) 1 e−|x|
θ ∈ (0, 12 , 1), −∞ < x < ∞
2π 2
fθ (x) =
0
otherwise
Here −∞ < x < ∞ and α > 0. Then which of the statements are true?
(a) The method of moment estimators neither α nor µ exist
(b) The method of moment estimators of α exists and it is a consistent
estimator α
(c) The method of moment estimators of µ exists and it is a consistent
estimator of µ
(d) The method of moment estimators of both α and µ exist, but they
are not consistent Ans:(b) and (c)
7. INTERVAL ESTIMATION
7.1 Introduction
The problem of interval estimation is that finding a family of random sets S(X) of
the parameter θ, such that for a given α, 0 < α < 1, Pθ {S(X) contains θ} ≥ 1−α, ∀ θ ∈
Ω.
Let θ ∈ Ω ⊆ < and 0 < α < 1. A function θ(X) satisfying Pθ {θ(X) ≤ θ} ≥
1−α ∀ θ is called lower confidence bound of θ at confidence level (1−α). The infiumum
takes over all possible values of θ ∈ Ω ⊆ < of Pθ {θ(X) ≤ θ} is (1 − α). The quantity
(1 − α) is called confidence coefficient.
A function of the form Pθ {θ ≤ θ̄(X)} ≥ 1 − α ∀ θ ∈ Ω ⊆ < is called upper
confidence bound of θ at confidence level (1 − α).
If S(x) is of the form S(x) = θ(x), θ̄(x) , then it is called a confidence interval
at confidence level (1 − α), provided Pθ {θ(X) ≤ θ ≤ θ̄(X)} ≥ 1 − α ∀ θ ∈ Ω ⊆ <. The
A.Santhakumaran 288
confidence coefficient (1 − α) is associated with the random interval θ(X), θ̄(X) ,
Let X be a random sample drawn from a population with pdf pθ (x), θ ∈ Ω ⊆ <
and a, b be two given positive numbers such that a < b, a, b ∈ <. Consider
Obtain the probability of confidence of the random interval (X, 10X) for θ, θ ∈ Ω.
Solution: The probability of confidence of the interval (X, 10X) for θ is
θ
Pθ {X < θ < 10X} = Pθ 1 < < 10
X
θ
= Pθ <X<θ
10
Z θ
1
= dx = 0.9
θ
10
θ
A.Santhakumaran 289
1 19
Problem 7.2 Find the confidence coefficient of the confidence interval 19X , X
θ
Pθ {0 < θ < 2T } = Pθ 0 < < 2
T
θ
= Pθ <T <∞
2
θ
= Pθ <T <θ
2
Z θ
2
= 2
tdt
θ
2
θ
" #θ
2 t2
= = 0.75
θ2 2 θ
2
Problem 7.5 For a sample observation x of X is drawn from a population with pdf
22 (θ − x) 0 < x < θ, θ > 0
θ
pθ (x) =
0
otherwise
Z 1
α
Thus 2(1 − y)dy =
λ2 2
2
λ2 − 2λ2 − (1 − α/2) = 0
q
⇒ λ2 = 1 − 2 − α/2 = c2
Pθ {Y ≤ λ1 } = α/2
Z λ1
2(1 − y)dy = α/2
0
⇒ λ21 − 2λ1 + α/2 = 0
q
λ1 = 1 − 1 − α/2 = c1
Solution: Let Y1 = min1≤i≤n {Xi } be the first order statistic of random sample
X1 , X2 , · · · , Xn . The pdf of Y1 is given by
ne−n(y1 −θ)
θ < y1 < ∞
pθ (y1 ) =
0
otherwise
Z λ2
ne−nt dt = 1 − α
λ1
−nλ1
e − e−nλ2 = 1−α
This equation has infinitely many solutions. If one can choose λ1 = 0, then 1−e−nλ2 =
1 − α, i.e., e−nλ2 = α ⇒ −nλ2 = log α. Thus λ2 = 1
n log( α1 ). .˙. The (1 − α) level
confidence interval for θ is
1 1
Pθ 0 < T < log = 1−α
n α
1 1
Pθ 0 < Y1 − θ < log = 1−α
n α
1 1
Pθ Y1 − log < θ < Y1 = 1−α
n α
1
Y1 − n log( α1 ), Y1 is the (1 − α) level confidence interval for θ.
Problem 7.7 Given a sample of size n from U (0, θ). Show that the confidence
interval for θ based on the sample range R with confidence coefficient (1 − α) and of
the form (R, Rc ) has c given as a root of the equation
cn−1 [n − (n − 1)c] = α.
R
Pθ {λ1 < < λ2 } = 1 − α
θ
P {λ1 < Y < λ2 } = 1 − α
Z λ2
p(y)dy = 1 − α
λ1
Z λ2
n(n − 1)y n−2 (1 − y)dy = 1 − α
λ1
" #λ2
y n−1 yn
n(n − 1) − = 1−α
n−1 n λ1
nλn−1
2 − (n − 1)λn2 − nλ1n−1 + (n − 1)λn1 = 1−α
This equation has infinitely many solutions. If one can choose λ1 = c and λ2 = 1, then
the confidence interval for θ is
P {c < Y < 1} = 1 − α
R
P {c < < 1} = 1 − α
θ
R
P {R < θ < } = 1 − α
c
R, Rc is the (1−α) level confidence interval for θ where c is given by cn−1 [n−(n−1)c] =
√
α. For n = 2, c = 1 − 1 − α.
For large or small samples, the Chebychev’s inequality can be employed to find the
confidence interval for a parameter θ, θ ∈ Ω. For a random variable X with Eθ [X] = θ
and Vθ [X] = σ 2 , then
1
q
Pθ |X − θ| < V [X] > 1 − where > 1
2
If θ̂(X) is the estimate of θ ( not necessarily unbiased) with finite variance, then by
Chebychev’s inequality
1
q
Pθ |θ̂(X) − θ| < Eθ [θ̂(X) − θ]2 >1−
2
A.Santhakumaran 294
q q
1
⇒ θ̂(x) − Eθ [θ̂(X) − θ]2 , θ̂(x) + Eθ [θ̂(X) − θ]2 is a 1 − 2
level confidence
interval for θ.
Problem 7.8 Let X1 , X2 , · · · , Xn be iid b(1, θ) random variables. Obtain (1 − α)
level confidence interval for θ by using Chebychev’s inequality.
Pn Vθ [X]
Solution: i=1 Xi ∼ b(n, θ) since each Xi ∼ b(1, θ). Eθ [X̄] = θ and Vθ [X̄] = n =
θ(1−θ)
n . Now s
θ(1 − θ) 1
Pθ |X̄ − θ| < >1−
n 2
Since θ(1 − θ) ≤ 41 ,
1 1
Pθ |X̄ − θ| < √ >1−
2 n 2
1
Pθ X̄ − √ < θ < X̄ + √ >1−
2 n 2 n 2
If n is kept constant, then one can choose 1 − 12 = 1 − α ⇒ 2 = 1
α ⇒ = √1 .
α
Thus
the (1 − α) level confidence interval for θ is
1 1
x̄ − √ , x̄ + √
2 nα 2 nα
Problem 7.9 Let X be a Binomial random variable with parameters n and θ. Obtain
(1 − α) level confidence interval for θ.
α
Solution: One can find the largest integer n2 (θ) such that Pθ {X ≤ n2 (θ)} ≥ 2 and
the smallest integer n1 (θ) such that Pθ {X ≥ n1 (θ)} ≥ α2 .
Because of the discreteness of the Binomial probability, one cannot make these
α
probabilities exactly equal 2 for all θ other than the symmetrical Binomial probability.
The events {X ≤ n1 (θ)} and {X ≥ n2 (θ)} are mutually exclusive, then
α α
Pθ {X ≤ n1 (θ) or X ≥ n2 (θ)} ≤ + =α
2 2
i.e., Pθ {n1 (θ) < X < n2 (θ)} ≥ 1 − α
The two functions n1 (θ) and n2 (θ) are monotonic and non - decreasing and also dis-
continuous step function such that the (1 − α) level confidence interval for θ is
Pθ {n−1 −1
2 (X) < θ < n1 (X)} ≥ 1 − α
A.Santhakumaran 295
α
where Pθ {X ≤ n1 (θ)} ≤
2
n1 (θ) n
X α
i.e., i θi (1 − θ)n−i ≤
i=0
2
n1 (n−1
1 (x)) = x so that
x n
X α
i θi (1 − θ)n−i = (7.1)
i=0
2
Thus the upper confidence limit for θ. Similarly the lower confidence limit for θ is
n n
X α
i θi (1 − θ)n−i = (7.2)
i=x
2
Solving the equations (7.1) and (7.2) for θ ( when n and α are known) gives the (1 − α)
level confidence interval for θ. i.e., (θ(X), θ̄(X)) is the (1 − α) level confidence interval
where θ̄(x) is the solution of the equation (7.1) and θ(x) is the solution the equation
(7.2).
Problem 7.10 Assuming there is a constant probability θ, for a person entering a
supermarket will make a purchase. Constitute a random sample of a Bernoulli random
variable ( success = purchase made, failure = no purchase made). If 10 persons were
selected at random and it was found that 4 made a purchase. Obtain 90% confidence
interval for θ.
Solution: The 90% confidence limits for θ is
4 10
!
X
i θi (1 − θ)10−i = 0.05
i=0
10 10
!
X
i θi (1 − θ)10−i = 0.05
i=4
Solving these equations for θ, one may get that θ̄(x) = 0.696 and θ(x) = 0.150. Thus,
if a random sample of 10 independent Bernoulli random variables gives x = 4 success,
the 90 % confidence interval for θ is ( 0.150, 0.696).
Problem 7.11 Let X1 , X2 , · · · , Xn be a random sample from a Poisson random
variable X with parameter θ. Obtain (1 − α) level confidence interval for θ.
Pn
Solution: Let Y = i=1 Xi . Given that each Xi follows P (θ). Then Y ∼ P (nθ). The
A.Santhakumaran 296
Pθ {λ−1 −1
2 (Y ) < θ < λ1 (Y )} = 1 − α
Solving the equations (7.3) and (7.4) for θ, the (1 − α) level confidence interval for θ
is θ(X), θ̄(X) where θ̄(x) is the solution of equation (7.3) and θ(x) is the solution of
equation (7.4).
Problem 7.12 Let X1 , X2 , · · · , Xn be a random sample of a Uniform random variable
X on (0, θ). Obtain (1 − α) level confidence interval for θ.
Solution: Let T = t(X) = max1≤i≤n {Xi }. The pdf of T is
nn tn−1
0<t<θ
θ
p(t | θ) =
0
otherwise
α
n
i.e., θ 1− = [λ2 (θ)]n
2
1
α
n
i.e., λ2 (θ) = θ 1 −
2
α
Similarly Pθ {T ≤ λ1 (θ)} =
2
Z λ1 (θ)
n n−1 α
⇒ t dt =
0 θn 2
1
α n
i.e., λ1 (θ) = θ
2
!
T T
1 , 1 provides the (1 − α) level confidence interval for θ.
(1− α2 ) n ( α2 ) n
P {a < Z < b} = 1 − α
X̄ − θ
whereZ = √ ∼ N (0, 1)
σ/ n
X̄ − θ
P {a < √ < b} = 1 − α
σ/ n
√ √
P {X̄ − bσ/ n < θ < X̄ − aσ/ n} = 1 − α
Z a
α
where a is given by φ(z)dz =
−∞ 2
Z ∞
α
and b is given by φ(z)dz =
b 2
Case (ii) When σ 2 is unknown and sample size n ≤ 30 then the statistic
X̄ − θ
t= √ ∼ t distribution with n − 1 d.f
S/ n
A.Santhakumaran 298
1 Pn
where S 2 = n−1 i=1 [Xi − X̄]2 . In this case
S S
(X̄ − zα/2 √ , X̄ − zα/2 √ )
n n
α R∞
where 2 = zα/2 φ(z)dz
Problem 7.14 A random sampling of size 50 taken from a N (θ, σ = 5) has a mean 40.
Obtain a 95% confidence interval for 2θ + 3
Solution: Given the sample mean x̄ = 40 and population standard deviation σ = 5.
The 95% confidence interval for θ is
σ σ
P X̄ − 1.96 √ < θ < X̄ + 1.96 √ = 0.95
n n
σ σ
P 2 X̄ − 1.96 √ < 2θ < 2 X̄ + 1.96 √ = 0.95
n n
σ σ
P 2 X̄ − 1.96 √ + 3 < 2θ + 3 < 2 X̄ + 1.96 √ +3 = 0.95
n n
The 95% confidence limits for 2θ + 3 are
5×2 5×2
2X̄ + 3 ± 1.96 √ = 83 ± 1.96 √
50 50
The 95% confidence interval for 2θ + 3 is ( 80.2281 , 85.7718)
For every Tθ , λ1 (α) and λ2 (α) can be chosen in number of ways. However the choice
is one like to choose λ1 (α) and λ2 (α), such that θ̄(X) − θ(X) is minimum which is the
(1 − α) level shortest confidence interval based on Tθ .
Let Tθ = t(X, θ) be sufficient statistic. A random variable Tθ is a function of
(X1 , X2 , · · · , Xn ) and θ whose distribution is independent of θ is called pivot.
Problem 7.15 Let X1 , X2 , · · · , Xn be a random sample from N (θ, σ 2 ) where σ 2 is
known. Obtain (1 − α) level shortest confidence interval for θ.
X̄−θ
Solution Consider the statistic Tθ = √σ which is a pivot. Since X̄ is sufficient and
n
Tθ ∼ N (0, 1), i.e, the distribution of Tθ is independent of θ. The (1−α) level confidence
interval for θ is
Pθ {a < Tθ < b} = 1 − α
( )
X̄ − θ
Pθ a < <b = 1−α
√σ
n
σ σ
Pθ X̄ − b √ < θ < X̄ − a √ = 1−α
n n
Minimize L = √σ (b − a) subject to
n
Z b
1 1 2
√ e− 2 x dx = 1 − α
a 2π
Z b
i.e., φ(x)dx = 1 − α (7.6)
a
∂L σ db
= √ −1 =0
∂a n da
A.Santhakumaran 300
db
⇒ −1 = 0
da
Z a
1 1 2
Define φ(a) = √ e− 2 z dz
−∞ 2π
Differentiate equation (7.6) with respect to a
Z b
dφ(x) da db
dx − φ(a) + φ(b) = 0
a da da da
Z b
db
0 × dx − φ(a) + φ(b) = 0
a da
db
φ(b) − φ(a) = 0
da
db φ(a)
⇒ =
da φ(b)
φ(a)
Thus −1 = 0
φ(b)
dL R a
If da = 0, then φ(a) = φ(b), i.e., when a = b or a = −b. If a = b, then a φ(x)dx = 0
R a R b
which does not satisfy a φ(x)dx = 1−α. If a = −b, then −b φ(x)dx = 1−α. Thus the
shortest length confidence interval based on Tθ is a equal two tails confidence interval.
The (1 − α) level confidence interval for θ is
σ σ
X̄ − √ z α2 , X̄ + √ z α2
n n
α
where z α2 is the upper ordinate corresponding to the area 2. The shortest length of
this interval is L = 2z α2 √σn .
Problem 7.16 Let X1 , X2 , · · · , Xn be a sample from U (0, θ). Find (1−α) level shortest
confidence interval for θ.
Solution: Let T = max1≤i≤n {Xi }. The pdf of T is
nn tn−1
0<t<θ
θ
pθ (t) =
0
otherwise
T
The pdf of Y = θ is given by
ntn−1
0<y<1
p(y) =
0
otherwise
T
The statistic Y = θ is pivot. The (1 − α) level confidence interval for θ is
P {a < Y < b} = 1 − α
A.Santhakumaran 301
T
P a< <b = 1−α
θ
T T
P <θ< = 1−α
b a
1 1
The length of the interval isL = − T
a b
i.e., [y n ]ba = 1 − α
bn − an = 1 − α
T
T, 1
αn
Problem 7.17 Let X1 , X2 , · · · , Xn be a sample drawn from a Normal population
N (θ, σ 2 ) where σ 2 is unknown. Obtain (1 − α) level shortest confidence interval for θ.
(X̄−θ) 1 Pn
Solution : The statistic Tθ = S
√
is a pivot where S 2 = n−1 i=1 (Xi − X̄)2 since
n
X̄ is sufficient and Tθ is independent of the parameter θ. Then Tθ follows t distribution
with (n − 1) degrees of freedom. The (1 − α) level confidence interval for θ is given by
Pθ {a < Tθ < b} = 1 − α
S S
Pθ X̄ − b √ < θ < X̄ − a √ = 1−α
n n
A.Santhakumaran 302
dL db S
= − 1 √ and
da da n
db
pn−1 (b) − pn−1 (a) = 0
da
dL pn−1 (a) S
⇒ = −1 √
da pn−1 (b) n
Z a
where pn−1 (a) = pn−1 (t)dt
−∞
As in the problem 7.15, the minimum occurs at a = −b. The (1 − α) level confidence
interval is a equal two tails confidence interval for θ is
S S
X̄ − t α2 (n − 1) √ , X̄ + t α2 (n − 1) √
n n
R a α
where a = t α2 (n − 1) is given by −∞ pn−1 (t)dt = 2 and b = −a
This shortest length of this interval is L = 2t α2 (n − 1) √Sn .
Problem 7.18 Let X1 , X2 , · · · Xn be iid random samples drawn from a Normal
population with mean θ and variance σ 2 . Find (1−α) level shortest confidence interval
for σ 2 when (i) θ is known and (ii) θ is unknown.
Solution: The Statistic
Pn
i=1 (Xi − θ)2
Tσ2 = ∼ χ2 with n degrees of fredom
σ2
dL 1 1 X
= − (Xi − θ)2
da a b
Z b
db
and 0dχ2 + pn (b) − pn (a) = 0
a da
db pn (a)
i.e., =
da pn (b)
Z a
wherepn (a) = pn (χ2 )dχ2
0
dL 1 1 pn (a) X
= − (Xi − θ)2
da a b pn (b)
dL
For minimum = 0
da
1 1 pn (a)
⇒ 2 =
a b2 pn (b)
2
⇒ b pn (b) = a2 pn (a)
Using iterative method to solve the equation b2 pn (b) = a2 pn (a) for a and b, i.e., to
solve
Z b Z a
2 2 2 2
b pn (χ )dχ = a pn (χ2 )dχ2 where a < b and a 6= b
0 0
If â and b̂ are the solution of the equation, then the shortest confidence interval for σ 2
is Pn Pn !
i=1 (Xi − θ)2 i=1 (Xi − θ)2
,
b̂ â
Case(ii)
If θ is unknown, then
Pn
i=1 (Xi − X̄)2 (n − 1)S 2 Pn
Tσ2 = = ∼ χ2 (n − 1)df where S 2 = 1
n−1 i=1 (Xi − X̄)2
σ2 σ2
A.Santhakumaran 304
(n − 1)S 2
Tσ2 =
σ2
with (n − 1)df
The shortest confidence interval for σ 2 is
!
(n − 1)S 2 (n − 1)S 2
,
b̂ â
Problem 7.19 Let X and Y be two independent random variables that are N (θ, σ12 )
σ22
and N (θ, σ22 ) respectively. Obtain (1 − α) level confidence interval for the ratio σ12
<1
by considering a random sample X1 , X2 , · · · , Xn1 of size n1 ≥ 2 from the distribution
of X and a random sample Y1 , Y2 , · · · , Yn2 of size n2 ≥ 2 from the distribution of Y .
1 Pn1 1 Pn2
Solution: Let s21 = n1 i=1 (Xi − X̄)2 and s22 = n2 i=1 (Yi − Ȳ )
2 be the variances of
n1 s21 n s2
the two samples. The independent random variables σ12
and σ2 2 2 have χ2 distribution
2
σ22
The (1 − α) level confidence interval for σ12
is
n1 s21
σ12 (n1 −1)
P σ2 a< n2 s22
<b = 1−α
2
σ2
2
1 σ2 (n2 −1)
n2 s22 n2 s22
(n2 −1) σ2 (n −1)
P σ2 a < 22 < b n2 s2 = 1−α
2 n1 s21 σ1 1 1
σ2 (n1 −1) (n1 −1)
1
σ22
The (1 − α) level confidence interval for σ12
is
!
n2 s22 (n1 − 1) n2 s22 (n1 − 1)
a ,b
n1 s21 (n2 − 1) n1 s21 (n2 − 1)
A.Santhakumaran 305
Pn
Let T = i=1 Xi , then T ∼ G(n, 1θ ). Its pdf is
θn e−θt tn−1
0<t<∞
Γn
pθ (t) =
0
otherwise
− y2 2n
1
2n Γn e y 2 −1 0<y<∞
pθ (y) =
0
otherwise
where a is given by
Z a
α
p2n (χ2 )dχ2 =
0 2
and b is given by
Z ∞
α
p2n (χ2 )dχ2 =
b 2
A.Santhakumaran 306
10 electronic components are place on test and their observed times to failure are 607.5,
1947.0, 37.6, 129.9, 409.5, 529.5, 109.0, 582.4, 499.0, 188.1 hours respectively. Find
the 90% confidence interval for θ and 90% confidence interval for mean time to failure.
Also obtain the 90% confidence interval for the probability of the component for a 100
hours period.
P
Solution: As in the Problem 7.16, xi = 5039.5, 2n = 20 degrees of freedom. From
χ2 table χ20.5 = 10.9 and χ20.95 = 31.4. 90% confidence interval for θ is
10.9 31.4
, = (0.00108, 0.00312)
2 × 5039.5 2 × 5039.5
The mean time to failure is 1θ . The 90% confidence interval for mean time to failure
1 1
lies between 0.00312 = 320.5 hours and 0.00108 = 925.9 hours.
The probability that one of these components will work at least t hours without
failure is P {X > t} = e−θt . The 90% confidence interval for the probability of com-
ponent for a 100 hours period lies between e−100×0.00312 = 0.732 and e−100×0.00108 =
0.898.
Problem 7.22 Explain a method of construction of large sample confidence interval
for θ in Poisson (θ).
Solution: For large samples the variable
∂ log L
Z = q ∂θ ∼ N (0, 1)
V [ ∂ log
∂θ
L
]
Hence the distribution of Z one can easily construct the confidence limits for θ for
large samples. We have
X X
log L(θ) = xi log θ − nθ − log xi
∂ log L(θ) nx̄
= −n
∂θ θ
A.Santhakumaran 307
" #
∂ log L(θ) nX̄
V = V −θ
∂θ θ
n
" #
1 X
= V Xi
θ2 i=1
n
1 X
= V [X]
θ2 i=1
1
= nθ
θ2
n
=
θ
nx̄
θ −n
ThusZ = p
n/θ
In Bayes estimation, the pdf (pmf ) π(θ) of θ on Ω ⊆ < is known as prior distri-
bution. For a fixed θ ∈ Ω, the pdf (pmf ) p(x | θ) represents the conditional pdf (pmf )
of a random variable X given θ. If π(θ) is the pdf (pmf ) of θ on Ω ⊆ <, then the joint
pdf (pmf ) of θ on Ω and X is given by p(x, θ) = π(θ)p(x | θ)
The Bayes risk of a decision function d with respect to the loss function L(θ, d)
is defined by R(π, d) = Eθ [R(θ, d)]. If θ on Ω is a continuous random variable and X
is of the continuous type, then Bayes risk with respect to the loss function L(θ, d) is
Z
= R(θ, d)π(θ)dθ
Z
= Eθ [L(θ, d(X))]π(θ)dθ
Z Z
= L(θ, d(x))p(x | θ)dx π(θ)dθ
Z Z
= L[θ, d(x)]p(x | θ)π(θ)dxdθ
If θ on Ω is a discrete variable with pmf π(θ) and X is of the discrete type, then
XX
R(π, d) = L[θ, d(x)]p(x | θ)π(θ)
θ x
The Bayes risk function of a decision function d(X) with respect to a loss function
L(θ, d(X)) in terms of p(θ | x) is
or
A.Santhakumaran 310
X
R(π, d) = g(x) [L(θ, d(x))p(θ | x)]
x
E[R(θ, d)] is a mean value of the risk R(θ, d) or the expected value of the risk
R(θ, d). It is evident that a Bayes estimator d? (X) minimizes the mean value of the
risk R(θ, d).
Theorem 7.1 Let X1 , X2 , · · · , Xn be a random sample from the pdf p(x | θ) and π(θ)
be a prior pdf of θ on Ω ⊆ <. Let L(θ, d) = (θ − d)2 be the loss function for estimating
the parameter θ. The Bayes estimator of θ is given by d? (X) = E [θ | X = x] .
Proof: The risk function of a decision function d(x) with respect to the loss function
L(θ, d) = [θ − d]2 is
Z Z
2
R(π, d) = g(x) [θ − d(x)] p(θ | x)dθ dx
The Bayes estimator is a function d? (X) that minimizes R(π, d). Minimization of
R(π, d) is same as the minimization of
Z
[θ − d(x)]2 p(θ | x)dθ
d? (x) = E[θ | X = x]
Remark 7.1 If L(θ, d) = |θ − d| is the loss function for estimating the parameter θ,
then Bayes estimator of θ is the median of the posterior distribution of θ ∈ Ω ⊆ <.
Since E|X − a| is minimized as a function of a, i.e., E|X − a| is minimized when a? =
median of the distribution of X. Also Bayes estimator is need not be unbiased.
Proof: Let π ? (θ) be the prior density corresponding to the Bayes estimator d? (X)
with respect to the loss function L(θ, d). Then
≤ sup R(θ, d)
θ∈Ω
for any other estimator d(X) of the parameter θ. Thus d? (X) is a minimax estimator.
= Vθ [d(X)] + [bias]2
Γ(x + 1)Γ(n − x + 1)
= cnx
Γ(n − x + 1 + x + 1)
n! x!(n − x)!
=
x!(n − x)! (n + 1)!
1
x = 0, 1, 2, · · · , n
n+1
g(x) =
0
otherwise
The posterior pdf of θ on Ω is
p(x, θ) π(θ)p(x | θ)
p(θ | x) = =
g(x) g(x)
= (n + 1)cx θ (1 − θ)n−x
n x
= E (θ | X = x)
Z 1
= θp(θ | x)dθ
0
Z 1
= (n + 1)cnx θx+2−1 (1 − θ)n−x+1−1 dθ
0
n! (x + 1)!(n − x)!
= (n + 1)
x!(n − x)! (n + 2)!
x+1
=
n+2
X+1
The Bayes estimator of the parameter θ is d? (X) = n+2 .
Bayes minimax estimator of the function d? (X) with respect to the loss function
L(θ, d? ) is
Z Z
R(π, d? ) = L[θ, d? (x)]π(θ)p(x | θ)dxdθ
Z ( n )
X
? 2
= π(θ) [d (x) − θ] p(x | θ) dθ where L(θ, d? (x)) = [d? (x) − θ]2
x=0
n
Z 1 (X 2 )
x+1
= −θ p(x | θ) dθ
0 x=0
n+2
Z 1 2
X +1
= Eθ −θ dθ
0 n+2
Z 1
1 h i
= Eθ (X + 1)2 + (n + 2)2 θ2 − 2(X + 1)(n + 2)θ dθ
(n + 2)2 0
A.Santhakumaran 313
Z 1
? 1
Eθ [X 2 ] + 2Eθ [X] + 1 + θ2 (n + 2)2 − 2θ(n + 2)Eθ [X] − 2θ(n + 2) dθ
R(π, d ) =
(n + 2)2 0
Z 1
1
R(π, d? ) = n(n − 1)θ2 + nθ + 2nθ + 1 + θ2 (n + 2)2 − 2θ(n + 2)nθ − 2θ(n + 2) dθ
(n + 2)2 0
1 1 Z
?
R(π, d ) = [nθ(1 − θ) + (1 − 2θ)2 ]dθ
(n + 2)2 0
1 n 1 1
= 2
+ =
(n + 2) 6 3 6(n + 2)
Find the Bayes estimate of θ and θ(1 − θ) using the quadratic loss function.
Solution: The marginal pdf of X1 , X2 , · · · , Xn is
Z
g(x1 , x2 , · · · , xn ) = p(x1 , x2 , · · · , xn , θ)dθ
Z 1
= π(θ)p(x1 , x2 , · · · , xn | θ)dθ
0
Z 1 P P
= θ x1
(1 − θ)n− xi
dθ
0
Z 1 X
= θt+1−1 (1 − θ)n−t+1−1 dθ where t = xi
0
t!(n−t)!
t = 0, 1, 2, · · · , n
(n+1)!
=
0
otherwise
p(x1 , x2 , · · · , xn , θ)
p(θ | x1 , x2 , · · · , xn ) =
g(x1 , x2 , · · · , xn )
π(θ)p(x1 , x2 , · · · , xn | θ)
=
g(x1 , x2 , · · · , xn )
A.Santhakumaran 314
(n+1)! θ t (1 − θ)n−t
0<θ<1
t!(n−t)!
=
0
otherwise
d? (x1 , x2 , · · · , xn ) = E [θ | X1 = x1 , · · · , Xn = xn ]
(n + 1)!θt (1 − θ)n−t
Z 1
= θ dθ
0 t!(n − t)!
(n + 1)! 1 t+2−1
Z
= θ (1 − θ)n+1−t−1 dθ
t!(n − t)! 0
(n + 1)! (t + 1)!(n − t)!
=
t!(n − t)! (n + 2)!
P
t+1 xi + 1
= =
n+2 n+2
is used. Find the Bayes estimate for (i) θ and (ii) e−θ
Solution: The marginal pdf of X1 , X2 , · · · , Xn is
Z ∞
g(x1 , x2 , · · · , xn ) = p(x1 , x2 , · · · , xn , θ)dθ
Z0 ∞
= π(θ)p(x1 , x2 , · · · , xn | θ)dθ
0
Z ∞ P
−θ e
−nθ
θ xi
= e dθ
0 x1 ! · · · xn !
A.Santhakumaran 315
Z ∞
1 X
= Qn e−(n+1)θ θt+1−1 dθ where t = xi
i=1 xi ! 0
t!
= Qn
i=1 x i !(n + 1)t+1
The posterior pdf of θ on Ω is
p(x1 , x2 , · · · , xn , θ)
p(θ | x1 , x2 , · · · , xn ) =
g(x1 , x2 , · · · , xn )
π(θ)p(x1 , x2 , · · · , xn | θ)
=
g(x1 , x2 , · · · , xn )
e−(n+1)θ θt X
= (n + 1)t+1 where t = xi and 0 < θ < ∞
t!
e−(n+1)θ θt
Z ∞
d? (x1 , x2 , · · · , xn ) = e−θ (n + 1)t+1 dθ
0 t!
(n + 1)t+1 ∞ −(n+2)θ t+1−1
Z
= e θ dθ
t! 0
(n + 1)t+1 Γt + 1
=
t! (n + 2)t+1
t+1
n+1
=
n+2
Problem 7.26 X ∼ b(n, θ) and suppose that a priori pdf of θ on Ω is U (0, 1). Find
(θ−d)2
the Bayes estimate of θ. Using loss function L(θ, d) = θ(1−θ) , find the Bayes minimax
estimate of θ.
x+1
Solution: As in Problem 7.23, the Bayes estimate of θ is d? (x) = n+2 . Minimax
estimate of θ with respect to the loss function L(θ, d? ) is
Z 1 Z
R(π, d? ) = π(θ) L(θ, d? (x))p(x | θ)dx dθ
0
A.Santhakumaran 316
" n #
X [d? (x) − θ]2
= p(x | θ) dθ
x=0
θ(1 − θ)
n
Z 1 "X 2 #
x+1 1
= −θ p(x | θ) dθ
0 x=0
n+2 θ(1 − θ)
Z 1 2
X +1 1
= Eθ −θ dθ
0 n+2 θ(1 − θ)
Z 1
1 1
= (n − 4) + dθ
(n + 2)2 0 θ(1 − θ)
Z 1
(n − 4) 1 1 1
= + + dθ
(n + 2)2 (n + 2)2 0 θ 1 − θ
p(x1 , x2 , · · · , xn , θ)
p(θ | x1 , x2 , · · · , xn ) =
g(x1 , x2 , · · · , xn )
π(θ)p(x1 , x2 , · · · , xn | θ)
=
g(x1 , x2 , · · · , xn )
(1 + t)n+1 −θ(1+t) n
= e θ , 0<θ<∞
n!
Bayes estimate of θ is
(1 + t)n+1 ∞ −θ(1+t) n+2−1
Z ∞ Z
d? (x) = e θ dθ
0 n! 0
(1 + t)n (n + 1)!
=
n! (1 + t)n+2
n+1 n+1
= = P
(1 + t)2
[ xi + 1]2
A.Santhakumaran 317
Γ(n + a + b)
p(θ | x1 , x2 , · · · , xn ) = θa+t−1 (1 − θ)n+b−t−1 0 < θ < 1
Γ(a + t)Γ(a + b − t)
Bayes estimate of θ is
Γ(n + a + b) 1 Z
?
d (x) = θa+1+t−1 (1 − θ)n+b−t−1 dθ
Γ(a + t)Γ(n + b − t) 0
P
a+t xi + a
= =
n+b+a n+b+a
Problem 7.29 Let the a prior pdf of θ on Ω be N (0, 1). Let X1 , X2 , · · · , Xn be iid
random sample drawn from a normal population with mean θ and variance 1. Find
the Bayes estimate of θ and Bayes risk with respect to a loss function L[θ, d] = [θ − d]2 .
Solution: The a priori pdf of θ on Ω is
√1 e− 12 θ2
−∞ < θ < ∞
π(θ) = 2π
0
otherwise
− 12
P
x2i n2 x̄2
Z ∞
e n+1 nx̄ 2
= (n+1)
e 2(n+1) e− 2
[θ− n+1 ]
dθ
(2π) 2 −∞
√ nx̄ nx̄ 2
Put the transformation n + 1(θ − n+1 ) = t ⇒ (n + 1)(θ − n+1 ) = t2
2 2
− 12 n x̄
P
x2i + 2(n+1) Z ∞
e 1 2
g(x1 , x2 , · · · , xn ) = n+1 √ e− 2 t dt
(2π) 2 n+1 −∞
n2 x̄2
− 12
P
x2i + 2(n+1) √
e
= n+1 √ 2π
(2π) 2 n+1
2 2
− 12 n x̄
P
x2i + 2(n+1)
e
= √ n
n + 1(2π) 2
The posterior pdf of θ on Ω is
π(θ)p(x1 , x2 , · · · , xn | θ)
p(θ | x1 , x2 , · · · , xn ) =
g(x1 , x2 , · · · , xn )
1 2 1
P
(xi −θ)2
√ n
√1 e− 2 θ √ 1 e− 2 n + 1(2π) 2
2π ( 2π)n
= 2
− 21 n x̄
P
x2i + 2(n+1)
e
1 (n+1) nx̄ 2
= q e− 2
[θ− n+1 ]
−∞<θ <∞
2π
n+1
= 0 otherwise
Bayes estimate of θ is
d? (x) = E[θ | X1 = x1 , · · · Xn = xn ]
Z ∞
= θp(θ | x1 , x2 · · · , xn )dθ
−∞
Z ∞
1 1 (n+1) nx̄ 2
= θ √ (n + 1) 2 e− 2 [θ− n+1 ] dθ
−∞ 2π
A.Santhakumaran 319
√ nx̄
√
Put t = n + 1(θ − ⇒ θ= √t nx̄
n+1 ) n+1
+ n+1 , dt = n + 1dθ
t2
1 te− 2
Z ∞ Z ∞
1 nx̄
t2
?
d (x) = √ √ dt + √ e− 2 dt
−∞ 2π n + 1 −∞ 2π n+1
nx̄ nx̄
= 0+ =
n+1 n+1
Bayes confidence interval estimation taking into account a prior knowledge of the
experiment and to construct the confidence interval for a parameter θ. The posterior
pdf p(θ | x1 , x2 , · · · , xn ) of θ on Ω is known, then one can easily find out the function
l1 (x) and l2 (x) such that
or
l2 (x)
X
= p(θ | x1 , x2 , · · · xn )
l1 (x)
Problem 7.30 Let X1 , X2 , · · · , Xn be iid b(1, θ) random variables and let the a prior
pdf π(θ) of θ on Ω be U (0, 1). Find (1 − α) level Bayes confidence interval for θ.
Solution: As in Example 7.24,
1 t − θ)n−t 0 < θ < 1 where t =
P
β(t+1,n−t+1) θ (1 xi
p(θ | x1 , x2 , · · · , xn ) =
0
otherwise
α
and Pθ {θ ≤ l1 x} =
2
Z l1 (x)
1 α
(θ) θt (1 − θ)n−t dθ = (7.8)
0 β(t + 1, n − t + 1) 2
Solving the equations (7.7) and (7.8) for θ, one may get the (1 − α) level Bayes confi-
dence interval (θ(x), θ̄(x)) for θ.
Problem 7.31 Let X1 , X2 , · · · , Xn be iid random sample drawn from a normal
population N (θ, 1), θ ∈ Ω ⊆ < and let the a priori pdf π(θ) of θ on Ω be N (0, 1). Find
(1 − α) level Bayes confidence interval for θ.
Solution: As in Problem 7.29, the posterior pdf of θ on Ω is
nx̄ 1
p(θ | x1 , x2 , · · · , xn ) ∼ N ,
n+1 n+1
Here θ is random variable. If one selects the equal tails confidence interval, then
( ! )
nX̄ √
Pθ −z α2 < −θ n + 1 < z α2 = 1−α
n+1
( )
nX̄ zα nX̄ zα
Pθ −√ 2 <θ< √ +√ 2 = 1−α
n+1 n+1 n+1 n+1
nx̄ z α2 nx̄ z α2
−√ , +√
n+1 n+1 n+1 n+1
Z ∞
α
p(θ | x1 , x2 , · · · , xn )dθ = (7.9)
l2 (x) 2
α
Pθ {θ ≤ l1 (x)} =
2
Z l1 (x)
α
p(θ | x1 , x2 , · · · , xn )dθ = (7.10)
0 2
Solving the equations (7.9) and (7.10) for θ, one may get the (1 − α) level Bayes
confidence interval (θ(X), θ̄(X)) for θ.
Problem 7.33 Let X1 , X2 , · · · , Xn be a sample drawn from a normal population
N (θ, 1). Assume that the a prior pdf π(θ) on Ω is U (−1, 1). Find (1 − α) level Bayes
confidence interval for θ.
Solution: The pdf of X1 , X2 , · · · , Xn is
n 1
P 2
√1 e− 2 (xi −θ) −∞ < x < ∞
p(x1 , x2 , · · · , xn | θ) = 2π
0
otherwise
1 nx̄2
P 2
e − 2 xi +
Z ∞
2 t2 dt √
= n e− 2 √ where t = (θ − x̄) n
2(2π) 2 −∞ n
1 2 nx̄
P 2
e − 2 xi + 2 √
= √ n 2π
2 n(2π) 2
π(θ)p(x1 , x2 , · · · , xn | θ)
p(θ | x1 , x2 , · · · , xn ) =
g(x1 , x2 , · · · , xn )
1 − 12
P
(xi −θ)2 √
√
2e 2 n( 2π)n
= √ √ 1
P 2 n 2
2π( 2π)n e− 2 xi + 2 x̄
√
n 1 2
= √ e− 2 n[θ−x̄] − ∞ < θ < ∞
2π
1
θ ∼ N x̄,
n
P {a < Z < b} = 1 − α
θ−x̄
where Z = √1
∼ N (0, 1)
n
n o
P −z α2 < Z < z α2 = 1−α
zα/2 zα/2
P X̄ − √ < θ < X̄ + √ = 1−α
n n
zα/2 zα/2
x̄ − √ , x̄ + √
n n
Problems
7.2 Explain the shortest confidence interval. Also obtain (1 − α) level shortest confi-
dence interval for θ, using a random sample of size n from
e−(x−θ)
x ≥ 0, θ > 0
p(x | θ) =
0
otherwise
A.Santhakumaran 324
7.3 Let X1 , X2 , · · · , Xn be a random sample from U (0, θ). Find the shortest - length
confidence interval for θ at level (1 − α).
7.6 Obtain (1−α) coefficient confidence interval for θ based on a random sample from
1 e− θ1 x
x ≥ 0, θ > 0
θ
p(x | θ) =
0
otherwise
7.7 Obtain (1 − α) level shortest confidence interval for θ using a random sample from
N (θ, 1).
7.9 Obtain a confidence interval for the range of a rectangular distribution in random
sample of size n.
7.10 The number of houses sold per week for 15 weeks by Dinesh real estate firm were
3 , 3, 4, 6, 2, 4, 4, 3, 1, 2, 0 , 5, 7, 1, 4 respectively. Assuming these are the
observed values for a random sample size 15 of a Poisson random variable with
parameter θ. Compute 95 % confidence limits for θ. Ans.(2.36, 4.18)
7.11 Show that in large samples, the 95% level confidence limits for the means of a
Poisson distribution are given by
r
1.92 3.84
X̄ + ± X̄
n n
where n−2 is negligible.
the 95% level confidence limits for large samples are given by
1.96
1± √
n
θ=
X̄
7.13 Obtain the large sample confidence interval with confidence coefficient (1 − α)
for the parameter of Bernoulli distribution.
7.14 Examine the connection between shortest confidence interval and sufficient statis-
tics.
7.15 Given n independent observations from a Poisson distribution with mean λ, find
Bayes’ estimate of λ, assuming the prior distribution π(θ) = e−λ , 0 < λ < ∞.
7.16 If d is a Bayes estimator of θ relative to some prior distributions and the risk
function does not depend on θ, show that d is minimax.
7.17 Define the terms: loss function, risk function and minimax estimator. Explain a
procedure of computing the minimax estimator under squared error loss function.
7.18 Explain Bayes and Minimax estimation procedures. Find out the Bayes estimate
of θ by using the quadratic loss function. Given a random sample from p(x | θ) =
θx (1 − θ)1−x , x = 0, 1. The a priori distribution of θ is π(θ) = 2θ, 0 ≤ θ ≤ 1.
7.19 Let X1 , X2 , · · · , Xn be a sample drawn from a normal population N (0, 1). Assume
that the a prior pdf π(θ) on Ω is U (−1, 1). Find (1 − α) level Bayesian confidence
interval for θ. Also comments on your confidence interval.
7.22 90 % confidence interval for θ based on a single observation X from the density function
1
θ 0 < x < θ, θ > 0
p(x | θ) =
0 otherwise
is
20X 50
(a) [X, 10X] (b) 19 , 20X (c) 49 , 12.5 (d) All the above Ans:(a)
A.Santhakumaran 326
7.23 The correct interpretation regarding the confidence interval (T1 , T2 ) of the pa-
rameter θ for a distribution F (x | θ), θ ∈ < with confidence coefficient 1 − α is
(a) θ belongs to (T1 , T2 ) with probability 1 − α
(b) (T1 , T2 ) covers the parameter θ with probability 1 − α
(c) (T1 , T2 ) includes the parameter θ with confidence coefficient 1 − α
(d) θ0 belongs to (T1 , T2 ) with confidence α where θ(6= θ0 ) is the true value.
Ans:(c)
7.25 Let X1 , X2 , · · · , Xn be a sample from U (0, θ). The equal two tails (1 − α) level
confidence interval for θ is
X(n) X(n)
(a) 1 , 1
(1−α/2) n (α/2) n
X(n) X(n)
(b) 1 , 1
(α/2)
X(n)
n (1−α/2) n
X(n)
(c) (1−α/2) n , (α/2) n
7.26 The joint pdf p(x, θ) can be expressed for the given value θ on Ω ⊆ < and the a
prior density π(θ) as
(a)p(x, θ) = p(x | θ)π(θ)
(b) p(x, θ) = g(x)p(x | θ)
g(θ)
(c) p(x, θ) = p(θ|x)
π(θ)
(d)p(x, θ) = p(x|θ) Ans:(a)
7.27 The joint pdf p(x, θ) can be expressed for the given value X = x. p(θ | x) is the
posterior pdf of θ on Ω ⊆ < and g(x) is the marginal density of X as
A.Santhakumaran 327
7.36 Let X be a random sample from a Poisson distribution with parameter λ has a
prior distribution f (z) where
e−z
z>0
f (z) =
0
otherwise
Under the squared error loss function which of the following statements are cor-
rect ?
(a) The Bayes estimator of eλ is 2X+1
X+1
(b) The posterior mean of λ is 2
7.37 Let (X, Y ) follow a bivariate normal distribution with mean vector (0, 0) and
dispersion matrix
X 1 ρ
=
ρ 1
A.Santhakumaran 330
q
X−Y 1+ρ
where ρ 6= 0 . Suppose Z = X+Y 1−ρ . Then which of the following statements
are correct?
q
1+ρ X−Y
(a) 1−ρ × √
X 2 +Y 2 +2XY
has a Student t distribution
q
1−ρ X−Y
(b) 1+ρ × √
X 2 +Y 2 −2XY
has a Student’s t distribution
(c) Z is symmetric about zero
(d) E[Z] exists and equal to zero Ans:(a) and (c)
1
7.38 Let X be a random sample from an exponential distribution with mean λ. If λ
has a prior distribution with probability density function
λe−λ
λ>0
g(λ) =
0
λ≤0
1
Then the Bayes estimator of λ with respect to the squared error loss function is
2 1 X+1
(a) X+1 (b) X (c) X (d) 2 Ans:(a)
7.39 Let X1 , X2 , · · · , Xn for n ≥ 5 be a random sample from the distribution with pdf
e−(x−θ)
if x > θ
f (x, θ) =
0
otherwise
7.42 Consider a sample of size one say X = x from a population with pdf
22 (x − θ) θ < x < 2θ, θ > 0
θ
fθ (x) =
0
otherwise
7.43 Let X1 , X2 , · · · , Xn be iid N (µ, σ 2 ) variables where µ and σ 2 both are unknown
Consider a confidence interval for σ 2 , which is of the form Ia,b =
parameter. P
Xi −X̄)2
P
(Xi −X̄)2 (
b , a where b > a > 0. Let Gn be the cumulative distribution
function of a χ2 random variable with n degrees of freedom. Which of the
following statements are true?
(a) It is possible to find a 95% confidence interval of the form Ia,b where ab = 1
(b) If Gn−1 (a) = 1 − Gn−1 (b) = 0.025 then it is the shortest 95% confidence
interval
(c) If it is the shortest confidence interval, then a and b must satisfy the condition
b − a = (n − 3) log ab
A.Santhakumaran 332
(d) If Gn−1 (a) − Gn−1 (b) = 0.95, then the expected length of a 95% confidence
1 1
interval of the form Ia,b is (n − 1) a − b σ2 Ans:(b) and (c)
(d) The Bayes estimate of n has larger variance than the variance of unbiased
X
estimate p Ans:(a),(b) and (d)
BIBLIOGRAPHY
9. Lehmann, E.L., Testing statistical hypotheses, John Wiley and Sons., 1959.
10. Lehmann, E. L., Theory of point estimation, John Wiley and Sons., 1983.
11. Lindgren, B.W., Statistical theory , Macmillan Publishing Co., Inc. New York,
1976.
12. Nelson, W., Accelerated testing statistical models, tests plan and data analysis,
John Wiley and Sons, Inc., 2004.
13. Nelson,W., Accelerated life testing step stress models and data analysis, IEEE
Trans. Reliability , 29, 103 - 108, 1980.
A.Santhakumaran 334
14. Neyman, J. and E.S. Pearson, On the problem of the most efficient tests of
statistical hypotheses, Phil. Trans.Roy. Soc., A, 231, 289 - 337,1933.
15.Rao. C.R., Linear statistical inference and its applications, Wiley and Sons, 1984
18. Santhakumaran, A., Decision making tools in real life problem, The Hindu, 27th
Oct. 1997.
19. Santhakumaran, A., Probability models and their parametric estimation, K.P.
Jam Publication, Chennai, 2004.
20. Santhakumaran, A., Probability theory and random processes, Sonaversity, Salem,
2005.
21. Zacks, S. Theory of statistical inference, John Wiley and Sons, New York, 1971.
SUBJECT INDEX