Você está na página 1de 62

Chapter 1 INTRODUCTION TO STATISTICS

WHAT IS STATISTICS?
Introduction
The word statistics appears to have been derived from the Latin word Status. Statistics was
simple the collection of numerical data, by the kinds on different aspects useful to the state.
Today statistics is the scientific study of handling quantitative information. It embodies a
methodology of collection, classification description and interpretation of data obtained through
the conduct of surveys and experiments.

Population
The total group under discussion or the group to which results will be generalized is called
population. For example collection of height measurements of all college students is called
population.

Sample
A part of the population selected in the belief that it will represent all the characteristics of the
population is called a sample. For example a sample of 10 students is selected from a population
of 100 students in order to analyse the average height of the students.

Meaning of Statistics
Now a days the word Statistics is used in two senses i.e.
Singular Senses
In its singular sense, the word statistics means the science of statistics which deals with statistical
methods.
Plural Senses
The word statistics, when used in plural senses means numerical facts collected in any field of
study by using a statistical method.

Definition Of Statistics
Statistic is the numerical statement of facts capable of analysis and interpretation, and science of
statistics is the study of their principles and methods applied in collecting, presenting, analysis
and interpretation the numerical data in any field of inquiry. OR
Science of facts and figures is called statistics. OR
(Croxton and Cowden)
Statistics are collection, presentation, analysis and interpretation of numerical data
OR
(Connor)
Statistics are measurement enumeration or estimation or social or natural phenomena
systematically arrange so as to exhibit interrelationship. OR
(Boodington)
Statistics is the science of estimates and probabilities. OR
(Achenwall)
Statistics are a collection of notes worthy facts concerning, both historical and descriptive. OR
Statistics is defined as the science of collecting organizing presentation, analysis and
interpreting numerical data for making better decisions.

Scope of Statistics

Statistics is the branch of mathematics that deals with data. Statistics uses data, collected
through systematically method of data collection and the theories are employed to arrive at the
conclusion.

Main Branches of Statistics / Division of Statistics


The science of statistics may be classified into following two main branches.
1. Statistical Method s
In case of statistical inquiry the last 1st step is the collection of data. Most of the data are complex
and confused. So for a clear picture of this data, we reduced the complexity and confusion of this
data, which is done by statistical method. Statistical method includes all the rules of procedure
and techniques which are used in the collection, classification, tabulation, comparison and
interpretation of data. Simply statistical data simplifies the complex of numerical data.
2. Applied Statistics
It deals with the application of statistical method to some specific problem; applied statistics has
two types.
a. Descriptive Applied Statistics
In descriptive applied statistics it is applied to these data which relates to the present or past
information for e.g. census in Pakistan for achieving certain conclusion.
b. Scientific Applied Statistics

We apply general rules on the quantitative data which is useful for forecasting and for this
purpose we use scientific applied statistics.

Limitation of Statistics
1. Statistical has a handicap in dealing with qualitative observation or values.
2. Statistical results are applied only on the average
3. Statistics does not study qualitative phenomena
4. Statistical deals with fact which can be numerically expressed for e.g. love, hate, beauty,
poverty health cannot be measured.
5. Sufficient care need be exercised in the collection, analysis and interpretation of data
otherwise statistical results may be false.

Use or Functions of Statistics


1. Statistical simplifies the complicated data
2. Statistical test the law of other sciences
3. Statistics help a lot in policy making purposes
4. Statistics and Forecasting
5. Statistics and Administration
6. Statistics helps in proper and efficient playing of a statistical inquiry in any field of study

Relationship of Statistics with other Sciences

Now a days statistics and statistical data, methods being applied increasing in agriculture,
Economics, Biology, Business, Physics, Chemistry, Astronomy, Medicine, Administration,
Education, Mathematics, Meteorology and Physical science.
1. Statistics and Administration
Statistics plays an important role in the field of administration and management in providing
measure of performance of the employees. Statistical data are widely used in taking all
administration decision. For example the authorities want of rise the pay scales of employees in
view of an increase in the cost of living. Statistical methods will be used to calculate the rise in
cost of living.
2. Statistics and Agriculture
Agriculture Statistics cover a wide field. These include Statistics of land utilization, production of
crop, price and wages in agriculture etc. Agriculture is greatly benefited by the statistical
methods.
3. Statistics and Medicine
Statistics plays an important role in the field of medicine, to test the effectiveness of different
types of medicines. Vital Statistics may be defined as the science. This deals with the application
of numerical methods to vital fact. It is a part of the broader field of demography. Demography is
a statistical study of all phases of human life relating to vital facts such as births, deaths, age,
marriages, religions, social affairs, education and sanitation. Vital statistics is a part of
demography and comprises of vital data.
4. Statistics and Mathematics
All statistical methods have their foundations in mathematics. No calculating work can be done
without the help of mathematics. Therefore, mathematics is applied widely in statistics. The
branch of statistics is called mathematical statistic. Both these subjects are so interrelated.
5. Statistics and Physical Sciences
Physical science greatly depending upon science of statistics in analysis and testing their
significance for drawing result. Statistical methods are used in physical science like physics,
chemistry, Geology etc.
6. Statistics and Economics
Important phenomena in all branches of economics can be described, compared with the helps of
statistics. Statistics of production described wealth of nation and compare it year after showing
there by the effect of changing economics policies and other factors on the level of production.
7. Statistics Helps in Forecasting
Through estimating the variables that exit in the fast forecasting about in times to come, can
easily be done. Statistics helps in forecasting future events. Use of some statistical techniques
like extrapolation and time series analysis helps in saying some thing the future courses of
events. Statistics plays an important role in filed of astronomy, transportation, communication
publics, health, teaching methods, engineering psychology, meteorology wealth forecasting.
Statistics and Business
Statistics plays in important role in business. It helps the business men to plan production
according to the tastes of the customers; the quality of the products can also be checked by using
statistical methods

Characteristic of Statistics
Statistics have the following characteristics.
1. Statistics are aggregates of facts
Statistics are a number of facts.

A single fact even it numerically expressed, cannot called statistics.


A single death, an accident etc, does not constitute statistics but on the other hand a number of
deaths, accidents are statistics.
2. Statistics are affected by many causes
Statistics are aggregates of such facts only as grow out of variety of circumstances their size,
shape at any particular moments is the result of the action and interaction of forces.
3. Statistics are numerically expressed
In statistics, we study quantitative expressions and not qualitative like old, young, good, bad etc.
4. Statistics are estimates to a reasonable standard
What standard of accuracy is to be regarded as reasonable will depends upon the aims and
objects of inquiry and what so ever the standard of accuracy is once adopted it must be uniformly
maintained throughout the inquiry.
5. Statistics are collected in a systematic manner
Statistics collected in a haphazard manner can not be accurate.

Statistics Inquiry
The inquiry about any problem which has done with the help of statistical principles and methods
is called statistical inquiry.
Steps in Statistical Inquiry
Requiring collection of data, the following steps are involved in statistical inquiry.
1. Planning inquiry.
2. Collection of data.
3. Editing the collected data.
4. Tabulating the data.
5. Analyzing the data by calculated statistical measures.

Planning of Statistics Inquiry


Following are the factors of planning of statistical inquiry.
1. Object and Scope of Inquiry
2. Nature and Type of Inquiry
i. Primary or Secondary
ii. Census or Samples
iii. Open and Secret
iv. Direct or Indirect
v. Regular or Adhoc
vi. Initial or Receptive
vii. Official, Semi Official or Non Official
3. Statistical Unit
The unit of measurements which are applied in the collected of data is called statistical unit. For
e.g. if we collect the rice crop according to per acre then it will be a statistical unit, for wheat crop
there are two types of statistical unit.
i. Physical Unit
ii. Arbitrary Units
Advantages of statistical Units
i. If fulfills the object of inquiry stable.
ii. Stable.
iii. Homogeneous.
iv. In obvious words.
4. Degree of Accuracy
This decision about the nature of inquiry and purpose of investigation is called degree of
accuracy.

Variable
A measurable quantity which can vary (differ) from one individual to another or one object to
another object is called variable. For e.g. height of students, weight of children. It is denoted by
the letters of alphabet e.g. x, y, z etc.
Type of Variable
There are many type of variable.
1. Continuous Variable
A variable which can take set of values (fractional) b/w two limits and has continuous integer
numbers is called continuous variable. Or
A variable which can assume any value within a given range is called a continuous variable. For
e.g. age of persons, speed of car, temperature at a place, income of a person, height of a plant, a
life time of a T.V tube etc.
2. Discrete Variable

A variable which can assume only some specific values within a given range is called discrete
variable. For e.g. Number of students in a class, Number of houses in a street, number of children
in a family etc. it cant occur in decimal.
3. Quantitative Variable
A characteristic which varies only in magnitude from one individual to another is called
quantitative variable. It can measurable. Or A characteristics expressed by mean of quantitative
terms is known as quantitative variable. For e.g. number of deaths in a country per year, prices
temperature readings, heights, weights etc.
4. Qualitative Variable
When a characteristic is express by mean of qualitative term is known as qualitative variable or
an attributes. For e.g. smoking, beauty, educational status, green, blues etc. it is noted that these
characters can not measure numerically.

Domain
A set of value from which variables are taken on a value is called domain.

Constant
A characteristic is called a constant if it assumes a fixed value e.g. p is a constant with a
numerical value of 3.14286. is also a constant with numerical value of 2.71828.

Errors
The difference b/w the actual values and the expected value is called errors. There are two types
of errors.
1. Compensating error
2. Biased errors

Data
A set of values or number of values is called data.

Quantitative Data

The data described by a quantitative variable such as number of deaths in a country per year,
prices temperature readings, heights, weights, wheat production from different acres, the number
of persons living in different houses etc, are called quantitative data.

Qualitative Data
Data described by a qualitative variable e.g. smoking, beauty, educational status, green, blue The
marital status of persons such as single, married, divorced, widowed, separated, The sex of
persons such as male and female, etc are called qualitative data.

Discrete Data
Data which can be described by a discrete variable is called discrete data. Number of students in
a class, Number of houses in a street, number of children in a family etc

Continuous Data
Data which can be described by a continuous variable is called continuous data. For e.g. age of
persons, speed of car, temperature at a place, income of a person, height of a plant, a life time of
a T.V tube etc

Chronological Data
A sequence of observations, made on the same phenomenon, recorded in relation to their time of
occurrence, is called chronological data. A chronological data is also called a time series.

Geographical Data
A sequence of observations, made on the same phenomenon, recorded in relation to their
geographical region, is called a geographical data.

Statistical Data
When the data is classified on the basis of a numerical characteristic which is know as statistical
data on classification according to class interval. Statistical data may be classified is to two types
1. Primary Data
It is most original data which is note complied by someone or it is first hand collected data. It has
also not undergone any sort of statistical treatment.
2. Secondary Data
It is that data which has already been compiled and analyzed by someone, may be sorted,
tabulated and has undergone statistical treatment.

Collection of Data
Following methods are used for collection of data.
1. Methods for Collection of Primary Data
Following are the main methods by which primary data are obtained.
i. Direct Personal Investigation
ii. Indirect Investigation
iii. Local Source
iv. Questionnaire Method
v. Registration

vi. Questionnaire by Post


vii. By Enumerators
viii. By Telephone
ix. Through Internet
2. Methods for Collection of Secondary Data
Secondary data may be obtained from the following sources.
i. Official Source
For e.g. publication of Statistical division, Ministries of food, Agriculture and Railways, Bureaus of
Education, Finance, Provincial Bureaus of Statistics etc.
ii. Semi Official Source
For e.g. State Bank of Pakistan, National Bank of Pakistan, WAPDA, District Councils Economics
Research Institute, P.I.D.C, Central Cotton Committee etc.
iii. Private Source
For e.g. Publications of Trade Association Chambers of Commerce, Market Committee and industry
iv. Research Organization
For e.g. University, other institute of education and Research, Irrigation Research Institute etc.
v. Technical, Trade, Journals and Newspaper

Chapter 3 MEASURES OF DISPERSION,


MOMENTS AND SKEWNESS
A quantity that measures how the data are dispersed about the average is
called measures of dispersion.
Range (R)
The range is a simplest measure of dispersion. It is defined as the difference b/w the
largest and smallest observation in a set of data. It is denoted by R. This is an absolute
measure of dispersion.
For Ungrouped Data
Range = R =
Where

Xm Xo

X m = the largest value.


X o = the smallest value.

For Grouped Data


Range = R = Upper class boundary of the highest class lower class boundary of
the lowest class
Or
Range = R = Class Marks (X) of the highest class Class Marks of the lowest class

Semi Inter Quartile Range or Quartile Deviation


The semi inter-quartile range or quartile deviation is defined as half of the difference b/w
the third and the first quartiles. Symbolically it is given by the
S.I.Q.R = Q.D =
Where

Q3 Q1
2

Q1 = First, Lower quartile


Q3 = Third, Upper quartile

This is an absolute measure of dispersion.

Mean Deviation or Average Deviation


The mean deviation is defined as the average of the deviation of the values from an
average (Mean, Median), the deviation are taken without considering algebraic signs.
1. Mean Deviation From Mean
For Ungrouped Data
M.D =

XX
n
Or

M.D =

X Mean
n

For Grouped Data

M.D =

f X X
f
Or

M.D =

X Mean

2. Mean Deviation From Median


For Ungrouped Data
M.D =

X X%
n
Or

M.D =

X Median
n

For Grouped Data

M.D =

f X X%
f
Or

M.D =

X Median

Standard Deviation (S)


The standard deviation is defined as the positive square root of the mean of the squared
deviation of the values from their mean. Thus the standard deviation of a set of n values
X 1. X 2 . X 3 .......... X n .it is denoted by S. This is an absolute measure of dispersion.
Methods of Standard Deviation
I.
Direct Method
II.
Short Cut Method
III.
Coding Method or Step-Deviation Method
1. Direct Method
For Ungrouped Data
S.D = S =

S.D = S =

X
n

X X
n

For Grouped Data

fX
f

S.D = S =

fX
f

f X X
f

S.D = S =

2. Short Cut Method


For Ungrouped Data

S.D = S =

Where D= X A

For Grouped Data

fD
f

S.D = S =

fD
f

3. Coding Method or Step-Deviation Method


For Ungrouped Data

S.D = S =

Where

XA
D
or
h
h

For Grouped Data

S.D = S =

fu
f

fu
f

Combined Standard Deviation ( Sc )


For two set of values

Sc =

2
n1S12 n2 S22
n1n2

X1 X 2
2
n1 n2
n1 n2

For three or more sets of data

Sc =

n S
i

2
i

X X

Variance ( S 2 )
The variance is defined as the mean of the squared deviation from mean. It is denoted by
S2
Or
The square of the standard c=deviation is called variance. It is denoted by S 2
Methods of Standard Deviation
1. Direct Method
2. Short Cut Method
3. Coding Method or Step-Deviation Method
1. Direct Method
For Ungrouped Data
Var(X) =

Var(X) =

S =

S =

X X

For Grouped Data


Var(X) =

Var(X) =

S2 =

S =

fX
f

fX
f

f X X
f

2. Short Cut Method


For Ungrouped Data
Var(X) =

S =

D
n

Where D= X A

For Grouped Data


Var(X) =

S2 =

fD
f

fD
f

3. Coding Method or Step-Deviation Method


For Ungrouped Data
Var(X) =

XA
D
or
h
h
For Grouped Data

S = h
2

Where

S = h
2

Var(X) =

fu
f

fu
f

Combined Variance ( Sc )
For two set of values

Sc 2 =

2
n1S12 n2 S 22
n1n2

X1 X 2
2
n1 n2
n1 n2

For three or more sets of data

Sc 2 =

n S
i

2
i

X X

Relative Measure of Dispersion


1. Coefficient Of Range
Coefficient of Range =

Xm Xo
Xm Xo

2. Coefficient Of Quartile Deviation


Coefficient of Q.D =
Where

Q3 Q1
Q3 Q1

Q1 = First, Lower quartile

Q3 = Third, Upper quartile


3. Coefficient Of Mean Deviation From Mean
Coefficient of M.D from Mean =

Mean Deviation From Mean


Mean

Or
Coefficient of M.D from Mean =

M .D From X
X

4. Coefficient Of Mean Deviation From Median


Coefficient of M.D from Median =

Mean Deviation From Median


Median

Or
Coefficient of M.D from Mean =

5. Coefficient Of Standard Deviation


Coefficient of S.D =

S .D
X

M .D From X%
X%

6. Coefficient Of Variation (C.V)


The coefficient of variation expresses the standard deviation as a percentage in
terms of arithmetic mean. It is used as a criterion of consistent performance, the
smaller coefficient of variation, and the more consistent in the performance.
Or
Coefficient of variation is used to compare the variability of two or more than two
series.
Coefficient of Variation = C.V =

S .D
100
X

Relationship Between Measures of Dispersion


1. For Normal Distribution
I. Mean Deviation = M.D = 0.7979 S.D
II. Quartile Deviation = Q.D = 0.6745 S.D
2. For Moderately Skewed Distribution
I. Mean Deviation = M.D =

3
S.D
4

2
S.D
3
5
III. Quartile Deviation = Q.D =
M.D
6
II. Quartile Deviation = Q.D =

Moments
A moment designates the power to which deviation are raised before averaging them.
Methods of Standard Deviation
1. Moments about Mean or Central Moments
2. Moments about Origin or Zero
3. Moments about Provisional Mean or Arbitrary Value (Non Central Moment)
1. Moments about Mean or Central Moments
For Ungrouped Data

1 m1

xx
n

0
2

2 m2

x x

3 m3

x x

4 m4

x x

For Grouped Data

Variance

1 m1

f x x

2 m2

f xx

3 m3

f x x

4 m4

f xx

Variance

2. Moments about Origin or Zero


For Ungrouped Data

'1 m'1
n
2 m2
'

'

x3
'3 m ' 3
n

4 m4
'

'

For Grouped Data

'1 m'1

fx

'2 m' 2

fx 2

f
f

3 m3
'

'

fx

'4 m' 4

fx 4

3. Moments about Provisional Mean or Arbitrary Value (Non Central Moment)


4.
Methods of Standard Deviation
i.
Direct Method
ii.
Short Cut Method
iii.
Coding Method or Step-Deviation Method
i.

Direct Method
For Ungrouped Data

'1 m'1

constant

x A
n

Where A is

2 m2

x A

3 m3

x A

4 m4

x A

'

'

'

'

'

'

For Grouped Data

'1 m'1

f x A

constant

2 m2

f x A

3 m3

f x A

4 m4

f x A

'

'

'

ii.

Where A is

'

'

'

Short Cut Method


For Ungrouped Data

'1 m'1
n

Where D= X - A

2 m2

3 m3

'

'

'

'

D4
'4 m' 4
n
For Grouped Data

'1 m'1

fD

'2 m' 2

fD 2

f
f

Where D= X - A

3 m3
'

'

fD

'4 m' 4

fD 4

iii.

Coding Method or Step Deviation Method


For Ungrouped Data

u
'1 m'1 h
n

Where

XA
D
or
h
h

u2
'2 m' 2 h 2
n

3 m3
'

'

h3

u4
'4 m' 4 h 4
n
For Grouped Data

'1 m'1

XA
D
or
h
h

2 m2
'

'

fu

fu

'3 m ' 3

h2

fu 3

4 m4
'

'

fu

Where

h3

h4

Relation Between Central moments in Terms of Non Central


Moments

1 m1 '1 1' 0
2 m2 '2 1' Varaince
2

3 m3 '3 31' 2 ' 2 1'

4 m4 '4 41' 3' 6 1' 2 ' 3 1'


2

Moments Ration

3 2
1 b1 3
2

2 b2 42
2
Sheppards Correction for Moments of Group Data
2 (corrected ) 2 (uncorrected )

h2
12

3 (corrected ) 3
4 (corrected ) 4 (uncorrected )

h2
7 4
2 (uncorrected )
h
2
240

Charliers Check
i.

f u 1 fu f

ii.

f u 1

fu 2 2 fu f

iii.

f u 1

fu 3 3 fu 2 3 f u f

iv.

f u 1

fu 4 4 fu 3 6 f u 2 4 f u f

Symmetry
In a symmetrical distribution a deviation below the mean exactly equals the corresponding
deviation above the mean. It is called symmetry.
For symmetrical distribution the following relations hold.
Mean = Median = Mode

Q3 - Median = Median - Q1
u3 m3 0

1 b1 0
Skewness
Skewness is the lack of symmetry in a distribution around some central value i.e. means
Median or Mode. It is the degree of asymmetry.
Mean

Median Mode

Q3 - Median Median - Q1
u3 m3 0

1 b1 0
There are two types of Skewness.
1. Positive Skewness

If the frequency curve has a longer tail to right, the distribution is said to be positively
skewed.
2. Negative Skewness
If the frequency curve has a longer tail to left, the distribution is said to negatively
skewed.

Coefficient of Skewness (SK)


Karl Pearsons Coefficient of Skewness
SK =

Mean Mode
S .D

SK =

3 Mean Median
S .D

Bowlys Quartile Coefficient of Skewness


SK =

Q3 Q1 2 Median
Q3 Q1

Moment Coefficient of Skewness


SK =

Kurtosis
Moment coefficient

1 2 3
2 5 2 6 1 9

2 is an important measure of kurtosis. These measures define as:


2 b2

The moment coefficient


measurement.
If
If
If

4
2 2

2 is a pure numbers and independent of the origin and unit of

2 3 distribution is Leptokurtic
2 3 distribution is Normal or Mesokurtic
2 3 distribution is Platy Kurtic
Or

Q.D
For Normal distribution, K = 0.263
P90 P10

Chapters 4 INDEX NUMBERS


Index Numbers

A relative number which indicates the relative change in a group of variables collected at
different time. Index numbers is a device for estimating trend in Prices, Wages, Production
and other economic variables. It is also known as economic barometer.
Or
An Index Number Is a number that measure a relative change in a variable or an average
relative change in a group of related variable with respect to a base. A base may be that
particular time, space professional class with whose reference change are to be measured
.

Type of Index Numbers


There are three types of index numbers which are commonly used.
1.

Price Index Numbers


A price index number is a number that measures the relative change in the price of a
group of commodities with respect to base.

2.

Quantity Index Numbers


These index numbers measure the changing the volume or quantity of goods produced
or consumed.

3.

Aggregative Index Number


These index numbers are used to measure changes in a phenomenon like cost of
living, total industrial production etc.

Classification of Index Numbers


Index number generally classified as
1. Simple index number.
2. Composite index number.

Index Number

Simple

Composite

Weighted

Un Weighted

Average of relative

Average of relative

Aggregative

Aggregative

1. Simple index number


A simple index numbers that measure a relative change in a single variable with
respect to a base these variables are Prices, Quantity, Cost of Living etc.
Or
If an index is based on single variable only than it is know as simple index number. For
example of index no prices of Banaspati Ghee, index number of carpets exported to
the Middle East etc.
1.1.

Fixed Base Method


Price Relative
They are obtained by dividing the price in a current year by the price in a base
year and expressed as percentage.
Price relative =

Pon

Pr ice in a current year


100
Pr ice in a base year

Pon

pn
100
po

pn =Current year prices ,

Where

po =Base Year prices

Quantity Relative
They are obtained by dividing the quantity in a current year by the quantity in a
base year and expressed as percentage.
Quantity Relative= Qon

Qon

Quantity in a current year


100
Quantity in a base year

qn
100
qo

qn =Current year quantities , qo =Base year quantities

Where
1.2.

Chain Base Method


In this method index number is computed in two steps. As a first step, we
calculate link relative by dividing current period price/quantity/value by
price/quantity/value of immediate previous period of current and expressing this
ratio in percentage.
Link Relative (Price) =

P n 1, n

Link

Relative

Q n 1, n

Pr ice in a current year


100
Pr ice in a preceding year
(Quantity)

pn
100
pn 1
=

qn
Quantity in a current year
100
100
Quantity in a preceding year
qn 1

In second step, we take just reverse step of step 1. Hence, to get chain indices
we multiply the current period link relative by link relative of immediate previous
period of current period and divide this product by 100.
Chain Indices =

Link R e lative for current year Chain Index of previous year


100

2. Composite Index Numbers


It is number that measures an average relative change in a group of related variables
with respect to a base.
Composite Index number are further classified as
2.1. Un-weighted composite Index Number
2.2. Weighted Composite Index Number
2.1. Un-weighted composite Index Number
In un-weighted index numbers the weights are not assigned to various items. The
following methods are generally used for the construction of un-weighted index
number.
a. Simple Aggregative Price Index
As we are aware that in calculation of composite index number we are always given
two or more commodities prices, quantities. So in simple aggregative method we
take year wise total of the involved commodities and then adopt fixed base or chain
base method as may be the case.
Fixed Base Method
Under this method to construct price/quantity index the total of
current year prices/ quantities of various commodities in question is

divided by total of base year prices/quantities and the result is


expressed in percentage. Symbolically:

Pon

p
p

100

Qon

and

q
q

100

Chain Base Method


Under this method as first step we compute link relative for each year by dividing
current year total of prices/quantities by the immediate previous year total of
prices/quantities and expressing the result in percentage. To get chain indices we
take the reverse procedure as we take in calculating link relative i.e. we multiply
each year link relative by previous year link relative and divide this product by 100
b. Simple Average of Price Relatives
Simple Average of Relatives method is further sub-divided into two methods:
Fixed Base Method
Under simple average of relatives by fixed base method, first of all we find
price/quantity/value relatives for each commodity given in the problem and then
average these relatives by using arithmetic mean, median, and geometric mean.
The resulted averages are known as index numbers by simple average of relative
method.
Chain Base Method
Now, we will discuss simple average of relatives by chain base method. Under this
method first of all we find link relatives for the given commodities, as a 2 nd step, we
take average (arithmetic mean, median, geometric mean) of link relatives. In 3 rd
step we find chain indices by adopting the same procedure as we take in chain
indices for single commodity index number.
2.2. Weighted Composite Index Number
In weighted index numbers, the weights are assigned in proportion to the relative
importance of different commodities included in the index. Weighted index numbers
are of two types.
a. Weighted Aggregative Indices
These indices are just like the simple aggregative indices but which basic
difference that weights are assigned to various commodities included in the index.
There are various methods of assigning weights. Various formulas for constructing
index numbers have been devised of which some of the most important ones are
given below.
Price Index Numbers
(1). Laspeyres Price Index =

Pon

p q
pq

n o

100 (Base Year Weighted)

o o

(2). Paasches Price Index =

Pon

p q
p q

n n

100 (Current Year Weighted)

o n

(3). Marshall-Edgeworth Price Index =

(4). Fishers Ideal Price Index =

Pon

p q p q
p q p q
n o

n n

o o

o n

Pon L P 100

100

p q p q
p q pq
n o

n n

o o

o n

100

Pon

(5). Walsh Price Index =

p
p

qo qn

qo qn

100

Quantity Index Numbers

(2). Paasches Quantity Index =

q
q

Qon

(1). Laspeyres Quantity Index =

Qon

po

po

q
q

pn

pn

100 (Base Year Weighted)

100 (Current Year Weighted)

Qon

(3). Marshall-Edgeworth Quantity Index =

q
q

n
o

(4). Fishers

Ideal

Qon L P 100

(5). Walsh Quantity Index =

po qn pn

po qo pn

Quantity

q
q

po

po

q
q

q
q

Qon

pn

pn

po pn

po pn

100

Index

100

100

b. Weighted Average of Relative Indices


Under this method we attach weights to price relatives or quantity relative. Thus,
first we find price or quantity relatives in the same way as we find simple average
relatives but now we will take weighted average for averaging calculated relatives.
The important types of weighted average of relatives are given below:
Price Index Numbers
(1). Laspeyres Price Index =

Pon

pn
po qo
po
100
po qo

Weighted)
Where Price Relative= I

(2). Paasches

Price

pn
100 ,
po

Index =

Pon

Where Price Relative= I

(3). Palgraves Price Index =

pn
100 ,
po

Pon

pn
po qn
po
100
po qn

I W (Current
W

W po qn

pn
pn qn
po
100
pn qn

(Base Year

W po qo

Weighted)

I W
W

I W
W

Year

Where Price Relative= I

pn
100 ,
po

W pn qn

Quantity Index Numbers

(1). Laspeyres Quantity Index =

Qon

qn
qo po
qo
100
q
p
o o

Weighted)
Where Quantity Relative= I

(2). Paasches Quantity Index =

qn
100 ,
qo
Qon

Where Quantity Relative= I

qn
qo pn
qo
100
qo pn

(3). Palgraves Quantity Index =

Where Quantity Relative= I

qn
100 ,
qo

Qon

I W (Current
W

Year

W qo pn

qn
qn pn
qo
100
qn pn

pn
100 ,
po

(Base Year

W qo po

Weighted)

I W
W

I W
W

W qn pn

Uses of Index Numbers


1. The price index numbers are used to measure change in the price of a commodities. It
helps in comparing the changes in the prices of one commodity with another.
2. The quantity Index number is used to measure the change in quantities produced,
Purchased, Sold etc.
3. The Index numbers of industrial production are used to measure the changes in the
level of industrial production in the country.
4. The index number is used to measure the change in enrolment of performance etc.
5. The index numbers of import prices and export prices are used to measure the change
in the terms of trade of a country.
6. The index numbers are used to measure seasonal variation and cyclical variation in a
time series.
7. The index numbers measure the purchasing power of money and determine the real
wages.

Limitations of Index Numbers

1. All index numbers are not suitable for all purposes. They are suitable for the purpose
for which they constructed.
2. Comparisons of changes in variables over long period are not reliable
3. Index numbers are subject to sampling error.
4. It is not possible to take into account all changes in quality or product.
5. The index numbers obtained by different methods of construction may give different
results.

Consumer Price Index Numbers (CPI)


Consumer Price Index numbers are intended to measure the changes in the prices paid
by the consumer for purchasing a specified basket of goods and services during the
current year as compared to the base year. The basket of goods and services will contain
items like Food, House Rent, Clothing, Fuel and Light, Education, Miscellaneous like,
Washing, Transport, Newspapers etc Consumers price index numbers are also called cost
of living index numbers or Retail price index numbers.

Wholesale Price Index Numbers (CPI)


Wholesale Price Index number is constructed to measure the change in prices of products
produced by different sectors of an economy and traded in wholesale markets.
The sector covered under this index are agriculture, industry etc.
Federal Bureau of Statistics is also engaged in constructing this index by using weighted
average of price relatives. Almost all the steps discussed in the topic of steps involved in
the construction of an index number are taken into consideration for constructing this
index

Construction of Consumer Price Index Numbers


The following steps are involved in the construction of consumer price index numbers.
1. Scope
The first step is to clearly specify the class of people and locality where they reside. As
far as possible a homogeneous group of persons regarding their income and
consumption
pattern are considered. These groups may be school teachers, industrial workers,
Officers, etc residing in a particular well defined area.
2. Household Budget Inquiry and Allocation of Weighs
The next step is to conduct a household budget inquiry of the category of people
concerned. The object of conducting a family budget inquiry is to determine the goods
and services to be included in the construction of index numbers.
This step has many practical problems as no two household have the same income and
consumption pattern. Therefore, the inquiry should include questions on family size,
number of earners, the quantity and quality of goods and services consumed and
money spent on them under various headings, such as: clothing and footwear, fuel and
lighting, housing, misc. etc. the weights are then assigned d to various groups in
proportion to the money spent on them.
3. Collection of Consumer Prices
The collection of retail is a very important and at the same time very difficult task
because the prices may vary from place to place and from shop to shop. The prices of
the selected items both for the given and base period are obtained from the locality
where the people reside or from where they make their purchase.
4. Method of Compilation of Consumer Price Index Numbers
After construction of consumer price index number we compiled in any one of the
following methods.
Aggregative Expenditure Method
In this method quantities are consumed as base year taken as weights. If
the price and quantity of base period and
year then

po and qo be

pn & qn be the price and quantity of given

Pon

p q
pq

n o

100

o o

Where

pq
pq

n o

= Aggregative expenditure in the given year.

o o

= Aggregative expenditure in the base year

Household Budget method Or Family Budget Method


In the method the amount of expenditure by the household on various items in the
base period. If po and pn be the prices of base and given year and weight W po qo
where so quantity of base period then

Pon

pn
po qo
po
100
po qo

Where Price Relative= I

I W
W
pn
100 ,
po

W po qo

Theoretical Tests for Index Numbers


From a theoretical view point, a good index number formula is required to satisfy the
following tests by lrving Fisher (1867-1947)
1. Time Reversal Test
This test may be stated as follows:If the time subscripts of a price (or quantity) index number formula be interchanged,
the resulting price (or quantity) index number formula should be the reciprocal of the
original formula. i.e.

Pon

1
Pno

or

Pon Pno 1

As we will just see,


Fishers and Marshall-Edgewoth index numbers satisfy the Time Reversal Test.
Laspeyress and Paasches index numbers not satisfy the Time Reversal Test.
2. Factor Reversal Test
This test may be stated as follows:If the factors prices and quantities occurring in a price (or quantity) index number
formula be interchanged so that a quantity (or price) index formula is obtained, then
the product of the index numbers should give the true value index number. i.e.

Pon Qon Value index

p q
p q

n n
o o

As we will just see,


Only Fishers index number satisfies the Factor Reversal Test.
Laspeyress , Paasches and Marshall-Edgewoth index numbers not satisfy the Factor
Reversal Test.
3. Circular Test
This test may be stated as follows: If an index for the year b based upon the year a is

Pab and for the year c based

Pbc , then the circular test requires that the index for the year c
based upon the year a, i.e., Pac should be the same as if it were compounded of these
upon the year b is
two stages i.e.

Pac Pab Pbc


As we will just see,

Laspeyress , Paasches , Fishers and Marshall-Edgewoth index numbers satisfy the


Circular Test.

Chapter 5 REGRESSION & MULTIPLE


REGRESSION
Regression
The dependence of one variable upon the other variable is called regression. For example,
weights depend upon the heights.
OR
Regression is a mathematical relationship b/w one dependent and one independent
variable.
For example, Demand depends upon price. When price is independent variable and
demand is dependent variable.

Linear Regression
When the dependence of the variable is represented by a straight line, then it is called the
linear regression otherwise it said to be non-linear or curvilinear regression. For example, If
X is independent variable and Y is dependent variable, then the relation Y=a+bX is called
linear regression.

Properties of Least Square Regression line or Regression line


1. The least square regression line always passes through the mean values i.e. ( X , Y ).
2. Regression Coefficient i.e. b , d always have the same size.
3. The sum of deviation from observed as estimated values is always equal to zero. i.e.

(Y Y ) 0 , ( X X ) 0

4. The sum of square deviation b/w observed estimated values always minimum. i.e.

(Y Y )

= minimum ,

( X X )

= minimum

5. Sum of trend values always equal to sum of observed values. i.e.

X X

Y Y

Types of Linear Regression / Regression Equations


Regression equations are the algebraic expressions of the regression lines. There are two
regression equations, because there are two regression lines. These are:
1. Regression Equations of Y and X
2. Regression Equation of X and Y
1.

Regression Equations of Y and X or Y on X


b is the regression coefficient of regression line Y on X.
Liner Regression / Regression line / Least square regression line

Y a bX
Or

Y Y b( X X )
Or

Y Y byx ( X X )
General Method
Normal Equations

Y na b X
XY a X b X

We get the value of a and b solving the above equations simultaneously.


Alternative Methods
Direct formula of obtaining the value of a and b
Direct formula of a
(1).

a Y bX

X Y X XY

(2).

a aYX

n X 2 X

X
X
X
X

(3).

a aYX

XY
Y
X
n

Direct formula of b

(1).

(2).

(3).

b bYX

b bYX

b bYX

n XY X Y
n X 2 X

XY
Y
X
X

(4).

(5).

(6).

(7).

b bYX

X
n

XY
X

X Y

n
2
X
n

XY nXY
X nX
2

XY nXY

b bYX

When

nS X2

b bYX

b bYX

DY
2
X

D D
X

n
2
DX
n

DX X A
DY Y B

(8).

b bYX r

(9).

b bYX

A= Constant
B= Constant

SY
SX

S XY
S X2

Where

S XY

SX

( X X )(Y Y )
n

(X X )
n

SX

X
n

X
n

( X X ) (Y Y )
(X X )

D
Where

2
X

SY

(Y Y )

SY

Y
n

2. Regression Equation of X and Y or X on Y


d is the regression coefficient of regression line X on Y.
Liner Regression / Regression line / Least square regression line

X c dY
Or

X X d (Y Y )
Or

X X bXY (Y Y )
General Method
Normal Equations

X nc d Y
XY c Y d Y

We get the value of c and d solving the above equations simultaneously.


Alternative Methods
Direct formula of obtaining the value of c and d
Direct formula of c
(1).

c X dY

X Y Y XY

n Y Y
2

(2).

(3).

c a XY

c a XY

X
Y
Y
Y

XY
Y
Y
2

Direct formula of d

(1).

d bXY

n XY X Y
n Y 2 Y

(2).

(3).

d bXY

d bXY

XY
Y
Y
Y

X
n

XY
Y

(4).

(5).

(6).

(7).

d bXY

X Y

n
2
Y
n

XY nXY
Y nY
2

XY nXY

d bXY

When

nSY2

d bXY

d bXY

S
2
Y

( X X ) (Y Y )
(Y Y )
2

DY

D D
X

n
2
DY

Where DX X A
DY Y B
(8).

d bXY r

(9).

d bXY

A= Constant
B= Constant

SX
SY

S XY
SY2

Where

S XY

( X X )(Y Y )
n

SX

(X X )

SY

(Y Y )

Scatter Diagram

SX

SY

X
n

Y
n

If we plot the paired observation


points is called a scatter diagram.

( X 1 , Y1 )( X 2 , Y2 )( X 3 , Y3 ) on a graph, the resulting set of

Y-axis

X-axis

Standard Deviation of Regression or Standard Error of Estimate

To observed values of (X, Y) do not all fall on the regression line but they scatter away from
it. The degree of scatter (or dispersion) of the observed values about the regression line is
measured by what is called the standard deviation of regression or the standard error of
estimate of Y on X and X on Y.
1.

Y on X ( Y a bX )
For Ungrouped Data

s y. x

a Y b XY
n2

Or

s y. x

Y Y

Where

n2

Y is Trend values

For Grouped Data

s y. x k

2.

fv

a fv b fuv

f 2

Where k is constant

X on Y ( X c dY )
For Ungrouped Data

sx. y

c X d XY
n2

Or

sx. y

X X

n2

For Grouped Data

Where

X is Trend values

sx. y h

fu

c fu d fuv

Where h is constant

f 2

Multiple Regression
A regression which involves two or more independent variable is called a multiple
regression. For example; the yield of a crop depends upon fertility of the land, fertilizer
applied, rain fall, quality of seeds etc. likewise, the systolic blood pressure of a person
depends upon ones weight, age, etc

Yi 1 X 1i 2 X 2i ....... k X ki i

Multiple Liner Regression With Two Independent Variables


Multiple Regression Line

Yi 1 X 1i 2 X 2i i
The estimated multiple liner regression based on sample data is

Y a b1 X 1 b2 X 2
Normal Equations are

Y na b X b X
X Y a X b X b X X
X Y a X b X X b X
1

2
2
2

We get the value of a, b1 and b2 solving the above equations simultaneously.

Type of Multiple Liner Regression / Multiple Regression Equations


With Two Independent Variables
Multiple Regression equations are the algebraic expressions of the regression lines. There
are two regression equations, because there are two regression lines. These are:

1. Multiple Regression Equations of X 1 on X 2 and X 3


2. Multiple Regression Equation of X 2 on X 1 and X 3
3. Multiple Regression Equation of X 3 on X 1 and X 2
1. Multiple Regression Equations of

X 1 on X 2 and X 3

b12.3 and b13.2 is the regression coefficient of Multiple regression line X 1 on X 2 and X 3 .
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line

X 1 a b12.3 X 2 b13.2 X 3
Or

( X 1 X 1 ) b12.3 ( X 2 X 2 ) b13.2 ( X 3 X 3 )
General Method
Normal Equations

X na b X b X
X X a X b X b X X
X X a X b X X b X
1

12.3

13.2

12.3

13.2

We get the
simultaneously.

value

of

12.3

a,

b12.3

13.2

and

b13.2 solving

the

above

equations

Alternative Methods
Direct formula of obtaining the value of a, b12.3 and b13.2
Direct formula of a

a X 1 b12.3 X 2 b13.2 X 3
Direct formula of b12.3
(1).

b12.3

S1 12
.
S 2 11

(2).

b12.3

S1 r13 r23 r12


.
S 2 1 r232

(3).

b12.3

Where r=correlation

S1 r12 r13 r23


.
S2 1 r232

Direct formula of b13.2


(1).

b13.2

S1 13
.
S3 11

(2).

b13.2

S1 r12 r32 r13


.
S3 1 r232

(3).

b13.2

Where r=correlation

S1 r13 r12 r32


.
S3 1 r232

r11
Where r21
r31

r12
r22
r32

r13
1
r23 r21
r33 r31

r12
1
r32

r13
r23
1

11 1 r232
Direct Method to Solve Multiple Regression equation

X X3
X1 X1
X X2
11 2
12 3
13 0
S1
S2
S3
Or

X 1 on X 2 and X 3

S1
S1 r12 r13 r23
X 2

2
S 2 1 r23
S 3

r13 r 12 r23
X3
1 r232

X1

2. Multiple Regression Equations of

X 2 on X 1 and X 3

b21.3 and b23.1 is the regression coefficient of Multiple regression line X 2 on X 1 and X 3
.
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line

X 2 a b21.3 X 1 b23.1 X 3
Or

( X 2 X 2 ) b21.3 ( X 1 X 1 ) b23.1 ( X 3 X 3 )
General Method
Normal Equations

X
X
X

2
2
2

We get the
simultaneously.

na b21.3 X 1 b23.1 X 3

X 1 a X 1 b21.3 X 12 b23.1 X 1 X 3

X 3 a X 3 b21.3 X 1 X 3 b23.1 X 32
value

of

a,

b21.3

and

b23.1 solving

the

above

Alternative Methods
Direct formula of obtaining the value of a, b21.3 and b23.1
Direct formula of a

a X 2 b21.3 X 1 b23.1 X 3
Direct formula of b21.3
(1).

b21.3

S2 21
.
S1 22

(2).

b21.3

S2 r23 r13 r21


.
S1 1 r132

(3).

b21.3

S 2 r21 r23r13
.
S1 1 r132

Direct formula of b23.1


(1).

b23.1

S2 23
.
S3 22

Where r=correlation

equations

(2).

b23.1

(3).

b23.1

S2 r21r31 r23
.
S3 1 r132

Where r=correlation

S 2 r23 r21r31
.
S3 1 r132

r11
Where r21
r31

r12
r22
r32

r13
1
r23 r21
r33 r31

r12
1
r32

r13
r23
1

11 1 r232
Direct Method to Solve Multiple Regression equation

X 2 on X 1 and X 3

X X3
X2 X2
X X1
22 1
21 3
23 0
S2
S1
S3
Or

S 2
S 2 r21 r23r13
X 1

2
S1 1 r13
S 3

X2

3. Multiple Regression Equations of

r23 r 21r31
X3
1 r132
X 3 on X 2 and X 1

b31.2 and b32.1 is the regression coefficient of Multiple regression line X 3 on X 2 and X 1
.
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line

X 3 a b31.2 X 1 b32.1 X 2
Or

( X 3 X 3 ) b31.2 ( X 1 X 1 ) b32.1 ( X 2 X 2 )
General Method
Normal Equations

X
X
X

3
3
3

na b31.2 X 1 b32.1 X 2

X 1 a X 1 b31.2 X 12 b32.1 X 1 X 2

X 2 a X 2 b31.2 X 1 X 2 b32.1 X 2 2

We get the value of a, b31.2 and b32.1 solving the above equations
simultaneously.
Alternative Methods
Direct formula of obtaining the value of a, b31.2 and b32.1
Direct formula of a

a X 3 b31.2 X 1 b32.1 X 2
Direct formula of b31.2
(1).

b31.2

S3 31
.
S1 33

(2).

b31.2

S3 r32 r12 r31


.
S1 1 r122

(3).

b31.2

Where r=correlation

S3 r31 r32 r12


.
S1 1 r122

Direct formula of b32.1


(1).

b32.1

S3 32
.
S 2 33

(2).

b32.1

S3 r31r21 r32
.
S 2 1 r122

(3).

b32.1

Where r=correlation

S3 r32 r31r21
.
S 2 1 r122

r11
Where r21
r31

r12
r22
r32

r13
1
r23 r21
r33 r31

r12
1
r32

r13
r23
1

11 1 r232
Direct Method to Solve Multiple Regression equation

X 3 on X 2 and X 1

X3 X3
X X1
X X2
33 1
31 2
32 0
S3
S1
S2
Or

S3 r31 r32 r12 S3

X 1
1 r122
S1
S2

X3

r32 r31r21
X2
1 r122

Chapter 6 CORRELATION , MULTIPLE AND


PARTIAL CORRELATION
Correlation
The interdependence of two or more variables is called correlation.
Or
The liner relationship b/w two or more variables is called correlation. For example, an
increase in the amount of rainfall will increase the sales of raincoats. Ages and weights of
children are correlated with each other.

Positive Correlation
The correlation in the same direction is called positive correlation. If one variable increase
other is also increase, and one variable is decrease other is also decrease. For example, an
increase in heights of children is usually accompanied by an increase in their weights. The
length of an iron bar will increase as the temperature increase.

Negative Correlation
The correlation in opposite (different) direction is called negative correlation. If one
variable increase other is decrease, and one variable is decrease other is increase. For
example, the volume gas will decrease as the pressure increase.

No Correlation Or Zero Correlation

If there are no relationship b/w two variables then it is called no correlation or zero
correlation.

Coefficient of Correlation
It is a measurement of the degree of interdependence b/w the variable. It is a pure number
and lies b/w -1 to +1 and intermediate value of zero indicates the absence of correlation. it
denoted by r.

Properties of Correlation Coefficient


1. The correlation coefficient is symmetrical with respect to X and Y i.e. rxy = ryx
2. The correlation co-efficient is the geometric mean of the two regression coefficients.

r b d Or r bxy byx
3. The correlation coefficient is independent of origin and unit of measurement i.e.

rxy ruv

4. The correlation coefficient lies b/w -1 and +1.i.e.


5. It is a pure number.

1 r 1

Formulas of Correlation Coefficient


For ungrouped Data

(1).

XY

r rxy ryx

X Y
n

( X X ) (Y Y )
n

(2).

r rxy ryx

(3).

r rxy ryx

(4).

r rxy ryx

( X X ) (Y Y )

(5).

r rxy ryx

XY nXY

( X X ) (Y Y )
2

XY nXY
X nX Y
2

nS x S y

nS x S y

nY 2

(6).

r rxy ryx

(7).

2
y

U V
n

X A DX
Y B Dy

, V

h
h
k
k

r rxy ryx bxy byx


Where

UV

(8).

D D

X A , DY Y B

r ruv rvu

Where

DY

Where DX

b byx r

Sy

Sx

r rxy ryx b d

Or

d bxy r

Sx
Sy

For Grouped Data

(1).

fX fY
f
fX
fY
fX

f
f
fD fD
fD D
f
fD
fD

fD
fD

f
f



fU fV
fUV
f
fU
fV
fU

f
f
fXY

r rxy ryx

(2).

r rxy ryx

(4).

r rxy ryx bxy byx

Where

bvu

Or

r rxy ryx b d

fU fV
f
fU
fU
f

fUV

r ruv rvu

(3).

, byx

k
bvu
h

buv

fU fV
f
fV
fV
f

fUV

bxy

h
buv
k

Rank Correlation

Sometimes, the actual measurement or counts of individuals or objects are either not
available or accurate assessment is not possible. They are then arranged in order
according to some characteristic of interest. Such an ordered arrangement is called a
ranking and the order given to an individual or object is called its rank. The correlation b/w
two such sets of rankings are known as Rank Correlation.
Rank Correlation =

rs 1

6 d 2

n(n 2 1)

(Spearmans Formula)

Where d=difference b/w ranks of corresponding values of X and Y


n= number of pairs of values (X, Y) in the data.

Rank Correlation for Tied Ranks


The spearmans coefficient or rank correlation applies only when no ties are present. In
case there are ties in ranks, the ranks are adjusted by assigning the mean of the ranks
which the tied objects or observations would have if they were ordered.
Rank Correlation for Tied =

rs 1

6 d 2 a
n(n 2 1)

1 3
1
t1 t1 t23 t2 ..............

12
12

Where t= tied values

Multiple Correlation

Multiple correlation coefficient measures the degree of relationship b/w a variable and a
group of variables and variable is not included in that group e.g. Ry .12 , R1.23
(1).

R1.23 R1.32

r122 r132 2r12 r13 r23


1 r232
Or

R1.23 R1.32 1

(2).

R2.13 R2.31

11

r212 r232 2r12 r13 r23


1 r132
Or

R2.13 R2.31 1

(3).

R3.12 R3.21

22

r312 r322 2r12 r13r23


1 r122

Or

R3.12 R3.12 1
r11
Where r21
r31

r12
r22
r32

33
r13
1
r23 r21
r33 r31

r12
1
r32

r13
r23
1

11 1 r232
1 r122 r132 r232 2r12 r13r23
Q r12 r21 , r23 r32 , r13 r31
2
2
2
are known as coefficient of multiple determination
R1.23
, R2.13
, R3.12
Partial Correlation

Hence

Correlation b/w two variable keeping the effects of all other variables as constant is called
partial correlation for example r12.3 , r13.2 , r23.1

(1).

r12.3 r21.3

r12 r13 r23

1 r 1 r
2
13

2
23

Or

r12.3 r21.3 b12.3 b21.3


(2).

r13.2 r31.2

r13 r12 r32

1 r 1 r
2
12

2
32

Or

r13.2 r31.2 b13.2 b31.2


(3).

r23.1 r32.1

r23 r21r31

1 r 1 r
2
21

2
31

Or

r23.1 r32.1 b23.1 b32.1


Q r12 r21 , r23 r32 , r13 r31

Chapter 7 ANALYSIS OF TIME SERIES


Time Series
An arrangement of data by successive time period is called time series. For example, the
total monthly sales receipts in a departmental store, the annual yield of a crop in a country

for a no of years, hourly temperature recorded at a locality for a period of years, the
weekly prices of wheat in Lahore, the monthly consumption of electricity in a certain town,
the monthly total of passengers carried by rail, the quarterly sales of a certain fertilizer,
the annual rainfall at Karachi for a number of years, the enrolment of students in a college
or university over a number of years and so forth.

Signal and Noise


Signal: The systematic component of variation in time series is called signal.
Noise: An irregular or random component of variation in the time series is called noise.

Analysis of Time Series


The analysis of time series consists of the description, measurement, and isolation of the
various components present in the series, this analysis helps the economists, businessmen
and Planner etc.
The value of the time series (Y) is the product effects of four components trend (T) ,
Cyclical (C), Seasonal (S) and Irregular (I) Movements. Y T C S I
But some statistical; consider the components of a time series fallow and additive law.

Y T C S I

Components (Movements) of a Time Series


A typical time series has four types of movements usually called components for a time
series.
a. Secular Trend (T)
b. Seasonal Movements or Seasonal Variation (S)
c. Cyclical Movements or Cyclical Variation or Cyclical Fluctuation (C)
d. Irregular, Accidental or Random Movements (I)
a. Secular Trend (T)
These movements refer to long term variation which shows any tendencies of growth
or decline over a long period approximately 30 to 40 years. These are smooth, steady
and regular in nature for example, a continually increasing for more food due to
population increase, a decline in death rate due to advance in science.
b. Seasonal Movements (S)
These movements refer to short-term variations which generally occur due to seasonal
effects within a period of one year. Climatic conditions including rainfall, heat and wind
directly effect the time series for example, the demand of Woolen cloths increases in
winter, the sale of shoes increase before EID, the price of wheat which fall after
harvesting season and rise before the sowing time, the sales of soft drinks which are
high in the summer and low in the winter, investments in Savings Certificates which
are high in the months of May and June and low in other months, and so forth. The
concept of seasonal variation is customarily broadened to include the more or less
regular fluctuations of shorter duration occurring within a day, a week, a month, a
quarter and so forth. Examples of such variations are the daily variations in
temperature or the monthly variations in Bank deposits.
c. Cyclical Movements or Cyclical Variation or Cyclical Fluctuation (C)
Statistical data in a number of cases show up and down movement periodically, there
are swings from prosperity through recession, depression and recovery and back to
prosperity, which is know as four phases of a business cycle ( Depression, Revival or
Recovery ,
Prosperity or Boom , Contraction or Recession) and very important example of cyclical
movements. These changes are repeated at intervals ranging from 7 to 10 years.
d. Irregular, Accidental or Random Movements (I)
These movements are irregular and unsystematic in nature and happen as a result of
abnormal events such as floods, earthquakes, wars and strikes etc. For example, Prices
rise during war time, the production of industries goes down due to labour strikes.

Analysis the Secular Trend (T)


There are four methods to measure the secular trend.
2) The Freehand Curve Method

3) The Method of Semi Averages


4) The Method of Moving Averages
5) The Method of Least Square
1) The Freehand Curve Method
In this method the data are plotted on a graph measuring the time units (year, months,
etc) along X-axis and the value of the time series variable along the Y-axis. A trend line
or smooth curve is drawn through the graph in such a way that is shows the general
tendency of the values. The trend values for different years (or months0 are read from
the trend line or curve.
2) The Method of Semi Averages
The freehand curve method, as we have seen, depends too much on personal
judgment and gives subjective results. Another simple, method for measuring secular
trend is the method of Simple Averages. In this method the data divided into two equal
parts. (If the number of values is odd, either the middle value is left out or the series is
divided unequally). The averages for each part are computed and places against the
centre of each part. The averages are plotted and joined by a line. The line is extended
to cover the whole data. Trend values corresponding to different time periods can be
read from this trend line.
3) The Method of Moving Averages
We have seen that the Freehand Curve Method is subjective because it is based too
much on individual judgment. The method of Semi Average is appropriate only when
the trend is liner. Another simple method which can also be used to eliminate seasonal,
cyclical and irregular movements is the method of moving averages. In this method,
we find the simple average successively taking a specific number of values at a time.
For example, if we want to find 3-Year moving average, we shall find the average of the
first three values, then drop the first value and include the fourth value. The process
will be continued till all the values in the series are exhausted. The averages so
obtained are placed in the middle of the group for which the average is calculated.
When we find the moving average taking an even number of values, the middle of the
group will lie b/w two years. In the order to make the average coincide with a particular
year, we centre the averages by calculating further a 2-year moving average of the
even order moving averages. The averages obtained are called moving averages
(Centre).
4) The Method of Least Square
The principle of least square states the sum of squares of the deviations of the
observed values from the corresponding estimated values should be least in this
method a straight line Y a bX , Second degree parabola Y a bX cX 2 and a
Third degree parabola Y a bX
the method of least squares

cX 2 dX 3 are fitted to the observed time series by

a. Linear trend line

Y a bX

Normal Equations

Y na b X
XY a X b X

We get the value of a and b solving the above equations simultaneously.


b. Second Degree Parabola \ Second Degree Trend Line

Y a bX cX 2
Normal Equations

Y na b X c X
XY a X b X c X
X Y a X b X c X
2

We get the value of a, b and c solving the above equations simultaneously.


c. Third Degree Parabola \ Third Degree Trend line

Y a bX cX 2 dX 3
Normal Equations

Y na b X c X d X
XY a X b X c X d X
X Y a X b X c X d X
X Y a X b X c X d X
2

We get the value of a, b, c and d solving the above equations simultaneously.

Chapter 8 SAMPLING AND SAMPLING


DISTRIBUTION
Population
A group of all possible elements or objects are called population, for example, Human
Population, the total number of students in college. The number of elements involved in
population is called size of the population. It is denoted by N.

Finite Population
A population said to be finite if it consists of a finite or fixed number of elements for
example, All university students In Pakistan, the weights of all students enrolled at Punjab
University.

Infinite Population
A population said to be infinite if there is not limit to the number of elements. For example,
All heights between 2 and 3 meters.

Existent Population
A population which consists of concrete objects is called an existent population.

Hypothetical Population
A population which does not contain concrete objects or items is called hypothetical
population.

Sample
Representation small part of a population is called sample. The number of elements
desired in sample is called sample size. It is denoted by n.

Sampling

Technique of selecting a true sample is called sampling. Sampling is broadly (mostly)


distributed into two classes.
a) Probability or Random Sampling
b) Non-probability or Non-random Sampling
a) Probability or Random Sampling
Technique of sampling where every sampling unit is selected untirely at random,
therefore every sampling unit have same chances of selection in the sample, the
probability involved in the selection of sampling unit such a technique is called
probability sampling.
Some Important probability samplings are
1. Simple random sampling
2. Stratified sampling
3. Systematic sampling
4. Cluster sampling
5. Multistage and Multiphase sampling
b) Non-probability or Non-random Sampling
In non probability sampling, the selection of the elements is not base on probability
theory but the personal judgment plays a significant role in the selection of the sample
the examples of non probability sampling are.
1. Judgment or Purposive Sampling
2. Quota Sampling

Sampling With Replacement (W.R)


Sampling is said to be with replacement if the selected unit is replaced to the population
before selecting the next unit, thus sampling unit can be selected more than once. For
example, the sampling with replacement is Just like Prize bond scheme.
The number of possible samples of size n from a population of size N using this
technique will be N n . If we have a population containing 6 elements and like to draw all
possible sample of size 2 talking with replacement sampling then number of possible
sample will become 36 i.e. 62 .

Sampling Without Replacement (W.O.R)


Sampling is said to be without replacement if the selected unit is not replaced to the
population before selecting the next unit, thus sampling unit can never be selected more
than once. For example, the sampling without replacement is Just like Committee System.
The number of possible sample of size n from a population of size N is obtained by
using following formula
No. of Possible samples =

N!
N
= N n !n !
n

Cn =

If for example, we have N=5 and n=2, the no. of possible samples will be 10 i.e.
5

C2 =

5!
= 10.
5 2 !2!

Parameter
Numerical information or values drawn from population are called parameter. These are
fixed numbers. It is usually denoted by Greek or capital letters. For example, population
mean , and standard deviation .

Statistic
Numerical information or values drawn from sample are called statistic. It vary from
sample to sample from the same population. It is denoted by Roman or Small letters. For
example, sample mean X and sample standard deviation S.

Sampling Units
A basic element or object which we select for a sample are called sampling units. For
example, if we want to measure the average height of college students are sampling units.

Sampling Frame

The complete list of all possible sampling units is called a frame.

Census
Complete enumeration of similar and dissimilar units is termed as census.

Sample Survey
In a sample Survey, enumeration is limited to only a part, or a sample select from the
population.

Preference to Sample Survey Over Complete Survey


We prefer sample survey to complete survey due to
1) Reduced cost which we incur on sample
2) Greater speed in presenting the result
3) Greater scope of inquiry
4) Greater accuracy

Sampling Error
The difference b/w parameter and statistic due to small size of sample is called sampling
error. It can be reduced by increasing the sample size to a sufficient level.
Sampling Error = x
Where x = Sample Mean = Population Mean

Non-Sampling Error

The non-sampling error is those errors that arise due to defective sampling frame or
information not being provided correctly. For example, income, Sale, Production and Age
etc are not coated correctly in the most of the cases.

Sampling Bias
Bias is a cumulative component of error which arises due to defective selection of the
sample or negligence of the investigator. Errors due to bias increase with an increase in
the size of sample.

Standard Error
The standard deviation of a sampling distribution of statistic is called standard error
(abbreviated to S.E).

S .E ( x )=

Sampling Distribution
Frequency distribution of statistics from all samples is called sampling distribution. For
example, sampling distributions of sample mean or sample distribution of sample
variance.

Simple Random Sampling

Technique of sampling where every sampling unit is selected at random from a


homogeneous population that every sampling unit have equal chances of selection in
sample and every part of population have similar characteristics. For example, Random
number table or lottery Method.

Stratified Sampling
When a population has highly variable material, the simple random sampling fails to give
accurate results. In this case our population is heterogeneous which is divided into
homogenous subgroups called strata. Then a sample is selected separately from each
strata at random and the combined into a single sample. This method is called stratified
random sampling.

Systematic Sampling
Systematic sampling is a method of selecting a sample that calls for taking every Kth
element in the population. The first unit in the sample is selected at random from first 1 to
K units the population and the every Kth unit is included in the sample.

Cluster Sampling

Cluster sampling a method of selection a sample in which population is divided into natural
groups , such as household , agricultural forms, etc. which are called cluster and taking
these clusters as sampling units, a sample is draw at random.

Quota Sampling
Quota sampling is method of selecting a sample of convenience with certain controls to
avoid some of the more serious biases involved in talking those most conveniently
available. In those method quotas are setup example, by specifying number of interviews
from urban and rural, males and females etc.

Sampling Distribution
Frequency distribution of statistics from all samples is called sampling distribution. For
example, sampling distributions of sample mean or sample distribution of sample
variance.
Population Size = N
Population = X
Sample Size = n
Population Mean =

Population Variance =

(x )
N

Population Standard Deviation =


Population Proportion = P

Sample Proportion = P

Sample Mean = x

(x )

X
N

Where X is represent the number of even, odd or specific number.

x
n

x
n

Biased Sample Variance = S 2 =

(x x )
n

Biased Sample Standard Deviation = S =

Unbiased Sample Variance =

s2 =

(x x )
n 1

Unbiased Sample Standard Deviation =

Sample Draw with Replacement =

(x x )

(x x )

s =

n 1

Nn

Sample Draw with Out Replacement =

N!
N
= N n !n !
n

Cn =

Sampling Distribution of Mean

1) Mean of the sampling distribution of x

x E ( x ) xf ( x )

2)

Variance of the sampling distribution of x

x2 E ( x ) 2 E ( x )

3)

x 2 f ( x ) xf ( x )

S .E ( x ) =Standard Deviation of the sampling distribution of x


x E ( x )2 E ( x )
2

4) Population Mean =

x 2 f ( x ) xf ( x )

5) Population Variance =

N
2

6) Population Standard deviation =

Verification for With Replacement (W.R)


a.

b.

x2

c.

x S .E ( x )

2
n

Verification for With Out Replacement (W.O.R)


a. x

x2

b.

2 N n
.
n N 1

N n
n N 1
Sampling Distribution of Difference b/w two means ( x1 x2 )
x S .E ( x )

c.

1) Mean of the sampling distribution of x1 x2

x x E ( x1 x2 ) ( x1 x2 ) f ( x1 x2 )
1

2) Variance of the sampling distribution of x1 x2

x21 x2 E ( x1 x2 ) 2 E ( x1 x2 )

3)

(x

x2 ) 2 f ( x1 x2 )

( x x ) f ( x x )

S .E ( x1 x2 ) =Standard Deviation of the sampling distribution of x1 x2


x1 x2 E ( x1 x2 ) 2 E ( x1 x2 )
2

4) Population Mean

x1 = 1

N1

(x

x2 ) 2 f ( x1 x2 )

(x x ) f (x x )
1

x2 = 2

5) Population Mean

6) Population Variance

7) Population Variance

N2

2
1

x1 =
2
1

N1

2
2

8) Population Standard deviation

N1

x2 =
2
2

N2

N2

x1 = 1

9) Population Standard deviation x2 = 2

2
1

N1

N1

2
2

N2

N2

Verification for With Replacement (W.R)


a. x x
1

1 2

x21 x2

b.

x1 x2

c.

12 22

n1 n2

12 22
S .E ( x1 x2 )

n1 n2

Verification for With Out Replacement (W.O.R)


a. x x
1

b.

2
x1 x2

1 2

12 N1 n1
22 N 2 n2

n1 N1 1
n 2 N 2 1

12 N1 n1
22 N 2 n2

n1 N1 1
n 2 N 2 1
Sampling Distribution of Sample Proportion ( P )
x1 x2 S .E ( x1 x2 )

c.

1) Mean of the sampling distribution of

( P )
P E ( P ) Pf

( P )

P 2 f ( P ) Pf

2) Variance of the sampling distribution of

P2 E ( P ) 2 E ( P )

3)

S .E ( P ) =Standard Deviation of the sampling distribution of P

P E ( P ) 2 E ( P )

4) Population Mean = P
number.

P 2 f ( P ) Pf ( P )
X
N

Where X is represent the number of even, odd or specific

Verification for With Replacement (W.R)

P P
Pq
2
b. P
n
a.

P S .E ( P )

c.

where

q 1 P

Pq
n

Verification for With Out Replacement (W.O.R)

P P
Pq N n
2
.
b. P
n N 1
a.

Pq N n
.
n N 1
Sampling Distribution of Difference b/w two Proportion ( P1 P2 )
1) Mean of the sampling distribution of P P

P S .E ( P )

c.

P P
1

2) Variance of the sampling distribution of

P2 P E ( P1 P2 ) 2 E ( P1 P2 )
1

3)

E ( P1 P2 ) ( P1 P2 ) f ( P1 P2 )

P1 P2

( P P )2 f ( P P ) ( P1 P2 ) f ( P1 P2 )
1

S .E ( P1 P2 ) =Standard Deviation of the sampling distribution of P1 P2

P P E ( P1 P2 ) 2 E ( P1 P2 )
1

4) Population Mean
Where

x1

( P P )2 f ( P P ) ( P1 P2 ) f ( P1 P2 )
1

X1
N1

is represent the number of even, odd or specific number

5) Population Mean
Where

x1 = P1

x2

x2 = P2

X2
N2

is represent the number of even, odd or specific number

Verification for With Replacement (W.R)

P1 P2

a. P P
1

b.

c.

P2 P
1

Pq
Pq
1 1
2 2
n1
n2

where

q1 1 P1

Pq P q
P P S .E ( P1 P2 ) 1 1 2 2
1
2
n1
n2

Verification for With Out Replacement (W.O.R)

P1 P2

a. P P
1

b.

P2 P
1

N1 n1
Pq
P2 q 2 N 2 n2
1 1

n1 N1 1
n2
N 2 1

q2 1 P2

Pq N n
P q N 2 n2
P P S .E ( P1 P2 ) 1 1 1 1 2 2

1
2
n1 N1 1
n2
N 2 1
Sampling Distribution of Biased Variance ( S 2 )
c.

S2

1) Mean of the sampling distribution of

S E (S 2 ) S 2 f (S 2 )
2

2) Variance of the sampling distribution of

S22 E ( S 2 ) 2 E ( S 2 )

3)

(S

S2

) f ( S 2 ) S 2 f ( S 2 )

2 2

S .E ( S 2 ) =Standard Deviation of the sampling distribution of S 2

S 2 E ( S 2 )2 E ( S 2 )

(S

4) Population Mean =

) f (S 2 )

2 2

f (S 2 )

5) Population Variance =

6) Population Standard deviation =

Verification

S E (S 2 ) 2
2

Sampling Distribution of Un-Biased Variance ( s 2 )


s2

1) Mean of the sampling distribution of

s E (s 2 ) s 2 f (s 2 )
2

2) Variance of the sampling distribution of

s22 E ( s 2 ) 2 E ( s 2 )

3)

(s )2 f (s
2

s2

f ( s 2 )

S .E ( s 2 ) =Standard Deviation of the sampling distribution of s 2

s2 E (s 2 )2 E (s 2 )

(s

4) Population Mean =

) f (s 2 )

2 2

5) Population Variance =

f (s 2 )

6) Population Standard deviation =

s E (s 2 ) 2

Verification

x
N

x
N

Chapter 9 Estimation
Confidence Interval for Population Mean With Replacement (Z-Test)
When Population Standard Deviation ( ) is known

P X Z / 2
X Z / 2 1
n
n

Or

X Z / 2
X Z / 2
n
n
When Population Standard Deviation

( ) is unknown & n>30

S
S

P X Z / 2
X Z / 2 1
n
n

Or
S
S
X Z / 2
X Z / 2
n
n
Confidence Interval for Population Mean With Out Replacement (ZTest)
When Population Standard Deviation ( ) is known

P X Z / 2
n

N n

X Z / 2
N 1
n

N n
1
N 1

Or

X Z / 2

N n

X Z / 2
N 1
n

When Population Standard Deviation

S
P X Z / 2
n

N n
N 1

( ) is unknown

N n
S
X Z / 2
N 1
n

N n
1
N 1

Or

X Z / 2

S
n

N n
S
X Z / 2
N 1
n

N n
N 1

Confidence Interval for Difference Between Population Mean ( 1 2 )


With Replacement (Z-Test)
When population S.D ( ) is known

2 2
2 2
P X 1 X 2 Z / 2 1 2 1 2 X 1 X 2 Z / 2 1 2 1
n1 n2
n1 n2

Or

12 22
2 2

1 2 X 1 X 2 Z / 2 1 2
n1 n2
n1 n2

X 1 X 2 Z / 2

When population S.D ( ) is unknown &

S12 S 22
S12 S 22

1 2 X 1 X 2 Z / 2
1
n1 n2
n1 n2

P X 1 X 2 Z / 2

1 X 2 Z / 2

n1 , n2 >30

Or

2
1

2
2

S
S
S2 S2

1 2 X 1 X 2 Z / 2 1 2
n1 n2
n1 n2

Confidence Interval for Difference Between Population Mean ( 2 1 )


With Replacement (Z-Test)
When population S.D ( ) is known

2 2
2 2
P X 2 X 1 Z / 2 1 2 2 1 X 2 X 1 Z / 2 1 2 1
n1 n2
n1 n2

Or

X 1 Z / 2

12 22

2 1 X 2 X 1 Z / 2

n1 n2
n1 n2
2
1

2
2

When population S.D ( ) is unknown &

P X 2 X 1 Z / 2

n1 , n2 >30

S12 S 22
S12 S 22

2 1 X 2 X 1 Z / 2
1
n1 n2
n1 n2
Or

X 2 X 1 Z / 2

2
1

2
2

S
S
S2 S2

2 1 X 2 X 1 Z / 2 1 2
n1 n2
n1 n2

Confidence Interval for Difference Between Population Mean ( 1 2 )


With Out Replacement (Z-Test)
When population S.D ( ) is known

2 N n 2 N 2 n2
2 N n 2 N n
P X 1 X 2 Z / 2 1 1 1 2 2 2 1 2 X 1 X 2 Z / 2 1 1 1 2
1
n1 N1 1
n 2 N 2 1

n1 N1 1 n2 N 2 1
Or

X 2 Z / 2

12 N1 n1 22 N 2 n2


1 2 X 1 X 2 Z / 2
n1 N1 1
n 2 N 2 1

When population S.D ( ) is unknown &

12 N1 n1 22 N 2 n2

n1 N1 1 n2 N 2 1

n1 , n2 >30

S 2 N n S 2 N 2 n2
S 2 N n S 2 N n
P X 1 X 2 Z / 2 1 1 1 2 2 2 1 2 X 1 X 2 Z / 2 1 1 1 2
1
n1 N1 1 n 2 N 2 1

n1 N1 1 n2 N 2 1

Or

1 X 2 Z / 2

S12 N1 n1 S22 N 2 n2
S12 N1 n1
S 22 N 2 n2

1
2
1
2
/2
n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1

Confidence Interval for Difference Between Population Mean ( 2 1 )


With Out Replacement (Z-Test)
When population S.D ( ) is known

12 N1 n1 22 N 2 n2
12 N1 n1 22 N 2 n2

P X 2 X 1 Z / 2


2 1 X 2 X 1 Z / 2

1
n1 N1 1
n 2 N 2 1
n
N

1
n
N

1
1
2 2 1
Or

X 1 Z / 2

12 N1 n1 22 N 2 n2
12 N1 n1
22 N 2 n2


2 1 X 2 X 1 Z / 2

n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1

When population S.D ( ) is unknown &

n1 , n2 >30

S 2 N1 n1 S22 N 2 n2
S2 N n
S 2 N n
P X 2 X 1 Z / 2 1 1 1 2 2 2 2 1 X 2 X 1 Z / 2 1

1
n1 N1 1
n 2 N 2 1

n1 N1 1 n2 N 2 1
Or

X 2 X 1 Z / 2

S12 N1 n1 S 22 N 2 n2
S12 N1 n1
S 22 N 2 n2

Z
2 1 / 2 n N 1 n N 1

2
1
n1 N1 1
n 2 N 2 1
1
1
2 2

Confidence Interval for Population Proportion (Z-Test)

P (1 P )
P (1 P )
1
P P Z / 2
P P Z / 2
n
n

Or

P (1 P )
P (1 P )
P Z / 2
P P Z / 2
n
n
Confidence Interval for Difference Between Two Population
Proportion ( P1 P2 ) (Z-Test)

P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
1
P P1 P2 Z / 2 1

P1 P2 P1 P2 Z / 2 1

n1
n2
n1
n2

Or

P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
P1 P2 Z / 2 1

P1 P2 P1 P2 Z / 2 1

n1
n2
n1
n2

Confidence Interval for Difference Between Two Population


Proportion ( P2 P1 ) (Z-Test)

P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
1
P P2 P1 Z / 2 1

P2 P1 P2 P1 Z / 2 1

n1
n2
n1
n2

Or

P P Z
2

/2

P1 (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )

P2 P1 P2 P1 Z / 2 1

n1
n2
n1
n2

Confidence Interval Estimate for Population Correlation Coefficient (ZTest)

1
1
P Z f Z / 2
z Z f Z / 2
1
n3
n 3

Or

Z f Z / 2

n3

z Z f Z / 2

n3
1 r
1 r

Z f 1.1513log

Chapter 10 Hypothesis Testing


Testing of Hypotheses concerning the Population Mean (Z-Test)
1. Null & Alternative Hypotheses

0
0

H0 ;
Alternative H1 ;
Null

0
0

0
0

2. Significance Level

=5%/1%

or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 0

For Two Tail Test (

C.R =

Z Z / 2

For One Tail Test ( 0.5


If Alternative

or

Z / 2 Z Z / 2

Z )

H1 ; 0

Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 0
C.R= Z

C.R= Z

4. Test Statistics
When population S.D ( ) is known

X 0

n
When population S.D ( ) is unknown & n>30
X 0
Z= S
n
Z=

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between two
Population Mean ( X 1 X 2 ) (Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

1 2 #0
1 2 #0

H0 ;
Alternative H1 ;
Null

1 2 #0
1 2 #0

1 2 #0
1 2 #0

2. Significance Level

=5%/1%

or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 1 2 #0

For Two Tail Test (

C.R =

Z Z / 2

or

Z / 2 Z Z / 2

For One Tail Test ( 0.5

Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
4. Test Statistics

When population S.D ( ) is known

( X 1 X 2 ) #0

12 22

n1 n2
When population S.D ( ) is unknown & n1 , n2 >30
Z=

#0 1 2

( X 1 X 2 ) #0
S12 S22

n1 n2

Z=
If

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between two
Population Mean ( X 2 X 1 ) (Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

2 1 #0
2 1 #0

H0 ;
Alternative H1 ;
Null

2 1 #0
2 1 #0

2 1 #0
2 1 #0

2. Significance Level

=5%/1%

or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 2 1 #0

For Two Tail Test (

C.R =

Z Z / 2

or

Z / 2 Z Z / 2

For One Tail Test ( 0.5

Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
#0 2 1

4. Test Statistics

When population S.D ( ) is known


( X 2 X 1 ) #0

12 22

n1 n2
When population S.D ( ) is unknown & n1 , n2 >30
( X 2 X 1 ) #0
Z=

Z=
If

S12 S22

n1 n2

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

H0

H0

Testing of Hypotheses concerning the Population Proportion (Z-Test)


1. Null & Alternative Hypotheses

P P0
P P0

H0 ;
Alternative H1 ;
Null

P P0
P P0

P P0
P P0

2. Significance Level

=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; P P0

For Two Tail Test (

C.R =

Z Z / 2

For One Tail Test ( 0.5


If Alternative

Z / 2 Z Z / 2

or

Z )

H1 ; P P0

Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P P0
C.R= Z Z
C.R= Z

4. Test Statistics

P P0
Z=
P0 (1 P0 )

P X
n

Or

Z=

P P0
P0 q0 )
n

q 1 P

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between two
Population Proportion( P1 P2 )(Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

H0 ;
Alternative H1 ;
Null

P1 P2 #0
P1 P2 #0

P1 P2 #0
P1 P2 #0

P1 P2 #0
P1 P2 #0

2. Significance Level

=5%/1%

or 0.05 / 0.01
If significance level is not given then we take 5% by default.

3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; P1 P2 #0

For Two Tail Test (

C.R =

Z Z / 2

Z / 2 Z Z / 2

or

For One Tail Test ( 0.5

Z )
If Alternative H1 ; P1 P2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P1 P2 #0
C.R= Z Z
#0 P1 P2

4. Test Statistics

( P1 P 2 ) #0
Z=

P2

P1

P1 q1 P 2 q 2

n1
n2
X2

X1

n1

n2
Or

( P1 P 2 ) #0
Z=

1 1

n1 n2

Pc qc

n Pn P
Pc 1 1 2 2
n1 n2

qc 1 Pc
If

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between two
Population Proportion( P2 P1 )(Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

H0 ;
Alternative H1 ;
Null

P2 P1 #0
P2 P1 #0

P2 P1 #0
P2 P1 #0

P2 P1 #0
P2 P1 #0

2. Significance Level

=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; P2 P1 #0

For Two Tail Test (

C.R =

Z Z / 2

Z / 2 Z Z / 2

or

For One Tail Test ( 0.5

Z )
If Alternative H1 ; P2 P1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P2 P1 #0
C.R= Z Z
#0 P2 P1

4. Test Statistics

( P 2 P1 ) #0

Z=

P1

P1 q1 P 2 q 2

n1
n2

P2

X2

X1

n1

n2
Or

( P 2 P1 ) #0
Z=

1 1

n1 n2

Pc qc

n Pn P
Pc 1 1 2 2
n1 n2

qc 1 Pc
If

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the Population Correlation
Row
Coefficient when ( 0 or 0 ) (Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

0
0

H0 ;
Alternative H1 ;
Null

0
0

0
0

2. Significance Level

=5%/1%

or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 0

For Two Tail Test (

C.R =

Z Z / 2

For One Tail Test ( 0.5

or

Z / 2 Z Z / 2

Z )
If Alternative H1 ; 0
C.R= Z Z

For One Tail Test ( 0.5

Z )
If Alternative H1 ; 0
C.R= Z Z

4. Test Statistics

Z f z

Z=

z
1

1 r
1 r

z 1.1513log

Z f 1.1513log

1
n3

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between Population
Row
of two Correlation Coefficient ( r1 r2 ) (Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

1 2 #0
1 2 #0

H0 ;
Alternative H1 ;
Null

1 2 #0
1 2 #0

1 2 #0
1 2 #0

2. Significance Level

=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 1 2 #0

For Two Tail Test (

C.R =

Z Z / 2

or

Z / 2 Z Z / 2

For One Tail Test ( 0.5

Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
4. Test Statistics
Z=

1 r1

1 r1

Z f 1 1.1513log

( Z f 1 Z f 2 ) #0

#0 1 2

Z 1 Z 2
1 r2

1 r2

Z f 2 1.1513log

Z 1 Z 2
If

1
1

n1 3 n2 3

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

H0

H0
Testing of Hypotheses concerning the difference between Population of
Row
two Correlation Coefficient ( r2 r1 ) (Z-Test)

If z-cal is less than z-tab so accepted

1. Null & Alternative Hypotheses

2 1 #0
2 1 #0

H0 ;
Alternative H1 ;
Null

2 1 #0
2 1 #0

2 1 #0
2 1 #0

2. Significance Level

=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)

1
Z / 2 )
2
If Alternative H1 ; 2 1 #0

For Two Tail Test (

C.R =

Z Z / 2

Z / 2 Z Z / 2

or

For One Tail Test ( 0.5

Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
4. Test Statistics
Z=

( Z f 2 Z f 1 ) #0

1 r1

1 r1

Z f 1 1.1513log

Z 2 Z 1
If

#0 2 1

Z 2 Z 1
1 r2

1 r2

Z f 2 1.1513log
1
1

n1 3 n2 3

#0 is not given in question then we take #0 =0.

5. Conclusion
If z-cal is greater than or equal to z-tab so rejected

If z-cal is less than z-tab so accepted

Source of
Variation
(S.O.V)

Degree
Freedo
m (d.f)

Sum of Square
(S.O.S)

H0

H0
Mean Square (M.S)

FDistribut
ion

Treatment(Column)
Treatment
(Sample)

k-1

S.S =

Error

n k

Total

n 1

Tj 2
n

C.F

Treatment M.S =

s12 Treatment.(Column) S .S

E.S.S = T.S.S
Treatment(Column)
S.S

Error M.S =

s E.S .S
2

n k

T.S.S =

2
i. j

C.F

1. Conclusion
If F-cal is greater than or equal to F-tab so rejected

k 1

s12
F 2
s

If F-cal is less than F-tab so accepted

H0

H0

Chapter 11 Analysis of Variance


Analysis of Variance Two way Classification or Two-Factor Experiment
1. Null & Alternative Hypotheses
Row
Null

B1 B 2 B 3 ................ Bk

H0 ;
Alternative H1 ;

At least two means are not equal.

Column

A1 A 2 A3 ................ Ak

H 0' ;

Null
Alternative

H1'

At least two means are not equal

2. Significance Level

=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
Row
C.R =

F F ( , V1 , V2 )
V1 r 1
V2 (r 1)(k 1)

V1 & V2 DegreeFreedom

Column
C.R = F

F ( , V1 , V2 )
V1 k 1
V2 (r 1)( k 1)

V1 & V2 DegreeFreedom

4. Test Statistics
Row /
Colum
n

A1

A2

A3

. Ak

Ti

Ti 2

2
i. j

B1
B2
B3
.
.
.
Bn

X1
1
X2
1
X3
1
.
.
.
Xn
1

X1
2
X2
2
X3
2
.
.
.
Xn
2

X1
3
X2
3
X3
3
.
.
.
Xn
3

X1k
X2k
X3k
.
.
.
Xnk

T T

Tj

T0 =

Grand Total

Tj

T =
X

2
i. j

2
i. j

i. j

(T0 ) 2
rk

I. Correction Factor = C.F = =

X
T
Column Sum of Square =C.S.S=

II. Total Sum of Square = T.S.S =

2
i. j

C .F

III.

IV. Row Sum of Square =R.S.S=

C.F

2
j

C.F

V. Error Sum of square = E.S.S = T.S.S - Column .S.S Row.S.S

ANOVA Table
Sourc
e of
Variati
on
(S.O.V
)

Degree
Freedom
(d.f)

Sum of Square
(S.O.S)

Mean Square (M.S)

Column.M.S =

Column.S.S =
Colum
n

k-1

2
j

C.F

Row.S.S =

Row

r-1

Error

(r 1)(k 1)

Tj 2
k

C.F

E.S.S = T.S.S
Treatment(Colum

s Column.S .S
2
1

k 1

Row.M.S=

s22 Row.S .S

r 1

Error M.S =

FDistributi
on

FColumn

FRow

s12
s2

s22
2
s

n) S.S
Total

n-1

s 2 E.S .S

(r 1)(k 1)

T.S.S =

2
i. j

C.F

5. Conclusion
Row
If FRow -cal is greater than or equal to

FRow -tab so rejected H 0


If FRow -cal is less than FRow -tab so accepted H 0

Column
If FColumn -cal is greater than or equal to

FColumn -tab so rejected H 0


If FColumn -cal is less than FColumn -tab so accepted H 0

Você também pode gostar