Você está na página 1de 13

STATI STI CS TERM S & FORM ULAS

I NTRODUCTI ON TO STATI STI CS



Paremet ers Populat ion Sample
Var i abl e X X
Sampl e Si ze n n
M ean X


St andar d Devi at i on s
Var i ance
2
s
2


POPULATI ON
Is a col l ect i on of al l t he el ement s under st at i st i cal i nvest i gat i on about w hi ch w e ar e t r yi ng t o
dr aw some concl usi ons.

SAM PLE
A smal l r epr esent at i ve por t i on of t he popul at i on under st udy i s r ef er r ed t o as a sampl e

SAM PLI NG
A val i d st at i st i cal pr ocedur e of dr aw i ng a sampl e f r om a popul at i on

PARAM ETER
Is a descr i pt i ve measur e of some char act er i st i c of t he popul at i on

STATI STI C
Is a descr i pt i ve measur e comput ed f r om a sampl e

FREQUENCY DI STRI BUTI ON
It i s a t abul ar summar y show i ng t he f r equenci es of obser vat i ons i n each of sever al non-
over l appi ng cl asses

I NDI VI DUAL SERI ES or UNGROUPED DATA
Scat t er ed or Raw dat a

DI SCRETE DI STRI BUTI ON
Raw dat e i s gr ouped w i t h f r equenci es

CONTI NUOUS DI STRI BUTI ON
Raw dat e i s gr ouped w i t h f r equenci es and cl ass i nt er val

RANGE
Is t he di f f er ence bet w een t he val ues of t he hi ghest and t he smal l est el ement i n t he dat a
= M aximum value M inimum Value

NO OF CLASSES
= 1+3. 322logN


W I DTH OF A CLASS
It i s det er mi ned by di vi di ng t he r ange of t he di st r i but i on by t he number o desi r ed cl ass i nt er val
=
Range
Nu uI Class tntervals


CLASS M I DPOI NT
It i s t he val ue hal f w ay bet w een t he l ow er and t he upper cl ass l i mi t s
=
( Maxtmum lass value Mtntmum lass value)
2


RELATI VE FREQUENCY
It i s t he pr opor t i on of t he t ot al f r equenci es f or any gi ven cl ass i nt er val of any f r equency
di st r i but i on

CUM ULATI VE FREQUENCY
It i s t he pr opor t i on of obser vat i ons w i t h val ues l ess t han or equal t o t he upper l i mi t of any
cl ass i nt er val

BAR CHART
Is a gr aphi cal devi ce used i n depi ct i ng dat a t hat have been summar i zed as f r equency, r el at i ve
f r equency or per cent age f r equency

PI E CHART
Is a ci r cul ar r epr esent at i on of dat a w hen a ci r cl e i s di vi ded i nt o sect or s w i t h ar eas equal t o t he
cor r espondi ng component . These sect or s ar e cal l ed sl i ces and r epr esent t he per cent age
br eakdow n of t he cor r espondi ng component s.

HI STOGRAM
Can be def i ned as a set of r ect angl es each pr opor t i onal i n w i dt h t o t he r ange of t he val ues
w i t hi n a cl ass and pr opor t i onal i n hei ght t o t he cl ass f r equency of t he r espect i ve cl ass i nt er val

FREQUENCY POLYGON
Is a gr aphi cal devi ce f or under st andi ng t he shape of t he di st r i but i on. It can al so be const r uct ed
by connect i ng t he mi dpoi nt s of i ndi vi dual bar s of a hi st ogr am

STEM & LEAF PLOT
It can be const r uct ed by separ at i ng t he di gi t s of each number i nt o t w o gr oups, one as a st em
and t he ot her as a l eaf . Af t er separ at i ng t he dat a, t he l ef t most di gi t i s t er med as t he st em and
i s t he hi gher val ue di gi t . The r i ght most di gi t i s t er med as t he l eaf and i s t he l ow er val ue di gi t













M EASURES OF CENTRAL TENDENCY

CENTRAL TENDENCY
The t endency of t he obser vat i ons t o concent r at e ar ound a cent r al poi nt

M EASURES OF CENTRAL TENDENCI ES
St at i st i cal measur es w hi ch i ndi cat e t he l ocat i on or posi t i on of a cent r al val ue t o descr i be t he
cent r al t endency of t he ent i r e dat a

TYPES OF CENTRAL TENDENCI ES
1) M at hemat ical Averages
a. Ar i t hmet i c mean Si mpl e / Wei ght ed
b. Geomet r i c mean
c. Har moni c mean
2) Posit ional averages
a. M edi an
b. M ode
c. Quar t i l es
d. Deci l es
e. Per cent i l es

M ATHEM ATI CAL AVERAGES
ARI THM ETI C M EAN - Simple
Of a set of obser vat i ons i s t hei r sum di vi ded by t he number of obser vat i ons
I ndividual Series =
x
n


Discreet dist ribut ion =
x



Cont inuous dist ribut ion =
x


Wher e x i s t he mi dpoi nt of al l cl ass i nt er val s

Shor t cut f or Cont i nuous di st r i but i on = A +
d

x i
Wher e A = assumed mean,
i = cl ass i nt er val ,
d = devi at i on f r om assumed mean

ARI THM ETI C M EAN - W eight ed
It enabl es us t o cal cul at e an aver age t hat t akes i nt o account t he i mpor t ance of each val ue t o
t he over al l t ot al
X

=
wx
w



COM POSI TE M EAN (Average of 2 means)
The combi nat i on of t w o or mor e ar i t hmet i c means
=
n1X

1 + n2X

2
n1 + n2


GEOM ETRI C M EAN
It i s commonl y used i n t he cal cul at i on of aver age r at e of gr ow t h gener al l y w her e smal l i t ems
ar e assi gned l ar ge w ei ght s and l ar ge i t ems ar e assi gned smal l er w ei ght s
G = x
1
, x
2
, x
3
. x
n
n


I ndividual Series G = ant ilog [
1
n
|ug x ]

Used t o calculat e grow t h rat e

Discreet / Cont inuous dist ribut ion Log G =
1
n
( |ug x)
Generally GM & AM w ill give t he same result

Calculat e principal of t he nt h year
= P
n
= P
u
( 1 + r)
n

= log P
n
= log P
u
+ n (log1 + r)

P
n
= Pr i nci pal of t he nt h year
P
u
= Or i gi nal pr i nci pal
r = r at e of i nt er est
n = nt h year

GM = ant ilog (n1 log GM 1 + n2 log GM 2/ (n1 + n2))

HARM ONI C M EAN
It i s t he r eci pr ocal of t he ar i t hmet i c mean of t he r eci pr ocal of t he var i at e. It i s speci f i cal l y used
i n t he comput at i on of aver age speed, aver age pr i ces, aver age pr of i t s et c under var i ous
condi t i ons.
Discreet / Cont inuous dist ribut ion
1
HM
=

1
x
n
HM =
n

1
x



HM = ( ) / ( ( 1/ x)


POSI TI ONAL AVERAGES
M EDI AN
It i s def i ned as t he mi ddl e or cent r al val ue of t he var i abl e w hen t he val ues ar e ar r anged i n t he
or der of magni t ude
I ndividual Series = (
n+1
2
) t h t erm n = cumulat ive f requency

Discreet dist ribut ion = (
n+1
2
) t h t erm

Cont inuous dist ribut ion = L +
|

x [
N
2
CF ]

Wher e L = Low er l i mi t of cl ass i nt er val ,
i = cl ass i nt er val ,
f = f r equency of medi an cl ass,
CF = cumul at i ve f r equency of cl ass pr ecedi ng medi an cl ass

M ODE
It i s t he var i at e havi ng t he maxi mum f r equency i n a dat a ser i es. A di st r i but i on t hat has a si ngl e
mode i s cal l ed a unimodal dist ribut ion, and t he one w i t h t w o modes i s cal l ed a bimodal
dist ribut ion
= L + i [
F
1
- F

2F
1
- F

- F
2
]

Wher e L = Low er l i mi t of cl ass i nt er val ,
i = cl ass i nt er val ,
F
1
= f r equency of modal cl ass,
F

= Fr equency of t he cl ass pr ecedi ng t he modal cl ass


F
2
= Fr equency of t he cl ass succeedi ng t he modal cl ass

QUARTI LES
In a dat a ser i es w hen obser vat i ons ar e ar r anged i n a or der ed sequence, quar t i l es di vi de t he
dat a i nt o 4 equal par t s
Ungrouped dat a = K (
n+ 1
4
)
Wher e K = nt h posi t i on of t he Quar t i l e,
n = Cumul at i ve f r equency,
in Q1 K = 1, Q2 K = 2, Q3 K = 3 & Q4 K = 4

Grouped dat a = L +
|

x [
KN
4
C ] w here c t ot al f requency below t he quart ile class

DECI LES
In a dat a ser i es w hen obser vat i ons ar e ar r anged i n a or der ed sequence, quar t i l es di vi de t he
dat a i nt o 10 equal par t s
Ungrouped dat a = K (
n+ 1
1
)

Grouped dat a = L +
|

x [
KN
1
C]

PERCENTI LES
In a dat a ser i es w hen obser vat i ons ar e ar r anged i n a or der ed sequence, quar t i l es di vi de t he
dat a i nt o 100 equal par t s
Ungrouped dat a = K (
n+ 1
1
)

Grouped dat a = L +
|

x [
KN
1
C]




M EASURES OF DI SPERSI ON

DI SPERSI ON
The degr ee t o w hi ch numer i cal dat a t ends t o spr ead ar ound an aver age val ue i s cal l ed
var i at i on or di sper si on of dat a.

M ETHODS OF M EASURI NG DI SPERSI ON
i . Range
i i . Int er quar t i l e r ange and quar t i l e devi at i on
i i i . M ean devi at i on or aver age devi at i on
i v. St andar d devi at i on

Coef f icient of Range =
Max Value-Mtn Value
Max Value+Mtn Value
=
L-S
L+S


Quart ile Deviat ions =
Q
3
- Q
1
2


Coef f icient of Quart ile Deviat ions =
Q
3
- Q
1
Q
3
+ Q
1



Absolut e deviat ion = | X X

) |

M ean Absolut e Deviat ion =
| ( X- X

) |
n


Cont inuous Frequency

M ean Absolut e Deviat ion =
| ( X- X

) |




Coef f icient of M ean Absolut e Deviat ion =
Mean Absuulte Devtattun
Mean


Variance
Ungrouped dat a
2
=
( X-X

)
n
=
x
n
- (X

) (short cut met hod)


Grouped dat a
2
=
I( X-X

)
I



Grouped dat a (Short cut met hod)
2
= __
d

[
d

X |_


STANDARD DEVI ATI ON
Ungrouped dat a =
_
( X- X

)
n
=
_
x
2
n
( X)

=
_
( X- u)
N
f or populat ion

Grouped dat a = _
I( X- X

)
I

Grouped dat a (Short cut met hod) o =
_
d

[
d

2
|



Coef f icient of Variat ion = 100 x
n
X

w here x - mean

Combined St andard deviat ion n
12
= _
n
1
o
1
+ n
2
o
2
+ n
1
d
1
+ n
2
d
2

n
1
+ n
2


Wher e X
12
=
n1X

1 + n2X

2
n1 + n2


d
1
= x
12
x
1
,
d
2
= x
12
x
2



Regr essi on

Slope
M = xy nXY, w her e X & Y ar e mean
x
2
nX
2



PROBABI LI TY THEORY

PROBABI LI TY
It i s t he l i kel i hood or chance t hat a par t i cul ar event w i l l occur . Theor y of pr obabi l i t y pr ovi des a
quant i t at i ve measur e of uncer t ai nt y or l i kel i hood of occur r ence of di f f er ent event s r esul t i ng
f r om a r andom exper i ment , In t er ms of quant i t at i ve measur es r angi ng f r om 0 t o 1. Thi s means
t hat t he pr obabi l i t y of a cer t ai n event i s 1 and t he pr obabi l i t y of a i mpossi bl e event i s 0.

Addit ional Theorem
P(A or B) = P(A) + P(B) P(A and B) P(A B) = P(A) + P(B) P(A B)

M ut ually exclusive event s
P(A or B) = P(A) + P(B) P(A B) = P(A) + P(B)

Rule of mult iplicat ion
P(A and B) = P(A) . P(A/ B) P(A B) = P(A) . P(A/ B)

Rule of mult iplicat ion f or mut ually exclusive event s
P(A and B) = P(A) . P(B) P(A B) = P(A) . P(B)


COM BI NATI ON
C
r
= C (n, r) =
n!
r! ( n-r) !
Wher e n = t ot al popul at i on, r = r esul t s

PERM UTATI ONS
P
r
= n! x C (n, r) =
n!
( n-r) !
Wher e n = t ot al popul at i on, r = r esul t s

CONDI TI ONAL PROBABI LI TY
P(B/ A) = occurrence of B w hen A has occurred =
P( AB)
P( A)


P(at least 1) = 1 (P not any)


PROBABI LI TY DI STRI BUTI ON
E(X) = X


= X P(X)

E(X) = X P(X)








BI NOM I AL DI STRI BUTI ON
It descr i bes di scr et e dat a r esul t i ng f r om an exper i ment know n as t he Ber noul l i pr ocess.
Tossi ng of a f ai r coi n f or a f i xed number of t i mes i s a Ber noul l i pr ocess and t he out comes of
such t osses can be r epr esent ed by Bi nomi al di st r i but i on.

Bi nomi al di st r i but i on w i l l appl y w hen Success + Fai l ur e = 1 p + q = 1

Probabilit y of X success in n Trials = P(x) = C
x
p
x
q
n-x

Wher e P(success) = p
P(f ai l ur e) = 1-p = q

M ean of Binomial dist ribut ion = = E( x) = np

Variance of Binomial dist ribut ion = o
2
= np(1-p) = npq

St andard deviat ion of Binomial dist ribut ion = o = npq
Wher e n = number of event s
p = P(Success)
q = P(f ai l ur e)

POI SSON DI STRI BUTI ON
It i s t he l i mi t i ng f or m of Bi nomi al di st r i but i on w hi ch f ocuses on t he number of di scr et e occur r ences
over an i nt er val . It i s appl i ed w hen n i s ver y l ar ge and p i s smal l .

M ean of Poisson dist ribut ion = 2 = St andard deviat ion of Poisson

Poisson = P(x) =
e
-2
2
x
x!
w here 2 = np


NORM AL DI STRI BUTI ON
Z scor e can be def i ned as t he number of st andar d devi at i on t hat a val ue, x i s above or bel ow t he mean
di st r i but i on. Fr om t he f or mul a i t i s cl ear t hat
1. i f t he val ue of x i s l ess t han t he mean t he z scor e i s negat i ve,
2. i f t he val ue of x i s mor e t han t he mean, t he z scor e i s posi t i ve and
3. i f t he val ue of x i s equal t o mean t he z scor e i s 0

Z =
x-
o

Wher e Z i s t he nor mal var i at e


Fx = 1(/ si gma 2Pi ) x e
-z2/ 2






HYPOTHESI S TESTI NG
Hypot heses t est ing
Is a w el l def i ned pr ocedur e w hi ch hel ps us t o deci de obj ect i vel y w het her t o accept or r ej ect t he
hypot hesi s based on t he i nf or mat i on avai l abl e f r om t he sampl e.

Null Hypot hesis - H
u
; =
u

Wher e i s t he popul at i on mean and
u
i s t he hypot het i cal val ue of t he popul at i on mean

It i s t he hypot hesi s w hi ch i s t est ed f or possi bl e r ej ect i on under t he assumpt i on t hat i s t r ue. It i s a set
as no di f f er ence or st at us quo and consi der ed t r ue unt i l and unl ess i t i s pr oved w r ong by t he col l ect ed
sampl e dat a.

Alt ernat ive Hypot hesis - H
1
;
u
consequent ly <
u
or >
u

It i s t he l ogi cal opposi t e of t he Nul l hypot hesi s. That i s w hen t he nul l hypot hesi s i s f ound t o be f al se
t hen t he al t er nat i ve hypot hesi s i s t r ue.

Level of signif icance
It i s t he pr obabi l i t y w hi ch i s at t ached t o t he nul l hypot hesi s, w hi ch may be r ej ect ed even w hen t hi s i s
t r ue. The l evel of si gni f i cance i s al so know n as t he si ze of r ej ect i on r egi on or t he si ze of t he cr i t i cal
r egi on

Tw o-Tailed t est of hypot hesis
It cont ai ns t he r ej ect i on on bot h t he t ai l s of t he sampl i ng di st r i but i on of t he t est st at i st i c. Thi s means
t hat t he r esear cher w i l l r ej ect t he nul l hypot hesi s i f t he comput ed smapl e st at i st i c i s si gni f i cant l y
hi gher t han or l ow er t han t he hypot hesi zed popul at i on par amet er (consi der i ng bot h t he r i ght and l ef t
t ai l s)

One-Tailed t est of hypot hesis
It cont ai ns t he r ej ect i on r egi on on one t ai l of t he sampl i ng di st r i but i on of a t est st at i st i c.
Lef t -t ailed t est nul l hypot hesi s i s r ej ect ed i f t he comput i ng st at i st i c i s si gni f i cant l y low er t han t he
hypot hesi zed popul at i on par amet er . H
1
; <
u

Right -t ailed t est nul l hypot hesi s i s r ej ect ed i f t he comput i ng st at i st i c i s si gni f i cant l y higher t han t he
hypot hesi zed popul at i on par amet er . H
1
; >
u


Type I error is commit t ed by reject ing t he null hypot hesis w hen it is t rue.
Type I I error is commit t ed by accept ing a null hypot hesis w hen it is f alse.










HYPOTHESI S TESTI NG FOR SI NGLE POPULATI ON

CENTRAL LI M I T THEOREM
A popul at i on has a mean p and st andar d devi at i on of o. If a sampl e of si ze n i s dr aw n f r om t he
popul at i on f or suf f i ci ent l y l ar ge sampl e si ze (n 30). The sample means are approximately normally
di st r i but ed r egar dl ess of t he shape of t he popul at i on di st r i but i on.

Z =
x-
o
n


Conf i dence i nt er val t o est i mat e popul at i on mean (det er mi ne t he sampl e si ze)
= X

Z x
o
n



Sample dist ribut ion of sample proport ion p

Z =
p- p
_
pq
n


Wher e p i s t he sampl e pr opor t i on, p i s t he popul at i on pr opor t i on & n i s sampl e si ze


The t dist ribut ion
The t di st r i but i on i s a f ami l y of si mi l ar pr obabi l i t y di st r i but i on w i t h a speci f i c t di st r i but i on dependi ng
on a par amet er know n as t he degr ee of f r eedom

t =
X

-
S
n















HYPOTHESI S TESTI NG FOR TW O POPULATI ONS


1

2
= Normal Dist ribut ion
x
1
x
2
= Normal Dist ribut ion


TW O-TAI LED TEST
St andard Error o
x
1
-x
2

= _
o
1
2
n
1
+
o
2
2
n
2


Z =
( x
1
-x
2
) (
1
-
2
)
_
x
1
2
n
1
+
x
2
2
n
2



ONE-TAI LED TEST
St andard Error o
x
1
-x
2

= SP_
1
n
1
+
1
n
2


Wher e SP = _
( N
1
- 1) S
1
2
+( N
2
- 1) S
2
2
N
1
+ N
2
- 2




Chi square t est t est
It i s used f or Goodness of Fit & Homogeneit y
Goodness of Fit w het her t he know n pr obabi l i t y di st r i but i on (bi nomi al , poi sson or nor mal
di st r i but i on) mat ch w i t h an act ual sampl e di st r i but i on.
Homogeneit y Is used t o det er mi ne w het her t w o or mor e i ndependent var i abl e ar e dr aw n f r om t he
same popul at i on or f r om di f f er ent popul at i ons

=
( O-F)
2
F

Wher e O i s t he obser ved f r equenci es, E i s t he expect ed f r equenci es


Expect ed f r equency of a cel l = e =
RT x CT
N

Wher e RT i s t he r ow t ot al , CT i s t he col umn t ot al & N i s t he t ot al number of f r equenci es


To r ead t he Chi t abl e
2
( d) ( x|gn||cance)

Wher e df i s t he (t ot r ow -1)(t ot col umn-1), Signif icance i s t he val ue pr ovi ded

Você também pode gostar