Você está na página 1de 7

Formulrio Estatstica - 2013

Estatstica Descritiva Univariada


Tabelas de Frequncia
frequncia
absoluta

Xi

ni

frequncia
relativa

fi

frequncia
absoluta
acumulada

frequncia
relativa
acumulada

Ni

Fi

.
n

.
.

.
.

.
.

.
.

Diviso da amostra em classes (variveis contnuas)


Regra de Sturges: k = 1 + 3,32 x log n

max - min
ai = amplitude das classes =
k

k = inteiro

Mdia

X =

Mdia
(populao)

Quartis
3 quartil = Q 3

Medidas de Localizao
Mdia
(amostra)

Quantis empricos
percentis 100
duo-deciles 12
decis 10
quintis 5
quartis 4
tercis 3

X
i =1

Q3 n x
4

Q3 n x

k = no inteiro
(decimal)

Mdia

X =

(observaes repetidas)

k = 3,5 X 3,5+1 = X 4

X
i =1

2 quartil = Q2 = mediana

1 quartil = Q1

n X
i =1

k = inteiro

Q3 n x

Mdia

X =

n C
i =1

n
Ci = ponto mdio
da classe i

Mediana

Depois de ordenada a amostra


n = par
n = impar

Med =

Xn + Xn
2

2
Med = X n +1

+1

k = no inteiro
(decimal)

X + X k +1
1
=k k
4
2

Q3 n x
4

1
= k X k +1
4

k = 3,5 X 3,5+1 = X 4

Percentis

X k + X k +1

2
Percentis
X
k +1

np
inteiro
100
np
dcimal
k=
100
k=

Para variveis ordinais:

Moda

3
= k X k +1
4

(observaes
agrupadas em k
classes)

X + X k +1
3
=k k
4
2

Depois de ordenada a amostra, a observao que


aparece mais vezes repetida

X
k
Percentis
X
k +1

np
inteiro
100
np
dcimal
k=
100

k=

Q1=P25 ; Q2=P50=mediana ; Q3=P75


2013, Pedro Casquilho

Medidas de Disperso

Outliers (valores atpicos ou anormais)


Limite
X i < Q1 1,5 x AIQ

Amplitudes
Amplitude
(total)
Amplitude
Interquartlica

A = Max Min

AIQ =

4
x
3

Amostra

(X

S2 =

i =1

Assimetria (Skewness)

n 1

Populao

( y )
i =1

Para observaes
repetidas

n .(X

S2 =

i =1

X)

S2 =

G1 0 distribuio simtrica

(C X )

i =1

varincia
n

(X

S=

Coeficiente de Achatamento G2

G2 0 distribuio mesocrtica

G2 > 0 distribuio leptocrtica

X)

n 1

(y
i =1

Achatamento ou Curtose (Kurtosis)

Populao

G1 < 0 distribuio assimtrica negativa

(n 1)2
n 2 .(n + 1)
M
x ' 44 3 x
G2 =

. n 3)
. n 2 )(
. n 3) S (n 2 )(
(n 1)(

Desvio Padro

i =1

G1 > 0 distribuio assimtrica positiva

da classe i

Amostra

n2
M
G1 =
x '33
(n 1)(. n 2) S

n 1

n 1
Ci = ponto mdio

desvio padro =

Coeficiente de Assimetria G1

Para observaes
agrupadas em k
classes

Simtrica X = Md = Mo

Distribuio
Positiva X Md Mo
Assimtrica Negativa X Md Mo

Medidas de Forma

X)

Observaes Padronizadas
X X
Zi = i
S

Varincia
n

X i > Q 3 + (1,5 x AIQ )

X i > LS um outlier

AIQ = representa 50% dos dados

Relao entre
AIQ e

X i < L i um outlier

AIQ = Q3 Q1 = Q 3 Q1
4

Inferior (Li)
Limite
Superior (Ls)

G2 < 0 distribuio platicrtica

Coeficiente de Curtose K

K=

(Q3 Q1 )
2 x (P90 P10 )

Medidas de Disperso
Disperso
Relativa
Disperso Relativa
Resistente

S
X
A
CVR = IQ
Med
CV =

2013, Pedro Casquilho

Estatstica Descritiva Bivariada


Covarincia
COV ( X ,Y ) S X' .SY'
Covarincia
(amostra)

COV ( X ,Y ) =

1 n

. ( X i X )(
. Yi Y )
n 1 i =1

1 n

. X iYi n. XY
n 1 i =1

COV ( X ,Y ) =
Covarincia
(populao)

COV ( X ,Y ) =

1 N
. ( X i x ).(Yi y )
N i =1

Medidas de Associao
Coeficientes de Correlao

Coeficiente de
Correlao

Varivel X

Varivel Y

Quantitativa

Quantitativa

Pearson

Quantitativa ou
Qualitativa
(ordinal)
Qualitativa
nominal
dicotmica
Qualitativa
nominal
dicotmica
Qualitativa
politmica
nominal/ordinal

Qualitativa
(pelo menos ordinal)
Quantitativa
Qualitativa
nominal
dicotmica
Qualitativa
dicotmica/politmica
nominal/ordinal

Spearman

Coeficiente de Correlao de Spearman


n

RS = 1

(R )

(RS )

Phi

(R )

()
(C )

n n

Rbp =

bp

C-Cramer

i =1
3

-1 RS 1

Coeficiente bisserial por pontos

Bisserial por
pontos

6 x d i2

n1.n2 . ( X 1 X 2 )
n.(n 1).S X'
ou

Rbp =

X1 X 2
. pq
SX

-1 Rbp 1

Coeficiente de Correlao de Pearson

R=

R=

COV ( X ,Y )
S X' .SY'

Coeficiente de correlao phi

(s para tabelas 2x2)

1 n
. ( X i X )(
. Yi Y )
n 1 i =1

(X
i =1

i X)

n 1

COV ( X ,Y ) S X' .SY'

(Y Y )
i =1

n 1

-1 R 1

AD BC
( A + B )(. C + D )(. A + C )(. B + D )
-1 1

2013, Pedro Casquilho

Coeficiente de correlao C-Cramer

C=

Tabela 1

X2
n.(m 1)

0 C 1

Li .C j

Eij =

m = menor valor entre o n Linhas e o n Colunas

Tabela 2
l

X =
2

(O

i =1 j =1

Li = total marginal da linha

ij

Eij )

Eij

Oij = valores observados

C j = total marginal da coluna

Eij = valores esperados

n = n total da amostra

Algumas Distribuies Tericas


6. Teorema do Limite Central: sejam

Variveis discretas

Distribuio Binomial
E ( X ) = np
X ~ B (n, p )
P( X = x ) = C xn x p x x (1 p )

n x

variveis aleatria independentes com


X i ~ N (, ) e i = 1,2,..., n , ento para n

Var ( X ) = npq

x = 1,2,..., n

X 1 , X 2 ,..., X n variveis aleatrias

independentes com X i ~ B (ni , p ) e i = 1,2,..., k ,

~
X
B
ni , p

i
i =1
i =1

ento

2. Seja

X ~ B (n, p ) , ento quando n e

0,1 < p < 0,9 tem-se que X ~ N np, npq

n
n3
n2
Distribuio F-Snedecor
X ~ F(n ,d )

2. Z ~ N (0,1) P(Z z ) = 1 P(Z z )


3. Seja X ~ N (, ) e Yi = a b. X i , ento

E(X ) =

Y ~ N (a b., b )

4. Sejam X 1 ~ N (1 , 1 ) e X 2 ~ N ( 2 , 2 ) ento
( X 1 X 2 ) ~ N (1 2 , 12 + 22 2.COV ( X 1 , X 2 ) )
X 1 ~ N (1 , 1 ) e X 2 ~ N ( 2 , 2 )

independentes, ento

12 + 22

Var ( X ) = 2n

Var ( X ) =

X
X ~ N (, ) Z =
~ N (0,1)

( X 1 X 2 ) ~ N (1 2 ,

E(X ) = n

Distribuio t-Student
X ~ t( n )
E(X ) = 0 e n 2

Var ( X ) = 2

Propriedades:

5. Sejam

Distribuio Normal reduzida (estandardizada)


X -
X ~ N (, )

Z=
~ N (0,1)

P ( X < x ) = P Z <

X ~ (2n )

Distribuio Normal
E(X ) =
X ~ N (, )

Distribuio Qui-Quadrado

Variveis Continuas

1.

tem-se X ~ N ,

Propriedades:

1. Sejam

X 1 , X 2 ,..., X n

d
d 2

Var ( X ) =

2d 2 (n + d 2 )
2
n(d 2 ) (d 4 )

Propriedades:

1.

X ~ F(n ,d )

2.

F;(n ,d ) =

F
X ~ ( n ,d )

F1;(d ,n )

)
2013, Pedro Casquilho

KEY FORMULAS

Lind, Marchal, and Wathen

Basic Statistics for Business and Economics, 5th edition

CHAPTER 5

CHAPTER 3

Special rule of addition

Population mean

LX

P(A or B) = P(A)

[3-1]

fL=-

P(A) = 1 - P(-A)

X=LX

[3-2]

P(A or B)
W1

X1 + W2 X2 + ... + wnXn
w1 + w2 + ... + wn

[5-3]

General rule of addition

Weighted mean
=

[5-2]

Complement rule

Sample mean, raw data

+ P(B)

= P(A) + P(B)

- P(A and B)

[5-4]

Special rule of multiplication

[3-3]

P(A and B) = P(A)P(B)

[5-5]

General rule of multiplication

Geometric mean

GM

\1'(X1)(X2)(X3)

(Xn)

[3-4]

Geometric mean rate of increase

_ n/ Value at end of period


GM - \ Value at start of period

P(A and B)

= P(A)P(BIA)

[5-6]

Multiplication formula
_
1.0

Total arrangements = (m)(n)

[3-5]

[5-7]

Number of permutations
Range
Range = Largest value - Smallest value

[3-6]

Mean deviation

p'=_n_l_
(n - r)!

[5-8]

C =_n_l_
r rl(n - r)l

[5-9]

n r

Number of combinations

MD

= LIX-XI
n

[3-7]
n

Population variance

CHAPTER 6
[3-8]

Mean of a probability distribution

[6-1]

fL = L[XP(x)]

Population standard deviation

Variance of a probability distribution

[3-9]

(J"2 =

L[(x - fL)2p(x)]

[6-2]

Binomial probability distribution

Sample variance
S2

= =L.>:..(X=-_-:..;-X-,-)2

n-1

P(x) = nCx 7l"'(1 [3-10]

fL =

IL(X- X)2

V n- 1

[3-11]

[6-3]

Mean of a binomial distribution

Sample standard deviation

s=

1T)n - x

[6-4]

n1T

Variance of a binomial distribution


(J"2 =

n1T(1 -

1T)

[6-5]

Poisson probability distribution

CHAPTER 4
Location of a percentile

Xe-/J.

p(x)=_fLxl

[6-6]

[4-1]

CHAPTER 7
Pearson's coefficient of skewness
Sk = 3(X - Median)

Mean of a uniform distribution

[4-2]

Software coefficient of skewness

sk= (n

-1~n _ 2) [~(X~Xn

fL=

a+b

[7-1]

Standard deviation of a uniform distribution

[4-3]

(J"=

/(b - a)2
'-1-2-

[7-2]

CHAPTER 10

Uniform probability distribution

z distribution as a test statistic

P(x) =
if a ::s;x::s; b

[7-3]

-a

X-JL
CI/Vii

and 0 elsewhere
z distribution,

Normal probability distribution

P(x) = _1_ e-[X-I'-)~


CIyI2;

[10-1]

Z=-CI

unknown
X-JL

[10-2]

z=--

[7-4]

s/Vii

2'"

Test of hypothesis, one proportion

Standard normal value


z=X-JL
CI

P-1T
Z=--

[7-5]

[10-3]

CI.

Test of hypothesis proportion

CI

=-

CI-

Vii

CI

[10-4]

r(1;;1T)

Standard error of mean

z-value, JL and

P-1T

Z=

CHAPTER 8
[8-1]

One sample test of mean, small sample


X- JL

[10-5]

t=--

known

s/Vii
X-JL
z = CI/\/n

[8-2]

CHAPTER 11
Test statistic for difference between two large s~mple means

z-value, population shape and

CI

unknown
X1 -X2

X-JL

z=--

[11-2]

z=~
~+~

[8-3]

s/Vii

n1

n2

Two-sample test of proportions


z=

CHAPTER 9
Confidence interval for JL, n

2:

30

P1 - P2

~Pc(1 - Pc) + Pc(1 - Pc)


n1

X+z-E-

Confidence interval for JL,

CI

Vii

[9-1]

-Vii

X1 +X2
Pc = n1 + n2

[9-2]

S2

= (n1 -1)s~

+ (n2 -1)s~
n1 + n2 - 2

[9-3]

t=
[9-4]

Standard error of proportion

a =
p

X1 -X2

[11-6]

[11-7]

~s~(.l+.l)
n1 n2

Paired t test

~P(1 - p)
n

[9-5]

~P(1;; p)

t=--

sd/Vii

Confidence interval for proportion

Pz

[11-5]

Two-sample test of means-small samples

Confidence interval for proportion

P z CIp

[11-4]

Pooled variance

Sample proportion

P=n

n2

Pooled proportion

unknown

X+ t-E-

[11-3]

CHAPTER 12
[9-6]

Test for comparing two variances

F=~

[12-1]

SS total = ~(X - XG)2

[12-2]

s~

Sample size for estimating mean

n=

(~r

[9-9]

Sample size for proportion

n = p(1 -

Sum of squares, total

Sum of squares, error

p)(~r

[9-10]

SSE = ~(X -

Xc)2

[12-3]

Sum of squares, treatments

Prediction interval

[12-4]

SST = SS total - SSE

~1

Y' :::!: t(Sy.x)

Confidence interval for means


(Xl - X2) :::!: t

~MSE(.l
+ .l)
n
n
1

[12-5]

+.1. + (X - X)2
n

[13-8]

k(X_X)2

CHAPTER 14
Multiple regression equation

Y' = a

CHAPTER 13
Coefficient of correlation

+ b 1X 1 + b~2 + ... + bkXk

[14-1]

Multiple standard error

k(X - X)(Y - Y)
r=
(n -1)sxsy ,

[13-1]

Correlation test of hypothesis

SY12 .. k

v'1'=f2

y')2

+ 1)

2-~

[13-2]

Linear regression equation

[14-2]

R - SS total

[14-3]

Global test of hypothesis

+ bX

Y' =a

[13-3]

SSR/k

F = SSE/(n - (k + 1))

Slope of the regression line

[14-4]

Testing for a particular regression coefficient

Sy
Sx

[13-4]

r-

t = b,-

Sb,

Intercept of the regression line


a=Y-bX

[13-5]

[14-5]

CHAPTER 15
Chi-square test statistic

Standard error of estimate

~k(Y- y')2
n-2

x2 =

[13-6]

2:[('0 ~. ,.)2]

[15-1]

Expected frequency

Confidence interval
Y' :::!: t(Sy.x)

n - (k

Coefficient of multiple determination

t=rvn - 2

Sy.x =

~k(Y

~.1. +
n

f = (Row total)(Column total)


(X - X)2
k(X_X)2

[13-7]

Grand total

[15-2]

Você também pode gostar