Você está na página 1de 49

RAGAM ANALISIS UNIVARIAT, BIVARIAT

DAN MULTIVARIAT
whisnu.t.
STATISTIK INFERENSIAL
1
Metode Analisis Kuantitatif
2
Metode Analisis berdasarkan variabel dan
skala pengukuran:
1. Analisis Univariat: t Test, one way
anova
2. Analisis Bivariat: asosiasi, diferensiasi,
korelasi dan regresi
3. Analisis Multivariat: elaborasi(korelasi
berganda), multiple regression (regresi
berganda), path, discriminant, factor
dan cluster analysis


ANALISIS BIVARIAT:
ASOSIASI, DIFERENSIASI,
KORELASI DAN REGRESI
3
Variabel 1
Variabel 2
Nominal Ordinal Interval
Nominal
Chi-square 2
Cramers
Coefficient contingency
Lambda simetrik
Lambda asimetrik
Dituruni 1,
skalanya jd
nominal
t- Test (hypothesis of
difference)
z-test (hypothesis of
difference)
Eta
Ordinal
Kendalls
Spearman s
Gamma
Sommers D asimetrik
Dituruni 1
Interval
Pearsons
Regression Asimetrik
4
ASOSIASI - CHI SQUARE
Asosiasi merujuk kepada pengukuran kekuatan hubungan
dimana salah satu variabel adalah dikotomi(hanya
membedakan 2 nominal, contoh +/- baik dan buruk
seperti 2 perbedaan yang ekstrim, pembagian 2 kutub
contoh laki2 perempuan), nominal atau ordinal.

Alat ukurnya adalah chi square

= = (x- / ) atau = (n-1) s /
atau = (Of-Ef) / Ef

Dimana, Of = observed frekuensi dan Ef = expected
frekuensi

5
6
Pengambilan keputusan
Lihat hasil chi square test, value 7.433 dan sig.059.
Jgn lupa lihat cells expected count less than 5 tidak
boleh lebih dari 20% untuk 2 X 2

Syarat:
1. Nilai chi square hitung > chi square tabel, ingat
df=3 dan =5%
2. Sig hasil < 0.05

Kekuatan hubungan dilihat pada uji pengukuran
7
KEKUATAN HUBUNGAN
Interpretasi:
< 0.20 lemah sekali, hampir bisa diabaikan
0.20-0.40 lemah
0.40-0.70 cukup kuat/moderat
0.70-0.90 kuat
0.90-1.00 amat kuat

...it is possible to have a relationship which displays
strong association but is not significant or a
relationship which displays an extremely weak
association but is very significant
8
DIFERENSIASI
Uji rata-rata dua sampel independen
dengan t-test

contoh:
Perbedaan tinggi badan antara pria dan
wanita
Perbedaan penghasilan antara bidang
pekerjaan
Perbedaan skor job satisfaction antara
pegawai swasta dan pemerintah
dsb


9
Untuk analisis untuk menguji apakah varians sama/ beda,
dan dilihat dari nilai hitung F dan sig, syarat sig < 0.05
untuk memenuhi signifikansi

Kemudian lihat t hitung dan sig-nya, syaratnya sama sig <
0.05 atau nilai lower-upper yang tidak boleh melewati
(nilai) 1.

10
Uji rata-rata dua sampel yang berpasangan

Contoh:
pengujian berat badan sebelum dan sesudah
mengikuti program diet
Skor job satisfaction sebelum dan sesudah
mengikuti pelatihan
Perbedaan nilai ujian sebelum dan sesudah
mengikuti tutorial

11
Lihat sig dan t hitung
Syarat,
Sig < 0,05
t hitung > t tabel (uji dua sisi)
12
KORELASI SEDERHANA
Sebuah bentuk asosiasi dimana kedua variabel
adalah interval

Metode ini adalah yang umum dipakai untuk
analisis bivariat

Correlation is symmetrical, not providing evidence
of which way causation flows

Dalam korelasi pada beberapa kasus dapat
diterapkan pada variabel yang menggunakan
skala ordinal (Spearmans model)


13
Pengukuran korelasi kerap diidentikkan dengan pearsons
correlation (product moment pearson) yang merupakan
analisis untuk menelaah kekuatan hubungan antara dua
variabel

Pearsons correlation = rxy

Coefficient of determination = rxy
...the percent of the variance in the dependent
variable explained by the independent...
(Garson, 2002)
The proportion of the variability among the y
scores that can be accounted for by the
variability among the x scores (Sprinthall, 1982)

rxy = 0.70 rxy = 0.49

49% of the information about y is contained in x


14
Correlations


15
Nonparametrik Correlation
16
Persyaratan penggunaan Pearsons R (Sprinthall p.193,
1982):

1. Bila menggunakan sampel dan ingin menarik inferensi
ke populasi maka sampel harus dipilih secara acak
(random)

2. Variabel yang digunakan menggunakan skala interval

3. Variasi dalam distribusi nilai variabel-variabel yang
digunakan bisa diasumsikan serupa (homoscedasticity)

4. Distribusi nilai dari tiap variabel harus unimodal dan
cukup simetrik the pearsons r can almost never be
used on income data since the income distribution in
the population is usually skewed Pearsons coefficient
hanya menunjukkan kemungkinan adanya hubungan
linear antar variabel.
17
REGRESI SEDERHANA
Analisis regresi digunakan untuk tujuan peramalan,
dan menganalisis bentuk hubungan antara dua
variabel dengan mengembangkan estimating
equation (persamaan regresi)

Analisis regresi =

Banyak diterapkan pada area bisnis, untuk
memprediksi hubungan iklan dengan penjualan,
tes sikap dengan kinerja karyawan, rasio
keuangan dengan harga saham,dsb 18
Analisis:
Adanya nilai rata-rata dan standar deviasi serta total sampel
Adanya hubungan yang kuat dan nyata antara orientasi politik dan
penggunaan media
Angka r square .596 adalah 59,6% orientasi politik diprediksi oleh
penggunaan media
19
Analisis:
Dari hasil uji ANOVA dan nilai F test, dilihat bahwa F =
26.238 dan tingkat signifikansi .000 jauh lebih kecil dari .050,
maka model regresi bisa digunakan untuk memprediksi
orientasi politik
Persamaan regresi Y = 1.728 + 0.136 X
20
Bagaimana mengetahui distribusi
normal?
21
Variabel yang akan digunakan bisa dihitung
koefisien pearson-nya:

Sk = 3( X- Me) / S

Sk = Koefisien Pearson
X = Rata-rata
Me = Median
S = Standar deviasi
ANALISIS MULTIVARIAT
ELABORASI, REGRESI BERGANDA, ANALISIS
JALUR (PATH), DISKRIMINAN, ANALISIS FAKTOR
DAN ANALISIS KELOMPOK (CLUSTER)
22
MULTIVARIATE ANALYSIS
23
Elaboration
Contingency tables
Split correlation analysis
High order partial analysis
Path analysis
Multiple regression prediction
Differentiation
Discriminant analysis
Manova
Exploration
Factor analysis
Cluster analysis
ELABORATION AND PARTIAL
CORRELATION
24
Partial correlation is the correlation of two
variables while controlling for a third or more
other variables (maximum 3 controlling
variables)

The extended model of partial correlation is path
analysis or structural equation modeling
when data are near or at interval level or use log-
linear modeling for lower level data


Statistical requirements for intervening and antecedent
variables
25
Intervening variable
All the three variables (intervening, independent and
dependent) must be related (theoretically)
When intervening variable is controlled, the relationship
between independent and dependent variable should
vanish
When independent variable is controlled, the relationship
between intervening and dependent variable should not
disappear

Antecedent variable
All the three variables (intervening, independent and
dependent) must be related (theoretically)
When antecedent variable is controlled, the relationship
between independent and dependent variable should not
vanish
When independent variable is controlled, the relationship
between antecedent and dependent variable should
disappear



PARTIAL CORRELATION
Tingkat
liberalisme
ekonomi
(antiseden)
Proporsi middle
class
independen
(independen)
Intensitas tuntutan
demokrasi
(intervening)
Tingkat
demokratisasi
sistem politik
(dependen)
Fragmentasi
kekuasaan elit
politik (control)
PATH ANALYSIS
Peer group
Gratification
obtained
dependen
Gratification
sought
independen
Sociability
Media
ownership
D
e
m
o
g
r
a
p
h
i
c

S
o
c
i
o

e
c
o
n
o
m
i
c

s
t
a
t
u
s


&

S
e
x

Intensity
Interactivity
Gratification
deficiency
Indikator
untuk
mengukur
Kemungkinan suatu hasil elaborasi
28
Konstan
Replikasi variabel ketiga tidak mempengaruhi

Melemah
Eksplanasi variabel ketiga mempengaruhi sebagai
anteseden (menjelaskan)
Intepretasi variabel ketiga mempengaruhi sebagai
intervening (menafsirkan)

Terbelah
Spesifikasi variabel ketiga mempengaruhi sebagai merinci
variabel

Menguat
Suppressor/distorter variabel ketiga mempengaruhi sebagai
distorter/suppressor

Teknik-teknik Elaborasi
29
CONTINGENCY TABLES
Variabel independen dan dependen nominal/ordinal
Variabel kontrol nominal/ordinal
Kategori nilai variabel kontrol tidak terlalu banyak
Semakin banyak variabel kontrol akan semakin besar sampel yang
dibutuhkan

SPLIT/DIFFERENTIAL ANALYSIS
Variabel independen dan dependen interval
Variabel kontrol nominal/ordinal
Kategori nilai variabel kontrol tidak terlalu besar
Semakin banyak variabel kontrol akan semakin besar sampel yang
dibutuhkan

HIGH ORDER PARTIAL ANALYSIS
Variabel independen dan dependen interval
Variabel kontrol interval
Jumlah variabel kontrol tidak tergantung besar sampel


30
Pengaruh gaya hidup terhadap orientasi politik
pelajar SMA dengan penggunaan media sebagai
variabel control
Dari hasil terlihat bahwa penggunaan media menjadi
intervening atau anteseden




MULTIPLE REGRESSION
31
Suatu teknik analisis untuk memprediksi nilai
sebuah dependen variabel berdasarkan nilai-nilai
sejumlah variabel independen

Ada beberapa metode penghitungan regresi
berganda:
Enter
Backward elimination
Forward elimination
Stepwise method
Analisis untuk dua variabel independen model enter,
95,2% orientasi politik dapat dijelaskan oleh variabel gaya hidup
dan penggunaan media
nilai sig dari tabel anova adalah .000 dimana model regresi yang
digunakan dapat memprediksi orientasi politik
Y = 11,046 + 6,857 X1 + 5,047 X2 sebagai persamaan regresinya
32
Model backward elimination, analisis
Lihat tabel model summary pada adjusted R square (utk >2
variabel bebas)
Ada empat model yang dihasilkan, dan model ke 4 yang
memiliki hasil terbesar
94,4% penjualan dapat dijelaskan oleh variabel jumlah outlet
dan promosi
33
Analisis,
Model 4 memiliki angka sig .000 (syarat < .005) maka
model regresi dapat digunakan
34
Analisis kolinearitas,
Untuk melihat hubungan diantara variabel-variabel independennya, apakah
terjadi kolinearitas
Lihat angka tolerance, contoh lihat model 1 pada pendapatan. Diadapat
angka tolerance 0,750 yang berarti R adalah 1 0,075 = 0,250. Jadi hanya
25% variabel pendapatan bisa dijelaskan oleh variabel independen lain
atau lihat VIF, dimana VIF = 1/Tolerance, angka VIF tidak boleh lebih besar
dari 5, karena terjadi multi koleniaritas diantara variabel-variabel bebasnya
Y = 54,639 + 2,342 X1 + 0,535 X2



35
Model forward elimination
Analisis datanya sama seperti
metode backward elimination!
36
Model stepwise
Analisis sama seperti model
backward, dan metode ini yang
paling sering digunakan untuk
analisis regresi berganda!
37
PATH ANALYSIS
38
is a causal model to understanding relationship
between variables (Babbie, 1973 p.324)

is a statistical technique that can be used to find out the
differences between two or more group of objects with
respect to several variables simultaneously (Klecka,
1980)

an explicit hypothesis of cause and effect that is tested
using the method of path analysis (Phil Ender, 2002)

However convincing, respectable, dan reasonable a
path diagram may appear, any causal inferences
extracted are rarely more than a form of statistical
fantasy (Everit and Dunn, 1991)
DISCRIMINANT ANALYSIS
39
Discriminant function analysis, known discriminant
analysis or DA, is used to classify cases into the
values of a categorical dependent, usually a
dichotomy. If discriminant function analysis is
effecetive for a set of data, classification table of
correct and incorrect estimates will yield a high
percentage correct. There are several purposes of
DA:
To investigate differences between groups
To determine the most parsimonious way to
distinguish between groups
To discard variables which are little related to group
distinctions
To classify cases into groups
To test theory by observing whether cases are
classified as predicted
40
Discriminant analysis (Garson, 2002)
shares all the usual assumptions of correlation,
requiring linear and homoscedastic realtionship, and
untruncated interval or near interval data
like multiple regression, it also assumes proper model
specification (inclusion of all important independents
and exclusion of extraneous variables)
assumes the dependent variable is a true dichotomy
since data which are forced into dichotomous coding
are truncated, attenuating correlation
is an earliezr alternative to logistic regression,which is
now frequently used in place of DA, as it usually
involves fewer violations of assumption, is robust, and
has coefficients which many find easier to interpret

41
Assumption for discriminant analysis
Dependent variable are a true dichotomy. One should never
dichotomize a continuous variable simply for the purpose of
applying discriminant analysis
All cases must be independent and must belong to a group
formed by the dependent variable. The groups must be
mutually exclusive
Group sizes of the dependent aren not grossly different
Independent variable(s) is interval, and dichotomies, dummy
variables and ordinal variables with at least 5 categories are
commonly used
The maximum number of independent variables is n-2, where
n is the sample size
Homogeneity of variances (homoscedasticity) within each
group formed by dependent, and variance of independent
should be similar between groups
Absence of perfect multicollinearity, of independent variables
will produce tolerance value approaching 0 and the matrix
wont have a unique discriminant solution
Low multicollinearity of independent, to the extent
independents are correlated, the standardized discriminant
function coefficient will not reliablyy assess the realative
importance of the predictor variables
FACTOR ANALYSIS
42
is a statistical technique used to identify a realtive small
number of factors that can be used to represent
relationship among sets of many interrelated variables
(Norusis, 1993 p.47)

The goal of factor analysis is to identify the not-directly-
observable factors based on a set of observable
variables

Two models of factor analysis:
1. Exploratory factor analysis (EFA) to uncover the
underlying structure of a realtively large set of
variables. Theres no prior theory and one uses factor
loadings to intuit the factor structure of the data
2. Confirmatory factor analysis (CFA) to determine if the
number of factors and the loading if measured
(indicator) variables in them conform ti what is
expected on the basis of pre-established theory
43
The purposes of factor analysis:
To reduce a large number if variables to a smaller
number of factors for modelling purposes. Factor
analysis is intergrated in structural equation modelling
(SEM)
To select a subset of variables from a larger set,
based on which original variables have the highest
correlations with the principal component factors
To create a set of factors to be treated as
uncorrelated variables as one approach to handling
multicollinearity in such procedures as multiple
regression
To validate a scale or index by demonstrating that its
constituent items load on the same factor, and to drop
proposed scale items which cross-load on more than
one factor
To establish that multiple tests measure the same
factor, thereby giving justification for administering
fewer tests
To identify clusters of cases and or outliers

Menguji konsep liberalism dengan faktor
analysis

44
Concept Dimensions Sub dimensions
LIBERALISM
ECONOMIC
LIBERALISM
hapus monopoli
zona free trade
potong subsidi
privatisasi BUMN

PERSONAL
LIBERALISM
aborsi
ekstramarital sex
kebebasan beragama
kesetaraan gender
persamaan ras
POLITICAL LIBERALISM
oposisi
kebebasan berserikat
kebebasan
berpendapat
multipartai
ekstra parlementer
CLUSTER ANALYSIS
45
Also called segementation analysis, classification
analysis or numerical taxonomy analysis, is similar in
purpose to Q-mode factor analysis both seek to
identify homogenous subgroups of cases in a
population. That is cluster analysis seeks to identify a
set of groups which both minimize within-group
variation and maximize between-group variation

Objects in each cluster tend to be similar to each other
and dissimilar to objects in the other clusters

Suatu teknik statistik untuk mengelompokkan satuan-
satuan analisis kedalam sejumlah cluster,
berdasarkan kesamaan (similarities/likeness) atas
sejumlah karakteristik yang dimiliki satuan analisis
Konsep dasar dan situasi ideal clustering
46
0
0.5
1
1.5
2
2.5
3
3.5
4
-1 0 1 2 3
Variable 2
Variable 2
Clustering variabel gaya hidup remaja dengan
metode K-Means

47
Total Variance Explained

Component
Initial Eigenvalues Extraction Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative %
1
4.100 24.118 24.118 4.100 24.118 24.118
2
1.673 9.842 33.960 1.673 9.842 33.960
3
1.386 8.155 42.115 1.386 8.155 42.115
4
1.084 6.379 48.494 1.084 6.379 48.494
5
.961 5.653 54.147
6
.928 5.459 59.607
7
.869 5.111 64.718
8
.795 4.674 69.392
9
.762 4.483 73.875
10
.738 4.340 78.215
11
.661 3.887 82.102
12
.634 3.728 85.830
13
.602 3.542 89.373
14
.545 3.205 92.578
15
.460 2.706 95.284
16
.436 2.567 97.850
17
.365 2.150 100.000
Extraction Method: Principal Component Analysis.

ANOVA


Cluster Error
F Sig. Mean Square df Mean Square df
Hobby
17.466 3 .199 1099 87.883 .000
Rekreasi_awal
41.812 3 .259 1099 161.191 .000
AK
21.364 3 .242 1099 88.391 .000
AKS_TOT
57.018 3 .325 1099 175.424 .000
MTT
33.890 3 .328 1099 103.174 .000
MK
17.146 3 .167 1099 102.904 .000
MF
24.689 3 .201 1099 122.563 .000
MMM
41.890 3 .301 1099 138.958 .000
MP
22.126 3 .205 1099 107.903 .000
OD
18.831 3 .130 1099 144.422 .000
OS
9.239 3 .126 1099 73.375 .000
OP
11.107 3 .316 1099 35.168 .000
OE
13.008 3 .253 1099 51.336 .000
OPT
32.711 3 .193 1099 169.109 .000
OF
16.578 3 .127 1099 130.173 .000
OM
23.502 3 .188 1099 124.888 .000
OB_factor
7.433 3 .119 1099 62.336 .000

Ditemukan empat buah cluster dari
variable gaya hidup:
1. demander adalah kelompok remaja
yang mempunyai skor gaya hidup
yang tinggi, sehingga mereka
cenderung sering menjalankan
aktivitas, memiliki opini serta minat
terhadap semua simensi dalam
gaya hidup (cluster 1)
2. anti demander, dimana mereka
mempunyai skor yang sangat
rendah dalam pengukuran gaya
hidup (cluster 3)
3. escapist adalah kelompok individu
yang cenderung mempunyai gaya
hidup fun, hedonis serta tidak
tanggap terhadap lingkungan sosial
mereka (cluster 2)
4. pro-social, yang bercirikan cukup
responsif dalam meyikapi segala
persoalan sosial yang terjadi di
lingkungannya (cluster 4)


48
Number of Cases in each Cluster

Cluster 1
223.000
2
262.000
3
237.000
4
381.000
Valid
1103.000
Missing
265.000

Final Cluster Centers

Cluster
1 2 3 4
A_Hobby
2.46 2.29 1.90 1.97
A_Rekreasi
2.75 2.57 1.96 2.00
A_Komunitas
3.96 3.45 3.23 3.56
A_Keg sosial
3.61 2.72 3.04 3.65
M_Temp Tinggal
4.63 4.18 3.85 4.56
M_Komunitas
4.08 3.73 3.41 3.79
M_Fashion
4.08 3.97 3.38 3.63
M_Media Massa
4.03 3.45 3.02 3.33
M_Prestasi
4.48 4.15 3.72 4.11
O_Diri
4.55 4.18 3.86 4.27
O_Sosial
4.69 4.39 4.26 4.58
O_Politik
2.79 2.39 2.72 2.82
O_Pendidikan
3.94 3.83 3.42 3.60
O_Produk & tekno
4.08 3.76 3.22 3.48
O_Masa depan
4.56 4.17 3.94 4.37
O_Ekonomi
4.03 3.49 3.27 3.51
O_Budaya
3.82 3.62 3.38 3.59

49
Checks on the quality of clustering results:
Perform cluster analysis on the same data using
different measures. Compare the results across
measures to determine the stability of the
solutions
Use different method of clustering and compare
the results
Split data randomly into halves. Perform
clustering separately on each half
Delete variables randomly. Perform clustering
based on the reduced set of variables. Compare
the results with those obtained by clustering
based on the entire set of variables
In non hierarchical clustering, the solution may
depend on the order of cases in the data set.
Make multiple runs using different order of cases
until the solution stabilizes

Você também pode gostar