Você está na página 1de 29

# MEASUREMENT:

## SCALING, RELIABILITY, VALIDITY

Chapter 7
Objectives
Know the characteristics and power of the four types of
scales nominal, ordinal, interval and ratios
Know how and when to use the different forms of rating
scales
Be able to explain stability and consistency and how
they are established
Be able to explain the difference between reflective and
formative scales
Be conversant with the different forms of validity
Be able to discuss what goodness of measures
means, and why it is necessary to establish it in
research
DETAILS OF STUDY MEASUREMENT

## Purpose of Types of Extent of Study setting Measurement

the investigation researcher and measures
study interference Contrived
Establishing: Non- Operational
Exploration - Causal Minimal: contrived definition
P Description Relationships Studying events Items
as they normally
R Hypothesis - Correlations
occur
(measures)
DATA
Testing - Group Scaling
O differences, Categorizing ANALYSIS
Manipulation
B ranks
and/or control
Coding
1.Feel for data
L and/or
simulation
E
M 2.Goodness of
data
S
T
A 3.Hypotheses
T testing
Unit of analysis Sampling Time horizon Data collection
M (population to Design method
E be studied)
Probability One shot Observation
N /non probability (cross Interview
T Individuals sectional) Questionnaire
Groups (n) Longitudinal Measurement
Organizations Unobtrusive
Machines etc.
Scales

## Nominal Scale allows the researcher to assign subjects to

certain categories or groups, example gender (male or
female), yes or no, country of origin, race, colour of your
eyes etc, etc.

## Ordinal Scale categorizes the variables to denote the

differences between variables, i.e. rank order example low
income, medium income, high income; ranking of importance
etc. It indicate relative standing among individuals.

Example:
1 represent the subject that you prefer most and 6
Scales example ordinal scale

## My preferred subjects: Rank

Creative Problem Solving
Statistics
Human Resource Management
Economics
Scales

## Interval Scale More powerful than the first two

scales. It taps the differences, the order, and the
equality of the magnitude of the differences on the
variable.
The origin or the starting point may be any arbitrary
number

Example
Strongly Disagree 1
Disagree 2
Neither Agree Nor Disagree 3
Agree 4
Strongly Agree 5
Scales

## Ratio Scale Not only measures the magnitude of the

differences between points on the scale but also taps
the proportions in the differences. It has a unique zero
origin (not an arbitrary origin) and has all the properties
of the other scales

Example
Age, Weight, Height, Income, CGPA, Number of
organizations the individuals have worked for
Rating Scales

## Category Scale uses multiple items to elicit a single

response as per the following.

Example:
________ Between RM1,500 RM 3,000
________ Between RM3,001 RM 5,000
________ Between > RM 5,0000
Rating Scales

## Semantic Differential Scale Several bipolar

attributes are identified at the extremes of the scale,
and the respondents are asked to indicate their
attitudes, on a semantic space, toward a particular,
object, or event on each of the attributes.

Example:
Ugly _ _ _ _ _ _ _ _ Beautiful
Non academically inclined _ _ _ _ _ _ _ _ Academically Inclined
Coward _ _ _ _ _ _ _ _ Brave
Rating Scales

## Numerical Scale Similar to semantic differential

scale but numbered instead of semantic space

Example:
1. I think this subject is:

## a. Very Enjoyable 5 4 3 2 1 Least Enjoyable

b. Very useful 5 4 3 2 1 Least Useful
c. Very Difficult 5 4 3 2 1 Least Difficult
d. Easy to Understand 5 4 3 2 1 Very difficult to understand
Rating Scales

## Itemized Rating Scale Can be balanced rating

scale or unbalanced rating scale.

## Example: Balanced Rating Scale

5 - Very frequently (use everyday)
4 - Frequently (use at least three times a week)
3 - Sometime (at least once a week)
2 - Seldom ( once a month)
1 Not used at all
Rating Scales

## Example: Unbalanced Rating Scale

4 - Very frequently (use everyday)
3 - Frequently (use at least three times a week)
2 - Sometime (at least once a week)
1 - Seldom ( once a month)

## Likert Scale how strongly subject agree or disagree

with statements on a 5 point scale
5 Strongly Disagree; 4 Disagree; 3 Neither Agree
Nor Disagree; 2 Agree; 1 Strongly Agree
Rating Scales

## Fixed or Constant Sum respondents here asked to

distribute a given number of points across a various
items. More of an ordinal scale

## Example Please indicate how you choose your

preferred university in terms of percentage.
ranking of the university _________
the location of the university __________
your friends /relatives are/were there _________
the courses offered __________
the facilities __________
100
Rating Scales

## Stapel Scale Measures both the direction and

intensity of the attitude toward the items under study.

Example
of the following characteristics by circling the
appropriate number.
a. +3 +2 +1 Communication Skills -1 -2 -3
b. +3 +2 +1 Use of english lang. -1 -2 -3
c. +3 +2 +1 Presentation Skills -1 -2 -3
Rating Scales

## Graphic rating Scale Scale An ordinal scale, by

at appropriate point on the line.

Example
Please rate the facilities offer at the university.
a. Library
1 5 10

b. Labs
1 5 10
Rating Scales

## Consensus Scale Develop by consensus, where a

panel selects certain items which in its view measure
the relevant concept.
Ranking Scales

## Used to tap preferences between two or among more

objects or items (ordinal in nature).

## Paired comparison the respondents are asked to

choose between two objects at the a time. A good
method if the choices are among a small number of
objects otherwise the results will not be reliable due to
respondents fatigue.

## Forced Choice - enables respondents to rank objects

relative to one another, among the alternatives
provided.
Ranking Scales

## Used to tap preferences between two or among more

objects or items (ordinal in nature).

## Paired comparison the respondents are asked to

choose between two objects at the a time. A good
method if the choices are among a small number of
objects otherwise the results will not be reliable due to
respondents fatigue.

## Forced Choice - enables respondents to rank objects

relative to one another, among the alternatives
provided.
Ranking Scales

Example:
Rank your preference for the following types of novels:
Classic ___________
Thriller ___________
Ghost Stories ___________
Romantic ___________

## Comparative Scale provides a benchmark or a point

of reference to assess attitudes towards the current
object, event or situation under study
Goodness of Measures

## Once an instrument (questionnaire) has been designed

with the appropriate scaling techniques, it is critical to
make sure the instrument is indeed accurately measuring
the variable, and that in fact, it is actually measuring the
concept it was designed to measure

## Before using a measurement instrument, we need to test

the goodness of fit (item analysis is carried out to see if
the items in the instrument belong there or not) and
ensure the instrument is reliable and valid
Goodness of Measures

## Validity is a test of how well an instrument developed

measures the particular concept it is intended to
measure.

## Different types of validity

- Content Validity
- Criterion related Validity
- Construct Validity
Goodness of Measures Content Validity

## Content Validity ensures that the measures includes

an adequate and representative set of items that tap the
concept.
How? Eg: Extensive literature review, panel of judges,

## Face Validity indicates that the items that are

intended to measure a concept, do, on the face of it,
look like they measure the concept.
Goodness of Measures Criterion Related Validity

## Criterion Related Validity is established when the

measures differentiates individuals on a criterion it is
expected to predict. This can be done by establishing
concurrent validity or predictive validity

## - Concurrent validity is established when the scale

discriminates individuals who are known to be different

## - Predictive validity indicates the ability the measuring

instrument to differentiate among individuals with
reference to future criterion
Goodness of Measures Construct Validity

## Construct Validity testifies to how well the results

obtained from the use of the measure fit theories
around which the test is designed

## - Convergent Validity is established when the scores

obtained with two different instruments measuring the
same concept are highly correlated.

## - Discriminant Validity is established when, based

on theory, two variables are predicted to be
uncorrelated, and the scores obtained by measuring
them are indeed empirically found to be so.
Reliability
The reliability of a measure/test that indicates the extent
to which it is without bias (error free) and hence ensures
consistent measurement across time and across various
items in the instrument
Reliability of a measure is an indication of stability and
consistency with which the instrument measures the
concept and helps to assess the goodness of a measure
Example:
When you use a measuring instrument repeatedly and
give you the same results each time, then the
instrument is said to be a reliable instrument.
Reliability
Stability of Measure
Test-retest reliability
This test generates a reliability coefficient by repeating of the same
measure on a second occasion
separate times
if their responses are consistent on these two occasion then the
questionnaire is said to achieve reliability
Parallel-Form Reliability
When responses on two comparable sets of measures tapping the
same construct are highly correlated, we have parallel-form
reliability
Reliability
Internal Consistency of Measure
Internal consistence of measures is indicative of
the homogeneity of the items in the measure that
tap the construct
The items should hang out together
Consistency can be examined through the
following
Inter item Consistency reliability
Split Half Reliability
Reliability
Internal Consistency of Measure
Inter item Consistency reliability
This is a test of the consistency of respondents
answers to all the items in a measure
To the degree that items are independent measures of
the same concept, they will be correlated with one
another
The most popular test of inter-item consistency
reliability is the Cronbachs Alpha
Split Half Reliability
Split Half Reliability reflects the correlations between
two halves of an instrument
Reflective versus formative measurement scales

## Reflective Scale the items (all of them!) are expected

to correlate; each item in a reflective scale is assumed
to have a common basis

## Formative Scale is used when a construct is viewed

as an explanatory combination of its indicators