Você está na página 1de 29

MEASUREMENT:

SCALING, RELIABILITY, VALIDITY


Chapter 7
Objectives
Know the characteristics and power of the four types of
scales nominal, ordinal, interval and ratios
Know how and when to use the different forms of rating
scales
Be able to explain stability and consistency and how
they are established
Be able to explain the difference between reflective and
formative scales
Be conversant with the different forms of validity
Be able to discuss what goodness of measures
means, and why it is necessary to establish it in
research
DETAILS OF STUDY MEASUREMENT

Purpose of Types of Extent of Study setting Measurement


the investigation researcher and measures
study interference Contrived
Establishing: Non- Operational
Exploration - Causal Minimal: contrived definition
P Description Relationships Studying events Items
as they normally
R Hypothesis - Correlations
occur
(measures)
DATA
Testing - Group Scaling
O differences, Categorizing ANALYSIS
Manipulation
B ranks
and/or control
Coding
1.Feel for data
L and/or
simulation
E
M 2.Goodness of
data
S
T
A 3.Hypotheses
T testing
Unit of analysis Sampling Time horizon Data collection
M (population to Design method
E be studied)
Probability One shot Observation
N /non probability (cross Interview
T Individuals sectional) Questionnaire
Dyads Sample size Physical
Groups (n) Longitudinal Measurement
Organizations Unobtrusive
Machines etc.
Scales

Nominal Scale allows the researcher to assign subjects to


certain categories or groups, example gender (male or
female), yes or no, country of origin, race, colour of your
eyes etc, etc.

Ordinal Scale categorizes the variables to denote the


differences between variables, i.e. rank order example low
income, medium income, high income; ranking of importance
etc. It indicate relative standing among individuals.

Example:
Please rank your preferences for the following subjects
1 represent the subject that you prefer most and 6
your least preferred subject.
Scales example ordinal scale

My preferred subjects: Rank


Business Research Method
Writing for Business Purposes
Creative Problem Solving
Statistics
Human Resource Management
Economics
Scales

Interval Scale More powerful than the first two


scales. It taps the differences, the order, and the
equality of the magnitude of the differences on the
variable.
The origin or the starting point may be any arbitrary
number

Example
Strongly Disagree 1
Disagree 2
Neither Agree Nor Disagree 3
Agree 4
Strongly Agree 5
Scales

Ratio Scale Not only measures the magnitude of the


differences between points on the scale but also taps
the proportions in the differences. It has a unique zero
origin (not an arbitrary origin) and has all the properties
of the other scales

Example
Age, Weight, Height, Income, CGPA, Number of
organizations the individuals have worked for
Rating Scales

Dichotomous Scale is used to elicit a yes or no


answer

Category Scale uses multiple items to elicit a single


response as per the following.

Example:
Please state in which category is your income
________ Between RM1,500 RM 3,000
________ Between RM3,001 RM 5,000
________ Between > RM 5,0000
Rating Scales

Semantic Differential Scale Several bipolar


attributes are identified at the extremes of the scale,
and the respondents are asked to indicate their
attitudes, on a semantic space, toward a particular,
object, or event on each of the attributes.

Example:
Ugly _ _ _ _ _ _ _ _ Beautiful
Non academically inclined _ _ _ _ _ _ _ _ Academically Inclined
Coward _ _ _ _ _ _ _ _ Brave
Rating Scales

Numerical Scale Similar to semantic differential


scale but numbered instead of semantic space

Example:
1. I think this subject is:

a. Very Enjoyable 5 4 3 2 1 Least Enjoyable


b. Very useful 5 4 3 2 1 Least Useful
c. Very Difficult 5 4 3 2 1 Least Difficult
d. Easy to Understand 5 4 3 2 1 Very difficult to understand
Rating Scales

Itemized Rating Scale Can be balanced rating


scale or unbalanced rating scale.

Example: Balanced Rating Scale


5 - Very frequently (use everyday)
4 - Frequently (use at least three times a week)
3 - Sometime (at least once a week)
2 - Seldom ( once a month)
1 Not used at all
Rating Scales

Example: Unbalanced Rating Scale


4 - Very frequently (use everyday)
3 - Frequently (use at least three times a week)
2 - Sometime (at least once a week)
1 - Seldom ( once a month)

Likert Scale how strongly subject agree or disagree


with statements on a 5 point scale
5 Strongly Disagree; 4 Disagree; 3 Neither Agree
Nor Disagree; 2 Agree; 1 Strongly Agree
Rating Scales

Fixed or Constant Sum respondents here asked to


distribute a given number of points across a various
items. More of an ordinal scale

Example Please indicate how you choose your


preferred university in terms of percentage.
ranking of the university _________
the location of the university __________
your friends /relatives are/were there _________
the courses offered __________
the facilities __________
100
Rating Scales

Stapel Scale Measures both the direction and


intensity of the attitude toward the items under study.

Example
Please rate your lecturers abilities with respect to each
of the following characteristics by circling the
appropriate number.
a. +3 +2 +1 Communication Skills -1 -2 -3
b. +3 +2 +1 Use of english lang. -1 -2 -3
c. +3 +2 +1 Presentation Skills -1 -2 -3
Rating Scales

Graphic rating Scale Scale An ordinal scale, by


asking respondents to answer to a particular by marking
at appropriate point on the line.

Example
Please rate the facilities offer at the university.
a. Library
1 5 10
Very bad Adequate Excellent

b. Labs
1 5 10
Very bad Adequate Excellent
Rating Scales

Consensus Scale Develop by consensus, where a


panel selects certain items which in its view measure
the relevant concept.
Ranking Scales

Used to tap preferences between two or among more


objects or items (ordinal in nature).

Paired comparison the respondents are asked to


choose between two objects at the a time. A good
method if the choices are among a small number of
objects otherwise the results will not be reliable due to
respondents fatigue.

Forced Choice - enables respondents to rank objects


relative to one another, among the alternatives
provided.
Ranking Scales

Used to tap preferences between two or among more


objects or items (ordinal in nature).

Paired comparison the respondents are asked to


choose between two objects at the a time. A good
method if the choices are among a small number of
objects otherwise the results will not be reliable due to
respondents fatigue.

Forced Choice - enables respondents to rank objects


relative to one another, among the alternatives
provided.
Ranking Scales

Example:
Rank your preference for the following types of novels:
Classic ___________
Thriller ___________
Ghost Stories ___________
Romantic ___________

Comparative Scale provides a benchmark or a point


of reference to assess attitudes towards the current
object, event or situation under study
Goodness of Measures

Once an instrument (questionnaire) has been designed


with the appropriate scaling techniques, it is critical to
make sure the instrument is indeed accurately measuring
the variable, and that in fact, it is actually measuring the
concept it was designed to measure

Before using a measurement instrument, we need to test


the goodness of fit (item analysis is carried out to see if
the items in the instrument belong there or not) and
ensure the instrument is reliable and valid
Goodness of Measures

Validity is a test of how well an instrument developed


measures the particular concept it is intended to
measure.

Different types of validity


- Content Validity
- Criterion related Validity
- Construct Validity
Goodness of Measures Content Validity

Content Validity ensures that the measures includes


an adequate and representative set of items that tap the
concept.
How? Eg: Extensive literature review, panel of judges,

Face Validity indicates that the items that are


intended to measure a concept, do, on the face of it,
look like they measure the concept.
Goodness of Measures Criterion Related Validity

Criterion Related Validity is established when the


measures differentiates individuals on a criterion it is
expected to predict. This can be done by establishing
concurrent validity or predictive validity

- Concurrent validity is established when the scale


discriminates individuals who are known to be different

- Predictive validity indicates the ability the measuring


instrument to differentiate among individuals with
reference to future criterion
Goodness of Measures Construct Validity

Construct Validity testifies to how well the results


obtained from the use of the measure fit theories
around which the test is designed

- Convergent Validity is established when the scores


obtained with two different instruments measuring the
same concept are highly correlated.

- Discriminant Validity is established when, based


on theory, two variables are predicted to be
uncorrelated, and the scores obtained by measuring
them are indeed empirically found to be so.
Reliability
The reliability of a measure/test that indicates the extent
to which it is without bias (error free) and hence ensures
consistent measurement across time and across various
items in the instrument
Reliability of a measure is an indication of stability and
consistency with which the instrument measures the
concept and helps to assess the goodness of a measure
Example:
When you use a measuring instrument repeatedly and
give you the same results each time, then the
instrument is said to be a reliable instrument.
Reliability
Stability of Measure
Test-retest reliability
This test generates a reliability coefficient by repeating of the same
measure on a second occasion
Respondents are asked to answer the same questionnaire at two
separate times
if their responses are consistent on these two occasion then the
questionnaire is said to achieve reliability
Parallel-Form Reliability
When responses on two comparable sets of measures tapping the
same construct are highly correlated, we have parallel-form
reliability
Reliability
Internal Consistency of Measure
Internal consistence of measures is indicative of
the homogeneity of the items in the measure that
tap the construct
The items should hang out together
Consistency can be examined through the
following
Inter item Consistency reliability
Split Half Reliability
Reliability
Internal Consistency of Measure
Inter item Consistency reliability
This is a test of the consistency of respondents
answers to all the items in a measure
To the degree that items are independent measures of
the same concept, they will be correlated with one
another
The most popular test of inter-item consistency
reliability is the Cronbachs Alpha
Split Half Reliability
Split Half Reliability reflects the correlations between
two halves of an instrument
Reflective versus formative measurement scales

Reflective Scale the items (all of them!) are expected


to correlate; each item in a reflective scale is assumed
to have a common basis

Formative Scale is used when a construct is viewed


as an explanatory combination of its indicators