Validity and Reliability of Questionnaires

JASKARAN SINGH
UNIT 3 CONTD..
VALIDITY
AND
RELIABILITY OF
QUESTIONNAIRES
BY JASKARAN SINGH
CONTENTS
Introduction
Steps in questionnaire designing
Validity
Reliability
Concept of validity
Types of validity
Steps in questionnaire validation
Types and measurement of reliability
Conclusion
JASKARAN SINGH
INTRODUCTION
Questionnaire: Important method of data

collection used extensively
Advantages of questionnaire
Less expensive
Offers greater anonymity
Disadvantages
Application is limited
Response rate is low
Opportunities to clarify issues is lacking
JASKARAN SINGH
Ideal requisites of a questionnaire:
Should be clear and easy to understand

Layout is easy to read and pleasant to eye
Sequence of questions easy to follow
Should be developed in an interactive style
Sensitive questions must be worded
exactly
JASKARAN SINGH
Steps in questionnaire
designing
JASKARAN SINGH
Validit
y
JASKARAN SINGH
The concept of validity
Validity is the ability of an instrument to measure what it is

intended to measure.
Degree to which the researcher has measured what he has

set out to measure (Smith, 1991)
Are we measuring what we think we are measuring?

(Kerlinger, 1973)
Extent to which an empirical measure adequately reflects

the real meaning of the concept under consideration
(Babbie, 1989)
JASKARAN SINGH
Why validity ?
Validity is done mainly to answer the following

questions:
Is the research investigation providing answers to the

research questions for which it was undertaken?
If so, is it providing these answers using appropriate

methods and procedures?
JASKARAN SINGH
Questions to ponder(Think
about)
Investigator
Readers of
report
Experts in the
field
Logic
Statistical tests
JASKARAN SINGH
Logical thinking
Justification of each question in relation to objective

of study
Easy if questions relate to tangible matters
Difficult in situations where we are measuring

attitude, effectiveness of a program, satisfaction
etc
Everybodys logic doesnt match . . No statistical

backing
JASKARAN SINGH
Statistical procedures
By calculating coefficient of correlations

between questions and outcome
variables
JASKARAN SINGH
Types of validity
Validity
Content
validity
Face validity
Criterion
related
Concurrent
Predictive
JASKARAN SINGH
Construct
validity
CONTENT VALIDITY
Uses logical reasoning and hence easy to apply
Extent to which a measuring instrument covers a

representative sample of the domain of the
aspects measured
Whether items and questions cover the full range

of the issues or problem being measured
Example, if we want to test knowledge on
American Geography it is not fair to have most
questions limited to the geography of England.
JASKARAN SINGH
Face Validity
The extent to which a measuring instrument

appears valid on its surface
Each question or item on the research instrument

must have a logical link with the objective
JASKARAN SINGH
CRITERION VALIDITY
The extent to which a measuring instrument

accurately predicts behaviour or ability in a given
area.
The measuring instrument is called criteria
It is of two types:
Predictive validity
Concurrent validity
JASKARAN SINGH
Predictive validity
If the test is used to predict future performance
Eg: Entrance exam . . . . Performance of these

tests correlates with later performance in
professional college
Eg: Written driving test
JASKARAN SINGH
Concurrent validity
If the test is used to estimate present

performance or persons ability at the present
time not attempting to predict future outcomes
Professional college exam
Eg: driving test, pilot test
JASKARAN SINGH
CONSTRUCT VALIDITY
Most important type of validity

Assesses the extent to which a measuring
instrument accurately measures a theoretical
construct it is designed to measure
Measured by correlating performance on the test
with performance on a test for which construct
validity has been determined.
This can be done experimentally, e.g., if we want
to validate a measure of anxiety. We have a
hypothesis that anxiety increases when subjects
are under the threat of an electric shock, then
the threat of an electric shock should increase
JASKARAN SINGH
anxiety scores.
Another method is to show that scores of the new

test differs across people with different levels of
outcomes being measured.
JASKARAN SINGH
Summary of Validity
CONTENT
CRITERION
CONSTRUCT
CONCURREN
T
PREDICTIVE
Whether the
test covers a
representative
sample of the
domains to be
measured
The ability of
the test to
estimate
present
performance
The ability of
the test to
predict future
performance
The extent to
which the
instrument
measures a
theoretical
construct
How it is
Ask experts to
accomplishe assess the
d
test to
establish that
the items are
representative
of the
outcome
Correlate
performance
on the test
with a
concurrent
behaviour
Correlate
performance
on the test
with a
behaviour in
future
Correlate
performance
on the
instrument
with a
performance
on an
established
instrument
What it
measures
JASKARAN SINGH
Steps in
questionn
aire
validation
JASKARAN SINGH
FACE VALIDITY
Evaluate in terms of:

Readability
Feasibility
Layout and
style
Clarity of
wording
JASKARAN SINGH
CONTENT VALIDITY
Two phases
Experts: Enhancement of
content of questionnaire
(Seven or more experts)
Researcher: Conceptualization
and domain analysis(finding
common and variable parts)
JASKARAN SINGH
How do experts evaluate

validity
Method 1: Average Congruency Percentage

(ACP) [Popham, 1978]
Experts compute the percentage of questions

deemed to be relevant for them
Take the average of all experts
If the value is > 90 . . . Valid
Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)

Then ACP = 90%
JASKARAN SINGH
CONSTRUCT VALIDITY
To examine empirically the interrelationship among items and

to identify clusters of items that share sufficient variation to
justify their existence as a factor or construct to be measured
by the instrument
Method: Factor analysis
Various items are gathered into common factors
Common factors are synthesized into fewer factors and then

relation between each item and factor is measured
Unrelated items are eliminated

JASKARAN SINGH
Reliabil
ity
JASKARAN SINGH
RELIABILITY
Definition: It is the ability of an instrument to

create reproducible results
Each time it is used, similar scores should be

obtained
A questionnaire is said to be reliable if we get

same/similar answers repeatedly
Though it cannot be calculated exactly, it can be

measured by estimating correlation coefficients
JASKARAN SINGH
JASKARAN SINGH
STABILITY
Done to ensure that same results are

obtained when used consecutively for
two or more times
Test-retest method is used
INTERNAL
CONSISTENC
Y
To ensure all subparts of a instrument

measure the same characteristic
(Homogeneity)
Split-half method
EQUIVALENC
E
Used when two observers study a

single phenomenon simulataneously
Inter-rater reliability
Reliability measured in
aspects of:
Test-Retest reliability
(for stability)
Test administered twice to the same participant at

different times
Used for things that are stable over time
Easy and straight-forward approach
Useful for questionnaires, checklist, rating scales etc
Disadvantages
Practice effect (mainly for tests)

Too short intervals in between (effect of memory)
JASKARAN SINGH
Some traits may change with time
Statistical calculation
Administration of instrument to a sample

on two different occasions
Scores compared and calculated by

using correlation coefficient formula
(pearson)
JASKARAN SINGH
Correlation coefficient
Measures the degree of relationship between two

sets of scores
Can range from -1 to +1
0 indicates absence of any relationships
Correlation coefficient
Strength of
relationship
+/- 0.7 to 1.0
Strong
+/- 0.3 to 0.69
Moderate
+/- 0.0 to 0.29
None to weak
JASKARAN SINGH
Split halves reliability

(homogenity)
Especially appropriate when the test is very long

Split the contents of the questionnaire into two equivalent
halves; either odd/even number or first/second half
Correlate scores of one half with scores of the other
Formula: r = (x-x)(y-y)
(x-x)2 (y-y)2
But this r is only for the half, so to check reliability of

entire test, use the formula
JASKARAN SINGH
R = 2r/1+r
(r = coefficient of split half, R =
coefficient of entire test)
JASKARAN SINGH
Inter-rater reliability
(Equivalence)
Inter-Rater Reliability refers to statistical measurements

that determine how similar the data collected by
different raters are. A rater is someone who is scoring
or measuring a performance, behavior, or skill in a
human or animal.
Used when a single event is measured simultaneously

and independently by two or more trained observers
R=
Number of agreements
Number of agreements + Number of disagreements
JASKARAN SINGH
Summary of Reliability
TEST
RETEST
What it
measures
Stability over
time
How it is
Administer
accomplishe the same test
d
to the same
people at two
different
times
SPLIT HALF
INTERRATER
Equivalency of
items
Agreement
between raters
Correlate
performance for
a group of
people on two
equivalent
halves of same
test
Have multiple
researchers
measure same
instrument and
determine
percentage of
agreement
between them
JASKARAN SINGH
Conclusion
Validated questionnaire
It is one which has undergone a validation

procedure to show that it accurately measures
what it aims to, regardless of who responds,
when they respond, and to whom they respond
or when self-administered and whose reliability
has also been examined thereby:
Reducing bias and ambiguities

Better quality of data and credible information
JASKARAN SINGH
In a nutshell . . . .
A questionnaire can be reliable

but invalid . . .
But a valid questionnaire is
always reliable . . .
JASKARAN SINGH
THANK
YOU . .
JASKARAN SINGH

Validity and Reliability of Questionnaires

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Validity and Reliability of Questionnaires

Enviado por

Direitos autorais:

Formatos disponíveis

JASKARAN SINGH

Questionnaire: Important method of data

Ideal requisites of a questionnaire:

Should be clear and easy to understand

The concept of validity

Validity is the ability of an instrument to measure what it is

Degree to which the researcher has measured what he has

Are we measuring what we think we are measuring?

Extent to which an empirical measure adequately reflects

Validity is done mainly to answer the following

Is the research investigation providing answers to the

If so, is it providing these answers using appropriate

Justification of each question in relation to objective

Easy if questions relate to tangible matters

Difficult in situations where we are measuring

Everybodys logic doesnt match . . No statistical

By calculating coefficient of correlations

Uses logical reasoning and hence easy to apply

Extent to which a measuring instrument covers a

Whether items and questions cover the full range

The extent to which a measuring instrument

Each question or item on the research instrument

The extent to which a measuring instrument

The measuring instrument is called criteria

If the test is used to predict future performance

Eg: Entrance exam . . . . Performance of these

Eg: Written driving test

If the test is used to estimate present

Professional college exam

Eg: driving test, pilot test

Most important type of validity

Another method is to show that scores of the new

Evaluate in terms of:

How do experts evaluate

Method 1: Average Congruency Percentage

Experts compute the percentage of questions

Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)

To examine empirically the interrelationship among items and

Various items are gathered into common factors

Common factors are synthesized into fewer factors and then

Unrelated items are eliminated

Definition: It is the ability of an instrument to

Each time it is used, similar scores should be

A questionnaire is said to be reliable if we get

Though it cannot be calculated exactly, it can be

Done to ensure that same results are

To ensure all subparts of a instrument

Used when two observers study a

Test administered twice to the same participant at

Used for things that are stable over time

Easy and straight-forward approach

Useful for questionnaires, checklist, rating scales etc

Practice effect (mainly for tests)

Administration of instrument to a sample

Scores compared and calculated by

Measures the degree of relationship between two

+/- 0.7 to 1.0

+/- 0.3 to 0.69

+/- 0.0 to 0.29

Split halves reliability

Especially appropriate when the test is very long

Correlate scores of one half with scores of the other