Você está na página 1de 16

Session 3: Reliability and

Validity
Dr. Ngo Tuyet Mai
Email: maint@hanu.edu.vn

Looking Back before Looking


Forward

What did you LEARN from the


PREVIOUS Session?

Session Objectives
At the end of the session, students will
be able to
Be familiar with the two most
important qualities of assessment:
Validity and Reliability

A Valid Assessment is
A valid assessment is one which
provides information on the ability
we want to assess and nothing else
(Brindley, 2003, p. 310).
A test is said to be valid if it
measures accurately what it is
intended to measure (Hughes,
2003, p. 26).

Construct Validity
Empirical Evidence must be provided as
part of demonstrating a tests validity
Form of Evidence: Content Validity and
Criterion-related Validity

Note: Construct refers to any underlying


ability (or trait) that is hypothesized in a
theory of language ability

Content Validity
A test is said to have content validity
if its content constitutes a
representative sample of the relevant
language skills, structures, etc
(depending on the purpose of the
test).
In order to judge whether or not a
test has content validity, we need a
specification of the skills or

Content Validity (Contd)


The greater a tests content validity, the
more likely it is to be an accurate measure
of construct validity.
Major areas (identified in the specification)
under-represented have harmful backwash
effect: areas NOT TESTED are likely to
become areas IGNORED in teaching and
learning
FULL TEST SPECIFICATIONS are needed

Criterion-related Validity
Criterion related validity relates
to the degree to which results on the
test agree with those provided by
some independent and highly
dependable assessment of the
candidates ability.

How to make tests MORE VALID????

How to make tests more


VALID
1. Write explicit specifications for the test
(include a representative sample of
the content in the test)
2. Use direct testing
3. Make sure the scoring of responses
relates directly to what is being tested.
4. Do everything possible to make the
test reliable. If the test is not reliable,
it can not be valid

A Reliable Assessment
Reliability refers to the consistency
with which our assessment tools
measure language ability. An
assessment is reliable when there is
little difference in learners scores or
in judges ratings across different
occasions or different judges.

Two Components of TEST


RELIABILITY
The performance of candidates from
occasion to occasion
The reliability of scoring

How to make tests MORE RELIABLE???

How to make tests MORE


RELIABLE
1. Take enough samples of behavior
2. Exclude items which do not
discriminate well between weaker
and stronger students
3. Do NOT allow candidates too much
freedom
4. Write unambiguous items
5. Provide clear and explicit
instructions
6. Ensure that tests are well laid out

How to make tests MORE


RELIABLE
7. Make candidates familiar with format and testing
techniques
8. Provide uniform and non-distracting conditions
9. Use items that permit objective scoring
10. Make comparisons between candidates as
direct as possible
11. Provide a detailed scoring key
12. Train scorers
13. Agree acceptable responses and appropriate
scores at outset of scoring
14. Identify candidates by number, not name
15. Employ multiple independent scoring

Reflection
In some educational settings, the quality of
language programs is judged solely on the
basis of students test scores. Is this fair?
A teacher wants to assess the listening
comprehension of her advanced ESL/EFL
class. She plays them a radio news bulletin
and asks them to write a short paragraph
outlining the main points. Is this a valid
assessment of listening ability? Why or
why not?

Você também pode gostar