Essay III Principles of Language Assessment

PRINCIPLES OF LANGUAGE ASSESSMENT
RIAS WITA SURYANI

(16178070)
1. INTRODUCTION
Test can be used as a valuable teaching tool. Teacher should be able to design a good
test for the students. A good test should have a positive effect on learning, and teaching
should result in improved learning habits. Such a test will aim at locating the specific and
precise areas of difficulties experienced by the class or the individual student so that
assistance in the form of additional practice and corrective exercises can be given. The test
should enable the teacher to find out which parts of the language program cause difficulty for
the class. In this way, the teacher can evaluate the effectiveness of the syllabus as well as the
methods and materials he/she is using. A good test should also motivate by measuring student
performance without in any way setting "traps" for them. A well-developed test should
provide an opportunity for students to show their ability to perform certain language tasks. A
test should be constructed with the goal of having students learn from their weaknesses. Thus,
to design good assessment, teachers should pay attention to validity, reliability, practicality,
authenticity, and washback.
2. PRINCIPLES OF LANGUAGE ASSESSMENT
Brown (2010) in his book, Language Assessment: Principles and Classroom

Practices mentioned five major principles of language assessment which consist of
practicality, reliability, validity, authenticity, and washback.
a. Practicality
A good test is practical. It is within the means of financial limitations, time
constraints, ease of administration, and scoring and interpretation. According to
Bachman and Palmer (1996), practicality is the relationship between the resources
that will be required in design, development, and use of the test and the resources
that will be available for these activities. It means that it focuses on how the test is
conducted. Moreover, Bachman and Palmer (1996) classified the addressed
resources into three types: human resources, material resources, and time. Based on
this definition, practicality can be measured by the availability of the resources
1
required to develop and conduct the test. Therefore, our judgment of the language
test is whether it is practical or impractical. A practical test:
- Stays within budgetary limits
- Can be completed by the test-taker within appropriate time constraints
- Has clear directions for administration
- Appropriately utilizes available material resources
- Does not exceed available material resources
- Considers the time and effort involved for both design and scoring
b. Reliability
A good test should be reliably. This means that the results of a test should be
dependable. They should be consistent (remain stable, should not produce different
results when it is used in different days). A test that is reliable will yield similar
results with similar group of students took the same test on two occasions, and their
results are roughly the same then the test will be called a reliable test. If the results
are very different. Then the test is not reliable. A number of sources of unreliability
may be identified Test itself (known as test reliability), Administration of a test, Test-
taker (known as student-related reliability), and Scoring of the test (known as rater
reliability). A reliable test should
- Is consistent in its conditions across two or more administrations
- Gives clear directions for scoring / evaluation
- Has uniform rubrics for scoring /evaluation
- Lends itself to consistent application of those rubrics by the scorer
- Contains items/takes that are unambiguous to the test-taker
c. Validity
Validity is the degree to which the test actually measures what it is intended to
measure. In other words, test what you teach, how you teach it. Types of validity
include face validity, content validity, criterion-referenced validity and construct
validity. For classroom teachers, content validity means that the test assesses the
course content and the outcomes using formats familiar to the students. Construct
validity refers to the 'fit' between the underlying theories and the methodology of
the language learning and the type of assessment. Face validity means that the test
looks as though it measures what it is supposed to measures. This is an important
factor for both students and administrators. A valid test is
- Measures exactly what it proposes to measure
- Does not measure irrelevant or contaminating variables.
- Relies as much as possible on empirical evidence (performance)
- Involves performance that samples the tests criterion (objective)
- Offers useful, meaning full information about a test takers ability
- Is supported by a theoretical rationale or argument
2
d. Authenticity
A test must be authentic. It is important for teacher to claim for authenticity in a
test. According to Bachman and Palmer as cited in Brown (2010), authenticity as
the degree of correspondence of the characteristics of a given language test task to
the features of a target language. It means the task enacted to the in the real world.
Several things must be considered in making an authentic test:
- language used in the test should be natural
- the items are contextual
- topics brought in the test should be meaningful and interesting for the
learners
- the items should be organized thematically
- must be based on the real-world
e. Washback
Washback effect is the impact of a test on teaching and learning. When students
take a test, there can be a positive or negative effect of test. These are beneficial
washback and harmful washback. The challenge to teachers is to create classroom
tests that serve as learning devices through which washback is achieved. Washback
enhances intrinsic motivation, autonomy, self-confidence, language ego,
interlanguage, and strategic investment in the students. Moreover, Brown (2010)
stated that instead of giving letter grades and numerical scores which give no
information to the students performance, giving generous and specific comments is
a way to enhance washback. A test that provides beneficial wash back is
- Positively influences what and how teachers teach
- Positively influences what and how learners learn
- Offers learners a chance to adequately prepare
- Gives learners feedback that enhances their language development
- Is more formative in nature than summative
- Provides conditions for peak performance by the learner
3. APPLYING PRINCIPLES TO THE EVALUATION OF CLASSROOM TESTS
According to five principles: practicality, reliability, validity, authenticity, and

washback, they can be provided as guidelines for evaluating a step-by-step procedure in
the classrooms. Brown (2010) showed eight tips based on five principles features
below,
1) Are the test procedures practical?

- Are administrative details clearly established before the test?
- Can students complete the test reasonably within the set time frame?
- Can the test be administered smoothly, without procedural gliches?
3
- Are all materials and equipment ready?
- Is the cost of the test within budgeted limits?
- Is the scoring/evaluation system feasible in the teachers time frame?
- Are methods for reporting result determined in advance?
2) Is the test itself reliable?

- Every student has a cleanly photocopied test sheet
- Sound amplification is clearly audible to everyone in the room
- Video input is equally visible to all
- Lighting, temperature, extraneous noise, and other classroom conditions are
equal (and optimal) for all students
- Objective scoring procedures leave little debate about correctness of an
answer
3) Can you ensure rater reliability?

- Have you use consistent sets of criteria for a correct responses?
- Can you give uniform attention to those sets throughout the evaluation
time?
- Can you guarantee that scoring is based only on the established criteria
and not on extraneous or subjective variable?
- Have you read through tests at least twice to check for your consistency?
- If you have made mid-stream modification of what you consider as a
correct response, did you go back and apply the same standards to all?
- Can you avoid fatigue by reading the tests in several sittings, especially if
the time requirement is a matter of several hours?
4) Does the procedure demonstrate content validity?

- Are unit objectives clearly identified?
- Are unit objectives represented in the form of test specifications?
- Do the test specifications include tasks that have already been performed
as part of the course procedures?
- Do the test specifications include tasks that represent all (or most) of the
objectives for the unit?
- Do those tasks involve actual performance of the target taks?
5) Has the impact of the test been carefully accounted for?

- Have you offered students appropriate review and preparation for the test?
- Have you suggest test-taking strategies that will be beneficial
- Is the test structured so that, if possible the best students will be modestly
challenged and the weaker students will not be overwhelmed?
- Does the test lend itself to your giving beneficial washback?
- Are the students encouraged to see the test as a learning experiences?
6) Is the procedure biased for best?
4
According to Swain cited in Brown (2010), to give an assessment procedure that
is biased for best, a teacher offers student appropriate review and preparation
for the test, Suggests strategies that will be beneficial, and structures the test so
that the best students will be modestly challenged and the weaker students will
not be overwhelmed.
7) Are the test tasks as authentic as possible?

- is the language in the test as natural as possible?
- are items as contextualized as possible rather than isolated?
- are topics and situations interesting, enjoyable, and/or humorous?
- is some thematic organization provided, such as through a story line or
episode?
- do tasks represent, or closely approximate, real world tasks?
8) Does the test offer beneficial washback to the learner?

- Is the test designed in such a way that you can give feedback that will be
relevant to the objectives of the units being tested?
- Have you given students sufficient pre-test opportunities to review the
subject matter of the test
- In your written feedback to each student, do you include comments that will
contribute to students formative development?
- After returning tests, do you spend class time going over the test and
offering advice on what students should focus on in the future?
- After returning tests, do you encourage questions from students?
- If time and circumstances permit, do you offer students a chance to discuss
results in an office hour?
Teachers can take these mentioned tips along with checklists into consideration in order
to evaluate test designs themselves for their own classrooms. Quizzes, tests, final
exams, and standardized proficiency tests can be examined through five principles. The
lesson not only helps to understand how five principles are but also assists to know they
are connected to current classroom assessment design. Thus, teachers need to consider
these principles as part of daily lesson design, also evaluate students based on a variety
of assessment during class or after class.
4. THE INFORMATIONS THAT WE NEED IN LANGUAGE

TEACHING/LEARNING EVALUATION
There are two types of information in that we need in language teaching/learning

evaluation, we use both qualitative and quantitative information.
5
a. Qualitative Information
Qualitative information is non numeric information based on the quality of an item
or object. For example if you were testing water then you might say that the taste is
either nice or not so nice. This would be very much based on opinion. When testing
the quality of something and forming an opinion on it this can be known as
qualitative information. Qualitative information might be collected by a restaurant
based on what their customer thought of the taste of the food. Customers may be
asked to choose if the food was bad, average, good or excellent. Since different
customers have different opinions then the information collected is not fact based
and is a qualitative opinion on the quality of something. While in language
teaching/learning evaluation the example of qualitative information can be Fanny
has an accent when she speaks English, in planning special assignment, her friend
help improve her accent.
b. Quantitative Information
Quantitative information is information than can be directly measured and can be
seen as factual information rather than opinion. For example, when testing water the
fluoride content of the water might be measured in milligrams. The information
collected is number based and provides hard facts on the quality of the water. It is
harder to argue against quantitative information. An example of a restaurant
collecting quantitative information would be to ask customers how much they would
pay for a meal. This could be seen as a mix of quantitative and qualitative
information but the end result is a numeric value based on the customers opinion so
it is quantitative information. While in language teaching/learning evaluation the
example of qualitative information can be the average reading speed of grade 10
students is 63 words per minutes.
All information, whether qualitative or quantitative, refers to characteristics of

something: students or teacher, textbook or videotapes, texts or realia, blackboard or
ministries of education. We need to be clear about the information in order to avoid
misunderstandings.
5. CONCLUSION
In conclusion, when evaluate students it is important to designing the test which has the
features and qualities of a good test. A test is good if it contains practicality, good validity,
6
high reliability, authenticity, and positive washback. The five principles provide guidelines
for both designing and evaluating the test. A test should be constructed with the goal of
having students learn from their weaknesses. It will locate the exact areas of difficulties
experienced by the class or the individual student so that assistance in the form of additional
practice and corrective exercises can be given. Thus, teacher should apply these five
principles in designing and evaluating the test which will be used in assessment activities.
REFERENCES
Bachman, Lyle F. Palmer, Adrian S. 1996. Language Testing in Practice: Designing and
Developing Useful Language Tests. Oxford University Press.
Brown, H. Douglas and Priyanvada Abywickrama. 2010. Language Assessment: Principles

and Classroom Practices. NY: Pearson Education.
Genesee, Fred and Upshur, John A. 2002. Classroom-based Evaluation in Second Language
Education. Cambridge: Cambridge University Press.

Essay III Principles of Language Assessment

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Essay III Principles of Language Assessment

Enviado por

Direitos autorais:

Formatos disponíveis

PRINCIPLES OF LANGUAGE ASSESSMENT

RIAS WITA SURYANI

2. PRINCIPLES OF LANGUAGE ASSESSMENT

Brown (2010) in his book, Language Assessment: Principles and Classroom

3. APPLYING PRINCIPLES TO THE EVALUATION OF CLASSROOM TESTS

According to five principles: practicality, reliability, validity, authenticity, and

1) Are the test procedures practical?

2) Is the test itself reliable?

3) Can you ensure rater reliability?

4) Does the procedure demonstrate content validity?

5) Has the impact of the test been carefully accounted for?

6) Is the procedure biased for best?

7) Are the test tasks as authentic as possible?

8) Does the test offer beneficial washback to the learner?

4. THE INFORMATIONS THAT WE NEED IN LANGUAGE

There are two types of information in that we need in language teaching/learning

All information, whether qualitative or quantitative, refers to characteristics of

Brown, H. Douglas and Priyanvada Abywickrama. 2010. Language Assessment: Principles

Você também pode gostar