Você está na página 1de 186

CHAPTER 1

BASIC CONCEPTS IN ASSESSMENT

Learning Outcomes

At the end of the chapter, the students should able to:

1. Define the terms: assessment, evaluation, measurement, test, testing, formative


assessment, placement assessment, diagnostic assessment, summative
assessment, traditional assessment, portfolio assessment, performance
assessment;
2. Discriminate the different purposes of assessment;
3. Differentiate the different types of assessment;
4. Identify and discuss the general principles of assessment;
5. Discuss the different guidelines for effective student assessment; and
6. Differentiate norm-referenced from criterion-referenced interpretation

INTRODUCTION

Assessment of Learning focuses on the development and utilization of


assessment tools to improve the teaching-learning process. It emphasizes on the use of
testing for measuring knowledge, comprehension and other thinking skills. As part of
the overall evaluation process, we need specifically to find out if the learners are
actually learning (changing their behavior) as a result of the teaching. This will show us
whether the teaching has been effective, which is ultimately the most important issue.
Assessment is a means of finding out what learning is taking place. As well as specific
knowledge and skills, we might also like to measure other changes in behavior related
to personality’, social skills, interests, learning styles, among others

There is a lot of debate about how to assess learning, and especially about how to
evaluate performance. Our objective give us guidance on what to assess, because they
are written in terms of what the learners should be able to do. Based on these
objectives, it is very useful to identify all the activities and skills which the learners will
carry out, the conditions under which they will perform these tasks and activities, the
possible results which might be obtained, and the standards by which their
performance will be measured.

The assessment itself can be done in different ways:

1. Ask the learners to recall facts or principles) e.g.’ What is ‘x’?).


2. Ask the learner to apply given or recalled facts or principles (e.g.’ How does x
help you solve this problem?).
3. Ask the learner to select and apply facts and principles to solve a given
problem (e.g.’ What do you know that will help you solve this problem?).

1
4. Ask the learner to formulate and solve her own problem by selecting,
generating and applying facts and principles (e.g.What I see as the problem
here and how can I reach a satisfying solution?).
5. Ask the leaner to perform tasks that shows mastery of the learning outcomes.

Once again, we need to stress the importance of participation, and this is


especially important in assessment and evaluation. Learners should be actively involved
in both the development o learning objectives, and as much as possible in their own
assessment. In many education systems, assessment is used as a tool for sorting
students for selection purposes (progression to a higher level of education, higher
rewards, among others). Assessment where students are compared with other is known
as norm-referencing. It is much better if learners are aware of what they need to learn
and what they have learned, so they can set their own targets and monitor their own
progress. Of course, teachers and trainers should advise the learners, and guide them in
order to help them learn; this is the key role if the teacher. Assessment of learners in
relation to a particular target or level of performance is called criterion-referencing.

DIFFERENT TERMINOLOGIES: ASSESSMENT, TEASTING, MEASUREMENT AND


EVALUATION

Assessment, measurement and evaluation mean many different things. These


terms are sometimes used interchangeably in the field of education. In this section, we
shall point out the fundamental different of the terms assessment, testing, measurement
and evaluation.

The term Assessment refers to the different components and activities of


different schools. An assessment can be used to student learning and in comparing
student learning with the learning goals of a academic program. Assessment is defined
as an act or process of collecting and interpreting information about student learning.
Another source expands this statement by adding that it is a systematic process of
gathering, interpreting, and using this information about student learning. It is a very
powerful tool for education improvement. It emphasizes on individual student or
groups of individuals and on the academic program of a certain educational institution.
There are different purposes of assessment such as: to provide feedback to students and
to serve as diagnostic tool for instruction. For this purpose assessment usually answer
the questions, “Was the instruction effective?” and “Did the students achieve the
intended learning outcome?”

Assessment is a general term that includes different ways that the teachers used
to gather information in the classroom. Information that help teachers understand their
students, information that is used to plan and monitor their classroom instruction,
information that is used to a worthwhile classroom culture and information that is used
for testing and grading. The most common form of assessment is giving a test. Since test
is a form of assessment, hence, it also answer the question, “how does individual
student perform?” Test is formal and systematic instrument, usually paper and pencil

2
procedure designed to assess the quality, ability, skill or knowledge of the students by
giving a set of question in uniform manner. A test is one of the many types of
assessment procedure used to gather information about the performance of students,
Hence, testing is one of the different methods used to measure the level of performance
or achievement of the learners. Testing also refers to the administration, scoring, and
interpretation of the procedures designed to get information about the extent of the
performance of the students. Oral questionings, observations, projects, performances
and portfolios are the other assessment processes that will be discussed later in detail.

Measurement is a process of quantifying or assigning number to the individual’s


intelligence, personality, attitudes and values, and achievement f the students. In other
words, express the assessment data in terms of numerical values and answer the
question, “how much?” Common example of measurement is when a teacher gives
scores to the test of the students such as Renzelgot 23 correct answer out of 25 items in
Mathematics test; Princess Mae got 95% in her English first grading periodic test;
Ronnick’s score 88% in his laboratory test in Biology. In the examples, numerical values
are used to represent the performance of the students in different subjects.

After collecting the assessment data, the teacher will use this to make decisions
or judgment about the performance of the students in a certain instruction.

Evaluation refers to the process of judging the quality of what is good and what
is desirable. It is the comparison of data to a set of standard or learning criteria for the
purpose of judging the worth or quality. Examples, in judging the quality of an essay
written by the students about their opinion regarding the first state of the nation
address of Pres. Benigno C. Aquino, evaluation occurs after the assessment data has
been collected and synthesized because it is only in this time where teacher is in the
position to make judgment about the performance of the students. Teachers evaluate
how well or to what extent the students attained the instructional outcomes.

TYPES OF ASSESSMENT PROCEDURES

Classroom assessment procedures can be classified according to the nature of


assessment, format of assessment, use in the classroom instruction and methods of
interpreting the results (Gronlund and Linn,2000).

Nature of Assessment

1. Maximum Performance
It is used to determine what individuals can do when performing at their
best. Examples of instruments using maximum performance are aptitude
tests and achievement tests.
2. Typical Performance

3
It is used to determine what individuals will do under natural conditions.
Examples of instruments using typical performance are attitude, interest,
and personality inventories, observational techniques and peer appraisal.

Format of Assessment

1. Fixed-choice Test
An assessment used to measure knowledge and skills effectively and
efficiently. Standard multiple-choice test is an example of instrument used
in fixed-choice test.
2. Complex-performance Assessment
An assessment procedure used to measure the performance of the learner
in context and on problems valued in their own right. Examples of
instruments used in complex-performance assessment are hands-on
laboratory experiment, projects, essays, oral presentation.

Role of Assessment in Classroom Instruction

“Teaching and Learning are reciprocal processes that depend on and affect one
another (Swearingen 2002 and Kellough, 1999).” The assessment component of the
instructional processes deals with the learning progress of the students and the
teacher’s effectiveness in imparting knowledge to the students.

Assessment enhances learning in the instructional processes of the result


provides feedbacks to both students and teachers. The information obtained from the
assessment is used to evaluate the teaching methodologies and strategies of the teacher.
It is also used to make teaching decisions. The result of assessment is used to diagnose
the learning problems of the students.

When planning assessment, it should start when teacher plans his instruction.
That is, when writing learning outcomes up to the time when the teachers assesses the
extent of achieving the learning outcomes. Teachers made decisions from the beginning
if instruction up to the end of instruction. There are four roles of assessment used in the
instructional process. The first is placement assessment, a type of assessment given at
the beginning if instruction. The second and third types of assessment are formative
assessment and diagnostic assessment and diagnostic assessment given during
instruction and the last is the summative assessment given at the end, of instruction.

1. Beginning of Instruction
Placement Assessment according to Gronlund, Linn, and Miller (2009) is
concerned with the entry performance and typically focuses on the questions:
Does the learner possess the knowledge ad skills needed to begin the planned
instruction? To what extent has the learner already developed the understanding
and skills that are the goals of planned objectives? To what extent do the
student’s interest, work habits, and personality indicate that one mode of

4
instruction might be better than another? The purpose of placement assessment
is to determine the prerequisite skills, degree of mastery of the course objectives
and the nest mode learning.
2. During Instruction
During the instructional process the main concern of a classroom teacher
is to monitor the learning progress of the students. Teacher should assess
whether students achieved the intended learning outcomes set for a particular
lesson. If the students achieved the planned learning outcomes, the teacher
should provide a feedback to reinforce learning. Based on recent researches, it
shows that providing feedback to students is the most significant strategy to
move students forward in their learning. Garnison and Ehringhaus (2007),
stressed in their paper “Formative and Summative Assessment in the
Classroom,” that feedback provides students with an understanding of what they
are doing well, links to classroom learning. If it is not achieved, the teacher will
give a group ,or individual remediation. During this process we shall consider
formative assessment and diagnostic assessment.
Formative Assessment is a type of assessment used to monitor the
learning process of the students during instruction. The purposes of formative
assessment are the ,following: to provide immediate feedback to both student
and teacher regarding the success and failures of learning; to identify the
learning errors that are in need of correction; to provide teachers with
information on how to modify instruction; and also to improve learning and
instruction.
Diagnostic Assessment is a type of assessment given at the beginning of
instruction or during instruction. It aims to identify the strengths and
weaknesses of the students regarding the topics t be discussed. The purpose of
diagnostic assessment are to determine the level of competence of the students;
to identify the students who already have knowledge about the lesson; to
determine the causes of learning problems that cannot be revealed by formative
assessment; and to formulate a plan for remedial action.
3. End of Instruction
Summative Assessmentis type of assessment usually given at the end of
a course or unit. The purposes of summative assessment are to determine the
extend to which the instructional objectives have been meet; to certify student
mastery of the intended learning outcomes as well as use it for assigning grades;
to provide information for judging appropriateness of the instructional
objectives; and to determine the effectiveness of instruction.

Methods of Interpreting the Results

1. Norm-referenced Interpretation
It is used to describe student performance according to relative position
in some known group. In this method of interpretation it is assumed that the

5
level of performance of students will not vary much from one class to another
class. Examples: ranks 5th in a classroom group of 40.
2. Criterion-referenced Interpretation
It is used to describe students’ performance according to a specified
domain of clearly defined learning tasks. This method of interpretation is used
when the teacher wants to determine how well the students have learned
specific knowledge of skills in a certain course or subject matter. Examples:
divide three-digit whole numbers correctly and accurately; multiply binomial
terms procedures correctly.

There are ways of describing classroom tests and other assessment procedures.
This table is a summary of the different types of assessment procedures that was
adapted and modified from Gronlund, Linn, and Miller (2009).

Classification Types of Function of Examples of


Assessment Assessment Instruments
Maximum It is used to Aptitude tests,
Performance determine what achievement tests
individuals can do
when performing at
their best.
Nature of
Typical It is used to Attitude, interest,
assessment
Performance determine what and personality
individuals will do inventories;
under natural observational
conditions. techniques; pee
appraisal
Fixed-choice An assessment used Standard multiple-
Test to measure choice test
knowledge and
skills effectively and
efficiently.
Form of Complex- An assessment Hands-on
Assessment performance procedure used to laboratory
assessment measure the experiment,
performance of the projects, essays,
learner in contexts oral presentation
and on problems
valued in their own
right.
Placement An assessment Readiness tests,
procedure used to aptitude tests,
determine the pretests on course
Use in classroom
learner’s objectives, self-
instruction
prerequisite skills, report inventories,
degree of mastery of observational
the course goals, techniques

6
and/ or best modes
of learning
Formative An assessment Teacher-made tests,
procedure used to custom made tests
determine the from textbook
learner’s learning publishers,
progress, provides observational
feedback to techniques
reinforce learning,
and corrects
learning errors.
Diagnostic An assessment Published
procedure used to diagnostic tests,
determine the teacher-made
causes of learner’s diagnostic tests,
persistent learning observational
difficulties such as techniques
intellectual,
physical, emotional,
and environmental
difficulties.
Summative An assessment Teacher-made
procedure used to survey test,
determine the end- performance rating
of-course scales, product
achievement for scales
assigning grades or
certifying mastery
of objectives.
Criterion- It is used to Teacher-made tests,
referenced describe student custom-made tests
performance from textbook
according to a publishers,
specified domain of observational
clearly defined techniques
learning tasks.
Example: multiplies
three-digit to whole
numbers correctly
and accurately.
Norm- It is used to Standardized
Methods of
referenced describe student’s aptitude and
Interpreting
performance achievement tests,
results
according to relative teacher-made
position in some survey tests,
known group. interest inventories,
Example: rank 5th in adjustment
a classroom group inventories.
of 40.

7
OTHER TYPES OF TEST

Other types of descriptive terms used to describe tests in contrasting types such
as the non-standardized versus standardized tests; objective versus subjective tests;
supply versus fixed-response test; individual versus group test; mastery versus survey
tests, speed versus power tests.

Non-standardized Test versus Standardized Test

1. Non-standardized test is a type of test developed by the classroom teachers.


2. Standardized test is a type of test developed by test specialists. It is
administered, scored and interpreted using a certain standard condition.

Objective Test versus Subjective Test

1. Objective test us a type of test in which two or more evaluators give an examinee
the same score.
2. Subjective test is a type of test in which the scores are influenced by the
judgment of the evaluators, meaning there is no one correct answer.

Supply Test versus Fixed-response Test

1. Supply test is a type of test that requires the examinees to supply an answer,
such as an essay test item or completion or short answer test item.
2. Fixed-response test is a type of test that requires the examinees to select an
answer from a given option such as multiple-choice test, matching type of test, or
true/ false test.

Individual Test versus Group Test

1. Individual test is a type of test administered to student on a one-on-one basis


using oral questioning.
2. Group test is a type of test administered to a group of individuals or group of
students.

Mastery Test versus Survey Test

1. Mastery test is a type of achievement test that measures the degree of mastery of
a limited set of learning outcomes using criterion-reference to interpret the
result.
2. Survey test is a type of test that measurers students’ general achievement over a
broad range of learning outcomes using norm-reference to interpret the result.

Speed Test versus Power Test

1. Speed test is designed to measure number of items an individual van complete


over a certain period of time.

8
2. Power test is designed to measure the level of performance rather than speed of
response. It contains test items that are arranged according to increasing degree
of difficulty.

MODES OF ASSESSMENT

There are different types of modes of assessment used by a classroom teacher to


assess the learning progress of the students. These are traditional assessment,
alternative assessment, performance-based assessment, and portfolio assessment.

Traditional Assessment

It is a type of assessment in which the students choose their answer from a given
list of choices. Examples of this type of assessment are multiple-choice test, standard
true/ false test, matching type test, and fill-in-the-blank test. In traditional assessment,
students are expected to recognize that there is only one correct or best answer for the
question asked.

Alternative Assessment

An assessment in which students create an original response to answer a certain


question.Students response to a question using their own ideas, in their own words.
Examples of a alternative assessment are short-answer questions, essays, oral
presentation, exhibitions, demonstrations, performance assessment, and portfolios.
Other activities included in this type are teacher observation and student self-
assessment.

Components of Alternative Assessment

a. Assessment is based on authentic tasks that demonstrate students’ ability to


accomplish communication goals.
b. The teacher and students focus on communication, not on right and wrong
answer.
c. Students help the teacher to set the criteria for successful completion of
communication tasks.
d. Students have opportunities to assess themselves and their peers.

Performance-based Assessment

Performance assessment (Mueler, 2010) us an assessment in which students are


asked to perform real-world tasks that demonstrate meaningful application of essential
knowledge and skills.

It is a direct measure of student performance because the tasks are designed to


incorporate context, problems and solution strategies that students would use in real
life. It focus on processes and rationales. There is no single correct answer, instead
students are led to craft polished, thorough and justifiable responses, performances and

9
products. It also involved long-range projects, exhibits, and performances that are
linked to the curriculum. In this kind of assessment, the teacher is an important
collaborator in creating tasks, as well as in developing guidelines for scoring and
interpretation.

GUIDELINE FOR EFFECTIVE STUDENT ASSESSMENT

Improvement of student learning is the main purpose of classroom assessment.


This can be done if assessment is integrated with good instruction and is guided by
certain principles. Gronlund (1998) provided the general guidelines for using student
assessment effectively.

1. Effective assessment requires a clear concept of all intended learning out-comes.


2. Effective assessment requires that a variety of assessment procedures should be
used.
3. Effective assessment requires that the instructional relevance of the procedure
be considered.
4. Effective assessment requires an adequate sample of student performance.
5. Effective assessment requires that the procedures must be fair to everyone.
6. Effective assessment requires specifications of criteria for judging successful
performance.
7. Effective performance requires feedback to students emphasizing strengths of
performance and weakness to be corrected.
8. Effective assessment must be supported by comprehensive grading and
reporting system.

CHAPTER 2

Learning Outcomes

At the end of this chapter, the students should be able to:

1. Define the following terms: goals, objectives, and educational objectives/


instructional objectives, specific/behavioral objectives, general/ expressive
objectives, learning outcomes, learning activity, observable outcome,
unobservable outcome, cognitive domain, affective domain, psychomotor
domain, and educational taxonomy;
2. Write specific and general objectives;
3. Identify learning outcomes and learning activities;
4. Determine observable outcomes and non-observable learning outcomes;
5. Identify the different levels of Bloom’s taxonomy;
6. Identify the different levels of Krathwolh’s 2001 revised cognitive domain;
7. Write specific cognitive outcomes;
8. Write specific affective outcomes;
9. Write specific psychomotor outcomes;

10
10. Write measurable and observable learning outcomes.

INTRODUCTION

Instructional goals and objectives play a very important role in both instructional
process and assessment process. This serves as a guide both for teaching and learning
process, communicate the purpose of instruction to other stakeholders, and to provide
guidelines for assessing the performance of the students. Assessing the learning
outcomes of the students is one of the very critical functions of teachers. A classroom
teacher should classify the objectives of the lesson because it is very important for the
selection of the teaching method and the selection of the instructional materials. The
instructional material should be appropriate for the lesson so that the teacher can
motivate the students properly. The objectives can be classified according to the
leaning, outcomes of the lesson that will be discussed.

PURPOSES OF INSTRUCTIONAL GOALS AND OBJECTIVES

The purposes of, the instructional goals and objectives.


1. It provides direction for the instructional process by clarifying the intended
leaning outcomes.
2. It conveys instructional intent to other stakeholders such as students, parents,
school officials, and the public.
3. It provides basis for assessing the performance of the students by describing the
performance to be measured.

GOALS AND OBJECTIVES

The terms goals and objectives are two different concepts but they are related to
each other. Goals and objectives are very important, most especially when you want to
achieve something for the students in any classroom activities. Goals can never be
accomplished without objectives and you cannot get the objectives that you need in
order that you can accomplish what you want to achieve. Below are the different
descriptions between goals and objectives.

Goals Objectives
Broad Narrow
General intention Precise
Intangible Tangible
Abstract (less structure) Concrete
Cannot be validated as is Can be validated
Long term aims what you want to Short term aims what you want to achieve
accomplish
Hard to quantify or put in a timeline Must be given a timeline to accomplish to
be more effective

11
Goals, General Educational Program Objectives, and Instructional Objectives

Goals.A broad statement of very general educational outcomes that do not


include specific level of performance. It tent to change infrequently and in response to
the societal pressure, e.g., learn problem solving skills; develop high level thinking skills;
appreciate the beauty f an art; be creative; and be competent in the basic skills in the
area or grammar.

General Educational Program Objectives. More narrowly defined statements of


educational outcomes that apply to specific educational program; formulated on the
annual basis; developed by program coordinators, principals, and other school
administrators.

Instructional Objectives. Specific statement of the learners behavior or outcomes


that are expected to be exhibited by the students after completing a unit of instruction.
Unit o instruction may mean: a two weeks lesson on polynomials; one week lesson on
“parallelism after correlatives”; one class period on “katangianngwika.” At the end of the
lesson the students should be able to add fractions with 100% accuracy; the students
should be able to dissed the frog following the correct procedures, are example of
instructional objectives.\

Typical Problems Encountered When Writing Objectives

Problems Error Types Solutions


Too broad or complex The objective is too broad Simplify or break apart
in scope or is actually more
than one objective
False or missing behavior, The objective does not list Be more specific; make
condition, or degree the correct behavior, sure the behavior,
condition, and/ or degree, condition, and degree are
or it is missing included
False given Describes instruction, not Simplify, include ONLY
conditions ABCDs
False performance No true overt, observable Describe what behavior
performance listed you must observe

To avoid different problems encountered in writing objectives, let us discuss the


components of instructional objectives and other terms related to constructing a good
instructional objective.

Four Main things That Objective Should Specify

1. Audience
Who? Who are the specific people the objectives are aimed at?
2. Observable Behavior
12
What? What do you expect them to be able to do? This should be an overt,
observable behavior, even if the actual behavior is covert or mental in nature. If
you cannot see it, heat it, touch it, taste it, or smell it, you cannot be sure your
audience really learned it.

3. Special Conditions
The third components of instructional objectives is the special conditions
under which the behavior must be displayed by the students. How? Under what
circumstances will be learning occur? What will the student be given o already
be expected to know to accomplish the learning?
4. Stating Criterion Level
The fourth component of the instructional objectives is stating the
criterion level. The criterion level of acceptable performance specifies how many
of the items must the students answer correctly for the teacher to attain his/her
objectives. How much? Must a specific set of criterion be met? Do you want total
mastery (100%), do you want them to response correctly 90% of the time,
among others? A common (and totally non-scientific) setting is 90% of the time.
Always remember that the criterion level need not be specified on
percentage of the number of items correctly answered. It can be stated as,
number of items correct; number of consecutive items correct; essential features
included in the case of essay question or paper; completion within a specified
time or completion with a certain degree of accuracy.

Types of Educational Objectives

Educational objective is also known as instructional objective. There are two


typeso educational objectives: specific or behavioral objectives and general or
expressive objectives (Kubiszyn and Borich, 2007).

1. Specific or Behavioral Objectives. Precise statement of behavioral to be


exhibited by the students; the criterion by which mastery of the objectives will
be judged; the statement of the conditions under which behavior must be
demonstrated.

Example of behavioral objective are: (1) Multiply three-digit number with


95% accuracy. (2) List the months of the year in proper order from memory,
with 100% accuracy. (3) Encode 30 words per minute with at most three (3)
errors using computer. These activities specify specific educational outcomes.

2. General or Expressive Objectives. Statement wherein the behaviors are not


usually specified and the criterion of the performance level is not stated. It only
describes the experience or educational activity to be done. The outcomes of the
activity is not expressed in specific terms but in general terms such as
understand, interpret or analyze. Examples of expressive objectives: (1)
Interpret the novel the Lion, the Witch, and the Wardrobe; (2) Visit Manila Zoo

13
and discuss what was of interest; (3) Understanding the concept of normal
distribution. These examples specify only the activity or experience and broad
educational outcome.
Instructional objective is a clear and concise statement of skill or skills
that students are expected to perform or exhibit after discussing a certain lesson
or unit of instruction. The components of instructional objective are observable
behaviors, special conditions which the behavior must be exhibited and
performance level considered sufficient todemonstrate mastery.
When a teacher developed instructional objectives, he must include an
action verb that specifies learning outcomes. Some educators and education
students are often confused with learning outcome and learning activity. An
activity that implies a certain product or end result of instructional objectives is
called learning outcome. If you write instructional objectives as a means or
processes of attaining the end product, then it is considered as learning activity.
Hence, revise it so that the product of the activity is stated.

Examples:

Learning Activities Learning Outcomes


Study identify
Read Write
Watch Recall
listen list

TYPES OF LEARNING OUTCOMES

After developing learning outcomes the next step the teacher must consider is to
identify whether the learning outcome is stated as a measurable and observable
behavior or non-measurable and non-measurable and non-observable behavior. If
learning outcome is measurable then it is observable, therefore, always state the
learning outcomes in observable behavior. Teachers should always develop
instructional objectives that are specific, measurable statement of outcomes of
instruction that indicates whether instructional intents have been achieved (Kubiszyn,
2007). The following are examples of verbs in terms of observable learning outcomes
and unobservable learning outcomes.

Observable Learning Outcomes Non-observable Learning Outcomes


Draw Understand
Build Appreciate
List Value
Recite Know
Add Be familiar

Examples of observable learning outcomes:

14
1. Recite the names of the characters in the story MISERY by Anton Chechov.
2. Add two-digit numbers with 100% accuracy.
3. Circle the initial sounds of words.
4. Change the battery of an engine.
5. List the steps of hypothesis testing in order.

Examples of non-observable learning outcomes:

1. Be familiar with the constitutional provisions relevant to agrarian reforms.


2. Understand the process of evaporation.
3. Enjoy speaking Spanish.
4. Appreciate the beauty if an art.
5. Know the concept of normal distribution.

Types of Learning Outcomes to Consider

Below are the lists of learning outcomes classified as a learning objective. The
more specific outcome should not be regarded as exclusive; there are merely suggestive
as categories to be considered (Gronlund, Linn, and Miller, 2009).

1. Knowledge
1.1 Terminology
1.2 Specific facts
1.3 Concepts and principles
1.4 Methods and procedures
2. Understanding
2.1 Concepts and principles
2.2 Methods and procedures
2.3 Written materials, graph, maps, and numerical data
2.4 Problem situations
3. Application
3.1 factual information
3.2 concepts and principles
3.3 methods and procedures
3.4 problem solving skills
4. Thinking skills
4.1 critical thinking
4.2 scientific thinking
5. General skills
5.1 laboratory skills
5.2 performance skills
5.3 communication skills
5.4 computational skills
5.5 Social skills
6. Attitudes

15
6.1 Social attitudes
6.2 Scientific attitudes
7. Interests
7.1 Personal interests
7.2 Educational interests
7.3 Vocational interests
8. Appreciations
8.1 Literature, art, and music
8.2 Social and scientific achievements
9. Adjustments
9.1 Social adjustments
9.2 Emotional adjustments

TAXONOMY OF EDUCATIONAL OBJECTIVES

Taxonomy of Educational Objectives is a useful guide for developing a


comprehensive list of instructional objectives. A taxonomy is primarily useful in
identifying the types of learning outcomes that should be considered when developing a
comprehensive list of objectives for classroom instruction.

Benjamin S. Bloom (1948, as cited by Gabuyo, 2011), awell-known psychologist


and educator, took the initiative to lead in formulating and classifying the goals and
objectives of the educational process.The three domains of educational activities were
determined: the cognitive domain, affective domain and the psychomotor domain.

1. Cognitive Domain called for outcomes of mental activity such as memorizing,


reading problem solving, analyzing, synthesizing and drawing conclusions.
2. Affective Domain describes learning objectives that emphasize a feeling tone, an
emotion, or a degree of acceptance or rejection. Affective objectives vary from
simple attention to selected phenomena to complex but internally consistent
qualities of character and conscience. We found a large number of such
objectives in the literature expressed as interests, attitudes, appreciations,
values, and emotional sets or biases (Krathwohl et al., 1964 as cited by Esmane,
2011). It refers to the persons’ awareness and internalization of objects and
simulations, it focus on the emotions of the learners.
3. Psychomotor Domain is characterized by the progressive levels of behaviors
from observation to mastery of physical skills (Simpson, 1972 as cited by
Esmane, 2011). This includes physical movements, coordination, and use of the
motor-skill areas. Development of these skills requires practice and is measured
in terms of speed, precision, distance, procedures, or techniques in execution. It
focused on the physical and kinesthetic skills of the learner. This domain is
characterized by the progressive levels of behaviors from observation to mastery
of physical skills.

16
Bloom and other educators work on cognitive domain, established and completed
the hierarchy of educational objectives in 1956, it was called as the Bloom’s Taxonomy
of the cognitive domain. The affective and psychomotor domains were also developed
by other group of educators.

CRITERIA FOR SELECTING APPROPRIATE OBJECTIVES

1. The objectives should include all important outcomes of the course or subject
matter,
2. The objectives should be in harmony with the content standards of the state and
with the general goals of the school.
3. The objectives should be in harmony with the sound principles of learning.
4. The objectives should be realistic in terms of the abilities of the students, time
and the available facilities.

CLEAR STATEMENT OF INSTRUCTIONAL OBJECTIVES

To obtain a clear statement of instructional objectives you should define the


objectives in two steps. First, state the general objectives of instruction as intended
learning outcomes. Second, list under each objective a sample of the specific types of
performance that the students should be able to demonstrate when they have achieved
the objectives (Gronlund, 2000 as cited by Gronlund, Linn, and Miller, 2009). This
procedure should result in the statement of general objectives and specific learning
outcomes such as the given example below.

1. Understands the scientific principles


1.1 Describe the principle in their own words.
1.2 Identifies examples of the principle.
1.3 States reasonable hypotheses based on the principles.
1.4 Uses the principle in solving problem
1.5 Distinguishes between two given principles.
1.6 Explains the relationships between the given principles.

In this example, the expected learning outcome is concerned with the


understanding of the students regarding scientific principles. As the verb understands
is expressed as a genera; objective, the statement immediately starts with the word
understands. It is very important to start immediately with the verb so that it will focus
only on the intended outcomes. No need to add phrase such as “the student should be
able to demonstrate that they understand,” and the like. Beneath the general objective
are statements of specific learning outcomes that start immediately with verbs that are
specific, indicate definite, and observable responses that is, one can be seen and can be
assessed by outside observes of evaluators. The verbs describes, identifies, states, uses,
distinguishes, and explains are specific learning outcomes stated in terms of
observable students performance.

17
MATCHING TEST ITEMS TO INSTRUCTIONAL OBJECTIVES

When constructing test items, always remembers that they should match the
instructional objectives. The learning outcomes and the learning conditions specified in
the test items should match with the learning outcomes and conditions stated in the
objectives. If a test developer followed this basic rule, then the test is ensured to have
content validity. The content validity is very important so that your goal is to assess the
achievements of the students, hence, don’t ask tricky questions. To measure the
achievement of the students ask them to demonstrate a mastery of skills that was
specified in the conditions in the instructional objectives.

Consider the following examples of matching test items to instructional


objectives as the author adapted and modified Kubiszyn and Borich’s (2007)
instructional objectives. From the table below, items 1 and 3 have learning outcomes
that match with the test item while items 2,4, ad 5 have learning outcomes that were
unmatched with the test items.

Match?
Yes No
1. Objective: discriminate fact from opinion from Pres.
Benigno C. Aquino’s first State of the Nation Address /
(SONA).
Test item: From the State of the Nation Address (SONA)
speech of President Aquino, give five (5) examples of
facts and five (5) examples of opinions.
2. Objectives: Recall the names and capitals of all the
different provinces of Regions I and II in the Philippines. /
Test items: List the names and capitals of two provinces
in Region I and three provinces in Region II.
3. Objective: List the main event in chronological order,
after reading the short story a VENDETTA by Guy de /
Maupassant.
Test item: From the short story A VENDETTA by Guy de
Maupassant, list the main event in chronological order.
4. Objective: Circle the nouns and pronouns from the given
list of words. /
Test item: Give five examples of pronouns and five
examples of verbs.
5. Objective: Make a freehand drawing about Region II
using your map as a guide. /
Test item: without using your map, draw the map of
Region II.

BLOOM’S REVISED TAXONOMY

Lorin Anderson a former student of Bloom together with Krathwolh, revised the
Bloom’s taxonomy of cognitive domain in the mid-90s in order to fit the more outcome-

18
focused modern education objectives. There are two major changes: (1) the names in
the six categories from noun to active verb, and (2) the arrangement of the order of the
last two highest levels as shown in the given figure below. This new taxonomy reflects a
more active from of thinking and is perhaps more accurate.

1956 2001

Evaluation Creating
Synthesis Evaluating
Analysis Analyzing
Application Applying
Comprehension Understanding
Knowledge Remembering
Noun to Verb From

Changes o Bloom’s Taxonomy

*Adapted with written permission from Leslie Owen Wilson’s curriculum Pages
Beyond Bloom – A New Version of the Cognitive Taxonomy.

Bloom’s Taxonomy in 1956 Anderson/Krathwolh’s Revision in 2001


1. Knowledge: Remembering or 1. Remembering: Objectives written on the
retrieving previously learned remembering level (lowest cognitive level):
material. Retrieving, recalling, or recognizing knowledge
Examples of verbs that relate to from memory. Remembering is when memory is
this function are: identify, relate, list, used to produce definitions, facts, or lists; to
define, recall, memorize, repeat, recite or retrieve material.
record name, recognize, acquire Sample verbs appropriate for objectives
written at the remembering level: state, tell,
underline, locate, match, state, spell, fill in the
blank, identify, relate, list, define, recall,
memorize, repeat, record, name, recognize,
acquire
2. Comprehension: the ability to 2. Understanding: Objectives written on the
grasp or construct meaning from understanding level (higher level of mental
material. ability than remembering requires the lowest
Examples of verbs that relate to level of understanding from the student):
this function are: restate, locate, Constructing meaning from different types of
report, recognize, explain,, express, functions be they written or graphic message
identify, discuss, describe, review, activities like interpreting, exemplifying,
infer, conclude, illustrate, interpret, classifying, summarizing, inferring, comparing
draw, represent, differentiate and explaining.
Sample verbs appropriate for objectives
written t the understanding level: restate, locate,
report, recognize, explain, express, identify,
discuss, describe, review, infer, conclude,
illustrate, interpret, draw, represent,
differentiate

19
3. Application: the ability to use 3.Applying: Objectives written on the applying
learned material, or to implement level require the learner to implement (use) the
material in new and concrete information: Carrying out or using a procedure
situations. through executing, or implementing. Applying
Examples of verbs that relate to relates and refers to situations where learned
this function are: apply, relate, material is used through products like models,
develop, translate, use, operate, presentations, interviews or simulations.
organize, employ, restructure, Sample verbs appropriate or objectives
interpret, demonstrate, illustrate, written at the applying level: apply, relate,
practice, calculate, show, exhibit, develop, translate, use, operate, organize,
dramatize employ, restructure, interpret, demonstrate,
illustrate, practice, calculate, show, exhibit,
dramatize
4. Analysis: the ability to break 4. Analyzing: Objectives written on the
down or distinguish the parts of the analysis level requires the learner to break the
material into their components so information into component parts and describe
that their organizational structure the relationship. Breaking material or concepts
may be better understood. into parts, determining how the parts relate or
Examples of verbs that relate to interrelate to one another or to an overall
this function are: analyze, compare, structure or purpose. Mental actions included in
probe, inquire, examine, contrast, this function are differentiating, organizing and
categorize, differentiate, investigate attributing, as well as being able to distinguish
detect, survey, classify, deduce, between the components or parts. When one is
experiment, scrutinize, discover, analyzing, he/she can illustrate this mental
inspect dissect, discriminate function be creating spreadsheets, survey,
separate charts, or diagrams, graphic representations.
Samples verbs appropriate for objectives
written at the analyzing level: analyze,, compare,
probe, inquire, examine, contrast, categorize,
differentiate, contrast, investigate, detect,
survey, classify, deduce, experiment, scrutinize,
discover, inspect, dissect, discriminate, separate

5. Synthesis: The ability to put 5.Evaluating: Objectives written on the


parts together to form a coherent or evaluating level require the student to make a
unique new whole. judgment about materials or methods. Making
Examples of verbs that relate to judgments based on criteria and standards
this function are: compose, produce, through checking and critiquing. Critiques,
design, assemble, create, prepare, recommendations, and reports are some of the
predict, modify, plan, invent, products that can be created to demonstrate the
formulate, collect, set up, generalize, processes of evaluation. In the newer taxonomy,
document, combine, propose, evaluation comes before creating as it is often a
develop, arrange, construct, necessary part of the precursory behavior
organize, originate, derive, write before creating something.

Remember this part has now changed places


with the last one on the old taxonomy.
Sample verbs appropriate for objectives

20
written at the evaluating level: appraise, choose,
compare, conclude, decide, defend, evaluate,
give your opinion, judge, justify, prioritize, rank,
rate, select, rate, support, value
6. Evaluation: The ability to 6.Creating: Objectives written on the
judge, check, and even critique the creating level require the student to generate
value of material for a given purpose. new idea and ways of viewing things. Putting
Examples of verbs that relate to elements together to from a coherent or
this function are: judge, assess, functional whole; reorganizing elements into a
compare, evaluate, conclude, new pattern or structure through generating,
measure, deduce, argue, decide, planning, or producing. Creating requires users
choose, rate, select, estimate, to put parts together in a new ways or
validate, consider, appraise, value, synthesize parts into something new and
criticize, infer different form or product. This process is the
most difficult mental function in the new
taxonomy.
This one used be No. 5 in Bloom’s
taxonomy and was known as the synthesis.
Sample verbs appropriate for objectives
written at the creating level: Change, combine,
compose, construct, create, invent, design,
formulate, generate, produce, revise,
reconstruct, rearrange, visualize, write, plan
*adapted with written permission from Leslie Owen Wilson’s Curriculum Pages
Beyond Bloom- A New Version of the Cognitive Taxonomy.

Cognitive Domain

Bloom’s taxonomy of cognitive domain is arranged according to the lowest level


to the highest level. Knowledge as the lowest level followed by comprehension, analysis,
application, synthesis and evaluation as the highest level.

1. Knowledge recognizes students’ ability to use rote memorization and recall


certain facts. Test questions focus on identification and recall information.

Sample verbs of stating specific learning outcomes:


Cite, define, identify, label, list, match, name, recognize, reproduce, select,
state

Instructional Objectives:
At the end of the topic, the students should be able to identify the
different steps in testing hypothesis.

Test Item:
What are the different steps in testing hypothesis?

21
2. Comprehension involves students’ ability to read course content, interpret
important information and put other’s ideas into words. Test questions should
focus on the use of facts, rule and principles.

Sample verbs of stating specific learning outcomes:


Classify, convert, describe, distinguish between, give examples, interpret
summarize

Instructional objective:
At the end of the lesson, the students should be able to summarize ,the
main events of the story INVICTUS in grammatically correct English.

Test Item:
Summarize the main events in the story INVICTUS in grammatically
correct English.

3. Application students take new concepts and apply them to new situation. Test
questions focus on applying facts and principles.

Sample verbs of stating specific learning outcomes:

Apply, arrange, compute, construct, demonstrate, discover, extend,


operate, predict relate, show, solve, use

Instructional objective:

At the end of the lesson the students should be able to write a short poem
in iambic pentameter.

Test Item:

Write short poem in iambic pentameter

4. Analysis students have the ability to take new information and break it down
into parts and differentiate between them. The test questions focus on
separation of a whole into component parts.

Samples verbs of stating specific learning outcomes:

Analysis, associate, determine, diagram, differentiate, discriminate,


distinguish, estimate, point out, infer, outline, separate

Instructional objectives:
At the end of the lesson, the students should be able to describe the
statistical tools needed in testing the difference between two means

22
Test Item:
What kind of statistical test would you, run to see if there is a significant
different between pre-test and post-test?

5. Synthesis students re able to take various pieces of information and dorm a


whole creating a pattern where one did not previously exist. Test question
focuses on combining new ideas to from a new whole.

Sample verbs of stating specific learning outcomes:


Combine, compile, compose, construct, create, design, develop, devise,
formulate, integrate, modify, revise, rewrite, tell, write

Instructional objectives:
At the end of the lesson, the students should be able to compare and
contrast the two types of error.

Test Item:
What is the difference between type I and Type II error?

6. Evaluation involves students’ ability to look at someone else’ or principles and


the worth of the work and the value of the conclusion.

Sample verbs of stating specific leaning outcomes:


Appraise, assess, compare, conclude, contrast, criticize, evaluate, judge,
justify, support

Instructional objectives:
At the end of the lesson, the students should be able to conclude the
relationship between two means.

Test Item:
What should the researcher conclude about the relationship in the
population?

Affective Domain

Affective domain describes learning objectives that emphasize a feeling tone, an


emotion, or a degree of acceptance or rejection. Affective objectives vary from simple
attention to selected phenomena to complex but internally consistent qualities of
character and conscience. We found a large number of such objectives in the literature
expressed as interests, attitudes, appreciations, values, and emotional sets or biases
(Krathwohl et al., as cite by Esmane, 2011). The affective domain includes objectives
pertaining to attitudes, appreciations, values, and emotions.
23
Krathwohl’s affective domain is perhaps the best known of any of the affective
domain. “The taxonomy is ordered according to the principles of internalization.”
Internalization refers to the process whereby a person’s affect toward an object passes
from a general awareness level to a point where the affect is internalized and
consistently guides or controls the person’s behavior. The arrangement of the affective
domain from lowest level to the highest level as articulated by Esmane (2011).

Level of Affective Domain

Level Definition Sample Verbs


1. Receiving Refers to being aware of or Example:
sensitive to the existence of
certain ideas, materials, or Listens to the ideas of others
phenomena and being able to with respect.
tolerate them. The learners are Sample verbs appropriate for
willing to listen. objectives written at the
receiving level: masks, choose,
describes, follows, gives, holds,
identifies, locates, names, points
to, selects, sits, erects, replies,
uses
2. Responding Refers to the commitment in Example:
some measure to the ideas,
materials, or phenomena Participates in class discussions
involved by actively responding actively.
to them. It answers question
about ideas. The learning Samples verbs appropriate for
outcomes emphasize compliance objectives written at the
in responding, willingness to responding level: answers,
respond, or satisfaction in assists, aids, complies, conforms,
responding. The learners are discusses, greets, helps, labels,
willing to participate performs, practices, presents,
reads, recites, reports, selects,
tells, writes
3. Valuing Refers to the willingness to be Examples:
perceived by others as valuing
certain ideas, materials, Demonstrates belief in the
phenomenon or behavior. It is democratic process.
based on the internalization of a
set of specified values, while Show the ability to solve
clues to these values are problems.
expressed in the learner’s overt Sample verbs appropriate for
behavior and are often objectives written at the
identifiable. This ranges from valuing level: completes,
simple acceptance to the more demonstrates, differentiates,
complex state of commitment. explains, follows, forms, initiates,
The learners are willing to be invites, joins, justifies, proposes,
involved. reads, reports, selects, shares,
studies, works
4. Organization Refers to the ability to relate the Examples:
value to those already held and
bring it into a harmonious and Explains the role of systematic
internally consistent philosophy. planning in solving problems.
Commits to using ideas and
incorporate them to different Prioritizes time effectively to
activities. It emphasizes on meet the needs of the

24
comparing, relating, and organization, family, and self,.
synthesizing values. The
learners are willing to be an Sample verbs appropriate for
advocate. objectives written at the
organizing level: adheres,
alters, arranges, combines,
compares, completes, defends,
explains, formulates, generalizes,
identifies, integrates, modifies,
orders, organizes, prepares,
relates, synthesizes
5. Characterization by Incorporate ideas completely Examples:
value or value set into practice, recognized by the
use of them. The value system Shows self-reliance when
that controls their behavior. working independently.
Instructional objectives are
concerned with the student’s Values people for what they are,
general patterns of adjustment not how they look.
such as personal, social, and Sample verbs appropriate for
emotional. The learners are objectives written at the
willing to change one’s characterizing level: acts,
behavior, lifestyle, or way of discriminates, displays,
life influences, listens, modifies,
performs, practices, proposes,
qualifies, questions, revises,
serves, solves, verifies

Psychomotor Domain

Psychomotor domain is characterized by the progressive levels of behaviors


from observation to mastery of physical skills. Esmane (2011) includes physical
movement, coordination, and use of the motor-skill areas. Development of these skills
requires practices and is measured in terms of speed, precision, distance, procedures, or
techniques in execution. The seven major categories are listed from the simplest
behavior to the most complex. The Psychomotor Domain includes objectives that
requires basic motor skills and/ or physical movement such as construct, kick or ski.

Level of Psychomotor Domain

Level Definition Example


1. Perception The ability to use sensory cues to Examples:
guide motor activity. This ranges Detects nonverbal
from sensory stimulation, communication cues.
through cue selection, to
translation Estimate where a ball will land
after it is thrown and then
moving to the correct locations=
to catch the ball.
Sample verbs appropriate for
objectives written at the
perception level: closes,
describes, detects, differentiates,
distinguishes, identifies, isolates,
relates, selects
2. Set Readiness to act. It includes Examples:

25
mental, physical, and emotional Recognizes one’s abilities and
sets. These three sets are limitations. Shows desire to learn
dispositions that predetermine a a new process (motivation).
person’s response to different Note: this subdivision of
situations (so metimes called Psychomotor domain is closely
mindsets). related to the “responding to
phenomena” subdivision of the
Affective domain.
Sample verbs appropriate for
objectives written at the set
level: begins, displays, explains,
moves, proceeds, reacts, shoes,
states, volunteers
3. Guided Response The early stages in learning a Examples:
complex skill that includes Performs a mathematical
imitation and trial and error. equation as demonstrated.
Adequacy of performance is
achieved by practicing. Follow instructions to build a
model.
Sample verbs appropriate fro
objectives written at the
guided response level: copies,
traces, follows, reacts,
reproduces, responds
4. Mechanism This is the intermediate stage in Examples:
learning a complex skill. Learned Uses a personal computer.
responses have become habitual
and the movements can be Repairs a leaking faucet.
performed with some confidence
and proficiency. Drives a car.
Sample verbs appropriate
objectives written at the
mechanism level: assembles,
calibrates, constructs,
dismantles, displays, fastens,
fixes, grinds, heats, manipulates,
measures, mends, mixes,
organizes, sketches
5. Complex Overt The skillful performance of Examples:
Response motor and acts that involves Operates a computer quickly and
complex movement patters. accurately.
Proficiency is indicated by a
quick, accurate, and highly Displays competence while
coordinated performance, playing the piano.
requiring a minimum of energy. Samples verbs appropriate for
This category includes objectives written at the
performing without hesitation, complex overt response level:
and automatic performance. For assembles, builds, calibrates,
example, players often utter constructs, dismantles, displays,
sounds of satisfaction or fasten, fixes, grinds, heats,
expletives as soon as they hit a manipulates, measures, mends,
tennis ball or throw a football, mixes, organizes, sketches
because they can tell by the fell
of the act what the result will Note: the key words are the same
produce. as mechanism, but will have
adverbs or adjectives that
indicate that the performance is
quicker, better, more accurate,
etc.

26
6. Adaption Skills are well developed and the Examples:
individual can modify movement Responds effectively to
patterns to fit special unexpected experiences.
requirements.
Modifies instruction to meet the
needs of the learners.
Samples verbs appropriate for
objectives written at the
adaption level: adapts, alters,
changes, rearranges, reorganizes,
revises, varies
7. Origination Creating new movement Examples:
patterns to fit a particular Creates a new gymnastic routine.
situation or specific problem. Sample verbs appropriate for
Learning outcomes emphasize objectives written at the
creativity based upon highly origination level: arranges,
developed skills. builds, combines, composes,
constructs, creates, designs,
initiates, makes, originates

Other Psychomotor Domains

Aside from the discussion of Simpson (1972) about the psychomotor domain,
there are two other popular versions commonly used by educators. The works of Dave,
R. H. (1975) and Harrow, Anita (1972) and Kubiszyn and Borich (2007) were discussed
below.

Level Definition Example


Imitation Observing and patterning Copying a work of art
behavior after someone
else. Performance may be
of low quality
Manipulation Being able to perform Creating work on one’s
certain actions by following own, after taking lessons,
instructions and practicing. or reading about it
Precision Refining, becoming more Working and reworking
exact. Few errors are something, so it will be
apparent “just right”
Articulation Coordinating a series of Producing a video that
actions, achieving harmony involves music, drama,
and internal consistency. color, sound, etc.
Naturalization Having high level Michael Jordan playing
performance become basketball, Nancy Lopez
natural, without needing to hitting a go0ld ball, etc.
think much about it.

27
Harrow’s (1972), Kubisxyn and Borich (2007)

Level Definition Example


Reflex movements Reactions that are not Flexion, extension, stretch,
learned. postural adjustment
Fundamental movements Inherent movement Basic movements such as
patterns which are formed walking, grasping, twisting,
by combinations of reflex manipulating
movements, the basis for
complex skilled
movements.
Perception Response to stimuli such as Coordinated movements
visual, auditory, such as jumping rope,
kinesthetic, or tactile punting, catching
discrimination.
Physical abilities Stamina that must be Muscular exertion, quick
developed for further precise movement
development such as
strength and agility.
Skilled movements Advanced learned Skilled activities in sports,
movements as one would recreation and dance
find in sports or acting.
No discursive Effective body language, Body postures, gestures,
communication such as gestures and facial facial expressions
expressions. efficiently executed in
skilled and dance
movements and
choreographies

CHAPTER 3

DEVELOPMENT OF CLASSROOM ASSESSMENT TOOLS

Learning Outcomes

At the end of this chapter, the student should be able to:

1. Define the following terms: clarity of the learning target, appropriateness of


assessment tools, validity, reliability, fairness, objectivity, comprehensiveness,
ease in scoring and administering, practically and efficiency, table of
specification, matching type of test, multiple-choice test, true or false test,
completion test, objective test, stem, distracters, key options;
2. Discuss the different principles of testing/ assessing;
3. Identify the different qualities of assessment tools;
4. Identify the different steps in developing test items;
5. Discus the steps in developing table of specification;
6. Construct a table of specification using the different formats;

28
7. Discuss the different format of assessment tools;
8. Determine the advantages and disadvantages of the different format of test item;
9. Identify the different rules in constructing multiple-choice test, matching type
test, completion test, true or false test; and
10. Construct multiple-choice test, matching type test, completion test, true or false
test.

INSTRODUCTION

In the previous chapter, we have discussed the process of developing


instructional objectives. As discussed, the instructional objectives must be specific,
measurable and observable. Teachers must develop test items that should match with
the instructional objectives appropriately and accurately. In this section, we shall
discuss the general principles of testing, the different qualities of assessment tools,
steps in developing assessment tools, format of table of specifications, and different
types of classroom tools.

GENERAL PRINCIPLES OF TESTING

Ebel and Frisbie (1999) as cited by Garcia (2008) listed five basic principle that
should guide teachers in assessing the learning progress of the students and in
developing their own assessment tools. These principles are discussed below.

1. Measure all instructional objectives. When a teacher constructs test items to


measure the learning progress of the students, they should match all the learning
objectives posed during instruction. That is why the first step in constructing a
test is for the teacher to go back to the instructional objectives.
2. Cover all the learning tasks. The teachers should construct a test that contains a
wide range of sampling of items. In this case, the teacher can determine the
educational outcomes or abilities that the resulting scores are representatives of
the total performance in the areas measured,.
3. Use appropriate test items. The test items constructed must be appropriate to
measure learning outcomes.
4. Make test valid and reliable. The teacher must construct a test that is valid so
that it can measure what it is supposed to measure from the students. The test is
reliable when the scores of the students remain the same or consistent when the
teacher gives the same test for the second time.
5. Use test to improve learning. The test scores should be utilized by the teacher
properly to improve learning by discussing the skills or competencies on the
items that have not been learned or mastered by the learners.

PRINCIPLES OF HIGH QUALITY ASSESSMENT

Assessing the performance of every student is a very critical task for classroom
teacher. It is very important that a classroom teacher should prepare the assessment

29
tool appropriately. Teacher-made tests are developed by a classroom teacher to assess
the learning progress of the students within the classroom. It has weaknesses and
strengths. The strengths of a teacher-made test lie on its applicabililtyand relevance in
the setting where they are utilized. Its weaknesses are the limited time and resources
for the teacher to utilize the test and also some of the technicalities involved in the
development of the assessment tools.

Test construction believed that every assessment tool should possess good
qualities. Most literatures consider the most common technical concepts in assessment
are the validity and reliability. For any type of assessment, whether traditional or
authentic, it should be carefully developed so that it may serve whatever purpose it is
intended for and the test results must be consistent with the type of assessment that
will be utilized.

In this section, we shall discuss the different terms such as clarity of the learning
target, appropriateness of an assessment tool, fairness, objectivity, comprehensiveness,
and ease of scoring and administering. Once these qualities of a good test are taken into
consideration in developing an assessment tool, the teacher will have accurate
information about the performance of each individual pupils or student.

Clarity of the Learning Target

When a teacher plans for his classroom instruction, the learning target should be
clearly stated and must be focused on student learning objectives rather than teacher
activity. The learning outcomes must be Specific, Measurable, Attainable, Realistic and
Time-bound (SMART) as discussed in the previous chapter. The performance task of the
students should also be clearly presented so that they can accurately demonstrate what
they are supposed to do and how the final product should be done. The teacher should
also discuss clearly with the students the evaluation procedures, the criteria to be used
and the skills to be assessed in the task.

Appropriateness of Assessment Tool

The type of test used should always match the instructional objectives or
learning outcomes of the subject matter posed during the delivery of the instruction.
Teachers should be skilled in choosing and developing assessment methods appropriate
for instructional decisions. The kinds of assessment tools commonly used to assess the
learning progress of the students will be discussed in details in this chapter and in the
succeeding chapter.

1. Objective Test. It is a type of test that requires students to select the correct
response from several alternatives or to supply a word or short phrase to answer
a question or complete a statement. It includes true-false, matching type, and
multiple-choice questions. The word objective refers to the scoring, it indicates
that there is only one correct answer.

30
2. Subjective Test. It is a type of test that permits the student to organize and
present an original answer. It includes either short answer questions or long
general questions. This type of test has no specific answer. Hence, it is usually
scored on an opinion basis, although there will be certain facts and
understanding expected in the answer.
3. Performance Assessment. (Mueller, 2010) is an assessment in which students
are asked to perform real-world tasks that demonstrate meaningful application
of essential knowledge and skills. It is can appropriately measure learning
objectives which focus on the ability of the students to demonstrate skills or
knowledge in real-life situations.
4. Portfolio Assessment. It is an assessment that is based on the systematic,
longitudinal collection of student work created in response to specific known
instructional objectives and evaluated in relation to the same criteria (Ferenz, K.,
2001). Portfolio is a purposeful collection of student’s work that exhibits that
student’s efforts, progress and achievements in one or more areas over a period
of time. It measures the growth and development of students.
5. Oral Questioning. This method is used to collect assessment data by asking oral
questions. The most commonly used of all forms of assessment in class, assuming
that the learner hears and shares the use of common language with the teacher
during instruction. The ability of the students to communicate orally is very
relevant to this type of assessment. This is also a form of formative assessment.
6. Observation Technique. Another method of collecting assessment data is
through observation. The teacher will observe how students carry out certain
activities either observing the process of product. There are two types of
observation techniques: formal and informal observations. Formal observation
are planned in advance like when the teacher assess oral report or presentation
in class while informal observation is done spontaneously, during instruction like
observing the working behavior of students while performing a laboratory
experiment in a biology class and the like. The behavior of students involved in
hid performance during instruction is systematically monitored, described,
classified, and analyzed.
7. Self-report. The response of the students may be used to evaluate both
performance and attitude. Assessment tools could include sentence completion,
likert scales, checklists, or holistic scales.

Different Qualities of Assessment Tools

1. Validity refers to the appropriateness of score-based inferences; or decisions


made based on the students’ test results. The extent to which a test measures
what it is supposed to measure.
2. Reliability refers to the consistency of measurement; that is, how consistent test
results or other assessment results from one measurement to another. We can
say that a test is reliable when it can be used to predict practically the same

31
scores when test administered twice to the same group of students and with a
reliability index of 0.61 above,.\
3. Fairness means the test item should not have any biases. It should not be
offensive to any examinee subgroup. A test can only be good if it is fair to all the
examinees.
4. Objectivity refers to the agreement of two or more raters of test administrators
concerning the score of a student. If the two rates who assess the same student
on the same test cannot agree on the score, the test lacks objectivity and neither
of the score from the judges is valid. Lack of objectivity reduces test validity in
the same way that the lack of reliability influence validity.
5. Scorability means that the test should be easy to score, direction for scoring
should be clearly in the instruction. Provide the students an answer sheet and
the answer key for the one who will check the test.
6. Adequacy means that the test should contain a wide range of sampling of items
to determine the educational outcomes or abilities so that the resulting scores
are representatives of the total performance in the areas measured.
7. Administrabilitymeans that the test should be administered uniformly to all
students so that the scores obtained will not very due to factors other than
differences of the students’ knowledge and skills. There should be a clear
provision for instruction for the students, proctors and even the one who will
check ,the test or the test scorer.
8. Practicality and Efficiency refers to the teacher’s familiarity with the methods
used, time required for the assessment, complexity of the administration, ease of
scoring, ease of interpretation of the test results and the materials used must be
at the lowest cost.

STEPS IN DEVELOPING ASSESSMENT TOOLS

1. Examine the instructions objectives of the topics previously discussed.


2. Make a table of specification (TOS).
3. Construct the test items.
4. Assemble the test items.
5. Check the assembled test items.
6. Write directions.
7. Make the answer key.
8. Analyze and improve the test items.

Let us discuss in details the different steps needed in developing good assessment
tools. Following the different steps is very important so that the test items developed
will measure the different learning outcomes appropriately. In this case, the teacher can
measure what is supposed to measure. Consider the following discussions in each step.

32
Examine the instructional Objectives of the Topic Previously Discussed

The first step in developing an achievement test is to examine and go back to the
instructional objectives so that you can match with the test items to be constructed.

Make a Table of Specification (TOS)

Table of Specification (TOS) is a chart or table that details the content and level
of cognitive level assessed on a test as well as the types and emphases of test items
(Gareis and Grant, 2008). Table of specification is very important in addressing the
validity and reliability of the test items. The validity of the test means that the
assessment can be used to draw appropriate result from the assessment because the
assessment guarded against any systematic error.

Table of specification provides the test constructor a way to ensure that the
assessment is based from the intended learning outcomes. It is also a way of ensuring
that the number of questions on the test is adequate to ensure dependable results that
are not likely caused by chance. It is also a useful guide in constructing a test and in
determining the type of test items that you need to construct.

Preparing a Table of Specification

Below are the suggested steps in preparing a table of specification used by the
test constructor. Consider these steps in making a two-way chart table of specification.
See also format 1 of the Table of Specification for the other steps.

a. Selecting the learning outcomes to be measured. Identify the necessary


instructional objectives needed to answer the test items correctly. The list of the
instructional objectives will include the learning outcomes in the areas of knowledge,
intellectual skills or abilities, general skills, attitudes, interest, and appreciation. Use
Bloom’s Taxonomy or Krathwolh’s 2011 revised taxonomy of cognitive domain as
guide.
b. Make an outline of the subject matter to be covered in the test. The length of the
test will depend on the areas covered in its content and the time needed to answer.
c. Decide on the number of items per subtopic. Use this formula to determine the
number of items to be constructed for each subtopic covered in the test so that the
number of item in each topic should be proportioned to the number of class sessions.
Number of class sessions x desired total number of items
Number of items = -------------------------------------------------------------------------------
Total number of class session
d. Make the two-way chart s shown in the format 2 and format 3 of a Table of
Specification.
e. Construct the test items. A classroom teacher should always follow the general
principle of constructing test items. The test item should always correspond with the
learning outcome so that it serves whatever purpose it may have.

33
If properly prepared, a table of specification will help you limit the coverage of test
and identify the necessary skills or cognitive level required to answer the test item
correctly.

Different Formats of Table of Specification

Gronlund (1990) lists several examples and format on how a table of


specification should be prepared.

a. Format 1 of a Table of Specification

The first format of a table of specification is composed of the specific objectives, the
cognitive level, type of test used, the item number and the total points needed in each
item. Below is the template of the said format.

Specific Objectives Cognitive Type of Test Item Number Total


Level Points
Solve worded Application Multiple-choice 1 and 2 4 points
problems in
consecutive
integers.

Specific Objectives refers to the intended learning outcomes state as specific


instructional objective covering a particular test topic.

Cognitive Level pertains to the intellectual skill or ability to correctly answer a test
item using Bloom’s taxonomy of educational objectives. We sometimes refer to this as
the cognitive demand of a test item. Thus, entries in this column could be “knowledge,
comprehension, application, analysis, synthesis and evaluation.

Type of Test Item identifies the type or kind of test a test item belongs to. Examples
of entries in ths column could be “multiple-choice, true or false, or even essay.”

Item Number simply identifies the question number as it appears in the test.

Total Points summarize the score given to a particular test.

Example on how to compute the number of items in each test.

Number of item for the topic: Synthetic division


Number of class session discussing the topic: 3
Desired number of items: 10
Total number of class sessions for the unit: 10

Number of class sessions x desire total number of items


Number of items = ---------------------------------------------------------------------------
Total number of class session

34
3 x 10
Number of items = -------------
10
30
Number o items = ------
10
Number of items for the topic synthetic division = 3

b. Format 2 of Table of Specification (one-way table of specification)

Contents Number Number Cognitive Level Test Item


Of Class of K-C A HO Distribut
Sessions Items TS ion
Basic Concepts Fraction 1 2 1-2
Addition of Fraction 1 2 3-4
Subtraction of Fraction 1 2 5-6
Multiplication and Division of 3 6 7-12
Fraction
Application/ Problem Solving 4 8 13-20
Total 10 20

c. Format 3 of Table of Specification (two-way table of specification)

Krathwohl’s Cognitive Level


Class Tot
Item
Content Sessio Remembe Understan Applyi Evaluat Creati al
Distribut
ns ring ding ng ing ng Ite
ion
ms
concept 1 2 1-2
s
z-score 2 4 3-6
t-score 2 4 7-10
Stanine 3 6 11-16
Percenti 3 6
17-22
le rank
Applicat 4 8 23-30
ion
Total 15 30

Note:

The number if item for each level will depend on the skills the teacher wants to
develop in his students. In the case of tertiary level, the teacher must develop more
higher-order thinking skills (HOTS) questions.

For elementary and secondary levels, the guidelines in constructing test will be
as stipulated in the DepEd Order 33, Series 2004 must be followed. That is, factual

35
information 60%, moderately difficult or more advanced questions 30% and higher
order thinking skills 10% for distinguishing honor students.

Construct the Test Items

In this section, we shall discuss the different format of objective type of test
items, the steps in developing objective and subjective test, the advantages and its
limitations. The different guidelines of constructing different types of objective and
subjective test items will also be discussed in this section.

General Guidelines for constructing Test Items

Kubisxyn and borich (2007) suggested some general guidelines for writing test
items ,to help classroom teachers improve the quality of test items to write.

1. Begin writing items far enough or in advance so that you will have time to revise
them,.
2. Match items to intended outcomes at appropriate level of difficulty to provide
valid measure of instructional objectives. Limit the question to the skill being
assessed.
3. Be sure each item deals with an important aspect of the content area and not
with trivia.
4. Be sure the problem posed is clear and unambiguous.
5. Be sure that the item is independent with all other items. The answer to one item
should not be required as a condition in answering the next item. A hint to one
answer should not be embedded to another item.
6. Be sure the item has one or best answer on which expert would agree.
7. Prevent unintended clues to an answer in the statement or question.
Grammatical inconsistencies such a or an giveclues to the correct answer to
those students who are not well prepared for the test.
8. Avoid replication of the textbook in writing test items; do not quote directly from
the textual materials. You are usually not interested in how well students
memorize the text. Besides, taken out of context, direct quotes from the text are
often ambiguous.
9. Avoid trick or catch questions in an achievement test. Do not waste time testing
how well the students can interpret your intentions.
10. Try to write items that require higher-order thinking skills.

Determining the Number of Test Items

Consider the following average time in constructing the number of test items.
The length of time and the type of item used are also factors to be considered in
determining the number of items to be constructed in an achievement test. These
guidelines will be very important in determining appropriate assessment for college
students.

36
Assessment Format Average Time to Answer
True-false 30 seconds
Multiple-choice 60 seconds
Multiple-choice of higher level
90 seconds
learning objectives
Short Answer 120 seconds
Completion 60 seconds
Matching 30 seconds per response
Short Essay 10-15 minutes
Extended Essay 30 minutes
Visual Image 30 seconds

The number of items included in a given assessment will also depend on the
length of the class period and the type of items utilized. The following guidelines will
assist you in determining an assessment appropriate for college-level students aside
from the previous formula discussed.

Yes No
The item is appropriate to measure a learning objective.
The item format is the most effective means of measuring the desired
knowledge.
The item is clearly worded and can be easily understood by the target
student population.
The items of the same format are grouped together.
There are various item types include in the assessment.
The students have enough time to answer all test items.
The test instructions are specific and clear,.
The number of questions targeting each objective matches the weight of
importance of that objective.
The scoring guidelines are discussed clearly and available to students.

Assemble the Test Item

After constructing the test items following the different principles of constructing
test item, the next step to consider is to assemble the test items. There are two steps in
assembling the test: (1) packaging the test; and (2) reproducing the test,.

In assembling the test, consider the following guidelines:

a. Group all test items with similar format. All items in similar format must be
grouped so that the students will not be confused.
b. Arrange test items from easy to difficult. The test items must be arranged
from easy to difficult so that students will answer the first few items correctly
and build confidence at the start of the test.
c. Space the test items for easy reading.
d. Keep items and option in the same page.
37
e. Place the illustrations near the description.
f. Check the answer key.
g. Decide where to record the answer.

Write Directions

Check the test directions for each item format to be sure that it is clear for the
students to understand. The test direction should contain the numbers of items to
which they apply; how to record their answers; the basis of which they select answer;
and the criteria for scoring or the scoring system.

Check the Assembled Test Items

Before reproducing the test, it is very important to proofread first the test items
for typographical and grammatical errors and make necessary corrections if any. If
possible, let others examine the test to validate its content. This can save time during
the examination and avoid destruction of the students.

Make the Answer Key

Be sure to check your answer key so that the correct answers follow a fairly
random sequence. Avoid answers such as TFTFTF, etc., or TTFFF for a true or false type,
and A B C D A B C D patterns for multiple-choice type. The number of true answers must
be equally the same with dales answers and also among the multiple-choice options.

Analyze and Improve The Test Items

Analyzing and improving the test should be done after checking, scoring and
recording the test. The details of this part will be discussed in the succeeding chapter.

DIFFERENT FORMATS OF CLASSROOM ASSESSMENT TOOLS

There are different types of assessing the performance of students. We have


objective test, subjective test, performance based assessment, oral questioning,
portfolio assessment, self-assessment and checklist. Each of this has their own function
and use. Type of assessment tools should alwasys be appropriate with the objectives of
the lesion.

There are two general types of test item to use in achievement test using paper
and pencil test. It is classified as selection-type items and supply type items.

Selection Type or Objective Test Item

Selection type items require students to select the correct response from several
options. This is also known as objective test item. Selection type items can be classified
as: Multiple-choice; matching type; true or false; or interpretative exercises.

Objective test item requires only one correct answer in each item.

38
Kinds of Objective Type Test

In this section, we shall discuss the different format of objectives types of test
items and the general guidelines in constructing multiple-choice type of test, guidelines
in constructing the stem, options and distracters, advantages and disadvantages of
multiple-choice, guidelines in constructing matching type of test, advantages and
disadvantages of matching type of test, guidelines in constructing true or false and
comprehension types of test, advantages and disadvantages of true or false and
interpretative exercises.

a. Multiple-choice Test

A multiple-choice test is used to measure knowledge outcomes and other types


of learning outcomes such as comprehension and applications. It is the most commonly
used format in measuring student achievements at different levels of learning.

Multiple-choice item consists of three parts: the stem, the keyed option and the
incorrect options or alternatives. The stem represents the problem or question usually
expressed in completion form or question form. The keyed option is the correct answer.
The incorrect options or alternativesalso called distracters or foil.

General Guidelines in Constructing Multiple-choice Test

1. Make a test item that is practical or with real-world applications to the


students.
2. Use diagram or drawing when asking question about application,
analysis or evaluation.
3. When ask to interpret or evaluate about quotations, present actual
quotations from secondary sources like published books or
newspapers.
4. Use tables, figures, or charts when asking question to interpret.
5. Use pictures if possible when students are required to apply concepts
and principles.
6. List the choices/ options vertically not horizontally.
7. Avoid trivial questions.
8. Use only one correct answer or best answer format.
9. Use three to five options to discourage guessing.
10. Be sure that distracters are plausible and effective.
11. Increase the similarity of the options to increase the difficulty of the
item.
12. Do not use “none of the above” options when asking for a best answer.
13. Avoid using “all of the above” options. It is usually the correct answer
and makes the item too easy for the examinee with partial knowledge.

39
Guidelines in Constructing the Stem

1. The stem should be written in question form or completion form. Research


showed that it is more advisable to use question form.
2. Do not leave the blank at the beginning or at the middle of the stem when using
completion form of a multiple-choice type of test.
3. The stem should pose the problem completely.
4. The stem should be clear and concise.
5. Avoid excessive and meaningless use of words in the stem.
6. State the stem in positive form. Avoid using the negative phrase like “not” or
“except”. Underline or capitalize the negative words if it cannot be avoided.
Example: Which of the following does not belong to the group? Or ,which of the
following does NOT belong to the group.
7. Avoid grammatical clues in the correct answer.

Guideline in Constructing Options

1. There should be one correct or best answer in each item.


2. List options in vertical order not a horizontal order beneath the stem.
3. Arrange the options in logical order and use capital letters to indicate each
option such as A, B, C, D, E.
4. No overlapping options; keep it independent.
5. All options must be homogenous in content to increase the difficult of an item.
6. As much as possible the length of the options must be the same or equal.
7. Avoid using the phase “all of the above”.
8. Avoid using the phase “none of the above” or “I don’t know.”

Guidelines in Constructing the Distracters

1. The distracter should be plausible.


2. The distracter should be equally popular to all examinees.
3. Avoid using ineffective distracters. Replace distracter(s) that are not effective to
the examinees.
4. Each distracter should be chosen by at least 5% of the examinees but not more
than the key answer.
5. Revise distracter(s) that are over attractive to the teachers. They might be
ambiguous to the examinees.

Examples of Multiple-choice Items

1. Knowledge Level
The most stable measure(s) of central tendency is the _______________.
A. Mean
B. Mean and median
C. Median
D. Mode

40
This kind of question is a knowledge level type because the students are required
only to recall the properties of the mean. The correct answer is option A.
2. Comprehension Level
Which most of the following statements describe normal distribution?
A. The mean is greater than the median.
B. The mean median and mode are equal.
C. The scores are more concentrated at the other part of the distribution.
D. Most of the scores are high.
This kind of question is a comprehension level type because the students are
required to describe the scores that are normally distributed. The correct answer
is option B.

3. Application Level
What is the standard deviation of the following scores of 10 students in
mathematics quiz, 10, 13, 16, 16, 17, 19, 20, 20, 20, 25?
A. 3.90
B. 3.95
C. 4.20
D. 4.25
This kind of question is an application level because the students are asked to
apply the formula and solve for the variance. The correct answer is option C.
4. Analysis Level
What is the statistical test used when you test the mean difference between pre-
test?
A. Analysis of variance
B. t-test
C. Correlation
D. Regression analysis
This kind of question is an example of analysis level type because students are
required to distinguish which type of test is used. The correct answer is option B.

Advantage of multiple choice test

1. Measure learning outcomes from the knowledge to evaluation level.


2. Scoring is highly objective, easy and reliable.
3. Scores are more reliable than subject type of test.
4. Measures broad samples of content within a short span of time.
5. Distracters can provide diagnostics information.
6. Item analysis can reveal the difficulty of an item and can discriminate the good
and the poor performing students.

Disadvantage of Multi-choice Test

1. Time consuming to construct a good item.


2. Difficult to find effective and plausible distracters.
3. Scores can be influence by the reading ability of the examiners.
4. In some cases, there is more than one justifiable correct answer.

41
5. Ineffective in assessing the problem solving skills of the students.
6. Not applicable when assessing the student’s ability to organize and express ideas.

b. Matching type
Matching type item consist of two columns. Column A contains the description
and must be place at the left side while column B contains the options and placed
at the right side. The examinees are asked to match the option that are
associated with the description.

Guideline in Constructing Matching types of Test

1. The description and option must be short and homogeneous.


2. The description must be written at the left side and marked it with column A
and the option must be written at the right side and marked it with column B
to save for the examinees.
3. There should be more options than descriptions or indicate in the directions
that each option may be used more than once to decrease the chance of
guessing.
4. Matching directions should, specify the basis for matching. Failure to indicate
how matches should be marked can greatly increase the tie consumed by the
teacher in scoring.
5. Avoid too many correct answers.
6. When using names, always include the complete name (first name and
surname) to avoid ambiguities.
7. Use number for the descriptions and capital letters for the options to avoid
confusions to the students that have a reading problem.
8. Arrange the options into a chronological order or alphabetical order.
9. The descriptions and options must be written in the same page.
10. A minimum of three and a maximum of seven items for elementary level and
a maximum of seventeen items for secondary and tertiary levels.

Examples of Matching Type Test

Direction: Match the function of the part of computer in Column A with its name
in Column B. Write the letter of your choice before the number.

Column A Column B

_____ 1. Stores information waiting to be used A. Central Processing


Unit

42
_____ 2. Consider as the brain of the computer B. Hard Drive

_____ 3. Hand-held device used to move the cursor C. Hardware

_____ 4. An example of an output device D. Mass Storage Device

_____ 5. Stores permanent information in the computerE. Mouse

_____ 6. Physical aspect of the computer F. Monitor

_____ 7. Used to display the output G. Processor

_____ 8. The instruction fed into the computer H. Printer

_____ 9. Pre-loaded data I. Random Access Memory

_____ 10. Permits a computer to store large amount of data J. Read Only Memory

K. Software

L. Universal Serial Bus

Advantages of Matching Type Test

1. It is simple to construct than a multiple-choice type of test.


2. It reduces the effects of guessing compared to the multiple-choice and true or
false type of test.
3. It is appropriate to assess the association between facts.
4. Provides easy, accurate, efficient, objective and reliable test scores.
5. More content can be covered in the given set of test.

Disadvantages of Matching type Test

1. It measures only simple recall or memorization of information.


2. It is difficult to construct due to problems in selecting the description and option.
3. It assesses only low level of cognitive domain such as knowledge and
comprehension.

c. True or False Type

Another format of an objective type of test is the true or false type of test items. In
this type of test, the examinees determine whether the statement presented true or
false. True or false test item is an example of a “force-choice test” because there are only
two possible choices in this type of test. The students are required to0 choose the
answer true or false in recognition to a correct statement or incorrect statement.

True or False type of test is appropriate in assessing the behavioral objectives such
as “identify” “select,” or “recognize”. It is also suited to assess the knowledge and

43
comprehension level in cognitive domain. This type of test is appropriate when there
are only two plausible alternatives or distracters.

Guideline in Constructing true or False test

1. Avoid writing a very long statement. Eliminate unnecessary word(s) in the


statement (be concise).
2. Avoid trial question.
3. It should contain only one idea in each item except for statement showing the
relationship between cause and effect.
4. It can be used for establishing cause and effect relationship.
5. Avoid using option-base statement, if it cannot be avoided the statement
should be attributed to somebody.
6. Avoid using negative or double negatives. Construct the statement positively.
If this cannot be avoided, bold negative words or underlined it to call the
attention of the examinees.
7. Avoid specific determiner such as “never,” “always,” “ all,” “none” for they
tend to appear in the statements that are false.
8. Avoid specific determiner such as “ some,” “sometimes,” and “may” they tend
to appear in the statement that are true.
9. The number of true items must be the same with the number of false items.
10. Avoid grammatical clues that lead to a correct answer such as the article (a,
an, the).
11. Avoid statement directly taken from the textbook.
12. Avoid arranging the statement in a logical order such as (TTTTTT-FFFFF,
TFTFTF, TTFFTTFF).
13. Directions should indicate where or how the students should mark their
answer.

Example of True or False type of test

Direction: Write your answer before the number in each item. Write T if the
statement is true and F if the statement if false.

T F 1. Test constructor should never phrase a test item in the negative.

T F 2. Photosynthesis is the process by which leaves make a plant’s food.

T F 3. The equation 3x³+x³+6=4x+6.

T F 4. All parasite are animals.

T F 5. A statement of opinion may be used in a true or false test item.

Advantage of a True or false test

1. It covers a lot of content in a short span of time.

44
2. It easier to prepare compared to multiple-choice and matching type of test.
3. It is easier to score because it can be scored objectively compared to a test that
depends on the judgment of the rater(s).
4. T is useful when there are two alternative only.
5. The score is more reliable than essay test.

Disadvantage of a true or false test

1. Limited only to low level of thinking skills such as knowledge and comprehension,
or recognition or recall information.
2. High probability of guessing the correct answer (5%) compared to multiple-
choice which consist of four option (25%).

Supply type or subjective type of test items

Supply type items require students to create and supply their own answer or
perform a certain task to show mastery of knowledge or skills. It is also known as
constructed response test. Supply type items or constructed response test are classified
as:

a. Short answer or completion type


b. Essay type items (restricted response or extended response

Another way of assessing the performance of the students is by using the performance-
base assessment and portfolio assessment which are categorized under constructed
response test. Let us discuss the details of the selection type and supply type test items
in this selection while the performance-based assessment and portfolio assessment will
be discussed in the succeeding chapters.

Subjective test item requires the students to organize and present an original answer
(essay test ) and perform task to show mastery of learning (performance-based
assessment and portfolio assessment) or supply a word or phrase to answer a certain
question (completion or short answer type of test).

Essay test is a form of subjective type of test. Essay test measures complex cognitive
skills or processes. This type of test has no one specific answer per students. It is usually
scored on an option basis, although there will be certain facts and understanding
expected in the answer. There are two kinds of essay items: extended response essay
and restricted response essays.

Kinds of subjective types test items

Subjective types of test is another test format where the students supplies answer
rather than select the correct answer. In this selection, we shall consider the completion
type items or short answer test and essay type item. There are two types of essay items

45
according to the length of the answer: extended response essay and restricted response
essay.

The teacher must present and discuss the criteria used in assessing the answer of the
students in advance to help them to prepare from the test.

a. Completion type of short answer test


Completion or short answer type is an alternative form of assessments because
the examinee needs to supply or to create the appropriate word(s), symbol(s) or
number(s) to answer the question or complete a statement rather than selecting the
answer from the given options. There are two ways of constructing completion type or
short answer type of test: question form of completion the statement form.

Guidelines in constructing completion type or short answer test


1. The answer should require a single word answer or brief and definite
statement. Do not used indefinite statement that allows several answers.
2. Be sure that the language used in the statement is precise and accurate in
relation to the subject matter being tested.
3. Be sure to omit only key words; do not eliminate so many words so that
the meaning of the item statement will not change.
4. Do not leave the blank at the beginning or within the statement. It should
be at the end of the statement.
5. Use direct question rather than incomplete statement. The statement
should pose the problem to the examinee.
6. Be sure to indicate the units in which to be expressed when the statement
requires numerical answer.
7. Be sure that the answer the students is required to produce is factually
correct.
8. Avoid grammatical clues.
9. Do not select textbook sentence.

46
Examples of completion and short answer
Direction: Write your answer before the number in each item. Write the word(s),
phrase, or symbol(s) to complete the statement.

Question Form Completion Form

Essay Item 1. Which supply type item Essay Item 1. Supply type item used to
is used to measure the ability to measure the ability too organize and
organize and integrated material? integrate material is called _________.

Distracters 2. What are the incorrect Distracters2. The incorrect options in a


option in a multiple-choice item called? multiple-choice test item are called
_________.
Pentagon 3. What do you call a
polygon that has five sides? Pnetagon 3. A polygon with five sides is
called _________.
Evaluation 4. What is the most
complex level in the bloom’s taxonomy Evaluation 4. The most complex level in
of cognitive domain? the bloom’s taxonomy of cognitive
domain is called _________.
Multiple-choice test item 5. Which
test item measures the greatest Multiple-choice Test Item 5. The test
variety of learning outcomes? item that measures the greatest variety of
learning outcomes is called _________.

Advantages of a Completion or Short Answer Test

1. It covers a broad range of topic in a short span of time.


2. It is easier to prepare and less time consuming compared to multiple-choice and
matching type of test.
3. It can assess effectively the lower level of Bloom’s Taxonomy. It can assess recall
of information, rather than recognition.
4. It reduces the possibility of guessing the correct answer because it requires
recall compared to true of false items and multiple-choice items.
5. It covers greater amount to content than matching type test.

Disadvantages of a Completion or Short Answer Test

1. It is only appropriate for questions that can be answered with short responses.
2. There is a difficult in scoring when the questions are not prepared properly and
clearly. The question should be clearly stated so that the answer of the student is
clear.
3. It can assess only knowledge, comprehension and application levels in Bloom’s
taxonomy of cognitive domain.
4. It is not adaptable in measuring complex learning outcomes.

47
5. Scoring is tedious and time consuming.
b. Essay Items
It is appropriate when assessing students’ ability to organize and present their
original ideas. It consists of a few number of questions wherein the examinee is
expected to demonstrate the ability to recall factual knowledge; organize his
knowledge; and present his knowledge in logical and integrated answer.

Types of Essay Items

There are two types of essay item: extended response and restricted response
essay.

b.1. Extended Response Essays

An essay test that allows the students to determine the length and
complexity of the response is called extended response essay item (Kubiszyn
and Borich, 2007). It is very useful in assessing the synthesis and evaluation
skills of the students. When the objective is to determine whether the
students can organize ideas, integrated and express ideas, evaluate
information in knowledge, it is best to use extended response essay test.

Using extended response essay item has advantages and disadvantages.


Advantages are: demo9nstrate learning outcomes at the synthesis and
evaluation levels; evaluate the answers with sufficient reliability to provide
useful measures of learning; provides more freedom to give responses to the
question and provide creative integration of ideas. Disadvantages are: more
difficult to construct extended response essay questions; scoring is time
consuming than restricted response essay.

Examples of Extended Response Essay Questions:

1. Present and describe the modern theory of evolution and discuss how
it is supported by evidence from the areas of (a) comparative
anatomy, (b) population genetic.
2. From the statement, “Wealthy politicians cannot offer fair
representation to all the people.” What do you think is the reasoning
of the statement? Explain your answer.

b.2. Restricted Response Essay

An essay item that places strict limits on both content and the response
given by the students is called restricted response essay item. In this type of
essay the content is ,usually restricted by the scope of the topic to be
discussed and the limitations on the form of the response is indicated in the
question.

48
When there is a restriction on the form and scope of the answer of the
students in an essay test, there can be advantages and disadvantages. The advantages
are: it is easier to prepare questions; it is easier to score; and it is more directly related
to the specific learning outcomes. The disadvantages are: it provides little opportunity
for the students to demonstrate their abilities to organize ideas, to integrate materials,
and to develop new patterns of answers; it measures learning outcomes at
comprehension, application and analysis levels only.

Example of Restricted Response Essay Questions:

1. List the major facts and opinions in the first state of the nation address
(SONA) of Pres. BenignoCojuangcon Aquino, Jr. Limit your answer to one
page only. The score will depend on the content, organization and accuracy of
your answer.
2. Point out the strength =s and weaknesses of a multiple-choice type of test.
Limit your answer to five strengths and five weaknesses. Explain each
answer in not more than two sentences.

Guidelines in Constructing Essay Test Items

1. Construct essay question used to measure complex learning outcomes only.


2. Essay questions should relate directly to the learning outcomes to be
measured.
3. Formulate essay questions that present a clear tasks to be performed.
4. An item should be stated precisely and it must clearly focus on the desired
answer.
5. All students should be required to answer the same question.
6. Number of points and time spent in answering the question must be
indicated in each item.
7. Specify the number of words, paragraphs or the number of sentences for the
answer.
8. The scoring system must be discussed or presented to the students.

Example of Essay Test Item

1. Choose a leader you admire most and explain why you admire him or her.
2. Pick a controversial issue in the Aquino administration. Discuss the issue and
suggest a solution.
3. If you were the principal of a certain school, describe how would you
demonstrate your leadership ability inside and outside of the school.
4. Describe the difference between Norm-referenced assessment and Criterion-
referenced assessment.
5. Do you agree or disagree with the statement, “Education comes not from
books but from practical experience. “Support your position.

49
Types of Complex Outcomes and Related Terms

for Writing Essay Questions

Outcomes Sample Verbs Sample Questions


Comparing Compare, classify, describe, Describe the similarities and
distinguish between, explain, differences between Philippine
outline, summarize educational system and the
Singaporian educational system.
Interpreting Convert, draw, estimates,Summarize briefly the content of the
illustrate, interpret, restate, second SONA of President Benigno C.
summarize, translate Aquino, Jr.
Inferring Derive, draw, estimate, extend, Using the facts presented, what is
predict, propose, relate most likely to happen when………?
Applying Arrange, compute, describe, Solve the solution set of the equation
illustrate, relate, summarize, X2 + 5x – 24 = 0 using factoring
solve method.
Analyzing Breakdown, describe,List and describe the characteristics
differentiate, divide, list,
of a good assessment instrument.
outline
Creating Compose, design, draw, Formulate a hypothesis about the
formulate, list, present, make problem “Mathematics attitude and
up competency levels of the education
students of U.E.”
Synthesizing Arrange, combine, construct, Design a scoring guide in evaluating
design, relate, group portfolio assessment.
Generalizing Construct, develop, explain, Explain the function of assessment of
formulate, make, state learning.
Evaluating Appraise, criticize, defend, Describe the strengths and
describe, evaluate, explain, weaknesses of using performance-
judge, rate, write. based assessment in evaluating the
performance of the students.

Advantages of Essay Test

1. It is easier to prepare and less time consuming compared to other paper and
pencil tests.
2. It measures higher-order thinking skills (analysis, synthesis and evaluation).
3. It allows students’ freedom to express individuality in answering the given
question.
4. The students have a chance to express their own ideas in order to plan their own
answer.
5. It reduces guessing answer compared to any of the objective type of test.
6. It presents more realistic task to the students.
7. It emphasizes on the integration and application of ideas.

Disadvantages of Essay Test

50
1. It cannot provide an objective measure of the achievement of the students.
2. It needs so much time to grade and prepare scoring criteria.
3. The scores are usually not reliable most especially without scoring criteria.
4. It means limited amount of contents and objectives.
5. Low variation of scores.
6. It usually encourages bluffing.

Suggestions for Grading Essay Test

Zimmaro (2003) suggested different guidelines in scoring an essay type. These


guidelines are very important in the performance of the ,students to avoid or lessen the
subjectivity of the scoring.

1. Decide on a policy for dealing with incorrect, irrelevant or illegal responses.


2. Keep scores of the previously read items out of sight.
3. The student’s identify should remain anonymous while his/her paper grading
the nest question.
4. Read and evaluate each student’s answer to the same question before grading
the next question.
5. Provide students with general grading criteria by which they will be evaluated
prior to the examination
6. Use analytic scoring or holistic scoring.
7. Answer the test question yourself by writing the ideal answer to it so that you
can develop the scoring criteria from your answer.
8. Write your comments on their papers.

Checklists for Evaluating Essay Questions

Yes No
The test item is appropriate for measuring the intended learning outcomes.
The test item task matches with the learning task to be measured.
The questions constructed measure complex learning outcomes.
It is states in the questions what is being measured and how the answer are
to be evaluated.
The terminology used clarified and limits the task.
All students are required to answer the same question.
There is an establish time limit to answer each question.
Provisions for scoring answers are given (criteria for evaluating answer).

51
CHAPTER 4

ADMINISTERING, ANALYZING, AND IMPROVING TESTS

Learning Objectives

At the end of this chapter, the students should be able to:

1. Define the basic concepts regarding item analysis;


2. Identify the steps in improving test items;
3. Solve difficulty index and discrimination index;
4. Identify the level of difficulty of an item;
5. Perform item analysis properly and correctly;
6. Identify the item to be rejected, revised, or retained; and
7. Interpret the results of item analysis.

INTRODUCTION

One of the most important functions of a teacher is to assess the performance of


the students. This is a very complicated task because you will consider many activities
such as the timing of the assessment process, the format of the assessment tools and the
duration of the assessment procedures.

After designing the assessment tools, package the test, administer the test to the,
students, check the test papers, score and then record them. Return the test papers and
then give feedback to the students regarding the result of the test.

PACKAGING AND REPRODUCING TEST ITEMS

Assuming that you have already assembled the test, you write the instructional
objectives, prepare the table of specification, and write the test items that match with
the instructional objectives, the next thing to do is to package the test and reproduce it
as discussed in the previous chapter.

1. Put the items with the same format together.


2. Arrange the test items from easy to difficult.
3. Give proper spacing for each item for easy reading.
4. Keep questions and options in the same page.
5. Place the illustrations near the options.
6. Check the key answer.
7. Check the direction of the test.
8. Provide space for name, date and score.
9. Proofread the test.
10. Reproduce the test.

52
ADMINISTERING THE EXAMINATION

After constructing the test items and putting them in order, the nest step is to
administer the test to the students. The administration procedures greatly affect the
performance of the students in the test. The test administration does not simply means
giving the test questions to the students ad collecting the test papers after the given
time. Below are the guidelines in administering the test before, during and after the test.

Guidelines Before Administering Examinations

1. Try to induce positive test-taking attitude.


2. Inform the students about the purpose of the test.
3. Give oral directions as early as possible before distributing the tests.
4. Give test-taking hints about guessing, skipping, and the like, are strictly
prohibited.
5. Inform the students about the length of time allowed for the test. If possible,
write on the board the time in which they must be finished with answering the
test. Give the students a warning before the end of the time limit.
6. Tell the students how to signal or call your attention if they have a question.
7. Tell the students what to do with their papers when they are done answering the
test (how papers are to collected).
8. Tell the students what to do when they are done with the test, particularly if they
are to go on to another activity (also write these directions on the chalkboard so
they can refer to them).
9. Rotate the method of distributing papers so you don’t always start from the left
or the front row.
10. Make sure the room is well lighted and has a comfortable temperature.
11. Remind students to put their names on their papers (and where to do so).
12. If the test has more than one page. Have each student checked to see that all
pages are there.

Guidelines During the Examination

1. Do not give instructions or avoid talking while examination is going on minimize


interruptions and distractions.
2. Avoid giving hints.
3. Monitor to check student progress and discourage cheating.
4. Give time warning if students are not pacing their work appropriately.
5. Make a note of any questions students ask during the test so that items can be
revised for future use.
6. Test papers must be collected uniformly to save time and to avoid test papers to
be misplaced.

53
Guideline After the Examination

After the examination, the next activity that the teacher needs to do is to score
the test papers, record the result of the examination; return the test papers and last to
discuss the test items in the class so that you can analyze and improve the test items for
future use.

1. Grade the papers (and add comments if you can); do test analysis (see the
module on test analysis) after scoring and before returning papers to students if
at all possible. If it is impossible to do your test analysis before returning the
papers, be sure to do it at another time. It is important to do both the evaluation
of your students and the improvement of your tests.
2. If you are recording grades or scores, record them in pencil in your class record
before returning the papers. If there are errors/ adjustments in grading they
(grades) are easier to change when recorded in pencil.
3. Return papers in a timely manner.
4. Discuss test items with the students. If students have questions, agree to look
over their papers again, as well as the papers of others who have the same
question. It is usually better not to agree to make changes in grades on the spur
of the moment while discussing the tests with the students but to give yourself
time to consider what action you want to take. The test analysis may have
already alerted you to a problem with a particular question that is common to
several students, and you may already have made a decision regarding, that
question (to disregard the question and reduce the highest possible score
according, to give all students credit for that question, among others).

ANALYSIG THE TEST

After administering and scoring the test, the teacher should also analyze the
quality of each item in the test. Through this you can identify the item that is good, item
that needs improvement or items to be removed from the test. But when do we consider
that the test is good? How do we evaluate the quality of each item in the test? Why is it
necessary to evaluate each item in the test? Lewis Aiken (1997) an author or
psychological and educational measurement pointed out that a “postmortem” is just as
necessary in classroom assessment as it is in medicine.

In this section, we shall introduce the technique to help teachers determine the
quality of a test item known as item analysis. One of the purposes of item analysis is to
improve the quality of the assessment tools. Through this process, we can identify the
item that is to be retained, revised or rejected and also the content of the lesson that is
mastered or not.

There are two kinds of item analysis, quantitative item analysis and qualitative
item analysis (Kubiszyn and Borich, 2007).

54
Item Analysis

Item analysis is a process of examining the student’ response to individual item


in the test. It consists of different procedures for assessing the quality of the test items
given to the students. Through the use of item analysis we can identify which of the
given are good and defective test items. Good items are to be retained and defective
items are to be improved, to be revised or to be rejected.

Uses of Item Analysis

1. Item analysis data provide a basis for efficient class discussion of the test
results.
2. Item analysis data provide a basis for remedial work.
3. Item analysis data provide a basis for general improvement of classroom
instruction.
4. Item analysis data provide a basis for increased skills in test construction.
5. Item analysis procedures provides a basis for constructing test bank.

Types of Quantitative Item Analysis

There are three common types of quantitative tem analysis which provide
teachers with three different types of information about individual test items. These are
difficulty index, discrimination index, and response options analysis.

1. Difficulty Index
It refers to the proportion of the number of students in the upper and
lower groups who answered an item correctly. The larger the proportion, the
more students, who have learned the subject is measured by the item. To
compute the difficulty index of an item, use the formula:
𝑛
DF= N, where
DF = difficulty index
N = number of the students selecting item correctly in the upper group
and in the lower group.
N = total number of students who answered the test

Level of Difficulty

To determine the level of difficulty of an item, find first the difficulty index
using the formula and identify the level of difficulty using, the range given below.

Index Range Difficulty Level


0.00 – 0.20 Very Difficult
0.21 – 0.40 Difficult
0.41 – 0.60 Average/ Moderately
Difficult

55
0.61 – 0.80 Easy
0.81 – 1.00 Very Easy

The higher the value of the index of difficulty, the easier the item is. Hence, more
students got the correct answer and more students mastered the content measured by
that item.

2. Discrimination Index
The power of the item to discriminate the students between those who
scored high and those who scored low in the overall test. In other words, it is the
power of the item to discriminate the students who know the lesson and those
who do not know the lesson.
It also refers to the number of students in the upper group who got an
item correctly minus the number of students in the power group who got an item
correctly. Divide the difference the difference by either the number of the
students in the upper group or number of students in the lower group or get the
higher number if they are not equal.
Discrimination index is the basis of measuring the validity of an item. This
index can be interpreted as an indication of the extent to which overall
knowledge of the content area or mastery of the skills is related to the response
on an item.

Types of Discrimination Test

There are three kinds of discrimination index: p0ositive discrimination, negative


discrimination and zero discrimination.

1. Positive discrimination happens when more students in the uppe group got the
item correctly than those students in the lower group.
2. Negative discrimination occurs when more students in the lower group got the
item correctly than the students in the upper group.
3. Zero discrimination happens when a number of students in the upper group
and lower who answer the test correctly are equal, hence, the test item cannot
distinguish the students who performed in the overall test and the students
whose performance are very poor.

Level of Discrimination

Ebel and Frisbie (1986) as cited by Hetzel (1997) recommended the use of Level
of Discrimination of an Item for easier interpretation.

56
Index Range Discrimination Level
0.19 and Poor item, should be eliminated or need to be revised
below
0.20 – 0.29 Marginal item, needs some revision
0.30 – 0.39 Reasonably good item but possibly for improvement
0.40 and Very good item
above

Discrimination Index Formula

CUG – CLG
DI = ------------- , where
D
DI = discrimination index value

CUG = number of the students selecting the correct answer in the upper group

CLG = number of the students selecting the correct answer in the lower group

D = number of students in either the lower group or upper group.

Note: Consider the higher number in case the sizes in upper and lower group a rot
equal.

Steps in Solving Difficulty Index and Discrimination Index

1. Arrange the scores from higher to lowest.


2. Separate the scores into upper group and lower group. There are different
methods to do this: (a) if a class consists of 30 students who takes an exam,
arrange their scores from highest to lowest, then divide them into two groups.
The highest score belongs to the upper group. The lowest score belongs to the
lower group. (b) Other literatures suggested to use 27%, 30%, or 33% of the
students for the upper group and lower group. However, in the Licensure
Examination for Teachers (LET) the test developers always used 27% of the
students who participated in the examination for the upper and lower groups.
3. Count the number of those who chose the alternatives in the upper and lower
group for each item and record the information using the template:
Options A B C D E
Upper Group
Lower Group

Note: Put asterisk for the correct answer.


4. Compute the value of the difficulty index and the discrimination index ans also
the analysis of each response in the distracters.
5. Make an analysis for each item.

Checklist for Discrimination Index


57
It is very important to determine whether the test item will be retained revised
or rejected. Using the Discrimination Index we can identify the nonperforming question
items; just always remember that they seldom indicate what is the problem. Use the
given checklist below:

Yes No
1. Does the key discriminate positively?
2. Does the incorrect options discriminate negatively?
If the answer to questions 1 and 2 are both YES, retain the item.

If the answers to questions 1 and 2 are either YES or NO, revise the item.

If the answer to questions 1 and 2 are both NO, eliminate or reject the item.

3. Analysis or Response Options


Aside from identifying the difficulty index and discrimination index, another way
to evaluate the performance of the entire test item is through the analysis of the
response options. It is very important to examine the performance of each
option in a multiple-choice item. Through this, you can determine whether the
distracters or incorrect answer. The attractiveness of the incorrect options is
determined when more students in the lower group than in the upper group
choose it. Analyzing the incorrect options allows the teachers to improve the test
items so that it can be used again in the future.

Distracter Analysis

1. Distracter

Distracter is the term used for the incorrect options in the muliplr-choice type of
test while the correct answer represents the key. It is very important for the test
writer to know if the distracters are effective or good distracters. Using
quantitative item analysis we can determine if the options are good or if the
distracters are effective.

Item analysis can identify non-performing test items, but this item seldom
indicates the error or the problem in the given item. There are factors to be
considered why students failed to get the correct answer in the given question.

a. It is not taught in the class properly.


b. It is ambiguous.
c. The correct answer is not in the given options.
d. It has more than one correct answer.
e. It contains grammatical clues to mislead the students.
f. The student is not aware of the content.
g. The student were confused by the logic of the question because it has double
negatives.

58
h. The student failed to study the lesson.
2. Miskeyed item
The test item is a potential miskey if there re more students from the upper
group who choose the incorrect options than the key.
3. Guessing item
Students from the upper group have equal spread of choices among the given
alternatives. Students from the upper group guess their answers because of the
following reasons:
a. The content of the test is not discussed in the class or in the text.
b. The test item is very difficult.
c. The question is trivial.
4. Ambiguous item
This happens when more students from the upper group choose equally an
incorrect option and the keyed answer.

Qualitative Item Analysis

Qualitative item analysis (Zurawski, R.M) is a process in which the teacher or


expert carefully proofreads the test before it is administered, to check if there are
typographical errors, to avoid grammatical clues that may lead to giving away the
correct answer, and to ensure that the level of reading materials is appropriate. These
procedures can also include small group discussions on the quality of the examination
and its items, with examinees that have already took the test. According to Cohen,
Swerdlik, and Smith (1992) as cited by Zurawski, students who took the examination
are asked to express verbally their experience in answering each item in the
examination. This procedure can help the teacher in determining whether the test
takers misunderstood a certain item, nd it can help also in determining why they
misunderstood a certain item.

IMPROVING TEST ITEMS

As presented in the introduction of this shapter, item analysis enables the


teachers to improve and enhance their skills in writing test items. To improve multiple-
choice test item we shall consider the stem of the item, the distracters and the key
answer.

How to Improve the Test Item

Consider the following examples in analyzing the test item and some notes on
how to improve the item based from the results of items analysis.

Example 1. A class is composed of 40 students. Divide the group into two.


Option B is the correct answer. Based from the given data on the table, as a teacher,
what would you do with the test item.

59
Option A B* C D E
Upper 3 10 4 0 3
Group
Lower 4 4 8 0 4
Group

1. Compute the difficulty index.


n = 10 + 4 = 14
N = 40
𝑛
DF = N
14
DF = 40
DF = 0.35 or 35%
2. Compute the discrimination index.
CUG = 10
CLG = 4
D = 20
CUG – CLG
DI = ----------------
D
10−4
= 20
6
= 20
= 0.30 or 30%
3. Make an analysis about the level of difficulty, discrimination and distracters.
a. Only 35% of the examines got the answer correctly, hence, the item is
difficult.
b. More students from the upper group got the answer correctly, hence, it has a
positive discrimination.
c. Retain options A, C, and E because most of the students who did not perform
well in the overall examination selected it. Those options attract most
students from the lower group.
4. Conclusion: Retain the test item but change option D, make it more realistic to
make it effective for the upper and lower groups. At least 5% of the examinees
choose the incorrect option.

Example 2.A class is composed of 50 students. Use 27% to get the upper and the
lower groups. Analysis the item given the following results. Option D is the correct
answer. What will you do with the test item?

Option A B C D* E
Upper Group 3 1 2 6 2
(27%)
Lower Group 5 0 4 4 1
(27%)

60
1. Compute the difficulty index
n = 6 +4 = 10
N = 28
𝑛
DF – N
10
DF = 28
DF = 0.36 of 36%

2. Compute the discrimination index.


CUG = 6
CLG = 4
D=4
CUG – CLG
DI = ----------------
D
6−4
DI = 14
2
DI = 14
DI = 0.14 or 14%
3. Make an analysis
a. Only 36% of the examinees got the answer correctly, hence, the item is
difficult.
b. More students from the upper group got the answer correctly, hence, it has a
positive discrimination.
c. Modify options and B and E because more students from the upper group
chose them compare with the lower group, hence, they are not effective
distracters because most of the students who performed well in the overall
examination selected them as their answer.
d. Retain options A and C because most of the students who did not perform
well in the overall examination selected them as the correct answer. Hence,
options A and C are effective distracters.
4. Conclusion: Revised the item by modifying options B and E.

Example 3.A class is composed of 50 students. Use 27% to get the upper and the
lower groups. Analyze the item given the following results. Option E is the correct
answer. What will you do with the test item?

Option A B C D E*
Upper Group 2 3 2 2 5
(27%)
Lower Group 2 2 1 1 8
(27%)

61
1. Compute the difficulty index:
n = 5 + 8 = 13
N = 28
𝑛
DF = N
13
DF = 28
DF = 0.46 or 46%
2. Compute the discrimination index.
CUG = 5
CLG = 8
D=4
CUG – CLG
DI = ----------------
D
5−8
DI =
14
−3
DI = 14
DI = 0.21 or -21%
3. Make an analysis.
a. 46% of the students got the answer to test item correctly, hence, the test item
is moderately difficult.
b. More students from the lower group got the item correctly; therefore, it is a
negative discrimination. The discrimination index is -21%.
c. No need to analyze the distracters because the item discriminates negatively.
d. Modify all the distracters because they are not effective. Most of the students
in the upper group chose the incorrect options. The options are effective if
most of the students in the lower group chose the incorrect options.
4. Conclusion: Reject the item because it has a negative discrimination index.

Example 4.Potential Miskeyed Item. Make an item analysis about the table below.

What will you do with the test that is a potential miskeyed item?

Option A* B C D E
Upper Group 1 2 3 10 4
Lower Group 3 4 4 4 5

1. Compute the difficulty index:


n=1+3=4
N = 40
𝑛
DF = N
4
DF = 40

62
DF = 0.10 or 10%
2. Compute the discrimination index.
CUG = 1
CLG = 3
D = 20
CUG – CLG
DI = ----------------
D
1−3
DI = 20
−2
DI = 20
DI = 0.10 or -10%
3. Make an analysis.
a. More students from the upper group choose option D than option A, even
though option A is supposedly the correct answer.
b. Most likely the teacher has written the wrong answer key.
c. The teacher checks and finds out that he/she did not miskey the answer that
he/ she though is the correct answer.
d. If the teacher,miskeyed it, he/ she must check and retally the scores of the
students’ test papers before giving them back.
e. If option A is really the correct answer, revise to weaken option D, distracters
are not supposed to draw more attention than the keyed answer.
f. Only 10% of the students got the answer to the test item correctly, hence, the
test item is very difficult.
g. More students from the lower group got the item correctly, therefore a
negative discrimination resulted. The discrimination index is -10%.
h. No need to analyze the distracters because the test item is very difficult and
discriminates negatively.
4. Conclusion: Reject the item because it is very difficult and has a negative
discrimination.

Example 5.Ambiguous Item.Below is the result of item analysis of a test with an


ambiguous test item. What can you say about the item? Are you going to retain,
revise or reject it?

Option A B C D E*
Upper Group 7 1 1 2 8
Lower Group 6 2 3 3 6

1. Compute the difficulty index:


n = 8 + 6 = 14
N = 39
𝑛
DF = N

63
14
DF = 39
DF = 0.36 or 36%
2. Compute the discrimination index.
CUG = 8
CLG = 6
D = 20
CUG – CLG
DI = ----------------
D
8−6
DI = 20
2
DI = 20
DI = 0.10 or 10%
3. Make an analysis.
a. Only 36% of the students got the answer to the test item correctly, hence, the
test item is difficult.
b. More students from the upper group got the item correctly, hence, it
discriminates positely. The discrimination index is 10%.
c. About equal numbers of top students went for option A and option E, this
implies that they could not tell which is the correct answer. The students do
not know the content of the test, thus, a reteach is needed.
4. Conclusion: revise the test item because it is ambiguous.

Example 6.Guessing Item.Below is the result of item analysis for a test with students’
answers mostly based on a guess. Are you going to reject, revise or retain the test item?

Option A B C* D E
Upper Group 4 3 4 3 6
Lower Group 3 4 3 4 5

1. Compute the difficulty index:


n=4+3=7
N = 39
𝑛
DF = N
7
DF = 39
DF = 0.18 or 18%
2. Compute the discrimination index.
CUG = 4
CLG = 3
D = 20
CUG – CLG
DI = ----------------
D

64
4−3
DI = 20
1
DI = 20
DI = 0.05 or 5%
3. Make an analysis.
a. Only 18% of the students got the answer to the test item correctly, hence, the
test item is very difficult.
b. More students from the upper group got the correct answer to the test item;
therefore, the test item is a positive discrimination. The discrimination index
is 5%.
c. Students respond about equally to all alternatives, an indication that they are
quessing.
Three possibilities why student guesses the answer on a test item:
 The content of the test item had not yet been discussed in the class
because the test is designed in advanced;
 Test items were badly written that students have no idea what the
question is really about; and
 Test items were very difficult as shown from the difficulty index
and low discrimination index.
4. Conclusion: Reject the item because it is very difficult; reteach the material to the
class.

Example 7.Guessing Item.The table below shows an item analysis of a test item with
ineffective distracters. What can you conclude about the test item?

Option A B C* D E
Upper Group 5 3 9 0 3
Lower Group 6 4 6 0 4

1. Compute the difficulty index:


n = 9 + 6 = 15
N = 40
𝑛
DF = N
15
DF =
40
DF = 0.38 or 38%
2. Compute the discrimination index.
CUG = 9
CLG = 6
D = 20
CUG – CLG
DI = ----------------
D
9−6
DI = 20

65
3
DI = 20
DI = 0.15 or 15%
3. Make an analysis.
a. Only 38% of the students got the answer to the test item correctly, hence, the
test item is difficult.
b. More students from the upper group answered the test item correctly; as a
result, the test got a positive discrimination. The discrimination index is 15%.
c. Options A, B and E are attractive distracters.
d. Option D is ineffective, therefore, change it with more realistic one.
4. Conclusion: Revise the item by changing option D.

66
CHAPTER 5

UTILIZATION OF ASSESSMENT DATA

Learning Outcomes

At the end of this chapter, the students should be able to:

1. Apply statistics in research and in any systematic investigation;


2. Construct frequency distribution for a given set of scores;
3. Graph the scores using histogram and frequency distribution;
4. Calculate the mean, median and mode, decile, quartile, and percentile of the
students’ scores[‘
5. Identify the different properties of the measure of central tendency;
6. Identify the uses of the different measures of variability;
7. Calculate the value and make an analysis of range, mean deviation, quartile
deviation, variance and standard deviation of given scores;
8. Differentiate standards deviation from coefficient of variation;
9. Identify the properties of the different measures of variability;
10. Apply the concept of skewness in identifying the performance of the students;
11. Determine the spread of scores using the measures of variation;
12. Compare the performance of the students using measures of central tendency
and measure of variability;
13. Convert raw scores to standard scores;
14. Determine the relationship of two groups of scores; and
15. Computer r and ƿ value of scores and make an analysis.

INTRODUCTION

Statistics is very important tool in the utilization of the assessment data most
especially in describing, analyzing, and interpreting the performance of the students in
the assessment procedures. The teachers should have the necessary background in the
statistical procedures used in assessment of student learning in order to give a correct
description and interpretation about the achievement of the students in a certain test
whether classroom assessment conducted by the teacher, division or national
assessment conducted by the Department of Education.

In this chapter, we shall discuss the important tools in analyzing and interpreting
assessment results. These statistical tools are measures of central tendency, measures
of variation, skewness, correlation, and different types of converted scores.

DEFINITION OF STATISTICS

Statistics is a branch of science, which deals with the collection, presentation,


analysis and interpretation of quantitative data.

67
Branches of Statistics

Descriptive Statistics is a method concerned with collecting, describing, and


analyzing a set of data without drawing conclusions (or inferences) about a large group.

Inferential statistics is a branch of statistics, concerned with the analysis of a


subset of data leading to predictions or inferences about the entire set of data.

FREQUENCY DISTRIBUTION

Frequency distribution is a tabular arrangement of data into appropriate


categories showing the number of observation in each category or group. There are two
major advantages: (a) it encompasses the size of the table; and (b) it makes the data
more interpretive.

Parts of Frequency Table

1. Class Limit is the grouping or categories defined by the lower and upper limits.
Examples: LL – UL
10 – 14
15 – 19
20 – 24
Lower class limit (LL) represent the smallest number in each group.
Upper class limit (UL) represent the highest number in each group.
2. Class size (c.i) is the width of each class interval.
Examples: LL – UL
10 – 14
15 – 19
20 – 24

The class size in this score distribution is 5.

3. Class boundaries are the numbers used to separate each category in the
frequency distribution but without gaps create by the class limits. The scores of
the students are discrete. Add 0.5 to the upper limit to get the upper class
boundary and subtract 0.5 to the lower limit to get the lower class boundary in
each group or category.
Examples: LL – UL LCB - UCB
10 – 14 9.5 – 14.5
15 – 19 14.5 – 19.5
20 – 24 19.5 – 24.5
4. Class marks are the midpoint of the lower and upper class limits. The formula is
LL+UL
XM= 2 .

Examples: LL – UL XM
10 – 14 12

68
15 – 19 17
20 – 24 22

Steps in Constructing Frequency Distribution

1. Compute the value of the range (R). Range is the difference between the highest
score and the lowest score.
R = HS – LS

Determine the class size (c.i). The class size is the quotient when you
divide the range by the desired number of classes or categories. The desired
numbers of classes are usually 5, 10 or 15 they depend in the number of scores
in the distribution. If the desired number of classes is not identified,
𝑅 𝑅
𝑐. 𝑖 = desired number of classes or 𝑐. 𝑖 = 𝐾.

2. Set up the class limits of each class or category. Each class defined by the lower
limit and upper limit. Use the lowest score as the lower limit of the first class.
3. Set up the class boundaries of needed. use the formula
𝐿𝐿 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑒𝑐𝑜𝑛𝑑 𝑐𝑙𝑎𝑠𝑠 − 𝑈𝐿 𝑜𝑓 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑐𝑙𝑎𝑠𝑠
𝑐. 𝑖 =
2
4. Tally the scores in the appropriate classes.
5. Find the other parts if necessary such as class marks, among others.

Examples: Raw score of 40 students in a 50-item mathematics quiz. Construct a


frequency distribution following the steps given previously.

17 25 30 33 25 45 23 19
27 35 45 48 20 38 39 18
44 22 46 26 36 29 15-LS 21
50-HS 47 34 26 37 25 33 49
22 33 44 38 46 41 37 32
R = HS – LS
= 50 – 15
R = 35
n = 40
Solve the value of k.
k = 1 + 3.3 log n
k = 1 + 3.3 log 40
k = 1 + 3.3 (1.602059991)
k = 1 + 5.286797971
k = 6.286797971
k=6
Find the class size.
𝑅
𝑐. 𝑖 =
𝐾
35
𝑐. 𝑖 =
6

69
𝑐. 𝑖 = 5.833
𝒄. 𝒊 = 𝟔

Construct the class limit starting with the lowest score as the lower limit of the
first category. The last category should contain the highest score in the distribution.
Each category should contain 6 as the size of the width (X). Count the number of scores
that falls in each category (f).

X Tally frequency (f)

15 – 20 //// 4

21 – 26 ///////// 9

27 – 32 /// 3

33 – 38 ////////// 10

39 – 44 //// 4

45 – 50 ////////// 10.
n = 40

Find the class boundaries and class marks of the given score distribution.

X f Class Boundaries XM

15 – 20 4 14.5 – 20.5 17.5

21 – 26 9 20.5 – 26.5 23.5

27 – 32 3 26.5 – 32.5 29.5

33 – 38 10 32.5 – 38.5 35.5

39 – 44 4 38.5 – 44.5 41.5

45 – 50 10. 44.5 – 50.5 47.5


n = 40

Graphical Representation of Scores in Frequency Distribution

The scores expressed in frequency distribution can be meaningful and easier to


interpret when they are graphed. There are methods of graphing frequency
distribution: bar graph or histogram and frequency polygon and smooth curve. Bar
graph or histogram and frequency distribution will be discussed in this section while
smooth curve will be discussed later in the skewness.

Histogram consists of a set of rectangles having bases on the horizontal axis


which centers at the class marks. The base widths correspond to the class size and the

70
height of the rectangles corresponds to the class frequencies. Histogram is best used for
graphical representation of discrete data or non-continuous data.

Frequency polygon is constructed by plotting the class marks against the class
frequencies. The x-axis corresponds to the class marks and the y-axis corresponds to the
class frequencies. Connect the points consecutively using a straight line. Frequency
polygon is best used in representing continuous data such as the scores of students in a
given test.

Construct a histogram and frequency polygon using the frequency distributions


of 40 students in a 50-itm mathematics quiz previously discussed.

X frequency (f)

15 – 20 4

21 – 26 9

27 – 32 3

33 – 38 10

39 – 44 4

45 – 50 10.
n = 40

DESCRIBING GROUP PERFORMANCE

There are two major concepts in describing the assessed performance of the
group: measures of central tendency and measures of variability. Measures of central
tendency are used to determine the average score of a group of scores while measure of
variability indicate the spread of scores in the group. These two concepts are very
important and helpful in understanding the performance of the group.

Measure of Central Tendency

Measure of central tendency provides a very convenient way of describing a set


of scores with a single number that describe the performance of the group. It is also
defined as a single value that is used to describe the “center” of the data. It is thought of
as a typical value in a given distribution. There are three commonly used measures of
central tendency. These are the mean, median, and mode. In this section, we shall
discuss how to compute the value and some of the properties of the mean, median, and
mode as applied in a classroom setting.

71
1. Mean
Mean is the most commonly used measure of the center of data and it is also
referred as the “arithmetic average.”
Computation of Population Mean
ƩX 𝑥1+ 𝑥2 + 𝑥
3 +⋯ 𝑥𝑛
𝜇= =
N N

Computation of Sample Mean

ƩX 𝑥1+ 𝑥2 + 𝑥
3 +⋯ 𝑥𝑛
𝑋= =
N N

Computation of the Mean for Ungrouped Data


ƩX
1. 𝑋 =
𝑛
Ʃfx
2. 𝑋 = n
Example 1: Scores of 15 students in Mathematics I quiz consist of 25
items. The highest score is 25 and the lowest score is 10. Here are the scores:
25,20,18, 18,17,15,15,15,14,14,13,12,12,10,10. Find the men in the following
scores.
X (scores)
25
20
18
18
17
15
15
15
14
14
13
12
12
10
10
Ʃx = 228
n = 15
ƩX 228
𝑋= = = 15.2
𝑛 15

72
Analysis:

The average performance of 15 students who participated in a mathematics quiz


consisting of 25 items is 15.2. The implication of this is that student who got score
below 15.2 did not perform well in the said examination. Students who got score higher
than 15.2 performed well in the examination compared to the performance of the whole
class.

Example 2: Find the Grade Point Average (GPA) of Ritz Glenn for the first
semester of the school year 2010 – 2011. Use the table below:

Subject Grade (xi) Units (wi) (wi) (xi)


BM 112 1.25 3 3.75
BM 101 1.00 3 3.00
AC 103N 1.25 6 7.50
BEC 111 1.00 3 3.00
MGE 101 1.50 3 4.50
MKM 101 1.25 3 3.75
FM 111 1.50 3 4.50
PEN 2 1.00 2 2.00
Ʃ(wi) = 26 Ʃ(wi) (xi) = 32.00

Ʃ(𝑤𝑖 ) (𝑥𝑖 )
𝑋= Ʃ𝑤𝑖

32
𝑋 = 26

𝑿 = 𝟏. 𝟐𝟑

The Grade Point Average of Ritz Glenn for the first semester SY 2010 – 2011 is 1.23.

Mean for Grouped Data

Grouped data are the data or scores that are arranged in a frequency
distribution. Frequency distribution is the arrangement of scores according to
category of classes including the frequency. Frequency is the number of observations
falling in a category.

For this particular lesson we shall discuss only one formula in solving the mean
for gouped data which is called midpoint method. The formula is:
Ʃf𝑋𝑚
𝑋= n

where x = mean value

f = frequency in each class or category

Xm = midpoint of each class or category

73
Ʃf𝑋𝑚 – summation of the product of f𝑋𝑚

Steps of Solving Mean for Grouped Data

1. Find the midpoint or class mark (Xm)of each class or category using the
LL+UL
formula Xm= .
2
2. Multiply the frequency and the corresponding class mark f𝑋𝑚 .
3. Find the sum of the results in step 2.
Ʃf𝑋𝑚
4. Solve the mean using the formula𝑋 = .
n

Example 3: Scores of 40 students in a science class consist of 60 items and they


are tabulated below.

X F Xm 𝐟𝑿𝒎
10 – 14 5 12 60
15 – 19 2 17 34
20 – 24 3 22 66
25 – 29 5 27 135
30 – 34 2 32 64
35 – 39 9 37 333
40 – 44 6 42 252
45 – 49 3 47 141
50 – 54 5 52 260
n = 40 Ʃf𝑋𝑚 = 1 345

Ʃf𝑋𝑚
𝑋=
n
1 345
𝑋=
40

𝑋 = 33.63

Analysis:

The mean performance of 40 students in science quiz is 33.63. Those


students who got scores below 33.63 did not perform well in the said examination
while those students who got scores above 33.63 performed well.

Properties of the Mean

1. It measures stability. Mean is the most stable among other measures of


central tendency because every score contributes to the value of the mean.
2. The sum of each score’s distance from the mean is zero.
3. It is easily affected by the extreme scores.

74
4. It may not be an actual score in the distribution.
5. It can be applied to interval level of measurement.
6. It is very easy to compute.

When to Use the Mean

1. Sampling stability is desired.


2. Other measures are to be computed such as standard deviation, coefficient of
variation and skewness.

2. Median

Median is the second type of measures of central tendency. Median is what


divides the scores in the distribution into two equal parts. Fifty percent (50%) lies
below the median value and 50% lies above the median value. It is also known as the
middle score or the 50th percentile. For classroom purposes, the first thing to do is to
arrange the scores in proper order. That is to arrange the scores from the lowest score
to highest score or highest score to the lowest score. When the number cases are odd,
the median is a score that has the same number of scores below and above it. When the
scores are even, determine the average of the two middle most scores that have equal
number of scores below and above it.

Median of Ungrouped Data

1. Arrange the scores (from lowest to highest or highest to lowest).


2. Determine the middle most score in a distribution if n is an odd
number and get the average of the two middle most scores if n is an
even number.

Example 1: Fin the median score of 7 students in an English class.

X (score)
19
17
16
15
10
5
2

Analysis:

The median score is 15. Fifty percent (50%) or three of the scores are above 15
(19,17,16) and 50% or three of the scores are below 15 (10,5,2).

Example 2: Find the median score of 8 students in an English class.

X (score)

75
30
19
16
15
10
5
2

16 + 15
𝑋̃ =
2

𝑋̃ = 15.5

Analysis:

The median score is 15.5 which means that 50% of the scores in the distribution
are lower than 15.5, those are 15,10,5, and 2; and 50% are greater then 15.5 those are
30,19,17,16 which mean four (4) scores are below 15.5 and four (4) scores are above
15.5.

Median of Grouped Data

Formula:
𝑛
−cfp
𝑋̃ = 𝐿𝐵 [ 2 fm ]c.i
𝑋̃ = median value
𝑛
MC = median class is a category containing the 2

𝐿𝐵 = lower boundary of the median class (MC)

cfp = cumulative frequency before the median class if the scores are
arranged from lowest to highest value

fm = frequency of the median class

c.i = size of the class interval

Steps in Solving Median for Grouped Data

1. Complete the table for cf<.


𝑛
2. Get 2 of the scores in the distribution so that you can identify MC.
3. Determine 𝐿𝐵 , cfp, fm, and c.i.
4. Solve the median using the formula.

76
Example 3: Scores of 40 students in a science class consist of 60 items and they are
tabulated below. The highest score is 54 and the lowest score is 10.

X F cf<
10 – 14 5 5
15 – 19 2 7
20 – 24 3 10
25 - 29 5 15
30 – 34 2 17 (cfp)
35 – 39 9 (fm) 26
40 - 44 6 32
45 – 49 3 35
50 - 54 5 40
n = 40

Solution:
𝑛 40
= = 20
2 2

𝑛
The category containing 2 is 35-39.

MC = 35 – 39

LL of the MC = 35

𝐿𝐵 = 34.5

cfp = 17

fm = 9

c.i = 5
𝑛
−cfp
𝑋̃ = 𝐿𝐵 + [ 2 fm ]c.i

20−17
= 34.5 + [ ]5
9

3
= 34.5+ [9] 5

15
= 34.5 + 9

= 34.5 + 1.67

77
𝑋̃ = 36.17

Analysis:

The median value is 36.17, which means that 50% or 20 scores are less
than 36.17.

Properties of the Median

1. It may not be an actual observation in the data set.


2. It can be applied in ordinal level.
3. It is not affected by extreme values because median is a positional measure.

When to Use the Median

1. The exact midpoint of the score distribution is desired.


2. There are extreme scores in the distribution.

3. Mode

Mode is the third measure of central tendency. The mode or the modal score is a
score or scores that occurred most in the distribution. I is classified as unimodal,
bimodal, and trimodal and multimodal.Unimodal is a distribution o scores that consists
of only one mode. Bimodal is a distribution of scores that consists of two modes.
Trimodal is a distribution of scores that consists of three modes or multimodal is a
distribution of scores that consists of more than two modes.

Example 1. Scores of 10 students od Section A, Section B, and Section C

Scores of Section A Scores of Section B Scores of Section C


25 25 25
24 24 25
24 24 25
20 20 22
20 18 21
20 18 21
16 17 21
12 10 18
10 9 18
7 7 18

The score that appeared most in section A is 20, hence, the mode of section A is
20. There is only one mode, therefore, score distribution is called unimodal. The modes
of section B are24, since both 18 and 24 appeared twice. There are two modes in
section B, hence, the disctribution is a bimodal distribution. The modes for section C
are18,21, and 25. There are three modes for section C, therefore, it is called a
trimodalor multimodal distribution.

78
Mode for Grouped Data

In solving the mode value using grouped data, use the formula:
𝑑1
𝑥̂ = 𝐿𝐵 + [𝑑 ]c.i
1 +𝑑2

𝐿𝐵 = lower boundary of the modal class

Modal Class (MC) = is a category containing the highest frequency

d1 = difference between the frequency of the modal class and the


frequency above it, when the scores are arranged from lowest to
highest.

d2 = difference between the frequency of the modal class and the


frequency below it, when the scores are arranged from lowest to
highest.

c.i = size of the class interval

Examples 2. Scores of 40 students in a science class consist of 60 items and they


are tabulated below.

x f
10 – 14 5
15 – 19 2
20 – 24 3
25 – 29 5
30 – 34 2
35 – 39 9
40 – 44 6
45 – 49 3
50 – 54 5
n = 40

Modal Class = 35 – 39

LL of MC = 35

𝐿𝐵 = 34.5

d1 = 9 – 2 = 7

d2 = 9 – 6 = 3

79
c.i = 5

𝑑1
𝑥̂ = 𝐿𝐵 + [𝑑 ]c.i
1 +𝑑2

7
= 34.5 + [7+3]5

35
= 34.5 + 10

𝑥̂ = 3.5 + 3.5

𝑥̂ = 3.8

The mode of the score distribution that consists of 40 students is 38, because 38
occurred several times.

Properties of the Mode

1. It can be used when the data are qualitative as well as quantitative.


2. It may not be unique.
3. It is not affected by extreme values.
4. It may not exist.

When to Use the Mode

1. When the “typical” value is desired.


2. When the data set is measured on a nominal scale.

4. Quantiles

Quantile is a score distribution where the scores are divided into different equal
parts. There are three kinds of quantile. The quartile is a score point that divided the
scores in the distribution into four (4) equal parts. Decile is a score point that divides
the scores in the distribution into hundred (100) equal parts.

80
Quartile Decile Percintile
10%
LS LS LS
1%
10% P1
25% D1 P10
D2 P20
Q1 P25
D3 P30
D4 P40
Q2 D5 P50
D6 P60
D7 P70
Q3 P75
D8 P80

D9 P90

P99
HS HS HS

Diagram for Quartile, Decile and Percentiles

Quantile for Ungroup Data

a. Quartile for Ungrouped Data


𝑘 𝑘
𝑄𝑘 = [ 4 𝑛 + (1 − 4)]nth score

1 1
𝑄1 = [ 4 𝑛 + (1 − 4)]nth score

2 2
𝑄2 = [ 4 𝑛 + (1 − 4)]nth score

3 3
𝑄3 = [ 4 𝑛 + (1 − 4)]nth score

Where,

Qk is the indicated quartile

k = 1,2,3

81
n = number of cases

b. Decile for Ungrouped Data


𝑘 𝑘
𝐷𝑘 = [10 𝑛 + (1 − )]nth score
10

1 1
𝐷1 = [10 𝑛 + (1 − )]nth score
10

2 2
𝐷2 = [10 𝑛 + (1 − )]nth score
10

3 3
𝐷3 = [10 𝑛 + (1 − )]nth score
10

9 9
𝐷9 = [10 𝑛 + (1 − )]nth score
10
Where,
Dk= is the indicated decile
k = 1,2,3,4,5,6,7,8,9
n = number of cases

c. Percentile for Ungrouped Data


𝑘 𝑘
𝐷𝑘 = [100 𝑛 + (1 − )]nth score
100
1 1
𝐷1 = [100 𝑛 + (1 − )]nth score
100
2 2
𝐷2 = [100 𝑛 + (1 − )]nth score
100
3 3
𝐷3 = [100 𝑛 + (1 − )]nth score
100
25 25
𝐷25 = [100 𝑛 + (1 − )]nth score
100
77 77
𝐷77 = [100 𝑛 + (1 − )]nth score
100
25 25
𝐷25 = [100 𝑛 + (1 − )]nth score
100
50 50
𝐷50 = [100 𝑛 + (1 − )]nth score
100
90 90
𝐷90 = [100 𝑛 + (1 − )]nth score
100
98 98
𝐷98 = [100 𝑛 + (1 − )]nth score
100
99 99
𝐷99 = [100 𝑛 + (1 − )]nth score
100
Where,
Pk is the indicated percentile
K = 1,2,3,4,5,………………. 97,98,99
N = number of cases

82
Example:

Using the given data 6, 8, 10, 12, 12, 14, 15, 16, 20. Find Q1, Q3, D6, D9, P65, P99.

x (score)
6
8
10
12
12
14
15
16
20
1. Solve the value of Q1.
n=9
1 1
𝑄𝑘 = [ 4 𝑛 + (1 − 4)]nth score

1 1
= [ 4 9 + (1 − 4)]nth score

9 3
= [ 4 + 4]nth score

12
= [ ]nth score
4

Q1 = 3rd score

The value of Q1is 10 which is the 3rd score in the distribution.

Therefore, 25% of the scores are below 10.

2. Solve the value of Q3.


3 3
𝑄3 = [ 4 𝑛 + (1 − 4)]nth score

3 3
𝑄3 = [ 4 (9) + (1 − 4)]nth score

27 1
= [ 4 + 4]nth score
28
= [ 4 ]nth score
𝑄3 = 7𝑡ℎ 𝑠𝑐𝑜𝑟𝑒

The 7th score in the distribution which is 15.

𝑄3 = 15

Hence, 75% of the scores in the distribution are less than 15.
83
3. Solve the value of D6.
36 6
𝐷6 = [10 (9) + (1 − )]nth score
10
54 4
𝐷6 = [10 + 10]nth score
58
𝐷6 = [10]nth score
D6 = 5.8th score
The value of D6 lies within the sum of the 5th score and 80% of the difference
between 6th and 5th scores.
D6 = 5th score + 0.80 (6th score – 5th score)
= 12 + 0.80 (14 – 12)
= 12 + 0.80 (2)
= 12 + 1.60
D6 = 13.60

Therefore, 60% of the scores in the distribution are less than 13.60.

4. Solve the value if D9.


9 9
𝐷9 = [10 (9) + (1 − )]nth score
10
81 1
𝐷9 = [10 + 10]nth score
82
𝐷9 = [10]nth score
𝐷9 = 8.2𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
The value of D9 lies within the 8th and 9th scores. That is, the sum of the 8th
score and 20% of the difference between the 9th and 8th scores.
D9 = 8th score + 0.20 (9th score – 8th score)
D9 = 16 + 0.20 (20 – 16)
D9 = 16 + 0.80
D9 = 16.80
Therefore, 90% of the scores in the distribution are less than 16.80.
5. Solve the value of P65.
65 65
𝑃65 = [100 𝑛 + (1 − )]nth score
100
65 65
𝑃65 = [100 (9) + (1 − )]nth score
100
585 65
𝑃65 = [100 + (1 − )]nth score
100
585 35 nth score
𝑃65 = [100 + ]
100
620
𝑃65 = [100]nth score
P65 = 6.20th score

Therefore, P65lies within the 6th and 7th scores. The value of P65 is the sum of
the 6th and 20% of the difference between the 7th and the 6th scores.

84
P65 = 6th score + 0.20(7th score – 6th score)
= 14 + 0.20 (15 – 14)
= 14 + 0.20 (1)
= 14 + 0.20

P65 = 14.20

Therefore, 65% of the scores in the distribution are less than 14.20.

6. Solve the value of P99.


99 99
𝑃99 = [100 𝑛 + (1 − )]nth score
100
99 99
𝑃99 = [100 (9) + (1 − )]nth score
100
891 891 1
𝑃99 = [100 + 𝑃90= [100 + ]]nth score
100
892
𝑃99 = [100]nth score
𝑃99 = 8.92𝑛𝑑 𝑠𝑐𝑜𝑟𝑒
P99 lies within the 8th and 9th scores. The value of P99 is the sum of the 8th
score and 92% of the difference of 9th and 8th scores.
P99 = 8th score + 0.92 (9th score – 8th score)
P99 = 16 + 0.92 (20 – 16)
P99 = 16 + 3.68
P99 = 19.68
Hence, 99% of the scores in the distribution are less than 19.68.

Quantiles for Grouped Data

a. Quartile for Grouped Data


𝑘𝑛
− cfp
The general formula for quartile is Qk =𝐿𝐵 + [ 4 ]c.i, where
fq

Qk = the indicated quartile

k = 1, 2, and 3

LB = lower boundary of the quartile class


cfp = cumulative frequency before the quartile class when scores are arranged
from lowest to highest
fq = frequency of the quartile class
c.i = size of the class interval
𝑛 2𝑛 3𝑛
QC = is a class or category containing 4 for Q1, for Q2 and for Q3
4 4

𝑛
− 𝑐𝑓𝑝1
𝑄1 = 𝐿𝐵1 + [ 4 ]c.i
𝑓𝑞1

85
2𝑛
− 𝑐𝑓𝑝2
4
𝑄2 = 𝐿𝐵2 + [ ]c.i
𝑓𝑞2

3𝑛
− 𝑐𝑓𝑝3
4
𝑄3 = 𝐿𝐵3 + [ ]c.i
𝑓𝑞3

Sample Computations of Quartile Using Grouped Data

Example 1: The data for the scores of fifty (50) students in Filipino class are
given below. Solve for the value of Q1.

x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 –80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
𝑛 50
= = 37.5
4 4
Q1C = 41 – 48
LL = 41
LB= 40.5
cfp = 10
fq = 5
c.i = 8
𝑛
− 𝑐𝑓𝑝1
4
𝑄1 = 𝐿𝐵1 + [ ]c.i
𝑓𝑞1
12.5 − 10
𝑄1 = 40.5 + [ ]8
5
2.5
𝑄1 = 40.5 + [ 5 ]8
10
𝑄1 = 40.5 +
28
𝑄1 = 40.5 + 4
𝑄1 = 44.50

Therefore, 25% of the scores of 50 students who participated in the test are less than
44.50.

86
Example 2: The data for the scores of fifty (50) students in Filipino class are
given below. Solve for the value of Q3.

x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
3𝑛 3(50)
= = 37.5
4 4
Q3C = 73 – 80
LL = 73
LB= 72.5
cfp = 37
fq = 8
c.i = 8
3𝑛
− 𝑐𝑓𝑝
4
𝑄3 = 𝐿𝐵3 + [ ]c.i
𝑓𝑞3
37.5 − 37
𝑄1 = 72.5 + [ ]8
8
0.5
𝑄3 = 72.5 + [ 8 ]8
4
𝑄3 = 72.5 +
8
𝑄3 = 72.5 + 0.5
𝑄3 = 73.00

Therefore, 75% of the scores in the distribution are less than 73.

b. Decile for Grouped Data


𝑘𝑛
− cfp
The general formula for deciles of grouped data is Dk = LB+ [ 10
]c.i,
fd

where:
Dk = indicated decile
k = 1, 2, 3, 4, 5, 6, 7, 8, 9
LB = lower boundary of the indicted decile class
𝑛 2𝑛 3𝑛
DC = deciles class is a class or category containing 2 for D1, 10 for D2, 10 for D3 ….
9𝑛
for D9
10

87
cfp = cumulative frequency before the indicated decile class when scores are
arranged from lowest to highest
fd = frequency of the indicated decile class
c.i = size of class interval
(1)𝑛
− cfp
𝐷1 = 𝐿𝐵 + [ 10 fd ]c.i
(2)𝑛
− cfp
10
𝐷2 = 𝐿𝐵 + [ ]c.i
fd
(3)𝑛
− cfp
10
𝐷3 = 𝐿𝐵 + [ ]c.i
fd
(4)𝑛
− cfp
𝐷4 = 𝐿𝐵 + [ 10 fd ]c.i
(5)𝑛
− cfp
10
𝐷5 = 𝐿𝐵 + [ ]c.i
fd
(6)𝑛
− cfp
𝐷6 = 𝐿𝐵 + [ 10 fd ]c.i
(7)𝑛
− cfp
𝐷7 = 𝐿𝐵 + [ 10 fd ]c.i
(8)𝑛
− cfp
10
𝐷8 = 𝐿𝐵 + [ ]c.i
fd
(9)𝑛
− cfp
𝐷9 = 𝐿𝐵 + [ 10 fd ]c.i

Sample Computation of Decile Using Grouped Data

Example 3: The data for the scores of fifty (50) students in Filipino class are
given below. Solve the value of D5.

x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
(5)𝑛 (5)50 250
= = = 25
10 10 10
D5C = 57 – 64

88
LL = 57
LB = 56.5
cfp = 19
fd = 12
c.i = 8
(5)𝑛
− cfp
10
𝐷5 = 𝐿𝐵 + [ ]c.i
fd
25 − 19
𝐷5 = 56.5 + [ ]8
12
6
𝐷5 = 56.5 + [12]8
48
𝐷5 = 56.5 +
12
𝐷5 = 56.5 + 4
𝐷5 = 60.5
Hence, 50% of the scores of 50 students are less than 60.5

Examples 4: The data for the scores of fifty (50) students in Filipino class are
given below. Solve for the value of D7.

x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
7𝑛 7(50) 350
= = = 35
10 10 10
D7C = 65 – 72
LL = 65
LB = 64.5
cfp= 31
fd= 6
c.i = 8
(7)𝑛
− cfp
10
𝐷7 = 𝐿𝐵 + [ ]c.i
fd
35 −31
𝐷7 = 64.5 + [ ]8
6
4
𝐷7 = 64.5 + [6]8

89
32
𝐷7 = 64.5 +
6
𝐷7 = 64.5 + 5.33
𝐷7 = 69.83

Therefore, 70% of the scores of students are less than 69.83.


c. Percentiles of Grouped Data
The general formula for percentile using grouped data is Pk = LB + +
𝑘𝑛
− cfp
[100 fd ]c.i,

Where,
Pk = the indicated perecentile
k = 1, 2, 3, 4, ….. 97, 98, 99
LB = lower boundary of the indicted percentil class
𝑛 2𝑛 3𝑛 98𝑛 99𝑛
PC = percentile class containing 100 for P1, 100 for P2, 100 for P3, ….100for P98, 100 for
P99
cfp = cumulative frequency before the indicated percentile class when scores are
arranged from lowest to highest
fd = frequency of the indicated percentile class
c.i = size of class interval

To derive the formula in solving the indicated percentile, just change the value of
k to the indicated percentile. There are 99 formulas in solving the percentile. Some of
the formulas for percentile are the following.

(1)𝑛
− cfp
100
𝑃1 = 𝐿𝐵 + [ ]c.i
fd
(2)𝑛
− cfp
𝑃2 = 𝐿𝐵 + [ 100 fd ]c.i
(3)𝑛
− cfp
10
𝑃3 = 𝐿𝐵 + [ ]c.i
fd
(4)𝑛
− cfp
𝑃4 = 𝐿𝐵 + [ 100 fd ]c.i
(10)𝑛
− cfp
100
𝑃10 = 𝐿𝐵 + [ ]c.i
fd
(20)𝑛
− cfp
100
𝑃20 = 𝐿𝐵 + [ ]c.i
fd
(25)𝑛
− cfp
𝑃25 = 𝐿𝐵 + [ 100 fd ]c.i
(50)𝑛
− cfp
100
𝑃50 = 𝐿𝐵 + [ fd
]c.i

90
(75)𝑛
− cfp
100
𝑃75 = 𝐿𝐵 + [ ]c.i
fd
(90)𝑛
− cfp
𝑃90 = 𝐿𝐵 + [ 100 fd ]c.i
(95)𝑛
− cfp
100
𝑃95 = 𝐿𝐵 + [ ]c.i
fd
(99)𝑛
− cfp
𝑃99 = 𝐿𝐵 + [ 100 fd ]c.i

Sample Computations of Percentile Using Grouped Data


Example 5: The data for the scores of fifty (50) students is Fillipino class. Data
are given below. Solve the value of P82.
x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
(82)𝑛 72(50) 4100
= = = 41
100 100 100
P82C = 73 – 80
LL = 73
LB = 72.5
cfp = 37
fd = 8
c.i = 8
(82)𝑛
− cfp
𝑃7 = 𝐿𝐵 + [ 100 ]c.i
fd
41 −37
𝑃82 = 72.5 + [ ]8
8
4
𝑃82 = 72.5 + [8]8
32
𝑃82 = 72.5 +
8
𝑃82 = 72.5 + 4
𝑃82 = 76.5
Therefore, 82% of the scores of 50 students are less than 76.5

91
Example 6: The data for the scores of fifty (50) students in Filipino class are
given below. Solve the value of P91.

x f cf<
25 – 32 3 3
33 – 40 7 10
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solution:
(91)𝑛 91(50) 4 550
= = = 45.50
100 100 100
P91C = 81 – 88
LL = 81
LB = 80.50
cfp = 45
fd = 3
c.i = 8
(91)𝑛
− cfp
100
𝑃91 = 𝐿𝐵 + [ ]c.i
fd
45.50 −45
𝑃91 = 80.50 + [ ]8
3
0.50
𝑃91 = 80.50 + [ ]8
3
4
𝑃91 = 80.50 +
3
𝑃91 = 80.50 + 1.33
𝑃91 = 81.83
Hence, 91% of the scores of 50 students are less than 81.83.

Measures of Variation

In the previous section, we have discussed measure or central tendency to


describe the distribution of scores. However, the measure of central tendency does not
uniquely describe a distribution, most especially if we want to know how close or how
far is the distance of the scores of students in a certain test from the average
performance of the group. It is in this line that we make use of the measures of
variation. Measure of variation is a single value that is used to describe the spread of the
scores in a distribution. The term variation that is also known as variability of

92
dispersion. There are several ways of describing the variation of scores: absolute
measures of variation and relative measures of variation.

Let us consider the scores of student in three sections of mathematics class. We


shall consider the spread of scores based on graphical presentation.

Section A Section B Section C


12 12 12
12 12 12
14 12 12
15 13 12
17 13 12
18 14 12
18 17 13
18 20 26
19 20 26
23 28 26
23 28 26
30 30 30
x̅ = 18.25 x̅ = 18.25 x̅ = 18.25
S = 5.15 S = 6.92 S = 7.63

Diagram of the table.

Section A
Mean = 18.25
S = 5.15 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

Section B 0
Mean = 18.25
S = 6.92 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

Section C
Mean = 18.25
S = 7.63 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

What can you observe about the mean and the standard deviation of the three
groups of scores?

Which group of students performed well in the class?

93
Which group of scores is most widespread? Less scattered?

Before answering such questions, let us first discuss the different types of
measures of variation.

Types of Absolute Measures of Variation

There are four kinds of absolute variation. The range, inter-quartile range and
quartile deviation, mean deviation, variance and standard deviation.

1. Range
Range (R) is the difference between the highest score and the lowest score in a
distribution. Range is the simplest and the crudest measure of variation, simplest
because we shall only consider the highest score and the lowest score.
a. Range for Ungrouped Data
R = HS – LS

Where,

R = range value

HS = Highest score

LS = Lowest score

Example 1: Find the range of the two groups of score distribution.

Group A Group B
10(LS) 15(LS)
12 16
15 16
17 17
25 17
26 23
28 25
30 26
35(HS) 30(HS)
RA = HS – LS
RA = 35 – 10
RA = 25
RB = HS – LS
RB = 30 - 15
RB = 15
Analysis:

The range of Group A = 25 is greater than the range of Group B = 15. The
implication of this is that scores in group A are more spread out than the scores in
group B or the scores in Group B are less scattered than the scores in group A.

94
b. Range for Grouped Data
R = HSUB – LSLB

Where,

R = range value

HSUB = upper boundary of the highest score

LSLB = lower boundary of the lowest score

Example 2: Find the value of range of the scores of 50 students in Mathematics


achievement test.

X F
25 – 32 3
33 – 40 7
41 – 48 5
49 – 56 4
57 – 64 12
65 – 72 6
73 – 80 8
81 – 88 3
89 – 97 2
n = 50
LL of LS = 25
LSLB = 24.5
UL of the HS = 97
HSUB = 97.5
R = HSUB – LSLB
R = 97.5 – 24.5
R = 73

Properties of Range

1. It is quick and easy to understand.


2. It is a rough estimation of variation.
3. It is easily affected by the extreme scores.

Interpretation of Range Value

When the range value is large, the scores in the distribution are more dispersed,
widespread or heterogeneous. On the other hand, when the range value is small the
scores in the distribution are less dispersed, less scattered, or homogeneous.

95
2. Inter-quartile Range (IQR) and Quartile Deviation (QD)
Inter-quartile range is the difference between the third quartile and the first
quartile.
IQR = Q3 – Q1

Properties of Inter-quartile Range

1. Reduces the influence of extreme values.


2. Not as easy to calculate as the range.
3. Only consider the middle 50% of the scores in the distribution.
4. The point of dispersion is the median value.

Quartile deviation indicates the distance we need to go above and below the
median to include the middle 50% of the scores. It is based on the range of the middle
50% of the scores, instead the entire set.
𝑸𝟑 − 𝑸𝟏
The formula in computing the value of the quartile deciationis 𝑸𝑫 = , where
𝟐
QD is the quartile deviation value, Q1 is the value of the first quartile and Q3 us the value
of the third quartile.

Steps in Solving Quartile Deviation

1. Solve for the value of Q1.


2. Solve for the value of Q3.
𝑸𝟑 − 𝑸𝟏
3. Solve for the value of QD using the formula 𝑸𝑫 = .
𝟐
a. Quartile Deviation of Ungrouped Data
𝑸𝟑 − 𝑸𝟏
𝑸𝑫 =
𝟐

Example 1: Using the given data 6,8,10,12,12,14,15,16,20, find the quartile


deviation.
X(score)
6
8
10
12
12
14
15
16
20

96
Solve for Q1.
n=9
1 1
𝑄1 = [ 4 𝑛 + (1 − 4)]nth score

1 1
𝑄1 = [ 4 9 + (1 − 4)]nth score

9 13 nth score
𝑄1 = [ 4 + ]
4

12
𝑄1 = [ 4 ]nth score

𝑄1 = 3𝑟𝑑 𝑠𝑐𝑜𝑟𝑒

𝑄1 = 10

Solve for Q3.


3 3
𝑄3 = [ 4 𝑛 + (1 − 4)]nth score
3 1
𝑄3 = [ 4 (9) + (1 − 4)]nth score
27 1
𝑄3 = [ 4 + 4]nth score
28
𝑄3 = [ 4 ]nth score
𝑄3 = 7𝑡ℎ 𝑠𝑐𝑜𝑟𝑒
𝑄3 = 15
IQR = Q3 – Q1
= 15 – 10
IQR = 5

𝑸𝟑 − 𝑸𝟏
𝑸𝑫 =
𝟐
15 − 10
=
2
5
=
2
𝑄𝐷 = 2.5

Analysis:

The amount that deviates from the mean value is 2.5.

97
b. Quartile Deviation of Grouped Data
𝑸𝟑 − 𝑸𝟏
𝑸𝑫 =
𝟐
Example 2: The data given below are the scores of fifty (50) students in Filipino
class. Solve for the value of quartile deviation (QD).

X F cf<
25 – 32 3 3
33 – 40 7 1.
41 – 48 5 15
49 – 56 4 19
57 – 64 12 31
65 – 72 6 37
73 – 80 8 45
81 – 88 3 48
89 – 97 2 50
n = 50
Solve for the value of Q1.
𝑛 50
= = 12.5
4 4
Q1C = 41 - 48
LL = 41
LB= 40.5
cfp = 10
fq = 5
c.i = 8
𝑛
− 𝑐𝑓𝑝1
𝑄1 = 𝐿𝐵1 + [ 4 ]c.i
𝑓𝑞1
12.5 − 10
𝑄1 = 40.5 + [ ]8
5
2.5
𝑄1 = 40.5 + [ 5 ]8
20
𝑄1 = 40.5 +
5
𝑄1 = 40.5 + 4
𝑄1 = 44.5

Solve fort he value of Q3.


3𝑛 3(50)
= = 37.5
4 4
Q3C = 73 – 80
LL = 73
LB= 72.5
cfp = 37
fq = 8

98
c.i = 8
3𝑛
− 𝑐𝑓𝑝3
4
𝑄3 = 𝐿𝐵3 + [ ]c.i
𝑓𝑞3
37.5 − 37
𝑄3 = 72.5 + [ ]8
8
0.5
𝑄3 = 72.5 + [ 8 ]8
4
𝑄3 = 72.5 +
8
𝑄3 = 72.5 + 0.5
𝑄3 = 73.00

Solve for the value of QD

𝑸𝟑 − 𝑸𝟏
𝑸𝑫 =
𝟐
73 − 44.5
𝑄𝑄 =
2
28.5
𝑄𝑄 =
2
𝑄𝑄 = 14.25

Interpretation of IQR and QD

The larger the value of the IQR or QD, the more dispersed the scores at the middle
50% of the distribution. On the other hand, if the IQR or QD is small, the scores are less
dispersed at the middle 50% of the distribution. The point of dispersion is the median
value.

Analysis for Inter-quartile Range and Quartile Deviation

When the value of IQR and QD is small, the scores are clustered within middle
50% of the score distribution. On the other hand, the scores are dispersed in the middle
50% of the distribution when the value of IQR and QD. To determine which group of
distribution is more clustered or disperses you should compare it with another group of
distribution since there is no standard value of a small or large value of IQR and QD.

3. Mean Deviation (MD)

Mean deviation measure the average deviation of the values from the arithmetic
mean. It gives equal weight to the deviation of every score in the distribution.

a. Mean Deviation for Ungrouped Data


Ʃ|𝑄 − 𝑄|
𝑄𝑄 =
𝑄
Where,

99
MD = mean deviation value

x = individual score

x = sample mean

n = number of cases

Steps in Solving Mean Deviation for Ungrouped Data

1. Solve the mean value.


2. Subtract the mean value from each score.
3. Take the absolute value of the difference in step 2.
Ʃ|𝑄− 𝑄|
4. Solve the mean deviation using the formula 𝑄𝑄 = 𝑄

Example 1: Find the mean deviation of the scores of 10 students in a mathematics


test. Give the scores:35, 30, 26, 24, 20, 18, 18, 16, 15, 10.

x x-𝐱 /x - 𝐱/
35 13.8 13.8
30 8.8 8.8
26 4.8 4.8
24 2.8 2.8
20 -1.2 1.2
18 -3.2 3.2
18 -3.2 3.2
16 -5.2 5.2
15 -6.2 6.2
10 -11.2 11.2
Ʃx = 212 Ʃ/x - x/ =
60.4
Ʃ𝑄
x= 𝑄

212
x=
10
x = 21.2

Ʃ|𝑥 − 𝑥|
𝑀𝐷 =
𝑛
60.4
𝑀𝐷 =
10
𝑀𝐷 = 6.04

Analysis:

The mean deviation of the 10 scores of students is 6.04. This means that
on the average, the value deviated from the mean of 21.2 is 6.04.

100
b. Mean Deviation for Grouped Data
Ʃ𝒇|𝑿𝒎 − 𝒙|
𝑴𝑫 =
𝒏
Where,

MD = mean deviation value

f = class frequency

Xm = class mark or midpoint of each category

x = mean value

n = number of cases

Steps in solving Mean Deviation for Grouped Data

1. Solve for the value of the mean.


2. Subtract the mean value from each midpoint or class mark.
3. Take the absolute value of each difference.
4. Multiply the absolute value and the corresponding class frequency.
5. Find the sum of the results in step 4.
6. Solve for the mean deviation using the formula for grouped data.

Example 2: Find the mean deviation of the given scores below.

X f Xm fXm 𝐗 𝐦 − 𝐱̅ /𝐗 𝐦 − 𝐟/𝐗 𝐦 − 𝐱̅/


𝐱̅/
10 – 14 5 12 60 -21.63 21.63 108.15
15 – 19 2 17 34 -16.63 16.63 33.26
20 – 24 3 22 66 -11.63 11.63 34.89
25 – 29 5 27 135 -6.63 6.63 33.15
30 – 34 2 32 64 -1.63 1.63 3.26
35 – 39 9 37 333 3.37 3.37 30.33
40 – 44 6 42 252 8.37 8.37 50.22
45 – 49 3 47 141 13.37 13.37 40.11
50 – 54 5 52 260 18.37 18.37 91.85
n = 40 Ʃf|Xm Ʃf/Xm − x̅/ =
= 1345 425.22

Ʃ𝑓𝑋𝑚
x=
𝑛
1345
x=
40
x = 33.63
101
Ʃ𝒇|𝑿𝒎 − 𝒙|
𝑴𝑫 =
𝒏
𝟒𝟐𝟓. 𝟐𝟐
𝑴𝑫 =
𝟒𝟎
𝑴𝑫 = 𝟏𝟎. 𝟔𝟑

Analysis:

The mean deviation of the 40 scores of students is 10.63. This means that in the
average, the value deviated from the mean of 33.63 is 10.63.

4. Variance and Standard Deviation

Variance is one of the most important measures of variation. It shows variation


about the mean.

Population Variance

2
Ʃ(𝑥 − µ)2
ơ =
𝑁
Sample Variance

Ʃ(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛−1
Steps in Solving Variance of Ungrouped Data

1. Solve for the mean value.


2. Subtract the mean value from each score.
3. Square the difference between the mean and each score.
4. Find the sum of the results in step 3.
5. Solve for the population variance or sample variance using the
formula of ungrouped data.

Example 1: Using the data below, find the variance and standard deviation of the
scores of 10 students in a science quiz. Interpret the result.

x x-𝐱 (x - 𝐱)2
19 4.4 19.36
17 2.4 5.76
16 1.4 1.96
16 1.4 1.96
15 0.4 0.16
14 -0.6 0.36
13 -1.6 2.56

102
12 -2.6 6.76
10 -4.6 21.16
Ʃx = 146 Ʃ(x - 𝐱)2 =
60.40
x = 14.6

a. Variance of Ungrouped Data


Population Variance of Ungrouped Data
Ʃ(𝑥 − µ)2
ơ2 =
𝑁
60.40
ơ2 =
10
ơ2 = 6.04

Sample Variance of Ungrouped Data


2
Ʃ(𝑥 − 𝑥̅ )2
𝑠 =
𝑛−1
60.4
𝑠2 =
9
𝑠 2 = 6.71

Note: If the standard deviation is already solved, square the value of the standard
deviation to get the variance.

b. Variance of Grouped Data


Population Variance
Ʃ𝑓(𝑋𝑚 − µ)2
ơ2 =
𝑁
Sample Variance
2
Ʃ𝑓(𝑋𝑚 − 𝑥̅ )2
𝑠 =
𝑛−1
Steps in Solving the Variance of Grouped Data

1. Solve for the mean value.


2. Subtract the mean value from each midpoint or class mark.
3. Square the difference between the mean value and midpoint or class mark.
4. Multiply the squared difference and the corresponding class frequency.
5. Find the sum of step 4.
6. Solve the population variance or sample variance using the formula of grouped
data.

103
Example 2: Score distribution of the test results of 40 students in a Filipino class
consisting of 50 items. Solve the variance and standard deviation and interpret the
result.

X f Xm fXm 𝐱̅ 𝐗𝐦 (𝐗 𝐦 𝐟(𝐗 𝐦 − 𝐱̅)𝟐


− 𝐱̅ − 𝐱̅)𝟐
15 – 3 17.5 52.5 33.7 -16.2 262.44 787.32
20
21 – 6 23.5 141 33.7 -10.2 104.04 624.24
26
27 – 5 29.5 147.5 33.7 -4.2 17.64 88.2
32
33 – 15 35.5 532.5 33.7 1.8 3.24 48.6
38
39 – 8 41.5 332 33.7 7.8 60.84 486.72
44
45 - 3 47.5 142.5 33.7 13.8 190.44 571.32
50
n= fXm Ʃf/X m − x̅)2 = 2
40 = 1348 606.4

Population Variance
Ʃ𝑓(𝑋𝑚 − µ)2
ơ2 =
𝑁
2 606.4
ơ2 =
40
2
ơ = 65.16

Sample Variance
Ʃ𝑓(𝑋𝑚 − 𝑥̅ )2
𝑠2 =
𝑛−1
2 606.4
𝑠2 =
39
2
𝑠 = 66.83

Standard deviation is the most important measures of variation. It is also


known as the square root of the variance. It is average distance of all the scores that
deviates from the mean value.

Population Standard Deviation

Ʃ(𝑋 − µ)2
ơ=√
𝑁

Sample Standard Deviation

104
Ʃ(𝑥 − 𝑥̅ )2
𝑠= √
𝑛−1

Steps in Solving Standard Deviation of Ungrouped Data

1. Solve for the mean value.


2. Subtract the mean value from each score.
3. Square the difference between the mean and each score.
4. Find the sum of step 3.
5. Solve for the population standard deviation or sample standard
deviation usimg the formula for ungrouped data.

Note: If the variance is already solved, take the square root of the variance to get
the value of the standard deviation.

Example: Using the data on example 1, solve the population and sample
standard deviation.

Population Standard Deviation

Ʃ(𝑋 − µ)2
ơ=√
𝑁

60.40
ơ=√
10
ơ = √6.04
ơ = 2.46

Sample Standard Deviation

Ʃ(𝑥 − 𝑥̅ )2
𝑠=√
𝑛−1

60.40
𝑠=√
9
𝑠 = √6.71
𝑠 = 2.59

Steps in Solving the Standard Deviation of Grouped Data

1. Solve for the mean value.


2. Subtract the mean value from each midpoint or class mark.
3. Square the difference between the mean value and midpoint or class
mark.

105
4. Multiply the squared difference and the corresponding class
frequency.
5. Find the sum of the results in step 3.
6. Solve for the population standard deviation or sample standard
deviation using the formula for grouped data.

Population Standard Deviation

Ʃ𝑓(𝑋𝑚 − µ)2
ơ=√
𝑁

2 606.4
ơ=√
40
ơ = √65.16
ơ = 8.07

Sample Standard Deviation

Ʃ𝑓(𝑋𝑚 − 𝑥̅ )2
𝑠=√
𝑁−1

2 606.4
𝑠=√
39
𝑠 = √66.8308
𝑠 = 8.18

Interpretation of Standard Deviation

1. If the value if standard deviation is large, on the average, the scores in the
distribution will be far from the mean. Therefore, the scores are spread out
around the mean value. The distribution is also known as heterogeneous.
2. If the value of standard deviation is small, on the average, the scores in the
distribution will be close to the mean. Hence, the scores are less dispersed or the
scores in the distribution are homogeneous.

Going back to the diagram presented, let us answer the questions pose using the
concept of mean and standard deviation with the help of the diagram below.

Section A
Mean = 18.25
S = 5.15 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

106
Section B
Mean = 18.25
S = 6.92 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

Section C
Mean = 18.25
S = 7.63 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
28 29 30

1. What can you observe about the mean and the standard deviation of the
three groups of scores?
Answer: The mean of the three groups of scores is the same which is equal to
18.25 and the standard deviation of section A = 5.15, section B = 6.92, and
section C = 7.63.
2. Which group of students performed well in the class?
Answer: In terms of performance, the three sections of students perform the
same because they have the same mean value of 18.25.
3. Which group of scores is most widespread? Less scattered?
Answer: The standard deviation of section A = 5.15, section B = 6.92 and
section C = 7.63. The scores that are most scattered are those in section C
because they have the largest value of standard deviation which is equal to
7.63. On the other hand, the less scattered group of scores is in section A
which has the smallest value of the standard deviation which is equal to 5.15.
Therefore, the smaller the value of the standard deviation on the average the
closer the scores are to the mean value and the larger the value of the
standard deviation on the average makes the scores scattered from the mean
value.
Using the diagram, there are more scores that are closer to the mean
value in section A than in section B and section C.

Properties of Variance and Standard Deviation

1. The most commonly used measures of variation most especially in research.


2. It shows variation of the individual scores about the mean.

Relative Measure of Variation

Coefficient of variation shows variation relative to the mean. It is used to


compare two or more groups of distribution of scores. Usually expressed in percent, the
smaller the value of the coefficient of variation the more homogeneous the scores in
that particular group. On the other hand, the higher the value of the coefficient of
variation the more dispersed the scores in that particular distribution.

107
The formula in computing the coefficient of variation is:
𝑠
𝐶𝑉 = (𝑥̅ ) 100%

Where,

𝑥̅ = mean value

s = standard deviation

Example:

Find the coefficient of variation of the given data below:

Section A Section B Section C


12 12 12
12 12 12
14 12 12
15 13 12
17 13 12
18 14 12
18 17 13
18 20 26
19 20 26
23 28 26
23 28 23
30 30 30
𝑥̅ = 18.25 𝑥̅ = 18.25 𝑥̅ = 18.25
s = 5.15 s = 6.92 s = 7.63

𝑠
𝐶𝑉𝐴 = ( ) 100%
𝑥̅
5.15
= 100%
18.25
𝑪𝑽𝑨 = 𝟐𝟖. 𝟐𝟐%

108
𝑠
𝐶𝑉𝐵 = ( ) 100%
𝑥̅
6.92
= 100%
18.25
𝑪𝑽𝑩 = 𝟑𝟕. 𝟗𝟐%
𝑠
𝐶𝑉𝐶 = ( ) 100%
𝑥̅
7.63
= 100%
18.25
𝑪𝑽𝑪 = 𝟒𝟏. 𝟖𝟏%

Analysis:

The sores in section A are less scattered than the scores in section B and section
C. In other words, scores in section A are more homogeneous than the scores in section
B and Section C. Another way to interpret this is, the scores in section c are more spread
out than the scores in section A and section B, or the scores in section C are more
heterogeneous tan the scores in section A and section B.

Measures of Skewness

Measure of skewness described the degree of departure of the scores from


symmetry. The skewness coefficient SK can be solved using the formula:
3(𝑥̅ −𝑥̃)
SK= where 𝑥̅ = mean value and 𝑥̃ = median value; s = standard deviation.
𝑠

Skewness can be classified according to the skewness coefficient. If SK > 0, it is


called positively skewed distribution. When SK < 0, it is negatively skewed distribution.
However, if Sk = 0, the scores are normally distributed. The skewness of a score
distribution indicates only the performance of the students but not the reasons about
their performance.

Positively skewed or skewed to the right is a distribution where the thin end tail
of the graph goes to the right part of the curve. This happens when most of the scores of
the students are below the mean.

Negatively skewed or skewed to the left is a distribution where the then in tail of
the graph goes to the left part of the curve. This happens when most scores got by the
students are above the mean.

109
𝑥̂ 𝑥̃𝑥̅

Graphical Representation of Positively Skewed Distribution (SK > 0)

In a classroom testing, a positively skewed distribution means that the students


who took the examination did very poor. Most of the students got a very low score and
only few students got a high score. Positively skewed distribution tells you only on the
poor performance of the test takers but not the reasons why the students did poorly in
the said examination. Poor performance of the students could be attributed to the
following: ineffective methods of teaching and instruction, students’ unpreparedness to
take the examination, test items are very difficult, and there is no enough time to
answer the test item.

𝑥̅ 𝑥̃ 𝑥̂

Graphical Representation of Negatively Skewed Distribution (Sk< 0)

Negatively skewed distribution means that the students who took the
examination performed well. Most of the scores are high and there are only few low
scores. The shape of the score distribution indicates the performance of the students
but not the reasons why most of the students got high scores. The possible reasons why
students got high score are: the group of students are smart, there is enough time to
finish the examination, the test are very easy, and there is an effective instruction and
the students have prepared themselves for the examination.

110
Example 1: Find the coefficient of skewness of the scores of 40 grade 6 pupils in
a 100-item test in Mathematics if the mean is 82 and the median is 90 with standard
deviation of 15.

Given:

𝑥̅ = 82

𝑥̃ = 90

s = 15

3(𝑥̅ −𝑥̃)
SK= 𝑠

3(82 − 90)
=
15
3(−8)
=
15
−24
=
15
SK= −1.60

Analysis:

The sk = -1.60, the value of sk is negative, meaning the score distribution is


negatively skewed. Most of the scores are high, this means that the students have
performed excellently in the said examination.

Example 2: Find the coefficient of skewness of the scores of 45 grade 6 pupils in


a 50-item test in Biology, if the mean is 46 and the median is 40 with standard deviation
of 7.5.

Given:

𝑥̅ = 46

𝑥̃ = 40

s = 7.5

111
3(𝑥̅ −𝑥̃)
SK= 𝑠

3(46 − 40)
=
7.5
3(6)
=
7.5
18
=
7.5
SK= 2.40

Analysis:

The sk = 2.40, the value of sk is positive, meaning the score distribution is


positively skewed. Most of the scores are below the mean. This means that the students
did not perform well in the said examination.

Normal Distribution

Normal distribution is a special kind of symmetric distribution and it


represents some properties in mathematics. It is very important when comparing
between scores and making statistical decisions. It can determined using the values of
the mean and standard deviation. It is also centered at the eman of the variable and its
variation will depend on the value of its standard deviation. The smaller the value of the
standard deviation, the steeper the score distribution and less dispersed.

Properties of Normal Distribution

1. The curve has a single peak, meaning the distribution is unimodal.


2. It is a bell-shaped curve.
3. It is symmetrical to the mean.
4. The end tails of the curve be extended indefinitely in both sides and
asymptotic to the horizontal line.
5. The shape of the curve will depend on the value of the mean and
standard deviation.
6. The total area under the curve is 1.0. Hence, the area of the curve in
each side of the mean is 0.5.
7. The probability between two given points in the curve is equal to the
area between the two points.

112
𝑥̅ = 𝑥̃ = 𝑥̂

Graphical Representation of Scores That Are Normally Distributed

2.14% 2.14%
0.13% 0.13%

-3S -2S -1S Mean +1S +2S +3S

Area Under the Normal Curve

When you add up the percentages of the baseline between three s units above
and three s units below, you come up with 99.98%. Let us evaluate the area under the
normal curve between the mean and the standard deviation as indicated in the diagram.
The percentage of cases that falls between the mean value and the value of the mean
plus the value of one standard deviation unit in the normal distribution of scores is
34.13%. And the percentage of cases that falls between the mean value and the value of
the mean minus the value of one standard deviation is 34.13%.

A score distribution with mean of 74 and standard deviation of 4 using the


normal curve model, about 34.13% of the scores in the distribution, falls between 74
and 78 as shown in the given illustration.

34.13%

113
58 62 66 70 74 78 82 86 90

From the given illustration with mean equals to 74 and a standard deviation of 4,
four points are added to each standard deviation unit above the mean (78, 82, 86, 90)
and four point are subtracted from the mean value for each standard deviation unit
below the mean (70, 66, 62, 58). This is approximately equal to 68.26% or 68% of the
scores in the distribution fall between 72 and 78 as shown in the following figures.

34.13%34.13%

58 62 66 70 74 78 82 86
90
68.26%

114
34.59%34.13%34.13%34.59%

58 62 66 70 74 78 82 86
90

Using the figure above, about 95.44% of the students got score from 66 to 82.

34.13%34.59%

58 62 66 70 74 78 82 86
90

Using the figure above, 47.72% or about 43% of the students got a score from 74
to 82.

We can also use the normal curve to determine the ,percentage of the scores of
students below or above a certain score. 15.86% or 16% of the students got a score
below 70. This can be considered also as a score of 70 is at 16th percentile.

115
13.59%

2.14%
0.13%
58 62 66 70 74 78 82 86 90

34.59%34.59%

13.59%

2.14%
0.13%
58 62 66 70 74 78 82 86 90

About 84.12% or 84% of the scores are below 78. This can be written as P84 = 78.

DESCRIBING INDIVIDUAL PERFORMANCE

Standard Scores

In this section, we shall discuss the different kinds of converted scores. The
procedures for converting raw scores to standard scores are presented in this section.
There are four (4) types of standard scores: z-score, t-score, standard nine (stanines),
and percentile ranks.

116
Scores directly obtained from the test are known as actual scores or raw scores.
Such scores cannot be interpret as whether the score is low, average or high. Scores
must be converted or transformed so that they become meaningful and allow some kind
of interpretations and direct comparisons of two scores. Consider the two figures
below:

20 50 80 65 80 95

Figure A Figure B

Figure A represents the score distribution with a mean of 50 with a standard


deviation of 10. Figure B represent a score distribution with a mean of 80 and standard
deviation of 5.

The shape of the two score distributions above is the same, however, the means
and standard deviation are different. This happens because the ranges of the scores
from figure A differ from figure B. In this case, the scores in those figures cannot be
compared because they belong to two different groups.

Example: Ritz Glenn obtained a score of 92 in Business Calculus and 88 in


Production Management. In which subject did he perform well? Are we correct if we say
that Ritz Glenn performed well in Business Calculus? This may be true, but how certain?
If we said that he is better in Business Calculus, then we are treating 92 as percentages.
In most cases, scores are converted to percentages before the teacher returned the test
papers to students, but not always. Ritz Glenn’s score of 92 in Business Calculus might
mean he answered 92 items correctly out of 100 item, or he answered 91 items
correctly out of 92, or it can be interpreted as 92 items correct out of 150 items. This
can happen also to his Production management score of 88; he might answered 88
items correctly out of 100, or 88 correct answers out of 88 items, or he might have
answered 88 items correctly out of 150 items. In other words, raw scores cannot be
interpreted directly, so we need additional information about the scores in the
distribution.

The raw scores of all students in the Business Calculus and Production
Management are very important so that we can get the information that describes both
score distributions. Bases from our previous discussion, the mean value and the
standard deviation are necessary to describe the distribution of scores. Let us add the
mean values and standard deviations of the scores of students in Business Calculus and
Production Management as shown:

117
Business Calculus Production Management
x = 92 x= 88
𝑥̅ = 95 𝑥̅ = 80
s=3 s=4
Ritz Glenn’s score in Business Calculus is three (3) points below the class mean
performance nd eight (8) points above the class mean performance in Production
Management. Using the mean value, we can say that Ritz Glenn performed better in
Production Management than in Business Calculus compared with the performance of
the rest of his classmates. How about the standard deviation? The standard deviation
enables us to know how many percent of the scores above or below each score has in
the distribution.

Assuming that the scores in Business Calculus and Production Management are
normally distributed, let us construct s curve that represents the given data. The normal
curve model is used as a basis to compare the distribution with different means and
different standard deviations.

8386 8992 9598101104107 62 68 72 76 80


84 88 92 96

Business Calculus Production Management

The shaded area represent the percentage of the scores lower than the score of
Ritz Glenn. In Business Calculus the score of Ritz Glenn is one standard deviation unit
below the mean and in Production Management his score is two standard deviation
units above the mean. To determine the exact percentage of the scores below the score
of Ritz Glenn in Business Calculus and Production management use the normal curve
model.

-4s -3s -2s -1sMean 1s2s3s4s

15.86% or approximately 16% of the scores below the score of Ritz Glenn in
Business Calculus or his score is 16th percentile.

118
-4s -3s -2s -1sMean 1s2s3s4s

97.71 or approximately 98% of the students’ scores in Production Management


are lower than Ritz Glenn’s 98th percentile score in Production Management.

1. z-scores

To get more exact information about the performance of Ritz Glenn collect the
raw score, mean and standard deviation and determine how far below or above the
mean in standard deviation units is the obtained raw score.

To determine the exact position of each score in the normal distribution use z-
score formula. The z-score is used to convert a raw score to standard score to
determine how far a raw score lies from the mean in standard deviation units. From this
we can also determine whether an individual student performs well in the examination
compared to the performance of the whole class.

The z-score value indicates the distance between the given raw score and the
mean value in units of the standard deviation. The z-value is positive when the raw
score is above the mean while the z is negative when the raw score is below the mean.

The formula of z-score is:


𝑥−µ 𝑥−𝑥̅
𝑧= or𝑧 = or, where
ơ 𝑠

z = z-value

x = raw score

s = sample standard deviation

𝑥̅ = sample mean

Ơ = population standard deviation

µ = population mean

The z-score formula is very essential when we compare the performance of the
student in his subjects or the performance of two students that belongs to different

119
groups. It can determine the exact location of the scores whether above or below the
mean and how many standard deviation units it is from the mean.

Example: Using the data about Ritz Glenn’s scores in Business Calculus and
Production Management, solve the z-score value.

Business Calculus Production Management


x = 92 x= 88
𝑥̅ = 95 𝑥̅ = 80
s=3 s=4
z-score of Business Calculus (BC)

𝑥 − 𝑥̅
𝑧=
𝑠
92 − 95
𝑍𝐵𝐶 =
3
−3
𝑍𝐵𝐶 =
3
𝑍𝐵𝐶 = −1
z-score of Production Management (PM)

88 − 80
𝑍𝑃𝑀 =
4
8
𝑍𝑃𝑀 =
4
𝑍𝑃𝑀 = +2

Analysis:

The score of Ritz Glenn in Business Calculus is one unit standard deviation below
the mean. His score in Production Management is two units standard deviation above
the mean. Therefore, we can conclude that Ritz Glenn performed better in Production
Management than in Business Calculus.

2. T-scores

There are two possible values of z-score, positive z if the raw score is above the
mean and negative z if the raw score is below the mean. To avoid confusion between
negative and positive value, use T-score to convert raw scores. T-score is another type
of standard score where the mean is 50 and the standard deviation is 10. In z-score the
mean is 0 and the standard deviation is one (1). To convert raw score to T-score, find
first the z-score equivalent of the raw score and use the formula T-score = 10z + 50.

Business Calculus Production Management


x = 92 x = 88
x̅ = 95 x̅ = 80
S=3 s=4

120
From the above discussion, z-score of Business Calculus is -1 and z-score of
Production Management is +2, solve the T-score equivalent:

T-scoreBC = 10z + 50
T-scoreBC = 10(-1) + 50
T-scoreBC = 10 + 50
T-scoreBC = 40
T-scoreBC = 10z + 50
T-scoreBC = 10(2) + 50
T-scoreBC = 20 + 50
T-scoreBC = 70

Analysis:

z-score of -1 is equivalent to a T-score of 40, and z-score of +2 is equivalent to a


T-score of 70. The negative value is eliminated in the T-score equivalent. Therefore, Ritz
Glenn performed better in Production Management than in Business Calculus due to
higher value of T-score which is equal to 70.

z-score
-4 -3-2-101 23 4
T-score
10 20 3040506070 80 90

Relationship Between z-score and T-score

3. Standard Nine

The third type of standard score is the Standard Nine point scale which is also
known as stanine, the origin word is sta(ndard) + nine. A stanineis a nine-point grading
scale ranging from 1 to 9, 1 being the lowest and 9 the highest. Stanine grading is easier
to understand than the other standard score model. The descriptive interpretation of
stanine 1, 2, 3 is below average, the stanine 4, 5, 6 is interpreted as average and the
descriptive interpretation of stanine 7, 8, 9 is above average. Use this graph as a basis of
analysis stanine results.

20%

121
-1.75 -1.25 -0.75 -0.25 0.250.75 1.25 1.75
1st 2nd 3rd 4th 5th 6th 7th8th
9th

Stanine

From the given figure, the mean of stanine is 0 and the central interval is 0.25
standard deviation of the mean, and each other interval is 0.5 standard deviation wide
except for the end tail of the normal curve.

Stanine is used to compare two or more distribution of data, particularly test


scores. Estimate or compute probabilities of events involving normal distributions.
Facilities using words rather than number in presenting statistical data.

The given figure below indicates the percentage of score in each stanine and the
corresponding descriptions.

Stanine Percentage of Scores Description


1 4% Very Poor
2 7% Poor
3 12% Below Average
4 17% Slightly Below Average
5 20% Average
6 17% Slightly Above Average
7 12% Considerably Above
Average
8 7% Superior
9 4% Very Superior

Normal
Bell=shaped Curve

122
Percentage of .13% 2.14% 13.59% 34.13% 34.13%
13.59% 2.14% .13%
Cases in 8 portions
Of the curve

Standard Deviations-4ơ -3ơ -2ơ -1ơ 0 +1ơ +2ơ +3ơ +4ơ

Cumulative 0.1% 2.3%15.9% 50% 84.1% 97.7% 99.9%

Percentages 1 510 2030405060708090 9599


Percentile

Normal Curve 10 203040 50 60 70 80 90


Equivalents

Relationship Between Percentile Rank and Normal Curve

4. Percentile Rank

Another way of converting a raw score to standard score is the percentile rank. A
percentile rank indicates the percentage of scores that lies below a given score.
Example, a test score which is greater than 95% of the scores of the examinees is said to
be 95th percentile. If the scores are normally distributed, percentile rank can be inferred
from the standard score. In solving percentile rank use the formula:

𝐶𝐹𝑏 + 0.5𝐹𝑔
𝑃𝑅 = ( ) 𝑥 100
𝑛

Where,

PR = percentile rank

CFb = cumulative frequency below the given score

Fg = frequency of the given score

n = number of scores in the distribution

Solving the percentile rank is tedious or needs a very long process, we can short-
cut the solution using the SPSS program or EXCEL program which is more easier to use
and more cheaper than other software.

Steps in Solving Percentile Rank

123
1. Arrange the test scores (TS) from highest to lowest.
2. Make a frequency distribution of each score and the number of students
obtaining each score. (F)
3. Find the cumulative frequency (CF) by adding the frequency in each score from
the bottom upward.
4. Find the percentile rank (PR) in each score using the formula and the result as
indicated in column 4.

Example: The table below shoes a summary of the scores of 40 students in a 45-
item multiple choice of test. Find the percentile ranks of each score in the
distribution.

TS F
45 1
43 2
42 2
41 1
40 1
39 2
37 3
36 2
34 1
33 2
32 2
30 3
29 4
28 1
27 1
25 2
24 1
22 2
21 2
19 1
18 2
16 1
15 1
40
Find the cumulative frequency of the frequency distribution. The third column
represents the cumulative frequency.

TS F CF
45 1 40
43 2 39
42 2 37
41 1 35

124
40 1 34
39 2 33
37 3 31
36 2 28
34 1 26
33 2 25
32 2 23
30 3 21
29 4 18
28 1 14
27 1 13
25 2 12
24 1 10
22 2 9
21 2 7
19 1 5
18 2 4
16 1 2
15 1 1
40
Find the percentile rank of each score.

a. Solution:
Score = 45
CFb = 39
Fg = 1
n = 40
𝐶𝐹𝑏 + 0.5𝐹𝑔
𝑃𝑅 = ( ) 𝑥 100
𝑛
39 + 0.5𝐹𝑔
𝑃𝑅 = ( ) 𝑥 100
40
39 + .5
𝑃𝑅 = ( ) 𝑥 100
40
39.5
𝑃𝑅 = ( ) 𝑥 100
40
𝑃𝑅 = 0.9875 𝑥 100
𝑃𝑅 = 98.75
𝑃𝑅 = 99

Analysis:

125
A raw score of 45 is equal to percentile rank of 99. This means that 99% of the
students who took the examination had raw scores equal to or lower than 45. This can
be written as PR99 = 45.

b. Solution:
Score = 43
CFb = 37
Fg = 2
n = 40

𝐶𝐹𝑏 + 0.5𝐹𝑔
𝑃𝑅 = ( ) 𝑥 100
𝑛
39 + 0.5(2)
𝑃𝑅 = ( ) 𝑥 100
40
37 + 1
𝑃𝑅 = ( ) 𝑥 100
40
38
𝑃𝑅 = ( ) 𝑥 100
40
𝑃𝑅 = 0.95 𝑥 100
𝑃𝑅 = 95

Analysis:

A raw score of 43 is equal to a percentile rank of 95. This means that 95% of the
students who took the examination had raw scores equal to or lower than 43. This can
be written also as PR95 = 43.

c. Solution:
Score = 42
CFb = 35
Fg = 2
n = 40
𝐶𝐹𝑏 + 0.5𝐹𝑔
𝑃𝑅 = ( ) 𝑥 100
𝑛
35 + 0.5(2)
𝑃𝑅 = ( ) 𝑥 100
40
35 + 1
𝑃𝑅 = ( ) 𝑥 100
40
36
𝑃𝑅 = ( ) 𝑥 100
40
𝑃𝑅 = 0.9 𝑥 100
𝑃𝑅 = 90

Analysis:

126
A raw score of 42 is equal to a percentile rank f 90. This means that 90% of the
students who took the examination had raw scores equal to or lower than 43. This can
be written also as PR90 = 42.

Note: continue solving the percentile ranks of each score in the distribution in
the exercise and compare the answers in the percentile ranks distribution in the
succeeding page.

When converting the raw scores to a percentile rank, the raw scores are put on a
scale that has the same meaning with different number of groups and for different
lengths of tests.

Frequency and percentile ranks distribution of a 45-item multiple choice of test


conducted to 40 students

TS F CF PR
45 1 40 99
43 2 39 95
42 2 37 90
41 1 35 86
40 1 34 84
39 2 33 80
37 3 31 74
36 2 28 68
34 1 26 64
33 2 25 60
32 2 23 55
30 3 21 49
29 4 18 40
28 1 14 34
27 1 13 31
25 2 12 28
24 1 10 24
22 2 9 20
21 2 7 15
19 1 5 11
18 2 4 8
16 1 2 4
15 1 1 1
n = 40

127
2.14% 2.14%
0.13% 0.13%

-4s -3s -2s -1s Mean +1s+2s +3s +4s

Raw Scores

40 45 50 55 60 65 70 75 80
Mean = 60
S
=5

z-
score

-4 -3 -2 -1 0 1 2 3 4

T-
score

10 20 30 40 50 60 70 80
90

Stanine

1 2 3 4 5 6 7 8 9

128
Relationship between Different Standard Scores

This figure shows the relationship between the raw scores and he converted
scores assuming that the distribution is normally distributed. The score distribution has
a mean of 60 and standard deviation of 5. Using these parameters let us consider a raw
score of 75, this raw score is equivalent to a distance of three standard deviations from
the mean value, z-score of 3 and T-score of 80 and stanine of 8. This can be done using
the different process that we have discussed in the previous sections.

DESCRIBING RELATIONSHIPS

Correlation

Another statistical method used in analyzing test results is the correlation. This
is the tool that we are going to utilize if we want to determine the relationship or
association between the scores of students in two different subjects. Is there a
relationship between the Mathematics scores and the Science scores of 15 students?
What type of linear relationship exists between the two sets of scores? Such questions
can be answered using the concepts of correlation. In this chapter, the different type of
computing the correlation coefficient when raw scores and ordinal level of
measurement are given. The graphical method or scattergram of determining the
relationship between two groups of scores are also discussed in this section but only
limited to linear relationship.

Correlation refers to the extent to which the distributions are linearly related or
associated between the two variables. The extent of correlation is indicated numerically
by the coefficient (rxy). The correlation coefficient (rxy) also known as Pearson Product
Moment Correlation Coefficient in honor to Karl Pearson who developed the said
formula. The correlation coefficient ranges from -1 to +1. There are three kinds of
correlation based from the correlation coefficient: (1) positive correlation; (2) negative
correlation; and (3) zero correlation. There are two ways of identifying the correlation
between the two variables: (1) using the formula; and (2) using scatter point or
scattergram.

Kinds of Correlation

1. Positive Correlation
High scores in distribution x are associated with high scores in distribution y. Low
scores in distribution x are associated with low scores in distribution y. This means that
as the value of x increases the value of y increases too or as the value of x decreases, the
y values will also decrease. The line that best fitted to the given points upward to the
right as shown in the scattergram of positive correlations. The slope of the line is
positive.

129
2. Negative Correlation
High scores in distribution x are associated with low scores in distribution y. Low
scores in distribution x are associated with high scores in distribution y. This means
that as the values of x increase, the values of y decrease or when the values of x
decrease, the values of y increase. The line that best fitted to the given points downward
to the right as shown in the scattergram of negative correlations. The slope of the line is
negative.
3. Zero Correlation
No association between scores in distribution x and scores in distribution y. No
single line can be drawn that best fitted to all points as shown in the scattergram of zero
correlation. No discernable pattern can be formed.
The formula in computing the correlation coefficient using th Person Product
Moment Correlation is:
(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]

Example: find the correlation coefficient of the scores of 10 students in


mathematics quiz and science quiz as given below.

Students Scores in Scores in


Math (x) Science (y)
1 35 41
2 15 25
3 11 19
4 35 39
5 45 40
6 28 30
7 30 26
8 15 23
9 45 48
10 40 42

Scattergram of Correlation

1. Scattergram of Positive Correlation

Another way of determining the correlation of pair of scores is through the use of
graphing. The graphical representation is called scattergram. Using your knowledge in
graphing ordered pairs in the coordinate plane; graph the scores of 8 students in
mathematics and science.

Mathematics Science Scores


Scores
1 6

130
2 8
3 10
4 11
5 13
6 16
7 20
8 21

Analysis:

As math score increases, there is a corresponding increase in the science score.


Using the given points in the coordinate plane, a straight line upward to the right can be
drawn that is best fitted to all points. Hence, the slope of the line is positive.

2. Scattergram of Negative Correlation

Graph the scores of 8 students in mathematics and science in the coordinate


plane.

Mathematics Science Scores


Scores
1 20
2 17
3 15
4 13
5 12
6 9
7 7
8 4

Analysis:

As math scores increase, science scores decrease. Using the given points in the
coordinate plane, a straight line downward to the right can be drawn that is best fitted
to all the points. Hence, the slope of the line is negative.

3. Scattergram of Zero Correlation

Graph he scores of 11 students in mathematics and science in the coordinate


plane.

Mathematics Science Scores

131
Scores
3 17
4 17
6 11
7 4
7 6
8 15
10 12
10 19
14 13
16 7
17 19

Make a scattergram of the scores of 11 students in mathematics and science.

Analysis:

No discernable pattern can be formed from the given set of points. No single line
can be drawn that is best fitted to all point in the plane.

Computation of Correlation

Steps in Solving Correlation Coefficient Using Pearson r

1. Complete the necessary data in the table as xy column, x2 column, y2 column.


2. Find the Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2.
3. Compute the correlation coefficient (rxy) using the formula.
(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]

Student Scores in Scores in xy X2 Y2


Math (x) Science (y)
1 35 41 1435 1225 1681
2 15 25 375 225 625
3 11 19 209 121 361
4 35 39 1365 1225 1521
5 45 40 1800 2025 1600
6 28 30 840 784 900
7 30 26 780 900 676
8 15 23 345 225 529
9 45 48 2160 2025 2304
10 40 42 1680 1600 1764
Ʃx = 299 Ʃy = 333 Ʃxy = Ʃx2 = Ʃy2 =
10989 10355 11961

132
(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
(10)(10 989) − (299)(333)
𝑟𝑥𝑦 =
√[(10)(10 355) − (299)2 ][(10)(11 961) − (333)2 ]
109 890 − 99 567
𝑟𝑥𝑦 =
√[103 550 − 89 401][119 610 − 110 889]
10 323
𝑟𝑥𝑦 =
√[(14 149)(8 721)]
10 323
𝑟𝑥𝑦 =
√123 393 429
10 323
𝑟𝑥𝑦 =
11 108.25949
𝑟𝑥𝑦 = 0.929308503
𝑟𝑥𝑦 = 0.93

Analysis:

The value of the correlation coefficient is rxy = 0.93, which means that there is a
very high positive correlation between the scores of 10 students in mathematics and in
science. This means that students who are good in mathematics are also good in science.

Spearman rho Coefficient

Another way of finding the correlation between two variables is the Spearman
rho correlation coefficient and it is denoted by a Greek letter rho (ƿ). The Spearman rho
correlation coefficient (ƿ) is a measure of correlation when the given sets of data are
expressead in ordinal level of measurement rather then raw scores as in Pearson r,
Spearman rho (ƿ) was first derived by a British psychologist by the name Spearman In
honor to him, the formula was named Spearman’s rho.
6Ʃ𝐷 2
ƿ = 1 − 𝑁(𝑁2−1)where,
ƿ = Spearman rho correlation coefficient value
D = difference between a pair of ranks
N = number of students/ cases

Steps in solving Spearman’s rho Correlation Coefficient

1. Rank the scores in the distribution if raw scores are given.


2. Find the difference between each pair.
3. Square the difference.
4. Find the summation of the squared difference.
5. Solve the value of the Spearman’s who coefficient using the formula
6Ʃ𝐷 2
ƿ = 1 − 𝑁(𝑁2−1).

133
Example: Ten (10) aspirants of the Gabuyo Scholarship at YAG University were
rank in their mathematics scores and English scores. Solve the value of ƿ to the nearest
hundredths. The data is tabulated below:

Student Mathematics Science Score


Score
Ritz Glenn 45 50
James Vincent 47 45
John Michael 39 35
Paul John 37 41
Raphael Carlo 33 38
John Rey 40 39
ShejRoi 15 25
Fitch Peter 46 49
Kristle Anne 25 40
Cloe Grace 44 42

Rank the scores in mathematics and the scores in science, find the difference of
each pair of scores and square the difference. Find the summation of D2 and solve for ƿ
value.

Student Mathematics Science D D2


Score Score
Ritz Glenn 3 1 2 4
James Vincent 1 3 -2 4
John Michael 6 9 -3 9
Paul John 7 5 2 4
Raphael Carlo 8 8 0 0
John Rey 5 7 -2 4
ShejRoi 10 10 0 0
Fitch Peter 2 2 0 0
Kristle Anne 9 6 3 9
Cloe Grace 4 4 0 0
ƩD = 34
2

Solution:

6Ʃ𝐷2
ƿ=1−
𝑁(𝑁 2 − 1)
6(34)
ƿ=1−
10(102 − 1)

134
204
ƿ=1−
10(100 − 1)
204
ƿ=1−
10(99)
204
ƿ=1−
990
ƿ = 1 − 0.21
ƿ = 0.79

Analysis:

The ƿ value is 0.79, which indicates a high positive correlation between the
mathematics scores and science scores of ten aspirants in the Gabuyo Scholarship. The
students who are good in mathematics are also good in science.

CHAPTER 6

ESTABLISHING VALDITY AND RELIABILITY O TEST

Learning Outcomes

At the end of this chapter, the students should be able to:

1. Deine the following terms, validity, reliability, content validity, construct validity,
criterion-related validity, predictive validity, concurrent validity, test-retest
method, equivalent/ parallel method, split-half method, Kuder-Richardson
formula, validity coefficient, reliability coefficient;
2. Discuss the different approaches of validity;
3. Present and discuss the different methods of solving the reliability of a tests;
4. Identify the different factors affecting the validity of the test;
5. Identify the factors affecting the reliability of the test;
6. Compute the validity coefficient and reliability coefficient; and
7. Interpret the reliability coefficient and validity coefficient of the test.

INTRODUCTION

Test construction believed that every assessment tool should possess good
qualities. Most literatures consider the most common technical concepts in assessment
are the validity and reliability. For any type of assessment whether traditional or
authentic it should be carefully developed so that it may serve whatever purpose it may
have. In this chapter, we shall discuss the different ways of establishing validity and
establishing reliability.

VALIDITY OF A TEST

Validity (Airasian, 2000) is concerned whether the information obtained from


an assessment permits the teacher to make a correct decision about a student’s

135
learning. This means that the appropriateness of score-based inferences or decisions
made are based on the students’ test results. It is extent to which a test measure what it
is supposed to measure.

When the assessment tool provides information that is irrelevant to the learning
objectives it was intended to helop, it makes the interpretation of the test result invalid.
Teachers must select use procedures, performance criteria, and settings to all forms of
assessment most especially performance-based assessment so that fairness to all
students is maintained. Assessing student’s performance on the basis of personal
characteristics rather than on the performance of the students lowers the validity of the
assessment.

Types of Validity

1. Content Validity. A type of validation that refers to the relationship between


test and the instructional objectives, establishes content so that the test
measurers what it is supposed to measure. Things to remember about validity:
a. The evidence of the content validity of a test is found in the Table of
Specification.
b. This is the most important type of validity for a classroom teacher.
c. There is no coefficient for content validity. It is determined by experts
judgmentally, not empirically.
2. Criterion-related Validity. A type of validation that refers to the extent to which
scores from a test relate to theoretically similar measures. It is a measure of how
accurately astudent’s current test score can be used to estimate a score on a
criterion measure, like performance in courcem classes or another measurement
instrument. For example, the classroom reading grades should indicates similar
levels of performance as Standardized Reading Test score.
a. Concurrent validity. The criterion and the predictor data are collected at the
same time. This type of validity is appropriate for test designed to assess a
student’s current criterion status or when you want to diagnose student’s
status; it is a good diagnostic screening test. It is established by correlating
the criterion and the predictor using Pearson product correlation coefficient
and other statistical tools correlations.
b. Predictive validity. A type of validation that refers to a measure of the extent
to which a student’s current test result can be used to estimate accurately the
outcome of the student’s performance at later time. It is appropriate for test
designed student’s future status on a criterion.
Predictive validity is very important in psychological testing, like if the
psychologist want to predict responses, behaviors, outcomes,
performances and others. These scores will be used in the assessment
process. Regression analysis can be used to predict the criterion of a
single predictor or multiple predictors.

136
3. Construct Validity. A type of validation that refers to the measure of the extent
to which a test measure a theoretical and unobservable variable qualities such as
intelligence, math achievement, performance anxiety, and the like, over a period
of time on the basis of gathering evidence. It is established through intensive
study of the test or measurement instrument using convergent/divergent
validation and factor analysis.
a. Convergent validity is a type of construct validation wherein a test has a high
correlation with another test that measures the same construct.
b. Divergent validity is a type of construct validation wherein a test has low
correlation with a test that measures a different construct. In this case, a high
validity occurs only when there is a low correlation coefficient between the
tests that measure different traits.
c. Factors analysis is another methods of assessing the construct validity of a
test using complex statistical procedure conducted with different procedures.
There are other ways of assessing construct validity like test’s internal
consistency, development change and experimental intervention.

Important Things to Remember about Validity

1. Validity refers to the decisions we make, and not to the test itself or to the
measurement.
2. Like reliability, validity is not an all-or-nothing concept; it is never totally abset
or absolutely perfect.
3. A validity estimate, called a validity coefficient, refers to specific type of validity.
It ranges between 0 and 1.
4. Validity can never be finally determined; it is specific to each administration of
the test.

Factors Affecting the Validity of a Test Item

1. The test itself.


2. The administration and scoring of a test.
3. Personal factors influencing how students response to the test.
4. Validity is always specific to a particular group.

Reasons That Reduce the Validity of the Test Item

1. Poorly constructed test items


2. Unclear directions
3. Ambiguous test items
4. Too difficult vocabulary
5. Complicated syntax
6. Inadequate time limit
7. Inappropriate level of difficulty

137
8. Unintended clues
9. Improper arrangement of test items

Guide Questions to Improve Validity

1. What is the purpose of the test?


2. How well do the instructional objectives selected for the test represent the
instructional goal?
3. Which test item formal will best measure the achievement of each objective?
4. How many test item will be required to measure the performance adequately to
each objective?
5. When and how will the test be administered?

VALIDITY COEFFICIENT

The validity coefficient is the computed value of the rxy. In theory, the validity
coefficient has values like the correlation that ranges from 0 to 1. In practice, most of the
validity scores are usually small and they range from 0.3 to 0.5, few exceed 0.6 to 0.7.
Hence, there is a lot of improvement in most of our psychological measurement.

Another way of interpreting the findings is to consider the squared correlation


coefficient (rxy)2, this is called coefficient of determination. Coefficient of
determination indicates how much variation in the criterion can be accounted for by
the predictor (teacher test). Example, if the computed value of rxy = 0.75. The coefficient
of determination is (0.75)2 = 0.5625 or 56.25% of the variance in the student
performance can be attributed to the test or 43.75% of the student performance cannot
be attributed to the test results.

Example Teacher Benjamin James develops a 45-item test and he wants to


determine if his test is valid. He takes another test that is already acknowledged for its
validity and uses it as criterion. He conducted these two sets of test to his 15 students.
The following table shows the results of the two tests. Is the test developed by Mr.
Benjamin James valid? Find the validity coefficient using Pearson r and the coefficient of
determination.

Teacher Criterion xy x2 y2
Bejamine Test (y)
James Test (x)
12 16 192 144 256
22 25 550 481 625
23 31 713 529 961
25 25 625 625 625
28 29 812 784 841
30 28 840 900 784
33 35 1 155 1 089 1 225
42 40 1 680 1 764 1 600
41 45 1 845 1 681 2 025
138
37 40 1 480 1 369 1 600
26 33 858 676 1 089
44 45 1 980 1 936 2 025
36 40 1 440 1 296 1 600
29 35 1 015 841 1 225
37 41 1 517 1 369 1 681
Ʃx = 465 Ʃy = 508 Ʃxy = 16 702 Ʃx2 = 15 487 Ʃy2 = 18 162

(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
(15)(16 702) − (465)(508)
𝑟𝑥𝑦 =
√[(15)(15 487) − (465)2 ][(15)(18 162) − (508)2 ]
25 0530 − 236 220
𝑟𝑥𝑦 =
√[232 305 − 216 225][272 430 − 258 064]
14 310
𝑟𝑥𝑦 =
√[(16 080)(14 366)]
14 310
𝑟𝑥𝑦 =
√231 005 280
14 310
𝑟𝑥𝑦 =
15 198.85785
𝑟𝑥𝑦 = 0.941518
𝑟𝑥𝑦 = 0.94

Coefficient of determination = (0.94)2 = 88.36%

Interpretation:

The correlation coefficient is 0.94, which means that the validity of the test is
high, or 88.36% of the variance in the students’ performance can be attributed to the
test.

RELIABILITY OF A TEST

Reliability refers to the consistency with which it yields the same rank for
individuals who take the test more than once (Kubiszyn and Borich, 2007). That is, how
consistent test results or other assessment results from one measurement to another.
We can say that a test us reliable when it van be used to predict practically the same
scores when test administered twice to the same group of students and with a reliability
index of 0.60 or above.

The reliability of a test can be determined by means if Pearson product


correlation coefficient, Spearman-Brown formula and Kuder-Richardson formulas.

Factors Affecting Reliability of a Test

139
1. Length of the test
2. Moderate item difficulty
3. Objective scoring
4. Heterogeneity of the student group
5. Limited time

Four Methods of Establishing Reliability of a Test

1. Test-retest Method. A type of reliability determined by administering the same


test twice to the same groups of students with any time interval between the
tests. The results of the test scores are correlated using the Pearson product
correlation coefficient (r) and this correlation coefficient provides a measure of
stability. This indicates how stable the test result over a period of time. The
formula is:
(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
2. Equivalent Form. A type of reliability determined by administering two
different but equivalent forms of the test (also called parallel or alternate forms)
to the same group of students in close succession. The equivalent forms are
constructed to the same set of specifications that is similar in content, type of
items and difficulty. The result of the test scores are correlated using the Pearson
product correlation coefficient (r) and this correlation coefficient provides a
measure of the degree to which generalization about the performance of
students from one assessment to another assessment is justified. It measures the
equivalence of the tests.
3. Split-half Method.Administer test once and score two equivalent halves of the
test. To split the test into halves that are equivalent, the usual procedure is to
score the even-numbered and the odd-numbered test item separately. This
provides two scores for each student. The result of the test scores are correlated
using the Spearman-Brown formula and this correlation coefficient provides a
measure of internal consistency. It indicates the degree to which consistent
2𝑟
results are obtained from two halves of the test. The formula is: 𝑟𝑜𝑡 = 1+𝑟𝑜𝑒 . The
𝑜𝑒
detail of this formula will be discussed in later lessons.
4. Kuder-Richardson Formula. Administer the test once. Score the total test and
apply the Kuder-Richardson formula. The Kuder-Tichardson 20 formula is
applicable only in situations where students’ responses are scored
dichotomously, and therefore, is most useful with traditional test items that are
scored as right or wrong, true or false, and yes or no type. KR-20 formula
estimates of reliability provide information whether the degree to which the
items in the test measure is of the same characteristic, it is an assumption that all
items are of equal in difficulty. (A statistical procedure used to estimate
coefficient alpha, a correlation coefficient is given.) Another formula for testing

140
the internal consistency of a test is the KR-21 formula, which is no limited to test
items that are scores dichotomously.

RELIABILITY COEFFICIENT

Reliability coefficient is a measure of the amount of error associated with the


test scores.

Description of Reliability Coefficient

a. The range of the reliability coefficient is from 0 to 1.0.


b. The acceptable range value 0.60 or higher.
c. The higher the value of the reliability coefficient, the more reliable the
overall test scores.
d. Higher reliability indicates that the test items measure the same thing.
Example, knowledge of solving number problem in algebra subject.
1. Pearson Product Moment Correlation Coefficient (rxy)
(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
2. Spearman-Brown Formula
2𝑟𝑜𝑒
𝑟𝑜𝑡 =
1 + 𝑟𝑜𝑒
Where, rot = reliability of the original test
roe = reliability of the correlation of odd and even item
3. KR-20 and KR-21 Formula
The KR-20 formula is also known as the Kuder-richardson formula.
𝑘 Ʃ𝑝𝑞
𝐾𝑅20 = 𝑘−1 (1 − )
𝑠2

k = number of items

p = proportion of the students who got the item correctly (index of difficulty)

q=1–p

s2 = variance of the total score


𝑘 𝑥̅ (𝑘−𝑥̅
𝐾𝑅21 = 𝑘−1 (1 − )
𝑘𝑠2

k = number of items

x̅ = mean value

q=1–p

s2 = variance of the total score

141
Interpreting Reliability Coefficient

1. The group variability will affect he size of the reliability coefficient. Higher
coefficient results from heterogeneous groups than from the homogeneous
groups. As group variability increases, reliability goes up.
2. Scoring reliability limits rest score reliability. If tests are scored unreliable error
is introduced. This will limit the reliability of the test scores.
3. Test length affects test score reliability. As the length increases, the test’s
reliability tends to go up.
4. Item difficulty affects test score reliability. As test items become very easy or
very hard, the test’s reliability goes down.

Level of Reliability Coefficient

Reliability Interpretation
Coefficient
Above 0.90 Excellent reliability
0.81 – 0.90 Very good for a classroom test
0.71 – 0.80 Good for classroom test. There are probably few items needs to
be improved.
0.61 – 0.70 Somewhat low. The test needs to be supplemented by other
measures (more test) for grades.
0.51 – 0.60 Suggests need for revision of test, unless it is quite short (ten or
fewer items). Needs to be supplemented by other measures
(more test) for grading.
0.50 and Below Questionable reliability. This test should not contribute heavily to
the course grade, and it needs revision.
Let us discuss the steps in solving the reliability coefficient using the different
methods of establishing the validity and reliability of the given tests using the different
examples.

Example 1: Prof. Henry Joel conducted a test to his 10 student in Elementary


Statistics class twice after one-day interval. The test given after one day is exactly the
same test given the fist time. Scores below were gathered in the first test (FT) and
second test (ST). Using test-retest method, is the test reliable? Show the complete
solution.

Student FT ST
1 36 38
2 26 34
3 38 38
4 15 27
5 17 25
6 28 26
7 32 35
8 35 36
9 12 19

142
10 35 38
Using the Pearson r formula, find the Ʃx, Ʃy, Ʃxy, Ʃx , Ʃy2.
2

Solution:

Student FT (x) ST (y) xy x2 y2


1 36 38 1 368 1 296 1 444
2 26 34 884 676 1 156
3 38 38 1 444 1 444 1 444
4 15 27 405 225 729
5 17 25 425 289 625
6 28 26 728 784 676
7 32 35 1 120 1 024 1 225
8 35 36 1 260 1 225 1 296
9 12 19 228 144 361
10 35 38 1 330 1 225 1 444
n = 10 Ʃx = 274 Ʃy = 316 Ʃxy = 9 192 Ʃx2 = 8 332 Ʃy2 = 10
400

(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
(10)(9 192) − (274)(316)
𝑟𝑥𝑦 =
√[(10)(8 332) − 2742 ][(10)(10 400) − 3162 ]
𝑟𝑥𝑦 = 0.91

Analysis:

The reliability coefficient using he Pearson r = 0.91, means that is has a very high
reliability. The scores of the 10 students conducted twice with one-day interval are
consistent. Hence, the test has a very high reliability.

Note: Compute the reliability coefficient of the same date using Spearman rho
formula. Is the test reliable?

Example 2: Prof. Vinci Glenn conducted a test to his 10 students in his Biology
class two times after one-week interval. The test given after one week is the parallel
form of the test during the first time the test was conducted. Scores below were
gathered in the first test (FT) and second test or parallel test (PT). Using equivalent or
parallel form method, is the test reliable? Show the compute solution, using the Pearson
r formula.

Student FT PT
1 12 20
2 20 22
3 19 23
4 17 20

143
5 25 25
6 22 20
7 15 19
8 16 18
9 23 25
10 21 24
Using the Pearson r formula, find the Ʃx, Ʃy, Ʃxy, Ʃx2, Ʃy2.

Student FT (x) PT (y) xy x2 y2


1 12 20 240 144 400
2 20 22 440 400 484
3 19 23 437 361 529
4 17 20 340 289 400
5 25 25 625 625 625
6 22 20 440 484 400
7 15 19 285 225 361
8 16 18 288 256 324
9 23 25 575 529 625
10 21 24 504 441
n = 10 Ʃx = 190 Ʃy = 216 Ʃxy = 4 174 Ʃx = 3 754
2 Ʃy2 = 4 148

(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
(10)(4 174) − (190)(216)
𝑟𝑥𝑦 =
√[(10)(3 754) − (190)2 ][(10)(4 724) − (216)2 ]
𝑟𝑥𝑦 = 0.76

Analysis:

The reliability coefficient using the Pearson r = 0.76, means that it has a high
reliability. The scores of the 10 students conducted twice with one-week interval are
consistent. Hence, the test has a high reliability.

Note: Compute the reliability coefficient of the same data using Spearman rho
formula. Is the test reliable?

Example 3: Prof. Glenn Lord conducted a test to his 10 students I his Chemistry
class. The test was given only once. The scores of the students in odd and even items
below were gathered, (O) odd items and (E) even items. Using split-half method, is the
test reliable? Show the complete solution.

Odd (x) Even (y)


15 20
19 17
20 24
25 21

144
20 23
18 22
19 25
26 24
20 18
18 17

2𝑟
Use the formula 𝑟𝑜𝑡 = 1+𝑟𝑜𝑒 to find the reliability of the whole test, find the Ʃx, Ʃy,
𝑜𝑒
Ʃxy, Ʃx2, Ʃy2 to solve the reliability of the odd and even test items.

Odd (x) Even (y) xy x2 y2


15 20 300 225 400
19 17 323 361 289
20 24 480 400 576
25 21 525 625 441
20 23 460 400 529
18 22 396 324 484
19 25 475 361 625
26 24 624 676 576
20 18 360 400 324
18 17 306 324 289
Ʃx = 200 Ʃy = 211 Ʃxy = 4 Ʃx2 = 4 Ʃy2 = 4
249 096 533

(𝑛)(Ʃ𝑥𝑦) − (Ʃ𝑥)(Ʃ𝑦)
𝑟𝑥𝑦 =
√[(𝑛)(Ʃ𝑥 2 ) − (Ʃ𝑥)2 ][(𝑛)(Ʃ𝑦 2 ) − (Ʃ𝑦)2 ]
(10)(4 249) − (200)(211)
𝑟𝑥𝑦 =
√[(10)(4 096) − (200)2 ][(10)(4 533) − (211)2 ]
𝑟𝑥𝑦 = 0.33

Find the reliability of the original test using the formula:


2𝑟
𝑟𝑜𝑡 = 1+𝑟𝑜𝑒
𝑜𝑒

2(0.33)
𝑟𝑜𝑡 =
1 + 0.33
0.66
𝑟𝑜𝑡 =
1.33
𝑟𝑜𝑡 = 0.50

Analysis:

145
The reliability coefficient using Brown formula is 0.50, which is questionable
reliability. Hence, the test items should be revised.

Example 4: Ms. Gauat administered a 40-item test in English for her Grade VI
pupils in Malanao Elementary School. Below are the scores of 15 pupils, find the
reliability using the Kuder-Richardson formula.

Student Score (x)


1 16
2 25
3 35
4 39
5 25
6 18
7 19
8 22
9 33
10 36
11 20
12 17
13 26
14 35
15 39
Solve the mean and the standard deviation of the scores using the table below.

Student Score (x) x2


1 16 256
2 25 625
3 35 1 225
4 39 1 521
5 25 625
6 18 324
7 19 361
8 22 484
9 33 1 089
10 36 1 296
11 20 400
12 17 289
13 26 676
14 35 1 225
15 39 1 521
n = 15 Ʃx = 405 Ʃx = 11 917
2
𝑛(Ʃ𝑥 2 )−(Ʃ𝑥)2
𝑠2 = 𝑛(𝑛−1)

15(11 917)−(405)2
𝑠2 = 15(14)

146
178 755−164 025
𝑠2 = 210

14 730
𝑠2 = 210

𝑠 2 = 70.14
Ʃ𝑥
Mean = 𝑛

405
= 15

Mean = 27

Solve for the reliability coefficient using the Kuder-Richardson formula.


𝑘 𝑥̅ (𝑘−𝑥̅ )
𝐾𝑅21 = 𝑘−1 [1 − ]
𝑘𝑠2

40 27(40−27)
𝐾𝑅21 = 40−1 [1 − 40(70.14)
]

40 27(13)
= 39 [1 − 40(70.14)]

351
= 1.03 [1 − 2 805.60]

= 1.03[1 − 0.1251]

= 1.03[0.8749]

𝐾𝑅21 = 0.90

Analysis:

The reliability coefficient using KR-21 formula is 0.90 which means that the test
has a very good reliability. Meaning, the test is very good for a classroom test.

𝑘 1 − Ʃ𝑝𝑞)
𝐾𝑅21 = [ ]
𝑘−1 𝑠2

Steps in Solving the Reliability Coefficient Using KR 20

1. Solve the difficulty index of each item (p).


2. Solve the value of q in each item.
3. Find the product of p and q columns.
4. Find the summation of pq.
5. Solve the variance of the scores.
6. Solve the reliability coefficient using KR-20 formula.

147
The first thing to do is to solve the difficulty index in each item and the variance of
the total scores.
𝑛
p= 0. 𝑁, where

n = number of students got the correct answer in each item.

N = number of students who answered each item

q=1–p

Example 5: Mr. Mark Anthony administered a 20-item true or false test for his
English IV class. Below are the scores of 40 students. Find the reliability coefficient
using the KR-20 formula and interpret the computed value, solve also the coefficient
of determination.

Item x p q pq x2
Number
1 25 0.625 0.375 0.234375 625
2 36 0.9 0.1 0.09 1 296
3 28 0.7 0.3 0.21 784
4 23 0.575 0.425 0.244375 529
5 25 0.625 0.375 0.234275 625
6 33 0.825 0.175 0.144375 1 089
7 38 0.95 0.05 0.0475 1 444
8 15 0.375 0.625 0.234275 225
9 23 0.575 0.425 0.244375 529
10 25 0.625 0.375 0.234375 625
11 36 0.9 0.1 0.09 1 296
12 35 0.875 0.125 0.109375 1 225
13 19 0.475 0.525 0.249375 361
14 39 0.975 0.025 0.024375 1 521
15 28 0.7 0.3 0.21 784
16 33 0.725 0.175 0.144375 1 089
17 19 0.475 0.525 0.249375 361
18 37 0.925 0.075 0.069375 1 369
19 36 0.9 0.1 0.09 1 296
20 25 0.625 0.375 0.234275 625
578 3.38875 17 698

25
p of item 1 = 40 = 0.625

q of item 1 = 1 − 0.625 = 0.375

pq of item 1 = (0.625)(0.375) = 0.234375

148
Note: Continue the same procedures up to the last item.

Solve for the variance of the scores.


𝑛Ʃ𝑥 2 −(Ʃ𝑥)2
𝑠2 = 𝑛(𝑛−1)
n = number of items
Ʃx2 = summation of the square of (x)
Ʃx = summation of (x)
𝑛Ʃ𝑥 2 −(Ʃ𝑥)2
𝑠2 = 𝑛(𝑛−1)
20(17 698)−(578)2
𝑠2 = 20(19)
353 960−33 408 420
= 380
19 876
= 380
𝑠 2 = 52.3053
𝑠 2 = 52.31

Solve the reliability coefficient using the KR-20 formula.


𝑘 Ʃ𝑝𝑞
𝐾𝑅20 = 𝑘−1 [1 − ]
𝑠2

20 3.38875
𝐾𝑅20 = 20−1 [1 − ]
52.31

20
𝐾𝑅20 = 19 [1 − 0.06478]

20
𝐾𝑅20 = 19 [0.93522]

𝐾𝑅20 = [1.05263][0.93522]

𝐾𝑅20 = 0.9844

𝐾𝑅20 = 0.98

Interpretation:

The reliability coefficient using the KR20 = 0.98 means that it has a very high
reliability or excellent reliability.

Coefficient of determination= (0.98)2

= 0.9604

= 96.04%

Interpretation:

149
96.04% of the variance in the students’ performance can be attributed to the test.

CHAPTER 7

SCORING RUBICS FOR PERFORMANCE AND PORTFOLIO ASSESSMENT

INTRODUCTION

One of the alternative method of rating the performance of the students aside
from paper and pencil test is the use of scoring rubrics or rubrics. Scoring rubrics are
used when judging the quality of the work of the learners on performance assessments.
It is a form of scoring guide that is used in evaluating the performance of students or
products resulting from the performance task. Scoring rubrics are very important in
assessing the performance of students using performance-based assessment and
portfolio assessment. In this chapter we shall discuss scoring rubrics, performance-
based assessment and portfolio assessment.

SCORING RUBRICS

Scoring rubrics (Brookhart, 1999 as cited by Moskal, 2000) are descriptive


scoring schemes that are developed by teachers or other evaluators to guide the
analysis of the products or processes of students’ efforts.

Another definition of rubrics is a rating system by which teachers can determine


at what level pf proficiency a student is able to perform task or display knowledge of a
concept and you can define the different levels of proficiency for each criterion
(Airasian,2000).

One common used of rubrics is when the teachers evaluate the quality of an
essay. The judgment of one evaluator differs from others when there are no criteria to
be followed. One evaluator might put much weight in the content of the topic or one
evaluator might give high mark on the organization aspect of the paper. If we are going
to evaluate the quality of an essay, it must have to have a combination of these factors.
In this case the evaluators judge the paper subjectively, to avoid such case the evaluator
must develop a predetermined criterion for evaluation purposes so that the subjectively
of evaluating is lessened or it becomes more objective.

Type of Rubrics

In this section, we shall discuss the two types of rubrics: the holistic rubric and
the analytic rubric.

Holistic rubrics is a type of rubrics that requires the teacher to score the overall
process or product as a whole (Nitko, 2001; Mertler, 2001). In this case, the evaluator
views the final product as a set of interrelated tasks contributing to the whole. Using
holistic rubric in scoring the performance or product of the students provides overall
impression on the ability of any given product. Some of the advantages are quick scoring

150
and provides overview of students’ performance. However, it does not provide detailed
feedback about the performance of the students in specific criteria.

A teacher can use holistic rubric when he wants a quick snapshot of the
performance of the students. A single dimension is already adequate to define the
quality of the performance of the students.

Analytic rubric is a type of rubric that provides information regarding


performance in each component part of a task, making it useful for diagnosing specific
strengths and weaknesses of the learners (Gareis and Grant, 2008). In this type of
rubrics, the evaluator evaluates the final product into each component part and each
part is scored independently. Hence, the total score of the product or performance of
the students will be the rating for all the parts being evaluated. When using analytic
rubrics, it is very important for the evaluator to treat each part separately to avoid any
bias result for the whole product of performance of the students.

The teacher can use analytic rubric when he wants to see the relative strengths
and weaknesses of the students’ performance in each criterion, a detailed feedback and
assess complicated performance, and also if the teacher wants the students to conduct
self-assessment on their understanding about their performance.

Advantages of Using the Rubrics

When assessing the performance of the students using performance-based


assessment, it is very important to use scoring rubrics. The advantages of using rubrics
in assessing students’ performance are:

1. Allows assessment to become more objective and consistent;


2. Clarifies the criteria in specific terms;
3. Clearly shows the student how the work will be evaluated and what is expected;
4. Promotes students’ awareness of the criteria to be used in asse3ssin peer
performance;
5. Provide useful feedback regarding the effectiveness of the instruction; and
6. Provides benchmarks against which to measure and document progress.

Development of Scoring Rubrics

Mertler (2001) suggested the different steps in developing rubrics used in the
assessment of performances, process, products or both process and product, for
classroom use, in his article “Designing Scoring Rubrics for Your Classroom.” The
information for these procedures was compiledfrom various sources (Airasian, 2000 &
2001; Montagomery, 2001); Nitko, 2001, Tombari&Borich, 1999). The steps were
summarized and discussed, followed by presentations of two sample scoring rubrics.

1. Reexamine the learning objectives to be addressed by the task. This allows you to
match your scoring guide with your objectives and actual instruction.

151
2. Identify specific observable attributes that you want to see (as well as those you
don’t want to see) your students demonstrate in their product, process, or
performance. Specify the characteristics, skills, or behaviors that you will be
looking for, as well as common mistakes you do not want to see. The teacher
must carefully identify the qualities that need to be displayed in the student’s
work to demonstrate proficient performance.
3. Brainstorm characteristics that describe each attribute. Identify ways to describe
above average, average, and below average performance for each observable
attribute identified in step 2.
For holistic rubrics, write throughnarrativedescription for excellent work
and poor work incorporating each attribute into the description. Describe the
highest and lower levels of performance combining the description for all
attribute.
For analytic rubrics, write through narrative description for excellent work
and poor work for each individual attribute. Describe the highest and lowest
levels of performance using the descriptors for each attribute separately.
For holistic rubrics, complete the rubric by describing other levels on the
continuum that ranges from excellent to poor work for the collective attribute.
Write descriptions for all intermediate levels of performance.
For analytic rubrics, complete the rubric by describing other levels on the
continuum that ranges from excellent to poor work for each attribute. Write
descriptions for all intermediate levels of performance for each attribute
separately.
4. Collect samples of student work that exemplify each level. These will help you
score in the future by serving as benchmarks.
5. Revise the rubric, as necessary. Be prepared to reflect on the effectivenesss of the
ruric and revise it prior to its next implementation.

Scoring Instruments for


Performance Assessments

Checklist Rating Scales

Rubrics

Analytic Rubrics Holistic Rubrics

Types of Scoring Instruments for Performance Assessments

152
Metler (2001) in his article “Designing Scoring Rubrics for Your classroom”
suggested template for holistic rubrics and analytic rubrics.

Template for Holistic Rubrics


Score Description
5 Demonstrates complete understanding of the problem. All
requirements of task are included in response.
4 Demonstrates considerable understanding of the problem. All
requirements of task are included.
3 Demonstrate partial understanding of the problem. Most
requirements of task are included.
2 Demonstrates little understanding of the problem. Many requirements
of task are missing.
1 Demonstrate no understanding of the problem.
0 No response/ task not attempted.

Template for Analytic Rubrics


Beginning Developing Accomplished Exemplary Score
1 2 3 4
Criteria Description Description Description Description
#1 reflecting reflecting reflecting reflecting
beginning movement achievement of highest level of
level of toward mastery level performance
performance mastery level of performance
of performance
Criteria Description Description Description Description
#2 reflecting reflecting reflecting reflecting
beginning movement achievement of highest level of
level of toward mastery level performance
performance mastery level of performance
of performance
Criteria Description Description Description Description
#3 reflecting reflecting reflecting reflecting
beginning movement achievement of highest level of
level of toward mastery level performance
performance mastery level of performance
of performance
Criteria Description Description Description Description
#4 reflecting reflecting reflecting reflecting
beginning movement achievement of highest level of
level of toward mastery level performance
performance mastery level of performance

153
of performance

Samples of holistic rubric and analytic rubric are presented below adapted from
various authors and websites.

The following are examples of holistic rubrics in assessing persuasive essay and
invention report adapted from a leading author in rubric Heidi Goodrich Andrade
(1997).

Analytic Rubric for Persuasive Essay

Criteria Quality
4 3 2 1
Make a claim I make a claim I make a claim I make a claim I do not make a
and explain but don’t but it is buried, claim.
why it is explain why it confused, or
controversial. is unclear.
controversial.
Give reasons I give clear and I give reasons I give 1 or 2 I do not give
in support of accurate in support of reasons which convincing
the claim reasons in the claim, but don’t support reasons in
support of the overlook the claim well, support of the
claim. important and/ or claim.
reasons. confusing
reasons.
Consider I thoroughly I discuss I acknowledge I do not
reasons discuss reasons reasons against that there are reasons against
against the against the claim, but leave reasons against the claim
claim claim and out important the claim but
explain why the reasons and/or don’t explain
claim is valid don’t explain them.
anyway. why the claim
still stands.
Relate the I discuss how I discuss how I say that I do not
claim to democratic democratic democracy and mention
democracy principles and principles and democratic democratic
democracy can democracy can principles are principles or
be used both in be used to relevant but do democracy.
support of and support the not explain
against the claim. how or why
claim. clearly.
Organization My writing is My writing has My writing is My writing is
well organized, a clear usually aimless and
has a beginning, organized but disorganized.
compelling middle and sometimes gets
opening, strong end. I generally off topic. Has
informative use appropriate several errors
body and paragraph in paragraph

154
satisfying format. format.
conclusion.
Has
appropriate
paragraph
format.
Word choice The words I use I use mostly My words are I use the same
are striking nut routine words. dull, uninspired words over and
natural, varied or they sound over and
and vivid. like I am trying over…. Some
too hard to words may be
impress. confusing.
Sentence My sentences I wrote well- My sentences Many run-ons,
fluency are clear, constructed but are often flat or fragments and
complete and routine awkward. awkward
of different sentences. Some run-ons phrasings make
lengths. and fragments. my essay hard
to read.
Conventions I use first My spelling is Frequent Many errors in
person form, correct on errors are grammar,
and I use common distracting to capitalization,
correct words. Some the reader but spelling and
sentence errors in do not interfere punctuation
structure, grammar and with the make my paper
grammar, punctuation. I meaning of my hard to read.
punctuation need to revise paper.
and spelling. it again.

Source: Understanding Rubrics by Heidi Goodrich Andrade originally published


in Educational Leadership, 1997 with written permission by the author.

Analytic Rubric for an Invention Report

Criteria Quality
4 3 2 1
The report The report The report The report
explains the explains all of explains some does not refer
key purposes of the key of the purposes to the purposes
Purposes the invention p0urposes of of the invention of the
and points out the invention. but misses key invention.
less obvious purposes.
ones as well.
The report The report The report The report
details both details the key neglects some does not detail
key and hidden features of the features of the the features of
Features
features of the invention and invention or the the invention
invention and explains the purposes they or the purposes
explains how purposes they serve. they serve.

155
they serve serve.
several
purposes.
The report The report The report The report
discusses the discusses the discusses either does not
strengths and strengths and the strengths or mention the
weaknesses of weaknesses of weaknesses of strengths or
Critique he invention, the invention. the invention the weaknesses
and suggests but not both. of the
ways in which invention.
it can be
improved.
The report The report The report The report
makes makes makes unclear makes no
appropriate appropriate or connections
connections connections inappropriate between the
between the between the connections invention and
Connections purposes and purposes and between the other things.
feature of the features of the invention and
invention and invention and other
many different one or two phenomena.
kinds of phenomena.
phenomena.

Source: Understanding Rubrics by Heidi Goodrich Andrade originally published


in educational Leadership, 1997 with written permission by the author.

Rubrics for an Oral Presentation


Criterion Quality
Gains attention of Gives details or an Does a two- Does not attempt to
audience amusing fact, a sentence gain attention of
series of questions, introduction, then audience, just starts
a short starts speech. Gives speech.
demonstration, a a one-sentence
colorful visual or a introduction, then
personal reason starts speech.
why this picked this
topic.

Source: Understanding Rubrics by Heidi Goodrich Andrade originally published


in Educational Leadership, 1997 with written permission by the author.

PERFORMANCE-BASED ASSSESSMENT

Performance-based assessment is a direct and systematic observation of the


actual performances of the students based from a predetermined performance criterion
(Zimmaro, 2003). It is an alternative form of assessing the performance of the

156
students that represent a set of strategies for the application of knowledge, skills,
and work habits through the performance of tasks that are meaningful and engaging to
students (Hibbard, 1996) and (Brualdi, 1998) in her article “Implementing Performance
Assessment in the Classroom,” From the definitions of the two well-known authors,
students are required to perform a task rather than select an answer from a given list of
options. It also provides teacher information about how the students understand and
apply knowledge and allow the teacher to integrate performance assessment in the
instructional process to provide additional learning activities for the students in the
classroom.

Paper and Pencil Test vs. Performance-based Assessment

Paper and pencil test measures learning indirectly. When measuring factual
knowledge and when solving well-structured mathematical problems, it is better to use
paper and pencil test. In this case, teacher asked question which indicates skills that
have been learned or mastered. Usually assessed low level thinking skills, or beyond
recall level. While performance-based assessment is a direct measure of learnig or
competence. This indicates that cognitive complex outcomes, affective and psychomotor
skills have been mastered. Examples of performances that can be judged or rated
directly by the evaluators are preparing a microscope slide in laboratory class,
performing gymnastics or a dance in a physical education class, cooking demonstration,
diving in a swimming class. In these kinds of activities, the teacher observes and rates
the students based from their performances. The teacher or evaluator provides
feedback immediately on how the students performed to carry out their performance
task.

PORTFOLIO ASSESSMENT

Portfolio assessment is the systematic, longitudinal collection of student work


created in response to specific, known instructional objectives and evaluated in relation
to the same criteria (Ferenz, 2001). Student portfolio is a purposeful collection f
student work that exhibits the student’s effort, progress and achievements in one or
more areas. The collection must include student participation in selecting contents,
reflection (Paulson, Paulson, Meyer 1991 as cited by Ferenz, 2001 in her article “Using
Student Portfolio for Outcomes Assessment).

The portfolio should represent a collection of students’ best work or best efforts,
student-selected samples of work experiences related to outcomes being assessed, and
documents according to growth and development towards mastering identified
outcomes.

A portfolio (Vavrus, 1990) is more than just a container full of stuff. It is a


systematic and organized collection of evidence used by the teacher and student to
monitor growth of the student’s knowledge, skills, and attitudes in a specific subject
area.

157
A portfolio (national education Association, 1993) is a record of learning that
focuses on the student’s work and her/his reflection on that work. Material is collected
through a collaborative effort between the student and staff members and is indicative
of progress towards the essential outcomes.

Comparison of Portfolio and Traditional Forms of Assessment (Ferenz, 2001)

Traditional Assessment Portfolio Assessment


Measures student’s ability at one time Measures student’s ability over time
Done but the teacher alone, students are Done by the teacher and the students, the
not aware of the criteria students are aware of the criteria.
Conducted outside instruction Embedded in instruction
Assigns student a grade Involves student in own assessment
Does not capture the students’ language Allows many expression of teacher’s
ability knowledge of student as a learner
Does not give student responsibility Student learns how to take responsibility

PART II

Reviewer in Assessment of Learning

The second part of this book is a summative assessment. The questions serve as
reviewer in preparation for the Licensure Examination for Teacher (LET), which are all
applications of the concepts in “Assessment of Learning” or Summative Assessment.

Direction: Write the letter of the correct answer before the number. Write the
letter E if the correct answer is not among the options. No erasures.

____1. Teacher Marivic discovered that her students are weak in sentence construction.
Which test should teacher Marivic administer to determine what other skills(s)
her pupils are weak?

A. Placement Test
B. Formative Test
C. Diagnostic Test
D. Summative Test

158
____2.Teacher Christopher will construct a periodic exam for his Algebra subject. Which
of the following should he consider first?

A. Prepare a table of specification.


B. Go back to his instructional objectives.
C. Study the content of his discussed lessons.
D. Identify the format of the test item.

____3. Which of the test item is most appropriate to attain Teacher Karl’s lesson
objective “multiply fractions and reduce the product to lowest term”?

A. What are the rules in multiplication of fraction?


8 6 6 6 8
B. Reduce , , , , and to their lowest terms.
12 8 8 10 18
3 15
C. The product of 5 and 18 is ____.
30 1
a. b. 3
90

15 2
c. 45 d. 6

3 2
D. The sum of 5 and 3 is ____.

4 19 5 5
a. 15 b. 15 c. 15 d.8

____4. “Group the following items according to order” can be classified as what type of
question _____?

A. Evaluating
B. Generalizing
C. Classifying
D. Inferring

____5. Which of the following test format does NOT belong to the group?

A. Short answer
B. Multiple choice
C. True or false
D. Matching type

____6. The result of National Assessment Test (NAT) are interpret against a set of
mastery level. This means that NAT is categorized as ____ test.

I. Criterion-referenced
II. Norm-referenced
A. Criterion-referenced only
B. Norm-referenced only
C. Either criterion-referenced or norm-referenced

159
D. Neither criterion-referenced nor norm-referenced

____7. Using statement I to IV, which of the following is NOT true about matching type of
test?

I. The descriptions and options not necessarily homogeneous.


II. The options at the first column and the description at the second
column.
III. The number of options must be greater than the number od
descriptions.
IV. There must be at least three items.
A. I only
B. I and II
C. II and III
D. II, III and IV

____8.____ is an example of vegetable.

The question above is an example of a poorly constructed test item.

What makes the test item poor?

A. It is not a significant test item.


B. The blank at the beginning of the sentence.
C. It is a very essay test item.
D. It is a short question.

____9. Teacher Ace constructed a matching type test. In his column of descriptions are
combinations of presidents, senators, cabinet members, current issues, and
sports. Which rule of constructing a matching type of test was NOT followed?

A. The options must be greater than the descriptions.


B. The descriptions must be heterogeneous.
C. The descriptions must be homogeneous.
D. Arrange the options according to order.

____10. Which of the following statement is TRUE when standard deviation is large?

A. Scores are concentrated around the mean.


B. The scores are normally distributed.
C. Scores are widely spread around the mean.
D. The mean and median are equal.

____11. When teacher Gerald used the table of specification in constructing his periodic
test, which of the following characteristics of a good test will be assured to his students?

A. Administrability

160
B. Construct Validity
C. Content Validity
D. Reliability

____12. Teacher Luis wants to test his students’ ability to speak extemporaneously,
which of the following is the most valid assessment tool?

A. Let his students construct a speech


B. Written test on the guidelines on delivering extemporaneous speech.
C. Let them make their portfolio on speeches delivered.
D. Performance test in extemporaneous speaking.

____13. In the parlance of test development, what does TOS mean?

A. Table of Skewness
B. Table of Specifics
C. Table of Species
D. Table of specification

____14. Which of the following is the highest level of Bloom’s taxonomy?

A. Identify the kinds of Measures of Variation.


B. Compute the mean value: 86, 91, 75, 96, and 88.
C. Compare and contrast standard deviation and coefficient of variation.
D. Explain the concept of variability.

____15. Given the scores: 94, 83, 83, 91, 94, 86, 80, 82, 81, 83, 85. What does the score 83
in the distribution mean /s?

A. Median
B. Mean and mode
C. Mode only
D. Median and mode

____16. Read the sample test item below and answer the question that follows:

During what age period is thumb-sucking likely to produce the greatest


psychological trauma?

A. Infancy
B. Preschool period
C. Before adolescence
D. During adolescence
E. After adolescence

161
What makes the test item poor?

A. The stem does not pose the problem completely.


B. There is a grammatical clue to answer the question.
C. Overlapping among options.
D. Cannot be determined due to insufficient information.

____17. Which of the following statement is TRUE about portfolio assessment?

A. Can determine the growth and development of the students.


B. It is valid and reliable traditional assessment.
C. Consider the suggestions of students in assessment.
D. Involves students in developing of test questions.

____18. Which of the following can diagnose more weaknesses of the students?

A. Portfolio assessment
B. Traditional assessment
C. Performance assessment
D. Analytic rubric

____19. Which of the following statement is NOT true about rubrics?

I. Rubric is not developmental.


II. Rubric can be used for summative and formative assessment.
III. Rubric can provide both grade and detailed feedback to improve
future performance.
IV. Students should not be involved in the rubric construction.
A. I only
B. II and III
C. I and IV
D. IV only

____20. If the computed range is small, this means that ____.

A. The students performed very well in the test.


B. The difference between the highest and the lowest score is high.
C. The students performed very poorly in the test.
D. The difference between the highest and the lowest score is low.

____21. Which of the following item represent norm-referenced statement?

A. Peter was able to got 90 items correctly out of 100 items in mathematics.
B. Fitch performed better in the test in mathematics than 88% of his classmates.
C. Fitch was able to solve **% of the problem correctly.

162
D. Glenn solved 9 problems out of 15 problems correctly.

____22. Scores of 8 students were: 86, 78, 89, 90, 88, 98, 95, 88. What is the mean value?

A. 87
B. 88
C. 89
D. 90

____23. What does a positively skewed score distribution mean?

A. The mean, median and mode are equal.


B. Most of the scores are low.
C. The scores are normally distributed.
D. Most of the scores are above the mean.

____24. Which of the following statement best described the performance of the students
when their scores are negatively skewed?

A. Most students got very high scores.


B. The scores are equally distributed from left and right of its mean.
C. A few students performed above the mean.
D. Most students did not perform well.

____25. Teacher Renzel conducted item analysis is his examination in Mathematics. The
facility index of item number 6 is 0.65. What does item number 6 mean?

A. Moderately difficult
B. Easy
C. Difficult
D. Very difficult

____26. The discrimination index of a test item is -0.25. What does this mean?

A. More students in the lower group got the item correctly than those in the
upper group/
B. More students in the upper group got the item correctly than those in the
lower group.
C. The number of students in the lower and upper groups who got the item is
equal.
D. More students from the upper group got the item incorrectly.

____27. Teacher Jhonson gave a test in English. Most of the students got scores above the
mean. What is graphical representation of their scores?

A. Skewed to the right

163
B. Skewed to the left
C. Mesokurtic
D. Normally distributed

____28. Teacher Dominic give a 50-item test in English. The mean performance of the
group is 27 and the standard deviation is 5. Franz obtained a score of 31. Which
of the following best described his performance.

A. Below average
B. Average
C. Above average
D. Outstanding

____29. The supervisor is talking about ‘grading on the curve” in a district meeting. What
does this expression mean?

A. A student’s grade determined whether or not the student attains the


standard of achievement.
B. A student’s grade tells how closely he is achieving his potential.
C. A student’s grade is equivalent to hid effort.
D. A student’s grad will depend on the achievements of the students.

____30. Joseph’s score is within 𝑥̅ ± 1 𝑆𝐷. To which of the following groups does he
belong?

A. Below average
B. Average
C. Needs improvement
D. Above average

____31. The computed r = 0.93 for scores in English and Math. What does this mean?

A. Math scores are slightly related to the English scores.


B. The higher the scores in English, the lower the scores in Math.
C. Scores in English is positively related to the scores in Math.
D. English scores is not related to the Math scores.

____32. Teacher Kristy conducted an item analysis for her test questions in English. She
found out that item number 10 has a difficulty index of 0.45 and a discrimination index
of 0.37. What should teacher Kristy do with item number 10?

A. Revise the item.


B. Retain the item.
C. Reject the item.
D. Make the item bonus.

164
____33. About how many percent of the scores fall between -2SD and +2SD units of its
mean?

A. 34%
B. 68%
C. 95%
D. 99%

____34. Which of the following statement best describe skewed score distribution?

A. The scores are normally distributed.


B. Most of the scores lie at one end or at the other end of the curve.
C. Most of the scores lie at the left end tail of the curve.
D. Most of the scores lie at the right end tail of the curve.

____35. Which of the following group of scores distribution is less spread?

A. sd = 1.5
B. sd = 1.65
C. sd = 1.75
D. sd = 2.0

____36. Mark’s raw score in the TLE class is 93 which equals to 96th percentile. What
does this imply?

A. 96% of Mark’s classmates got a score higher than 93.


B. 96% of Mark’s classmates got a score lower than 93.
C. Mark’s score is less than 93% of his classmates.
D. Mark is higher than 96% of his classmates.

____37. Which type of assessment is most appropriate for assessing learning difficulties?

A. Formative assessment
B. Placement assessment
C. Summative assessment
D. Diagnostic assessment

____38. Which statement about performance-based assessment is TRUE?

A. They emphasize merely the process.


B. They stress on doing, not on knowing.
C. Essay tests are not performance-based assessment.
D. They accentuate on process as well as product.

____39. Which assessment tool will be most authentic?

A. Portfolio
B. Completion test

165
C. True or false test
D. Multiple-choice test

____40. Which is the most important about portfolio and performance-based


assessment?

A. Authentic assessment
B. Numerical grading
C. Grading sheet
D. Scoring rubric

Situation 1.Study the table on the item analysis for non-attractiveness of


distracters based on the result of a multiple choice test in Mathematics conducted by
Teacher Fitch. The letter with asterisk is the correct answer. Answer items 41-45.

A* B C D
Upper 27% 12 5 8 10
Lower 27% 9 6 12 8

____41. Based in the table, which is the most effective distracter?

A. Option B
B. Option C
C. Option D
D. Option A

____42. The table shows that as a result of the analysis the test item ____.

A. Was very easy


B. Has a negative discriminating power.
C. Has a positive discriminating power.
D. Could not be clearly determined because of the insufficient data.

____43. Based on the table in situation 1, which of the options should be revised?

A. Options B
B. Option C
C. Option D
D. Option A

____44. What is the level of difficulty if the given test?

A. Very easy
B. Easy
C. Moderately easy

166
D. Difficult

____45. In which group got more correct answer?

A. Lower group
B. Upper group
C. Could not be determined, data are insufficient
D. None of the above

Situation 2. The table below shows different test administered to a class to


which Angel belongs, then answer question 46-48.

Subjects Mean SD Angel’s Scores


Mathematics 78 7 75
English 80 7.3 82
Music 90 7.2 89
PE 88 7.5 90

____46. In which subject(s) did Angel perform most poorly in relation to the group’s
mean performance?

A. English
B. PE
C. Music
D. Mathematics

____47. What type of learner s Angel?

A. Bodily kinesthetic
B. Logical
C. Linguistic
D. Musical

____48. In which test or subject the scores is most widespread?

A. English
B. PE
C. Music
D. Mathematics

Situation 3.For item 49-50. Read and analyze the matching type of test given
below.

Direction:Match column A with column B. Write only the letter of your answer
at the line on the left

Column A Column B

167
____1. December 25 A. Consider the 8th wonder of the world

____2.Ferdinand Marcos B. The founder of Katipunan

____3. Corazon Aquino C. Christmas day

____4.Baguio City D. The first woman President of the Philippines

____5. Andres Bonifacio E. The summer capital of the Philippines

____6.Bnaue Rice Terraces F. The President of the Philippines who served the
longest

____7.Benigno Aquino G. Former senator of the Philippines

____49. What is the main defect of the test item?

A. It does not measure what it is intended to measure.


B. Consists of 7 items only.
C. It is not reliable.
D. The descriptions and options are not homogeneous.

____50. How would you improve the test item?

A. Column A should be in Column B and Column B should be in Column A.


B. Increase the number of items in Column A.
C. Capitalize items in Column A.
D. Remove letter G in Column B.

____51. Which of the following is NOT a factor in errors in assessment?

A. The test-retest may increase a student’s score.


B. The test questions may get outdated.
C. The student may take the test for granted.
D. Administering the test twice may measure different attribute.

____52. Teacher Vinci wants to test his student’s ability to formulate ideas. Which type of
test should he develop?

A. Problem solving type


B. Essay question
C. Completion test
D. Matching type

____53. In group norming the percentile rank of the examinees is:

A. Independent on the batch of examinees

168
B. Dependent on the batch of examinees
C. Affected by skewed distribution
D. Not affected by skewed distribution

____54. Which is true about norm-referenced statement?

A. Mark performed better in spelling than 60% of his classmates


B. Mark was able to spell 90% of the words correctly
C. Mark was able to spell 90% of the words correctly or spelled 45 words out of
50 correctly
D. Mark spelled 35 words out of correctly

____55. Which holds true to norm-referenced testing?

A. Constructing test items in terms of instructional objectives


B. Identifying an acceptable level of mastery in advance
C. Determining task that reflects instructional objectives
D. Identifying average performance of a group

____56. A positive discrimination index means that:

A. The test item could not discriminate between the lower and upper groups.
B. More from the upper group got the item correctly
C. More from the lower group got the item correctly
D. The test item has low reliability

____57. Teacher Vince is conducting a test, not one examinees approached him for
classification on what to do. Which characteristic of a good test is applied?

A. Fairness
B. Objectivity
C. Administrability
D. Clarity

____58. Teacher Marie wanted to teach her pupils folk dancing. Her check-up test was a
written test on the steps of folk dance. What characteristics of good test does it lack?

A. Objectivity
B. Comprehensiveness
C. Validity
D. Reliability

____59. Teacher Mark Angelo used the table of specifications when he constructed his
periodic test, the students can be assumed that the test has ____.

A. Clarity
B. Content validity

169
C. Relevance
D. Reliability

____60. Which is the most reliable tool for seeing the development in a student’s ability
to sing?

A. Performance assessment
B. Self-assessment
C. Scoring rubric
D. Portfolio assessment

____61. In which competency did Teacher Grace students’ find more easy? In the item
with a difficulty index of ____.

A. 0.31
B. 0.91
C. 0.55
D. 1.0

____62. The criterion of success in Teacher Harold’s objectives is that “the students must
be able to get 85% of the items correctly.” Ana and 24 others got 36 items correctly out
of 50. This means that Teacher Harold:

A. Attained his objective because of his appropriate teaching materials


B. Failed to attain his lesson objective as far as the 25 pupils are concerned
C. Attained his lesson objective
D. Did not attain his lesson objective because the student’s failed to study the
material

____63. The discrimination index of item #16 is -0.25. what does this imply?

A. An equal number from the lower and upper group got the item correctly.
B. More from the upper group got the item correctly.
C. More from the lower group got the item correctly.
D. More from the upper group got the item wrong.

____64. The discrimination index of item #18 is +0.35. What does this mean?

A. More from the lower group got the item correctly.


B. An equal number from the lower and upper group got the item correctly.
C. More from the upper group got the item wrong.
D. More from the upper group got the item correctly.

____65. The discrimination index of item #20 is 0. What does this mean?

A. More from the lower group got the item correctly.


B. An equal number from the lower and upper group got the item correctly.

170
C. More from the upper group got the item correctly.
D. More from the upper group got the item wrong.

____66. Which is correct about MEDIAN?

A. It is a measure of variability.
B. It is the most stable measure of central tendency.
C. It is appropriate when there are extreme scores.
D. It is significantly affected by extreme values.

____67. Which measure(s) of central tendency can be determined by mere inspection?

A. Median
B. Mode
C. mean
D. mode and median

____68. Here is a score distribution: 88, 85, 84, 83, 80, 75, 75, 73, 56, 55, 51, 51, 51, 34,
34, 20. Which of the following best describes the distribution?

A. Bimodal
B. Multimodal
C. Unimodal
D. Cannot be determined

____69. Which is true of unimodal score distribution?

A. The group tested has one mode.


B. The scores are either high or low.
C. The scores are high.
D. The scores are low.

____70. A test item has a difficulty index of 0.85 and discrimination index of -0.10. What
should the teacher do?

A. Make a bonus item.


B. Reject the item.
C. Retain the item.
D. Reject the item and make it a bonus.

____71. Which measure(s) of central tendency is (are) most appropriate when the score
distribution is skewed?

A. Mode
B. Mean and median
C. Median
D. Mean

171
____72. In a one hundred-item test, what does Gil’s score of 70 mean?

A. He surpassed 70 of his classmate in terms of score.


B. He surpassed 30 of his classmates in terms of score.
C. He got a score above mean.
D. He got 70 items correctly.

____73. Which of the following measures is more affected by an extreme score?

A. Mean deviation
B. Median
C. Mode
D. Mean

____74. The sum of all the scores in a distribution is always equal to:

A. The mean times the interval size


B. The mean divided by N
C. The mean times N
D. The mean divided by the sum of all scores.

____75. Teacher Marc is researching on family income distribution which is symmetric.


Which measure/s of central tendency will be most appropriate?

A. Mode
B. Mean
C. Median
D. Mean and median

____76. Study the table below then answer the question that follows.

Scores Percent of Students


10-19 3#
20-29 7%
30-39 37%
40-49 39%
50-59 14%

In which scores interval is the median?

A. In the interval 40 to 49
B. In between the intervals of 10-19 and 20-29
C. In the interval 30-39
D. In the interval 50-59

172
____77. Using data in #76, how many percent of the students got a score above 39?

A. 10%
B. 13%
C. 39%
D. 53%

____78. Robert’s raw score in the mathematics class is 45 which equals to 96th percentile.
What does this mean?

A. 96% of Robert’s classmates got a score higher than 45.


B. 96% of Robert’s classmates got a score lower than 45.
C. Robert’s score is less than 45% of his classmates.
D. Robert’s is higher than 96% of his classmates.

____79. Which one describes the percentile rank of a given score?

A. The percent of cases of a distribution below and above a given score.


B. The percent of cases of a distribution below the given score.
C. The percent of cases of a distribution above the given score.
D. The percent of cases of a distribution within the given score.

____80. Marc obtained a score of 85 in Mathematics multiple-choice test. What does it


mean?

A. He has a rating of 85.


B. He answered 85 items in the test correctly.
C. He answered 85% of the test item correctly.
D. His performance is 15% better than the group.

____81. Median is the 50th percentile as Q3 is to:

A. 25th percentile
B. 45th percentile
C. 70th percentile
D. 75th percentile

____82. Karla Marie obtained a NEAT percentile rank of 98. This means that:

A. They have a zero reference point.


B. The y have scales of equal units.
C. They indicate an individual’s relative standing in a group.
D. They indicate specific points in the normal curve.

____83.Markie obtained a NEAT percentile rank of 95. This means that:

A. He got a score of 95.


B. He answered 95 items correctly.

173
C. He surpassed in performance of 95% of his fellow examinees.
D. He surpassed in performance of 5% of his fellow examinees.

____84. Mark Erick is 2.5 standard deviation above the mean of his group in Math and 1.5
standard deviation above in English. What does this imply?

A. He excels in both English and Math.


B. He is better in Math than in English.
C. He does not excel in English nor in Math.
D. He is better in English than in Math.

____85. Which statement about the standard deviation is CORRECT?

A. The lower the standard deviation the more spread the scores are.
B. The higher the standard deviation the less the scores spread.
C. The higher the standard deviation the more the spread the scores.
D. It is a measure of central tendency.

____86. Which group of scores is most varied? The group with ____.

A. sd = 1
B. sd = 2
C. sd = 3
D. sd = 4

____87. Mean is to measure of central tendency as ____ is to measure of variability.

A. Quartile deviation
B. Quartile
C. Correlation
D. Skewness

____88. Study the two sets of scores below and answer the question that follows:

SET A: 11, 12, 23, 24, 35, 36, 47, 48, 59

SET B: 13, 14, 24, 25, 35, 46, 46, 47, 59

Which statement correctly applies to the two sets of score distribution?

A. The scores in set A are more spread out than those in set B.
B. The range for set B is 46.
C. The range for set A is 47.
D. The scores in set b are more spread out than those in set A.

174
____89. Skewed score distribution means:

A. The scores are normally distributed.


B. The mean and the median are equal.
C. Consist of academically poor students.
D. The scores are concentrated more at one end or the other end.

____90. What is the graphical representation of the distribution if a class is composed of


bright students?

A. Mesokurtic
B. Skewed to the right
C. Skewed to the left
D. Normally distributed

____91. Most students who took the examination got scores above the mean.

What is the graphical representation of the score distribution?

A. Normal curve
B. Playkurtic
C. Positively skewed
D. Negatively skewed

____92. If a class is composed of academically poor students, what is the graphical


representation of the score distribution?

A. Skewed to the right


B. A bell curve
C. Leptokurtic
D. Skewed of the left

____93. Which of the following method is questionable due to practice and familiarity is
establishing reliability of the test?

A. Split-half
B. Parallel form
C. Test-retest
D. Kuder-Richardson

____94. Which assessment activity is most appropriate to measure the objective “to
explain the meaning of molecular bonding” for the group with strong interpersonal
intelligence?

A. Write down chemical formulas and show how they were derived.
B. Build several molecular structures with multicolored pop beads.
C. Draw diagram that show different bonding patterns.

175
D. Demonstrate molecular bonding using students as atoms.

____95. Which is the most reliable tool for detecting the development in your pupils’
ability to write?

A. Objective assessment
B. Self-assessment
C. Scoring rubric
D. Portfolio assessment

____96. Which characteristic of a good test is questionable when significantly greater


number from the lower group gets the test items correctly?

A. Objectivity
B. Scorability
C. Administrability
D. Reliability

____97. In which competency did my students find the greatest difficulty? In the item
with a difficulty index of ____.

A. 0.1
B. 0.9
C. 1.0
D. 0.5

____98. Which is correct about norm-referenced testing?

A. Constructing test items in terms of instructional objectives.


B. Identifying an acceptable level of mastery in advance.
C. Determining task that reflect instructional objectives.
D. Identifying average performance of a group.

____99. The discrimination index of test item is +.45. What does this mean?

A. More from the lower group got the item correctly.


B. An equal number from the lower and upper group got the item correctly.
C. More from the upper group got the item correctly.
D. More from the upper group got the item wrongly.

____100. Test item has a difficulty index of .60 and discrimination index of .40.

What should the teacher do?

A. Make a bonus item.


B. Reject the item.

176
C. Retain the item.
D. Make it a bonus item and reject it.

____101. When Teacher Grace conducted an item analysis, she found out that a
significantly greater number from the upper group of the class got test item number 5
correctly. This means the test item.

A. Has a negative discriminating power


B. Is valid
C. Is easy
D. Has a positive discriminating power

____102. Which of the statement described norm-referenced?

A. Mark performed better in spelling than 60% of his classmates


B. Mark was able to spell 90% of the words correctly
C. Mark was able to spell 90% of the words correctly and spelled 35 words out
of 50 correctly.
D. Mark spelled 35 words out of 50 correctly.

____103. The discrimination index of a test item is 0. What does this mean?

A. More from the lower group got the item correctly.


B. N equal number from the lower and upper group got the item correctly.
C. More from the upper group got the item correctly.
D. More from the upper group got the item wrong.

____104. A positive discrimination index means that:

A. The test item could not discriminate between the lower and upper.
B. More from the upper group got the item correctly.
C. More from the lower group the item correctly.
D. The test item has low reliability.

____105. A test item has difficulty index of 0.91 and a discrimination index of -0.20. What
should the teacher do?

A. Make a bonus item.


B. Reject the item.
C. Retain the item.
D. Revise the item.

____106. The computed r for scores in Math and Filipino is -0.43. What does this mean?

A. Math scores are positively related to Filipino scores.


B. The higher the Math scores, the lower the Filipino scores.
C. Math scores are not related to Filipino scores.

177
D. Filipino scores are slightly related to Math scores.

____107. The computed r for scores in English and Science is 0.66. What does this mean?

A. English scores are positively related to Science scores.


B. As the scores in English increase, the scores in Science decrease.
C. As the scores in English decrease the scores in Science increase.
D. English score are related to Science scores.

____108. The scatter gram of two variables are spread evenly in all direction, this means
that:

A. There is a high positive correlation between the two variables.


B. There is a low negative correlation between the two variables.
C. There is no correlation between the two variables.
D. There is a high negative correlation between the two variables.

____109. Teacher Renzel found out that there is a negative correlation between the
scores in English and in Mathematics. What does this mean?

A. Students’ scores in English are inversely related to their scores in


Mathematics.
B. Students’ score in English are directly related to their scores in mathematics.
C. Students who are good in English are not necessarily good in Mathematics.
D. Students who are good in Mathematics are not necessarily good in English.

____110. Which of the following is NOT a characteristic of authentic assessment?

A. Direct evidence
B. Performing a task
C. Contrived
D. Real-life

____111. A short quiz conducted by Teacher Benjamin James to get feedback on how
much the students learned but will not be used for grading purposes is classified as a
____.

A. Diagnostic assessment
B. Placement assessment
C. Summative assessment
D. Formative assessment

____112. Teacher BJ set a 90% accuracy in a 25-item spelling test. Nike obtained a score
of 88% and this can be interpreted as ____.

A. He obtained 88% percentile score.

178
B. He did not meet the set criterion by 2%.
C. He is higher than 88% of the group.
D. He is 2% short of the set percentile score.

____113. Teacher A conducted a test at the end of a lesson to find out if the objectives of
her lesson has been attained. Which of the following type of assessment must be
dministered?

A. Formative assessment
B. Diagnostic assessment
C. Norm- assessment
D. Criterion-referenced

____114. The test is administered by the Professional Regulation Commission aimed at


measuring the proficiency of teachers in developing a set of instructional skills and
methodologies. The examination is given every month of April and September of the
year. The test can be classified as ____.

A. Performance test
B. Norm-referenced test
C. Professional test
D. Criterion-referenced test

____115. A certain university wanted an entrance examination that can identify future
outcomes or differences such as who will graduate from college or who will drop out.
The test has ____.

A. Predictive validity
B. Content validity
C. Construct validity
D. Concurrent validity

____116. Which are characteristics of a good assessment instrument?

I. Objectivity
II. Validity
III. Scorability
IV. Reliability
A. I, II, IV
B. II, IV
C. I, II, III
D. I, II, III, IV

____117. If one wants to establish the reliability his test. Which of the following will he
do?

I. Administer a parallel test


179
II. Split the test
III. Develop a very difficult test
IV. Administer the same test twice
A. I, III, IV
B. I, II, IV
C. I, II
D. I, IV

____118. Which of the following is the main purpose of administering a pre-test and
post-test to the students?

A. Measure the effectiveness of the instructional materials


B. Measure gains in learning
C. Measure the effectiveness of instruction
D. Trained students for government examination

____119. Which statement is true in a bell-shaped curve?

A. There are ore high scores than low scores.


B. Most of the scores are high.
C. The scores are normally distributed.
D. There are more low scores than high scores.

____120. Teacher Benjie give a 50-item test where the mean performance of the group is
40 and the standard deviation is 4. James obtained a score of 37. Which of the following
best describe his performance?

A. Below average
B. Average
C. Above average
D. Outstanding

____121.The discrimination index of a test item is 0.39. What does this imply?

A. More students in the lower group got the item correctly than those students
in the upper group.
B. More students in the upper group got the item correctly than those students
in the lower group.
C. The number of students in the lower group and upper group who got the
item is equal.
D. More students from the upper group got the item incorrectly.

____122. Teacher Nike constructed a matching type test. In his column of descriptions
are combinations of dates of evens, current issues, and sports. Which rule of
constructing a matching type of test was NOT followed?

A. The options must be greater than the descriptions.


180
B. The descriptions must be heterogeneous.
C. The descriptions must be homogeneous.
D. Arrange the options according to order.

____123. Which of the following is/ are true about matching type of test?

I. The descriptions and options not necessarily homogeneous.


II. The option at the first column and the description at the second
column.
III. The number at the first column and the descriptions at the second
column.
IV. There must be at least three items.
A. I only
B. II and III
C. IV
D. II, III and IV

____124. Teacher X discovered that his students are weak in solving age problems.
Which test should Teacher X administer to further determine in which other skill(s) his
pupils are weak?

A. Placement assessment
B. Diagnostic assessment
C. Formative assessment
D. Summative assessment

____125. Teacher May conducted an item analysis of her periodic test. She found out that
item number 6 is non-discriminating. What does this it imply?

I. The item is very difficult and nobody got the correct answer.
II. The instruction is effective.
III. The item is very easy and everybody got the correct answer.
A. I only
B. I and II
C. I and III
D. III only

____126. A portfolio assessment requires a presentation of a collection of student’s work.


What is its purpose?

I. To showcase the current abilities and skills of the learners.


II. To show growth and development of the learners.
III. To evaluate he cumulative achievement of the learners.
A. I, II and III
B. I and II
C. I and III

181
D. II and III

____127. Teacher JR conducted an item analysis of his periodic test. He found out that
item number 16 has a difficulty index of 0.41 and discriminating index of 0.36. What
should teacher JR do with item number 16?

A. Reject the item.


B. Retain the item.
C. Revise the item.
D. Make it a bonus item.

____128. An admission officer of a certain university conducted four batches of entrance


examination for scholarship. The results are the following: batch I: average = 85.75 with
15 examinees. Batch II: average = 90.25 with 15 examinees. Batch III: average = 88.75
with 20 examines and Batch IV: average = 89.25 with 10 examinees. What is the overall
average of the examinees?

A. 88.46
B. 88.50
C. 88.80
D. 89.00

____129. About how many percent of the cases fall between -2SD to +2SD units from the
mean?

A. 68.26%
B. 95.44%
C. 99.72%
D. 99.98%

____130. Which types of assessment is most appropriate for assessing learning


difficulties?

A. Formative assessment
B. Placement assessment
C. Summative assessment
D. Diagnostic assessment

____131. Most of the students got scores above the mean. What would be the graphical
representation of their scores?

A. Normally distributed
B. Skewed to the right
C. Negatively skewed
D. Positively Skewed

____132. Which of the following measures of variation is the most stable?

182
A. Range
B. Quartile deviation
C. Mean deviation
D. Standard deviation

____133. Which of the following instructional objectives is the lowest level of


Krathwolh’s cognitive taxonomy?

A. Identify the kinds of measures of variability.


B. Compute the variance value: 25, 27, 30, 33, and 36.
C. Compare and contrast quartile deviation and standard deviation.
D. Explain the concept of variability.

____134. Which is true when the standard deviation is large?

A. Scores are concentrated.


B. Scores are not extremes.
C. Scores are spread apart.
D. The bell-shaped is steep.

____135. Teacher A is talking “grading on the curve” in a district meeting. What does
“grading on the curve” mean?

A. A student grade is compared with a established standard.


B. A student grad is compared with his achievement and his improvement.
C. A student grade compared with his achievement to his effort.
D. A student grade compared with other students.

____136.Meryll’s raw score in the English class is 95 which equals to 98th percentile.
What does this mean?

A. 98% of Meryll’s classmates got a score higher than 95.


B. 98% of Meryll’s classmates got a score lower than 95.
C. Meryll’s score is less than 98% of his classmates.
D. Meryll is higher than 98% of his classmates.

____137. Which of the following statement is/ are important in developing a scoreing
rubrics?

I. Description of each criteria to serve as standard


II. Very clear descriptions of performance in each level
III. Rating scale
IV. Mastery levels of achievement
A. I only
B. I and II

183
C. I, II, III
D. I, II, III, IV

____138. Which of the given statement best described scoring rubrics?

A. It is analytical.
B. It is developmental.
C. It is holistic.
D. Neither analytical nor holistic.

____139. Which is the most reliable tool for seeing the development in your pupils’
ability to write?

A. Summative assessment
B. Performance-based assessment
C. Self-evaluation
D. Portfolio assessment

____140.The most appropriate tool to measure the performance in terms of how far a
score is above or below he mean or average.

A. Standard scores
B. Norm-reference
C. Criterion-reference
D. Raw scores

____141. Which of the following statement is the main purpose when a teacher uses a
standardized test?

A. To compare the performance of the students with each other


B. To serve as a final examination
C. To serve as a unit test
D. To engage in easy scoring

____142. Which of the following statement is NOT true about rubrics?

I. Rubric is not developmental


II. Rubric can be used for summative and formative assessment.
III. Rubric can provide both grade and detailed feedback to improve
future performance.
IV. Students should not be involved in the rubric construction.
A. I only
B. II and III
C. I and IV
D. IV only

184
____143. Which is the most important about portfolio and performance-based
assessment?

A. Authentic assessment
B. Numerical grading
C. Letter grading
D. Scoring rubric

____144. Which of the following method is questionable due to practice and familiarity in
extablishing reliability of the test?

A. Split-half
B. Parallel form
C. Test-retest
D. Kuder-Richardson

____145. Which of the following is NOT a characteristic of an objective test?

A. Can cover a large sampling of content areas


B. Time consuming to prepare good objective test
C. There is a single or best answer.
D. Can measure higher order thinking skills such as organizing original ideas

____146. This is the preplanned collection of sample of student works, assessed results
and other output produced by the students.

A. Diary
B. Portfolio
C. Observation report
D. Anecdotal report

____147. Teacher Fitch Peter will construct a periodic test for his Biology subject. Which
of the following he will need to accomplish first?

A. Prepare a table of specification.


B. Go back to his instructional objectives.
C. Study the content of his discussed lessons.
D. Identify the format of the test item.

____148. What type of error was committed by a researcher if he commit a type 1 error?

A. Accepting the null hypothesis which is not true.


B. Rejecting the null hypothesis which I true.
C. Severity error
D. Generosity error

185
____149. Teacher Luis wants to test his student’ ability to speak extemporaneous, which
of the following is the most valid assessment tool?

A. Let his students construct a speech.


B. Written test on the guidelines in delivering extemporaneous speech.
C. Let them make their portfolio on speeches delivered.
D. Performance test in extemporaneous speaking.

____150. Which of the following statement DOES NOT describe the present grading
system of elementary and secondary public schools?

A. The lowest possible failing grade appeared in the report card is 65%.
B. Student must master at least 75% of the competency per subject.
C. Use transmutation table in the computation of percentage.
D. Averaging method is utilized in the computation of final grade.

186

Você também pode gostar