Escolar Documentos
Profissional Documentos
Cultura Documentos
As teaching is causing learning among learners, teachers need to be thoroughly aware of the processes in determining how successful they are in the aforementioned task. They need to know whether their students are achieving successfully the knowledge, skills, and values inherent in their lessons. For this reason, it is critical for beginning teachers, to build a repertoire measurement and evaluation of student learning. This chapter is geared towards equipping you with the basic concepts in educational assessment, measurement, and evaluation. Measurement, Assessment, and Evaluation Measurement as used in education is the quantification of what students learned through the use of tests, questionnaires, rating scales, checklists, and other devices. A teacher, for example, who gave his class a 10 item quiz after a lesson on the agreement of subject and verb is undertaking measurement of what was learned by the students on that particular lesson.
Assessment, however, refers to the full range of information gathered and synthesized by teachers about their students and their classrooms (Arends, 1994). This information can be gathered in informal ways, such as through observation or verbal exchange. It cal also be gathered through formal ways, such as assignments, tests, and written reports or outputs. While measurement refers to the quantification of students performance and assessment as the gathering and synthesizing of information, evaluation is a process of making judgments, assigning value or deciding on the worth of students performance. Thus, when a teacher assigns a grade to the score you obtained in a chapter quiz or term examination, he is performing an evaluative act. This is because he places value on the information gathered on the test. Measurement answers the question, how much does a student learn or know? Assessment looks into how much change has occurred on the students acquisition of a skill, knowledge or value before and after a given learning experience. Since evaluation is concerned with making judgments on the worth or value of a performance, it answers the question, how good, adequate or desirable is it? Measurement and assessment are, therefore, both essential to evaluation.
Educational
Assessment:
context
for
Educational
Bloom (1970) has this to say on the process of educational assessment: Assessment characteristically starts with an analysis of the criterion and the environment in which an individual lives, learns, and works. It attempts to determine the psychological pressures the environment creates, the role expected, and the demands and pressures their hierarchical arrangement, consistency, as well as conflict. It then proceeds to the determination of the kinds of evidence that are appropriate about the individuals who are placed in this environment, such as their relevant strengths and weaknesses, their needs and personality characteristics, their skills and abilities. From the foregoing description of the process of educational assessment, it is very clear that educational assessment concerns itself with the total educational setting and is a more inclusive term. This is because it subsumes measurement and evaluation. It focuses not only on the nature of the learner but also on what is to be learned and how it is to be learned. In a real since, it is diagnostic in intent or purpose. This is due to the fact that through educational assessment the strengths and weaknesses of an individual learner can be identified and at the same
time, the effectiveness of the instructional materials used and the curriculum can be ascertained. Assessments are continuously being undertaken in all educational settings. Decisions are made about content and specific objectives, nature of students and faculty, faculty morale and satisfaction, and the extent to which student performances meet standards. Payne (2003) describes a typical example of how assessments can be a basis for decision making: 1. The teacher reviews a work sample, showing some column additions are in error and there are frequent carrying errors. 2. He / She assigns simple problems on proceeding pages, with consistent addition errors in some number combinations, as well as repeated errors in carrying from one column to another. 3. He / She give instruction through verbal explanation, demonstration, trial and practice. 4. The student becomes a successful in calculations made in each preparation step after direct teacher instruction. 5. The student returns to the original pages, completes it correctly, and is monitored closely when new processes are introduced.
From the foregoing example, it can be seen that there is a very close association between assessment and instruction. The data useful in decision making may be related from informal assessments, such as observations from interactions or from teacher made tests. Informed decision making in education is very important owing to the obvious benefits it can bring about (Linn, 1999). Foremost among these benefits evaluation of feelings of competence in the area of academic skill and the sense of ones perception of being able to function effectively in society is something obligatory. Finally, the affective side of development is equally important. Personal dimensions, like feelings of self worth, being able to adjust to people and cope with various situations lead to better overall life adjustment.
Improvement of Student Learning Knowing how well students are performing in class can lead teachers to devise ways and means of improving student learning.
Identification of Students Strengths and Weaknesses Through measurement, assessment, and evaluation, teachers can be able to single out their students strengths and weaknesses. Data on these strengths and weaknesses can serve as bases for undertaking reinforcement and / or enrichment activities for the students.
Assessment of the Effectiveness of a Particular Teaching Strategy Accomplishment of an instructional objective through the use of a particular teaching strategy is important to teachers. Competent teachers continuously evaluate their choice of strategies on the basis of student achievement.
Appraisal of the Effectiveness of the Curriculum Through educational measurement, assessment, and evaluation, various aspects of the curriculum are continuously evaluated by curriculum committees on the basis of the results of achievement test results.
Assessment and Improvement of Teaching Effectiveness Results of testing are used as basis for determining teaching effectiveness. Knowledge of the results of testing can provide school administrators inputs on the instructional competence of teachers under their charge. Thus, intervention programs to improve teaching effectiveness can be undertaken by the principals or even supervisors on account of the results of educational measurement and evaluation.
Communication with and Involvement of Parents in Their Childrens Learning Results of educational measurement, assessment, and evaluation are utilized by the school teachers in communicating to parents their childrens learning difficulties, knowing how well their children are performing academically can lead them to forge a partnership with the school in improving and enhancing student learning.
The second method teachers utilize is observation. This method involves watching the students as they perform certain learning tasks like speaking, reading, performing laboratory investigation and participating in group activities.
Personal Contact. It refers to the teachers daily interactions with his / her students. A teachers observation on students as he / she works and relaxes, as well as daily conversation with them can provide valuable clues that will be or great help in planning instruction. Observing students not only tells the teacher how well students are doing but allows him / her to provide them with immediate feedback. Observational information is available in the classroom as the teacher watches and listens to students in various situations. Examples of these situations are as follows:
1. Oral Reading. Can the student read well or not? 2. Answering Questions. Does the student understand concepts? 3. Following Directions. Does the student follow specified
instruction?
4. Seatwork. Does the student stay on task? 5. Interest in the Subject. Does the student participate actively in
learning activities?
6. Using Instructional Materials. Does the student use the
material correctly? Through accurate observations, a teacher can determine whether the students are ready for next lesson. He / She can also identify those students who are in need of special assistance.
Analysis. Through a teachers analysis of the errors committed by students, he / she can be provided with much information about their attitude and achievement. Analysis can take place either during or following instruction. Through analysis, the teacher will be able to identify immediately students learning difficulties. Thus, teachers have to file samples of students work for discussion during parent teacher conferences. Open ended Themes and Diaries. One technique that can be used to provide information about students is by asking them to write about their lives in and out of the school. Some questions that students can be asked to react to are as follows: 1. What things do you like and dislike about school? 2. What do you want to become when you grow up? 3. What things have you accomplished which you are proud of? 4. What subjects do you find interesting? Uninteresting? 5. How do you feel about your classmates? The use of diaries is another method for obtaining data for evaluative purposes. A diary can consist of a record, written every 3 or 4 days, in which students write about their ideas, concerns, and feelings. An analysis of students diaries often gives valuable evaluative information.
Conferences. Conferences with parents and the students previous teachers can also provide evaluative information. Parents often have information which can explain why students are experiencing academic problems. Previous teachers can also describe students difficulties and the techniques they employed in correcting them. Guidance counselors can also be an excellent source of information. They can also shed light on test results and personality factors, which might affect students performance in class. Testing. Through testing, teachers can measure students cognitive achievement, as well their attitudes, values, feelings, and motor skills. It is probably the most common measurement technique employed by teachers in the classroom.
Types of Evaluation
Teachers need continuous feedback in order to plan, monitor, and evaluate their instruction. Obtaining this feedback may take any of the following types: diagnostic, formative, and summative. Diagnostic evaluation is normally undertaken before instruction, in order to assess students prior knowledge of a particular topic or lesson. Its purpose is to anticipate potential learning problems and group / place students in the proper course or unit of study. Placement of some elementary school children in special reading programs based on a reading comprehension test is an example of this type of evaluation. Requiring entering college freshmen to enroll in Math Plus based on the results of their entrance test in Mathematics is another example. Diagnostic evaluation can also be called pre assessment, since it is designed to check the ability levels of the students in some areas so that instructional starting points can be established. Through this type of evaluation, teachers can be provided with the valuable information concerning students knowledge, attitudes, and skills when they begin studying a subject and can be employed as basis for remediation or special instruction. Diagnostic evaluation can be based on teacher made tests, standardized tests or observational techniques.
Formative
evaluation
is
usually
administered
during
the
instructional process to provide feedback to students and teachers and how well the former are learning the lesson being taught. Results of this type of evaluation permit teachers modify instruction as needed. Remedial work is normally done to remedy deficiencies noted and bring the slow learners to the level of their classmates or peers. Basically, formative evaluation asks, how are my students doing? It uses pretests, homework, seatwork, and classroom questions. Results of formative evaluation are neither recorded, nor graded but are used for modifying or adjusting instruction. Summative evaluation is undertaken to determine students achievement for grading purposes. Grades provide the teachers the rationale for passing or failing students, based on a wide range of accumulated behaviors, skills, and knowledge. Through this type of evaluation, students accomplishments during a particularly marking term are summarized or summed up. It is frequently based on cognitive knowledge, as expressed through test scores and written outputs. Examples of summative evaluation are chapter tests, homework grades, completed project grades, periodical tests, unit test and achievement tests.
This type of evaluation answers the question, how did my students fare? Results of summative evaluation can be utilized not only for judging student achievement but also for judging the effectiveness of the teacher and the curriculum.
Approaches to Evaluation
According to Escarilla and Gonzales (1990), there are two approaches to evaluation, namely: norm referenced and criterion referenced. Non referenced evaluation is one wherein the performance of a student in a test is compared with the performance of the other students who took the same examination. The following are examples of norm referenced evaluation: 1. Karls score in the periodical examination is below the mean. 2. Cynthia ranked fifth in the unit test in Physics. 3. Reys percentile rank in the Math achievement test is 88.
Criterion referenced evaluation on the other hand, is an approach to evaluation wherein a students performance is compared against a predetermined or agreed upon standard. Examples of this approach are as follows: 1. Sid can construct a pie graph with 75% accuracy. 2. Yves scored 7 out of 10 in the spelling test. 3. Lito can encode an article with no more than 5 errors in spelling.
REFERENCES
Airisian, P.W. (1994). Classroom Assessment, 2nd Ed. New York: McGraw Hill, Inc. Bloom, B.S. (1970). The Evaluation of Instruction: Issues and Problems. New York: Holt, Rinehart & Winston. Clark, J. & I. Starr (1977). Secondary School Teaching Methods. New York: Macmillan Publishing Company. Escarilla, E. R. & E. A. Gonzales (1990). Measurement and Evaluation in Secondary Schools. Makati: Fund for Assistance to Private Education (FAPE). Jaeger, R. M. (1997). Educational Assessment: Trends and Practices. New York: Holt, Rinehart & Winston. Kellough, R. D., et al (1993). Middle School Teaching Methods and Resources, New York: Macmillan Publishing Company. Payne, D. A. (2003). Measuring and evaluating Educational Outcomes. New York: Macmillan Publishing Company.
Test Defined
A test is a systematic procedure for measuring an individuals behavior (Brown, 1991). This definition implies that it has to be developed following specific guidelines. It is a formal and systematic way of gathering information about the learners behavior, usually through paper and pencil procedure (Airisian, 1989). Through testing, teachers can measure students acquisition of knowledge, skills, and values in any learning area in the curriculum. While testing is the most common measurement technique teachers use in the classroom, there are certain limitations in their use. As pointed out by Moore (1992), tests cannot measure student motivation, physical
limitations and even environmental factors. The foregoing indicates that testing is only one of students learning and achievement.
Uses of Tests
Tests serve a lot of functions for school administrators, supervisors, teachers, and parents, as well (Arends, 1994; Escarilla & Gonzales, 1990). School administrators utilize test results for making decisions regarding the promotion or retention of students; improvement or enrichment of the curriculum; and conduct of staff development programs for teachers. Through test results, school administrators can also have a clear picture of the extent to which the objectives of the schools instructional program is achieved. Supervisors use test results in discovering learning areas needing special attention and identifying teachers weaknesses and learning competencies not mastered by the students. Test results can also provide supervisors baseline data on curriculum revision. Teachers, on the other hand, utilize tests for numerous purposes. Through testing, teachers are able to gather information about the effectiveness of instruction; give feedback to students about their progress; and assign grades.
Parents, too, derive benefits from tests administered to their children. Through test scores, they are able to determine how well their sons and daughters are faring in school and how well the school is doing its share in educating their children.
Types of Tests Numerous types of tests are used in school. There are different ways of categorizing tests, namely: ease of quantification of response , mode of preparation, mode of administration, test constructor, mode of interpreting results, and nature of response (Manarang & Manarang, 1983; Louisell & Descamps, 1992). As to mode of response, test can be oral, written or performance.
1. Oral Test It is a test wherein the test taker gives his answer
orally.
2. Written Test It is a test where answers to questions are
an answer or a product that demonstrates his knowledge or skill, as in cooking and baking.
students answers can be compared and quantified to yield a numerical score. This is because it requires convergent or specific response.
2. Subjective Test It is a paper and pencil test which is not
easily quantified as students are given the freedom to write their answer to a question, such as an essay test. Thus, the answer to this type of test is divergent. As to mode of administration, tests can either be individual or group.
1. Individual Test It is a test administered to one student at a
time.
2. Group Test It is one administered to a group of students
simultaneously.
specialist. This type of test samples behavior under uniform procedures. Questions are administered to students with the same directions and time limits. Results in this kind of test are scored following a detailed procedure based on its manual and interpreted based on specified norms or standards.
2. Unstandardized Test It is one prepared by teachers for use
in the classroom, with no established norms of scoring and interpretation of results. it is constructed by a classroom teacher to meet a particular need. As to the mode of interpreting results, tests can either be norm referenced or criterion referenced.
1. Norm referenced Test It is a test that evaluates a
students performance by comparing it to the performance of a group of students on the same test.
2. Criterion referenced Test It is a test that measures a
As to the nature of the answer, tests can be categorized into the following types: personality, intelligence, aptitude, achievement,
aspects of an individuals personality. Some areas tested in this kind of test include the following: emotional and social adjustment; dominance and submission; value orientation; disposition; emotional stability; frustration level; and degree of introversion or extroversion.
2. Intelligence Test It is a test that measures the mental ability
of an individual.
3. Aptitude Test it is a test designed for the purpose of
identify their specific strengths and weaknesses in past and present learning.
7. Formative Test It is a test given to improve teaching and
learning while it is going on. A test given after teaching the lesson for the day is an example of this type of test.
8. Socio metric Test It is a test used in discovering learners
likes and dislikes, preferences, and their social acceptance, as well as social relationships existing in a group.
9. Trade Test It is a test designed to measure an individuals
remembering facts, concepts, and other important data on any topic or subject.
2. Comprehension
clarification and articulation of the main idea of what students are learning.
3. Application Level: behaviors that have something to do with
problem solving and expression, which require students to apply what they have learned to other situations or cases in their lives.
4. Analysis Level: behaviors that require students to think
critically, such as looking for motives, assumptions, cause effect relationship, differences and similarities, hypotheses, and conclusions.
5. Synthesis Level: behaviors that call for creative thinking,
such as combining elements in new ways, planning original experiments, creating original solutions to a problem and building models.
6. Evaluation Level: behaviors that necessitate judging the
How Long the Test Should Be. The answer to the aforementioned question depends on the following factors: age and attention span of the students; and type of questions to be used. How Best to Prepare Students for Testing. To prepare students for teaching, Airisian (1994) recommends the following measures; (1) providing learners with good instruction; (2) reviewing students before testing; (3) familiarizing students with question formats; (4) scheduling the test; and (5) providing students information about the test.
Completion Drawing Type an incomplete drawing is presented which the student has to complete. Example: In the following food web, draw arrow lines indicating which organisms are consumers and which are producers.
Completion Statement Type an incomplete sentence is presented and the student has to complete it by filling in the blank. Example: The capital city of the Philippines is
__________________.
Correction Type a sentence with underlined word or phrase is presented, which the student has to replace to make it right. Example: Change the underlined word / phrase to make each of the following statements correct. Write your answer on the space before each number. __________ 1. The theory of evolution was popularized by Gregor Mendel. __________ 2. Hydrography is the study of oceans and ocean currents.
Identification Type a brief description is presented and the student has to identify what it is. Example: To what does each of the following refer? Write your answer on the blank before each number. __________ 1. A flat representation of all curved surfaces of the earth. __________ 2. The transmission of parents characteristics and traits to their offsprings.
Simple Recall Type a direct question is presented for the student to answer using a word or phrase. Example: What is the product of two negative numbers? Who is the national hero in the Philippines?
Short Explanation Type similar to an essay test but requires a short answer. Example: Explain in a complete sentence why the Philippines Magellan. was not really discovered by
Selection Types of Objective Test. Included in the category of selection type, grouping type, matching type, multiple choice type, alternate response type, key list test, and interpreting exercise.
the students in a specified order. Example 1: Arrange the following events chronologically by writing the letters A, B, C, D, E on the spaces provided. _______ Glorious Revolution _______ Russian Revolution _______ American Revolution _______ French Revolution _______ Puritan Revolution Example 2: Arrange the following planets according to their nearness to the sun, by using numbers, 1, 2, 3, 4, 5. _______ Pluto _______ Venus _______Jupiter _______ Mars _______ Saturn
of lettered choices. Example: Match the country in Column 1 with its capital city in Column 2. Write letters only. Column 1 ________ 1. Philippines ________ 2. Japan ________ 3. United States ________ 4. Great Britain ________ 5. Israel Column 2 a. Washington D. C. b. Jeddah c. Jerusalem d. Manila e. London f. Tokyo g. New York
problem or unfinished sentence followed by several responses. Example: The study of value is (a) axiology (c) epistemology (b) logic (d) metaphysics.
two possible answers to the question. The true false format is a form of alternative response type. Variations on the true false include yes no, agree disagree, and right wrong. Example: Write True, if the statement is true; False, if it is false. _________ 1. Lapulapu was the first Asian to repulse European colonizers in Asia. _________ 2. Magellans expedition of the Philippines led to the first circumnavigation of the globe. _________ 3. The early Filipinos were uncivilized before the Spanish conquest of the archipelago. _________ 4. The Arabs introduced Islam in Southern Philippines.
paired concepts based on a specified set of criteria (Olivia, 1998). Example: Examine the paired items in Column 1 and Column 2. On the blank before each number, write: A = If the item in column 1 is an example of the item in column 2; B = If the item in column 1 is a synonym of the item in column 2; C = If the item in column 2 is opposite of the item in column 1; and D = If the item in Columns 1 and 2 are not related in any way. Column 1 _____ 1. capitalism _____ 2. labor intensive _____ 3. Planned economy _____ 4. opportunity cost _____ 5. free goods
Column 2 economic system capital intensive command economy demand and supply economic goods
of test that can assess higher cognitive behaviors. According to Airisian (1994) and Mitchell (1992), interpretive exercise provides students some information or data followed by a series of questions on that information. In responding to the questions in an interpretive exercise, the students have to analyze, interpret,
or apply the material provided, like a map, excerpt of a story, passage of a poem, data matrix, table or cartoon. Example: Examine the data on child labor in Europe during the period immediately after the Industrial Revolution in the continent. Answer the questions given below encircling the letter of your choice.
TABLE 1 Child Labor in the Years Right After the Industrial Revolution in Europe
Year 1750 1760 1770 1780 1790 1800 1820 Number of Child Laborers 1800 3000 5000 3400 1200 600 150
1.
sudden increase in the number of child laborers? a. 1760 b. 1770 3. c. 1780 d. 1790
in addressing the problems of child labor. In what year this evident? a. 1780 c. 1800
b. 1790
d. 1820
Essay Test
This type of test presents a problem or question and the student is to compose a response in paragraph form, using his or her own words, and ideas. There are two forms of the essay test: brief or restricted; and extended.
requires a limited amount of writing or requires that a given problem be solved in a few sentences. Example: Why did early Filipino revolts fail? Cite and explain 2 reasons.
student to present his answer in several paragraphs or pages of writing. It gives students more freedom to express ideas and opinions and use synthesizing skills to change knowledge into a creative idea. Example: Explain your position on the issue of charter change in the Philippines.
According to Reyes (2000) and Gay (1985), the essay test is appropriate to use when learning outcomes cannot be adequately measured by objective test items. Nevertheless, all levels of cognitive behaviors can be measured with the use of the essay test as shown below.
Comprehension Level What does it mean when a person had crossed the Rubicon?
Application Level Cite three instances showing the application of the Law of Supply and Demand.
Analysis Level Analyze the annual budget of your college as to categories of funds, sources of funds, major
Synthesis Level Discuss the significance of the Peoples Power Revolution in the restoration of democracy in the Philippines.
Evaluation Level Are you in favor of the political platform of the Peoples Reform Party? Justify your answer.
Choosing the type of test depends on the teachers purpose and the amount of time to be spent for the test. As a general rule, teachers must create specific tests that will allow students to demonstrate targeted learning competencies.
CHAPTER 4 An Introduction to the Assessment of Learning in the Psychomotor and Affective Domains
As pointed out in the previous chapter, there are three domains of learning objectives that teachers have to assess. While it is true that achievement in the cognitive domain is the one teachers measure frequently, students growth in non cognitive domains of learning should also be given equal emphasis. This chapter expounds different ways by which learning in the psychomotor and affective domains can be assessed and evaluated. Levels of Learning in the Psychomotor Domain The psychomotor domain of learning is focused on processes and skills involving the mind and the body (Eby & Kujawa, 1994). It is the domain of learning which classifies objectives dealing with physical movement and coordination (Arends, 1994; Simpson, 1966). Thus, objectives in the psychomotor domain require significant motor
performance. Playing a musical instrument, singing a song, drawing, dancing, putting a puzzle together, reading a poem and presenting a
speech are examples of skills developed in the aforementioned domain of learning. There are three levels of psychomotor learning: imitation,
Imitation is the ability to carry out a basic rudiments of a skill when given directions and under supervision. At this level the total act is not performed skillfully. Timing and coordination of the act are not yet refined.
Manipulation is the ability to perform a skill independently. The entire skill can be performed in sequence. Conscious effort is no longer needed to perform the skill, but complete accuracy has not been achieved yet.
Precision is the ability to perform an act accurate, efficiently, and harmoniously. Complete coordination of the skill has been acquired. The skill has been internalized to such extent that it can be performed unconsciously.
Based on the foregoing list of objectives, it can be noted that these objectives range from simple reflex reactions to complex actions, which communicate ideas or emotions to others. Moreover, these objectives serve as a reminder to every teacher that students under his charge have
to learn a variety of skills and be able to think and act in simple and complex ways. Measuring the Acquisition of Motor and oral Skills There are two approaches that teachers can use in measuring the acquisition of motor and oral skills in the classroom: observation of student performance and evaluation of student projects (Gay 1990). Observation of Student Performance is an assessment approach in which the learner does the desired skill in the presence of the teacher. For instance, in physical Education class, the teacher can directly observe how male students dribble and shoot the basketball. In this approach, the teacher observes the performance of a student, gives feedback, and keeps a record of his performance, if appropriate. Observation of student performance can either be holistic or atomistic (Louisell & Descamps, 1992). Holistic observation is employed when the teacher gives a score or feedback based on pre established prototypes of how an outstanding, average, or deficient performance looks. Prior to the observation, the teacher describes the different levels of performance.
A teacher, for example, who required his students to make an oral report on a research they undertook, describes the factors which go into an ideal presentation. What the teacher may consider in grading the report, include the following: knowledge of the topic; organization of the presentation of the report; enunciation; voice projection; and enthusiasm. The ideal present has to be described and the teacher has to comment on each of these factors. A student whose presentation closely matches the ideal described by the teacher would receive a perfect mark. The second type of observation that can be utilized is atomistic or analytic. This type of observation requires that a task analysis be conducted in order to identify the major subtasks involved in the student performance. For example, in dribbling the ball, the teacher has to identify movements necessary to perform the task. Then, he has to develop pa checklist which enumerates the movements necessary to the performance of the task. These positions are demonstrated by the teacher. As students perform the dribbling of the ball, the teacher assigns checkmarks for each of the various subtasks. After the students has performed the specified action, all checkmarks are considered and an assessment of the performance is made.
Evaluation of Student Products is another approach that teachers can use in the assessment of students mastery of skills. For example, projects in different learning areas may be utilized in assessing students progress. Student products include drawings, models, construction paper products, etc. The same principles involved in holistic and atomistic observations apply to the evaluation of projects. The teacher has to identify prototypes representing different levels of performance for a project or do a task analysis and assign scores by subtasks. In either case, the student has to inform of the criteria and procedures to be used in the assessment of their work.
According to Airisian (1994), there are four steps to consider in making use of this type of performance assessment. (1) establishing a clear purpose; (2) setting performance criteria; (3) creating an appropriate setting; and (4) forming scoring criteria or predetermined rating. Purpose is very important in carrying out portfolio assessment. Thus, there is a need to determine beforehand the objective of the assessment and the guidelines for student products that will be included in the portfolio prior to compilation. While teachers need to collaborate with their colleagues in setting a common criterion, it is crucial they involve their students in setting standards or performance. This will enable the latter to claim ownership over their performance. Portfolio assessment also needs to consider the setting in which students performance will be gathered. Shall it be a written portfolio? Shall it be a portfolio of oral or physical performances, science experiments, artistic productions and the like? Setting has to be looked into since arrangements have to be made on how desired performance can be properly collected.
Lastly, scoring methods and judging students performance are required in portfolio assessment. Scoring students portfolio, however, is time consuming as a series of documents and performances has to be scrutinized and summarized. Rating scales, anecdotal records, and checklists can be used in scoring students portfolios. The content of a portfolio, however, can be reported in the form of a narrative.
Arrange the scales either from positive or negative or vice versa. Write directions for accomplishing the rating scale. Following is an example of a rating scale for judging a student teacher presentation of a lesson.
5 5 5 5
4 4 4 4
3 3 3 3
2 2 2 2
1 1 1 1
Use of questions Directions and refocusing Use of reinforcement Use of teaching aids and instructional materials
A checklist differs from a rating scale as it indicates the presence or absence of specified characteristics. It is basically a list of criteria upon which a students performance or end product is to be judged. The checklist is used by simply checking off the criteria items that have been met. Response on a checklist varies. It can be a simple check mark indicating that an action took place. For instance, a checklist for observing student participation in the conduct of a group experiment may appear like this: 1. Displays interest in the experiment. 2. Helps in setting up the experiment. 3. Participates in the actual conduct of the experiment. 4. Makes worthwhile suggestions.
The rater would simply check the items occurred during the conduct of the group experiment. Another type of checklist requires a yes or no response. The yes is checked when the action is done satisfactorily; the no is checked when the action is done unsatisfactorily. Below is an example of this type of checklist.
_______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________ _______________ ______________
Receiving involves being aware of and being willing to freely attend to a stimulus.
Responding involves active participation. It involves not only freely attending to a stimulus but also voluntarily reacting to it in some way. It requires physical, active behavior.
Valuing refers to voluntarily giving worth to an object, phenomenon or stimulus. Behaviors at this level reflect a belief, appreciation, or attitude.
Commitment involves building an internally consistent value system and freely living by it. A set of criteria is established and applied in making choices.
encourage students to express feelings, attitudes, and values about topics discussed in class. They can observe students and may find evidence of some affective learning. Although, it is difficult to assess learning in the affective domain, there are some tools that teachers can use in assessing learning in this area. Some of these tools are the following: attitude scale; questionnaire; simple projective techniques; and self expression techniques (Escarilla & Gonzales, 1990; Ahmann & Glock, 1991). Attitude Scale is a form of rating scale containing statements designed to gauge students feelings on an attitude or behavior. An example of an attitude scale is shown below.
Response to the items is based on the response code provided in the attitude scale. A value ranging from 1 to 5 is assigned to the options provided. The value of of 5 is usually assigned to the option strongly agree and 1 to the option strongly disagree. When a statement is negative, however, the assigned values are usually reversed. The composite score is determined by adding the scale values and dividing it by the number of statements or items. Questionnaire can also be used in evaluating attitudes, feelings, and opinions. It requires students to examine themselves and react to a series of statements about their attitudes, feelings, and opinions. The response style for a questionnaire can take any of the following forms: checklist type, semantic differential, and likert scale
The Checklist type of response provides the students a list of adjectives for describing or evaluating something and requires them to check those that apply. For example, a checklist questionnaire on students attitudes in a science class may include the following: This class is ________________ boring. ________________ exciting. ________________ interesting. ________________ unpleasant. ________________ highly informative. I find Science ________________ fun. ________________ interesting. ________________ very tiring. ________________ difficult. ________________ easy. The scoring of this type of test is simple. Subtract the number of negative statements checked from the number of positive statements checked.
Semantic
differential
is
another
type
of
response
on
questionnaire. It is usually a five point scale showing polar or opposite objectives. It is designed so that attitudes, feelings, and opinions can be measured by degrees from very favorable to very unfavorable. Given below is an example of a questionnaire employing the aforementioned response type. Working with my group members is: Interesting Challenging Fulfilling _____ : _____ : _____ : _____ : _____ Boring _____ : _____ : _____ : _____ : _____ Difficult _____ : _____ : _____ : _____ : _____ Frustrating
The composite score on the total questionnaire is determined by averaging the scale values given to the items included in the questionnaire. Likert scale is one of the frequently used styles of response in attitude measurement. It is oftentimes a five point scale links the options strongly agree and strongly disagree. An example of this kind of response is shown below.
A Likert Scale for Assessing Students Attitude Towards Leadership Qualities of Student Leaders
Name ____________________________________ Date _____________ Read each statement carefully. Decide whether you agree or disagree with each of them. Use the following response code: 5 = Strongly disagree; 4 = Agree; 3 = Undecided; 2 = Disagree; 1 = Strongly Disagree. Write your response on the blank before each item. Student leaders: 1. Have to work for the benefit of the students.
2. Should set example of good behavior to the
members of the organization. 3. Need to help the school in implementing campus rules and regulations. 4. Have to project a good image of the school in the community. 5. Must speak constructively of the schools teacher and administrators. Scoring of a Likert scale is simlar to the scoring of an attitude scale earlier presented in this chapter.
Simple projective techniques are usually used when a teacher wants to probe deeper into the students feelings and attitudes. Escarilla and Gonzales (1990) say that there are three types of simple projective techniques that can 1be used in the classroom, namely: word association, unfinished sentences, and unfinished story. In word association, the student is given a word and asked to mention what comes to his / her mind upon hearing it. For example, what comes to your mind upon hearing the word corruption? In an unfinished sentence, the students are presented partial sentences and are asked to complete them with words that best express their feeling, for instance: Given the chance to choose, I _____________________________. I am happy when _______________________________________. My greatest failure in life was ______________________________. In an unfinished story, a story with no ending is deliberately presented to the students, which they have to finish or complete. Through this technique, the teacher will be able to sense students worries, problems, and concerns.
Another way by which affective learning can be assessed is through the use of self expression techniques. Through these techniques, students are provided the opportunity to express their emotions and views about issues, themselves, and others. Self expression techniques may take any of the following forms: log book of daily routines or activities, diaries, essays and other written compositions or themes, and
autobiographies.
CHAPTER REVIEW
1. What is meant by psychomotor learning? What are the levels of learning under the psychomotor domain? Explain each. 2. What are the two general approaches in measuring the acquisition of motor and oral skills? Differentiate each. 3. What are the guidelines to observe in undertaking atomistic and holistic observation? 4. What is portfolio assessment? What are the advantages of using this type of assessment in evaluating student performance and student products? 5. What are the guidelines to observe in using portfolio assessment in the classroom?
6. What are the tools teachers can use in measuring students acquisition of motor and oral skills? Briefly define each. 7. What do we mean by affective learning? What are the different levels of affective learning? Describe each briefly. 8. What are the techniques teachers can employ in evaluating affective learning? Discuss each very briefly.
Constructing paper and pencil test is a professional skill. Becoming proficient at it takes study, and practice. Owing to the recognized importance of a testing program, a prospective teacher has to assume this task seriously and responsibly. He / She needs to be familiar with the different types of test items and how best to write them. This chapter seeks to equip prospective teachers with the skill in constructing objective paper and pencil tests.
Measure all instructional objectives. The test a teacher writes should be congruent with all the learning objectives focused in class.
Cover all learning tasks. A good test is not focused only on one type of objective. It must be truly representative of all targeted learning outcomes.
Use appropriate test items. Test items utilized by a teacher have to be in consonance with the learning objectives to be measured.
Make test valid and reliable. Teachers have to see to it that the test they construct measures what it purports to measure. Moreover, they need to ensure that the test will yield consistent results for the students taking it for the second time.
Use test to improve learning. Test scores obtained by the students can serve as springboards for the teachers to re-teach concepts and skills that the former have not mastered.
Validity It is the degree to which a test measures what it seeks to measure. To determine whether a test a teacher constructed is valid or not, he / she has to answer the following questions: 1. Does the test adequately sample the intended content? 2. Does it test the behaviors / skills important to the content being tested? 3. Does it test all the instructional objectives of the content take up in class?
Reliability It is the accuracy with which a test consistently measures that which it does measure. A test, therefore, is reliable if it produces similar results when used repeatedly. A test may be reliable but not necessarily valid. On the other hand, a valid test is always a reliable one.
Objectivity It is the extent to which personal biases or subjective judgment of the test scorer is eliminated in checking the student responses to the test items, as there is only one correct answer for each question. For a test to be considered objective, experts must agree on the right of the best answer. Thus, objectivity is a characteristic of the scoring of the test and not of the form of the test questions.
Scorability It is easy to score or check as answer key and answer sheet are provided.
Administrability It is easy to administer as clear and simple instructions are provided to students, proctors, and scorers.
Relevance It is the correspondence between the behavior required to respond correctly to a test item and the purpose or objective in writing the item. The test item should be directly related to the course objectives and actual instruction. When used in relation to educational assessment, relevance is considered a major contributor to test validity.
Balance Balance in a test refers to the degree to which the proportion of items testing particular outcomes corresponds to the deal test. The framework of the test is outlined by a table of specifications.
Efficiency It refers to the number of meaningful responses per unit of time. Compromise has to be made the available time for testing, scoring, and relevance.
Difficulty The test items should be appropriate in difficulty level to the group being tested. In general, for a norm referenced test, a reliable test is one in which each item is passed by half of the students. For a criterion referenced test, difficulty can be judged relative to the percentage passing before and after instruction. Difficulty will indefinitely be based on the skill and knowledge measured and students ability.
Discrimination For a norm referenced, the ability of an item to discriminate is generally indexed by the difference between the proportion of good and poor students who respond correctly. For a criterion referenced test,
differences of the ability of the test or item to distinguish competent from less competent students.
Fairness To ensure fairness, the teacher should construct and administer the test in manner that allows students an equal chance to demonstrate their knowledge or skills.
Identification of instructional objectives and learning outcomes. This is the first step a teacher has to undertake when constructing classroom tests. He / She has to identify instructional objectives and learning outcomes, which will serve as his / her guide in writing test items.
Listing of the topics to be covered by the Test. After identifying the instructional objectives and learning outcome, a teacher needs to outline the topics to be included in the test.
Preparation of Table of Specification (TOS). The table of specifications is a two way table showing the content coverage of the test and the objectives to be tested. It can serve as a blueprint in writing the test items later.
Selection of the Appropriate Types of Tests. Based on the TOS, the teacher has to select test types that will enable him / her to measure the instructional objectives in the most effective way. Choice of test type depends on what shall be measured.
Writing Test Items. After determining the type of test to use, the teacher proceeds to write the suitable test items.
Sequencing the Items. After constructing the test items, the teacher has to arrange them based on difficulty. As a general rule items have to be sequenced from the easiest to the most difficult for psychology reason.
Writing the Directions or Instructions. After sequencing items, the teacher has to write clear and simple directions, which the students will follow in answering the test questions.
Preparations of the Answer Sheet and Scoring Key. To facilitate checking of students answers, the teacher has to provide answer sheets and prepare a scoring key in advance.
topic covered. The formula to be applied is as follows: % for a Topic = Total number of days / hours spent divided by the total number of days / hours spent teaching the topic. Example: Mrs. Sid Garcia utilized 10 hours for teaching the unit on Pre Spanish Philippines. She spent 2 hours in teaching the topic, Early Filipinos and their Society. What percentage of test items should she allocate for the aforementioned topic? Solution: 2 (100) = 20% 4
5. Determine the number of items to construct for each topic.
This can be done by multiplying the percentage allocation for each topic by the total number of items to be constructed. Example: Mrs. Sid Garcia decided to prepare a 50 item test on the unit, Pre Spanish Philippines. How many items should she write for the topic mentioned in step number 4? Solution: 50 items x 0.20 (20%) = 10 items
6. Distribute the number of items to the objectives to be tested. The number of items allocated for each objective depends on the degree of importance attached by the teacher to it. After going through the six steps, the teacher has to write the TOS in a grid or matrix, as shown below.
15
10
10
10
20
50
In writing directions, use a clear, succinct style. Be as explicit as possible but avoid long drawn out explanations. Emphasize the more important directions and key activities through the use of understanding, italics, or different type size or style. Field or pretest the directions with a sample of both examinees and examiners and to identify and possible gather
misunderstandings
inconsistencies
suggestions for improvement. Keep directions for different forms, subsections or booklets as uniform as possible.
Where necessary or helpful, give practice items before each regular section. This is very important when testing young children or those unfamiliar with the objective tests, or separate answer sheets.
The correct answer depends on what is meant by serious. Considering that heart disease leads to more deaths, mental illness affects a number of people, and AIDS is a world wide problem nowadays, there are three possible answers. Nevertheless, the question can be reworded as follows, for example: Improved: The leading cause of death in the world today is: (C) Heart disease (D) Cancer
To be able to write effective multiple choice items, the following guidelines should be followed: 1. Each item should be clearly stated, in the form of a question or an incomplete statement. 2. Do not provide grammatical or contextual clues to the correct answer. For instance, the use of a before the options indicates that the answer begins with a vowel. 3. Use understand.
4.
language
that
even
the
poorest
readers
will
5.
Each alternate response should fit the stem in order to avoid giving clues to its correctness.
6.
Refrain from using negatives or double negatives. They tend to make the items confusing and difficult.
7.
Use all of the above and none only when they will contribute more than another plausible distractor.
8.
Do not use items directly from the textbook. Test for understanding not memorization. Examine the following multiple choice items. Sample 1: A two way grid summarizing the relationship between test scores and criterion scores is
sometimes referred to as an: (A) Correlation coefficient. (C) Probability histogram. (B) Expectancy table. (D) Bivariate frequency distribution
Sample 1 is faulty because of the use of article an. This is because this article can lead the student to the correct answer, which is B. Improved: Two way grids summarizing test criterion relationships are sometimes called: (A) Correlation coefficient. (C) Probability histogram. (B) Expectancy table. (D) Bivariate frequency distribution
Sample 2: Which of the following descriptions makes clear the meaning of the word electron? (A) An electronic gadget (B) Neutral particles (C) Negative particles Sample 2 is poorly written owing to its use of distractors that are not plausible or closely related to each other. Options A and D are not in anyway associated with the remaining choices or alternatives. Improved: Which of the following phrases is a description of an electron? (A) Neutral particle (B) Negative particle (C) Neutralized proton Sample 3: What is the area of a right triangle whose sides adjacent to the right angle are 4 inches and 3 inches, respectively? Sample 3 is also erroneously written as it used the option none of the above without caution. Why? This is because the answer is 6 inches and the bright student will definitely choose option D. on the other hand, (D) Related particle (E) Atom nucleus (D) A voting machine (E) The nuclei of atoms
the student who solved the problem incorrectly and came up with an answer not found among the choices, would choose D, thereby getting the correct answer for the wrong reason. The answer, none of the above can be a good alternative if the correct answer is included among the options or choices. Improved: What is the area of a right triangle whose sides adjacent to the right angles are 4 inches,
respectively? (A) 6 square inches (B) 7 square inches (C) 12 square inches (D) 13 square inches (E) none of the above
Using Multiple Choice Items in Assessing Problem Solving and Logical Thinking
Schools today are stressing on problem solving skills owing to societys pressures on the former to produce individuals with significant skills in the aforementioned area. A number of terms have been used to describe the basic operations of application. Terms like critical thinking and logical reasoning are used as rubrics under which the basic processes of problem identification, specification of alternative solutions, evaluation of consequences, and solution selection are grouped.
Creating problem solving measures follows a step by step procedures (Haladyna & Downing, 1999). Step 1. Decide on the principle / s to be tested. Criteria to be considered should: Be known principles but the situation in which the principles are to be applied should be new. Involve significantly important principles. Be pertinent to a problem or situation common to all students. B e within the range of comprehension of all students. Use only valid and reliable sources from which to draw data Be interesting to students. Step 2. Determine the phrasing of the problem situation so as to require the students in drawing their conclusion to do one of the following: Make a prediction. Choose a course of action. Offer an explanation for an observed phenomenon. Criticize a prediction or explanation made by others.
Step 3. Set up the problem situation in which the principle or principles selected operate. Present the problem to the class with directions to draw a conclusion or conclusions and give several supporting reasons foe their answer. Step 4. Edit the students answers, selecting those that are most representative conclusions of their thinking. These will include both
and
supporting
reasons
that
are
acceptable and unacceptable. Step 5. To the conclusions and reasons obtained from the students, the teacher now adds any others that he or she feels are necessary to cover the salient points. The total number of items should be at least 50% more than is desired in the final form to allow for elimination of poor items. Some types of statements that can be used are as follows: True statements of principles and facts False statements of principles and facts Acceptable and unacceptable analogies Appeal to acceptable or unacceptable authority
Ridicule Assumption of the conclusion Teleological explanations Step 6. Submit tests to colleagues or evaluators for criticisms. Revise test based on these criticisms. Step 7. Administer test. Follow with thorough class discussion. Step 8. Conduct an item analysis. Step 9. In the light of steps 7 and 8, revise the test. Following are some examples of problem solving items.
1. Ulysses wanted to go to the US. But Ulysses father, who is quite
strict with him, stated emphatically that he could not go unless he got a grade of 1.25 in both his freshman English courses, Ulysses father always keep his promises. When summer came, Ulysses went to the US. If from this information, you conclude that Ulysses earned 1.25, you must be assuming that: (A) Ulysses had never obtained a grade of 1.25 before. (B) Ulysses had no money of his own.
(C) Ulysses father was justified in saying what he did. (D) Ulysses went to the US with his fathers consent. (E) Ulysses was very sure that he would be able to go. 2. Consider these facts about the coloring of animals: Plant lice, which live on the stems of green plants,
the trees on which it lives. Insects, birds, and mammals that live in the desert
region are white. Which one of the following statements do these facts tend to support? Animals that prey on others use colors as disguise. Some animals imitate the color and shape of other natural objects for protection. The coloration of animals has to do with their surroundings.
Protective coloration is found more among insects and birds than among mammals. Many animals and insects have protective coloring.
6. Refrain from creating a pattern of response. 7. Present a similar number of true and false statements. 8. Be sensitive to the use of specific determiners. Words such as always all, never, and none indicate sweeping generalizations, which are associated with false items. Conversely, words like usually and generally are associated with true items. 9. A statement must only have one central idea. 10. Avoid quoting exact statements from the textbooks. Let us go over examples of the alternate response test items. Sample 1. The raison detre for capital punishment is retribution according to some peripatetic politicians. This sample alternate response item is poorly written for it used words that are very unfamiliar or difficult to understand by an average student. Improved: According to some politicians, the justification for the existence of capital punishment can be traced to the biblical statement, an eye for an eye, a tooth for a tooth.
Sample 2. From time to time efforts have been made to explain the notion that there may be a cause and effect relationship between arboreal life and primate anatomy. Sample 2 id again faulty as it was copied exactly between from the textbook. Improved: There is a known relationship between primate anatomy and arboreal life. Sample 3. Many people voted for Gloria Macapagal Arroyo in the last presidential election. Sample 3 also violates the rule on writing alternate response items owing to its use of not precise language. As such it is open to numerous and ambiguous interpretation. Improved: Gloria Macapagal Arroyo received more than 50% of the votes cast in the last presidential election. Alternate response items allow teachers to sample a number of cognitive behaviors in a limited amount of time. Even the scoring of alternative response items tends to be simple and easy. Nonetheless, there are content and learning outcomes that cannot be adequately measured by alternate response items, like problem solving and complex learning.
responses. Sound testing practice dictates that the directions spell out the nature of the task. It is unfair and reasonable that the student should have to read through the stimulus and response list in order to discern the basis for matching. 2. Be sure that the whole matching exercise is found on one
page only. Splitting the exercise is confusing, distracting, and time consuming for the student.
3.
If a matching exercise is too long, the task becomes tedious and the discrimination too fine. 4. Both the premises and responses in the same general
category or class (e.g. inventors inventions; authors literary works; objects - characteristics). 5. Premises or responses composed of one or two words
should be arranged alphabetically. Analyze the following matching exercise. Does it follow the suggestions on writing a matching exercise? Directions: Match Column A with Column B. You will be given one point for each correct match. Column A 1. Execution of Rizal 2. Pseudonym of Ricarte 3. Hero of Tirad Pass 4. Arrival of the Spaniards in the Philippines 5. Masterpiece of Juan Luna Column B a. 1521 b. 1896 c. Gregorio del Pilar d. Spolarium e. Vibora
The matching exercise is poorly written as the premises in column A do not belong to same category. Thus, answers can easily be guessed by the student. Below is the version of the above matching exercise. Column A
1. National Hero of the Philippines 2. Hero of Tirad Pass 3.
1.
completion item. 2. The blank should be placed near or at the end of the
correct and whether spelling will be a factor in scoring. 4. Be definite enough in the incomplete statement so that only
one correct answer is possible. 5. Avoid using direct statements from the textbooks with a
word or two missing. 6. All blanks for all items should be of equal length and long
enough to accommodate the longest response. Go over the following sample items: Directions: On your answer sheet, write the expression that completes each of the following sentences. 1. __________ is money earned from the use of money. 2. The Philippines is at the _________ and ________ of ________. Sample 1 is poorly written as a well written completion item should have its blank either near or at the end of the sentence. In like manner Sample 2 is also poorly written as the statement is over mutilated. Following are the improved versions of these sample items.
1. Money earned from the use of money is called _________. 2. The Philippines is located in the continent of _________.
Sample 2 Directions: The following words are arranged at random. On your answer sheet, rearrange the words so that they will form a sentence. much the costs rose
Sample 3 Directions: Each group of letters below spell out words item if the letters are properly arranged. On your answer sheet, rearrange the letters in each group to form a word. ybo ebul swie atgo
2.
prices.
3.
total revenue.
Following are examples of identification items written following the guidelines in constructing this type of test item. Directions: Following are phrase definitions of terms. Opposite each number, write the term defined. 1. Weight divided by volume. 2. Degree of Hotness or coldness of a body
3. Changing speed of a moving body
2. Spaces for the writing of answers have to be provided and should be of the same length. Below are examples of enumeration items. Directions: List down or enumerate what are asked for in each of the following.
Underlying Causes of World War I and II 1. ______________________ 2. ______________________ 3. ______________________ Factors Affecting the Demand for a Product 6. ______________________ 7. ______________________ 8. ______________________ 9. ______________________ 10. _____________________ 4. ______________________ 5. ______________________
Example 2: Bonifacio is for the Philippines, while ______________ is for the United States of America. (a) Jefferson (b) Lincoln (c) Madison (d) Washington
The following guidelines have to be considered in constructing analogy items: (Calmorin, 1994). 1. The pattern of relationship in the first pair of words must be the same pattern in the second pair. 2. 3. Options must be related to the correct answer. The principle of parallelism has to be observed in writing the options. 4. More than three options have to be included in each analogy item to lessen guessing. 5. All items must be grammatically consistent.
charts or even comprehension of written passages. Airisian (1994) suggested the following guidelines in writing this kind of test item: 1. The interpretative exercise must be related to the instruction provided the students. 2. The material to be presented to the students should be new to the students but similar to what was presented during instruction.
3.
Written passages should be as brief as possible. The exercise should not be a test of general reading ability.
4.
The students have to interpret, apply, analyze and comprehend in order to answer a given question in the exercise.
CHAPTER REVIEW
1. What are the basic principles of testing that teachers must consider in constructing classroom tests? Explain each briefly. 2. What are the steps or procedures teachers have to follow in writing their own tests? Explain the importance of each of them. 3. What is the table of specification (TOS)? How is it prepared? 4. 5. What are the general guidelines in writing test items? What are the specific guidelines to be observed in writing the following types of test item: 5.1 5.2 Multiple choice; True false;
Matching item; Arrangement item; Identification item; Correction item; Analogy; Interpretative exercise; Short explanation item?
criterion for answering the question. It follows, therefore, that a more restricted response essay item is, in general, preferable. An instruction such as discuss the relative advantages and disadvantages of essay tests with respect to (1) reliability, (2) objectivity, (3) content validity, and (4) usability presents a better defined task more likely to lend itself to reliable scoring and yet allows examinees sufficient opportunity or freedom to organize and express their ideas creatively.
State necessary assumptions; Describe the limitations of data; Explain methods and procedures; Produce, organize, and express ideas; Integrate learning in different areas; Create original forms; and Evaluate the worth of ideas.
1. What is the chemical formula for sodium bicarbonate? 2. Who wrote the novel, The Last of the Mothicans? B. Selective Recall in which a basis for evaluation
or judgment is suggested 1. Who among the Greek philosophers affected your thinking as a student? 2. Which method of recycling is the most appropriate to use at home? II. A. Understanding Comparison of two phenomena on a single
designated basis
1. Compare 19th century and present day Filipino writers
B.
1. Compare the Philippine Revolution of 1896 with that of Peoples Power Revolution of 1986. C. Explanation of the use or exact meaning of a
phrase or statement. 1. The legal system of the Mesopotamians was anchored on the principle of an eye for an eye, a tooth for a tooth. What dies these principle mean?
D.
1. What is the central idea of communism as an economic system? E. Statement of an artists purpose in the
selection or organization 1. Why did Hemingway describe in detail the episode in which Gordon, lying wounded, engages the oncoming enemy?
III.
Application. It should be clearly understood that whether or not question requires application depends on the
preliminary educational experience. If an analysis has been taught explicitly, a questionnaire analysis is but a simple recall. A. Causes or Effects
1. Why did Fascism prevail in Germany and Italy but not in Great Britain and France? 2. Why does frequent dependence on penicillin for treatment minor ailment result in its reduced
B.
Analysis
1. Why was Hamlet torn by conflicting desires? 2. Why was the Propaganda Movement a successful failure? C. Statement of Relationship
1. A researcher reported that teaching styles correlates with student achievement at about 0.75. What does this correlation mean? D. Illustrations or examples of principles
1. Identify three examples of the uses of the hammer in a typical Filipino home. E. situations 1. Would you weigh more or less on the moon? Why or why not? F. Reorganization of facts Application of rules or principles in specified
1. Some radical Filipino historians assert that the Filipino revolution against Spain was a revolution from the top not from below. Using the same observation, what other conclusion is possible?
IV. A.
1. Should members of the Communist Party of the Philippines be allowed to teach in colleges and universities? Why or why not? 2. Nature is more influential than the environment in shaping an individuals personality. Prove or disprove this statement.
B.
Discussion
1. Trace the events that led to the downfall of the dictatorial regime of Ferdinand Marcos. C. Criticism of the adequacy, correctness, or
relevance of a statement 1. Former President Joseph Estrada was convicted for the case of plunder by the Sandiganbayan. Comment on the adequacy of the evidence used by the said tribunal in reaching a decision on the case field against the former chief executive of the country.
D.
1. What should be the focus of researches in education to explain the incidence of failure among students with high intelligence quotient? 2. What questions should parents ask their children in order to determine the reasons why they join fraternities and sororities?
Following are examples of essay questions based on Blooms Taxonomy of Cognitive Objectives. A. Knowledge Explain how Egypt came to be called the gift of the Nile. B. Comprehension What is meant when a person says, I had just crossed the bridge? C. Application Give at least three examples of how the law of supply operates in our economy today. D. Analysis Explain the causes and effects of the Peoples Power Revolution on the political and social life of the Filipino people. E. Synthesis Describe the origin and significance of the celebration of Christmas the world over.
There are four sources of difficulty that are likely to be encountered by teachers in the use of essay tests (Greenberg, et al, 1996). Let us over each of these difficulties and look into ways to minimize them. Question Construction. The preparation of the essay item is the most important in the development process. Language usage and word choice are particularly important during the construction process. The language dimension is very critical not only because it controls the comprehension level of the item for examinee, but it also specifies the parameters of the task. As a test constructor, you need to narrowly specify, define, and clarify what it is that you want from the examinees. Examine this sample essay question, Comment on the significance of Darwins Origin of Species. The question is quite broad considering that there are several ways of responding to it. While the intention of the teacher who wrote this item was to provide opportunity for the students to display their mastery of the material, students could write for an hour and still not discover what their teacher really wants them to relative to the aforementioned topic. An improved version of the same question follows: Do you agree with Darwins concept of natural selection resulting in the survival of the fittest and the elimination of the unfit? Why or why not? Reader Reliability. A number of studies had been conducted then and now on the reliability of grading free response test items. Results of
these researches failed to demonstrate consistently satisfactory agreement among essay raters (Payne, 2003). Some of the specific contributory factors in the lack of reader reliability include the following: quality of composition and penmanship; item readability; racial or ethnic prejudice on essay scoring and subjectivity of human judgment. Instrument Reliability. Even if an acceptable level of scoring is attained, there is no guarantee that measurement of desired behaviors will be consistent. There remains the issue of the sampling of objectives or behaviors represented by the test. One way to increase the reliability of an essay test is to increase the number of questions and restrict the length of the answers. The more specific and narrowly defined the questions, the less likely they are to be ambiguous to the examinee. This procedure should result in more uniform understanding and performance of assigned and scoring. It also helps ensure better coverage of the domain of objectives. Instrument Validity. The number of test questions influences both the validity and reliability of essay questions. As commonly constructed, an essay test contains a small number of items; thus, the sampling of desired behaviors represented in the table of specification will be limited, and the test suffering from decreased or lowered content validity.
There is another sense in which the validity of an essay test may be questioned. Theoretically, the essay test allows the examinees to construct a creative, organized, unique and integrated communication. Nonetheless, these examinees spend most of their time very frequently in simply recalling and organizing information, rather than integrating it. The behavior elicited by the test, then, is not that hoped for by the teacher or dictated by the table of specifications. Again, one way of handling the problem is by increasing the number of items on the test.
Limit the problem that the question poses so that it will have a clear or definite meaning to most students.
Use simple words which will convey clear meaning to the students.
Prepare enough questions to sample the material of the subject area broadly, within a reasonable time limit. Use the essay question for purposes it best serves, like organization, handling complicated ideas and writing. Prepare questions which require considerable thought, but which can be answered in relatively few words. Determine in advance how much weight will be accorded each of the various elements expected in a complete answer. Without knowledge of students names, score each question for all students. Require all students to answer, all questions on the test. Write questions about materials immediately relevant to the subject. Study past questions to determine how students performed. Make gross judgments of the relative excellence of answers as a first step in grading.
Word a question as simple as possible in order to make the task clear. Do not judge papers on the basis of the external factors unless they have been clearly stipulated. Do not make a generalized estimate of an entire papers worth. Do not construct a test consisting of only one question.
Before focusing on the specific methods of scoring essay tests, let us consider the following guidelines. First, it is critical that the teacher prepare in advance a detailed ideal answer. This is necessary as it will
serve as the criterion by which each students response will be judge. If this is not done, the results could be terrible. The subjectivity of the teacher could seriously prevent consistent scoring, and it also possible that student responses might dictate what constitutes correct answers. Second, student papers should be scored anonymously, and that all answers to a given item be scored one at a time, rather than grading each students total test separately. As already pointed out, essay questions are the most difficult to check owing to the absence of uniformity of response on the part of the students who took the test. Moreover, there are a number of distractors on the students responses that can contribute to subjective scoring of an essay item (Hopkins et al, 1990). These distractors include the following: handwriting, style, grammar, neatness, and knowledge of the students. There are two ways of scoring an essay test: holistic and analytic (Kubiszyn & Borich, 1990).
Holistic Scoring. In this type of scoring, a total score is assigned to each essay question based on the teachers general impression of over all assessment. Answers to an essay question are classified into any of the following categories: outstanding; very satisfactory; fair; and poor. A score
value is then assigned to each of these categories. Outstanding response gets the highest score, while poor response gets the lowest score. Analytic Scoring. In this type of scoring, the essay is scored in terms of its components. An essay scored in this manner has separate points for organization of ideas; grammar and spelling; and supporting arguments or proofs. As an essay test is difficult to check, there is a need for teachers to ensure objectivity in scoring students responses (Hopkins et al, 1990). To minimize subjectivity in scoring an essay test, the following guidelines have to be considered by the teacher (Airisian, 1994): Decide what factors constitute a good answer before administering an essay question. Explain these factors in the test item. Read all answers to a single essay question before reading other questions. Reread essay answer a second time after initial scoring.
While it is true that test formats and content coverage are important ingredients in constructing paper and pencil tests, the conditions under which students shall take the test are equally essential. This chapter is focused on how tests should be administered and scored.
Short answer items should be placed before essay items. Specify directions that students have to follow in responding to each set of grouped items.
6.
Avoid cramming items too close to each other. Leave enough space for the students to write their answers.
7.
Avoid splitting multiple choice or matching items across two different pages.
8.
Provide a quite and comfortable setting. This is essential as interruptions can affect students concentration and their performance in the test.
Anticipate questions that students may ask. This is also necessary as students questions can interrupt test taking. In order to avoid questions, teachers have to proofread their test question before administering it to the class.
Set proper atmosphere for testing. This means that students have to know in advance that they will be given a test. In effect, such information can lead them to prepare for the test and reduce test anxiety.
Discourage cheating. Students cheat for a variety of reasons. Some of these are pressures from parents and teachers, as well as intensive competition in the classroom. To prevent and discourage cheating Airisian (1994)
recommends the following strategies: strategies before testing; and strategies during testing.
Define to the students what is meant by cheating. Explain the discipline to be imposed when caught cheating
Require students to remove unnecessary materials from their desks. Have students sit in alternating seats. Go around the testing room and observe students during testing. Prohibit the borrowing of materials like pen and eraser. Prepare alternate forms of the test. Implement established cheating rules. Help students keep track of time.
Scoring Test
After the administration of a test, the teacher needs to check the students test papers in order to summarize their performance on the test. The difficulty of checking a test differs with the kind of test items used. Selection items are the easiest to scores, followed by short answer response and completion items. The most difficult to score, however is the essay item. Scoring Objective Tests. The following guidelines have to be considered by a teacher in scoring an objective test: Key to correction has to be prepared in advance for use in scoring the test paper.
Apply the same rules to all students in checking students responses to the test questions. Score each part of the test to have a clear picture of how students fared in order to determine areas they failed to master. Sum up the scores for grading purposes.