Você está na página 1de 45

UNIVERSITI TEKNOLOGI MALAYSIA PENGENALAN PENGUKURAN DAN PENILAIAN (MPF 1213)

ASSESSMENT PROCEDURE AND GRADING & REPORTING REPORT

NOR SAHIDAH BINTI MOHAMAD ALI 880522-01-5516 JB MP121197

DR. HAMIMAH BTE ABU NAIM FALKULTI PENDIDIKAN

SEMESTER 1 SESI 2012/2013 21 DECEMBER 2012

ASSESSMENT PROCEDURE What is Assessment? To many teachers (and students), assessment simply means giving students tests and assigning them grades. This conception of assessment is not only limited, but also limiting (see section below on Assessment versus grading). It fails to take into account both the utility of assessment and its importance in the teaching/learning process. In the most general sense, assessment is the process of making a judgment or measurement of worth of an entity (e.g., person, process, or program). Educational assessment involves gathering and evaluating data evolving from planned learning activities or programs. This form of assessment is often referred to as evaluation(see section below on Assessment versus Evaluation). Learner assessment represents a particular type of educational assessment normally conducted by teachers and designed to serve several related purpose (Brissenden and Slater, n.d.). These purposed include:

motivating and directing learning providing feedback to student on their performance providing feedback on instruction and/or the curriculum ensuring standards of progression are met

Learner assessment is best conceived as a form of two-way communication in which feedback on the educational process or product is provided to its key stakeholders (McAlpine, 2002). Specifically, learner assessment involves communication

to teachers (feedback on teaching); students (feedback on learning);curriculum designers (feedback on curriculum) and administrators (feedback on use of resources).

For teachers and curriculum/course designers, carefully constructed learner assessment techniques can help determining whether or not the stated goals are being achieved. According to Brissenden and Slater (n.d.), classroom assessment can help teachers answer the following specific questions:

To what extent are my students achieving the stated goals? How should I allocate class time for the current topic? Can I teach this topic in a more efficient or effective way? What parts of this course/unit are my students finding most valuable? How will I change this course/unit the next time I teach it? Which grades do I assign my students?

For students, learner assessment answers a different set of questions (Brissenden and Slater, n.d.):

Do I know what my instructor thinks is most important? Am I mastering the course content? How can I improve the way I study in this course? What grade am I earning in this course?

Why Assessment is Important First and foremost, assessment is important because it drives students learning (Brissenden and Slater, n.d.). Whether we like it or not, most students tend to focus their energies on the best or most expeditious way to pass their tests. Based on this knowledge, we can use our assessment strategies to manipulate the kinds of learning that takes place. For example, assessment strategies that focus predominantly on recall of knowledge will likely promote superficial learning. On the other hand, if we choose assessment strategies that demand critical thinking or

creative problem-solving, we are likely to realize a higher level of student performance or achievement. In addition, good assessment can help students become more effective self-directed learners (Angelo and Cross, 1993).

As indicated above, motivating and directing learning is only one purpose of assessment. Well-designed assessment strategies also play a critical role in educational decision-making and are a vital component of ongoing quality improvement processes at the lesson, course and/or curriculum level. Types and Approaches to Assessment Numerous terms are used to describe different types and approaches to learner assessment. Although somewhat arbitrary, it is useful to these various terms as representing dichotomous poles (McAlpine, 2002).

Formative Informal

<---------------------------------> <--------------------------------->

Summative Formal Final Product Convergent

Continuous <----------------------------------> Process Divergent <---------------------------------> <--------------------------------->

Formative vs. Summative Assessment Formative assessment is designed to assist the learning process by providing feedback to the learner, which can be used to identify strengths and weakness and hence improve future performance. Formative assessment is most appropriate where the results are to be used internally by those involved in the learning process (students, teachers, curriculum developers).

Summative assessment is used primarily to make decisions for grading or determine readiness for progression. Typically summative assessment occurs at the end of an educational activity and is designed to judge the learners overall performance. In addition to providing the basis for grade assignment, summative assessment is used to communicate students abilities to external stakeholders, e.g., administrators and employers.

Informal vs. Formal Assessment With informal assessment, the judgments are integrated with other tasks, e.g., lecturer feedback on the answer to a question or preceptor feedback provided while performing a bedside procedure. Informal assessment is most often used to provide formative feedback. As such, it tends to be less threatening and thus less stressful to the student. However, informal feedback is prone to high subjectivity or bias.

Formal assessment occurs when students are aware that the task that they are doing is for assessment purposes, e.g., a written examination or OSCE. Most formal assessments also are summative in nature and thus tend to have greater motivation impact and are associated with increased stress. Given their role in decision-making, formal assessments should be held to higher standards of reliability and validity than informal assessments.

Continuous vs. Final Assessment Continuous assessment occurs throughout a learning experience (intermittent is probably a more realistic term). Continuous assessment is most appropriate when student and/or instructor knowledge of progress or achievement is needed to determine the subsequent progression or sequence of activities. Continuous assessment provides both students and teachers with the information needed to improve teaching and learning in process. Obviously, continuous assessment involves increased effort for both teacher and student.

Final (or terminal) assessment is that which takes place only at the end of a learning activity. It is most appropriate when learning can only be assessed as a complete whole rather than as constituent parts. Typically, final assessment is used for summative decision-making. Obviously, due to its timing, final assessment cannot be used for formative purposes.

Process vs. Product Assessment Process assessment focuses on the steps or procedures underlying a particular ability or task, i.e., the cognitive steps in performing a mathematical operation or the procedure involved in analyzing a blood sample. Because it provides more detailed information, process assessment is most useful when a student is learning a new skill and for providing formative feedback to assist in improving performance.

Product assessment focuses on evaluating the result or outcome of a process. Using the above examples, we would focus on the answer to the math computation or the accuracy of the blood test results. Product assessment is most appropriate for documenting proficiency or competency in a given skill, i.e., for summative purposes. In general, product assessments are easier to create than product assessments, requiring only a specification of the attributes of the final product.

Divergent vs. Convergent Assessment Divergent assessments are those for which a range of answers or solutions might be considered correct. Examples include essay tests, and solutions to the typical types of indeterminate problems posed in PBL. Divergent assessments tend to be more authentic and most appropriate in evaluating higher cognitive skills. However, these types of assessment are often time consuming to evaluate and the resulting judgments often exhibit poor reliability.

A convergent assessment has only one correct response (per item). Objective test items are the best example and demonstrate the value of this approach in assessing knowledge. Obviously, convergent assessments are easier to evaluate or score than divergent assessments. Unfortunately, this ease of use often leads to their

widespread application of this approach even when contrary to good assessment practices. Specifically, the familiarity and ease with which convergent assessment tools can be applied leads to two common evaluation fallacies: the Fallacy of False Quantification (the tendency to focus on whats easiest to measure) and the Law of the Instrument Fallacy (molding the evaluation problem to fit the tool). Assessment versus Evaluation Depending on the authority or dictionary consulted, assessment and evaluation may be treated as synonyms or as distinctly different concepts. As noted above, if a distinction exists, it probably involves what is being measured and why and how the measurements are made. In terms of what, it is often said that we assess students and we evaluate instruction. This distinction derives from the use of evaluation research methods to make judgments about the worth of educational activities. Moreover, it emphasizes an individual focus of assessment, i.e., using information to help identify a learner's needs and document his or her progress toward meeting goals. In terms of why and how the measurements are made, the following table (Apple & Krumsieg, 1998) compares and contrasts assessment and evaluation on several important dimension, some of which were previously defined. Dimension Timing Focus of Measurement Relationship Administrator Recipient Findings and Uses Diagnostic Judgmental Fixed Assessment Formative Process-Oriented Evaluation Summative Product-Oriented Prescriptive

Between Reflective and

Modifiability of Criteria, Flexible Measures Standards of Absolute

Comparative

Measurement Relation Objects of A/E

(Individual) Between Cooperative Competitive

From: Apple, D.K. & Krumsieg. K. (1998). Process education teaching institute handbook. Pacific Crest The bottom line? Given the different meaning ascribed to these terms by some educators, it is probably best that whenever you use these terms, you make yourdefinitions clear. Assessment versus Grading Based on the above discussion, grading grading could be considered a component of assessment, i.e., a formal, summative, final and product-oriented judgment of overall quality of worth of a student's performance or achievement in a particular educational activity, e.g., a course. Generally, grading also employs a comparative standard of measurement and sets up a competitive relationship between those receiving the grades. Most proponents of assessment, however, would argue that grading and assessment are two different things, or at least opposite pole on the evaluation spectrum. For them, assessment measures student growth and progress on an individual basis, emphasizing informal, formative, process-oriented reflective feedback and communication between student and teacher. Ultimately, which conception you supports probably depends more on your teaching philosophy than anything else. Observation Techniques There are several observation techniques that are used within the schools to record student performance or behavior. This lesson will describe six such observation techniques, ways to report the information and the role that the paraeducator can play in the observation and recording of students' performance and behavior.

These techniques include:


Frequency Rate Duration Interval Recording Time Sampling Anecdotal Records

All of these techniques rely on precisely identifying the behaviors in observable and measurable performance terms (as discussed in the previous lesson) to make the results meaningful and reliable. The Paraeducators' Role in Observations As long as the planning for obaservation has been done by a teacher, anyone that is able to make accurate observations can perform the actual observation of the behavior. This can include paraeducators, with training in the observation technique and also a knowledge of the behavior being observed. The Observations When developing an observation period, the teacher will take the following considerations into account. A paraeducator should be aware of these

considerations in order to make consistent and accurate measurements during the observation. Defining the Behavior The target behavior will need to be defined in a way that it is observable an measurable to anyone that may be observing that student. It is possible that both the teacher and the paraeducator could observe the same student at the same time and note different behaviors. Clearly identifying specific behaviors being observed makes communicating and interpreting the results of the observation more accurate.

The teacher should be the one to identify and define the behavior. However, the paraeducator needs to have a clear understanding of the specific behavior. Where the Observation is to Take Place Certain behaviors occur in specific locations thoughout the day. It is up to the teacher to determine where behaviors are occurring so that the time observations take place will coincide with the behavior. If a student is kicking other students on the playground, then observing them in the classroom will not provide an accurate observation. However, if a student is talking out in class, the classroom would be an appropriate location. The teacher needs to establish the location in order for the observer to collect accurate information. When the Observation is to Take Place The target behavior will also determine the time of the observation. The teacher should schedule the observation during a time in which the behavior is likely to occur and for a length of time that will allow opportunity for the behavior to occur. What Observation Technique is to be Used In determining the observation technique to use, the teacher will take into consideation the specific behavior and the information that they will want to gather from the observation. A paraeducator will need to have an understanding of these techniques and practice them before they can use them in an observation. Observation Techniques Frequency Frequency counts are a record of the number of times a specific behavior occurs within a specific time period. Frequency counts are useful for recording behaviors which have a clear beginning and ending, are of relatively short duration, and tend to occur a number of times during the specified time period. In order to perform a frequency count, the following are required:

a specific time period, a specific behavior, and

a method for tallying the number of events. A tally sheet is usually used to identify the behavior being observed and to

record the the frequency or the number of times which the behavior occurs. Below is an example of a tally sheet and how the frequency of a behavior might be recorded. Sample of Frequency Record Form Student: Billy Smith Behavior: Leaving seat during science class Time Start / Stop 10:50 am 11:50 am Tally Observations xxxxx xxxxx xxxxx of Total Count

Date

2/14/97

15

Some examples of a frequency count could be the number of math problems completed on a math worksheet within 15 minutes, the number of times a preschooler intentially communicates in an hour, the number of times a student raises their hand during a 10 minute class discussion, and the number of times a student leaves their seat during science class are all examples of frequency counts. A frequency count would NOT be used for those behaviors that occur at a high rate, such as tapping a pencil on a desk, or when the behavior occurs for an extended period of time, such as when a student sucks their thumb.

Rate Rate is very similar to frequency. Recording rates of behavior included gather information on both the frequency of the observed behavior and the length of the observation time. Rate is the ratio of the number of times a behavior occurs within a specific time period AND the length of the time period. The ratio is computed by dividing the number of events by the number minutes, hours, or days that the

observation occurred. The frequency or number of times a student leaves their seat during math class may be reported as a rate if the length of the class or the length of the observation period is known. The rate of a behavior can also be averaged across a number of observation period to report an average rate. From a series of observations it may be determined that a student's average rate of "out of seat" behavior may be twelve times per hour. For example, if the list contains 20 words and the student requires five minutes to write the list, the rate would be four words per minute. An example follows of how one might record "out of seat" behavior as rate. Sample of Rate Record Form Student: Billy Smith Behavior: Leaving seat during science class Time Start / Stop 10:50 am 11:50 am Tally Observations xxxxx xxxxx xxxxx of Total Count

Date

2/14/97

15

Rate (count/Length of time) = 15/1 hour = 15 times per hour

Duration Recording the duration of a behavior is done by recording the starting and ending time of a behavior and computing the length of time that the behavior occurs. This technique is usually used to observe behavior which occur less frequently and continue for a period of time.

An example of duration recording could be for a student who has crying episodes in class. Everytime the student cries in class, you would record the beginning and ending times, and then calculate the duration of the crying episode. A few other examples of when duration recording could be used include how long it takes a student to finish a math assignment, the length of time a student takes cleaning up, or how long a student spends continuously tapping their pencil on the desk. Sample of Duration Record Form Tally Sheet for Duration of Behavior Student Name: Date Observation: Observed Behavior: Starting Time: Ending Time: Duration: Interval Recording Interval recording is a technique that measures whether or not a behavior occurs within a specific time interval. The total observation time is divided into smaller intervals, and the observer records whether or not the behavior occurrs within that interval. By using the interval recording technique, the teacher can get an estimate of both the frequency and duration of the the behavior. The observer marks only once whether the behavior occurred at anytime within that interval. Interval recording requires the observer's undivided attention, since the observation is continuous for a set period of time. of

An example of interval recording could be for a child who throws their toys during free-time. If the free-time lasts for 15 minutes, then that time could be broken into 1 minute intervals. If in the first minute, the child throws the toy the the interval is marked. If in the next minute, they don't throw a toy then the interval is not marked. However, if in the third minute, the child throws three different toys, the interval is only marked once again. Interval Recording Sheet Interval Recording Student Name: Date Observation: Observed Behavior: Starting Time: Ending Time: Total Observation Time of

Other examples of when interval recording may be used include, a student who talks to other students around them during work time, the amount of socializing that a student does at recess, or if a student is attending to a book during personal reading time. Interval recording will work for any behaviors that can be observed, however there is a strong time demand upon the observer which may make this technique inappropriate or undesirable to use.

Time Sampling Time sampling is similar to interval recording in that the observation time is divided into intervals, however in time sampling, the behavior is recorded only if it occurs at the end of the time period. When the specified amount of time has expired, the observer looks at the student and determines whether or not the behavior is occurring. In general, this technique is used for behaviors which are longer in duration. For example, if the behavior is identified as "being out of seat", the observation time might be 15 minutes with intervals of 1 minute. The paraeducator would mark at one minute intervals whether the student being observed was out of his or her seat. Sample of Time Sampling Record Form Since with time sampling the observation is done intermittently, the observer, such as the teacher or paraeducator, is able to observe a behavior without having to set an amount of time aside to observe continually. Thus time sampling is a practical way of getting an estimation of the overall occurrence of a behavior. Some other examples of behaviors that time sampling can be used with include, a student reading a book, nail biting, participation in a game during recess, or working on math assignments. Time sampling would generally NOT be used with behaviors with a short duration such as hitting, kicking or spitting. If the behavior does not have a long enough duration, then it may not be observed at the specified intervals. The observer may utilize a timer or a tape recorder with beeps to determine when to record if the behavior is occurring. In a variation of this technique, tapes with random beeps are sometimes used to record observations at random times during the observation period. With this variation the observer and the student do not know ahead of time when the recording will occur.

Anecdotal Records Anecdotal records are written notes describing events or incidents that occur. These notes usually become part of a student's file. Anecdotal records may be used to document:

a significant event which occurs unexpectedly or infrequently; the settings or conditions in which a behavior occurred; the antecedents (what happens before) and the consequences (what happens after) of a problem behavior; or

a conversation with parents. If a paraeducator is working with the student at the time of the incident, they

may be asked to assist in completing the anecdotal record.

Effective Anecdotal Records The purpose of the anecdotal record is to document the event as clearly and accurately as possible. The following guidelines should be observed when writing the record: 1. Record observation at the time behavior is observed rather than at a later time. 2. Utilize a standardized anecdotal record form to record the information to help insure that all relevant information is included. 3. Record what is actually observed rather than your feelings about the incident. 4. Use performance terms to describe behavior. 5. Be careful about including information about other students (by name) in the record. 6. Be aware that parents and other professionals will have access to the record.

What should be included in an anecdotal record? Anecdotal records are usually recorded on preprinted forms to insure that all relevant information is included. These anecdotal record usually includes the following: 1. Name of the observer 2. Date of the incident 3. Time when the incident occurred 4. Name of the student involved 5. A description of the incident 6. Location/setting where the incident occurred 7. Notes/Recommendations/Actions taken (be careful here) 8. Signature Sample Anecdotal Record Form

Reporting Information The following are not specific techniques for observing behaviors, however they do allow the observer to interpret the information that is gathered during the observation. By calculating the percentage and average, a large amount of information about the behavior's occurrence can be summarized briefly. Percentage Percentage is the ratio of the number of times an event occurs to the number of possibilities for that event to occur times 100. For example, if we are interested in determining the percent of math problems a student does correctly while completing a math worksheet, and the student gets fifteen of twenty items on the sheet correct, then the percentage would be the ratio of the number correct (15) and the number possible (20) times 100 or 75 percent. You may be familiar with using percentage in recording academic work, but percentages are also used with observing behaviors. Following are some of the observation techniques presented in this lesson, and how a percentage can be calculated with the information gathered in the observation. Time Sampling Reported as Percentage Time sampling a technique which relies on observing behavior at specific intervals during a predetermined time period. A specific time period such a ten minutes might be divided into 10 equal intervals of one minute. At the end of each one minute interval the paraeducator would record whether a specific identified behavior was occurring. At the end of the ten minute period the number of intervals at which the behavior was occurring divided by the total number of intervals times 100 will give the percentage of time that the behavior was occurring. Using the same "being out of seat" behavior, the paraeducator would mark on a recording sheet at each one minute interval whether the student being observed was in his/her seat or out of his/her seat. If the student was out of their seat at six intervals during the ten observations then it would be determined that the student was "out of seat" 60 percent of the time.

Percentage may also be determined when observing behaviors of longer duration. If we observe a student for ten minutes and record whether the behavior is occurring at each minute, we can compute the percentage of observations (out of a possible ten) that the behavior occurs. This is discussed further inTime Sampling. Percentage might be a more effective method for reporting the extent of behaviors which are of a longer duration, such as writing, thumb-sucking, or crying. Duration Reported as Percentage If the observation using a duration technique is done during a specific period of time, the percentage of time that the behavior occurs may also be computed. All occurrences and length of time the behavior occurred are recorded. For example, if the behavior being observed was "being out of seat", the paraeducator could use a stop watch to measure the number of minutes and seconds during a 30 minute period in which the student was out of his/her seat. If the number of minutes and seconds is divided by 30 minutes and taken times 100, the percentage of time that the student was out of his/her seat can be determined. Again, recording the percentage requires that the observer record the number of possible attempts or opportunities divided by the number of times that the student meets the criteria. The result is then taken times 100. Average Averaging Frequency/Rate The frequency/rate of behaviors can be averaged across a number of observation periods to determine the average. For example, if one looks at the student who calls out without raising their hand during a class for a week, we can calculate an average rate. If on Monday one tallies 17 times, 5 times on Tuesday, 8 times for Wednesday, 9 times on Thursday, and on Friday one tallies 11 times, then the average frequency is calculated as follows: Average Frequency = 17+5+8+9+11 = 50 times total 50 times / 5 observations = an average of 10 times per observation

The following form can also be used for recording and computing the average rate of behavior over a number of observation periods. Average Rate Calculation Sheet of Behavior Observations 1 2 3 4 5 Total Count Length Rate (Count/Length) Average Total Count/Total Length Rate

Averaging Duration The duration of behaviors can be averaged across a number of observation periods to determine the average. For example, if we look at the student who sucks his or her thumb during school for a day, we can calculate the average duration for the time they are observed, as follows: If the student sucks their thumb for 10 minutes, 7 minutes, 4 minutes, 3 minutes, then one calculates the average duration of thumb sucking as follows: Average Duration = 10+7+4+3 = 24 minutes total Divide 24 minutes / 4 individually observed incidences = an average of 6 minutes

One can summarize that she or he sucks their thumb on the average six minutes at a time. Summary Although the techniques and strategies for recording behavior are not difficult, carefully developed procedures and practice are essential in gathering accurate data. The following guidelines may be helpful: 1. Describe as precisely as possible the behavior you are recording before you begin to record it. Discuss examples of the behavior to make sure that you have the same understanding of the behavior as the teacher. 2. Prepare the recording technique ahead of time. Make sure you are familiar with the form and the method for recording. 3. Carefully observe the time limits and time intervals used in recording. 4. Try to prepare so that you need to make the fewest judgments while recording. Record the behavior every time it occurs, regardless of how much it occurs. For example, if you are recording how often a student touches other students, you should record all touches whether they are gentle or hard. If you can't tell whether a behavior fits the criteria you and the teacher need to further refine the criteria so that it matches the intent of the observation and is observable and measurable.

ANECDOTAL RECORDS 1.1) Def: factual descriptions of the meaningful incidents and events that teacher has observed in the pupils life. - Each incident should be written down shortly after it happens. - The descriptions may be recorded on separate cards like 1.2) The uses of Anecdotal Record: Obtaining data pertinent to a variety of learning outcomes and to many aspect of personal and social development.

1.3) Advantages of Anecdotal Records: a) they depict actual behavior in natural situations. b) Records of actual behavior provide a check on other evaluation methods and enable us to determine the extent of change in the pupils typical patterns of behavior. c) Enable gathering evidence on event that are exceptional but significant. d) Makes us more diligent in observation and increase our awareness of such behaviors. 1.4) Limitations of Anecdotal Records: a) Time consuming task b) Difficulty of being objective when observing and reporting pupil behavior. 1.5) Effective Use Of Anecdotal Records: a) Determine in advance what to observe, but be alert for unusual behavior. b) Observe and record enough of the situation to make the behavior meaningful. c) Make a record of the incident as soon after the observation as possible. d) Limit each anecdote to brief description of a single incident. e) Keep the factual description of the incident and your interpretation of it separate. f) Record both positive and negative behavioral incident. g) Collect a number of anecdotes on a pupil before drawing inferences concerning typical behavior. h) Obtain practice in writing anecdotal records.

RATING SCALES

2.1) Def: A set of characteristics or qualities to be judged and some type of scale for indicating the degree to which each attribute is present. 2.2) The Uses: a) It will direct observation towards specific aspect of behavioral. b) It will provide a common frame of reference for compairing all pupils on the same set of characteristics c) It will provide a convenient method for recording the observers judgements. 2.3) Types of Rating Scales: a) Numerical rating scales: the simplest types of rating scales which the rater checks or circle a number to indicate the degree to which a characteristic is present. b) Descriptive Graphic Rating Scale: use descriptive phrases to identify the points on a graphic scale 2.4) The Uses of Rating Scales: -Rating scales can be used to evaluate a wide variety of learning outcomes and aspects of development. It classified into three areas: a) Procedure Evaluation b) Product Evaluation c) Evaluating Personal Social development 2.5) Common Errors in Rating: -Certain types of errors occur so often in rating that speacial effort are needed to counteract them. These errors due to: a) Personal bias b) Halo effect c) Logical error 2.6) Principles of Effective Rating: a) Characteristics should be educationally significant. b) Characteristics should be directly observable.

c) Characteristics and points on the scale should be clearly defined. d) Between three and seven rating positions should be provided and raters should be permitted to mark at intermediate points. e) Raters should be instructed to omit rating when they feel unqualified to judge. f) Ratings from several observers should be combined whenever possible.

CHECKLIST 3.1) Def: A checklist is similar in appearance rating the scale or on the other hand calls for a simple yes-no judgements. 3.2) Uses of Checklist: a) A method of recording whether a characteristic is present or absent or whether an action was or was not taken. b) Useful at the primary level such as c) Useful in evaluating those performance skills that can be divided into a series of specific action

PEER APPRAISAL An approach to the problem of studying interpersonal relationships and the socio-emotional climate of a classroom. Plays an important role in revealing and evaluating the social structure of the group through the measurement of the frequency of acceptance or non-acceptance among the individuals who constitute the group.

Criteria Of Peer Appraisal Peer rating (sociometric tests) may be devised for many types of 'groups and situations. Main considerations are that each one must be relevant to a specific life situation of the group. Each item or question must require each person in the group to make one or more definite Selection revealing certain personal preference, rejection or value. The technique allows analysis of each person's position and status within the group with respect to a particular criterion.

How Peer Appraisal Work? 1.Guess who is the best liked boy in the class? Who is the most generous boy? Who is the most selfish boy? ctc. 2.Select one of your colleagues you would most like as friend or partner in a particular activity 3.Name the pupil in your class with whom you would most like to sit at lunch; name second choice, name the two persons in order of preference, etc. 4.Identification of persons possessing certain specified traits such as the opposites "talkative - silent", "neat-unkempt". 5.Identification of dominant individuals, cliques, cleavages (sex; racial, economic, etc.) 6.Patterns of social attraction and rejection. 7.Opinion test through "word pictures".

Peer Appraisal Technique

"Guess Who? Technique In Guess Who" technique, respondents are asked to write names against each question such as : Name 1. Guess who is the best liked boy in the class 2. Guess who starts the most arguments 3. Guess who is the most cooperative boy in the class, etc. 4. There is a boy who is a) tall and witty b) interested in cricket c) most regular in class Guess who is this boy

The simple way to analyse results is to count the number of times each student's name appears in the blanks. Such findings may be utilized for helping individual student and understanding the pattern of existing interpersonal relationships in the group.

SOCIOGRAM Peer ratings about students they would most like as friends or partners in particular activities may be identified through a sociogram. The obtained results form a set of choices and these are plotted as a diagram (sociogram) showing the pattern of choices.

In the above figure which depicts a sociogram showing choices of work partners, you will notice student 7 as an isolate being chosen by none of the students; Student 10 is especially popular in the group receiving first choice nominations of four other students. Students 2,4 and 8 (and also 3.5.9.10) form a close knit clique. The student liked by most students is known as a 'star'. Here no.10 is a star as he has maximum first choices. Sociometry also helps us study the reciprocity of relationships among the members of the group. Between no 3 and 10, reciprocity is both sided while for no.1 and 10 it is onesided only. Many other such interesting relalions can be seen in the patterns of choices. Usually, an individual's sociometric score is simply the number of mentions he receives or percentage of mentions he receives from others in the group.

PRINCIPLE IN OBTAINING PEER RATING Rating of variable to be ranked must be simple and directly understandable to students. Rating should pertain to students world and asked in a very simple language Rater must be assured of annonymity that no one in the class will see their ratings. This will help elicit honest responses and will protect feelings of students receiving bad ratings. Raters are knowledgeable i.e, have valid evidence of what they say

SELF REPORT METHOD Require the respondent to react to item concerning his own behaviour or characteristics The items generally require expression regarding likes, dislikes, fears, hopes, religious beliefs, ideas about sex and many other matters that reflect the way in which the person copes with his own needs and demand of his environment. Commonly used for measuring the traits pertaining to interest, adjustrnent, attitude and personality etc. Sometimes a self-report test measures only one trait such as introversion - extroversion, security - insecurity or high anxiety - low anxiety. These can also be developed so as to measure a number of traits simultaneously. For example Cattell's sixteen Personality Factor Questionnaire yields 16 different scores. Self-reporting is obtained through a checklist, questionnaire or a rating

WELL KNOWN SELF-REPORTING INSTRUMENT Woodworth Personal Datasheet The Minnesota Multiphasic Personality Inventory (MMPI) Edwards Personal Preference Schedule Minnesota Teacher Attitude lnventory (MTAI)

PRECAUTION WHILE USING SELF-REPORTING METHOD Administering twice to the same individuals after a short interval of time with rearranged items on second administration Introducing various 'lie' scales to check deceiving tendency.

EVALUATION THROUGH SELF-REPORT Sugestions Use the standardized inventories Use more than one questionnaire/inventory Administer twice with changed sequence of items Insert 'lic' scales Establish norms for local population

Precautions Place only due faith in he tools of this type Do not use the technique for which you are not well trained, e.g. use of MMPI by teachers is not advisable Seek the help of trained professionals in administration and interpretation

QUESTIONNAIRES A set of written questions which are usually highly structured. Normally assemble a number of questions which are then posed to a representative sample of the relevant population. It can either be highly structured, with fixed alternative responses which can then be collated and analysed, or more open-ended, with the respondents able to express themselves in their own words.

WAYS OF ADMINISTERING QUESTIONNAIRES Face-to-face interviewing Hand-out questionnaires Postal questionnaires Telephone questionnaires

STRENGTHS 1. Surveys are able to study large samples of people fairly easy. 2. Surveys are able to examine a large number of variables. 3. Survey research can ask people to reveal behaviour and feelings which have been experienced in real situations. 4. If samples of people are selected at random and are large enough it should be possible to generalise the results to a larger population. 5. Questionnaire surveys can be carried out relatively cheaply. WEAKNESS 1. People may not respond truthfully, either because they cannot remember or because they wish to present themselves in a socially acceptable manner. 2. We can not establish cause and effect relationships from survey data as other variables which could have had an effect may not have been considered in the questionnaire or interview. 3. It may be difficult to obtain a random sample of the population because some people who are selected refuse to answer questions or it may be difficult to obtain a full list of the population from which to select a random sample.

PSYCHOMETRICS Instruments which have been developed for measuring mental characteristics. Developed to measure a wide range of things, including creativity, job attitudes and skills, brain damage and, of course, 'intelligence'. It is usually used to describe specific tests for personality, aptitude, intelligence or some kind of attitude measurement. This technique, of course, provides lots of quantitative data which is easy to analyse statistically. Psychometric tests are usually easy to administer. Constructing valid and reliable tests is very difficult. Tests usually contain culture bias, especially intelligence tests. Most tests will contain designer bias, in the sense that any test is biased in the direction of the author's view.

Most tests make the assumption that characteristics to be measured are fixed and invariant, both in relation to time and also in relation to circumstance or situation. Many studies in psychology, especially social psychology, demonstrate that this is not so.

INTERVIEWS There are many different ways to conduct an interview, ranging from casual chats to formal, standardised, set questions which have to be asked in a particular way. Clinical interviews are lengthy interviews aimed at a detailed understanding of a person's mental processes. There are no set questions; the questions depend on the last answers given. Interviews conducted in a casual manner provide information that is more spontaneous and realistic than those obtained in a formal interview. If we use standardised interviews it is easier to generalise (as long as the sample is large enough). Clinical interviews provide insight into the thoughts of individual children or adults which a standardised format would not allow. LIMITATION 1. Sampling of subjects is a problem (see section on sampling for more detail). 2. Informal interviews do not allow generalisation. One person may talk about something so differently from the way that another person does that it becomes almost impossible to compare what two people said. This applies to some extent to clinical interviews. 3. In formal interviews, if people feel that they are being asked a set of routine and automatic questions from a list they often do not talk as freely as they would in a casual conversation. The interviewer needs to be thoroughly skilled and trained to make it seem a natural and not an awkward situation. This means that a formal interview study is quite difficult (and expensive) to conduct well. 4. A major problem with interviews is demand characteristics. This includes interviewer biases and response biases. An interviewer may influence the respondent through, for example, leading questions or subtle reinforcements of 'right' or 'wrong' answers. Response bias may happen when, for example, respondents give socially acceptable answers.

PROS AND CONS OF SELF-REPORT Advantage Gives you the respondents views directly

Disadvantage Validity problems: Deception (of self or others) Lack of conscious awareness Attribution biases

REFERENCE :

Angelo, T. A., & Cross, K. P. (1993). Classroom assessment techniques: A handbook for college teachers. San Francisco: Jossey-Bass.

Apple, D.K, & Krumsieg. K. (1998). Process education teaching institute handbook. Corvalis, OR: Pacific Crest Software.

Brissenden, G., & Slater, T. Assessment primer. In College Level One (CL-1) Team. Field-tested learning assessment guide. Available athttp://www.flaguide.org.

Linn, R. L. (1995). Measurement and assessment in teaching (7th ed.). Englewood Cliffs, NJ: Merrill.

McAlpine, M. (2002). Principles of assessment. Glasgow: University of Glasgow, Robert Clark Center for Technological Education. Available at:http://www.caacentre.ac.uk/dldocs/Bluepaper1.pdf

Wiggins, G. P. (1998). Educative assessment: Designing assessments to inform and improve student performance. San Francisco: Jossey-Bass.

Wass, V., Van der Vleuten, J., & Shatzer, R.J. (2001). Assessment of clinical competence. The Lancet, 357, 945-949.

GRADING AND REPORTING Primary Purpose of Grades Officially - The primary purpose of grades is to communicate student achievement to students, parents, school administrators, post-secondary institutions and employers. - from Bailey, J. and McTighe, J., Reporting Achievement at the Secondary School Level: What and How?, in Thomas R. Gusk ey, (Ed.) Communicating Student Learning: ASCD Yearbook 1996, ASCD, Alexandria, VA, 1996 Some would argue that grades also serve to motivate student learning. We will discuss that later. For now, lets look at the various grading approaches and

systems currently in vogue.

Assessment Concepts in the Grading Process Assessment starts with the STANDARD. o Reliability - Accuracy and Consistency o Validity - Meaningfulness and Appropriateness Formative Assessment - Data collected from pre-assessments, homework, practice, and learning tasks to determine future instruction. Data collected here is not put in grade book. Summative Assessment - Data collected to determine level of mastery. It is data collected here that is used in the grading system.

The Combined and Translated Process This part is not as obvious as you might think. The way you choose to

combine/translate separate scores into one grade is one of the most important decisions you will make. You may literally hold the students future in your han ds based on your decisions.

Rationales for Assigning Grades Relative to fixed standard Pro focus on achievement (e.g., 90%); often mandated by state or by school district policy Con the standard is really an opinion Relative to group performance Pro real world orientation; always clear to determine Con grade depends on others, who is the relevant group Relative to ability, effort, or as a personal improvement Pro focus is on the student; often used by teachers who care about their students Con not recommend by experts as these make any conclusions about learning murky to others

Coding Systems: The Actual Grades Optional coding systems: Letter grades Percentage grades Checklists Narrative reports BUT . . . The letter grade is the most widely used coding system. It is even used even used in the general culture (A list actors, A number 1 used car, etc.). So lets focus here.

Different Grading Systems Five-point system - Most high schools a five-point system. Numerical values are applied to grades as follows: A = 4, B = 3, C = 2, D = 1, F=0 Thirteen-point system - A few high schools in the United States use a thirteen-point system. Numerical values are applied to grades as follows: A+ = 4.33, A = 4.0, A = 3.57, B+ = 3.33, B = 3.0, B = 2.67, C+ = 2.33, C = 2.0, C = 1.67, D+ = 1.33, D = 1,0 D = .67, F = 0.0 Grade-rationing system Grade-rationing is a euphemism for rank-based grading and is popular approach among some educators. The arguments for grade-rationing are that grade inflation represents a serious problem in education, that can only be counteracted by the enforcement of rank-based standards. (see next slide)

Since many large companies and corporations used rank-based evaluation measures (referred to as rank-and-yank or up-or-out' approaches to

evaluations), ranked-based grading prepares students for the real world situation. Students learn to compete academically with peers who will later be their competitors in the job market. A vitality curve is a leadership construct, assigning credit with certain proportions of the production to proportions of a producing population. For example, there is an often cited "20/80 rule or the Law of the Vital Few. This law posits that the top 20% of criminals commit 80% of the crimes, the top 20% of academics produce 80% of useful results, and so on. The concept of a "vitality curve" has been used to justify the "rank-and-yank" system of performance management, whereby the bottom ranking 10% of workers are fired at each evaluation. Rank-based performance evaluations (in education and employment) foster cutthroat and unethical behavior.

Rank-and-yank contrasts with the management philosophies of W. Edwards Deming. Demings influence in Japan has been credited with Japan's world

leadership in many industries, particularly the automotive industry. While rank-andyank puts success or failure of the organization on the shoulders of the individual worker, Deming stresses the need to understand organizational performance as fundamentally a function of the corporate systems and processes created by management. Workers need to feel valued, supported and part of a team doing important work. He sees so-called performance evaluation, annual review of

performance, and merit-based evaluation as misguided and destructive. (see next slide) Weight GPA Some high schools, to reflect the varying skill required for different level courses and to discourage students from selecting easy 'A's, will give higher numerical grades for difficult courses, often referred to as a weighted GPA. For example, two common conversion systems used in honors and advanced placement courses are: A = 5 or 4.6 B = 4 or 3.5 C = 3 or 2.1 D=1 F=0 Another policy commonly used by 4.0-scale schools is to mimic the eleven-point weighted scale (see below) by adding a .33 (one third of a letter grade) to an honors or advanced placement class. (For example, a B in a regular class would be a 3.0, but in an honors or AP class it would become a B+, or 3.33).

Communicating Face to Face

Grades

and

Scores

to

Parents

Guardians

BEFORE THE SHOW BEGINS Be organized. Have a folder containing the students grades, examples of work, standardized test scores, behavior notes. Know this material. Know the grading system; know how to read the

standardized score report; know the nature of norm group(s) used. Know the potential incongruence among the grades, test scores and behavior evidence found in the folder and be ready to discuss them. Have an agenda. Example: Point out strengths (grades & test scores),

suggest areas for improvement (grades & test scores, comment on behavior (never begin with behavior especially if it is a concern), solicit questions, close with a look to the future. SHOWTIME: Be honest. Dont sugarcoat. Dont go beyond your competence in answering a question. Say you will get back to them. Be professional. Dont dismiss or prejudge any result as unimportant. Any result is important to the parent. Be calm. Dont be surprised if your assessment d iffers from the parents; students may be behave differently at home and in the classroom. Be geared up with specific suggestions for the parents on how they might help improve the performance of their student. Be confidential. Do not refer to any other students performance. Be ready. Know who to call if you encounter an obnoxious parent . Be upbeat. Close on a vision to a positive future.

Effect of Variability on Weights The most variable element will have greatest weight in determining the grade, not the element with the highest numerical value. Regression to mean The composite formed by adding grades together will show less variability than the grade ranges of the subscores used to create it. Legal Considerations It is your responsibility to keep accurate records. electronic grade books; security. LEGISLATION - Family Educational Rights and Privacy Act (FERPA) Two main points: o Parents have a right to see grading and test score information for their children. o Schools may not reveal grades and scores to a third part without the individuals consent. COURTS - Two main points: o Deference is given to the educators judgment, as long as o Grades are assigned in an even-handed, rational manner. SCHOOL ADMINISTRATION - a surprise, perhaps: o Final authority for grades is the school administration. In rare Issues: hard copy and

circumstances an administrator may change a grade and has the legal responsible to do so.

Practical Advice 1. First, have a reasonable and fair assessment plan. 2. Check for school policies on grading; if school has policy, study it carefully. 3. Learn to use an electronic spreadsheet or purchase a Teacher Gradebook program (some schools have a centralized system). 4. Consider creatively combining formative and summative assessment. 5. Review suggestions for parent-teacher conference. 6. Use various sources to provide feedback to parents and to solicit their help.

Over the course of an academic career the average student will be exposed to a variety of grading systems and procedures. Although some of these systems may be qualitative in nature, such as an annual or semi-annual written narrative, the vast majority are quantitative and depend upon numerical or alphanumerical metrics. Perhaps the most familiar of these involves the letters "A" through "F," where "A" is usually given a value of 4.0 and is characterized in words

as outstanding or excellent and "F" is given a value of 0.0 and is described as unsatisfactory orfailing. The grades of A through F are usually derived from some more differentiated quantitative value such as test score, in which the specific nature of the relationship between grade and test score may take a variety of different forms: (e.g., an A is defined by a score of 90% or better or by a value that falls in the top 510% of scores independent of absolute value, and so on). Regardless of the specific translation of test performance into letter grade, the point to keep in mind is that the AF scale defines the most frequent grading system used in higher education over the past half century or more. Variations in the Grading System Like all prototypes, the AF system admits many variations. These often take the form of plusses and minuses, thereby producing a scale having the possibility of fifteen distinct units: A+, A, A, B+, B F. In actual practice, the grade of A+ is scarcely ever used and the same is true for D+ and Dand F+ and F, thereby

yielding a scale of between eight to ten units. Generally speaking, the greater the number of units in the grading system the more precisely does it hope to quantify student performance. What is interesting in this regard are fluctuations in the actual number of units used in different historical eras. Without going too deeply into the relevant historical facts, it is clear that certain historical periods, such as the 1960s, reduced the grading system to two or so unitsPass, No Credit (P/NC)whereas other periods, such as the 1980s, expanded it to ten, eleven or twelve units. Variations in the breadth of the grading system would seem to have significant educational implications. At a minimum, these differences may be taken to imply that scales having a large number of units indicate a relative comfort in making precise distinctions, whereas those having fewer units suggest a relative discomfort in making such distinctions. In the case of more differentiated systems, distinctions and rankings are significant, and individual achievement is emphasized; in the case of less differentiated systems, distinctions and rankings are de-emphasized and interstudent competition is minimized. To some degree, it is possible to view fluctuations in American grading systems as reflecting a more general ambivalence the society has in regard to competition and cooperation, between individual recognition and social equity. Educational institutions sometimes emphasize strict evaluation, competition, and individual achievement, whereas at other times they emphasize less precise evaluation, cooperation, and sympathetic understanding for students of all achievement levels. Another property of grading systems is that individual class grades often are combined to produce an overall metric called the grade point average or GPA. Unlike its constituent values, which usually are carried to only one (or no numerically significant places), the GPA presents a metric of 400 units yielding the possibility that a GPA of 3.00 will locate the student in the category of "good" whereas a value of2.99 will exclude him or her from this category. In the same way, honors, admission to graduate school, preliminary selection for interviews by a desirable company, and so forth, may be defined by a single point difference on the GPA scale (e.g., 3.50 versus3.49 for Phi Beta Kappa, etc.). Because GPAs are significant in categorizing student performance, a number of evaluations have been made of their reliability and validity. One issue to be

addressed here concerns field of study, where it is well documented that classes in the natural sciences and business produce lower overall grades than those in the humanities or social sciences. What this means is that it is unreasonable to equate grade values across disciplines. It also suggests that the GPA is composed of unequal components and that students may be able to secure a higher GPA by a judicious selection of courses. Although other factors may be mentioned aside from academic discipline (such as SAT level of school, quality and nature of tests, etc.) the conclusion must be that the GPA is a poor measure and should not be used by itself in coming to significant decisions about the quality of student performance or differences between departments and/or educational institutions. The GPA is also a relatively poor basis on which to predict future performance, which perhaps explains why such attempts are never very impressive. In fact, a number of meta-analyses of this relationship, conducted every ten years or so since 1965, reveals that the median correlation between GPA and future performance is 0.18; a value that is neither very useful nor impressive. The strongest relationship between GPA and future achievement is usually found between undergraduate GPA and first-year performance in graduate or professional school. Despite such difficulties in understanding the exact meanings of grades and the GPA, they remain important social metrics and sometimes yield heated discussions over issues such as grade inflation. Although grade inflation has many different meanings, it usually is defined by an increase in the absolute number of As and Bs over some period of years. The tacit assumption here seems to be that any continuing increase in the overall percentage of "good grades" or in the overall GPA implies a corresponding decline in academic standards. Although historically there have been periods in which the number of good grades decreased (so-called grade deflation), significant social concerns usually only accompany the grade inflation pattern. This one-sided emphasis suggests that grade inflation is as much a sociopolitical issue as an educational one and depends upon the dubious equating of grades with money. What really seems of concern here is a value issue, not a cogent analogy that reveals anything significant about grades or money.

How Grades Are Produced Grading systems represent just one aspect of an interconnecting network of educational processes, and any attempt to describe grading systems without considering other aspects of this network must necessarily be incomplete. Perhaps the most important of these processes concerns the procedures used to produce grades in the first place, namely, the classroom test. Here, of course, are purely formal differences; for example, between multiple choice and essay tests, or between in-class and take-home tests or papers. Also to be included are the quality of test items themselves not only in terms of content but also in terms of the clarity of the question and, in the case of multiple choice tests, of the distractors. One way to capture the complexity of possible ways in which grades are produced is to consider the set of implicit choices that lie behind an instructor's use of a specific testing and/or grading procedure. Included here are such questions as: What evaluation procedure should I use? Term papers, classroom discussions, or inclass tests? If I choose tests, what kind(s)? Essay, true/false, fill-in-the-blank, matching, or multiple-choice? If I choose multiple-choice, what grading model should I use? Normal curve, percent-correct, improvement over preceding tests? If I choose percent-correct, how many tests should I give? Final only, two in-class tests and a final, one midterm and one final? How should I weight each test if I choose the midterm-final pattern? Midterm equals final, midterm is equivalent to twice the final exam grade, final equals twice the midterm grade? What grade report system should I use? P/F; A, B, C, D, F; or A+, A, A, B+, F? An examination of this collection of possible choices suggests that instructors have a large number of options as to how to go about testing and grading their students. Any consideration of the ways in which testing and grading relate to one another must also deal with the ways in which one or both of these activities relate to learning and teaching. The relationship between learning and testing is a fairly direct (if neglected) one, especially if tests are used not only to evaluate student achievement but also to reinforce or promote learning itself. Thus it is easy to develop a classroom question or exercise that requires the student to read some material before being able to answer the question or complete the exercise. Teaching, on the other hand, would seem to be somewhat further removed from

issues of testing and grading, although the specific testing and grading plan used by the instructor does inform the student as to what constitutes relevant knowledge as well as what attitude he or she holds toward precise evaluation and academic competition. Students are not immune to testing and grade procedures, and educational researchers have made the distinction between students who are grade oriented and those who are learning oriented. Although this distinction is surely too onedimensional, it does suggest that for some students the classroom is a place where they experience and enjoy learning for its own sake. For other students, however, the classroom is experienced as a crucible in which they are tested and in which the attainment of a good grade becomes more important than the learning itself. When students are asked how they became grade (or learning) oriented, they usually point to the actions of their teachers in emphasizing grades as a significant indicator of future success; alternatively, they describe instructors who are excited by promoting new learning in their classrooms. When college instructors are asked about the reason(s) for their emphasis on grades, they report that student behaviors such as arguing over the scoring of a single questionmake it necessary for them to maintain strict and well-defined grading standards in their classrooms. The ironic point is that both the student and the instructor see the "other" as emphasizing grades over learning, and neither sees this as a desirable state of affairs. What seems missing in this context is a clear recognition by both the instructor and the student that grades are best construed as a type of communication. When grades (and tests) are thought about in this way, they can be used to improve learning. As it now stands, however, the communicative purpose of grading is ordinarily submerged in their more ordinary use as a means of rating and sorting students for social and institutional purposes not directly tied to learning. Only when grades are integrated into a coherent teaching and learning strategy do they serve the purpose of providing useful and meaningful feedback not only to the larger culture but to the individual student as well.

REFERENCE BAIRD, LEONARD L. 1985. "Do Tests and Achievement?" Research in Higher Education 23:385. Grades Predict Adult

CURRETON, LOUISE W. 1971. "The History of Grading Practices." Measurement in Education 2:19. DUKE, J. D. 1983. "Disparities in Grading Practice: Some Resulting Inequities and a Proposed New Index of Academic Achievement." Psychological Reports 53:1023 1080. GOLDMAN, ROY D. ; SCHMIDT, DONALD, E. ; HEWITT, BARBARA, N.; and FISHER, RONALD. 1974. "Grading Practices in Different Major Fields." American Education Research Journal 11:343357. MILTON, E. OHMER; POLLIO, HOWARD R.; and EISON, JAMES A. 1986. Making Sense of College Grades. San Francisco: Jossey-Bass. POLLIO, HOWARD R.; and BECK, HALL P. 2000. "When the Tail Wags the Dog: Perceptions of Learning and Grade Orientation in and by Contemporary College Students and Faculty." The Journal of Higher Education 71:84102.

Você também pode gostar