Você está na página 1de 20

Sr.

no

Author's Name & Year

Problem Statement/ Questions Asked

Sample

1 H. JOHN To measure the Stability of A survey of Fortune 500 BERNARDI Rater Leniency companies found that most N of the sampled companies used ratings to determine the amount of merit pay awarded to employees.

Florida Atlantic University,

JEFFREY S. KANE University of North Carolina at Greensboro,

PETER VILLANOV A Appalachian State University, JOSEPH PEYREFITT E Florida Atlantic University 2 Study by Kane Stable Rater Leniency for Police Sergeants rating patrol officers Raters and ratees in this study were 328 patrol officers, 38 sergeants, and 14 lieutenants in a large police department in the south-eastern United States. The retention rate for all personnel throughout the length of the study was over 94 percent.

The retention rate for all personnel throughout the length of the study was over 94 percent.

3 Study by Kane SUPERVISOR RATING and SOCIAL WORKERS

Respondents were 44 supervisors of 376 social workers employed in a large state agency in the southeastern United States. Six different job titles were represented among the social workers, but all the jobs belonged to the same job family. The retention rate for all relevant personnel throughout the length of the study was 71 percent.

rate for all relevant personnel throughout the length of the study was 71 percent. Bernardin

4 RICHARD ROLE OF THE RATER IN J.KLIMOSK PERFORMANCE I AND APPRAISAL MANUEL LONDON

One hundred and fifty-three registered nurses (RN) from four Columbus, Ohio area hospitals served as subjects. No attempt was made to split the sample according to hospital or functional unit All RNs working on the day shift in each hospital were required to fill out the questionnaires as part of a research study carried out in conjunction with the School of Nursing at the Ohio State University. In the study 202 RNs participated; however, 49 questionnaires were eliminated from the sample due to lack of self-ratings, supervisor ratings, and peer ratings for each subject. The questionnaire of interest to the present study consisted of 19 measures of effectiveness.

Ohio State University

Methodology

Major Findings

Limitations

Hypothesis based technique.

Guilford (1954) presented the hypothesis that rater leniency is stable over 40 years ago. The analysis continued on the same hypothesis.

1. The validity of both these leniency indexes is somewhat suspect. For example, Taylor and Hastman (1956) defined leniency as a mean rating above the scale midpoint. However, the validity of this method rests on the dubious assumption that the true mean of ratings is located at a scale's mid-point (Sharon & Bartlett, 1969). The use of skewness measures of leniency is also difficult to justify in that a negatively skewed distribution does not necessarily imply that ratings are lenient. One can assume that the mean of most rating distributions will fall above the scale midpoint and that the distributions will be negatively skewed as a function.

Leniency can lead to rapid depletion of the merit budget and proportionately reduce the amount of merit increase available to superior workers. Systematically lenient ratings deny organizations accurate information regarding the effectiveness of their operations, potentially jeopardizing their success in today's competitive business environment. 2. All of the above indexes of leniency can be termed absolute measures, meaning that they seek to express degree of leniency as a difference relative to some absolute standard of zero leniencies.

Two types of leniency measures commonly used (1) are those that express leniency as a comparison between mean rating level and a purported representation of the true mean rating and (2) those that compare mean rating level and the skewness of the distribution of ratings.

The performance appraisal Performance appraisals forms for patrol officers generated on two successive contained 20 items occasions, in 1981 (time 1) representing 11 and 1982 (time 2), were dimensions rated on a nine-retrieved from a database. The point scale ranging from procedure was to ensure no "never" (1) to "always" (9 overlap in the time 1 and time 2 samples of patrol officer ratings within raters.

Although the number of ratees rated at time 1 was fixed at four, as stated above, the number of unique ratees at time 2 ranged from three to six.

"never" (1) to "always" (9

The independence of the times 1 and 2 samples within each rater that resulted from this procedure eliminated consistency in performance as an explanation for any significant correlation that might be found between ratings over the two rating points, leaving stable rater leniency as the most plausible explanation All social workers were rated using the same appraisal forms. One form called for ratings of six performance factors (ratee traits) on a five-point graphic rating scale. This form also called for a rating of summary job effectiveness on a fivepoint scale ranging from "very

ineffective" (1) to "very effective" (5) The other appraisal system used was a work planning and review system individualized, written performance standards were established for each employee, and the extent to which the standards had been achieved was assessed and self-assessed at the conclusion of the appraisal period. Performance standard attainment was evaluated on a threepoint scale (failed to achieve, achieved, and exceeded), and overall performance was evaluated on a sevenpoint scale ranging from "not effective at all" (1) to

"One of the most effective employees with whom I have ever worked" (7). A rating of 4 on this summary rating was required to render an employee eligible for merit pay.

For each measure RNs were asked to simultaneously rate and rank their peers and themselves (head nurses had to rate and rank their subordinate RNs on the same items) on a 20-point continuum.

Strong rater-group specific bias There were no strong evidences to prove the differentiates self ratings, hypothesis. supervisor ratings, and peer ratings from each other. There is considerable halo common to supervisors and peers while strong rater specific biases also exist .

The measures of Supervisors and peers overlap in There were critics that did not show agreement. effectiveness used in the way they make their ratings. this questionnaire were obtained from performance appraisal forms used by six different hospitals including those sampled. The investigators tried to select characteristics (both behaviors and traits) which represented the range of effectiveness covered in the appraisal forms.

(For less than 40 variables multiple 7R squares are used and factor analysis was carried out.)

This bias is due not only to systematic halo error which may be common to all raters but to rater-group specific bias. Yet it seems, as Guion (1965, p. 473) suggested, that raters in different positions use different dimensions. For example, supervisors may be less able to discriminate between items related to competence from those related to effort, whereas nurses rating themselves and peers can make this distinction.

5 Farahman Farrokhi, Rajab Esfandiari and Mehdi Vaez Dalili

Applying the ManyFacet Rasch Model to Detect Centrality in Self-Assessment, Peer-Assessment and Teacher assessment

194 raters, who were subdivided into student and teacher raters. The student raters were labelled either selfassessors or peer-assessors. Teacher assessors were six Iranian teachers of English.

6 Michael M. Harris

Rater Motivation in The supervisor of a particular ratee. the Performance Appraisal Context: A Theoretical Framework

7 Jai Ghorpade & Creating qualityAmerican Industry Milton M Chen driven Performance Appraisal Systems

Analytic rating scale.

The findings are promising in First, the Rasch analysis is more that they provide evidence for computationally complex than traditional the viability of both methods based on the classical test theory. Second, the Rasch model self-assessors and peerbelongs to more restricted measurement assessors for rating purposes models and must fit the data before it can be along with teachers applied The findings is that raters in a positive mood more readily recall positive information, and exhibit greater integration of diverse information However, empirical research indicates that decision makers who are slightly depressed engage in more thorough and careful information processing; decision makers who are elated are likely to use much more superficial information processing and judgment procedures. The present paper is that a nonmotivated rater will use less thorough and deliberate information processing techniques than a motivated rater, and therefore may have limited information about a particular ratee.

Model of Rater Motivation

Hypothesis based technique. At least two factors make performance appraisal one of the sensitive areas of organizational activity and change. First people have a direct stake in the appraisals that affect their pay and prospects. Second, organizations increasingly view performance appraisal as a system that can hinder or promote initiatives such as reengineering or teamwork.

7 Robert L. Holzbach

Rater Bias in Performance Ratings: Superior, Self-, and Peer Ratings

Analysis of superior, self-, and peer performance ratings of 107 managerial and

76 professional employees in a medium-sized manufacturing location, representing 95% of the managerial and professional staff. How three rater types194 assessors-188 self-and peernamely, self-assessors peer- assessors and 6 teacher assessors-were assessors and teacher employed to assess 188 essays written assessors, showed variability by Iranian English majors at in terms of central tendency universities in Iran effect, if any, in relation to each other?

8 Farahman Farrokhi, Rajab Esfandiari and Mehdi Vaez Dalili

9 William T Hoyt

Rater Bias in Psychological Research: When Is It a Problem and What Can We Do About It?

Collected data on a large number of

targets each of whom has been rated by a large number of observers 10 Paul O An Investigation of the Rater- Questionnaires 150 male, first-level Kingstrom, Larry Ratee Acquaintance and sales supervisors E Mainstone Rater Bias from 78 branch offices of a large, international manufacturer

11 John F. Binning, Andrea J. Zaba and John C. Whattam

Explaining the Biasing Effects of Performance Cues in Terms of Cognitive Categorization

Students observed a videotape of a problem solving group and were then asked to complete a questionnaire describing the group leaders behaviour, immediately after they had received bogus feedback on the groups performance.

Multitrait-Multimethod The research clarified the MTMM, Dimensional dimensional structure of analysis ratings by superiors.

The procedure did not reduce the significant halo effect, nor did it improve The nonsignificant discriminant validity in the MTMM analysis

Many-Facet Rasch Model

The results of Facets analysis showed that the three types of assessor-selfassessor, peer-assessor and teacher assessor-did not exhibit any sign of centrality either at group level or at individual level

Multivariate analysis

Four biases were analyzed two of which are already known Leniency Effect and Halo Effect. The other two are Dyadic Variation and Rater covariation. Dyadic variance bias is prevalent in most of the studies more so than the Leniency effect. Rater covariance also distorts findings in some designs. Halo error acts to inflate observed effect sizes whenever observations are linked.

Need for empirical work on bias covariance as the few published multivariate generalizability studies provide limited guidance for researchers using linked designs.

Means, Standard Deviations & Intercorrelations for Rating Favourability, Halo Error, Sales Productivity, and Acquaintance Measures, Hierarchical Regression Multiple regression

The study has found that subordinates with whom supervisors had established relatively high task and personal acquaintance received significantly more favourable overall performance ratings.

Do not know to what extent the results can be generalized as the study was conducted in same org among employees having the same job.

7-point Likert format, Mean Ratings' of Specific Behaviours and Variance Explained

Observers ratings of various phenomena related to group processes are significantly biased when raters got bogus feedback about the groups performance before the rating task.

by Performance Cues, MANOVA, Univariate ANOVA

Você também pode gostar