Você está na página 1de 5

Update:

Performance on FCAT 2.0


Since the release of this years third-grade FCAT 2.0 Reading and Mathematics scores yesterday, there has been much confusion around what the scores mean with respect to achievement in previous years. Considering that over the past two years Florida has (at different times) changed both the test itself (FCAT to FCAT 2.0) and the scale and passing benchmarks the test is measured on, it is understandable for confusion and frustration to arise when the stakes involved are so high. What Happened Much of the confusion following from yesterdays release was due to the differences between proficiency levels reported last year and proficiency levels retroactively applied to last years data using the new scale which the DOE is now using to compare year to year performance in this years release. As a quick recap, up until 2011 students were taking the FCAT which had its own measurement scale and established proficiency levels and passing scores appropriate to that test. In 2011, the test itself changed to the new FCAT 2.0, a more rigorous exam aligned with the new Next Generation Sunshine State Standards (implemented in anticipation of getting students ready to meet the expectations of the Common Core Standards coming in 2014). Still with me? However when the new FCAT 2.0 was administered last year, it did not yet have its own new measurement scale established. So in order to maintain year-to-year reporting consistency, the measurement scale from the old test (FCAT) was applied to the results of the new test (FCAT 2.0). As a result, reported performance on the new test last year looked very similar to performance on the old test from previous years. Earlier this year, the state completed the process of establishing a new measurement scale uniquely tailored to the demands of the FCAT 2.0 to be used moving forward. That is what is being used now, and what they were now able to go back and retroactively apply to last years data for comparison sake. By reapplying the new scale and standards to last years data, the new results looked as if there had been little to no change in performance levels. While this is true and more accurate, it is still no less confusing to those who remember being told that 72% of students were proficient statewide last year and are now being told only 57% actually were. In Table 1 below, we see this years performance levels compared to what was reported last year (in yellow) and last years actual performance reported this year (in blue).

2012 % Level 3 and Above

2011 Level 3 and Above (PREVIOUSLY REPORTED)

Change between 2011 (PREVIOUSLY REPORTED) and 2012 Performance

2011 Level 3 and Above (RETROACTIVELY RESCALED)

Change between 2011 (RETROACTIVELY RESCALED) and 2012 Performance

Reading Duval County 51% 69% - 18% 55% State 56% 72% - 16% 57% Math Duval County 56% 79% - 23% 56% State 58% 78% - 20% 56% To be clear the confusion occurring this year is largely the result of what the DOE did and reported last year applying the prior FCAT scale to the new FCAT 2.0 test results to try to maintain some kind of year-to-year consistency. Last years results were artificially scaled to match student performance levels from the previous year, rather than measuring students against an independent scale, and thus not a true reflection of performance on the FCAT 2.0 at the time. While there is much anger about the scale being changed this year, that was the correct and necessary thing to do it just should have been done last year. How the scale changed Aside from being more appropriately aligned to the content of the new test, the new measurement scale is actually an improvement over the old scale in a few key ways. In Figure 1 below, we see the achievement level cutoff scores for grades 3-10 on the old scale for the previous FCAT. Figure 1: Previous FCAT Developmental Scale Score Cut Scores

- 4% - 1% 0% + 2%

2586 2086 1586 1086 586 86 3 4 5 6 7 8 9 10

The lowest line represents the cut scores between Level 1 and Level 2 achievement across grade levels, above that is the cut score line between Level 2 and Level 3, and so on up. The diamond line represents Level 3 and above cutoff. We see that at face value the cutoffs appear to be not very consistent across years and to become more closer together over time, making it more difficult to score in Levels 2, 3 or 4 at higher grade levels. Compare that now with the achievement level cutoff scores on the new FCAT 2.0 measurement scale in Figure 2 below. We see much more consistency in expectations for each level between the lower and upper grade levels in the new scale. Figure 2: New FCAT 2.0 Scale Score Cut Scores
290 270 250 230 210 190 170 150 3 4 5 6 7 8 9 10

This simple comparison is useful for becoming basically oriented about how the previous (FCAT) and new (FCAT 2.0) achievement level standards operate relative to one another, but they are insufficient for making any final conclusions about how the two sets of standards compare for a number of reasons including that they reflect growth on two different scales. To account for these scale differences, another way to compare these two sets of standards is to look at the percentage of total possible points at each grade level that students must earn to meet each cutoff score under the two systems. For example, the previous FCAT developmental scale ranged from 86 to 3008 points. A 3rd grader could score between 86 and 2514 on that scale (a 2,428 point range) and needed to score at least an 1198 to reach the Level 3 cutoff. Given the possible range of points a 3rd grader could score in, a score of 1198 would mean earning 46% of the total possible points a 3rd grader could earn (or, in effect,

demonstrating mastery of 46% of the tested skills for 3rd grade Reading). In comparison, the new FCAT 2.0 scale ranges from 140 to 302 points. A 3rd grader can score between 140 and 260 on that scale (a 120 point range) and needs to score at least a 198 to reach the Level 3 cutoff. Given the possible range of points a 3rd grader could score in, a score of 198 would mean demonstrating mastery of 48% of the tested skills for 3rd grade Reading. Figure 3: Previous FCAT Developmental Scale Cut Scores as a % of Total Possible Points
80% 70% 60% 50% 40% 30% 20% 3 4 5 6 7 8 9 10

As we see in Figure 3, the percent of total possible points on the test that students needed to score for each level fluctuated significantly from year to year under the previous FCAT standards. For example, in order to be considered a Level 3 reader under the previous system a student would need to demonstrate mastery of 46% of the material in 3rd grade, then 50% of the material in 4th grade, back to 46% of the material in 5th grade, and then continually increasing amounts each year up to almost 60% of the material in 10th grade. Figure 4: Proposed New FCAT 2.0 Scale Cut Scores as a % of Total Possible Points
80% 70% 60% 50% 40% 30% 20% 3 4 5 6 7 8 9 10

If we compare that to the new FCAT 2.0 scales in Figure 4, we see that the new scale is much more consistent in terms of what is expected of students across years:
A Level 1 reader is a student who demonstrates mastery of ~ less than 35% of grade- level material every year. A Level 2 reader is one who demonstrates mastery of between ~ 35% to 50% of grade- level material every year. A Level 3 reader is one who demonstrates mastery of between ~ 50% to 60% of grade- level material every year. A Level 4 reader is one who demonstrates mastery of between ~ 60% to 73% of grade- level material every year. A Level 5 reader is one who demonstrates mastery of ~ more than 73% of grade-level material every year.

This implication is important for a number of reasons. First, it establishes a system that holds students, teachers and faculty to the same standards of mastery across all grade levels. Second, and perhaps just as importantly, it makes those standards meaningful by creating a common language across all grade levels about what it means to be performing at each achievement level so that teachers and administrators at each successive grade level will have a concrete understanding of what an incoming students previous performance indicates about their needs. This common language is also important from a research and reporting perspective for being able to make more direct comparisons across grade levels, such as when looking at things like 3rd grade reading performance vs. 10th grade reading performance across the district.

Você também pode gostar