Analyzing Item Difficulty and Discrimination in Multiple Choice Exams

EFFICIENCY OF ITEM-ANALYSIS TECHNIQUES IN A MULTIPLE-CHOICE EXAMINATION
A Project Presented to Miraluna L. Herrera Ph.D
In Partial Fulfillment of the requirements in Statistical Seminar (Stat. 196)
Junel B. Ambajic
October 2011
Abstract
This piece of work was aimed to analyze test items of teachers made test in the subject of Math 1.7 College Algebra for College Students in Caraga State University, Ampayon Butuan City. Additionally secondary analysis was also performed to explore the more convenient methods in computing the two parameters in the MultipleChoice Question (MCQ) type of examination, the Item Difficulty Index and Item Discrimination Index. The findings shows that most of the items falling only acceptable if the methods of Amir Zaman [et.al], 2010 used. However some items were eliminated due to their poor discrimination index value. Further analysis showed that the method of Amir Zaman [et.al], 2010 is less error for computing the two parameters in Multiple-Choice Question (MCQ) type of examination and thus the findings result that the Method of Amir Zaman [et.al], 2010 is more convenient. Also the researcher was able to determine of why many researchers used the Upper and Lower 27% of the sample size/ Group when it come in analyzing the two parameters. And that is because 27% is more reliable since the Reliability Statistics is higher (0.72) which implies to be accepted. Base on the book of Michael J. Miller, Ph.D. Entitled RELIABILITY AND VALIDITY states that it is accepted and valid when the reliability statistics is greater than or equal to (0.70).
Chapter I
INTRODUCTION
1.1 Background of the Study Today there are varieties of Multiple-Choice Question (MCQ) methods for use in assessing subjects knowledge. And Multiple-Choice Question is the most common used as of these days for evaluating students. The researcher would like to estimate which of the tool is more appropriate in analyzing the Multiple-Choice Question (MCQ). This time, educators are more concerned with the progress in academic aspects of the students. More probably, due to this matter we are often to use the Item Analysis to evaluate students performances as well as their capabilities. Also in this paper, the researcher will focus on the tools to which is more efficient to analyze the Multiple-Choice Question (MCQ) test type of examination and analyzing the reliability in different percentage. Given that the data gathered of the researcher contains four (4) and five (5) Options. The researcher intended to conduct this kind of study to see how efficient the tool is and observe why does many researchers commonly used the 27% percentage from top to bottom of the group. Here we are using the result of Math 1.7 (College Algebra) as our example data with respect to their Final examinations results.
1.2 Statement of the Problem

This study aims to Determine and observe the more convenient tool in analyzing the following; i. ii. iii. Item Discrimination Index Item Difficulty Index and, Reliability Statistics in different percentage.
1.3 Significant of the Study

The result of this study serves as a basis for statistics instructors/Teachers in evaluating their students whether the students are doing well in their class/course. Also this study will give the more convenient tool when they are able to evaluate their students. Moreover, this study would like analyze whether the Multiple-Choice Question (MCQ) type of examination is more reliable and valid to students when it comes to evaluate their capabilities.
1.4 Scope and Delimitation

This study deals only in Multiple-Choice Question (MCQ) test type of examination. And also more focuses in finding the more convenient tool and the more efficient percentage to be used in analyzing the results of Math 1.7 (College Algebra) as our example data in this research. This study was undertaken during the first semester of the A.Y 2010-2011. The respondents were those who are enrolled in Math 1.7 (College Algebra) in first and second semester of the A.Y 2009-2010 in Caraga State University main Campus.
1.5 Basic Definitions

The following terms are defined as they are used in most of the study. Index Difficulty - This term refers to a convenient measure of easiness and difficulty of an item. The higher the percentage of student who answered the item correctly the easier the item and the lower the percentage of students who answered the item correctly, the more difficult the Item is. Index Discrimination This term refers to the measure that determines the ability of a test to separate the students who are good from those who are poor. Multiple-Choice Question (MCQ) - It is one of the types of examination in which the answer is offered to the candidate or to the person who took the examination. Reliability This term refers the degree to which an instrument (questionnaire) measures the same way each time it is used.
Validity This term refers the extent to which a test measures what it was intended to measure.
1.6 Review of Literatures

There are different methods in analyzing the Multiple-Choice Question (MCQ) type of examination particularly in the Item Discrimination index, Item Difficulty index and Reliability Statistics. The most common method or percentage used in analyzing the item Difficulty and Discrimination index is the 27 % in both Upper and Lower Group. The following researchers found who used the method are as follows; Mitra N K [et.al] 2009, Graham Barrow (2005), Jan Patock (May 2004). They are using the 27% in getting both the Upper and lower Group to determine the High scorers and Low scorers for the sake of getting the Item Discrimination Index. But the most controversial is that when it comes in computing the Item Difficulty Index some of these researchers have their own methods. Like Mitra N K [et.al] 2009 compute the Item Difficulty Index (DI)= R/T, where DI is the difficulty Index, R is the number of correct responses and T is the total number of responses ( which includes both correct and incorrect responses).And the result of this computation considered difficult when the DI was less than 30% and the item considered easy when the DI was greater than 80%. And in computing the Item Discrimination Index (DI)= (UG - LG)/n. where n is the 27% of the entire sample size, Upper Group (UG) is the 27% of our sample size counted from top to middle or say the same with our n. And the Lower Group (LG) is the 27% of our sample size counted from bottom to middle. And the result were interpreted as, the higher the DI value, the test item can
6
discriminate better between students with higher test score and those with lower scores. Based on Ebels (1972) guidelines on classical test theory item analysis, items were categorized in their discrimination indices as stated Table 1.6.1 Discrimination Index result 0.0 - 0.19 0.2 - 0.29 0.3 0.39 DI >= 0.4 Remarks Poor item and must be revised/eliminated Item revision is necessary Good item and must retain Excellent and must retain
Table 1.6.1: Ideal percentage for Discrimination Index (DI). Also Jan Patock (May 2004) computed the Item Difficulty Index in different method. Item Difficulty Index DI = , where DI is the Difficulty Index, c total number of correct responses and n the total number of respondents or say our sample size it is same with Mitra N K [et.al],2009. The result analyzed as, the higher the DI value the easier the question is. Such that when the value is less than 30% the item is difficult and when the DI value greater than 70% we can say that the item is easy. And the formula in computing the Item Discrimination Index is given by DI = , where DI is the discrimination Index, a the response of the 27% Upper
Group of our sample size counted from top to middle, b the response of the 27% Lower Group of our sample size counted from bottom to middle and n is the 27% of our sample size say the total number of the respondents. Items which discriminate well are those which have difficulties between 30% - 70%. Besides there is one more existing method in analyzing the Multiple-Choice Question (MCQ) type of examination. The researcher also found that the general method that we can set our own
7
percentage and this was researched by Amir Zaman [et.al] 2010. Entitled: Analysis of MultipleChoice Items and the effect of items Sequencing on Difficulty Level in the Test of Mathematics. The formula for Difficulty Index (DI) is given by;
DI =
Where, DI Is the Difficulty Index A Is the Percentage (%) of high achievers doing correct B - Is the Percentage (%) of low achievers doing correct and, The formula for Discrimination Index is also given by; DI = (UG LG) Where, DI Is the Discrimination Index Upper Group (UG) - The number of students in the Upper Group who responded correctly Lower Group (LG) the number of students in the Lower Group Who responded correctly and, The value of DI must be lying in the interval of 20% - 90% for an item to be accepted. Moreover, we have to analyze the result of the reliability testing in different percentage. So that the researcher could determine why these Published paper found are mostly uses the 27%. And the researcher also found that there were a lot of researchers having their own ideal percentage in Item Difficulty Index and Item discrimination Index. Dawn M. Zimmaro, Ph.D.
8
2003. Test Item Analysis And Decision Making. Computed the reliability of the Item difficulty Index as shown in Table 1.6.2 and Item Discrimination Index as shown in Table 1.6.3 is similar with the ideal discrimination index of Mitra N.K 2009 thus we can set this ideal percentage for having any remarks at the end of computations. Number of Answer Option (Distracter) 5 response multiple-choice question 4 response multiple-choice question 3 response multiple-choice question 2 (true or false)response multiple-choice question Table 1.6.2: Ideal difficulty for Item Difficulty Index. In this table, values that are less than in our ideal difficulty must be rejected or revised for any reasons that the data could only tell us. Percentage or values of Discrimination Index (DI) DI <= 0.19 0.20 0.29 0.30 0.39 DI >= 0.40 Remarks Poor items / must be reject Fairly good Items / revised Good items / retain Very good items / retain Ideal Difficulty .60 .62 .66 .75
Table 1.6.3: Ideal Percentage for Item Discrimination Index. And for the Reliability Coefficient (Alpha): We are able to calculate it for this is a measure of the amount of measurement error associated with a test score and see in Table 1.6.4. The range is from 0% - 100%
The higher the value, the more reliable the overall test scores. Typically, the internal consistency reliability is measured. This indicates how well the items are correlated with one another.
High reliability indicates that the items are all measuring the same thing, or general construct.
Two ways to improve the reliability of the test are 1.) Increase the number of question in the test or 2.) Use items that have high Discrimination values in the test.
Reliability 0.49 or below
Interpretation This test should not contribute heavily to the course grade, and it needs revision
0.50 0.59
Subject for revision of test. And needs to be supplemented by other measure (e.g., more test) for grading
0.60 0.69
There are probably some items which could be improved.
0.70 0.79
Good for a classroom test; in the range of most. There are probably a few items which could be improved.
0.80 0.89 0.90 and above
Very good for a classroom test. Excellent reliability; at the level of the best standardized tests. Table 1.6.4: Standard Reliability Coefficient (Alpha)
10
Chapter 2
METHODOLOGY
This Chapter presents the procedure from setting the data as the results of Math 1.7 (College Algebra) Final examinations in Caraga State University Main Campus.
2.1 Data Gathering

The data gathered from the test results of the students in Caraga State University who are taking the course Math 1.7(College Algebra) during the final examinations of first and second semester, A.Y 2010-2011. 2.2 Data Processing in MS-Excel and Getting the Upper and the Lower Group given the following percentage 27%, 30%, 50%. These are the steps in recording the data of students who took the final examinations of Math 1.7 (College Algebra). 1. Record all the test answer, Scores of the students per Item in MS Excel. 2. Highlight the scores from student 1 to student 647, and then sort from Z to A (Highest to lowest) format. 3. Compute first the 27 % from our N (Total sample size) to get the n (number of students who belongs both Upper and the Lower Group). 4. Then from the sorted scores, count from top to middle to get the n of your Upper Group. 5. Also from the sorted scores, count from bottom to middle to get the n of your Lower Group.
11
6. For 30% and 50% do the same thing on what we have done in getting the 27%.See Figure 2.2 and above;
Figure 2.2 Steps in Getting the Upper And the lower Group.
12
2.2.3 Importing data from Excel to SPSS

These are the steps on how to import data from Excel to SPSS. 1. Highlight the original form of data from Excel, and then right click COPY. 2. Load the SPSS processor, and from left of the bottom click the DATA VIEW icon, and then right click PASTE. 3. Also we need to give variables of our data. 3.1 From bottom left click VARIABLE VIEW. 3.2 Input the variables as suited of your data. 3.3 Code the data as desired for more inconvenient to look at. 4. Then the data is ready for any analyses. Figure 2.2.3
13
Figure 2.2.3: Steps in importing the data from MS-Excel to SPSS
14
2.3 Results and Discussions

The following results as shown below are computed in different ways.
Method 1:
Item Number Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Number of student who responded correctly denoted by (R) 32 381 442 62 231 272 99 293 182 129 314 403 267 201 215 219 285 146 150 DI= Where T is the total number of respondents 0.0495 0.5889 0.6832 0.0958 0.357 0.4204 0.153 0.4529 0.2813 0.1994 0.4853 0.6229 0.4127 0.3107 0.3323 0.3385 0.4405 0.2257 0.2318
Remarks Difficult Moderate Moderate Difficult Moderate Moderate Difficult Moderate Difficult Difficult Moderate Moderate Moderate Moderate Moderate Moderate Moderate Difficult Difficult
15
Q20 Q21 Q22 Q23 Q24 Q25 Q26 Q27 Q28 Q29 Q30 Q31 Q32
217 194 320 157 159 69 306 128 207 272 192 313 111
0.3354 0.2998 0.4946 0.2427 0.2457 0.1066 0.473 0.1978 0.3199 0.4204 0.2968 0.4838 0.1716
Moderate Difficult Moderate Difficult Difficult Difficult Moderate Difficult Moderate Moderate Difficult Moderate Difficult
Table 2.3.1: Item Difficulty Index (Mitra N K [et.al] Method)

In Table 2.3.1, we have 44% of item are Difficult (DI 0.3) and 56% of it are Moderate (0.30 < DI 0.80).
16
Item Upper Group Lower Group DI = Number result result Q1 -0.0033 0.051724 0.057471 Q2 0.19818 0.764368 0.41954 Q3 0.18166 0.83908 0.522989 Q4 0.109 0.201149 0.011494 Q5 0.12551 0.45977 0.241379 Q6 0.19818 0.609195 0.264368 Q7 -0.10569 0.034483 0.218391 Q8 0.1156 0.58046 0.37931 Q9 0.19157 0.5 0.166667 Q10 -0.09579 0.091954 0.258621 Q11 0.23121 0.672414 0.270115 Q12 0.01651 0.609195 0.58046 Q13 -0.03303 0.344828 0.402299 Q14 0.19818 0.505747 0.16092 Q15 0.02642 0.356322 0.310345 Q16 0.28075 0.609195 0.12069 Q17 0.24772 0.666667 0.235632 Q18 0.12221 0.344828 0.132184 Q19 0.13542 0.436782 0.201149 Q20 0.28075 0.632184 0.143678 Q21 0.12551 0.413793 0.195402 Q22 0.25433 0.724138 0.281609
Remarks Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Accept/Retain Accept/Retain Poor item/ Eliminate Poor item/ Eliminate Accept/Retain Poor item/ Eliminate Accept/Retain
17
Q23 0.41954 Q24 0.114943 Q25 0.137931 Q26 0.396552 Q27 0.224138 Q28 0.522989 Q29 0.563218 Q30 0.287356 Q31 0.637931 Q32 0.091954 0.189655 0.344828 0.275862 0.33908 0.212644 0.206897 0.54023 0.086207 0.304598 0.16092
0.14863 -0.109 0.02973 -0.08257 0.009909 0.17836 0.12881 0.006606 0.16845 -0.05615
Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate Poor item/ Eliminate
Table 2.3.2: Discrimination Index (Mitra N K [et.al] Method)
In table 2.3.2, it shows that there are 9% of the items only accept/retain (DI > 0.20) and the remaining 91% are Poor items/Eliminated (DI 0.20)
18
Method 2:
Item Number Number of student who responded correctly denoted by (c) Q1 32 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 381 442 62 231 272 99 293 182 129 314 403 267 201 215 219 285 146 150 217 194 320 DI= Where n is the total number of respondents 0.0495 0.5889 0.6832 0.0958 0.357 0.4204 0.153 0.4529 0.2813 0.1994 0.4853 0.6229 0.4127 0.3107 0.3323 0.3385 0.4405 0.2257 0.2318 0.3354 0.2998 0.4946 Remarks
Difficult Moderate Moderate Difficult Moderate Moderate Difficult Moderate Difficult Difficult Moderate Moderate Moderate Moderate Moderate Moderate Moderate Difficult Difficult Moderate Difficult Moderate
19
Q23 Q24 Q25 Q26 Q27 Q28 Q29 Q30 Q31 Q32
157 159 69 306 128 207 272 192 313 111
0.2427 0.2457 0.1066 0.473 0.1978 0.3199 0.4204 0.2968 0.4838 0.1716
Difficult Difficult Difficult Moderate Difficult Moderate Moderate Difficult Moderate Difficult
Table 2.3.3: Difficulty Index (Jan Patock Method)

In table 2.3.3, shows that 44% of the items are Difficult (DI 0.30) and the remaining 56% of the items are Moderate (0.30 < DI 0.70).
20
Item Number Q1
Upper Group Lower Group DI= result result -0.00089 0.051724 0.057471 0.053296 0.764368 0.41954 0.048855 0.83908 0.522989 0.029313 0.201149 0.011494 0.033754 0.45977 0.241379 0.053296 0.609195 0.264368 -0.02842 0.034483 0.218391 0.03109 0.58046 0.37931 0.05152 0.5 0.166667 -0.02576 0.091954 0.258621 0.062179 0.672414 0.270115 0.004441 0.609195 0.58046 -0.00888 0.344828 0.402299 0.053296 0.505747 0.16092 0.007106 0.356322 0.310345 0.075503 0.609195 0.12069 0.066621 0.666667 0.235632 0.032866 0.344828 0.132184 0.036419 0.436782 0.201149 0.075503 0.632184 0.143678
Remarks Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated 0.033754
Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 0.413793 0.195402
21
Q22 0.724138 Q23 0.41954 Q24 0.114943 Q25 0.137931 Q26 0.396552 Q27 0.224138 Q28 0.522989 Q29 0.563218 Q30 0.287356 Q31 0.637931 Q32 0.091954 0.189655 0.344828 0.275862 0.33908 0.212644 0.206897 0.54023 0.086207 0.304598 0.16092 0.281609
0.068397 0.039972 -0.02931 0.007994 -0.02221 0.002665 0.047967 0.034643 0.001777 0.045302 -0.0151
Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated Poor item/Eliminated
Table 2.3.4: Discrimination Index (Jan Patock Method).

In table 2.3.4, it shows that 100% of the items are Poor/Eliminated (DI < 0.30).
22
Method 3:
Item Number Q1 Q2 0.764368 Q3 0.83908 Q4 0.201149 Q5 0.45977 Q6 0.609195 Q7 0.034483 Q8 0.58046 Q9 0.5 Q10 0.091954 Q11 0.672414 Q12 0.609195 Q13 0.344828 Q14 0.505747 Q15 0.356322 Q16 0.609195 Q17 0.666667 Q18 0.344828 Q19 0.436782 Q20 0.632184 Q21 0.413793 Q22 0.724138 Q23 0.41954 0.16092 0.29023 0.281609 0.5028735 Moderate 0.195402 0.3045975 Moderate 0.143678 0.387931 Moderate 0.201149 0.3189655 Moderate 0.132184 0.238506 Moderate 0.235632 0.4511495 Moderate 0.12069 0.3649425 Moderate 0.310345 0.3333335 Moderate 0.16092 0.3333335 Moderate 0.402299 0.3735635 Moderate 0.58046 0.5948275 Moderate 0.270115 0.4712645 Moderate 0.258621 0.1752875 Moderate 0.166667 0.3333335 Difficult 0.37931 0.479885 Moderate 0.218391 0.126437 Moderate 0.264368 0.4367815 Difficult 0.241379 0.3505745 Moderate 0.011494 0.1063215 Moderate 0.522989 0.6810345 Difficult 0.41954 0.591954 Moderate Upper Group 0.051724 Lower Group 0.057471 DI= 0.0545975 Moderate Remarks Difficult
23
Q24 0.114943 Q25 0.137931 Q26 0.396552 Q27 0.224138 Q28 0.522989 Q29 0.563218 Q30 0.287356 Q31 0.637931 Q32 0.091954 0.189655 0.1408045 0.344828 0.4913795 0.275862 0.281609 0.33908 0.451149 0.212644 0.3678165 0.206897 0.2155175 0.54023 0.468391 0.086207 0.112069 0.304598 0.2097705
Difficult Difficult Moderate Moderate Moderate Moderate Moderate Moderate Difficult
Table 2.3.5: Difficulty Index (Amir Zaman [et.al] Method) In table 2.3.5, it shows that 22% of the items are difficult (DI 0.20) and the remaining 78% are Moderate (0.20 < DI 0.90).
24
Item Number Q1 Q2
Upper Group 0.051724 0.764368
Lower Group 0.057471 0.41954 0.522989 0.011494 0.241379 0.264368 0.218391 0.37931 0.166667 0.258621 0.270115 0.58046 0.402299 0.16092 0.310345 0.12069 0.235632 0.132184 0.201149 0.143678 0.195402 0.281609 0.16092
DI=(UG-LG)
Remarks Poor item/ Eliminated
-0.005747 Accept/Retain/Retain 0.344828 Accept/Retain/Retain 0.316091 Poor item/ Eliminated 0.189655 Accept/Retain/Retain 0.218391 Accept/Retain/Retain 0.344827 Poor item/ Eliminated -0.183908 Poor item/ Eliminated 0.20115 Accept/Retain/Retain 0.333333 Poor item/ Eliminated -0.166667 Accept/Retain/Retain 0.402299 Poor item/ Eliminated 0.028735 Poor item/ Eliminated -0.057471 Accept/Retain/Retain 0.344827 Poor item/ Eliminated 0.045977 Accept/Retain/Retain 0.488505 Accept/Retain/Retain 0.431035 Accept/Retain/Retain 0.212644 Accept/Retain/Retain 0.235633 Accept/Retain/Retain 0.488506 Accept/Retain/Retain 0.218391 Accept/Retain/Retain 0.442529 Accept/Retain/Retain 0.25862 25
Q3 0.83908 Q4 0.201149 Q5 0.45977 Q6 0.609195 Q7 0.034483 Q8 0.58046 Q9 0.5 Q10 0.091954 Q11 0.672414 Q12 0.609195 Q13 0.344828 Q14 0.505747 Q15 0.356322 Q16 0.609195 Q17 0.666667 Q18 0.344828 Q19 0.436782 Q20 0.632184 Q21 0.413793 Q22 0.724138 Q23 0.41954
Q24 0.114943 Q25 0.137931 Q26 0.396552 Q27 0.224138 Q28 0.522989 Q29 0.563218 Q30 0.287356 Q31 0.637931 Q32 0.091954 0.189655 -0.097701 0.344828 0.293103 0.275862 0.011494 0.33908 0.224138 0.212644 0.310345 0.206897 0.017241 0.54023 -0.143678 0.086207 0.051724 0.304598 -0.189655
Poor item/ Eliminated Poor item/ Eliminated Poor item/ Eliminated Poor item/ Eliminated Accept/Retain/Retain Accept/Retain/Retain Poor item/ Eliminated Accept/Retain/Retain Poor item/ Eliminated
Table 2.3.6: Discrimination Index (Amir Zaman [et.al] Method).
In table 2.3.6, it shows that 56% of the items are to be accepted/retain (DI 0.20) and the remaining 44% of the items are poor/eliminated (DI< 0.20) and, Table 2.3.7 shows the differences between the reliability statistics.
Percentage Tested 20% 27% 30% 50%
Reliability Statistics (Cronbach Alpha)

0.43 0.72 0.69 0.49
Table 2.3.7: Reliability Statistics in different percentage
26
Furthermore, since the methods of Amir Zaman [et.al] 2010, shows the more efficient in computing the different parameters in Item Difficulty Index and Item Discrimination Index. The researcher found that this method it is efficient from the two (2) methods because only Amir Zaman [et.al] 2010 got the Lower percentage in eliminating the items. In other words, the methods of Amir Zaman [et.al] 2010 have less committing errors. And also in table 2.3.7 shows the higher Reliability Statistics (0.72) in 27% which implies that using of Upper and Lower 27% from our sample size/group is reliable and efficient to use. From the book of Michael J. Miller, Ph.D.2010, Entitled RELIABILITY AND VALIDITY states that it is accepted and valid when the reliability statistics is greater than or equal to (0.70).
27
REFERENCES CITED
[1] Amir Zaman, [et.al].2010.Analysis of Multiple Choice Items and the Effect of Items Sequencing on Difficulty Level in the test of Mathematics. European Journal of social sciences Volume 17, pg.1 (2010). [2] Annie W.Y Ng and Alan H.S Chan, 2009. Different Methods of Multiple-Choice Test: Implications and Design for Further Research.IMECS 2009, March 18-20, 2009, Hong Kong. [3] Dawn M. Zimmaro, Ph.D. 2003. Test Item Analysis and Decision Making. Center for Teaching and Learning. [4] David G. Hamill and Paul D. Usala. 2002. Multiple-Choice Test item Analysis: A new Look at the Basics. U.S Immigration and Naturalization Services. [5] Geoffrey T. Crisp and Edward J. Palmer 2007.Engaging Academics with a Simplified Analysis of their Multiple-choice Question (MCQ) Assessment Result. Journal of University Teaching and Learning Practice- Vol 4/2, 2007. [6] Graham Barrow, 2005.The role of Analysis in Multiple Choice Question tests. Graham Barrow of GR Business Process Solution. [7] Jan Patock, May 2004. How to Interpret your Statistical Analysis Reports. University Testing Services Payne Hall, room 301. [8] Lord, F.M. 1952. The relationship of the Reliability of Multiple-Choice Test to the Distribution of Item Difficulties. Psychometrika, 1952,18, 181-194
28
[9]
Mitra N K. [et.al]. 2009. The Levels Of Difficulty And discrimination Indices in Type A Multiple choices Question of Pre- clinical Semester 1 Multidisciplinary Summative Tests. IeJSME 2009:3 (1): 2-7
[10]
Seema VArma, Ph.D. 2007. Preliminary Item Statistics Using Point-Biserial Correlation and P-Values. Educational Data Systems, INC. 15850 Concord Circle, Suite A Morgan Hill.
[11]
Steven J. Burton, [et.al].1991.How to Prepare Better Multiple-Choice Test Items: Guidelines for University Faculty. Brigham Young University Testing Services.
[12] [13]
http://www.washington.edu/oea/score1.htm http://www.michaeljmillerphd.com/res500_lecturenotes/reliability_and_validity.pdf
29

Analyzing Item Difficulty and Discrimination in Multiple Choice Exams

Enviado por

Dados do documento

Descrição original:

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Analyzing Item Difficulty and Discrimination in Multiple Choice Exams

Enviado por

Direitos autorais:

Formatos disponíveis

EFFICIENCY OF ITEM-ANALYSIS TECHNIQUES IN A MULTIPLE-CHOICE EXAMINATION

A Project Presented to Miraluna L. Herrera Ph.D

In Partial Fulfillment of the requirements in Statistical Seminar (Stat. 196)

1.2 Statement of the Problem

1.3 Significant of the Study

1.4 Scope and Delimitation

1.5 Basic Definitions

1.6 Review of Literatures

Reliability 0.49 or below

There are probably some items which could be improved.

0.80 0.89 0.90 and above

2.1 Data Gathering

2.2.3 Importing data from Excel to SPSS

Figure 2.2.3: Steps in importing the data from MS-Excel to SPSS

2.3 Results and Discussions

Table 2.3.1: Item Difficulty Index (Mitra N K [et.al] Method)

Table 2.3.2: Discrimination Index (Mitra N K [et.al] Method)

157 159 69 306 128 207 272 192 313 111

Table 2.3.3: Difficulty Index (Jan Patock Method)

Table 2.3.4: Discrimination Index (Jan Patock Method).

Difficult Difficult Moderate Moderate Moderate Moderate Moderate Moderate Difficult

Upper Group 0.051724 0.764368

Remarks Poor item/ Eliminated

Table 2.3.6: Discrimination Index (Amir Zaman [et.al] Method).

Percentage Tested 20% 27% 30% 50%

Reliability Statistics (Cronbach Alpha)

Table 2.3.7: Reliability Statistics in different percentage

Você também pode gostar