Escolar Documentos
Profissional Documentos
Cultura Documentos
Learner Control and Error Correction in ICALL: Browsers, Peekers, and Adamants
Trude Heift
Simon Fraser University
ABSTRACT This article reports the findings of a study on the impact of learner control on the error correction process within a web-based Intelligent Language Tutoring System (ILTS). During three one-hour grammar practice sessions, 33 students used an ILTS for German that provided error-specific and individualized feedback. In addition to receiving detailed error reports, students had the option of peeking at the correct answer, even before submitting a sentence (browsing). The results indicate that the majority of students (85%) sought to correct errors on their own most of the time, and that 18% of students abstained entirely from looking up answers. Furthermore, the results identify language skill as a predictor for students belonging to the group of Browsers, Frequent Peekers, Sporadic Peekers, and Adamants.
KEYWORDS Intelligent Language Tutoring Systems, Intelligent and Individualized Feedback, Learner Control in CALL, Web-Based Language Instruction, Grammar Practice
INTRODUCTION In early CALL programs, based on behaviorist principles, students worked within a strict framework: navigation was generally hard-wired into the program (students were often trapped in an exercise unless they provided the correct answer), and help options were limited or nonexistent. In contrast, modern CALL programs emphasize student control or, following Higgins (1987), the Pedagogue Role of the computer. In practical terms, users navigate more freely through the program, can terminate
2002 CALICO Journal
Volume 19 Number 2
295
Trude Heift
(Labrie & Singh, 1991; Levin & Evans, 1995; Loritz, 1995; Hagen, 1994; Holland, Kalan, & Sama, 1995; Sanders, 1991; Schwind, 1995; Wang & Garigliano, 1992; Yang & Akahori, 1997, 1999; Heift & Nicholson, 2000a). Additionally, a number of studies have focused on comparisons of CALL programs. For example, Nagata (1993, 1995, 1996) compared the effectiveness of error-specific (or metalinguistic) versus traditional feedback with students learning Japanese. In all studies, Nagata found that intelligent computer feedback based on NLP can explain the source of an error and, thus, is more effective than traditional feedback (see also Yang & Akahori, 1997,1999; Brandl, 1995). The studies above focused on students learning outcomes (results) and confirm Hollands conclusion that ICALL is effective and useful, in particular, for form-focused instruction. However, it would be equally instructive to examine the learning process while students work with such systems (Heift, 2001). As Chapelle and Mizuno (1989) state, when lowability students perform poorly on a criterion measure, it remains unclear how their work with the courseware may have failed to facilitate their eventual achievement. In terms of error correction, van der Linden (1993) found that, when comparing learner strategies in programs with different levels of feedback, feedback about the type of error encouraged students to correct their work themselves. The question arises whether such feedback strategy would apply given a learner-controlled grammar practice environment in which the student can access correct answers or even skip exercises. A study by Cobb and Stevens (1996) showed that students who rely excessively on program-supplied help are not learning as much as those who try to solve problems through their own self-generated trial-and-error feedback. For this reason, while CALL programs should provide a degree of learner control (Steinberg, 1989), it is important that students not overuse quick routes to correct answers. The current study focuses on whether students correct themselves in a learner-controlled practice environment of an ILTS. Moreover, it examines whether language skill level influences students error correction behavior. In the following sections, we will describe German Tutor, the web-based ILTS for German which was used for this study. We will then describe the participants and outline the tasks and methodology used. Finally, we will summarize the results, by providing examples of students output during the practice session, and conclude with suggestions for further research. AN INTELLIGENT LANGUAGE TUTORING SYSTEM FOR GERMAN The German Tutor contains a grammar and a parser that analyzes sentences entered by students and detects grammatical and other errors. The
Volume 19 Number 2 297
298
CALICO Journal
Trude Heift
PARTICIPANTS AND PROCEDURE During the spring semester 2000, the ILTS was used with 33 students of two introductory classes of German. The data were collected during three one-hour class sessions. For the study described here, students worked on the Build a Sentence exercise in which words are provided in their base forms and students are asked to construct a sentence (see Figure 1). Figure 1 Build A Sentence Exercise
In the event of an error, students have a number of options in the exercise. They can either correct the error and resubmit the sentence by clicking the Prfen check button, peek at the correct answer(s) with the Lsung answer button, or go on to the next exercise with the Weiter next button. If students choose to correct the sentence, it is checked again for further errors. The iterative correction process continues until the sentence is correct. During the three one-hour sessions, students worked on six chapters with a total of 120 exercises. Each practice session covered two chapters, but not all students finished all 40 exercises of the two chapters during the given practice time. Also, not all students were present in all three practice sessions. The grammatical structures present in the exercises were: gender and number agreement of noun phrases, subject-verb agreement, present tense of regular and irregular verbs, accusative and dative objects/prepositions, two-way prepositions, present perfect, auxiliaries, word order of finite and nonfinite verbs, modals, and separable prefix verbs. The linguistic
Volume 19 Number 2 299
RESULTS Table 1 provides a general summary of student interactions with the program. Table 1 Submission Types
Number of submissions Peeks without input Correct on first submission Total retries Peeks during retries Retries and correct submissions Total server requests 51 1791 2614 284 2330 4456 10.86% 89.14% 100% % of subtotals % of total 1.15% 40.19% 58.66%
A total of 4,456 server requests were made during the three one-hour practice sessions, that is, an average of 135 requests per student during their total practice time. Students did not provide any input for 51 sentences (1%); they simply requested the correct answer(s) and moved on to the next exercise. Forty per cent of the submitted sentences were correct on first submission, while 59% required retries. For 11% of the retries, students peeked at the correct answer at some point during the error correction process, while the remaining 89% of the learners corrected their mistakes and eventually submitted a correct answer. Analyzing the data with respect to learner-system interaction, we found four distinct interaction types: (a) Browsers, (b) Frequent Peekers, (c) Sporadic Peekers, and (d) Adamants (see Table 2).
300 CALICO Journal
Trude Heift
Table 2 Interaction Types
Browsers peek without any input Student ID D21(11), D16(9), D33(9), D8(8), D18(8), D7(6) Frequent Peekers peek after one or two tries D7, D8, D12, D16, D18 Sporadic Peekers peek once in a while but mostly correct errors D2, D3, D4, D5, D6, D10, D11, D13, D14, D15, D17, D19, D20, D21, D22, D23, D25, D28, D29, D31, D32, D33 22 (66.7%) Adamants peek once or never D1, D9, D24, D26, D27, D30
Total
6 (18.2%)
5 (15.1%)
6 (18.2%)
Table 2 shows that 18% of the students browsed through the exercises without providing any input at some point during the three practice sessions, that is, they did not attempt to answer an exercise. The remaining three interaction types were determined by two factors: (a) the number of retries for an exercise and (b) the number of peeks. Fifteen per cent of the students were Frequent Peekers who requested the correct answer(s) from the system more often than they corrected their errors. Sixty-seven per cent, the Sporadic Peekers used system help options less often than they corrected themselves. Eighteen per cent, the Adamants, corrected their errors and peeked at the correct answer not more than once during total practicing time. The four distinct interaction types will be discussed in the following sections.
Browsers
Table 2 indicates that six students (D21 [11], D16 [9], D33 [9], D8 [8], D18 [8], and D7 [6]) tended to browse through the exercises, sometimes requesting the answer without providing any input. Student D21 skipped the most exercises (11), while D7 browsed through six exercises. There are a number of possibilities why students might have chosen this strategy during the practice sessions. First, students might have thought that they knew the answers to the exercises they skipped and chose to not type them in. Second, students may have been curious to see all possible answers for certain exercises. (If students type in an answer, they are informed whether or not their specific answer is correct, but they do not get to see other possible answers.) Third, students may have wanted to complete the two chapters of each practice session in the time allotted and decided to skip some exercises.
Volume 19 Number 2 301
0 0 0
19 26 17
4 10 3
38 28 33 99 (34%)
With respect to student assessment during practice, the system keeps a detailed record of student performance. When a sentence is submitted, the value for each linguistic element in the student input (e.g., direct object, gender, subject-verb agreement, etc) is incremented or decremented depending on whether it was correct or not. In subsequent retries of the same exercise, only the values of the linguistic structures which are still incorrect are updated. The values correspond to one of three learner levels: beginner, intermediate, or advanced. As a result, the student is assessed over time, and the values reflect cumulative performance for each
302 CALICO Journal
Trude Heift
linguistic structure of the exercises completed (see Heift & Nicholson, 2000a). Table 3 above shows two distinct profiles for Browsers: predominantly intermediate to advanced and beginner to intermediate. However, the Browsers also overlap with the three remaining groups. D7, D8, D16, and D18 were also Frequent Peekers, while D21 and D33 belonged to the group of the Sporadic Peekers. These groups are discussed in the following sections.
Frequent Peekers
Students in the group of Frequent Peekers are characterized by the very low number of resubmissions in the same exercise. They request correct answers more often than they revise sentences and resubmit them. That is, they take advantage of the learner-controlled environment, using system help options more frequently than immediately trying to correct their errors. Table 4 summarizes the number of retries of the Frequent Peekers. Table 4 Frequent Peekers
ID Retries Total peeks 1 Correct D7 D8 D12 D16 D18 Total 88 (74.6%) 34 (35.1%) 30 (33.3%) 7 (10.4%) 53 (72.6%) 212 (47.6%) 10 25 21 19 7 82 1 Peek 14 30 27 28 10 109 2 Correct 2 0 3 0 0 5 2 Peek 4 8 9 13 3 37 18 38 36 41 13 146 (32.8%) 118 97 90 67 73 445 Total exercises completed
Initially correct
Volume 19 Number 2
303
In contrast, while student D8 was at the advanced level twice, D12 and D16 were always at the beginning or intermediate level. The group average was 29.9% at the beginner and 64.6% at the intermediate level. It is possible that mid to high performers (D7 and D18) had more con304 CALICO Journal
Trude Heift
fidence in their own work than in the accuracy of a computer program. Consequently, if the system reported an error, they may have tended to look up the correct answer. Moreover, these students might have felt that they could have learned more from reading the correct answer than from the iterative error correction process. As for the weaker students (D8, D12, and D16), they probably found it more frustrating to work through their errors due to the number of mistakes they made and preferred to look up the correct answer. While the Frequent Peekers habitually peeked at the answers, the Sporadic Peekers and Adamants corrected their mistakes and resubmitted their answers. In fact, they either requested the correct answer only very rarely or worked through an exercise until the bitter end. These two groups are discussed in the following section.
Sporadic Peekers
The majority of students (66.7%) were Sporadic Peekers. These students generally corrected their errors, requesting the correct answer once in a while but significantly less often than the Frequent Peekers. Table 6 shows the error correction pattern for student D2, a typical error correction pattern for students belonging to this group. Table 6 Error Correction Pattern for Sporadic Peeker D2
ID Initially correct Retries 1
Correct
1 Peek 2
2
Correct
2 Peek 2
3
Correct
3 Peek 0
4
Correct
4 Peek 2
D2
74
16
12
In contrast to the Frequent Peekers, Sporadic Peekers corrected their errors far more often than they peeked at the correct answer. They also repeated an exercise more often than the Frequent Peekers: up to six iterations for a single exercise in some cases. We also considered the language skill level of the Sporadic Peekers and found that the percentages for initially correct submissions ranged between 32% and 87%, with a group average of 62% (see Table 7).
Volume 19 Number 2
305
Total
The computer log further showed that the majority of these students were predominantly at a intermediate level during practice. The percentages for beginning and advanced students were nearly balanced with 13.6% and 10.7%, respectively. From a pedagogical point of view, the correction strategy employed by the Sporadic Peekers seemed very favorable. While students generally corrected their mistakes, they did not work to the point of frustration nor let the correction process turn into a guessing game. In the group below, the Adamants, students tended to correct their answers to the bitter end, even after the corrections turned into what amounted to random guesses.
Adamants
The Adamants were similar to Sporadic Peekers in that they generally preferred to correct their errors, but they were even more persistent than Sporadic Peekers. They were the users who requested the correct answer only once or never during all three practice sessions and made little use of the help options of the ILTS. Table 8 shows the number of total exercises completed and the number of peeks for the six Adamants. Table 8 Adamants
D1 Total exercises Peeks 119 1 D9 77 0 D24 80 0 D26 109 1 D27 89 0 D30 111 1
The data demonstrate that the six students requested the correct answer once or not at all during the total practice time. It is, therefore, not surprising that students in this group submitted the greatest number of retries: up to 10 times in several cases. Moreover, this group accounted for all instances exceeding six retries.
306 CALICO Journal
Trude Heift
Considering the number of retries, it is also not surprising that some of the corrections became random; students possibly did not remember which changes they had already made. For example, we noticed that in some instances students resubmitted an identical sentence. Consider (2a)-(2j) below which illustrate the corrections a student applied before attaining the correct answer. The error types flagged by the system are given in parentheses: (2a) (2b) (2c) (2d) (2e) (2f) (2g) (2h) (2i) (2j) Ich esse keinem Fleisch. (direct object) Ich esse keinen Fleisch. (direct object) Ich esse keinen\s Fleisch. (spelling) Ich esse keinens Fleisch. (spelling) Ich esse keinenes Fleisch. (spelling) Ich esse keinenen Fleisch. (spelling) Ich esse keinen Fleisch. (direct object) Ich esse keine Fleisch. (direct object) Ich esse keinem Fleisch. (direct object) Ich esse kein Fleisch. (correct)
The sentence submissions given in (2a)-(2j) indicate that, in all instances, the errors occurred with the inflection of the negation kein. It should also be noted that sentences (2g) and (2i) were submitted before (2b and 2a). We also considered the language skill level of the Adamants and found that they were mid to high performers. The data show that all six students scored above average in entering the correct answer at initial submission. For example, the scores for the correct answers entered by student D30 on the first try was 85.7%. The mean for the remaining five students ranged between 70% and 82.5%, with a group average of 75.6%. With respect to the students language skill level during practice, Table 9 shows that students were at the intermediate and advanced levels across most grammatical constructs (92.8%). In a few instances (7.2%), students were assessed at the beginning level (see Table 9).
Volume 19 Number 2
307
0 9 3 6 0 2 20 (7.2%)
33 28 27 29 28 20 165 (59.4%)
17 11 12 16 20 17 93 (33.4%)
It could have been expected that the Adamants were mid to high performers. Students at the beginning level may have found it too frustrating to correct sentences without any expectation of success. However, individual learner differences may have also played a role: some students may have simply refused to give up.
In comparing the language skill levels of all four interaction types, Figure 2 shows that low to mid performers tended to be Browsers and/or Frequent Peekers.
308
CALICO Journal
Trude Heift
Figure 2 Skill Profile across all Constructs for Each Interaction Type
70 60
40 30 20 10 0
Browsers
Frequent Peekers
Beginner
Intermediate
Advanced
In contrast, mid to high performers tended to be Adamants, while Sporadic Peekers consisted mainly of students with intermediate language level skills. The number of beginning and advanced students among the Sporadic Peekers is fairly balanced at 13.5% and 10.6%, respectively. Given these results we speculate that beginning learners take more advantage of system help options. First, they make more errors than learners at other levels and thus find it more frustrating to correct exercises independently. Second, students who make a lot of errors accomplish fewer exercises in the time allotted; peeking at the answer is an expedient way to advance through an exercise set. Intermediate students achieve a higher number of initially correct responses and, even in the case of errors, require fewer tries. Finally, high performers get more sentences initially correct and find working through many retries once in a while more of a challenge than a nuisance. CONCLUSIONS AND FURTHER RESEARCH In this article, we investigated learner control and error correction in a web-based ILTS for German. The data show that 85% of the participants
Volume 19 Number 2 309
321 321
3 32121 321 3 32121 3 32121 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321
50
Sporadic Peekers
Adamants
Initially Correct
32121 321 32121 32121 321 32121 32121 321 32121 32121 321 32121 32121 321 32121 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321
80
3221 1 321 3221 1 3221 1 321 3221 1 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321 321
2 21 1 21 2 21 1 2 21 1 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21
Error Percentage
REFERENCES
Bland, S. K., Noblitt, J. S., Armington, S., & Gray, G. (1990). The naive lexical hypothesis: Evidence from computer-assisted language learning. Modern Language Journal, 74, 440-450. Brandl, K. K. (1995). Strong and weak students preference for error feedback options and responses. Modern Language Journal, 79, 194-211. Chapelle, C., & Mizuno, S. (1989). Students strategies with learner-controlled CALL. CALICO Journal, 7 (2), 25-47.
310
CALICO Journal
Trude Heift
Chapelle, C., Jamieson, J., & Park, Y. (1996). Second language classroom traditions: How does CALL fit? In M. Pennington (Ed.), The power of CALL (pp. 33-52). Houston, TX: Athelstan Publications. Cobb, T., & Stevens, V. (1996). A principled consideration of computers and reading in a second language. In M. Pennington (Ed.), The power of CALL (pp. 115-137). Houston, TX: Athelstan Publications. Elsom-Cook, M. (1988). Guided discovery tutoring and bounded user modelling. In J. Self (Ed.), Artificial intelligence and human learning (pp. 165-178). Bristol, UK: J. W. Arrowsmith Ltd. Hagen, L. K. (1994). Unification-based parsing applications for intelligent foreign language tutoring systems. CALICO Journal, 12 (2), 5-31. Hegelheimer, V., & Chapelle, C. (2000). Methodological issues in research on learnercomputer interactions in CALL. Language Learning & Technology [Online], 4 (1), 41-59. Available: llt.msu.edu Heift, T. (2001). Error-specific and individualized feedback in a web-based language tutoring system: Do they read it? ReCALL, 13 (2), 129-142. Heift, T., & Nicholson, D. S. (2000a). Theoretical and practical considerations for web-based intelligent language tutoring systems. In G. Gauthier, C. Frasson, & K. VanLehn (Eds.), Intelligent Tutoring Systems, 5th International Conference, ITS 2000 (pp. 354-362). Montreal, Canada: ITS. Heift, T., & Nicholson, D. (2000b). Enhanced server logs for intelligent, adaptive web-based systems. In Proceedings of the Workshop on Adaptive and Intelligent Web-based Educational Systems, ITS 2000 (pp. 23-28). Montreal, Canada. Heift, T., & McFetridge, P. (1999). Exploiting the student model to emphasize language teaching pedagogy in natural language processing. In Proceedings of the Workshop on Computer-Mediated Language Assessment and Evaluation in Natural Language Processing, ACL/IALL 1999 (pp. 55-62). College Park, MD. Higgins, J. (1987). Artificial unintelligence. TESOL Quarterly 21 (1), 159-165. Holland, M. (1991). Parsers in tutors: What are they good for? CALICO Journal, 11 (1), 28-47. Holland, M. V., Kaplan, J. D., & Sama, M. R. (Eds.). (1995). Intelligent language tutors: Theory shaping technology. Mahwah, NJ: Lawrence Erlbaum. Hubbard, P. L. (1996). Elements of CALL methodology: Development, evaluation, and implementation. In M. Pennington (Ed.), The power of CALL (pp. 15-33). Houston, TX: Athelstan Publications. Labrie, G., & Singh, L. P. S. (1991). Parsing, error diagnostics, and instruction in a French tutor. CALICO Journal, 9, 9-25. Levin, L. S., & Evans, D. A. (1995). ALICE-chan: A case study in ICALL theory and practice. In M. V. Holland, J. D. Kaplan, M. R. Sama (Eds.), Intelligent language tutors: Theory shaping technology (pp. 77-99). Mahwah, NJ: Lawrence Erlbaum.
Volume 19 Number 2
311
312
CALICO Journal
Trude Heift
AUTHORS BIODATA Dr. Trude Heift is an Assistant Professor in the Linguistics Department at Simon Fraser University. Her research areas are in CALL, Computational and Applied Linguistics. She has developed web-based Intelligent Language Tutoring Systems for German, Greek and ESL. She is also the director of the Language Learning Centre at Simon Fraser University.
AUTHORS ADDRESS Dr. Trude Heift Linguistics Department Simon Fraser University Burnaby, British Columbia Canada V5A1S6 Phone: 604/291-3369 Fax: 604/291-5659 Email: heift@sfu.ca
Volume 19 Number 2
313