Davies Textbook Trends in Teaching Language Testing

Language Testing
http://ltj.sagepub.com Textbook trends in teaching language testing

Alan Davies Language Testing 2008; 25; 327 DOI: 10.1177/0265532208090156 The online version of this article can be found at: http://ltj.sagepub.com/cgi/content/abstract/25/3/327
Published by:
http://www.sagepublications.com
Additional services and information for Language Testing can be found at: Email Alerts: http://ltj.sagepub.com/cgi/alerts Subscriptions: http://ltj.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.co.uk/journalsPermissions.nav Citations http://ltj.sagepub.com/cgi/content/refs/25/3/327
Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009
Language Testing 2008 25 (3) 327347
Textbook trends in teaching language testing

Alan Davies University of Edinburgh, UK
The article examines changes in language testing textbooks in English since Lado (1961) and proposes that two trends may be discerned. The first shows how the growing professionalism of the field has required an expansion in teaching materials to meet the need for new training programmes. What the expansion also shows is the desire, again a mark of increasing professionalism, to provide all teaching resources from within the profession so that for needed skills (e.g. statistics and measurement) it is now less necessary to appeal to outsiders such as statisticians and psychometricians. The second trend explains the need for the profession to expand its view of the skills needed by its members. From Lado onwards, skills were always conjoined with knowledge about language and about testing. More recently, the profession has explicitly declared a concern for principles with regard, for example, to validity and to ethics. The increasing professionalism comes at a cost: that cost is twofold: in-housing all resources means that language testers are increasingly insulated from other potentially rewarding disciplines. And the complete resource offerings in the later teaching materials means that students may be denied empirical encounters with real language learners, spending all (or much of) their training within the resource material. The article also questions how far research has informed the changes in training materials. Keywords: informed by research, knowledge, language testing textbooks, practical manuals, principles, professionalism, skills, teachers resources
In writing about the teaching of language testing, we can make use of any of the materials (printed, audio, video, DVD, etc.) that have been developed. But it will not be very helpful to do so, first because our critique becomes an indiscriminate survey of the literature, and, second, because teaching ceases to be a deliberate proactive presentation and becomes an exposure to the whole field. It is more useful, both for our understanding of teaching and in order to put limits on
Address for correspondence: Alan Davies, Linguistics and English Language, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Adam Ferguson Building, George Square, Edinburgh EH8 9LL, UK; email: a.davies@ed.ac.uk
2008 SAGE Publications (Los Angeles, London, New Delhi and Singapore) DOI:10.1177/0265532208090156
328
the material discussed in this paper, to consider teaching as deliberate pedagogy. By deliberate pedagogy, I mean the work that teachers do in their professional pursuit of teaching: they plan and organize their area of expertise, which may be a language, a science or, in our case, language testing, in order to facilitate learning. When I look back over the last 50 years two trends may be discerned. The first trend charts the growing professionalism and expansion of the field alongside the attempt to develop all-in material, thereby relieving the student of the need in the teaching context to draw on material outside the textbook. As we shall see, psychometric issues are still very important today but see Fulcher and Davidson (2007). The second trend reveals the move from the skills knowledge approach to the current attempt to take account also of principles. Skills provide the training in necessary and appropriate methodology, including item writing, statistics, test analysis and increasingly software programmes for test delivery, analysis and reportage. Knowledge offers relevant background in measurement and language description, as well as in context setting, and may involve an examination of different models of language learning, of language teaching and of language testing such as communicative language testing, performance testing and nowadays, socio-cultural theory. Principles concern the proper use of language tests, their fairness and impact, including questions of ethics and professionalism, thus a consideration of the growing professionalism of language testing, of the responsibilities of language testers and of the impact of their work on a range of stakeholders and of the ethical choices they must make. In what follows, I reflect on key publications over the period and later return to a consideration of a selection from those key publications of representative texts in terms of the two trends I have adumbrated. These representative texts are British, American and Australian and they span the whole period under discussion. They are not intended to be any more celebrated than any of the others referred to but are selected as representative largely because they illustrate my argument of the move over these 40 years from skills knowledge to a knowledge-informed skills and then to a principles-informed skills.
I First trend: Expansion A three-way distinction can be made of materials produced for teachers. At the most discursive end we have (1) teachers resources,
Alan Davies 329
including books, videos, DVD and computer software. These provide a library for teachers, there to inform them and be made available, where appropriate, to their students. Next are (2) textbooks which provide a deliberately pedagogic approach, again aimed largely at teachers and intended to help them professionally. Then at the how-to end we have (3) practical manuals. Sometimes two of these elements may be combined. Robert Lado, whom Wood calls luminary (Wood, 1991, p. 238) gave language testing early credibility. It is instructive to consult the list of references in Language Testing (Lado, 1961). He is even-handed: he cites such influential psychometric texts as Anastasi (1954/1961), Cronbach (1949/1961) and Buros (1959), as well as significant linguistic texts (Bloomfield, 1933; Gleason, 1955; Hockett, 1958; Sapir, 1921; Fries, 1945). There are no references to any other language testing authors and Lados linguistics references are all to theoretical and descriptive linguists, not to applied linguists. It is as though Lado is making his contribution to establishing the field by maintaining that applied linguistics needs language and that language testing needs applied linguistics measurement. Lados book is thus firmly in the middle of my three-way distribution, among the textbooks. Lado introduces his book thus: a comprehensive introduction to the construction and use of foreign language tests. It incorporates modern linguistic knowledge into language testing as one of its chief contributions. The material is primarily intended for teachers of foreign languages and of English as a foreign language (Lado, 1961, p. vii). While the book may be a textbook, in my use of the term here, there are some parts of the book which approximate a practical manual, notably Part 2: Testing the elements of language. There are those who do not value Lados contribution to language testing. But that is unjust. The book is a triumph of combining issues. McNamara, 40 years later, writes his recommendations about testing dominated practice for nearly twenty years and are still influential in powerful tests such as TOEFL (McNamara, 2000, p. 89). Bernard Spolsky calls Lados 1961 volume a pioneering book (Spolsky, 1995, p. 353). He praises Lados work thus: Lados explicit appeal to theory was a crucial step to the professionalization of the field. With Lado, and with the students and colleagues he gathered (in the 1950s) at Michigan, like Harris and Palmer the language testing profession had taken a major first step (Spolsky, 1995, p. 150). And he urges us to remember our pioneers, such as Lado: Our field has been remarkably ahistorical: we have too often satisfied ourselves with patricidal fury on a named or unnamed predecessor before launching
330
ourselves into our own rediscovery of a slightly circular wheel of our own (Spolsky, 1995, p. 352). No later publication comes near the breadth of Lado (1961), until perhaps Fulcher and Davidson (2007). What Lado was keenly aware of was that language teachers need to know about language as well as about language testing. Lados successors have been less concerned with providing knowledge about language, perhaps because in the last half century applied linguistics has been more widely available. Here is part of Lados commentary: As language yields its secrets to linguistic analysis, lexicographic study, and quantitative research, it is more and more feasible to define specifically the task of learning a foreign language. As we identify more precisely the elements and patterns to be acquired by the speakers of a language in learning another, we will be able to test more precisely the progress made by the student under given conditions (Lado, 1961, pp. 338339). Lado may have been over-optimistic about the future of science, but his understanding of what was needed was just. Later in the 1960s, Harris (1969) and Valette (1967, 2nd ed., 1977) followed on Lados example, Harris for ESL, Valette for modern foreign languages, amplifying Lado in a specialist area and at the same time dependent on him. Valette (1967, p. v) claimed that her intention was to introduce teachers to a diversity of testing techniques, while Harris (1969) offered his book as a short concise text on the testing of ESL, a subject about which both classroom teachers and trainers of teachers have shown an increasing concern (Harris, 1969, p. vii). Like Valette, Harris modelled himself on Lado, thus combining an analytic approach to language and its uses in such sections as what is meant by reading comprehension, what is meant by writing, what is meant by speaking a second language along with discussion of test characteristics (reliability, validity, practicality), test construction, test administration, analysis of test results and followed by a separate section on the statistics needed to complete the task. Valette is narrower, understandably so since she offers a range of different language examples, but in the main both the Harris and the Valette are largely concerned with how to develop tests, with Harris going further in how to analyse the results. While Lado combined the resource and the textbook and Valette and Harris the textbook and the practical manual, all three primarily offer a textbook approach. CAL (1961) and Davies (1968), in their provision of historical accounts and views of language testing issues, provided texts which were resource-based and as such offered information and ideas to teachers and graduate students which could be followed up. Applied linguistics in the 1960s was still in its early
Alan Davies 331
days and therefore language teachers (who might or might not follow graduate courses in applied linguistics at some point) were the necessary audience for applied linguistics developments. In his important CAL paper, J. B. Carroll (1961) does not attempt to offer a language testing blueprint. Instead, he sets out to readdress the attention of the audience to certain basic and fundamental problems and points of view, some of which may have been lost sight of in the heat of enthusiasm for technical detail (Carroll, 1961, p. 31). And in his edited volume, Davies (1968) offers a range of views that attempt to bring together the three strands of language testing: language, learning and evaluation (Davies, 1968, p. 1), examining the basic disciplines and their relevance to language testing uses and types of test the influence of tests on education the item analysis needed (Davies, 1968, p. 13). Again, as with the Carroll paper, the focus here is that of resource (with a glance at textbook material). Whether we locate them as resource materials or as textbooks, it seems the case that while the Lado and the Valette and the Harris combine the what with the how to, with Valette and Harris more on the how to side, the Carroll and the Davies are both very much on the what side. Through the 1960s and the 1970s, with the publication of the textbooks of Clark (1972) and Allen and Davies (1977), along with the publication of the Peace Corps Manual of language testing (published later as Anderson 1993), a deliberately practical field-guide, there was always the recognition that while these materials provided textbooks and practical manuals, supporting psychometric and statistical back-ups were necessary. These were not language or applied linguistics specific but were generic to all testing such as Cronbach (1949), Anastasi (1954) and so on. And for statistical work there were generic programmes such as SPSS. The Edinburgh course in applied linguistics appeared in four volumes between 1973 and 1977 (Allen & Corder, 1973, 1974, 1975; Allen & Davies, 1977). Volume 4 (Allen & Davies, 1977), with the title Testing and experimental methods, argued that ideas in applied linguistics needed to be submitted to the rigour of hypothesis and experimentation. Experiments, it was suggested, need tests while tests are themselves kinds of experiment: This book is an attempt to demonstrate our belief in the importance of this link (Allen & Davies, 1977, p. 10). After the Introduction by Davies, two chapters (by Davies & Ingram) were devoted to testing, two (by Ruth Clark) to experimental design and computation and one to statistical inference. There
332
were also two appendices on statistical inference and tables. Practical work was provided at the end of each chapter. Ingram (1977) contributed a theoretical chapter on Basic concepts in testing, while Davies contributed a more practical chapter on The construction of language tests (Davies, 1977). The book fits very neatly into our second (text-book) category. The emphasis throughout is very much on the connection between skills and knowledge, both measurement knowledge and language knowledge being presented as part of the skills that the language tester (and researcher) needed to acquire. This volume, The Edinburgh course in applied linguistics, Volume 4, was a serious attempt to locate language testing firmly within applied linguistics and was possibly the first such attempt. All four volumes were much used over the subsequent 10 years to teach applied linguistics and, in the case of Volume 4, to teach both language testing and research methodology. In the 1980s, we see both an expansion and an enriching in language testing publications. This development was paralleled in other areas of applied linguistics, an increasing number of research specialists in fields such as second language acquisition and discourse analysis expanded their research base and, necessarily, their teaching provision to take account of the expansion. Thus in language testing we see the explosion of communicative language teaching with Carroll and Halls (1985) teachers guide alongside a growth in more general textbooks such as Madsen (1983), Hughes (1989) and the earlier Heaton (1975). B. J. Carroll (1985) promoted communicative language testing procedures. The purpose of this book Carroll maintained is to outline principles and techniques for specifying the communicative needs of a language learner and for assessing his language performance in terms of those needs (Carroll, 1985, p. 5). With hindsight it would be more appropriate to place Carrolls Testing Communicative Performance in our skills knowledge category. Carrolls principles are more properly considered knowledge in my sense, with the proviso that this knowledge is quite ideologically driven. What we also see in the teaching programmes at graduate level is that empirical work is still part of the core requirement so that students following language testing courses were required to carry out a small-scale testing project. For this they relied on the growing number of textbooks which were beginning to relate the necessary research and analytic techniques to language and applied linguistics. Among these were Hatch and Farhady (1982), followed in a second edition by Hatch and Lazaraton (1991) for statistics and research
Alan Davies 333
design and the Henning (1987) which began to make IRT techniques meaningful to the field of language testing and applied linguistics. The 1980s also saw the start of the new journal Language Testing, a sure sign of the fields growing research capacity. This journal was very deliberately not a teaching outlet. In the 1990s we see the normal academic development of an emerging discipline, now maturing and showing that maturity by publishing research monographs, regular surveys of the field (Davies, 1982; Skehan, 1988, 1989; Alderson & Banerjee, 2001, 2002), both its development and its future trends. These developments increased the coming together of the contributing disciplines so that Bachman (1990) and Bachman and Palmer (1996) brought together research design, statistics, computer programmes, test preparation and analyses, while Davies (1990) and Wood (1991) offered critiques of language testing which can at best be regarded as resource materials for teaching, but are not easily put directly to use in a training programme. Since so much writing about language testing, up until the 1990s, and perhaps even today, concerns large-scale testing, Genesee and Upshur (1996) was very much to be welcomed, dealing as it did with the very real, and very difficult context of classroom assessment. Genesee and Upshur termed their book practical and it belongs to the practical manual end of my textbook category, concerned primarily with skills and offering the knowledge necessary for employing those skills but less concerned with principles. As the field of language testing has grown, as courses have developed, at the undergraduate and graduate as well as at the PhD level, and as those courses have specialized so that now there are a number of masters degrees in language testing itself, while formerly these were normally part of degrees in applied linguistics or in applied language studies, TESOL, and so forth, different teaching needs have shown themselves. Extended resources deliberately designed for teaching are represented by the publication of self-help teaching materials in the shape of the University of Melbourne video series Mark My Words (1997) and the ILTA Web-based interviews on language testing Video FAQs (Fulcher & Thrasher, 1999, 2000). A parallel development is represented by the publication of the Dictionary of language testing (Davies, Brown, Elder, Hill, Lumley & McNamara, 1999) and the Encyclopedic dictionary of language testing (Mousavi, 2002). Maturing disciplines display and develop their maturity through the development of teaching materials, such as the videos and through the defining descriptive work of specific dictionaries. This
334
descriptive work provides the self-help teaching resources that teachers and students rely on. Davies et al. (1999) labelled their Dictionary of Language Testing a segmental dictionary: as such, it retains its professional/vocational/registral association and at the same time its normative/pedagogic purpose (Davies, 1996, p. 231). A more traditional development, also in the 2000s, is represented by the Cambridge University Press (CUP) Language Assessment series which, beginning in 1999, has to date published 10 volumes. Seven are of particular relevance (Douglas, 1999; Alderson, 2000; Read, 2000; Buck, 2001; Cushing, 2002; Purpura, 2004; Luoma, 2004) inasmuch as they replicate within their properly narrow confines the operation of the Bachman (1990) model of language testing. This series is deliberately pedagogic. The Series Editors Preface to Alderson (2000) concludes thus: this book offers a principled approach to the design, development and use of reading tests and thus exemplifies the purpose of this series to bring together theory and research in applied linguistics in a way that is useful to language testing practitioners (Alderson, 2000, p. xi). A separate but contemporary development may be found in the work of Pennycook (2001), Shohamy (2001), Hawkey (2006) and McNamara and Roever (2006). Although none of these publications is a textbook, all are being widely used and excerpted in the teaching of language testing and in training programmes. Basing themselves on a teleological foundation, on the judgement of test use (what Bachman & Palmer (1996) term test usefulness) and on a concern for a professional attachment to ethics (Davies, 1997), they all insist, in somewhat different ways, that test validity must take account of how and where a test is used. Such critiques, based as they are on an essentialist, relativist belief, may or may not be tenable or indeed practical. But there is no doubt that their critical attacks have penetrated into teaching programmes, giving pause to the perhaps overly confident view that a language test is a language test, no matter where or for whom. This critical stand-off is linked also to the social constructivist critiques of positivist philosophies (Lantolf, 2000). In all cases, what we see is a genuine and worthwhile attempt to reflect on what Shohamy (2001) calls the power of tests. Students and teachers who are working in and studying language testing need to know about these critiques so that they are aware that what they are involved in, language testing, has the potential to harm, indeed destroy people, even though, of course, they may not change what the students and teachers think or do.
Alan Davies 335
II Second trend: Skills, knowledge and principles The second trend reveals the move from the skills knowledge approach to the current attempt to take account also of principles. Skills provide the training in necessary and appropriate methodology, including item writing, statistics, test analysis and increasingly software programmes for test delivery, analysis and reportage. Knowledge offers relevant background in measurement and language description as well as in context setting. Principles concern the proper use of language tests, their fairness and impact, including questions of ethics and professionalism. The movement over the last 40 odd years seems to be from skills to skills knowledge to skills knowledge principles. The trend is not consistent but overall seems to hold. We can argue as follows: what a new (applied) activity needs quickly is to disseminate skills. But it becomes apparent quite soon that skills are not sustainable without knowledge since knowledge provides the context in which skills operate: if skills represent how?, then knowledge represents what?. And then over time, as the activity becomes more confident and, as a profession, practising the activity grows, it is inevitable that the activity itself comes into question: externally of course, but that is an old critique (testing has always had its critics) but now internally as the language testing professionals themselves begin to query their own professionalism, their ethical foundations. What then happens is that what had been a skill, such as item writing, incorporates knowledge and so becomes skill knowledge since item writing requires understanding of the context and purpose for which the items are being written. Thus a test of LSP necessarily requires that item writers have the relevant knowledge of the language description of their area of special purpose. And further, as knowledge becomes more widely available in the profession, so the need to explain, to justify and to judge becomes important. Thus the concern for, let us say, validity moves from principles to knowledge as validity itself takes on more than a concern to represent the ideal domain and becomes a recognition of the practical impact on the test in all its singular settings. In its turn, the new principles-informed knowledge is operationalized and incorporated into skills. Skills, meaning techniques and methodologies, on their own are no longer enough, skills knowledge are inadequate without the addition of principles. For teaching, as for learning, there is a need for careful balancing of the practical (the skills) with the descriptive (the knowledge)
336
and the theoretical (the principles). All are necessary but one without the other(s) is likely to be misunderstood and/or trivialized. The survey of language testing courses, reported in Bailey and Brown (1996), has now been updated for this volume by Brown and Bailey who find that in the 10 years since their 1996 survey little has changed apart from the choice of textbooks. They report the presence of a stable knowledge base that is evolving and expanding rather than shifting radically (Brown & Bailey, this volume: p. 371). In 1996, Bailey and Brown reported that there is a great deal of diversity in the sorts of language testing preparation provided to teachers (Bailey & Brown, 1996, p. 250). This diversity is revealed in the list of required and optional textbooks supplied to them by the 84 language testing teachers who returned their questionnaires. Bailey and Brown list 32 textbooks, half of which were listed by only one respondent. The most common textbooks were as follows:
Henning, 1987 Madsen, 1983 Hughes, 1989 Bachman, 1990 Oller, 1979 Shohamy, 1985
Bailey and Brown (1996, p. 247) comment that there is a wide range of emphasis, from the very theoretical to the very practical in the assessment preparation language teachers receive. However, of the six textbooks listed above, the four most commonly used, (Henning, Hughes, Bachman and Oller) were very much on the theoretical side. It appears that there was a widely held view that a language testing textbook should be inclusive, combining knowledge and skills. The high ranking given to Oller (1979) confirms that choice of theory plus practical textbook; indeed, we might speculate that the attraction of the Oller is that it offered not just knowledge and skills but also broached principles in the discussion of the nature of validity: What is the ultimate criterion of validity for language tests?(Oller, 1979, p. 404). No doubt Ollers ideological adherence at that time to his expectancy grammar and to the indivisibility hypothesis (or unifactorial structure of language proficiency) may have made for a one-sided approach, but principles they are, nonetheless. In their 2007 Survey, reported in this volume, Brown and Bailey noted that 29 textbooks were listed as in use, compared with the 32 in the 1996 Survey. They write: Interestingly, only six of the books were common to both studies and of those, four were in new editions, while only two were in their original editions (Brown & Bailey, this volume: p. 371).
Alan Davies 337
The six textbooks common to both surveys were as follows:

Hughes (1989, 2002) Bachman (1990) Brown (new ed. 2005) Cohen (1994) Alderson, Clapham and Wall (1995) Bachman and Palmer (1996)
(There is, of course, a natural delay before a new textbook is taken up and an existing one laid down.) The 2007 list of frequency of use placed five of these (Hughes, Bachman, Brown, Alderson et al., Bachman and Palmer) at the head of its list. I noted above that of the six most commonly used textbooks listed in 1996 four were on the theoretical side. These include the Hughes and the Bachman, the only textbooks that appear as most commonly used in both lists. And two of those moving into the top position for the first time in 2007, the Alderson, Clapham and Wall and the Bachman and Palmer, also take up a theoretical approach, thereby combining, as I have argued, knowledge and skills. This combination of knowledge and skills is, it would appear, more likely to endure than the somewhat ephemeral practical manuals. I now consider a small number of celebrated, perhaps iconic texts published over the last 50 years, to illustrate what I regard as the skills, knowledge and principles trend in the concept of the teaching of language testing. What I propose is, as foreshadowed earlier, that over this period there has been an expansion from skills to skills knowledge and then to skills knowledge principles. My illustrative texts are:
Lado (1961) Allen and Davies (1977) Hughes (1989) Bachman (1990) Alderson, Clapham and Wall (1995) Bachman and Palmer (1996) Mark My Words (1997), the ILTA Video FAQs (Fulcher & Thrasher, 1999, 2000), the Dictionary of language testing (Davies et al., 1999) these three taken together McNamara (2000) Davidson & Lynch (2002) Weir (2005) McNamara and Roever (2006) Fulcher and Davidson (2007)
While there are indeed texts that deal entirely or perhaps mainly with skills (for instance, Madsen, 1983; Carroll & Hall, 1985; Heaton, 1975), all the examples I want to discuss are more comprehensive. Thus Lado (1961), Allen and Davies (1977), Hughes (1989) deal with both skills and knowledge.
338
Lado (1961) begins his book with knowledge: his Part 1 consists of discussions of language, language learning, language testing, variables and strategy of language testing, and critical evaluation of tests. The remaining 90% of the book examines the skills needed for developing tests and for experiments using language tests. Similarly, Allen and Davies (1977) consider both skills knowledge with chapters on Basic concepts in testing and on The construction of language tests. There are then two chapters on experiments plus a further chapter (and appendices) on the meaning and working of statistics used in experiments and testing. Hughes (1989, now in its second edition, 2003) again deals with the background knowledge needed in language testing and with the statistical and item writing skills. What is of interest to us in this discussion is that Hughess second edition does not move beyond the skills knowledge position he took up in 1989 in spite of the pressures exercised by discussions in the language testing community on principles. This may suggest that there is less demand among teachers, Hughess target audience, for principles than I had assumed. Bachman and Palmer (1996) and Davidson and Lynch (2002) follow much the same pattern. However, Bachman and Palmers examination of the conceptual basis of test development introduces what they term test usefulness, a kind of metric by which we can evaluate not only the tests that we develop and use, but also all aspects of test development and use (Bachman & Palmer, 1996, p. 17). This formulation I consider to be an incorporation of skills knowledge such that the knowledge of test development and use now becomes a learnt skill. But the main purpose of their book is, they contend, to enable the reader to become competent in the design, development and use of language tests (Bachman & Palmer, 1996, p. 3). That is its primary purpose and that is why I place the book in the skills knowledge category. Davidson and Lynchs book (2003) also belongs in this category. The subtitle is: A teachers guide to writing and using language test specifications. Davidson and Lynch maintain that existing language testing textbooks assume knowledge of testing while they aim to provide an introduction to newcomers, focussing on test specifications. And so their book offers guidance on the skills needed to write test specifications but also shows how the knowledge behind those specifications can be viewed as and taught as skills in their own right. Bachman (1990), the text on which Bachman and Palmer (1996) is based (as well as the series Cambridge Language Assessment see above), does not deal with the skills of item writing and test analysis.
Alan Davies 339
These matters are, after all, taken up in the later Bachman and Palmer (1996). But what Bachman does in his 1990 text is to treat knowledge as a form of skill and at the same time to move on to begin an examination of principles. Hence the discussion of validity as a unitary concept, of the evidential nature of validity and of the consequential and ethical basis of validity. Such discussions look forward to the Bachman and Palmer (1996) concept of test usefulness. Alderson, Clapham and Wall (1995) is much more rounded and less practical. Something we do not do in this book is to describe language testing techniques in detail (Alderson, Clapham & Wall, 1995, pp. 23). What they do deal with is the examination of validation in all its aspects. Indeed, short on techniques though this volume may be, through its in-depth discussion of validity and standards its scope as a textbook is as wide-ranging as that of Lados before them and that of Fulcher and Davidson (2007) over ten years later. Alderson, Clapham and Wall (1995) ground much of their discussion by reference to the work of the UK EFL examination boards. This has the advantage of realism and makes a powerful argument for the different concerns of academic language testing on the one hand and public or institutional (or indeed commercial) language testing on the other. It is distressing for academics to learn that: not all boards understand what is meant by validation, validity and reliability (Alderson et al., 1995, p. 257). But the very context specificity of the book and its strong critique almost ideological of the boards may detract from the overall concern of the learner who is unlikely to have the same critical view as the authors. The two video projects, Mark My Words (1997) and the ILTA Video FAQs (Fulcher & Thrasher, 1999, 2000) are not primarily concerned with skills. Both deal largely with knowledge. Thus, Mark My Words has the following topics in its series: Language proficiency assessment Principles of test development (principles is used somewhat differently in the present article) Objective and subjective assessment Stages of test analysis Performance assessment Classroom-based assessment In this video series, knowledge is presented not so much as background as part of the necessary skills behaviour in developing language tests. The Dictionary of Language Testing (Davies et al., 1999) goes further by incorporating, as is the nature of dictionaries,
340
topics such as ethics, ethicality and impact and thus taking some account of the principles of language testing. Like Alderson, Clapham and Wall (1995), Weir (2005) is less concerned with skills than with knowledge and in particular with validation. What he very carefully does is to explain that validation evidence is required to demonstrate validity: in other words, while others have shaped knowledge into a kind of skill, what Weir does is to convert principles into first knowledge and then into a skill. No doubt there is a case for retaining a separation between knowledge and skills and between principles and skills so that implementing them requires thought, not just automaticity. But for teaching purposes, which is our concern here, the demonstration of how to make both knowledge and principles operational, that is skill-like, is pedagogically very appealing. McNamara (2000) and McNamara and Roever (2006) both elaborate the knowledge needed for professional language testers and deal in some depth with principles. Thus, in both texts there is concern with ethics and social policy and responsibilities: indeed, the whole of McNamara and Roever (2006) is concerned, as the title indicates, with social issues. McNamara and Roever argue that language testing is ripe for a broader view of assessment and its social aspects (while) testers need to reflect on test use (McNamara & Roever, 2006, p. 8). If McNamara (2000) and McNamara and Roever (2006) are concerned wholly with knowledge and principles in such a way that principles become part of the knowledge needed, Fulcher and Davidson (2007) is yet more all-embracing, offering in one volume what Bachman (1990) and Bachman and Palmer (1996) offer in two. Fulcher and Davidson (2007) has the title: Language testing and assessment: An advanced resource book and is part of an Applied Linguistics series of resource books in different areas. Fulcher and Davidson (2007, p. xix) consider that their discussion is set within a new approach that we believe brings together testing practice, theory, ethics and philosophy. At the heart of our new approach is the concept of effect-driven testing. This is a view of test validity that is highly pragmatic. Our emphasis is on the outcome of testing activities. The integrative nature of their text, comprising: A) Introduction: 10 units dealing with the central concepts of language testing and assessment B) Extension: readings from books and articles linked to the concepts introduced in Section A C) Exploration: extended activities building on both A and B
Alan Davies 341
situates the learner within the language testing enterprise. Of all the texts examined in this paper, Fulcher and Davidson (2007) does seem to provide the most complete coverage of skills, knowledge and principles. The development proposed above can be summarized as follows: In the 1960s (and earlier) language testing relied on external sources, particularly psychometric (Cronbach, 1949; Anastasi, 1961; Tyler, 1963; Anstey, 1966). From the 1970s and onwards, the attempt was made internally to nativize the necessary skills and knowledge but in separate texts, thus Hatch and Farhady (1982) and Hatch and Lazaraton (1991) dealing with statistics and research design; Shohamy (2001) and possibly the external Pennycook (2001) to handle critical approaches, the ILTA Code of Ethics (2000) presenting the professions ethical principles, and Bachman (2004), again dealing with statistics. Meanwhile, we have the internal sequence discussed above from Lado (1961) to Fulcher and Davidson (2007) moving gradually beyond the skills knowledge scenario to the skills knowledge principles combination. We present this array in Table 1.
III Conclusion The development in teaching materials examined in this paper comes as a result of the increasing professionalism of the field of language testing. That increasing professionalism has a cost: that cost is twofold: in-housing all resources means that language testers are increasingly excluded from other potentially rewarding disciplines. And the complete resource offerings in the later teaching materials means that students are over-protected from exposure to empirical encounters with real language learners, spending all (or much of) their training within the resource material. This exclusion from external influences leads to an insularity, a reluctance to take up new ideas, as McNamara and Roever (2006) argue. They remind us that in spite of the social turn in the last two decades the teaching of language testing is still largely psychometric: In terms of academic training, we stress the importance of a well-rounded training for language testers that goes beyond applied psychometrics a training that includes a critical view of testing and social consequences (McNamara & Roever, 2006, p. 255). A similar resistance may explain the reluctance to grapple with recent research findings. Woods (1991) is an extreme view: it is clear that innovation is not driven by research (but) it is important to understand how (innovations) happened, and whether they
342
Table 1

Changes in English language testing textbooks External Internal separate Internal combined Lado Skills Knowledge Principles
1960s:
Cronbach; Anastasi; Tyler; Anstey Hatch & Farhady; Hatch & Lazaraton
1970s:
Allen & Davies
2000s:
Code of Ethics Shohamy Pennycook Bachman
Hughes x Bachman; Alderson, Clapham & Wall Bachman & x Palmer Mark My ? Words x McNamara Davidson & Lynch; Weir McNamara & Roever Fulcher & x Davidson
x x x
x x
x x x x x x x ? x x x x x
were actually necessary, if only to appreciate how marginal the part research evidence plays in the decision (Wood, 1991, p. 248).Woods scepticism about the influence of research relates to innovation: it is clear that innovation is not driven by research, a view he exemplifies in his comments on the English Language Testing Service test (Wood, 1991, pp. 235236). In this paper, I have not, in any direct way, discussed to what extent language testing textbooks make use of language testing research. This omission is deliberate. It is not, of course, that there is a dearth of research in language testing: on the contrary there is a great deal, reported in the journals such as Language Testing and Language Assessment Quarterly, in the encyclopedias such as Shohamy and Hornberger (2008) and Hinkel (2005), in the many monographs, notably the Alderson and Bachman series (see the reference above, p. 334), to the CUP Language Assessment Series: 20002004) and in the regular reports of
Alan Davies 343
Cambridge ESOL and Educational Testing Service on TOEFL. But while a textbook is properly informed by research, its primary purpose is not, as is that of a monograph, to report recent research. Textbooks consolidate while monographs are dynamic, reporting developments in current research. There is an inevitable gap, a time lag between the publication of research and its incorporation in a textbook, by which time there may be a very different research need, as, for example, McNamara (2005), Rea-Dickins (2008) and Leung and Lewkowicz (2008) note, in their references to classroom assessment. McNamaras labelling of this gap as unbridgeable (McNamara, 2005, p. 778) is apt and mistaken. Yes, there is a gap but it is a necessary gap. Worthwhile training needs to be informed by mature understanding of research and not by the latest news from the PhD and the research project.
IV References
Alderson, C. (2000). Assessing reading. Cambridge: Cambridge University Press. Alderson, C. & Banerjee, J. (2001). Language testing and assessment (Survey Article). Language Teaching, 34, 213236. Alderson, C. & Banerjee, J. (2002) Language testing and assessment (Survey Article). Language Teaching, 35, 79113. Alderson, C., Clapham, C. & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press. Alderson, C. & Bachman, L. F. (Eds.). (20002005). The Cambridge language assessment series. Cambridge: Cambridge University Press. Allen, P. & Davies, A. (Eds.). (1977). Testing and experimental methods: Volume 4. The Edinburgh course in applied linguistics. London: Oxford University Press. Allen, P. & Corder, S. P. (Eds.). (1973). Readings for applied linguistics: Volume 1. The Edinburgh course in applied linguistics. London: Oxford University Press. Allen, P. & Corder, S. P. (Eds.). (1974). Techniques in applied linguistics: Volume 3. The Edinburgh course in applied linguistics. London: Oxford University Press. Allen, P. & Corder, S. Pit (Eds.). (1975). Papers in applied linguistics: Volume 2. The Edinburgh course in applied linguistics. London: Oxford University Press. Anastasi, A. (1954). Psychological testing (1st ed.). New York: Macmillan. Anastasi, A. (1961). Psychological testing (2nd ed.). New York: Macmillan. Anderson, N. (1993). Handbook for classroom teachers in Peace Corps language programs. Manual. Washington, DC: The Peace Corps. Anstey, E. (1966). Psychological tests. London: Macmillan.
344
Bachman, L. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press. Bachman, L. (2004). Statistical analyses for language assessment. Cambridge: Cambridge University Press (also with Antony Kunnan: Handbook and CD). Bachman, L. & Palmer, A. (1996). Language testing in practice. Oxford: Oxford University Press. Bailey, K. M. & Brown, J. D. (1996). Language testing courses: What are they? In A. Cumming & R. Berwick (Eds.), Validation in language testing (pp. 236256). Clevedon: Multilingual Matters. Bloomfield, L. (1933). Language. New York: Henry Holt. Brown, J. D. & Bailey, K. M. (2008). Language testing courses: What are they in 2007? Language Testing (this issue). Buck, G. (2001). Assessing listening. Cambridge: Cambridge University Press. Buros, O. (1959). Fifth mental measurements yearbook. Highland Park, NJ: Gryphon Press. CAL (Center for Applied Linguistics) (1961). Testing the English proficiency of foreign students. Washington, DC: CAL. Carroll, B. J. (1985). Testing communicative performance. Oxford: Pergamon. Carroll, B. J. & Hall, P. (1985). Make your own language tests: A practical guide to writing language performance tests. Oxford: Pergamon. Carroll, B. J. (1961). Fundamental considerations in testing for English Language proficiency of foreign students. In CAL Testing the English proficiency of foreign students (pp. 3040). Washington, DC: Center for Applied Linguistics. Clark, J. (1972). Foreign language testing: Theory and practice. Philadelphia, PA: Center for Curriculum Inc. Cobuild (1987). Collins Cobuild English language dictionary. London: Collins. COE (2000). ILTA Code of Ethics. http://www.iltaonline.com/code.pdf Cohen, A. D. (1994) Assessing language ability in the classroom (2nd ed.). New York: Heinle and Heinle. Cronbach, L. (1949). Essentials of psychological testing (1st ed.). New York: Harper and Row International. Cronbach, L. (1961). Essentials of psychological testing (2nd ed.). New York: Harper and Row International. Cumming, A. & Berwick, R. (Eds.). (1996). Validation in language testing. Clevedon: Multilingual Matters. Cushing, S. W. (2002). Assessing writing. Cambridge: Cambridge University Press. Davidson, F. & Lynch, B. (2002). Testcraft: A teachers guide to writing and using language test specifications. New Haven, CT and London: Yale University Press. Davies, A. (Ed.). (1968). Language testing symposium: A psycholinguistic approach. London: Oxford University Press. Davies, A. (1977). The construction of language tests. In P. Allen & A. Davies (Eds.), Testing and experimental methods. Volume 4. Edinburgh course in applied linguistics (pp. 38104). London: Oxford University Press.
Alan Davies 345

Davies, A. (1982). Language testing survey. Parts 1 and 2. In V. Kinsella (Ed.), Surveys (pp. 127159). Cambridge: Cambridge University Press. Davies, A. (1990). Principles of language testing. Oxford: Oxford University Press. Davies, A. (1996). The role of the segmental dictionary in professional validation: Constructing a dictionary of language testing. In A. Cumming and R. Berwick (Eds.), Validation in language testing (pp. 222235). Clevedon: Multilingual Matters. Davies, A. (1997). Demands of being professional in language testing. Language Testing, 14(3), 328339. Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Dictionary of language testing. Cambridge: Cambridge University Press and UCLES. Douglas, D. (1999). Assessing languages for specific purposes. Cambridge: Cambridge University Press. Fries, C. (1945). Teaching and learning English as a foreign language. Ann Arbor, MI: University of Michigan Press. Fulcher, G. & Davidson, F. (2007). Language testing and assessment: An advanced resource book. London: Routledge. Fulcher, G. & Thrasher, R. (1999/2000). Video FAQs. Introducing topics in language testing. ILTA (online) http://www.le.ac.uk/education/ilta/faqs/ main.html Genesee, F. & Upshur, J. (1996). Classroom evaluation in second language education. Cambridge: Cambridge University Press. Gleason, H. (1955). An introduction to descriptive linguistics. New York: Henry Holt. Harris, D. (1969). Testing English as a second language. New York: McGraw Hill. Hatch, E. & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. New York: Newbury House. Hatch, E. & Farhady, H. (1982). Research design and statistics for applied linguistics. New York: Newbury House. Hawkey, R. (2006). Impact theory and practice. Studies in language testing. Cambridge: Cambridge University Press and UCLES. Heaton, B. (1975). Writing English language tests. London: Longman. Henning, G. (1987). A guide to language testing. Cambridge, MA: Newbury House. Hinkel, E. (Ed.). (2005). Handbook of research in second language teaching and learning. Mahwah, NJ: Lawrence Erlbaum. Hockett, C. (1958). A course in modern linguistics. New York: Macmillan. Hughes, A. (1989). Testing for language teachers (1st ed.). Cambridge: Cambridge University Press. Hughes, A. (2003). Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press. Ingram, E. (1977). Basic concepts in testing. In J. P. B. Allen & A. Davies (Eds.), Testing and experimental methods: Volume 4. Edinburgh course in applied linguistics (pp. 1137). London: Oxford University Press. Lado, R. (1961)Language testing. London: Longmans.
346
Lantolf, J. (Ed.). (2000). Sociocultural theory and second language learning. Oxford: Oxford University Press. Leung, C. & Lewkowicz, J. (2008). Assessing second/additional language of diverse populations. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education: Volume 7. Language testing and assessment (2nd ed.), (pp. 301317). New York: Springer Science & Business Media. Luoma, S. (2004). Assessing speaking. Cambridge: Cambridge University Press. Madsen, H. (1983). Techniques in testing. New York and Oxford: Oxford University Press. Mark My Words (1997). Video Series. Melbourne: University of Melbourne Language Testing Research Centre. McNamara, T. (2000). Language testing. Oxford: Oxford University Press. McNamara, T. (2005). Introduction to Part V1: Second language testing and assessment. In E. Hinkel. (Ed.), Handbook of research in second language teaching and learning (pp.775778). Mahwah, NJ: Lawrence Erlbaum. McNamara, T. & Roever, C. (2006). Language testing: The social dimension. Malden, MA and Oxford: Blackwell. Mousavi, A. (2002). An encyclopedic dictionary of language testing (3rd ed.). Taiwan: Tung Hua. Oller, J. W., Jr. (1979). Language tests at school. London: Longman. Pennycook, A. (2001). Critical applied linguistics: A critical introduction. Mahwah, NJ: Lawrence Erlbaum. Purpura, J. (2004). Assessing grammar. Cambridge: Cambridge University Press. Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press. Rea-Dickins, P. (2008). Classroom-based language assessment. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopedia of language and education: Volume 7. Language testing and assessment (2nd ed.). (pp. 257271). New York: Springer Science & Business Media. Sapir, E. (1921). Language: An introduction to the study of speech. New York: Harcourt Brace. Shohamy, E. (1985). A Practical handbook in language testing for the second language teacher. Shaked: Ramat Aviv, Israel. Shohamy, E. (2001). The power of tests: A critical perspective on the uses of language tests. Harlow: Pearson. Shohamy, E. & Hornberger, N. H. (Eds.). (2008) Encyclopedia of language and education: Volume 7. Language testing and assessment (2nd ed.). New York: Springer Science & Business Media. Skehan, P. (1988). Language Testing: Part 1. Language Teaching, 21(4), 211221. Skehan, P. (1989). Language testing: Part 2. Language Teaching, 22(1), 113. Spolsky, B. (1995). Measured words. Oxford: Oxford University Press. Tyler, L. (1963). Tests and measurements. Englewood Cliffs, NJ: Prentice-Hall.
Alan Davies 347

Valette, R. (1967). Modern language tests: A handbook (1st ed.). New York: Harcourt Brace and World. Valette, R. (1977). Modern language tests: A handbook (2nd ed.). New York: Harcourt Brace and World. Weir, C. (2005). Language testing and validation: An evidence based approach. Houndmills, Basingstoke: Palgrave Macmillan. Wood, R. (1991). Assessment and testing. Cambridge: Cambridge University Press.

Davies Textbook Trends in Teaching Language Testing

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Davies Textbook Trends in Teaching Language Testing

Enviado por

Direitos autorais:

Formatos disponíveis

Language Testing

http://ltj.sagepub.com Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Language Testing 2008 25 (3) 327347

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 329

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 331

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 333

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 335

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 337

The six textbooks common to both surveys were as follows:

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 339

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 341

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Allen & Davies

Code of Ethics Shohamy Pennycook Bachman

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 343

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 345

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Textbook trends in teaching language testing

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Alan Davies 347

Downloaded from http://ltj.sagepub.com by Gilberto Berrios on April 19, 2009

Você também pode gostar