Prosody in Non-native Speakers of English Matchett
The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English
F. Dee Matchett
TESL 533 Educational Technology Carson Newman University Dr. E. Cody-Mitchell
November 19, 2013
2 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Abstract Latin served as the international language for hundreds of years and still has a major influence in medical vocabulary and legalese. However, English has replaced Latin as the global language and now serves as the primary language of economic trade. English presses its influence upon science and medicine as well; hence, professional journals are primarily published in English. It is also the language of the Internet and is, therefore, exercising great influence throughout the World Wide Web. The smaller our world becomes in terms of communication and the blending of cultures, the more non-natives are finding the need to speak English in a readily comprehensible manner. This is a literature review to determine the efficacy of using accent reduction software to improve the prosody of non-native speakers of English. The mental process involved in the production of segmentals and suprasegmentals will be explained. The relationship between the prosody of the speaker and the perception of the listener will be noted and the implications of how intelligibility affects them will be identified. Employment discrimination, based upon foreign accent and intelligibility, will be discussed in relationship to the possible benefits of accent reduction. The various technologies employed in accent reduction and research regarding their efficacy will be examined. Finally, a comparison of currently available accent reduction software will be offered.
3 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Following are acronyms and keyword definitions as used in this review: CALL Computer Assisted Language Learning CAPT- Computer Assisted Pronunciation Training ELL English Language Learner ESL English as a Second Language generally refers to the teaching of English when English is the primary language of instruction (location is usually an English speaking country) EFL English as a Foreign Language - refers to the teaching of English when English is not the primary language of instruction (location is usually a non-English speaking country) L1 native language L2 second language, in this case English NSP- non-standard pronunciation prosody the rhythm, stress and intonation of speech segmentals- phonological units of language, i.e. vowels and consonant suprasegmentals- elements of speech that extend over its segments such as changes in pitch, loudness, duration, tone, and prominence. Spectrogram visual representation of the spectrum of frequencies in a sound that can indicate phoneme patterns of speech Waveform - visual representation of speech that can indicate prosody, rate, intensity, and loudness by tracking changes in air pressure over time as a sound is produced
4 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Figure 1 Brocas Area and Wernikes Area Retrieved from http://thebrain.mcgill.ca, October 16, 2013 Research regarding the mental processes involved in the production of language has been ongoing since 1861 when the surgeon, Paul Broca identified the area of speech formation by examining a deceased mans brain. The man, although unable to speak, could hear and understand the speech of others. Broca rightly assumed that a lesion found in the brain had been responsible for the mans inability to articulate. The neurologist Carl Wernicke added to the knowledge of language production when he discovered another region of the brain that processed speech. People with lesions in this area, could articulate, but their speech was largely incomprehensible. Both areas are located in the left hemisphere of the brain. (Bruno, 2002). (See Figure 1) It was later discovered that numerous nerve fibers form a path of transmission between Brocas area and Wernickes area. This connection allows Wernickes area to analyze written words or auditory language, form contextual understandings, and transfer that information to Brocas area. Brocas area plans the pronunciation of words and sends that information to the motor cortex that commands the muscle movements required for pronunciation. This is an oversimplification since all these processes function simultaneously with a variety of other input such as semantic memory. Semantic memory stores definitions and the articulation pattern necessary to pronounce words, including tongue placement and mouth position. (Dubuch, 2002) This explanation does, however, provide us with a basic understanding of how the segmental aspects of speech are produced. 5 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Figure 2: Lateralization of language processing Retrieved from http://www.frontiersin.org/Journal/10.3389/fnene.2010.00013/full, October 21, 2013 The production of suprasegmentals that give a language its stress patterns, rhythm and intonation require assistance from a location in the right brain. Its interactive relationship with Brocas area and Wernickes area has been difficult to adequately describe, but a dual pathway model proposed by researchers, Angela Friederici and Kai Alter has gained acclaim. (Friederici & Alter, 2004) (Friederici, 2011) The model has been further substantiated by the findings of Yuri Saito. (Saito, Fukuhara, Aoyama, & Toshima, 2009) The complexity of this interaction is beyond the scope of this paper, but basically the dual pathway model describes the synergistic production of the segmental and suprasegmental aspects of speech. (See Figure 2) This additional right brain processing enables us to use speech for emotional expression. Raising voice pitch when surprised or lowering pitch when angry are both functions of right brain language prosody. We use the left brain, where the segmentals of speech are centered, for analysis and the right side of the brain, where suprasegmentals are centered, for the artistic expression of music, art, and dance. This correlates with English being a stress-timed language, quite musical in nature, with intricate patterns of rhythm, stress, and intonation that can be difficult for ELLs to master. (Romer-Trillo, 2012) Engaging the right brain should help learners assimilate the lyrical nature of the language. In this examination of the literature regarding the intelligibility of non-native speakers of English, we will find indications that when these lyrical 6 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett features are lacking, comprehensibility is negatively affected. Studies indicate that music and speech are decoded by shared brain processes and the brain responds to the same psychoacoustic cues, namely: loudness, tempo and speech rate, melodic and prosodic contour, spectral centroid and sharpness. (Meng, 2009) (Courinho, 2013) These are all areas that can be addressed through accent reduction software and therefore affect the right brain processing of suprasegmentals. According to The New York View, a publication of Columbia Universitys Graduate School of Journalism, there is a marked increase in the number of non-native speakers seeking the services of accent modification coaches. (Cheng, 2012) Statistics from training services confirm this report. Sankin Speech Improvement, LLC, recently reported a 35% increase in clients seeking accent reduction services. (Sankin, 2013) A recent report from the US Census Bureau indicates that 21% of the American population speaks a language other than English at home. (Census, 2011) This figure is only 1% higher than 2009, but still trending upward as it has been for several decades. The American Community Survey report summarizes the findings as follows: This report provides illustrative evidence of the continuing and growing role of non- English languages as part of the national fabric. Fueled by both long-term historic immigration patterns and more recent ones, the language diversity of the country has increased over the past few decades. As the nation continues to be a destination for people from other lands, this pattern of language diversity will also likely continue. (Ryan, 2013) 7 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Figure 3 US Census Bureau, American Community Survey retrieved from http://www.census.gov/prod/2013pubs/acs-22.pdf , October 31, 2013
From the 1995-6 school year to the 2005-2006 school year, the state of Tennessee saw a 296% increase in ELL students, thus ranking Tennessee 6 th in the nation for increase in ELL student enrollment. (Ariza, Morales-Jones, Yahyan, & Zainuddin, 2010) It would be a mistake to inadequately serve the language deficits of our increasing population of non-native speakers. As this population filters into the educational system and the work force, there is a need to address speech proficiency. The increase in the demand for accent reduction services stated previously shows that second language speakers are aware of the need to acquire that proficiency. However, they may not be able to determine what specific factors in their speech are creating difficulty.(Derwing, 2003) The two primary factors needed for speech to be 8 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett comprehensible are pronunciation and prosody. (Baker, 2007) This corresponds to the production of segmentals and suprasegmentals described earlier. In a study of Pronunciation and Intelligibility, (Levis & Levelle, 2011) speech recordings of native Spanish and Korean speakers of English were evaluated by pronunciation experts to determine what most impacted intelligibility. Although insufficient pronunciation skills did result in a loss of comprehension, panelists felt that correcting pronunciation would not improve intelligibility until the larger problems of rhythm, tempo and word stress (misplaced or lacking) were addressed. These are all prosodic features of language. Monroe and Derwing (Munro & Derwing, 2000) studied the effect of accents upon native English listeners (random people, not pronunciation experts) to determine how they perceived differences in accented speech from their own speech, the difficulty they experienced in understanding accented speech, and how much accented speech the hearer actually understood. The study showed that pronunciation was the least relevant cause of comprehensibility because listeners quickly adjusted to differences in English pronunciation, This finding demonstrates empirically that the presence of a strong foreign accent does not necessarily result in reduced intelligibility or comprehensibility (p.19). A lack of prosody, on the other hand is troublesome to the listener since, two foreign-accented utterances may both be fully understood (and therefore be perfectly intelligible), but one may require more processing time than another (p.19). This slower processing time can be frustrating to the listener and as a result be interpreted as lack of fluency. In the study, listeners rated this type of speech as having lower comprehensibility, even though they could transcribe the speaker perfectly. Generally, this perception is due to the 9 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett missing elements of suprasegmentals that give the English language its characteristic qualities of rhythm, stress and intonation. Without these qualities, speech becomes laborious to decipher. This is further supported by the research of Shiri Lev-Ari and Boaz Keysar at the University of Chicago. They introduce another effect of accentedness: credibility. (Lev-Ari & Keysar, 2010) They demonstrated that the difficulty listeners have processing accented speech gives them the perception that accented speakers are less truthful. Truthful statements seemed less credible to them, when they were difficult to understand. This could place accented speakers in awkward positions when compared to their native speaking counterparts and result in discrimination. A study of the relationship between prosodic training and intelligibility among German speakers of English parallels the evaluation that in the ear of the hearer prosody greatly impacts meaning and recommends "emphasizing that the goal of pronunciation training is to increase intelligibility and improve the effectiveness of communication, [thereby] one opens the door to including training beyond word-level pronunciation to include the prosodic level. (Jackson & OBrien, 2011) An interesting study was done on how prosody affects word meaning. Participants listened to phrases that contained novel words pronounced with matched or mismatched prosodic elements. In other words, pitch and stress were either wrongly spoken or rightly applied. Word meaning was correctly inferred much more frequently when prosody was appropriately matched. These findings suggest that speech contains reliable prosodic markers to word meaning and that listeners use these prosodic cues to differentiate meanings. (Nygaard, Herold, & Namy, 2009) Although this study was not focused on second language learning, the results would suggest that an increase in the ability of NSP (non-standard pronunciation) speakers to use suprasegmentals in their speech would lead to an increase in comprehensibility. 10 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Anglo- America 64% Hispanic 16% African- American 12% Asian 5% Unspecifie d 3% Figure 4 Population Distributions in the Workforce Source: Bureau of Labor Statistics Anglo- American 96% Hispanic 1.2% African- American .8% Asian 1.8% Figure 5 Distribution of Minorities in Corporate America In a brief published by the Center for Applied Linguistics that cites the findings of Monroe and Derwing, OBrien and Jackson, and Levis (among others), the recommendation is made that adult ELLs be made aware of the contribution of stress, intonation and rhythm to comprehensibility and that teachers be encouraged to improve student pronunciation by not focusing on perfect pronunciation alone. The brief suggested that Computer Pronunciation Training (CAPT) could be utilized to address issues of prosody. (Schaetzel, 2009) These recommendations are steps in the right direction because, Unfortunately, suprasegmental features such as stress and intonation are often treated by ESL teachers as peripheral frills and not as central to the conveying of meaning. (Avery & Ehrlich, 2006) Accentedness not only causes difficulty for the hearer, the speaker can be greatly affected as well, especially in their professional development. It is interesting to note that the percentage of Hispanic (16%) and Asian (5%) populations in the workforce is equal to 21%. This is the same figure stated earlier as the percentage of non-native English speakers in the US, indicating a good representation in the workforce. However, representation among corporate America is unequal: Asian (1.8 percent); Latino (1.2 percent). (Burns, Barton, & Kerby, 2012) 11 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett In the Journal of Cultural Diversity, an article debating the effect of Foreign- Accentedness and upward mobility in the workplace, L2 speakers of English are referred to as the invisible minority, stating that they are underrepresented and marginalized. (Akomolafe, 2013) A number of research studies substantiate these descriptions. In one study 65 job recruiters were asked to rate their perception of the speech of applicants. Those with non- standard grammar and pronunciation were judged more negatively in terms of employability than those who spoke Standard English. (Atkins, 2000) In a study using a matched-guise technique that was developed to reveal peoples attitudes, accented speakers were favored for less desirable jobs and speakers of Standard English were favored for more prestigious positions. The accented speakers were viewed as less efficient and their communication skills as less suitable for those jobs.(Cargyle, 2000) These perceptions may relegate non-native speakers to lower levels of economic success, even though their skills and abilities qualify them for higher level positions. Spanish accented applicants in another matched-guise study were viewed as having a lower chance for promotion to management level and were perceived as less competent, although their qualifications were on par with native speakers of English. (Nguyen, 2010) (Jayesh Shah, Raouf Seifeldin, 2010) Beyond discrimination are other issues created by communication problems in the work place and in education. Even when non-native speakers are hired, they face difficulties on the job. The unintelligibility of their speech can negatively affect job performance and interaction with others. These types of problems are pertinent in the medical profession where the number of foreign born doctors and nurses has increased. International medical graduates (IMGs) now make up 26% of the US physicians. (Jayesh Shah, Raouf Seifeldin, 2010) Many work in rural low income areas avoided by native born graduates. They face communication issues because of 12 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett their accented speech that can lead to misunderstandings that put patients at risk. (Khurana, 2013) Efforts to mitigate communication issues through accent modification courses are proving effective. An analysis of pre- and post-course performance data indicated the efficacy of the training. (Khurana, 2013) This resulted in the following recommendation: Thus, communication training should be offered in tiers at several different levels in colleges, universities and healthcare institutions. It should be offered at subsidized rates to the students and faculty, with the bulk of cost absorbed by the employer that will benefit from increased employee productivity and patient or student satisfaction. Organizations that do not have on-site training capability may provide it through online training programs. (Khurana, 2013) The same is holding true for the nursing profession. They are experiencing difficulties with foreign-born nursing students whose attrition rates are high, primarily due to accent related communication problems. In a study of 13 students with NSP, the faculty at the Long Island University School of Nursing experienced difficulty comprehending their speech patterns. There was concern that in a clinical setting these students would also encounter communication issues with patients, family members and staff. A speech pathologist was hired to provide accent modification training. This training proved advantageous to the nursing students in improving their intelligibility. 12 of the 13 students graduated. The school felt they had effectively addressed what could have been potentially harmful patient safety issues and were able to greatly improve their retention of NSP nursing students. (Carr, 2012) Another study of 15 nursing students in the invisible minority reported that the students felt lonely and isolated, were disappointed in the absence of acknowledgment of individuality 13 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett from teachers, peers lack of understanding and knowledge about cultural differences, lack of support from teachers,(Gardner, 2005) and the discrimination they faced. These factors have a negative effect on self-image that can hinder social integration. So, in addition to employment and educational barriers, socio-cultural barriers can be difficult for NSP speakers to mount. The Institute for Research on Public Policy reported on the social integration of foreign immigrants to Canada. (Derwing & Waugh, 2012) Deficiency in pragmatic language skills were found to isolate them from integrating more fully into Canadian society. Since immersion into the society of the second language facilitates fluency, acquiring prosody in English can be hindered by this lack of socialization. It limits the time that NSP speakers spend hearing and imitating native speakers. Unbroken, this can lead to a cycle of poor language skills, social exclusion, isolation, poor self-image, and socio-economic suppression. Computer Assisted Pronunciation Training (CAPT) is one option for breaking that cycle. Computer Assisted Language Learning (CALL) refers to the use of computer technology and software to teach all aspects of language: reading, writing, listening, and speaking. From CALL has grown a subset of technologies to improve speech: Computer Assisted Pronunciation Training (CAPT). Instead of taped voice recordings on cassette or CD to imitate in a language lab, instruction has trended towards CAPT software programs. Much of the technology for CAPT has been borrowed from speech therapy research and augmented to facilitate language learners. Automatic Speech Recognition (ASR) technology has been adapted to map speech patterns for pronunciation comparison. (Qooco, 2009) The learners pronunciation of a given text is analyzed against an accepted speech model and rated for accuracy. Some CAPT programs combine ASR with speech waveforms. Prosody, rate, speech and loudness can be read from a waveform. (McGregor, 2002) Actual phonemes cannot be read within a waveform unless 14 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett frequency components are analyzed and displayed as a spectrogram. (Carmell, 2013) Reading a spectrogram accurately requires training; for this reason, some linguists have argued against it, citing that they are presented because of their flashy look, to impress the users. (Neri, Cucchiarini, Strik, & Boves, 2009) However, it does give the learner a general visual
comparison. The waveform of the model voice can be examined for conformity to the learners voice. Since the addition of visual display has been shown to increase error recognition, waveforms can be a good learning aid, as noted in a study of the Kay-Pentax Computerized Speech Laboratory. Learners were able to use visual feedback from spectrograms to recognize gaps in their language production that they had not noticed with imitation exercise alone. (Pearson, 2011) A combination of aural and visual modalities produces an increased effectiveness in speech production. (Dominic Massaro, Micahel Cohen, Antoinette Gesi, 1993) This correlates to the previously mentioned stimulation of the right brain during the production of suprasegmentals and with the well-grounded pedagogical strategy of presenting instruction using a variety of modalities. In addition to aural and visual stimulation, by its very nature CAPT Figure 6 Waveform and spectrogram of the same word compute Retrieved from http://www.cslu.ogi.edu/tutordemos/SpectrogramReading/waveform.html 15 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett also engages the learner kinesthetically. CAPT software offering both aural and visual feedback is referred to as a Dual-Mode program. The limitation of ASR programs is that matching voice patterns alone provides insufficient feedback to the learner regarding how to improve their accuracy. (Hismanoglu, 2011) Researchers are still at work to remedy this, but much has already been improved upon in the software available today. The trend is for learners to move from a passive role of imitation to producing authentic speech. This is closer to the natural way that language is acquired. (Eskenazi & Hansma, 1998) In 1995, Auralog introduced Talk To Me software, the first program able to process complete sentences. (LanguageOnLine, 2001) It mimicked authentic speech by using narrow parameters of response. Responses were elicited from the learner that fell within the parameters that the system recognized. In this way, learners felt as if they were creating original utterances. (Eskenazi & Hansma, 1998) Some linguists criticized this level of feedback, saying, Using artificially generated sentences does not necessarily put learners on the path to communicative ability with natural speech.(Godwin-Jones, 2009) This problem is being addressed based upon speech-interactive micro world technology in which the learner enters a virtual world and authors new scenarios. (Holland, Kaplan, & Sabol, 1999) One application of virtual world language learning is seen in a program entitled Virtual Pre-K (VPK) that interacts with parents and students. (Cummins, 2007) It boasts a variety of bilingual activities (English/Spanish) on a website parents can easily navigate, along with hands- on materials such as flashcards and CDs used in conjunction with the website. The program has been a great success, They are coming back to school and sharing their enthusiasm, and theyre doing it in both languages The parents are so eager for knowledge they can use to help their children. (p.262) 16 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett The next step was a result of the FLUENCY project at Carnegie Mellon University developed by Maxine Eskanazi using a SPHINX II ARS system. It allowed, automatic alignment of the predicted text with the incoming speech signal (Eskenazi & Hansma, 1998) for pronunciation error analysis and correction. The system showed improvement in speech prosody error recognition and an improved user interface. It also allowed users to select a golden voice to imitate. The idea was that learners could choose a voice closer to their own as a model. Males could select a voice with a lower F0 (pitch) and females a voice with a higher F0. In 2002, Katherine Probst, Yan Ke, and Maxine Eskanazi built upon the FLUENCY project to further develop the concept of a golden voice. They formed two hypotheses: First, that imitating a native speaker with voice features similar to the learner would increase the learners speech development. Secondly, if the learner could choose the voice they wanted to imitate that was most intelligible to them, this would also lead to better speech development. In addition to gender selection, with this program the user could select voices with comparable pitch (F0) and rate of articulation, which is the speed at which articulators move (the lips, tongue, and other muscles). (Probst, Ke, & Eskenazi, 2002) The second hypothesis proved to be erroneous. Learners made better progress when the system matched them to the most similar voice in the database, rather than selecting their own preference. This shows that the voice that appeals to a learner may not be the best choice for modeling. (Probst et al., 2002) Surprisingly, however, choosing a voice of the opposite gender showed a higher percentage of improvement than selecting the same gender. This was thought to be related to voice intelligibility. The first hypothesis proved true. Learners modeling a voice matched by the computer for similarity improved the most, showing a 43.3% increase in pronunciation and prosody. A suitable F0 match proved to be the most important factor in 17 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett improvement at 47.2 %. It should be noted that F0 is a suprasegmental characteristic involving right brain speech production. The search for the best golden voice has continued. In 2006, Kwansun Cho and John G. Harris modified the voices of 10 Korean speakers of English using time warp and waveform overlap to morph the Korean voices with the voices of 10 American English speakers. The resulting voices were then judged for accentedness and found to be 8.59 percent less accented than the original voices.(Cho & Harris, 2006) In 2009, the Euronounce Project combined Pitch Line software with the AzAR tutoring system to integrate improved segmental and suprasegmental characteristics into a database of German speakers of Polish. (Demenko, Wagner, Cylwik, & Jokisch, 2009) The pitch contour of the learners speech was displayed in waveform alongside a model teachers voice. Users made graphical comparisons with the Azar articulation diagram. The model voice resulting from the addition of Pitch Line was judged to be very natural(p.4), however, the study concluded that, Although they are promising, further experiments are indispensable to improve the obtained acoustic models especially for accented syllables. (p.7) Of the 15 test subjects, 13 considered the new AzAR suitable for individual study and 2 were willing to use the program with teacher assistance. An exciting new concept towards the golden voice was introduced in 2007. Since the technology now existed to modify a speakers voice, why not modify the learners own voice and let it become the closest acoustical model possible? Figure 7 AzAR template for pronunciation assessment 18 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Here we propose a voice transformation technique that can be used to generate the (arguably) ideal voice to imitate: the own voice of the learner with a native accent. Our work extends previous research, which suggests that providing learners with prosodically corrected versions of their utterances can be a suitable form of feedback in computer assisted pronunciation training. Our technique provides a conversion of both prosodic and segmental characteristics by means of a pitch-synchronous decomposition of speech into glottal excitation and spectral envelope. We apply the technique to a corpus containing parallel recordings of foreign-accented and native-accented utterances, and validate the resulting accent conversions through a series of perceptual experiments. Our results indicate that the technique can reduce foreign accentedness without significantly altering the voice quality properties of the foreign speaker. (Felps, Bortfeld, & Gutierrez-Osuna, 2009) A uniquely individual approach had appeared and the results indicated that after this morphing technique was applied, the perception of accentedness in the learners voice was greatly reduced. Still there were problems integrating the application for the purpose of pronunciation learning. To some degree, the inevitable segmental errors of the learner were transferred to their modified voice. Not until 2007 was a method introduced that overcame this problem. Ruili Wang and Jingli Lu developed a system that morphed the voice features of the learners voice with the teachers voice in a way that eliminated learner pronunciation errors, while retaining the voice qualities of the non-native speaker. Because our voice modification is based on a teachers voice, the resynthesized utterances can be free from segmental error. (Wang & Lu, 2011) The dreamed of golden voice had become reality. 19 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett The patented process is now part of SpeedLingua software that also incorporates the Tomatis Method (TOMATIS-Developpement, 2009). The learner engages in a receptive listening activity for 15 minutes prior to engaging in language learning activities. During this pre-exercise time, the learner hears music that is gradually filtered from the sound frequencies of their native language to those of the target language. This tunes the ear to hear the dominant rhythm and musical intonation of the language being practiced. SpeedLingua is the only software available that preconditions the right side of the brain for language learning and then morphs the users own voice so that they can hear themselves speak the language as if they were a native speaker, while performing the learning exercises. While the technological development of CAPT systems is intriguing, how does that development figure into the pedagogical aspects of language instruction? Can CAPT be truly effective and benefit both students and teachers? First lets look at research regarding the efficacy of several CAPT educational software programs. In 2003, a small study was conducted with a control group and an experimental group consisting of 9 middle-aged engineers with multiple language backgrounds: Arabic, Farsi (2), Hungarian, Polish (2), Romanian, Russian and Somalia. The project use Aurologs Talk to Me software. Intelligibility was only improved in learners who entered the study with strong accentedness. Those with higher level pronunciation and prosody skills showed no significant change. Considering that the subjects only practiced with the software a total of 12.5 hours, this is still impressive, especially when one considers that these were adult learners and according to Lennegers Critical Period hypothesis age can hinder pronunciation skills when acquiring a new language. (Lenneger, 1967) It could be that those with higher level 20 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett skills had already reached their maximum potential for improvement before using the software or they may have needed more practice hours to show any significant improvement. Pronunciation Power software was used in a study of university students in Ankara, Turkey. (Seferoglu, 2005). This study aimed to find out whether integrating accent reduction software in advanced English language classes at the university level would result in improvements in students pronunciation at the segmental and suprasegmental levels. The difference between the experimental groups pre- and posttest scores was also found to be statistically significant. (p. 303,313) A study of school children used PARLING software that was developed for Italian students learning English. Through stories and games, the children learn to pronounce new words. The aim of the study was to determine if learners improved their pronunciation at a rate equally as effective as traditional classroom instruction. The control group and the experimental group showed comparable improvement indicating that the practice time with the software was at least as effective as teacher-led instruction. (Neri, Gerosa, Giuliani, & Mich, 2008) Improvement in the application of technology to CAPT software can be seen in a recent study with EFL student groups age 22-28 : a control group, a group using Pronunciation Power (reviewed above) and a third group using NeoSpeech. NeoSpeech uses text-to- speech technology that allows the user to input any text and hear it spoken by a synthesized voice. The group using this technology scored higher on post test scores than either the control group or the group using Pronunciation Power. (Kilickaya, 2008) The use of authentic language clearly enhanced learning pronunciation skills. 21 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Other improvements for CAPT are still under development and look promising, such as high variability phonetic training (HVPT) that exposes learners to multiple voices producing target sounds instead of a single voice to improve the learner listening skills. Listening skills are another area of second language acquisition that is difficult to learn and requires much exposure to the new language. CAPT software with HVPT could facilitate listening skills and that is hoped in turn, would facilitate improvement in pronunciation. Theoretically, having a more native-like perceptual system should promote gains in pronunciation accuracy. (Thomson, 2011) In the HVPT pilot study (Thomson, 2011), users were able to take the listening skills they acquired and discern meaning from the voices of speakers to whom they had not been previously exposed. This transfer of skills would be of great benefit to ELLs in real life conversations. The CAPT program developed for this study resulted in improved intelligibility scores not only in response to English vowel productions elicited using a voice that had previously been heard in training, but also in response to productions elicited using a novel voice. These results suggest that the program helped learners isolate relevant phonetic cues to vowel identity that were then generalizable to new speakers. (p.758)
Looking at the conclusions of all the studies referred to previously, several themes concerning the benefits and limitations of CAPT appear. CAPT is viewed as an adjunct to the teacher, not a replacement. CAPTs ability to provide drill practice is seen as an asset that frees teacher time to focus on specific pronunciation problems of individual students that CAPT does not address. As students develop less dependence on the teacher for pronunciation practice, more classroom time can be spent on interactive communication that develops conversational skills. 22 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett One teacher commented on what she liked most about CAPT it takes the personal element out of the feedback. Instead of telling students no and making them repeat over and over again, we were instead able to give them a positive goal to work towards. (Pearson, 2011) Students benefited from the computers consistent feedback and endless patience. The ability for learners to work at their own pace, select a variety of function options, and choose activities that tailor a program to their individual needs are other assets that were frequently mentioned. CAPTS flexibility allowed learners to make adjustments that facilitated their own learning speed: In our experiments, we also noticed that some of the subjects, who preferred a slow version of speech material, tended to speed up the speech material a little or switch it back to the normal speed, when they had caught the pronunciation features in these utterances. This tendency reflects the fact that their objectives of second language learning are to perceive and produce natural speech with a regular speed. (Wang & Lu, 2011)
A stress free environment was another area often commented upon. Since CAPT provides private instruction, correction in front of peers can be avoided. This reduces stress for students by eliminating the fear of error, which can inhibit them from taking the language risks necessary for learning. Teachers who feel unequipped to teach pronunciation also appreciate assistance from CAPT, so their stress-level is reduced as well. CAPT can be especially helpful in EFL instruction to make up for the lack of exposure to a native language environment. 23 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett CAPT does have limitations. These are being overcome so steadily that only the comments from the three most recent studies examined in this review will be considered. All of them were concerned about user friendly feedback. Except in a general manner, it is difficult for students to take the raw data provided from waveforms and spectrograms and turn that into meaningful information useful for the needed changes in pronunciation or prosody. this type of feedback is not in line with the requirement that feedback should first of all be easy to comprehend. (Neri et al., 2009) Hismanoglu echoes this concern and felt CAPT could be used more effectively by following his recommendation, Teachers should be able to comprehend spectrograms, waveforms, and fundamental frequency contours to analyze students articulations of target language words. (Hismanoglu, 2011) In addition, (Demenko et al., 2009) lists a number of technical difficulties: weak speech signals, no extrapolation for voiceless sounds, not entirely correct/reliable F0 extraction. Hopefully, the technological issues will be remedied as advancements continue to be made. While the prospect of making CAPT even more effective than it has already proven to be is an attractive proposition, training teachers in analysis is probably not cost nor time efficient. Budget constraints would hinder the expense of the training involved. The chore of analyzing the database of each students practice record seems overly burdensome on the already demanding time schedules that teachers face. It would seem better to subcontract this task out to linguistic experts who are well-versed at reading waveforms and spectrograms and rely on their recommendations to the teacher, but that creates another financial hurdle to cross. Considering the fast pace of advancements in CALL technology over the last decade, improvement in feedback may be just around the corner and worth the wait. 24 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett The future of CAPT software appears limitless as researchers look at developing CAPT applications that will reflect, ... a current understanding of how L2 pronunciation develops. In particular, it [this study] attempted to address constraints stemming from interactions between L1 and L2 categories, while also increasing the quantity and quality of phonetic experience beyond what is typically available to adult learners. (Thomson, 2011) Thomson also looked at the accessibility web-based programs would offer and the learning potential that would be possible through wireless mobile devices. This would offer new avenues for research through remote monitoring as well. The potential for CAPT to incorporate innovative research-based techniques is enormous, and still in its infancy. (p.759) Both researchers and teachers could collaborate on designing software programs by implementing, platforms that many ESL teachers and computer lab instructors already use. For example, the functionality needed for listening to sound files and selecting from among response alternatives is available in many popular Learning Management Systems, such as Moodle or Sakai. (p.760) Most of the software researchers are using to develop CAPT programs is available as freeware. With a little effort, the potential is available for teachers to create their own software programs that are tailor made to the needs of their students. Some CAPT software even offers a teacher authoring feature within the program for teachers to add their own lesson materials. The significance of stimulating right brain thinking for the processing of suprasegmentals and the ability of CAPT to facilitate that process through aural, visual, and kinetic modalities has been made evident in this review. The importance of prosody in making language intelligible to 25 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett the listener and research regarding the efficacy of CAPT for improving both the pronunciation of the segmental aspects of speech and the production of the suprasegmental elements of speech have been delineated. The service that accent reduction training can render to open better job opportunities and career advancement for non-native speakers has been identified. Hopefully, accent reduction training will help the business world embrace the diversity that people from a variety of cultures can offer. The use of CAPT as tool to bridge the gap between students needs and the limitations of a traditional classroom setting has been clarified. In an effort to facilitate administrators and teachers in selecting appropriate CAPT software an evaluation rubric of currently available CAPT software will be found in Appendix 1.
26 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett
Bibliography
Ariza, E.N., Morales-Jones, C.A., Yahya, N., & Zainududdin H., (2000). Why TESOL?: Theories & Issues in Teaching Englishs to Speakers of Other Languages in K-12 Classrooms. Dubuque, IA: Kendall Hunt. Avery P., & Ehrlich, S. (2006) Teaching American English Pronuncation. Oxford: Oxford University Press. Akomolafe, S. (2013). FOREIGN-ACCENTED SPEAKERS AND. Journal of Cultural Diversity, 20(1), 49. Atkins, C. P. (2000). Do Employment Recruiters Discriminate on the Basis of Nonstandard Dialect? Journal of Employment Counseling, 30(September 1993), 108119. Bruno, D. (2002). THE BRAIN FROM TOP TO BOTTOM history. Language Processiing Areas in the Brain. Retrieved from http://thebrain.mcgill.ca/flash/d/d_10/d_10_cr/d_10_cr_lan/d_10_cr_lan.html Burns, C., Barton, K., & Kerby, S. (2012). The State of Diversity in Today s Workforce (pp. 1 7). Washington, D.C. Retrieved from http://www.americanprogress.org/wp- content/uploads/issues/2012/07/pdf/diversity_brief.pdf Cargyle, A. (2000). Evaluations of Employment Suitabiiity.pdf. Journal of Employment Counselingour, 37, 165177. Carr, S. (2012). IMPROVING COMMUNICATION THROUGH ACCENT MODIFICATION: Journal of Cultural Diversity, 19(3). Census. (2011). Language Other Than English Spoken At Home. America Fact Fincer. Retrieved from http://factfinder2.census.gov/faces/nav/jsf/pages/searchresults.xhtml?refresh=t Cheng, H. (2012). Accent Reduction Classes Now In Demand _ The New York View. The New York View. Retrieved October 26, 2013, from http://newyorkview.net/2012/08/accent- reduction-classes-now-in-demand/ Cho, K., & Harris, J. G. (2006). Towards an Automatic Foreign Accent Reduction Tool. In Speech Prosody (pp. 25). Courinho, E. (2013). Psychoacoustic cues to emotion in speech prosody and music. Cognition and Emotion, 27(4), 658684. 27 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Cummins, J. (2007). Literacy, Technology, and Diversity. Boston: Pearson Education,Inc. Demenko, G., Wagner, A., Cylwik, N., & Jokisch, O. (2009). An Audiovisual Feedback System for Acquiring L2 Pronunciation and L2 Prosody. In 2nd ISCA Workshop on Speech and Language Technology in Education (SLaTE) (Vol. 2, pp. 25). Derwing, T. M. (2003). ELL perception of their accent.pdf. The Canadian Modern Language Review, 59(4), 21. Derwing, T. M., & Waugh, E. (2012). IRPP S tudy (p. 36). Quebec. Dominic Massaro, Micahel Cohen, Antoinette Gesi, R. H. (1993). Massaro Bimodal-Speech- Perception-An-Examination-across-Languages. Journal of Phonetics, 21, 445478. Eskenazi, M., & Hansma, S. (1998). The fluency pronunciation trainer. In Proceedings of the STiLL Workshop (p. 6). Pittsburg: Language Technology Institute, Carnegie Mellon University. Retrieved from http://www.cs.cmu.edu/~max/mainpage_files/Esk-Hans-98.pdf Felps, D., Bortfeld, H., & Gutierrez-Osuna, R. (2009). Foreign accent conversion in computer assisted pronunciation training. Speech communication, 51(10), 920932. doi:10.1016/j.specom.2008.11.004 Friederici, A. D. (2011). The brain basis of language processing: from structure to function. Physiological reviews, 91(4), 135792. doi:10.1152/physrev.00006.2011 Friederici, A. D., & Alter, K. (2004). Lateralization of auditory language functions: a dynamic dual pathway model. Brain and language, 89(2), 267276. doi:10.1016/S0093- 934X(03)00351-1 Gardner, J. (2005). Barriers influencing the success of racial and ethnic minority students in nursing programs. Journal of transcultural nursing: official journal of the Transcultural Nursing Society / Transcultural Nursing Society, 16(2), 15562. doi:10.1177/1043659604273546 Godwin-Jones, R. (2009). EMERGING TECHNOLOGIES SPEECH TOOLS AND TECHNOLOGIES. Language Learning and Technology, 13(3), 411. Hismanoglu, M. (2011). Computer Assisted Pronunciation Teaching: From Past to Present. In 4th International Online Language Conference (pp. 193203). Holland, V. M., Kaplan, J. D., & Sabol, M. A. (1999). Preliminary Tests of Language Learning in a Speech-Interactive Graphics Microworld. In Calico (Vol. 16, pp. 339360). Miami: Calico Journal. 28 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Jackson, C. N., & OBrien, M. G. (2011). The interaction between prosody and meaning in second language speech production. Die Unterrichtspraxis. Teaching German, 44(1), 111. doi:10.1111/j.1756-1221.2011.00087.x Jayesh Shah, Raouf Seifeldin, H. A. (2010). International Medical Graduates in American Medicine: Contemporary challenges and oportunities. America Medical Association. Retrieved from http://www.ama-assn.org/ama1/pub/upload/mm/18/img-workforce- paper.pdf Khurana, P. (2013). Efficacy of Accent Modification Training for International Medical Professionals. Journal of University Teaching & Learning Practice, 10(2), 13. Kilickaya, F. (2008). Improving Pronunciation via Accent Reduction and Text-to-speech Software. In World CALL International Conference (pp. 35). LanguageOnLine. (2001). The History of TeLL Me More. Innovation for Language Learning. Retrieved from http://www.languageonline.in.th/history_en.htm Lenneger, E. (1967). The Geological Foundations of Language. New York: John Wiley and Sons. Lev-Ari, S., & Keysar, B. (2010). Why dont we believe non-native speakers? The influence of accent on credibility. Journal of Experimental Social Psychology, 46(6), 10931096. doi:10.1016/j.jesp.2010.05.025 Levis, J., & Levelle, K. (2011). PRONUNCIATION AND INTELLIGIBILITY: ISSUES IN RESEARCH AND PRACTICE PROCEEDINGS OF THE 2 ND ANNUAL PRONUNCIATION IN Editors. In Pronunciation in Second Language Learning and Teaching Conference (pp. 5669). Iowa State University. McGregor, A. (2002). Pronunciation Software Review. Meng, H. (2009). Developing Speech Recognition and Synthesis Technologies to Support Computer-Aided Pronunciation Training for Chinese Learners of English *. In 23rd Pacific Asia Conference on Language (pp. 4042). Munro, M. J., & Derwing, T. M. (2000). Foreign Accent , Comprehensibility , and Intelligibility in the Speech of Second Language Learners. Neri, A., Cucchiarini, C., Strik, H., & Boves, L. (2009). The pedagogy-technology interface in Computer Assisted Pronunciation Training. In Computer Assisted Language Learning: Critical Concepts in Linguistics (Vol. IIV, pp. 140164). doi:10.1076/call.15.5.441.13473 Neri, A., Gerosa, M., Giuliani, D., & Mich, O. (2008). The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning. doi:10.1080/09588220802447651 29 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Nguyen, L. T. (2010). Employment decisions as a function of an applicant. San Hose State University. Nygaard, L. C., Herold, D. S., & Namy, L. L. (2009). The semantics of prosody: acoustic and perceptual evidence of prosodic correlates to word meaning. Cognitive science, 33(1), 127 46. doi:10.1111/j.1551-6709.2008.01007.x Pearson, P. (2011). PRONUNCIATION AND INTELLIGIBILITY: ISSUES IN RESEARCH AND PRACTICE PROCEEDINGS OF THE 2 ND ANNUAL PRONUNCIATION IN Editors. In Pronunciation in Second Language Learning and Teaching Conference (p. 169). Iowa State University. Probst, K., Ke, Y., & Eskenazi, M. (2002). Enhancing foreign language tutors In search of the golden speaker. Speech Communication, 37(3-4), 161173. doi:10.1016/S0167- 6393(01)00009-7 Qooco. (2009). About ASR. Qooco Chinese Learning. Retrieved February 11, 2013, from http://www.qoocochinese.com/web/help_4.htm Romer-Trillo, J. (2012). Pragmatics and Prosody in English Language Teaching. Educational Linguistics, 15, 2314. Ryan, C. (2013). Language Use in the United States: 2011 (p. 16). Washington, D.C. Retrieved from http://www.mla.org/map_main Saito, Y., Fukuhara, R., Aoyama, S., & Toshima, T. (2009). Frontal brain activation in premature infants response to auditory stimuli in neonatal intensive care unit. Early human development, 85(7), 4714. doi:10.1016/j.earlhumdev.2009.04.004 Sankin, S. (2013). Accent Reduction Training Demand Is Increasing Sankinspeechimprovement. PRWeb. Retrieved October 26, 2013, from http://www.prweb.com/releases/accent- reduction-nyc/regional-accents/prweb10689963.htm Schaetzel, K. (2009). Teaching Pronunciation to Adult English Language Learners. CAELA Network Brief. Retrieved from www.cal.org/caelanetwork Seferoglu, G. (2005). Towards an Automatic Foreign Accent Reduction Tool. British Journal of Educational Technology, 36(2), 303316. Thomson, R. (2011). Computer Assisted Pronunciation Training: Target- ing Second Language Vowel Perception Improves Pronunciation. Calico Jounral, 28(3), 744766. TOMATIS-Developpement. (2009). The TOMATIS Method , a teaching process for listening (p. 14). Luxembourg: Tomatis Developpement S.A. 30 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett Wang, R., & Lu, J. (2011). Investigation of golden speakers for second language learners from imitation preference perspective by voice modification. Speech Communication, 53(2), 175184. doi:10.1016/j.specom.2010.08.015 Wendy, B. (2007). Learning prosody and fluency characteristics of second language speech: The effect of experience on child learners acquisition of five suprasegmentals - ProQuest. Applied Psycholinguistics. Retrieved from http://0- search.proquest.com.library.acaweb.org/docview/200859527/fulltextPDF/1414C9067752B 26F3C2/3?accountid=9900
.
31 The Use of Accent Reduction Software to Improve Prosody in Non-native Speakers of English Matchett
Logo Illustrations
Talk to Me, Copyright 2013, Informer Technologies, Inc Retrieved November 3, 2013 http://softadvice.informer.com/Talk_To_Me_Auralog.html Pronunciation Power CD-ROM for Mac And Windows. Copyright 1995-2013 ESL.net. Retrieved September 15, 2013, from http://www.esl.net/pronunciation_power.html NeoSpeech, www.neospeech.com 2013.All rights reserved. Retrieved November 3, 2013 http://www.neospeech.com/ SpeedLingua, Copyright 2010, Retrieved November 13, 2013 http://www.learnissimo.com