Practicing Pronunciation: Will Voice XML do for language learners what HTML did for collaborators?


  • Kenneth J. Luterbach East Carolina University
  • Diane Rodriguez East Carolina University



practicing pronunciation, voice synthesis, voice recognition


This paper considers the utility of the Voice Extensible Markup Language (Voice XML) for language learning. In particular, this article considers whether Voice XML might become as popular as HTML. First, this paper discusses the surprising popularity of HTML, which provides contextual information useful for considering the potential of Voice XML. Second, this article discusses two voice scripts in order to demonstrate Voice XML tags and features. The first example script concerns voice synthesis only whereas the second script utilizes both voice synthesis and voice recognition. In order to gain insight into the utility of Voice XML for instructional applications, the second voice script can be accessed by language learners in order to practice pronouncing words in English. Technically, each voice script is a text file containing Voice XML tags. Once the file containing a Voice XML script is stored on a web server and a telephone number linked to the file, a language learner can use a telephone to practice pronouncing words. Those implementation details are considered in the third section of this paper, which identifies one particular system that permits developers to test and deploy Voice XML scripts free of charge. Lastly, this article concludes with a discussion of issues concerning the utility of Voice XML relative to HTML.


Download data is not yet available.

Author Biographies

Kenneth J. Luterbach, East Carolina University

College of Education

Diane Rodriguez, East Carolina University

College of Education


Berners-Lee, T. (2000) Weaving the web: The original design and ultimate destiny of the world wide web. New York, NY: HarperCollins.

Jurafsky, D. & Martin, J. H. (2000) Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice Hall.

Raggett, D. (2001) Getting started with Voice XML. Retrieved September 12, 2006 from

Rogers, E. (2003) Diffusion of innovations (5th edition). New York, NY: Free Press.

World Wide Web Consortium (2006) "Voice browser" activity. Retrieved September 12, 2006 from






Research papers