Content Adaptation for Language Learning: A Hybrid AI Approach

Jatin Arora

New Zealand

Victoria University of Wellington image/svg+xml

Irina Elgort

https://orcid.org/0000-0002-4568-9951

New Zealand

Victoria University of Wellington image/svg+xml

Junhong Zhao

https://orcid.org/0000-0001-7031-3828

New Zealand

Victoria University of Wellington image/svg+xml

|

Accepted: 12/23/2025

|

Published: 12/26/2025

DOI: https://doi.org/10.4995/eurocall.2025.23898
Funding Data

Downloads

Keywords:

Large Language Model, Artificial Intelligence, comprehensible input, simplification, second language learning

Supporting agencies:

This research was not funded

Abstract:

In learning a foreign language, access to comprehensible input is a critical success factor. However, at early stages, when learners are still below an intermediate-proficiency level, finding level-appropriate and engaging materials is highly problematic. Although the Internet abounds in text and multimedia materials in many languages, most of them are too difficult to be useful for lower-proficiency language learners. The present project aimed to establish whether the affordances of large language models (LLMs) can be harnessed to turn authentic audio, video, and text materials into comprehensible input for independent elementary-level language learners. The present article reports on the outcomes of a research and development project that adopts a hybrid approach to simplifying authentic materials, combining affordances of LLMs with careful prompt engineering and rule-based refinement. The article details the hybrid sequential pipeline system and the results of two rounds of evaluation: language teacher ratings and automated text analysis indices. Based on the outcome of these evaluations, it is concluded that the proposed approach can provide an efficient way of simplifying authentic content for and by lower-proficiency language learners. Directions for future research and development are also proposed.

Show more Show less

References:

Brezina, V., & Gablasova, D. (2015). Is there a core general vocabulary? Introducing the New General Service List. Applied Linguistics, 36(1), 1-22. https://doi.org/10.1093/applin/amt018

Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41, 977–90. https://doi.org/10.3758/BRM.41.4.977

Cobb, T. (n.d.). Compleat Web VP v.2.6 [computer program]. Lextutor: Vocabulary profiler. Retrieved from https://www.lextutor.ca Accessed 10 Jan 2025 at https://www.lextutor.ca/vp/comp.

Crossley, S. A. (2024). Developing Linguistic Constructs of Text Readability Using Natural Language Processing. Scientific Studies of Reading, 29(2), 138–160. https://doi.org/10.1080/10888438.2024.2422365

Crossley, S. A., Allen, D. B., & McNamara, D. S. (2011). Text simplification and comprehensive reading: Effects of text modification on lexical processing and comprehension. Journal of Educational Psychology, 103(1), 90–105.

Crossley, S. A., & McNamara, D. S. (2016). Text simplification and text cohesion: The role of connectives and anaphoric references. Discourse Processes, 53(7), 524–546.

Crossley, S. A., Louwerse, M.M., McCarthy, P.M., & McNamara, D.S. (2007), A Linguistic Analysis of Simplified and Authentic Texts. The Modern Language Journal, 91, 15-30. https://doi.org/10.1111/j.1540-4781.2007.00507.x

Day, R. (2002). Top ten principles for teaching extensive reading. Reading in a Foreign Language, 14(2), 136-141. https://doi.org/10.64152/10125/66761

Dale, E., & Chall, J. S. (1948). A formula for predicting readability. Educational Research Bulletin, 27(1), 11–20.

Davies, M. (2008-) Word frequency data from The Corpus of Contemporary American English (COCA). Data available online at https://www.wordfrequency.info.

Developers, F. (2024). Flask web framework. Retrieved from https://flask.palletsprojects.com.

Developers, P. (2024). Pandas: Data analysis library. Retrieved from https://pandas.pydata.org

Dupuy, B. C., (1999). Narrow Listening: an alternative way to develop and enhance listening comprehension in students of French as a foreign language. System, 27(3), 351-361. https://doi.org/10.1016/S0346-251X(99)00030-5

Durbahn, M., Rodgers, M., & Peters, E. (2020). The relationship between vocabulary and viewing comprehension. System, 88, 102166. https://doi.org/10.1016/j.system.2019.102166

Ellis, N. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition, SSLA, 24, 143-188. doi:10.1017/S0272263102002024

Gagen-Lanning, K. (2015). The effects of metacognitive strategy training on ESL learners’ self-directed use of TED Talk videos for second language listening. (Unpublished Master Thesis). Iowa State University.

Gimeno-Sanz, A. (2002). Principles in CALL software design and implementation. International Journal of English Studies, 2(1), 109–128. Retrieved from https://revistas.um.es/ijes/article/view/48511

In’nami, Y., Koizumi, R., Jeon, E. H., & Arai, Y. (2022). Chapter 8. L2 listening and its correlates: A meta-analysis. In Understanding L2 Proficiency: Theoretical and meta-analytic investigations (pp. 235-283). John Benjamins Publishing Company.

Jeon, E. H., & Yamashita, J. (2022). Chapter 3. L2 reading comprehension and its correlates: An updated meta-analysis. In Understanding L2 proficiency: Theoretical and meta-analytic investigations (pp. 29-86). John Benjamins Publishing Company.

Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (Automated Readability Index, Fog Count, and Flesch Reading Ease Formula) for Navy enlisted personnel (Tech. Rep. 8-75). U.S. Navy Research Branch. https://doi.org/10.21236/ADA006655

Krashen, S. (1985). The input hypothesis. Longman.

Krashen, S. (2004). The case for Narrow Reading. Language Magazine 3(5), 17-19, http://www.sdkrashen.com/content/articles/narrow.pdf.

Larsen-Freeman, D., & Long, M. (1991). An introduction to second language acquisition research. Longman.

Laufer, B., & Ravenhorst-Kalovski, G. C. (2010). Lexical threshold revisited: Lexical text coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 10(1), 15-30. http://hdl.handle.net/10125/66648

Levy, M., & Stockwell, G. (2006). CALL dimensions: Options and issues in computer assisted language learning. Lawrence Erlbaum Associates.

Liu, F., Jiang, Y., Lai, C., & Jin, T. (2024). Teacher engagement with automated text simplification for differentiated instruction. Language Learning & Technology, 28(2), 163–182. https://doi.org/10.64152/10125/73576

Ma, Q., Crosthwaite, P., Sun, D., & Zou, D. (2024). Exploring ChatGPT literacy in language education: A global perspective and comprehensive approach. Computers and education: Artificial intelligence, 7, 100278.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ICLR Workshop. https://arxiv.org/abs/1301.3781

Nation, I. S. P. (2022). Learning vocabulary in another language (3rd ed.). Cambridge University Press. https://doi.org/10.1017/9781009093873

Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian modern language review, 63(1), 59-82. https://doi.org/10.3138/cmlr.63.1.59

Nation, I. S. P. (2016) Making and Using Word Lists for Language Learning and Teaching. John Benjamins, Amsterdam. https://doi.org/10.1075/z.208

Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta‐analysis. Language learning, 50(3), 417-528. https://doi.org/10.1111/0023-8333.00136

OpenAI. (2024a). ChatGPT-4o model. Retrieved from https://openai.com

OpenAI. (2024b). GPT and Whisper API documentation. Retrieved from https://platform.openai.com/docs/

OpenAI. (2024c). TTS for simplified text conversion. Retrieved from https://openai.com/tts

OpenAI. (2024d). Whisper speech-to-text model. Retrieved from https://openai.com/whisper

Rets, I., Astruc, L., Coughlan, T., & Stickler, U. (2022). Approaches to simplifying academic texts in English: English teachers’ views and practices. English for Specific Purposes, 68, 31–46. https://doi.org/10.1016/j.esp.2022.03.003

Rodgers, M.P.H., & Webb, S. (2011), Narrow Viewing: The Vocabulary in Related Television Programs. TESOL Quarterly, 45: 689-717. https://doi.org/10.5054/tq.2011.268062

Van Zeeland, H., & Schmitt, N. (2013). Lexical coverage in L1 and L2 listening comprehension: The same or different from reading comprehension? Applied Linguistics, 34(4), 457–479. https://doi.org/10.1093/applin/ams074

Wu, Chia-Pei. (2020). Implementing TED Talks as Authentic Videos to Improve Taiwanese Students’ Listening Comprehension in English Language Learning. Arab World English Journal (AWEJ) Special Issue on CALL (6). 24-37. https://doi.org/10.24093/awej/call6.2

Hu, M., & Nation, I. S. P. (2000). Unknown vocabulary density and reading comprehension. Reading in a Foreign Language, 13(1), 403–430. https://nflrc.hawaii.edu/rfl/item/43 , https://doi.org/10.64152/10125/66973

Young, D. N. (1999). Linguistic simplification of SL reading material: Effective instructional practice? The Modern Language Journal, 83(3), 350-366. https://doi.org/10.1111/0026-7902.00027

Show more Show less