GenAI Models as Keyword Rankers: A Learner-centred Case Study for L2 Spanish
Submitted: 05/13/2025
|Accepted: 09/16/2025
|Published: 12/26/2025
Copyright (c) 2025 Jasper Degraeuwe

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Downloads
Keywords:
Generative artificial intelligence, L2 Spanish, vocabulary learning, word lists
Supporting agencies:
Abstract:
Frequency-based word lists form an important part of general-purpose vocabulary learning courses aimed at beginner and (lower-)intermediate learners of a foreign/second language (L2). For advanced learners and/or specific purposes, however, relying exclusively on these general word lists will be unlikely to lead to an adequate selection of vocabulary. As research in this latter area remains scarce (especially for languages other than English), the present study aims to fill (part of) the gap by investigating the use of Generative Artificial Intelligence (GenAI) models to automatically rank vocabulary items based on how typical they are of a given topic, focusing on Spanish as the target language. I compile a dataset containing four domain-specific subsets of 200 vocabulary items (for the topics economics, health, law, and migration) and analyse how well GenAI-based rankings of these vocabulary items (using zero-shot prompting) correlate with gold standard human rankings (provided by L2 learners). As the evaluation baseline, I use the rankings obtained by means of the Kullback-Leibler divergence (i.e., a statistical keyness measure based on word frequencies). With a top average Spearman’s ρ and Kendall’s weighted τ of 0.73, this first-of-its-kind study demonstrates that the tested GenAI models (Gemma, Llama, and Mistral) outperform the baseline by a large margin, showing great potential for use in the real-life creation of domain-specific vocabulary lists for L2 learning purposes.
References:
Alfter, D. (2024). Out-of-the-Box Graded Vocabulary Lists with Generative Language Models: Fact or Fiction? In T. Gaillat, C. Mallart, F. Moreau, J.-Y. Li, G. Drouet, D. Alfter, E. Volodina, & A. Jönsson (Eds.), Proceedings of the 13th Workshop on Natural Language Processing for Computer Assisted Language Learning (pp. 1–19). LiU Electronic Press. https://doi.org/10.3384/ecp211001
Boers, F. (2021). Evaluating second language vocabulary and grammar instruction: A synthesis of the research on teaching words, phrases, and patterns. Routledge. https://doi.org/10.4324/9781003005605
Brysbaert, M., & Diependaele, K. (2013). Dealing with zero word frequencies: A review of the existing rules of thumb and a suggestion for an evidence-based choice. Behavior Research Methods, 45(2), 422–430. https://doi.org/10.3758/s13428-012-0270-5
Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, 34(2), 213. https://doi.org/10.2307/3587951
Davies, M., & Hayward Davies, K. (2018). A frequency dictionary of Spanish: Core vocabulary for learners (2nd ed.). Routledge. https://doi.org/10.4324/9781315542638
Flynn, T. N., & Marley, A. A. J. (2014). Best-worst scaling: Theory and methods. In S. Hess & A. Daly (Eds.), Handbook of Choice Modelling. Edward Elgar Publishing. https://doi.org/10.4337/9781781003152.00014
Gabrielatos, C. (2018). Keyness analysis: Nature, metrics and techniques. In C. Taylor & A. Marchi (Eds.), Corpus Approaches To Discourse: A critical review (pp. 225–258). Routledge. https://doi.org/10.4324/9781315179346-11
Goethals, P. (2018). Customizing vocabulary learning for advanced learners of Spanish. In Read, Timothy and Sedano Cuevas, Beatriz and Montaner-Villalba, Salvador (Ed.), Technological innovation for specialized linguistic domains: Languages for digital lives and cultures, proceedings of TISLID’18 (pp. 229–240). Éditions Universitaires Européennes.
Gries, S. Th. (2021). A new approach to (key) keywords analysis: Using frequency, and now also dispersion. Research in Corpus Linguistics, 9(2), 1–33. https://doi.org/10.32714/ricl.09.02.02
Kamrotov, M., Talalakina, E., & Stukal, D. (2022). Technical vocabulary in languages for special purposes: The corpus-based Russian economics word list. Lingua, 273, 103326. https://doi.org/10.1016/j.lingua.2022.103326
Kiritchenko, S., & Mohammad, S. (2017). Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 465–470. https://doi.org/10.18653/v1/P17-2074
Mitchell, R., Myles, F., & Marsden, E. (2019). Second language learning theories (4th ed.). Routledge. https://doi.org/10.4324/9781315617046
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
Nation, I. S. P. (2007). The Four Strands. Innovation in Language Learning and Teaching, 1(1), 2–13. https://doi.org/10.2167/illt039.0
Nation, I. S. P. (2016). Making and Using Word Lists for Language Learning and Testing. John Benjamins Publishing Company. https://doi.org/10.1075/z.208
Nation, I. S. P. (2022). Learning Vocabulary in Another Language (3rd ed.). Cambridge University Press. https://doi.org/10.1017/9781009093873
Pojanapunya, P., & Watson Todd, R. (2018). Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus Linguistics and Linguistic Theory, 14(1), 133–167. https://doi.org/10.1515/cllt-2015-0030
Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical Conditioning II (pp. 64–99). Appleton-Century-Crofts.
Schmitt, N. (2010). Key Issues in Teaching and Learning Vocabulary. In R. Chacón-Beltrán, C. Abello-Contesse, & M. D. M. Torreblanca-López (Eds.), Insights into Non-native Vocabulary Teaching and Learning (pp. 28–40). Multilingual Matters. https://doi.org/10.21832/9781847692900-004
Webb, S., & Nation, I. S. P. (2017). How vocabulary is learned. Oxford University Press. https://doi.org/10.25170/ijelt.v12i1.1458
West, M. (1953). A General Service List of English Words. Longmans, Green & Co.
Zimmerman, C. B. (1996). Historical trends in second language vocabulary instruction. In J. Coady & T. Huckin (Eds.), Second Language Vocabulary Acquisition (1st ed., pp. 5–19). Cambridge University Press. https://doi.org/10.1017/CBO9781139524643.003


