Thematic vocabulary selection for didactic purposes: evaluation of a quantitative approach

Jasper Degraeuwe, Patrick Goethals


The aim of this study is to evaluate the results of a quantitative approach to the thematic selection of vocabulary for didactic purposes. We describe in detail how three quantitative measures (absolute frequency, keyness and dispersion) are configured and combined to automate the selection of specific vocabulary from a specialized corpus. We then evaluate whether the automatic selection is confirmed by the judgements of SFL teachers. The results of this evaluation experiment show that in more than 85% of the cases the output of the quantitative selection method is accepted by at least half of the teachers. This observation is also backed from a statistical angle, with the outcome of an interrater reliability test indicating that there is a substantial agreement (Cohen’s kappa = 0.61) between the judgements of the teachers and the automatic selection.


corpus linguistics; vocabulary learning; automatic vocabulary selection; thematic vocabulary selection; absolute frequency; keyness; dispersion; Spanish as a foreign language (SFL)

