What data for data-driven learning?

Alex Boulton

France

Nancy-Université

Crapel – ATILF / CNRS
|

Accepted: 07/30/2021

|

Published: 03/22/2012

DOI: https://doi.org/10.4995/eurocall.2012.16038
Funding Data

Downloads

Keywords:

corpus, concordance, data-driven learning, DDL

Supporting agencies:

This research was not funded

Abstract:

Corpora have multiple affordances, not least for use by teachers and learners of a foreign language (L2) in what has come to be known as ‘data-driven learning’ or DDL. The corpus and concordance interface were originally conceived by and for linguists, so other users need to adopt the role of ‘language researcher’ to make the most of them. Despite the alleged advantages of this, it does create a potential barrier for occasional or non-specialist users in particular. While researchers debate the status of the ‘web-as-corpus’, the Internet represents a vast bank of data already familiar to most people; less discussed is the status of ‘Google-as-concordancer’ – another familiar tool. This paper discusses some of the advantages and disadvantages of this approach from a pedagogical perspective.
Show more Show less

References:

Acar, A., Geluso, J., & Shiki, T. (2011). How can search engines improve your writing? CALL-EJ, 12(1), 1-10. http://callej.org/journal/12-1/Acar_2011.pdf

Aston, G. (1996). The British National Corpus as a language learner resource. In S. Botley, J. Glass, A. McEnery & A. Wilson (Eds.), Proceedings of TALC 1996. UCREL Technical Papers, 9, 178-191.

Bergh, G. (2005). Min(d)ing English language data on the web: What can Google tell us? ICAME Journal, 29, 25-46. http://gandalf.aksis.uib.no/icame/ij29/ij29-page25-46.pdf

Bernardini, S., Baroni, M., & Evert, S. (2006). A WaCky introduction. In M. Baroni & S. Bernardini (Eds.). Wacky! Working papers on the web as corpus (pp. 9-40). Bologna: Gedit. http://wackybook.sslmit.unibo.it/

Boulton, A. (2009). Testing the limits of data-driven learning: Language proficiency and training. ReCALL, 21(1), 37-51. https://doi.org/10.1017/S0958344009000068

Boulton, A. (2010). Data-driven learning: On paper, in practice. In T. Harris & M. Moreno Jaén (Eds.), Corpus linguistics in language teaching (pp. 17-52). Bern: Peter Lang.

Boulton, A. (2011). Data-driven learning: The perpetual enigma. In S. Goźdź-Roszkowski (Ed.), Explorations across languages and corpora (pp. 563-580). Frankfurt: Peter Lang.

Clerehan, R., Kett, G., & Gedge, R. (2003). Web-based tools and instruction for developing it students' written communication skills. In Proceedings of exploring educational technologies. http://www.monash.edu.au/groups/flt/eet/full_papers/clerehan.pdf

Conroy, M. (2010). Internet tools for language learning: University students taking control of their writing. Australasian Journal of Educational Technology, 26(6), 861-882. http://ascilite.org.au/ajet/ajet26/conroy.html https://doi.org/10.14742/ajet.1047

Firth, J. (1957). Papers in linguistics 1934-1951. London: Oxford.

Gilquin, G., & Gries, S. (2009). Corpora and experimental methods: A state-of-the-art review. Corpus Linguistics and Linguistic Theory, 5(1): 1-26. https://doi.org/10.1515/CLLT.2009.001

Hafner, C., & Candlin, C. (2007). Corpus tools as an affordance to learning in professional legal education. Journal of English for Academic Purposes, 6(4), 303-318. https://doi.org/10.1016/j.jeap.2007.09.005

Johns, T. (1986). Micro-Concord: A language learner's research tool. System, 14(2), 151-162. https://doi.org/10.1016/0346-251X(86)90004-7

Johns, T. (1988). Whence and whither classroom concordancing? In P. Bongaerts, P. de Haan, S. Lobbe & H. Wekker (Eds.), Computer applications in language learning (pp. 9-27). Dordrecht: Foris. https://doi.org/10.1515/9783110884876-003

Johns, T. (1990). From printout to handout: Grammar and vocabulary teaching in the context of data driven learning. CALL Austria, 10, 14-34.

Johns, T. (1991). Should you be persuaded: Two examples of data-driven learning. In T. Johns & P. King (Eds.), Classroom concordancing. English Language Research Journal, 4, 1-16.

Joseph, B. (2004). The editor's department: On change in Language and change in language. Language, 80(3), 381-383. http://www.ling.ohio-state.edu/~bjoseph/publications/2004EDchange.pdf https://doi.org/10.1353/lan.2004.0132

Kaszubski, P. (2006). Web-based concordancing and ESAP writing. Poznan Studies in Contemporary Linguistics, 41, 161-193.

Keller, F., & Lapata, M. (2003). Using the web to obtain frequencies for unseen bigrams. Computational Linguistics, 29(3), 459-484. https://doi.org/10.1162/089120103322711604

Kilgarriff, A. (2005). Language is never, ever, ever random. Corpus Linguistics and Linguistic Theory, 1(2), 263-275. http://kilgarriff.co.uk/Publications/2005-K-lineer.pdf https://doi.org/10.1515/cllt.2005.1.2.263

Kilgarriff, A., & Grefenstette, G. (2003). Introduction to the special issue on web as corpus. Computational Linguistics, 29(3), 333-347. https://doi.org/10.1162/089120103322711569

Kosem, I. (2008). User-friendly corpus tools for language teaching and learning. In A. Frankenberg Garcia (Ed.), Proceedings of the 8th teaching and language corpora conference (pp. 183-192). Lisbon: ISLA-Lisboa.

Lüdeling, A., Baroni, M., & Evert, S. (2007). Using web data for linguistic purposes. In M. Hundt, N. Nesselhauf & C. Biewer (Eds.), Corpus linguistics and the web (pp. 7-24). Amsterdam: Rodopi. https://doi.org/10.1163/9789401203791_003

Milton, J. (2006). Resource-rich web-based feedback: Helping learners become independent writers. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 123-137). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139524742.009

Robb, T. (2003). Google as a quick 'n' dirty corpus tool. TESL-EJ, 7(2). http://www.tesl ej.org/wordpress/issues/volume7/ej26/ej26int/

Rohdenburg, G. (2007). Determinants of grammatical variation in English and the formation / confirmation of linguistic hypotheses by means of internet data. In M. Hundt, N. Nesselhauf & C. Biewer (Eds.), Corpus linguistics and the web (pp. 191-209). Amsterdam: Rodopi. https://doi.org/10.1163/9789401203791_012

Sha, G. (2010). Using Google as a super corpus to drive written language learning: A comparison with the British National Corpus. Computer Assisted Language Learning, 23(5), 377-393. https://doi.org/10.1080/09588221.2010.514576

Shei, C. (2008). Discovering the hidden treasure on the Internet: Using Google to uncover the veil of phraseology. Computer Assisted Language Learning, 21(1), 67-85. https://doi.org/10.1080/09588220701865516

Sinclair, J. (2004). New evidence, new priorities, new attitudes. In J. Sinclair (Ed.), How to use corpora in language teaching (pp. 271-299). Amsterdam: John Benjamins. https://doi.org/10.1075/scl.12.20sin

Sinclair, J. (2005). Corpus and text: Basic principles. / Appendix: How to build a corpus. In M. Wynne (Ed.), Developing linguistic corpora: A guide to good practice (pp. 5-24 / 95-101). Oxford: Oxbow Books. http://ota.ox.ac.uk/documents/creating/dlc/chapter1.htm

Todd, R. (2001). Induction from self-selected concordances and self-correction. System, 29(1), 91-102. https://doi.org/10.1016/S0346-251X(00)00047-6

Show more Show less