Toward the morpho-syntactic annotation of an Old English corpus with universal dependencies

Authors

DOI:

https://doi.org/10.4995/rlyla.2022.16787

Keywords:

Universal Dependencies, Treebanks, Syntactic Annotation, Old English

Abstract

The aim of this article is to take the first steps toward the compilation of a treebank of Old English compatible with the framework of Universal Dependencies (UD). Such a treebank will comprise morphological and syntactic annotation of Old English texts adequate for cross-linguistic comparison, diachronic analysis and natural language processing. The article, therefore, engages in four tasks: (i) identifying the Old English exponents of UD lexical categories; (ii) selecting the Old English exponents of UD morphological features; (iii) finding the areas of Old English morphology that require token indexing in the UD format; and (iv) checking on the relevance of the universal set of dependency relations. The data have been extracted from ParCorOEv2, an open access annotated parallel corpus Old English-English. The main conclusions are that the annotation format calls for two additional fields (gloss and morphological relatedness) and that enhanced dependencies are required in order to account for some syntactic phenomena.

Downloads

Download data is not yet available.

Author Biography

Javier Martín Arista, Universidad de La Rioja

Professor Javier Martín Arista teaches Old English and Linguistics at the University of La Rioja. He has published widely in linguistics journals. He is the PI of the Nerthus Project (www.nerthusproject.com), which deals with advanced corpus linguistics, computational linguistic analysis, digital humanities and electronic lexicography of Old English.

References

Böhmová, A., Hajič, J., Hajičova, E., & Hladká, B. (2003). "The Prague Dependency Treebank", in A. Abeillé (ed.) Treebanks: Building and Using Parsed Corpora. Dordrecht: Kluwer Academic Publishers, 103-127. https://doi.org/10.1007/978-94-010-0201-1_7

Campbell, A. (1987). Old English Grammar. Oxford: Oxford University Press.

Chomsky, N. (1970). "Remarks on Nominalization", in R. Jacobs & P. Rosenbaum (eds.) Readings in English Transformational Grammar. Waltham, MA.: Blaisdell, 184-221.

Croft, W., Nordquist, D., Looney, K., & Regan, M. 2017. "Linguistic typology meets Universal Dependencies", in M. Dickinson, J. Hajic, S. Kübler & A. Przepiórkowski (eds.) Proceedings of the 15th International Workshop on Treebanks and Linguistic Theories (TLT15), 63-75.

de Marneffe, M. C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J. & Manning, C. (2014). "Universal Stanford Dependencies: a cross-linguistic typology", in Proceedings of LREC 2014, 4585-4592.

de Marneffe, M. C., Manning, C., Nivre, J. & Zeman, D. 2021. "Universal Dependencies". Computational Linguistics 47/2, 255-308. https://doi.org/10.1162/coli_a_00402

Denison, D. (1993). English Historical Syntax. London: Longman.

Dik, S. C. (1997a). The Theory of Functional Grammar. Part 1: The Structure of the Clause. Berlin: Mouton de Gruyter. Edited by K. Hengeveld.

Dik, S. C. (1997b). The Theory of Functional Grammar. Part II: Complex and Derived Constructions. Berlin: Mouton de Gruyter. Edited by K. Hengeveld. https://doi.org/10.1515/9783110218374

Fillmore, C. J. (1968). "The Case for Case", in E. Bach and R. Harms (eds.) Universals in Linguistic Theory. New York: Holt, Rinehart & Winston, 1-88.

Foley, W. 2007. "A typology of information packaging in the clause", in T. Shopen (ed.) Language Typology and Syntactic Description. Volume I: Clause Structure. Cambridge: Cambridge University Press, 362-446. https://doi.org/10.1017/CBO9780511619427.007

Foley, W. & R. D. Van Valin, Jr. (1984). Functional syntax and universal grammar. Cambridge: Cambridge University Press.

Greenberg, J. H. (1966). "Some universals of grammar with particular reference to the order of meaningful elements", in J. H. Greenberg (ed.), Universals of grammar. Cambridge, Mass.: MIT Press, 73-113.

Haspelmath, M. 2019. "Indexing and flagging, and head and dependent marking". Te Reo 62/1, 93-115.

Healey, A. (ed.), Wilkin, J. & Xiang, X. (2004). The Dictionary of Old English web corpus. Toronto: Dictionary of Old English Project, Centre for Medieval Studies, University of Toronto.]

Hogg, R. M. & Fulk, R. D. (2011). A Grammar of Old English. Volume 2: Morphology. Blackwell. https://doi.org/10.1002/9781444327472

Hudson, R. A. (1984). Word Grammar. Blackwell.

Jurafsky, D. & J. H. Martin. Speech and Language Processing (3rd. edition). Forthcoming.

Kaplan, R. & Bresnan, J. (1982). "Lexical-Functional Grammar: A formal system for grammatical representation", in J. Bresnan (ed.) The Mental Representation of Grammatical Relations. Cambridge, Mass.: MIT Press, 173-281.

Kastovsky, D. (1992). "Semantics and vocabulary", in R. Hogg (ed.) The Cambridge history of the English language I: The beginnings to 1066. Cambridge: Cambridge University Press, 290-408. https://doi.org/10.1017/CHOL9780521264747.006

Kübler, S. & Zinsmeister, H. (2015). Corpus Linguistics and Linguistically Annotated Corpora. London: Bloomsbury.

Lakoff, G. (1971). "On generative semantics", in D. D. Steinberg & L. A. Jakobovits (eds.), Semantics: An interdisciplinary reader in philosophy, linguistics and psychology. Cambridge: Cambridge University Press, 232-296.

Martín Arista, J. (2012). "The Old English prefix ge-: A panchronic reappraisal". Australian Journal of Linguistics 32/4, 411-433. https://doi.org/10.1080/07268602.2012.744264

Martín Arista, J. & Ojanguren López, A. E. (2018). "Grammaticalization and deflexion in progress. The past participle in the Old English passive". Studia Neophilologica 90/2, 155-175. https://doi.org/10.1080/00393274.2018.1463823

Martín Arista, J., Domínguez Barragán, S., García Fernández, L., Ruíz Narbona, E., Torre Alonso, R. & Vea Escarza, R. (comp.). (2021). ParCorOEv2. An open access annotated parallel corpus Old English-English. Nerthus Project, Universidad de La Rioja, www.nerthusproject.com.

McDonald, R. T., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Bertomeu, N. & Lee, J. (2013). "Universal Dependency Annotation for Multilingual Parsing", in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 92-97.

Mel'čuk, I., Bresnan, J., Asudeh, A., Toivonen, I. & Wechsler, S. (2016). Lexical-Functional Syntax. Chichester: Wiley-Blackwel. https://doi.org/10.1002/9781119105664

Mel'čuk, I. (1988). Dependency Syntax: Theory and Practice. State University of New York Press.

Nilsson, J., Nivre, J., & Hall, J. (2006). Graph transformations in data-driven dependency parsing, in COLING 21 and ACL 44, 257-264. https://doi.org/10.3115/1220175.1220208

Nivre, J. 2015. "Towards a universal grammar for natural language processing", in A. Gelbukh (ed.) Computational Linguistics and Intelligent Text Processing. New York: Springer, 3-16. https://doi.org/10.1007/978-3-319-18111-0_1

Nivre, J. 2016. "Universal Dependencies: A Cross-Linguistic Perspective on Grammar and Lexicon", in Proceedings of the Workshop on Grammar and Lexicon: Interactions and Interfaces, 38-40.

Nivre, J., de Marneffe, M. C., Ginter, F., Hajiˇc, J., Manning, C., Pyysalo, S., Schuster, S., Tyers, F. & Zeman, D. (2020). "Universal Dependencies v2: An evergrowing multilingual treebank collection", in Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020), 4027-4036.

Nivre, J., de Marneffe, M. C., Ginter, F., Golberg, Y., Hajiˇc, J., Manning, McDonald, B., C., Pyysalo, S., Schuster, S., Silveira, Tsarfaty, R. & Zeman, D. 2016. "Universal Dependencies v1: a multilingual treebank collection, in Proceedings of the 10th International Conference on Language Resources and Evaluation, 1659-1666.

Perlmutter, D. (ed.). (1983). Studies in Relational Grammar. Chicago: The University of Chicago.

Petré, P. (2014). Constructions and environments: Copular, passive, and related constructions in Old and Middle English. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199373390.001.0001

Petrov, S., Das, D. & R. McDonald. 2012. "A Universal Part-of-Speech Tagset", in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), 2089-2096.

Pintzuk, S. & Plug, L. (eds.) (2001). The York-Helsinki Parsed Corpus of Old English Poetry [http://www-users.york.ac.uk/~lang18/pcorpus.html].

Ringe, D. & Taylor, A. (2014). The development of Old English. A linguistic history of English, vol. 2. Oxford: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199207848.001.0001

Rissanen, M., Kytö, M., Kahlas-Tarkka, L., Kilpiö, M., Nevanlinna, S., Taavitsainen, I., Nevalainen, T. & Raumolin-Brunberg, H. (eds). (1991). The Helsinki Corpus of English Texts. Department of Modern Languages, University of Helsinki.

Taylor, A., Warner, A., Pintzuk. S. & Beths, F. (2003). The York-Toronto-Helsinki Parsed Corpus of Old English Prose [https://www-users.york.ac.uk/~lang22/YcoeHome1.htm].

Tesnière, L. 2015 (1959). Elements of Structural Syntax. Translation by T. Osborne and S. Kahane of Tesnière (1959). Amsterdam: John Benjamins. https://doi.org/10.1075/z.185

Van Valin, R. D., Jr. & R. LaPolla. (1997). Syntax: structure, meaning and function. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9781139166799

Downloads

Published

2022-07-28

Issue

Section

Articles

Funding data