Learning verb inflection using Cilenis conjugators

Pablo Gamallo*, Marcos García*, Isaac González**, Marta Muñoz** and Iria del Río**
*CITIUS, Universidade de Santiago de Compostela, **Cilenis S.L


1. Introduction

It is well-known that the complexity of verb inflection in Romance languages is much higher than in English. While an English verb is, at most, associated to three forms, the inflection of a single verb in Spanish or Portuguese can take up to 70 or 80 different inflected forms (without including composite forms). Verb morphology is then perceived as one of the most important acquisition challenges for L2 learners of Romance languages. So, it is important to provide learners with appropriate linguistic tools. Automatic verb conjugators are seen as very useful tools to help L2 learners acquiring verb inflexion. They can be accessed using many types of devices: computers, tablets, smartphones, etc. Verb conjugators take part of a vast variety of language-learning tools, including dictionaries, translators, or multimedia resources, available in real time on the Internet. Learners receive powerful reinforcement by searching for relevant linguistic information at any time and in any place.

In this article, we describe a linguistic tool, named Cilenis Verb Conjugator, consisting of three related modules: Cilenis Conjuga, which is a verb conjugator for Spanish, Cilenis Conjugador, a verb conjugator for Portuguese, and Cilenis Conxugador, a verb conjugator for Galician language. These three modules allow us to conjugate Spanish, Portuguese, and Galician verb infinitives, respectively. They are ideal for L2 learners as well as for those students completing high school and college class assignments. The main properties of our three verb conjugators are the following:

  • They are based on highly reliable linguistic information.

  • They were developed within an open source project.

  • They are accessed via search forms with user-friendly interfaces.

  • They can be installed on Android for smart phones.

2. The strategy

Our linguistic tool was built on the basis of a free software project. The start point of this project was the study, analysis, and improvement of three existing free conjugators:

  • The Spanish verb conjugator developed by the research group “Gramática del Español”, at University of Santiago de Compostela.

  • The Portuguese conjugator, called Conjugue, written by Ricardo Ueda Karpischek with Awk. This tool was designed to inflect all verb forms contained in the Ispell dictionary.

  • The Galician verb conjugator, called Conshuga, wich is a Perl script developed by the research team “ProLNat”, at the University of Santiago de Compostela.

In order to accomplish the project's requirements, we performed two tasks: a linguistic improvement of the existing tools, and a computational unification of the three conjugators into one single architecture.

First, the linguistic task mainly consisted of identifying and correcting grammatical and lexical errors found in the three conjugators, as well as in adding missing irregular verbs. Besides, we also added new items of information, namely the association of different forms to the same verb inflection, as well as the insertion of different conjugations to ambiguous verbs with the same infinitive form. For instance, “acostar” in Spanish or “cumprir” in Portuguese. Concerning defective verbs, we made an important effort to unify the different, and sometimes controversial, criteria used by the authors of the three conjugators. Finally, special attention was paid to the language varieties in Portuguese. To deal with spelling variation in this language, we defined four different cases:

  • European Portuguese before the requirement of the Spelling Agreement (“Acordo Ortográfico” of 1990).

  • European Portuguese filling the requirements of the Spelling Agreement.

  • Brazilian Portuguese before the requirements of the Spelling Agreement.

  • Brazilian Portuguese filling the requirements of the Spelling Agreement.

Second, the computational task consisted of building a single framework containing the linguistic data of the three existing conjugators. For this purpose, we defined a set of scripts transforming the different outputs of these conjugators into a single output format. This new unified output is the input of the visual modules used to develop the search interfaces in both the Web and Android environments.

The three modules of the Cilenis Conjugator framework are also open-source and freely available from their corresponding Web pages. This will allow other researchers or companies, not only to install and use the conjugators in their own environments, but also to improve the software and /or enrich the linguistic information it contains. By now and as far as we know, one company has took advantage of our open source project: the company GalApps has made use of our Galician module to implement an application for both smart phones with Android and.

3. The search interfaces

Two different types of search interfaces were implemented: Web forms written with HTML and PHP, and Android applications implemented with JAVA. The Android interfaces rely on APIs to search and retrieve results from the Web forms. We paid special attention to the design of the interfaces, by making them attractive, simple, and user-friendly (Figures 1-6).

Figure 1. Web form of the Spanish verb conjugator.


Figure 2. Web form of the Portuguese verb conjugator.


Figure 3. Web form of the Spanish verb conjugator.


Figure 4. Android app of the Spanish verb conjugator.


Figure 5. Android app of the Portuguese verb conjugator.

Among the different elements and functionalities including in our search interfaces, we outline the following:

  • The search/conjugate button and the text box, which are obviously shared by all existing verb conjugators. The user must type the verb infinitive she/he wishes to conjugate into the text box, and then press in the search button.

  • Given a specific language, the system is able to identify and conjugate only existing verbs, that is, known verbs that are in the main dictionaries of that language. In order to search for just existing verbs, the user must mark with a cross the corresponding square boxes.

  • On the contrary, if the user does not cross the box of existing verbs, the system may conjugate imaginary or wrongly spelt verbs, if their ending corresponds to an existing conjugation model. The Spanish conjugator, Onoma, also provides this and the previous functionality (Rello & Basterrechea, 2011).

  • As it was stated before, the Portuguese module allows the user to search for four different language varieties. To choose a specific variety, the user must cross or not the square box with the label “Acordo Ortográfico (Spell Agreement), and select between “Port. Europeu” (European Portuguese) and “Port. Brasileiro” (Brazilian Portuguese).

  • Finally, when the user type an ambiguous verb infinitive, the system shows the different lexical units associated to that verb (see again Figure 3 above). Each lexical unit gives a different verb inflection.

4. Query analysis

To analyse the behaviour of the users of our three verb conjugators, we have stored all queries made so far in log files. These log files provide us with useful information to improve the system. For instance, as they allow us to observe what the most searched verbs are, it is possible to revise and correct errors which are potentially very dangerous. Besides, these files allows us to check if the users take advantage of all elements and functionalities of the conjugators, for instance if they search for only existing verbs or not.

Below, Table 1 shows useful information about the number of total search queries, unique users, and the top 20 lists with the most searched verbs. The queries made from the Web forms have been set apart from those made from Android devices, except for Galician, since we have not developed the corresponding Android version. The number of total and unique users clearly shows that there are much more queries made using Android devices than those made using the Web applications. This is in accordance with the fact that the use of smart phones and tablets is still growing while the laptop and PCs sales are reported to be in slightly decline. Notice also that the number of total users for Galician verbs is much higher than for Spanish and Portuguese, even if the number of unique users is clearly lower. It means Galician users are more persevering: they make about 20 queries per unique user, compared to not more than 3 or 4 queries per user in Portuguese and Spanish. The adhesion and loyalty of Galician users are probably due to the fact that, unlike for Portuguese and Spanish, there are very few alternatives to our conjugator in Galician.

Besides, Table 1 also shows that most of the top 20 searched verbs have irregular forms, even if some representative regular verbs (e.g. comer - to eat, cantar - to sing) also appear on the top of the lists. As it was expected, irregular verbs are more troublesome and annoying for both native and second language learners.

Total users

PT Web PT Android ES Web ES Android GL
12417 42775 2195 33292 105533

Unique users

PT Web PT Android ES Web ES Android GL
2804 16664 673 7850 6823

Top search verbs

PT Web PT Android ES Web Es Android GL

487 fazer
405 ver
373 ser
339 ter
324 ir
317 vir
296 poder
232 estar
229 dizer
183 amar
153 querer
152 saber
150 por
148 haver
147 comer
145 dar
138 gostar
121 trazer
120 falar
112 ler

3189 comer
1701 ir
1463 ser
1376 ver
1362 fazer
1193 vir
1029 ter
872 estar
872 amar
815 falar
694 poder
645 dar
635 querer
618 haver
574 trazer
492 andar
478 dizer
439 cantar
433 correr
428 saber
141 clipsetar
108 comer
67 mentar
61 ser
53 oir
51 ir
38 hacer
37 haber
35 estar
34 pitufar
32 tener
32 bailar
31 poder
30 dar
28 ver
28 saber
26 venir
26 cantar
24 poner
22 querer
1211 ir
1211 comer
1068 ser
1047 hacer
917 tener
837 estar
678 decir
665 ver
601 poder
521 poner
520 saber
520 dar
504 haber
487 venir
470 querer
430 salir
403 traer
390 hablar
361 leer
348 dormir
1369 ir
1264 ser
1188 poder
1069 vir
1067 ver
1008 ter
938 dar
854 estar
833 facer
705 haber
683 dicir
609 saber
560 caber
530 seguir
517 ler
510 querer
510 andar
479 comer
470 poñer
458 falar

Table 1. Log information on search queries made from both the Web application and Android devices.

5. Related work

There are many verb conjugators for Spanish and Portuguese, but not for Galician. For Portuguese we found:

  • Conjugador Insite

  • Conjugador Só Português

  • Verbix for Portuguese

  • for Portuguese

  • Conjuga-me

  • Verbomatic

  • Flip

Flip and Cilenis Conjugador are the only conjugators offering the four varieties of Portuguese (Brazilian and European Portuguese, after and before of the “Acordo Ortográfico” - Spelling Agreement). The rest of conjugators just offer either Brazilian or Portuguese verb forms. In some cases, there are important errors in verb inflection, or a limited list of verbs in the database, even common verbs are out of some of these conjugators.

All Portuguese conjugators, except Cilenis, don't allow us to conjugate verbs that are not in their lists of verbs. The most similar conjugator to Cilenis Conjugador is Flip, but the interface is slow and the user needs too much clicks to see the verb in the desired inflection.

For Spanish, we found many on-line conjugators, in some cases integrated into more general systems containing other languages. Some examples of Spanish conjugators are the following:

  • Reverso

  • Wordreference

  • Verbix for Spanish

  • Onoma

  • Conjuga

The most complete conjugators are both Onoma and Conjuga (open-source tool from the University of Santiago de Compostela). Onoma is oriented toward philology and quite complex for ordinary users. On the other hand, Conjuga is the source core of our Spanish Cilenis Conjugator. We have made some corrections on it and enlarged the number of verb inflections.

For Galician, there are only two options:

  • Verbix for Galician

  • Digalego

Both of them do not provide information for all Galician verbs. Besides, the interface of Digalego is a large list of verbs and does not contain a search engine. This makes uncomfortable any searching procedure. By contrast, the main aim of our interface is to be easy to use by providing a minimalist search form.

As far as the Android market is concerned, we found a wide range of apps to conjugate, especially for Spanish. Among these apps, “El Conjugador” has an interface in French which is quite slow. On the other hand, “Spanish Verbs”, “4001 Spanish Verbs” and “Spanish Verbs Conjugator” are other options based on lists, sometimes even without providing all the verbal tenses. For Portuguese, we just found “Portuguese Verbs”, but as in the case of many other apps, it only offers a small list of verbs to conjugate. Finally, for Galician, the only option is Conxugalego, based on the core of Cilenis Conjugator, which was developed by the members of our research team ProlNat@GE, at the University of Santiago de Compostela.

6. Further Applications

Besides the use of verb conjugators in L2 learning, in current work, we have integrated our conjugators into other Natural Language Processing tools. In particular, it has been inserted into "Avalingua", software aimed at automatically identifying and evaluating spelling, lexical, and grammatical errors in written language. The role of the verb conjugators within the Avalingua architecture is to generate a complete lexicon with all possible verb forms.


