An insight into Twitter: a corpus based contrastive study in English and Spanish.


  • Irina Argüelles Álvarez Universidad Politécnica de Madrid
  • Alfonso Muñoz Muñoz Universidad Politécnica de Madrid



Twitter, contrastive analysis, corpus linguistics


The aim of this paper is to study the use of Spanish and English in the micro-blogging social network Twitter from a contrastive point of view. A quantitative research methodology is applied in order firstly, to identify specific common characteristics of language, organization and content in the medium and secondly, to find eventual differences in the use of a particular language. To carry out the experiment, two corpora were constructed using language data from Twitter, one in Spanish with a total number of 4,027,746 words and another with similar characteristics in English with a total number of 4,655,992 words. From the results obtained, the conclusion is that there are a number of very general discourse and organizational features common to the two corpora under study. It is also concluded that there are some particular characteristics which differentiate the use of English and Spanish in the medium.


Download data is not yet available.


Bazerman, C. (2000). Singular utterances: realizing local activities through typified forms in typified circumstances. In A. Trosborg (Ed.) Analysing Professional Genres. Amsterdam, John Benjamins.

Boyd, D. M., and Ellison, N. B. (2007). “Social Network Sites: Definition, History, and Scholarship”. Journal of Computer-Mediated Communication, 13(1): 210-230.

Cambridge University Press (1995). Cambridge International Dictionary of English. Cambridge, Cambridge University Press.

Conrad, S., and Biber, D. (2000). Adverbial marking of stance in speech and writing. Evaluation in text: Authorial stance and the construction of discourse, 56–73.

Java, A., Song, X., Finin, T. and Tseng, B. (2007) “Why we Twitter: Understanding microblogging usage and communities”. Procedings of the Joint 9th WEBKDD and 1st SNA-KDD Workshop.

Kwak, H., Lee, C., Park, H. and Moon, S. (2010). “What is Twitter, a social network or a news media?” International World Wide Web Conference Committee (IW3C2) Raleigh, North Carolina, USA.

Levy, P., S. Little and Aiyegbayo, O. (2007). Design for learning for the social network generation: themes from a LAMS evaluation project. In

Misanchuk, M. and Anderson, T. (2001) “Building community in an online learning environment: communication, cooperation and collaboration”.

Mischaud, E. (2007). Twitter: Expressions of the whole self. An investigation into user appropriation of a web-based communications platform. London: Media@lse.

Retrieved October 20, 2011 from MScDissertationSeries/Past/Mishaud_Final.pdf

O’Reilly, T. (2004). “The architecture of participation”.

O’ Reilly, T. and Milstein, S. (2009). The Twitter book. Sebastopol, USA, O’Relly Media Inc.

Schmidt, H. (2009). TreeTagger.

Stubbs, M. (2001). Words and Phrases: Corpus Studies of Lexical Semantics. Oxford, Blackwell.

Swales, J.M. (1990). Genre analysis. Cambridge, Cambridge University Press.

Tseronis, A. (2009). Qualifying standpoints. Stance adverbs as a presentational device for managing the burden of proof. Utrecht, LOT Dissertation Series.

Twitter. 8/11/2011

Wikipedia Retrieved October 1, 2011