Learning IS-A relations from specialized-domain texts with co-occurrence measures


  • Pedro Urena University of Granada




ontology learning, ontology enrichment, taxonomy, corpus linguistics, co-occurrence


Ontology  enrichment  is  a  classification  problem  in which  an  algorithm  categorizes  an  input conceptual unit  in the corresponding node  in a target ontology. Conceptual enrichment  is of great importance both to Knowledge Engineering and Natural Language Processing, because it helps maximize the efficacy of intelligent systems, making them more adaptable to scenarios where  information  is  produced  by  means  of  language.  Following  previous  research  on distributional  semantics,  this  paper  presents  a  case  study  of  ontology  enrichment  using  a feature-extraction  method  which  relies  on  collocational  information  from  corpora.  The  major advantage  of  this  method  is  that  it  can  help  locate  an  input  unit  within  its  corresponding superordinate node in a taxonomy using a relatively small number of lexical features. In order to  evaluate  the  proposed  framework,  this  paper  presents  an  experiment  consisting  of  the automatic classification of a chemical substance in a taxonomy of toxicology.


Download data is not yet available.


Agirre, Eneko, Alfonseca, Enrique, and Oier López de Lacalle. 2004. Approximating Hierarchy-based Similarity for WordNet Nominal Synsets Using Topic Signatures. Proceedings of GWC-04, 2nd global WordNet Conference, edited by Petr Sojka, Karel Pala, Pavel Smrž, Christiane Fellbaum, and Piek Vossen, 15–22. The Global Wordnet Association.

Alfonseca, Enrique, and Suresh Manandhar. 2002. “Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures.” In Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web. EKAW 2002. Lecture Notes in Computer Science, edited by Asunción Gómez-Pérez, and Richard Benjamins, Vol. 2473, 1–7. Berlin, Heidelberg: Springer. https://doi.org/10.1007/3-540-45810-7_1

Anthony, Laurence. 2018. AntCorGen (Version 1.1.1) [Computer Software]. Tokyo, Japan: Waseda University. Available from http://www.antlab.sci.waseda.ac.jp/

Biemann, Chris. 2005. “Ontology learning from text: A survey of methods.” LDV forum 20(2): 75–93.

Buitelaar, Paul, and Philipp Cimiano, eds. 2008. Ontology Learning and Population: Bridging the Gap between Text and Knowledge. Amsterdam: IOS Press.

Cimiano, Philipp. 2006. Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Berlin, Heidelberg: Springer.

Cimiano Philipp, and Johanna Völker. 2005. “Text2Onto.” In Natural Language Processing and Information Systems, edited by Andrés Montoyo, Rafael Muñoz R, and Elisabeth Métais, 227-238. Lecture Notes in Computer Science, vol 3513. Berlin, Heidelberg: Springer. https://doi.org/10.1007/11428817_21

Clark, Malcolm, Kim, Yunhyong, Kruschwitz, Udo, Song, Dawei, Albakour, Dyaa, Dignum, Stephen, Cerviño Baresi, Ulises, Fasli, Maria, and Anne De Roeck. 2012. “Automatically Structuring Domain Knowledge from Text: An Overview of Current Research.” Information Processing and Management 48(3): 552–568. https://doi.org/10.1016/j.ipm.2011.07.002

Cressie, Noel and Timothy R. C. Read. 1989. “Pearson’s X2 and the Loglikelihood Ratio Statistic G2: A comparative review.” International Statistical Review 57(1): 19–43. https://doi.org/10.2307/1403582

De Knijff, Jeroen, Frasincar, Flavius, and Frederik Hogenboom. 2013. “Domain Taxonomy Learning from Text: The Subsumption Method versus Hierarchical Clustering.” Data and Knowledge Engineering 83: 54–69. doi: dx.doi.org/10.1016/j.datak.2012.10.002. https://doi.org/10.1016/j.datak.2012.10.002

Faatz, Andreas, and Ralf Steinmetz. 2003. “Ontology Enrichment with Texts from the WWW.” In Proceedings of the 2nd ECML/PKDD Semantic Web Mining Workshop.

Faatz, Andreas, and Ralf Steinmetz. 2005. “An Evaluation Framework for Ontology Enrichment.” In Ontology Learning from Text: Methods, Applications and Evaluation, edited by Paul Buitelaar, Philipp Cimiano, and Bernardo Magnini, number 123 in Frontiers in Artificial Intelligence and Applications, 77-91. Amsterdam: IOS Press.

Fano, Roberto Mario. 1961. Transmission of Information: A Statistical Theory of Communications. Cambridge, MA: MIT Press.

Fotzo, Hermine Njike, and Patrick Gallinari. 2004. “Learning «Generalization/specialization» Relations between Concepts: Application for Automatically Building Thematic Document Hierarchies.” In Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval, 143–155. Le Centre de Hautes Études Internationales D'informatique Documentaire.

Gherasim, Toader, Harzallah, Mounira, Berio, Giuseppe, and Pascale Kuntz. 2013. “Methods and Tools for Automatic Construction of Ontologies from Textual Resources: A Framework for Comparison and its Application.” In Advances in Knowledge Discovery and Management, edited by Fabrice Guillet, Bruno Pinaud, and Gilles Venturini, 177–201. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-642-35855-5_9

Gómez-Pérez, Asunción, and David Manzano-Macho. 2004. “An Overview of Methods and Tools for Ontology Learning from Texts.” The Knowledge Engineering Review 19(3): 187–212. https://doi.org/10.1017/S0269888905000251

Gruber, Thomas. 1995. “Toward Principles for the Design of Ontologies Used for Knowledge Sharing?” International Journal of Human-Computer Studies 43(5-6): 907–928. https://doi.org/10.1006/ijhc.1995.1081

Harris, Zellig. 1954. “Distributional Structure.” Word 10(2–3): 146–162. https://doi.org/10.1080/00437956.1954.11659520

Hazman, Maryam, El-Beltagy, Samhaa, and Ahmed Rafea. 2011. “A Survey of Ontology Learning Approaches.” Database 22(8): 36–43. https://doi.org/10.5120/2610-3642

Hearst, Marti. 1992. “Automatic Acquisition of Hyponyms from Large Text Corpora.” Proceedings of the Fourteenth conference on Computational Linguistics, Vol. 2, 539–545. Association for Computational Linguistics. https://doi.org/10.3115/992133.992154

IJntema, Wouter, Sangers, Jordy, Hogenboom, Frederik, and Flavius Frasincar. 2012. “A Lexico-semantic Pattern Language for Learning Ontology Instances from Text.” Web Semantics: Science, Services and Agents on the World Wide Web, 15, 37–50. https://doi.org/10.1016/j.websem.2012.01.002

Jurafsky, Daniel, and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, (2nd Ed.). Pearson/Prentice-Hall.

Lehmann, Jens, and Johanna Völker, eds. 2014. Perspectives on Ontology Learning. Amsterdam: IOS Press.

Maedche, Alexander, and Steffen Staab. 2001. “Ontology Learning for the Semantic Web.” IEEE Intelligent Systems 16(2): 72–79. https://doi.org/10.1109/5254.920602

Meijer, Kevin, Frasincar, Flavius, and Frederik Hogenboom. 2014. “A Semantic Approach for Extracting Domain Taxonomies from Text.” Decision Support Systems 62: 78–93. https://doi.org/10.1016/j.dss.2014.03.006

Petasis, Georgios, Karkaletsis, Vangelis, Paliouras, Georgios, Krithara, Anastasia, and Elias Zavitsanos. 2011. “Ontology Population and Enrichment: State of the Art.” In Knowledge-driven Multimedia Information Extraction and Ontology Evolution, edited by Georgios Paliouras,Constantine Spyropoulos, and George Tsatsaronis, 134–166. Berlin: Springer. https://doi.org/10.1007/978-3-642-20795-2_6

Periñán-Pascual, Carlos. 2017. “Bridging the Gap within Text-data Analytics: A Computer Environment for Data Analysis in Linguistic Research.” Revista de Lenguas para Fines Específicos23(2): 111-132.

Periñán-Pascual, Carlos, and Francisco Arcas Túnez. 2010. “The Architecture of FunGramKB”, 7th International Conference on Language Resources and Evaluation, Valletta (Malta). Proceedings of the Seventh International Conference on Language Resources and Evaluation, European Language Resources Association (ELRA), 2667–2674.

Periñán-Pascual, Carlos, and Ricardo Mairal Usón. 2010. “La Gramática de COREL: Un Lenguaje de Representación Conceptual”. Onomázein 21, 11–45.

Princeton University. 2010. "About WordNet." WordNet. Princeton University.

Shamsfard, Mehrnoush, and Ahmad Abdollahzadeh Barforoush. 2003. “The State of the Art in Ontology Learning: A Framework for Comparison.” The Knowledge Engineering Review 18(4): 293–316. https://doi.org/10.1017/S0269888903000687

Ureña Gómez-Moreno, Pedro, and Eva Mestre-Mestre. 2017. “Automatic Domain-specific Learning: Towards a Methodology for Ontology Enrichment.”Revista de Lenguas para Fines Específicos 23(2):63–85. https://doi.org/10.20420/rlfe.2017.173

Velardi, Paola, Faralli, Stefano, and Roberto Navigli. 2013. “Ontolearn Reloaded: A Graph-based Algorithm for Taxonomy Induction.” Computational Linguistics 39(3): 665–707. https://doi.org/10.1162/COLI_a_00146

Wong, Wilson, Liu, Wei, and Mohammed Bennamoun. 2012. “Ontology Learning from Text: A Look Back and into the Future.” ACM Computing Surveys 44(4): 1–36. https://doi.org/10.1145/2333112.2333115

Zouaq, Amal, and Roger Nkambou. 2010. “A Survey of Domain Ontology Engineering: Methods and Tools.” In Advances in Intelligent Tutoring Systems,edited by Roger Nkambou, Mizoguchi Riichiro, and Jacqueline Bourdeau, 103–119. Berlin, Heidelberg: Springer. https://doi.org/10.1007/978364214363-2