Social tagging and blog-scraping as an alternative for updating controlled vocabularies: Practical application to a library and information science thesaurus

  • Gonzalo Mochón Bezares Universidad Carlos III de Madrid
  • Eva Méndez Rodríguez Universidad Carlos III de Madrid
  • Ángela Sorli Rojo Consejo Superior de Investigación Científicas
Keywords: Social tagging, Thesauri maintenance, Blogs, Library and Information Science, Terminological extraction


The aim of this paper is to compare the use of free language tags, taken in our case from specialized blogs on information sciences, against the unstructured controlled language of keywords lists, for verifying which of them is the best source of new terminology for the Librarianship Thesaurus and Documentation. To do this, authors’ labels were extracted from 127 blogs on librarianship and information science using web scraping techniques, and were compared with descriptors and identifiers lists of the ISOC library and documentation database (ISOC-BD). The results of the analysis of authors’ tags in blogs contribute with 186 new terms, while the database lists only 130 terms. It is concluded that free language tags could be a better and faster way for contributing new terminology to controlled vocabularies than unstructured controlled language lists.


Author Biographies

Gonzalo Mochón Bezares, Universidad Carlos III de Madrid
Facultad de Humanidades, Comunicación y Documentación. Departamento de Biblioteconomía y Documentación. Estudiante
Eva Méndez Rodríguez, Universidad Carlos III de Madrid
Facultad de Humanidades, Comunicación y Documentación. Departamento de Biblioteconomía y Documentación. Profesora
Ángela Sorli Rojo, Consejo Superior de Investigación Científicas
Departamento de Postgrado y Especialización. Directora


