blog.mynarz.net: Publishing the vocabulary of the types of grey literature as linked data

This blog post is based on the poster presentation delivered at the Grey Literature 12 conference.

The aim of this post is to introduce the typology of grey literature we have started to develop at the National Technical Library. The vocabulary of the types of grey literature is a controlled vocabulary that is meant to be used to express that a document belongs to a certain document type. The design of the vocabulary is based on an analysis of six existing grey literature typologies. Thus, it can be seen as a formalization of the outcomes of the systematic examination done during this analysis.

It has a loose structure with hierarchical relationships between the types' concepts. Each type has a unique identifier (a URI, in this case) and a preferred label. Some types feature labels in multiple languages and links to other types, both from the vocabulary itself and from external datasets. In the vocabulary's documentation each type will be provided with a definition and a prototypical example of a document for which it can be used.

I will briefly mention the technologies that we have used in the vocabulary's development. The vocabulary is expressed in the RDF data format as a SKOS concept schema. RDF (which stands for Resource Description Framework) is a data format for expressing data with a graph structure and Simple Knowledge Organisation System (abbreviated as SKOS) is an ontological language for representing knowledge organisation systems, such as thesauri, codelists, or systematic classifications.

The vocabulary will be published as linked data. Linked data is a publication model for exposing structured data on the Web in a way that uses links between datasets to create a network of interlinked data. The vocabulary includes links to other vocabularies and datasets, such as the Biblioontology, Dublin Core Metadata Initiative Types, or DBPedia, which represents the structured information extracted from Wikipedia.

The vocabulary is supposed to be a product of a co-operative development. The project of the grey literature typology is hosted at the Google Code website. The reason for using Google Code is that it has the functionality to support collaborative development. At its core there is a distributed version control system that enables to track the different versions of the vocabulary submitted by the members of the development team. It makes possible to incorporate feedback by commenting on the individual changes of the vocabulary and by reporting issues that should be fixed in the future versions. The Google Code website also includes a wiki that serves as the vocabulary's documentation.

For the purposes of this vocabulary's development the Working Group for Grey Literature Typology was established. The aim of this informal group is to bring together the experts from various fields related to grey literature, knowledge organisation systems, or semantic web technologies in order to work collaboratively on the further evolution of the vocabulary. If you are interested in participating in this vocabulary's development or becoming its user I encourage you to check out the project's website at Google Code.

blog.mynarz.net

2010-12-06

Publishing the vocabulary of the types of grey literature as linked data

No comments :

Post a Comment