Linked data principles

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Linked data principles govern the use of the semantic web technologies described in the previous sections. Unlike the technologies, the principles are not backed by any standards body, such as the World Wide Web Consortium. Instead, they are community-driven and their sole enforcement mechanism is peer pressure. Nevertheless, this may turn out not to be the case in the near term future if the principles get incorporated into official policies and regulations, such as the ones that govern public sector institutions.
Linked data principles provide a guidance both for data publishers and consumers. For publishers, they offer the best practices that they have to comply with in order for their data to be recognized as linked data. From consumers’ perspective, the principles prescribe behaviour patterns that they can expect when working with linked data, such as what happens when linked data URIs are resolved in the course of content negotiation.
Compared with the principles of open data, there are fewer instances of the linked data principles. The original Linked Data Principles drafted by Tim Berners-Lee form a strong core that any other, and mostly derivative, linked data principles tend to cite or relate to.

Tim Berners-Lee's Linked Data Principles

Linked Data Principles, written by Tim Berners-Lee in 2006, effectively define what is linked data. The principles set a touchstone that may be used to determine if datasets qualify for being described as “linked data”, by covering all the necessary conditions that datasets need to fulfil in order to earn that label. These conditions are encapsulated in four succinct principles.
  1. Use URIs as names for things.
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL).
  4. Include links to other URIs, so that they can discover more things.
Berners-Lee, the inventor of the World Wide Web, sees the principles as natural for the Web. He recounts that in writing down the principles he only captures his intentions that were already a part of his original architecture for the Web.
After the creation of the principles they were modified to a small extent, clarifying certain issues and making some parts more explicit. For example, the original version from 2006 did not explicitly mention what technologies should be used for achieving the prescribed behaviour of the data. This was amended later, making it clear that the technologies that were intended to be used were RDF and SPARQL.

Five Stars of Linked Open Data

Four years after the inception of the original Linked Data Principles Tim Berners-Lee proposed a more iterative take on publishing linked data in his Five Stars of Linked Open Data scheme. It contains five commandments for data producers explaining how to proceed with improving the way how their data is published.
  • ★ Publish data on the Web under an open licence (e.g., in PDF).
  • ★★ Publish data in a structured format (e.g., in Excel).
  • ★★★ Publish data in a non-proprietary format (e.g., in CSV).
  • ★★★★ Use URLs to identify data, so that it is linkable (e.g., in RDF).
  • ★★★★★ Link your data to other data to provide context.
A major change in this scheme is the recognition of the importance of open access to data, which is already required in order to earn the first star. The scheme emphasizes that adoption of linked data principles creates a space for continuous improvement. Data producers can start publishing data with a low up-front cost and consequently continue investing more resources towards the goal of joining the pool of linked open data.
There are several renditions of the Five Stars of Linked Open Data scheme besides the one done by Tim Berners-Lee himself. For example, Ed Summers was among the first to publish the scheme and Michael Hausenblas illustrated the scheme with some examples along with associated costs and benefits for each of the steps described by the scheme.

