2012-08-14

What is linked data?

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Linked data is a publication model for structured data on the Web. The term “linked data” was coined by Tim Berners-Lee in 2006 in a note in which he described the Linked Data Principles.
An essential feature of linked data is materialization of relationships. Linked data makes implicit relationships between the described things explicit by materializing them as data [1, p. 94]. Reified relationships expressed as links thus become a part of machine-readable data amenable to automated processing. Traditionally, relationships in data are kept implicit as a part of the background knowledge, documentation, or software. In such cases, integration of data is done on the application level with a custom-crafted code or queries and the effort of discovering relationships in disparate datasets is left to application developers and other data consumers. Materialization of relationships in linked data shifts this integration effort to the data level.
While the current Web turned out to be mostly a web of documents, linked data leads to a growth of a web of data. This web of data may describe not only documents but may also include data, abstract ideas, or physical objects, along with their materialized relationships. In this way, linked data offers a seamless integration of the web of documents and the web of things into the web of data. Marko Rodriguez supposes that “the web of data may emerge as the de facto medium for data representation, distribution and ultimately, processing” [2, p. 38].
Linked data is a fundamentally distributed publishing model that locates data in heterogeneous data spaces. Unlike the current data stores that may be likened to silos or terminal nodes, linked data spaces are mutually connected via hyperlinks, through which disparate data sources may be defragmented and integrated into a single, virtual global data space. For linked data, relationships with other data expressed via links are of fundamental value. To illustrate this point, in his note about linked data Tim Berners-Lee claims that “the value of your own information is very much a function of what it links to”.
Linked data may be seen as a pragmatic implementation of the vision of the so-called “semantic web”, that is the web that communicates meaning in a way machines can operate on. Linked data has a mature and well-understood technology stack [3] comprised of the semantic web technologies. Most of these technologies are developed and standardized at the World Wide Web Consortium (W3C). In the following blog posts the key technologies for linked data will be introduced: Uniform Resource Identifier for identification of data, Hypertext Transfer Protocol for interaction with data, and Resource Description Framework for data representation.

References

  1. AYERS, Danny. Evolving the link. IEEE Internet Computing. January/February 2007, vol. 11, no. 1, p. 94 — 96. ISSN 1089-7801.
  2. RODRIGUEZ, Marko A. A reflection on the structure and process of the web of data. Bulletin of the American Society for Information Science and Technology. August/September 2009, vol. 35, no. 6. ISSN 1550-836.
  3. HEATH, Tom; BIZER, Chris. Linked data: evolving the Web into a global data space. 1st ed. Morgan & Claypool, 2011. Also available from WWW: http://linkeddatabook.com/book. ISBN 978-1-60845-430-3. DOI 10.2200/S00334ED1V01Y201102WBE001.

No comments :

Post a Comment