2012-08-22

Linked data: use

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
The flexible, application-agnostic nature of linked data makes it possible to employ it for a broad spectrum of uses. Linked data does not discriminate according to the type of use as “Linked Data principles and publishing guidelines are designed to make structured data more amenable to ad hoc consumption on the Web” [1, p. 13].
Roy Fielding wrote that “the primary mechanisms for inducing reusability within architectural styles is reduction of coupling (knowledge of identity) between components and constraining the generality of component interfaces” [2, p. 35]. Fielding’s REST, covered in the previous blog post about HTTP, is based on uniform interfaces between components and thus abides by this recommendation. However, a trade-off of uniform interfaces is of efficiency because such interfaces are optimized for the general case [Ibid., p. 82]. Since linked data is based on REST it also inherits this trade-off.
Linked data adopts separation of concerns and decouples content from presentation. In this way, it decouples data from upstream (producers) and downstream (consumers) interfaces enabling variability without introducing interoperability costs. Since linked data is not application-specific it may be used to power all kinds of applications.
Modelling of linked data is based on the reuse of existing models provided by RDF vocabularies and ontologies. A common approach to modelling of linked data is to mix various vocabularies and ontologies at will, cherry-picking their components to build a customized model suited for particular data.
Flexibility of the RDF data model enables to query the data and reconfigure it for a particular use. Semantic web technologies open opportunities for reuse by offering “query interfaces for applications to access public information in a non-predefined way” [3]. This is more difficult to achieve for non-RDF data formats. For example, Fadi Maali argues that “providing the data in a fixed table structure, as in CSV files, makes it harder for consumers to re-arrange the data in a way that best fits their needs” [4, p. 86].
Together, composing data models of parts of data models already known to applications and the flexibility that allows to rearrange the data model to the application model is facilitative to generic consumption. Such an advantage is particularly manifest when applications combine multiple sources of linked data The applications of this type are referred to as “meshups” since they are built on data sources that mesh with each other [5, p. 321]. Without linked data, this scenario would require manual integration effort on the application level, whereas linked data would be already integrated on the data level.
The following paragraphs provide answers on how linked data meets the concrete criteria on the use of open data.

Non-proprietary data formats

RDF is a non-proprietary data format and its specifications are open and free for anyone to inspect and implement.

Standards

Linked data builds on web standards maintained by the W3C or the Internet Engineering Task Force (IETF). For an overview of standard specifications related to linked data see Linked Data Specifications maintained by Michael Hausenblas.

Machine readability

RDF serializations covered in the previous blog post on RDF are machine-readable. Specifications of RDF serializations have well-defined conformance criteria, which facilitate the development of standard parsers and make it possible for data to be validated for conformance, such as with the W3C RDF Validation Service.
RDF data is well-structured with a high level of granularity. Users of RDF may use it as a graph that may be broken down into individual triples, which allows access to data at a very detailed level.
Linked data makes explicit, machine-readable licensing possible by linking to licences. There are several RDF vocabularies that contain properties to do that, such as the Dublin Core Terms with dcterms:rights. For a structured representation of the licences themselves Creative Commons Rights Expression Language may be employed.

Safety

RDF cannot include executable content. Serializations of RDF are textual (with the exception of the proposed Binary RDF [6]), which promotes inspection and eases safety checks. However, using RDF in adversarial environments with security problems, such as RDF injection or query sanitization, is an area in which little research is conduced.

References

  1. HOGAN, Aidan; UMBRICH, Jürgen; HARTH, Andreas; CYGANIAK, Richard; POLLERES, Axel; DECKER, Stefan. An empirical survey of linked data conformance. In Journal of Web Semantics [in print]. 2012. Also available from WWW: http://sw.deri.org/~aidanh/docs/ldstudy12.pdf. ISSN 1570-8268. DOI 10.1016/j.websem.2012.02.001.
  2. FIELDING, Roy Thomas. Architectural styles and the design of network-based software architectures. Irvine (CA), 2000. 162 p. Dissertation (PhD.). University of California, Irvine.
  3. ACAR, Suzanne; ALONSO, José M.; NOVAK, Kevin (eds.). Improving access to government through better use of the Web [online]. W3C Interest Group Note. May 12th, 2009 [cit. 2012-04-06]. Available from WWW: http://www.w3.org/TR/egov-improving/
  4. MAALI, Fadi. Getting to the five-star: from raw data to linked government data. Galway, 2011. Masters thesis (MSc.). National University of Ireland. Digital Enterprise Research Institute.
  5. OMITOLA, Tope; KOUMENIDES, Christos L.; POPOV, Igor O.; YANG, Yang; SALVADORES, Manuel; SZOMSZOR, Martin; BERNERS-LEE, Tim; GIBBINS, Nicholas; HALL, Wendy; SCHRAEFEL, Mc; SHADBOLT, Nigel. Put in your postcode, out come the data: a case study. In AROYO, Lora; ANTONIOU, Grigoris; HYVONËN, Eero; TEN TEIJE, Annette; STUCK- ENSCHMIDT, Heiner; CABRAL, Liliana; TUDORACHE, Tania (eds.). The semantic web: research and applications, 7th Extended Semantic Web Conference, Heraklion, Crete, Greece, May 30 — June 3, 2010, Proceedings, Part I. Heidelberg: Springer, 2010. Lecture notes in computer science, 6088. ISBN 978-3-642-13485-2.
  6. FERNÁNDEZ, Javier D.; MARTÍNEZ-PRIETO, Miguel A.; GUTIERREZ, Claudio; POLLERES, Axel. Binary RDF representation for publication and exchange (HDT) [online]. W3C Member Submission. March 30th, 2011 [cit. 2012-04-24]. Available from WWW: http://www.w3.org/Submission/2011/SUBM-HDT-20110330/

No comments :

Post a Comment