Concepts of open data

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Open data is a set of practices for publishing data. There is no formally declared and adopted definition of open data and it is not backed by any legal or standards body. Instead, it is essentially a community-driven effort [1, p. 1]. The meaning of the concept of open data stems from a shared discourse in the open data community.
This chapter deals with the core concepts of open data and describes the principles of open data that are guided by these concepts. The following parts cover open data in practice, including general characterizations of policies and implementations of open data in the public sector.
The concept of open data refers, as the name suggests, to two main abstract concepts. It denotes the application of the principles of openness to data.


Data is the subject of open data. Description of the elusive notion of data may be constructed by juxtaposing its various facets.
One perspective of defining data is through its content. According to the Suggested Upper Merged Ontology “data point” or “datum” is “an item of factual information derived from measurement or research.” Data covers observations, measurements, and records describing the physical or social reality. It may also provide models and conceptualizations of reality for describing other data.
A defining facet that contributes to the meaning of data is its form. First, data is primarily digital, which makes it “computable”; i.e., amenable to automatic computer processing. Data imposes a structure on its content that makes it sufficiently formalized to allow for processing in an automated manner. The perception of this attribute depends on the level of use of data. For example, while a sound recording is usually not thought of as data if it is used to convey words, it may be treated as data if its use is its conversion to a different file format.
Content and format of data determine its affordances; the uses data makes possible. Data is generative, open to a variety of types of use. For example, data is used for preservation, information exchange, or computation.
Not only the types of use shape the prevalent perceptions of data. A common-sense interpretation of data is influenced by the tools we use to work with data; how we store it, represent it, or interface with it. The context in which data is used shapes the mental image of data. For instance, people recall database tables or spreadsheets when they think about data.
Data is characterized by features that make it conducive to be opened. Digital data is easy and inexpensive to copy. Thus, it may be treated as a non-rivalrous, public good [2]. Because data users are working with their own copy, consumption of data does not diminish the ability of others to consume it. In other words, using data does not make it less useful. In fact, quite the opposite is true as using data can make it more valuable. For example, one can extract valuable annotations informing about the ways data is being used, that are based on implicit participations [3, p. 32]. In the light of these properties, openness seems to be innate to data.


Openness is the intellectual foundation of open data. It is a quality of being open, an absence of restrictions and barriers, the goal of which is to achieve equal access and use.
The principle of openness is transdisciplinary. It is the driving force behind several movements, including open data. For example, Holger Kienle lists several related domains in which the concept of “openness” is applied [4]. They include areas such as “open access”, “open content”, “open knowledge”, and “open source”. What these related fields have in common is their concern with an open way of distribution, which is based on the free transfer of digital information on the Web, unencumbered by any common restrictions and barriers of the physical world. In this way, all of these fields, including open data, may be treated as publication models that apply the principles of openness to various domains.
Open access focuses on literature, such as articles and pre-prints, that serve as research sources. Open source applies the open publication model to software, with the particular aim to enable free access to software’s source code. Open content is a more general framework concerning content of any type of creative work. Content is deemed as open if it allows four types of use: reuse, revision, remix, and redistribution. In the same vein, Open knowledge deals with any type of knowledge, notwithstanding its carrier, that is recognized as open only under the circumstances if anyone is free to use, reuse and redistribute it, requiring at most attribution or sharing it alike, under an analogous licence [5].


