2012-08-05

Legal openness of data

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Legal openness addresses the conditions of use. In other words, it covers what users are allowed to do with the data.
The default conditions of use for open data are declared by law. The main areas of legislation that impact open data include intellectual property rights and database rights [1, p. 138].
The control of intellectual property rights over data depends on the content of data. These rights affect only original creative works. Data, in most cases, does not satisfy this condition. It usually consists of facts and, according to the law, no one can claim ownership of facts. Moreover, data is not a product of creative work [2].
In the case of public sector information, it is a product of the pursue of the public task. In such a case, public data may be explicitly declared to be exempt from copyright, which was proclaimed for the US public data in the 1976’s Copyright Law. The baseline here is that, in many cases, data may not be treated as a private property, but more likely as a common good.
The distinction whether there are intellectual property rights associated with data is an important one. The options in this division introduce a completely different default state for data. The assessment of the relation of intellectual property rights is relevant for narrowing down the alternative ways how the rights holders may modify the conditions of use for data.
The impact of database rights on data is restricted by the law that is valid where the data is produced. Of course, local legislations influence intellectual property rights of data as well, however, they tend to be more universal as they are harmonized thanks to a number of international treaties. Sui generis database rights apply especially in the context of the member states of the European Union. In 1996, the EU issued the Directive 96/9/EC on the legal protection of databases [3]. The directive grants rights to the creators of databases, protecting their intellectual contribution to selection and arrangement of the database contents. This directive is now transposed into the legal systems in many EU member countries.
With regard to the described rights, in some cases, open data may be a subject to requirements of both. The content of data may be eligible for intellectual property rights protection and the data as a whole may be entitled to derive its protection from database rights. In such a situation, dual licensing may be applied, providing data content and data structure with different licences that are more appropriate for the given type of licensed work. However, it may prove to be difficult to find a clear boundary distinguishing between the parts of data to be licensed separately. It also raises the barrier to use of the data, since its users need to know the requirements of both licences. Due to these complexities it may be easier to handle the legal variations with a universal waiver.
Possibilities for opening data may also be limited by implied contracts, such as exclusive licence agreements. Data bounded by contracts may be difficult to work with, because users may be either not aware of their existence or they may be found difficult to interpret and abide with, especially for laypersons. The most usable solution for open data would be to have a single legal document that users need to consult in order to know what the conditions of use are, as explicit and unified rules simplify the use of data.
The legal recommendations found in open data principles usually advise to modify the default conditions under which data is available with a legal instrument that amends the conditions on the basis of contract law, using tools such as a licence or a waiver. Such recommendation serves a number of purposes. First of all, it provides explicit and comprehensive conditions of use that are valid for the data in question, shielding the users from the possibly complex and hard-to-interpret law. The second aim is crucial for open data, because this is the way how a previously restricting conditions may be made more open by renouncing some rights.
There are two main types of legal tools used to amend the conditions of use of data: licences and waivers. Licences redefine how data may be used in accordance with the producer’s desires and users’ needs. Licences for open data are discussed in the subsequent section.
Waivers serve to waive rights associated with data. The purpose of legal waivers is to reconstruct the conditions of use that applies to the works in the public domain. Yet in some countries, such as the Czech Republic, waiving intellectual property rights is not considered as a valid legal act. In these countries, works may enter into the public domain only naturally and not with a deliberate action. However, licences may be used to emulate the public domain by explicitly setting the same conditions of use.
Both with law, regulations, licences, and waivers data producers are able to accomplish legal openness. Legal openness is a necessary precondition for achieving technical openness. Data that is technically open (e.g., online and in a structured format) but not legally open (e.g., with a prohibitive licence) is not open at all. Most of the data that is legally open can be made open in the technological respect, such as by screen-scraping, a technique that extracts data from web pages. In fact, increasing technical openness of data is an example of reuse that is made possible by open legal conditions of use. On the contrary, there are no ways in which users of data can achieve legal openness of the data, since only data producers can do that.

References

  1. VAN DER SLOOT, Bart. On the fabrication of sausages, or of open government and private data. eJournal of eDemocracy and Open Government [online]. 2011 [cit. 2012-03-15], vol. 3, no. 2, p. 136 — 154. ISSN 2075-9517. Available from WWW: http://www.jedem.org/article/view/68
  2. MILLER, Paul; STYLES, Rob; HEATH, Tom. Open Data Commons, a license for open data. In BIZER, Christian; HEATH, Tom; IDEHEN, Kingsley; BERNERS-LEE, Tim (eds.). Linked Data on the Web (LDOW 2008): proceedings of the WWW2008 Workshop on Linked Data on the Web, Beijing, China, April 22nd, 2008. Aachen: RWTH Aachen University, 2008. CEUR workshop proceedings, vol. 369. ISSN 1613-0073.
  3. EU. Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases. Official Journal of the European Union. 1996, vol. 15, L 77, p. 20 — 28. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:1996:077:0020:0028:EN:PDF. ISSN 1725-2555

2012-08-04

Concepts of open data

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Open data is a set of practices for publishing data. There is no formally declared and adopted definition of open data and it is not backed by any legal or standards body. Instead, it is essentially a community-driven effort [1, p. 1]. The meaning of the concept of open data stems from a shared discourse in the open data community.
This chapter deals with the core concepts of open data and describes the principles of open data that are guided by these concepts. The following parts cover open data in practice, including general characterizations of policies and implementations of open data in the public sector.
The concept of open data refers, as the name suggests, to two main abstract concepts. It denotes the application of the principles of openness to data.

Data

Data is the subject of open data. Description of the elusive notion of data may be constructed by juxtaposing its various facets.
One perspective of defining data is through its content. According to the Suggested Upper Merged Ontology “data point” or “datum” is “an item of factual information derived from measurement or research.” Data covers observations, measurements, and records describing the physical or social reality. It may also provide models and conceptualizations of reality for describing other data.
A defining facet that contributes to the meaning of data is its form. First, data is primarily digital, which makes it “computable”; i.e., amenable to automatic computer processing. Data imposes a structure on its content that makes it sufficiently formalized to allow for processing in an automated manner. The perception of this attribute depends on the level of use of data. For example, while a sound recording is usually not thought of as data if it is used to convey words, it may be treated as data if its use is its conversion to a different file format.
Content and format of data determine its affordances; the uses data makes possible. Data is generative, open to a variety of types of use. For example, data is used for preservation, information exchange, or computation.
Not only the types of use shape the prevalent perceptions of data. A common-sense interpretation of data is influenced by the tools we use to work with data; how we store it, represent it, or interface with it. The context in which data is used shapes the mental image of data. For instance, people recall database tables or spreadsheets when they think about data.
Data is characterized by features that make it conducive to be opened. Digital data is easy and inexpensive to copy. Thus, it may be treated as a non-rivalrous, public good [2]. Because data users are working with their own copy, consumption of data does not diminish the ability of others to consume it. In other words, using data does not make it less useful. In fact, quite the opposite is true as using data can make it more valuable. For example, one can extract valuable annotations informing about the ways data is being used, that are based on implicit participations [3, p. 32]. In the light of these properties, openness seems to be innate to data.

Openness

Openness is the intellectual foundation of open data. It is a quality of being open, an absence of restrictions and barriers, the goal of which is to achieve equal access and use.
The principle of openness is transdisciplinary. It is the driving force behind several movements, including open data. For example, Holger Kienle lists several related domains in which the concept of “openness” is applied [4]. They include areas such as “open access”, “open content”, “open knowledge”, and “open source”. What these related fields have in common is their concern with an open way of distribution, which is based on the free transfer of digital information on the Web, unencumbered by any common restrictions and barriers of the physical world. In this way, all of these fields, including open data, may be treated as publication models that apply the principles of openness to various domains.
Open access focuses on literature, such as articles and pre-prints, that serve as research sources. Open source applies the open publication model to software, with the particular aim to enable free access to software’s source code. Open content is a more general framework concerning content of any type of creative work. Content is deemed as open if it allows four types of use: reuse, revision, remix, and redistribution. In the same vein, Open knowledge deals with any type of knowledge, notwithstanding its carrier, that is recognized as open only under the circumstances if anyone is free to use, reuse and redistribute it, requiring at most attribution or sharing it alike, under an analogous licence [5].

References

  1. HOWARD, Alex. Data for the public good [ebook]. 1st ed. Sebastopol: O’Reilly, 2012. ISBN 978-1-449-32976-1. Available from WWW: http://shop.oreilly.com/product/0636920025580.do
  2. EAVES, David. UK adopts open government license for everything: why it’s good and what it means [online]. October 1st, 2010 [cit. 2012-04-02]. Available from WWW: http://eaves.ca/2010/10/01/uk-adopts-open-government-license-for-everything-why-its-good-and-what-it-means/
  3. LATHROP, Daniel; RUMA, Laurel (eds.). Open government: collaboration, transparency, and participation in practice. Sebastopol: O’Reilly, 2010. ISBN 978-0-596-80435-0.
  4. KIENLE, Holger M. Open data: reverse engineering and maintenance perspective [online]. February 8th, 2012 [cit. 2012-03-08]. Available from WWW: http://arxiv.org/abs/1202.1656
  5. Open definition [online]. Version 1.1. November 2009 [cit. 2012-03-17]. Available from WWW: http://opendefinition.org/okd/

2012-08-03

Pricing models for disclosure of public sector information

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
The disclosure of information might be a subject to charge. However, conditioning access to public sector information by prices may constitute a fundamental barrier.
The models for pricing public sector information may be divided into three groups. The first model sees public bodies act as private companies and tries to recover their costs incurred from information production. If public bodies charge only to recover the cost of information provision, they use the marginal cost model. To adopt the third model is to cease charging altogether and not require users of information to pay any price.

Cost recovery model

Public sector institutions are usually free to recoup some costs by charging users that access their information [1, p. 11]. When they employ the cost recovery pricing, they essentially behave the same way as for-profit companies.
Aside from the benefit of public bodies being able to sustain themselves, this model introduces a number of challenges. First, it is discriminative for those that cannot afford to pay for the access to information of their interest. For example, full cost recovery may have an adverse effect on small and medium-sized enterprises that do not have the necessary resources to obtain the information they need in order to pursue their business plan. Second, a large part of consumers of public sector infomation is constituted by other public sector bodies. If full cost recovery is demanded from public bodies, it reduces public sector information to an instrument of reallocation of the public funding.

Marginal cost model

Marginal cost pricing recoups only the costs of information provision. It is derived from the marginal cost of distribution, that reflects the cost of provision of one further unit. This pricing model is recommended by the EU directive on the re-use of public sector information [2]. If public bodies adopt the marginal cost pricing model and start charging less for their information, they might see a surge of interest for the information that might lead to a greater total income than in the cases when the bodies employ full cost recovery model. The use of the Web brings this pricing model close to the model that applies no prices, because on the Web the marginal cost of distribution covering the reproduction of information is essentially zero.

Open access model

In the open access model public body does not require a payment for provisioning of information to the public. This approach entails a significant reduction of friction and administrative overhead associated with each individual transaction of public sector information. It is a non-discriminative model, since it makes access to information to be independent on user’s budget.
A common argument for no pricing is that public sector information had been already paid for from the tax revenue and thus there should not be any additional charges [3, p. 55]. Pricing for information is seen as inconsistent with the established way of funding of public sector bodies. Public sector should not run business, and some contend that civil service is too inflexible to do so [4].
Several alternative models to recover partial costs were proposed to substitute for the direct cost recovery. For example, one model suggested imposing a levy on requests for updates of public data, such as in business registers [1, p. 27].

References

  1. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  2. EU. Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information. Official Journal of the European Union. 2003, vol. 46, L 345, p. 90 — 96. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2003:345:0090:0096:EN:PDF. ISSN 1725-2555.
  3. Beyond access: open government data & the right to (re)use public information [online]. Access Info Europe, Open Knowledge Foundation, January 7th, 2011 [cit. 2012-04-15]. Available from WWW: http://www.access-info.org/documents/Access_Docs/Advancing/Beyond_Access_7_January_2011_web.pdf
  4. ARTHUR, Charles; CROSS, Michael. Give us back our crown jewels. Guardian [online]. March 9th, 2006 [cit. 2012-03-09]. Available from WWW: "http://www.guardian.co.uk/technology/2006/mar/09/education.epublic

2012-08-02

Disclosure of public sector information

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
The regulations require public bodies to take on an obligation of providing access to information they possess. The EU directive on the re-use of public sector information holds the disclosure of public sector information to be a “fundamental instrument for extending the right to knowledge, which is a basic principle of democracy” [1, p. 92]. In the light of this assertion, public bodies should ensure wide dissemination and long-term preservation of the information they produce.

Scope of disclosure

Public sector information is an umbrella term for all content produced by public bodies [2, p. 5]. Nonetheless, there are several exceptions to this rule, when defining the information that should be disclosed.
Public sector information covers any non-personal data held, collected or produced by a public body as a part of the public task, with the exception of the information relating to national security [3, p. 6]. Therefore, disclosure of public sector information should not apply to information that would abrogate individual privacy rights or endanger national security [4]. However, when left unquestioned, the goal of national security may lead public sector bodies to be overprotective of some data. For example, for some time in the US data about dams were not available due to the fear of misuse for terrorist attacks [5, p. 330].
In the EU, several types of public sector information are exempted from the requirement of disclosure. Public sector information held by cultural heritage institutions, such libraries, museums, and archives, currently falls under a different regime. It often has different qualities than the information from other parts of the public sector. This type of information is mostly static, held as a record, and not directly associated with the pursue of public tasks [6, p. 7]. Similarly, the public broadcasting and research information generated by education institutions is usually exempt from the scope of the definition of public sector information. However, besides the exceptions listed individually, all public sector information is a subject to the requirement of disclosure.

Types of disclosure

The approaches to disclosure of public sector information are usually categorized either according to the extent of disclosed information or by the activity of the public body.
The information that gets released might be limited a summary of the full information the public body possesses. Summary disclosure is used for informing about the decisions made by public bodies. On the other hand, full disclosure is used for informing the decisions of the public. For example, in the case of elections, decisions of the members of public are based on information from public bodies. Based on the distinction of the source of initiative that drives the disclosure, there are two models of information provision in the public sector: reactive and proactive [7, p. 155].

Reactive disclosure

Reactive disclosure is an on-demand, passive dissemination of public sector information that “implies an (enforceable) right for a subject to access to information on request” [8]. It institutes a permission culture of freedom of information requests. Joshua Tauberer criticizes reactive disclosure, because it provides only “a very narrow view of the public sector that is based on the requested snap-shots of data” [9]. This model is characterized by a strong information control and a lack of high-level political and bureaucratic support for open government and as such, reactive disclosure is unsuitable for the realization of this vision.

Proactive disclosure

Proactive disclosure is an active dissemination of public sector information that “means that the information is publicly available on the basis of a direct initiative of the public body” [8]. This type of disclosure may also be referred to as “suo motu” disclosure, that comes from the Latin “upon its own initiative” [10, p. 69]. Proactive disclosure thus requires a switch from “presumption of non-disclosure to presumption of openness” [Ibid., p. 66]. With such presumption, public sector information is thought of as public resource, as something to be shared. This way of disclosure is “suited for mediators” [9], that can transform the information and add value to it. An example of a model for proactive disclosure is open data, which will be discussed further in a greated detail.

References

  1. EU. Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information. Official Journal of the European Union. 2003, vol. 46, L 345, p. 90 — 96. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2003:345:0090:0096:EN:PDF. ISSN 1725-2555.
  2. SCHELLONG, Alexander; STEPANETS, Ekaterina. Unchartered waters: the state of open data in Europe [online]. CSC, 2011 [cit. 2012-04-12]. Public sector study series, 01/2011. Available from WWW: http://assets1.csc.com/de/downloads/CSC_policy_paper_series_01_2011_unchartered_waters_state_of_open_data_europe_English_2.pdf
  3. YIU, Chris. A right to data: fulfilling the promise of open public data in the UK [online]. Research note. March 6th, 2012 [cit. 2012-03-06]. Available from WWW: http://www.policyexchange.org.uk/publications/category/item/a-right-to-data-fulfilling-the-promise-of-open-public-data-in-the-uk
  4. GIGLER, Bjorn-Soren; CUSTER, Samantha; RAHEMTULLA, Hanif. Realizing the vision of open government data: opportunities, challenges and pitfalls [online]. World Bank, 2011 [cit. 2012-04-11]. Available from WWW: http://www.scribd.com/WorldBankPublications/d/75642397-Realizing-the-Vision-of-Open-Government-Data-Long-Version-Opportunities-Challenges-and-Pitfalls
  5. LATHROP, Daniel; RUMA, Laurel (eds.). Open government: collaboration, transparency, and participation in practice. Sebastopol: O’Reilly, 2010. ISBN 978-0-596-80435-0.
  6. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  7. FRANCOLI, Mary. What makes governments ‘open’?: sketching out models of open government. eJournal of eDemocracy and Open Government [online]. 2011 [cit. 2012-03-15], vol. 3, no. 2, p. 152 — 165. ISSN 2075-9517. Available from WWW: http://www.jedem.org/issue/view/5
  8. SOLDA-KUTZMANN, Donatella. Public sector information: a market without failure? In Share-PSI Workshop: Re
    moving the Roadblocks to a Pan-European Market for Public Sector Information Re-use
    [online]. 2011 [cit. 2012-03-09]. Available from WWW: http://share-psi.eu/submitted-papers/
  9. TAUBERER, Joshua. Open government data: principles for a transparent government and an engaged public [online]. 2012 [cit. 2012-03-09]. Available from WWW: http://opengovdata.io/
  10. Beyond access: open government data & the right to (re)use public information [online]. Access Info Europe, Open Knowledge Foundation, January 7th, 2011 [cit. 2012-04-15]. Available from WWW: http://www.access-info.org/documents/Access_Docs/Advancing/Beyond_Access_7_January_2011_web.pdf

2012-08-01

Legal aspects of public sector information

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Public sector information is a subject to jurisprudence based on different sources of law and regulations endowed with legal power. The law relevant for the disclosure of public sector information comes from multiple regulators and as such is a combination of both international law, including conventions or EU directives, and national law [1]. As a result of this state of affairs, the conditions governing public sector information may be composed of rules coming from multiple layers. In effect, the legislation related to public sector information may pose equivocal requirements and ordinances that are difficult to adhere to.
The right to access to public sector information stems from a basic human freedom to seek and impart information. Right to information is enshrined in at least 50 national constitutions [2, p. 62]. Dedicated acts formalizing the right to access to public sector information are established in a large part of countries that acknowledge the freedom of information.
First legal act on the access to public sector information entitled “Freedom of the Press Act” was passed in 1766 in Sweden [2, p. 57]. The right to know what proceedings of the public sector are was recognized as early as 1969 by the Japanese Supreme Court [3]. Other countries followed the suit by establishing the right to know and access to information as a part of the citizen rights. During the following decades the adoption was rather slow and in the middle of 1980s only 11 countries had freedom of information law [4, p. 264]. However, this area experienced a sudden growth of interest paired with an increasing number of countries recognizing the importance of access to information. By 2004 the number of countries that enacted a freedom of information law increased to 59 [Ibid., p. 264].
The prevailing presumption in favour of secrecy has shifted to presumption favouring maximum disclosure and public sector information that is open by default [5, p. 23]. In many countries, the default settings for access to public sector information have changed. Accessing public sector information is no longer perceived as a privilege, it is a right [6, p. 8].
My thesis focuses on the legal situation for public sector information in the European Union and its member countries. The EU legislation is most relevant for the European context, in which the thesis is situated, and which can prove to be a valid model for an official public policy that establishes rules for the domain of public sector information. Thus, the EU legislation would be treated in more detail.

Legislation in the European Union

In the EU, public sector information legislation consists of the directives of the European Commission and their local transpositions that weave the directives’ regulations into state law of the member countries. A key directive covering public sector information is the EU directive on the re-use of public sector information [7]. The directive “establishes a minimum set of rules governing the re-use and the practical means of facilitating re-use of existing documents held by public sector bodies” [Ibid., p. 93]. It prescribes public bodies to provide a mechanism for members of the public to request access to information produced by the bodies. The overarching tenet of the directive is non-discrimination, which manifests itself in stipulations including the prohibition of exclusive arrangements that grant exclusive rights to access to a particular entity, or the recommendation for marginal cost charging.
The planned novelization of the directive [8] extends the scope of public sector information to include the information from the cultural heritage sector, such as libraries, archives, and museums. Furthermore, it strives to conflate the right to access with the right to reuse. It brings about a change in the charging models that declares the marginal cost of reproduction as a new default, while requiring public bodies that continue charge extra price to provide a solid explanation for their behaviour. The novelization also deals with the enforcement of the directive and proposes to set up an independent authority to oversee the compliance with the principles of disclosure.

References

  1. KOUMENIDES, Christos L.; SALVADORES, Manuel; ALANI, Harith; SHADBOLT, Nigel R. Global integration of public sector information. In Proceedings of the WebSci10: Extending the Frontiers of Society On-line, April 26 — 27th, 2010, Raleigh (NC), US. Raleigh, 2010.
  2. Beyond access: open government data & the right to (re)use public information [online]. Access Info Europe, Open Knowledge Foundation, January 7th, 2011 [cit. 2012-04-15]. Available from WWW: http://www.access-info.org/documents/Access_Docs/Advancing/Beyond_Access_7_January_2011_web.pdf
  3. MENDEL, Toby. Freedom of information: an internationally protected human right. Comparative Media Law Journal. 2003, no. 1. Also available from WWW: http://www.juridicas.unam.mx/publica/rev/comlawj/cont/1/cts/cts3.htm
  4. BERTOT, John C.; JAEGER, Paul T.; GRIMES, Justin M. Using ICTs to create a culture of transparency: e-government and social media as openness and anti-corruption tools for societies. Government Information Quarterly. July 2010, vol. 27, iss. 3, p. 264 — 271. DOI 10.1016/j.giq.2010.03.001.
  5. LATHROP, Daniel; RUMA, Laurel (eds.). Open government: collaboration, transparency, and participation in practice. Sebastopol: O’Reilly, 2010. ISBN 978-0-596-80435-0.
  6. KUNDRA, Vivek. Digital fuel of the 21st century: innovation through open data and the network effect [online]. President and Fellows of Harvard College, 2012 [cit. 2012-03-15]. Discussion Paper Series, no. D-70. Available from WWW: http://shorensteincenter.org/wp-content/uploads/2012/03/d70_kundra.pdf
  7. EU. Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information. Official Journal of the European Union. 2003, vol. 46, L 345, p. 90 — 96. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2003:345:0090:0096:EN:PDF. ISSN 1725-2555.
  8. EU. Proposal for a Directive of the European Parliament and of the Council amending Directive 2003/98/EC on re-use of public sector information [online]. Brussels, December 12th, 2011 [cit. 2012-04-30]. COM (2011) 877. 2011/0430/COD. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/directive_proposal/2012/proposal_directive.pdf

2012-07-31

What is public sector information?

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Access to proceedings of the public sector is a fundamental underpinning of democracy. “Quality of public discussion would be significantly impoverished without the nourishment of information from public authorities” [1]. Moreover, economic and research activities in the private sector would be vastly impoverished if public sector information was kept concealed within the public sector. Reuse of public sector information in the private sector is a pivotal goal of its disclosure.
The disclosure of public sector information constitutes the subject matter of my thesis. In this blog post I try to delineate the scope of the domain described in the thesis by providing its basic conceptualization, along with lexical and extensional definitions of the concepts involved. To cater for this goal, this introductory post is concerned with definitions, describing what the concept of “public sector information” covers.
First, how can the borders of the public sector be circumscribed? Boundaries of the public sector are demarcated by private ownership. The institutions the public sector consists of are not private property [2, p. 5]. Instead, the public sector is publicly owned.
Other definitions of the public sector employ the viewpoints of policy control or financial control. A common way of how to give a definition to the public sector in law is to use an extensional definition enumerating the public bodies that fall within its scope.
However, the boundary between public and private sector is getting blurry, since a lot of the functions traditionally performed by public bodies have been outsourced within public-private partnerships. The public sector may also start to take on some characteristics of the private sector, such as the models of finance management.
The public sector is constituted of public bodies. Public body is an institution with legal subjectivity that belongs to the public sector. It is set up under law by the state or other public sector body. Public bodies are established for a specific purpose of meeting the needs in the general interest. They do not have a commercial character and so the majority of their budgets is funded from tax revenue [3, p. 55]. Among the public bodies that are deemed to be most important from the perspective of the data they produce are offices of cadaster, mapping agencies, statistical offices, or company registrars [4, p. 10].
Public bodies produce public sector information, or public data, which is the subject matter of this chapter. UK Public data transparency principles offer a working definition of “public data”. Public data is thought of as “the objective, factual, non-personal data on which public services run and are assessed, and on which policy decisions are based, or which is collected or generated in the course of public service delivery”. It is usually a by-product of the delivery of functions of public sector bodies, which makes it serve as an official public record as well [5]. The term “public sector data” is in most contexts used in the same way as “government data”, and can be thus treated as synonymous.
Given the generic definition of public sector information, enumerating all of the types of public data would be unnecesary. Instead, a few prototypical examples will be mentioned. In 2010, a survey by Socrata identified several high-value categories of data. Among the top-ranked categories were data about public safety, revenues and expenditures, and education. The most commonly used data categories in publicdata.eu, a catalogue of Europe’s public data, are “Finance and budgeting”, “Social questions”, and “Education and communication”. Among the other frequently mentioned types of public data are statistical or geospatial data, the types that are particularly important from the perspective of their reuse by businesses. Paul Clarke sorted out public data into 4 categories:
  • Historical data, such as statistics
  • Planning data, including legal regulations in progress
  • Infractructural data, for example, reference concepts such as postcodes
  • Operational data, covering real-time streaming data, e.g., traffic situation
Governments collect data for a plethora of topics, some of which may look obscure, such as the statistics of people injured by vending machines in the US [6]. Nevertheless, collection of all of the datasets should be justified by their function for fulfiling the requirements of the public task and by their contribution as a source of improvements, such as for increasing the safety of vending machines in the aforementioned example. The scope of public sector information follows the function of the public sector.

References

  1. MENDEL, Toby. Freedom of information: an internationally protected human right. Comparative Media Law Journal. 2003, no. 1. Also available from WWW: http://www.juridicas.unam.mx/publica/rev/comlawj/cont/1/cts/cts3.htm
  2. LIENERT, Ian. Where does the public sector end and the private sector begin? [online]. June 1st, 2009 [cit. 2012-04-29]. IMF working paper, no. 09/122. Available from WWW: http://www.imf.org/external/pubs/ft/wp/2 009/wp09122.pdf
  3. The Council of the European Communities. Council Directive 93/37/EEC of 14 June 1993 concerning the coordination of procedures for the award of public works contracts. Official Journal of the European Communities. August 9th, 1993, vol. 36, L 199, p. 54 — 84. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.d o?uri=CELEX:31993L0037:EN:PDF. ISSN 0378-6978.
  4. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  5. American Library Association. Key principles of government information [online]. Chicago, 1997 — 2012 [cit. 2012-04-07]. Available from WWW: http://www.ala.org/advocacy/govinfo/keyprinciples
  6. LOVLEY, Erika. The government has a database for most everything. Politico [online]. June 24th, 2009 [cit. 2012-04-07]. Available from WWW: http://www.politico.com/news/stories/0609/24118.html

2012-07-30

Linked open data for public sector information: sharing my thesis

The public sector records data about what it does and about the environment in which it operates. Nowadays, improved and automated ways of data collection lead to a growth of the volume of data that is available in the public sector. Digitization allows to store the recorded data in a way that scales. Presently, researchers estimate that more than 5 exabytes is stored online every day [1]. Fortunately, there are scalable technologies for data storage and retrieval at our disposal.

The Web enables zero cost reproduction of digital information that makes it possible to share the information in a frictionless manner. Building on the premise that data deemed useful for the public sector is useful for the private sector as well, online exchange of public sector data allows to maximize its value by reaching members of the public that may recycle it and reuse it for their own purposes. In fact, the increased access and reuse of the disclosed public data is driven by technologies making it feasible [2].

Digital data may be represented in structured ways that make it machine-readable. Raw, machine-readable representations of data are amenable to automated processing and enable to retain the generative value of data, so that people and computers might use the data in a non-predefined way. Machine readability makes possible a wide array of interactions with data that go far beyond displaying it. In this way, disclosure of public sector data in a machine-readable format allows members of the public to find new uses for the data.

Adoption of the available technologies for data representation and storage may prove to have a disruptive effect on the public sector. Graham Vickery emphasizes two technological developments that, in his opinion, completely redefined the possibilities for public sector information [3, p. 6]. First, he points out to the technologies that enable digitization of public resources. Second, he highlights the role of broadband telecommunications that enable better access to public sector information.

The technologies for representing and exchanging data constitute the basic components for open disclosure of data. Open access to public sector data is considered as a key ingredient for a government that is open. Open government is “the notion that the people have the right to access the documents and proceedings of government” [4, p. xix], which is necessary for an open society that “reflects the universal values of intellectual autonomy, equality and trust” [5, p. 8]. Coupled with the demand for openness of the public sector, the technologies stimulated numerous initiatives promoting open data world-wide. Open data is a set of practices for data disclosure that strives to provide for an equal access and an equal use of the data.

The foundations of open data draw from related approaches. Driven by the recognition of freedom of information as a basic human right, open data transposes the principles of open access, close to those of open source, onto data. It complements the adoption of the approaches of e-government, which promotes use of information and communication technologies to improve government processes, and coincides with the call for government 2.0, which makes a better use of online collaborative technologies to create a more participatory government.

The application of open data, and more specifically linked open data, to the information held by public sector bodies constitutes the main theme of my diploma thesis titled Linked open data for public sector information, of which I am going to share excerpts here, in the form of blog posts. I have decided to publish it in this way because it allows me to share short and focused pieces on specific topics rather than just publishing the whole thesis. I think of it as of re-contextualization: the information flows differently on the Web than in academia.

In the thesis, public sector information represents the content, to which the principles of open data are applied using the technologies recommended by the linked data publication model. The goal of my thesis is twofold. The first part explores the competitive advantage of linked data for release of public sector information under the terms of open data principles. The second part extrapolates the impact and challenges associated with the adoption of linked open data for public sector information.

I hope you will find it useful.

You can find the original fulltext of the thesis here.

Table of contents

  1. What is public sector information?
  2. Legal aspects of public sector information
  3. Disclosure of public sector information
  4. Pricing models for disclosure of public sector information
  5. Concepts of open data
  6. Legal openness of data
  7. Licences for open data
  8. Principles of open data: accessibility
  9. Principles of open data: use
  10. Qualities of open data
  11. Open data policies
  12. Open data for public sector information
  13. Open data infrastructure of the public sector
  14. Open data as a platform
  15. What is linked data?
  16. Technologies of linked data: URIs
  17. Technologies of linked data: HTTP
  18. Technologies of linked data: RDF
  19. Linked data principles
  20. Linked data: discoverability
  21. Linked data: accessibility
  22. Linked data: permanence
  23. Linked data: use
  24. Linked data: quality
  25. Linked open data in the public sector
  26. Impact of open data
  27. Impacts of open data: transparency
  28. Impacts of open data: accountability
  29. Impacts of open data: efficiency
  30. Impacts of open data: disintermediation
  31. Impacts of open data: participation
  32. Impacts of open data: business
  33. Impacts of open data: journalism
  34. Challenges of open data
  35. Challenges of open data: implementation
  36. Challenges of open data: information overload
  37. Challenges of open data: usability
  38. Challenges of open data: data literacy
  39. Challenges of open data: misinterpretation
  40. Challenges of open data: privacy
  41. Challenges of open data: data quality
  42. Challenges of open data: trust
  43. Challenges of open data: procured data
  44. Challenges of open data: summary

References

  1. WRUUCK, Patricia. 2012: the year of big data. European Public Policy Blog [online]. Brussels, May 1st, 2012 [cit. 2012-05-01]. Available from WWW: http://googlepolicyeurope.blogspot.com/2012/05/2012-year-of-big-data.html
  2. BERNERS-LEE, Tim; SHADBOLT, Nigel. Our manifesto for government data. Guardian Datablog [online]. January 21st, 2010 [cit. 2012-04-07]. Available from WWW: http://www.guardian.co.uk/news/datablog/2010/jan/21/timbernerslee-government-data
  3. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  4. LATHROP, Daniel; RUMA, Laurel (eds.). Open government: collaboration, transparency, and participation in practice. Sebastopol: O'Reilly, 2010. ISBN 978-0-596-80435-0.
  5. HALONEN, Antti. Being open about data: analysis of the UK open data policies and applicability of open data [online]. Report. London: Finnish Institute, 2012 [cit. 2012-04-05]. Available from WWW: http://www.finnish-institute.org.uk/images/stories/pdf2012/being%20open%20about%20data.pdf