2012-07-31

What is public sector information?

The following post is an excerpt from my thesis entitled Linked open data for public sector information.
Access to proceedings of the public sector is a fundamental underpinning of democracy. “Quality of public discussion would be significantly impoverished without the nourishment of information from public authorities” [1]. Moreover, economic and research activities in the private sector would be vastly impoverished if public sector information was kept concealed within the public sector. Reuse of public sector information in the private sector is a pivotal goal of its disclosure.
The disclosure of public sector information constitutes the subject matter of my thesis. In this blog post I try to delineate the scope of the domain described in the thesis by providing its basic conceptualization, along with lexical and extensional definitions of the concepts involved. To cater for this goal, this introductory post is concerned with definitions, describing what the concept of “public sector information” covers.
First, how can the borders of the public sector be circumscribed? Boundaries of the public sector are demarcated by private ownership. The institutions the public sector consists of are not private property [2, p. 5]. Instead, the public sector is publicly owned.
Other definitions of the public sector employ the viewpoints of policy control or financial control. A common way of how to give a definition to the public sector in law is to use an extensional definition enumerating the public bodies that fall within its scope.
However, the boundary between public and private sector is getting blurry, since a lot of the functions traditionally performed by public bodies have been outsourced within public-private partnerships. The public sector may also start to take on some characteristics of the private sector, such as the models of finance management.
The public sector is constituted of public bodies. Public body is an institution with legal subjectivity that belongs to the public sector. It is set up under law by the state or other public sector body. Public bodies are established for a specific purpose of meeting the needs in the general interest. They do not have a commercial character and so the majority of their budgets is funded from tax revenue [3, p. 55]. Among the public bodies that are deemed to be most important from the perspective of the data they produce are offices of cadaster, mapping agencies, statistical offices, or company registrars [4, p. 10].
Public bodies produce public sector information, or public data, which is the subject matter of this chapter. UK Public data transparency principles offer a working definition of “public data”. Public data is thought of as “the objective, factual, non-personal data on which public services run and are assessed, and on which policy decisions are based, or which is collected or generated in the course of public service delivery”. It is usually a by-product of the delivery of functions of public sector bodies, which makes it serve as an official public record as well [5]. The term “public sector data” is in most contexts used in the same way as “government data”, and can be thus treated as synonymous.
Given the generic definition of public sector information, enumerating all of the types of public data would be unnecesary. Instead, a few prototypical examples will be mentioned. In 2010, a survey by Socrata identified several high-value categories of data. Among the top-ranked categories were data about public safety, revenues and expenditures, and education. The most commonly used data categories in publicdata.eu, a catalogue of Europe’s public data, are “Finance and budgeting”, “Social questions”, and “Education and communication”. Among the other frequently mentioned types of public data are statistical or geospatial data, the types that are particularly important from the perspective of their reuse by businesses. Paul Clarke sorted out public data into 4 categories:
  • Historical data, such as statistics
  • Planning data, including legal regulations in progress
  • Infractructural data, for example, reference concepts such as postcodes
  • Operational data, covering real-time streaming data, e.g., traffic situation
Governments collect data for a plethora of topics, some of which may look obscure, such as the statistics of people injured by vending machines in the US [6]. Nevertheless, collection of all of the datasets should be justified by their function for fulfiling the requirements of the public task and by their contribution as a source of improvements, such as for increasing the safety of vending machines in the aforementioned example. The scope of public sector information follows the function of the public sector.

References

  1. MENDEL, Toby. Freedom of information: an internationally protected human right. Comparative Media Law Journal. 2003, no. 1. Also available from WWW: http://www.juridicas.unam.mx/publica/rev/comlawj/cont/1/cts/cts3.htm
  2. LIENERT, Ian. Where does the public sector end and the private sector begin? [online]. June 1st, 2009 [cit. 2012-04-29]. IMF working paper, no. 09/122. Available from WWW: http://www.imf.org/external/pubs/ft/wp/2 009/wp09122.pdf
  3. The Council of the European Communities. Council Directive 93/37/EEC of 14 June 1993 concerning the coordination of procedures for the award of public works contracts. Official Journal of the European Communities. August 9th, 1993, vol. 36, L 199, p. 54 — 84. Also available from WWW: http://eur-lex.europa.eu/LexUriServ/LexUriServ.d o?uri=CELEX:31993L0037:EN:PDF. ISSN 0378-6978.
  4. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  5. American Library Association. Key principles of government information [online]. Chicago, 1997 — 2012 [cit. 2012-04-07]. Available from WWW: http://www.ala.org/advocacy/govinfo/keyprinciples
  6. LOVLEY, Erika. The government has a database for most everything. Politico [online]. June 24th, 2009 [cit. 2012-04-07]. Available from WWW: http://www.politico.com/news/stories/0609/24118.html

2012-07-30

Linked open data for public sector information: sharing my thesis

The public sector records data about what it does and about the environment in which it operates. Nowadays, improved and automated ways of data collection lead to a growth of the volume of data that is available in the public sector. Digitization allows to store the recorded data in a way that scales. Presently, researchers estimate that more than 5 exabytes is stored online every day [1]. Fortunately, there are scalable technologies for data storage and retrieval at our disposal.

The Web enables zero cost reproduction of digital information that makes it possible to share the information in a frictionless manner. Building on the premise that data deemed useful for the public sector is useful for the private sector as well, online exchange of public sector data allows to maximize its value by reaching members of the public that may recycle it and reuse it for their own purposes. In fact, the increased access and reuse of the disclosed public data is driven by technologies making it feasible [2].

Digital data may be represented in structured ways that make it machine-readable. Raw, machine-readable representations of data are amenable to automated processing and enable to retain the generative value of data, so that people and computers might use the data in a non-predefined way. Machine readability makes possible a wide array of interactions with data that go far beyond displaying it. In this way, disclosure of public sector data in a machine-readable format allows members of the public to find new uses for the data.

Adoption of the available technologies for data representation and storage may prove to have a disruptive effect on the public sector. Graham Vickery emphasizes two technological developments that, in his opinion, completely redefined the possibilities for public sector information [3, p. 6]. First, he points out to the technologies that enable digitization of public resources. Second, he highlights the role of broadband telecommunications that enable better access to public sector information.

The technologies for representing and exchanging data constitute the basic components for open disclosure of data. Open access to public sector data is considered as a key ingredient for a government that is open. Open government is “the notion that the people have the right to access the documents and proceedings of government” [4, p. xix], which is necessary for an open society that “reflects the universal values of intellectual autonomy, equality and trust” [5, p. 8]. Coupled with the demand for openness of the public sector, the technologies stimulated numerous initiatives promoting open data world-wide. Open data is a set of practices for data disclosure that strives to provide for an equal access and an equal use of the data.

The foundations of open data draw from related approaches. Driven by the recognition of freedom of information as a basic human right, open data transposes the principles of open access, close to those of open source, onto data. It complements the adoption of the approaches of e-government, which promotes use of information and communication technologies to improve government processes, and coincides with the call for government 2.0, which makes a better use of online collaborative technologies to create a more participatory government.

The application of open data, and more specifically linked open data, to the information held by public sector bodies constitutes the main theme of my diploma thesis titled Linked open data for public sector information, of which I am going to share excerpts here, in the form of blog posts. I have decided to publish it in this way because it allows me to share short and focused pieces on specific topics rather than just publishing the whole thesis. I think of it as of re-contextualization: the information flows differently on the Web than in academia.

In the thesis, public sector information represents the content, to which the principles of open data are applied using the technologies recommended by the linked data publication model. The goal of my thesis is twofold. The first part explores the competitive advantage of linked data for release of public sector information under the terms of open data principles. The second part extrapolates the impact and challenges associated with the adoption of linked open data for public sector information.

I hope you will find it useful.

You can find the original fulltext of the thesis here.

Table of contents

  1. What is public sector information?
  2. Legal aspects of public sector information
  3. Disclosure of public sector information
  4. Pricing models for disclosure of public sector information
  5. Concepts of open data
  6. Legal openness of data
  7. Licences for open data
  8. Principles of open data: accessibility
  9. Principles of open data: use
  10. Qualities of open data
  11. Open data policies
  12. Open data for public sector information
  13. Open data infrastructure of the public sector
  14. Open data as a platform
  15. What is linked data?
  16. Technologies of linked data: URIs
  17. Technologies of linked data: HTTP
  18. Technologies of linked data: RDF
  19. Linked data principles
  20. Linked data: discoverability
  21. Linked data: accessibility
  22. Linked data: permanence
  23. Linked data: use
  24. Linked data: quality
  25. Linked open data in the public sector
  26. Impact of open data
  27. Impacts of open data: transparency
  28. Impacts of open data: accountability
  29. Impacts of open data: efficiency
  30. Impacts of open data: disintermediation
  31. Impacts of open data: participation
  32. Impacts of open data: business
  33. Impacts of open data: journalism
  34. Challenges of open data
  35. Challenges of open data: implementation
  36. Challenges of open data: information overload
  37. Challenges of open data: usability
  38. Challenges of open data: data literacy
  39. Challenges of open data: misinterpretation
  40. Challenges of open data: privacy
  41. Challenges of open data: data quality
  42. Challenges of open data: trust
  43. Challenges of open data: procured data
  44. Challenges of open data: summary

References

  1. WRUUCK, Patricia. 2012: the year of big data. European Public Policy Blog [online]. Brussels, May 1st, 2012 [cit. 2012-05-01]. Available from WWW: http://googlepolicyeurope.blogspot.com/2012/05/2012-year-of-big-data.html
  2. BERNERS-LEE, Tim; SHADBOLT, Nigel. Our manifesto for government data. Guardian Datablog [online]. January 21st, 2010 [cit. 2012-04-07]. Available from WWW: http://www.guardian.co.uk/news/datablog/2010/jan/21/timbernerslee-government-data
  3. VICKERY, Graham. Review of the recent developments on PSI re-use and related market developments [online]. Final version. Paris, 2011 [cit. 2012-04-19]. Available from WWW: http://ec.europa.eu/information_society/policy/psi/docs/pdfs/report/psi_final_version_formatted.docx
  4. LATHROP, Daniel; RUMA, Laurel (eds.). Open government: collaboration, transparency, and participation in practice. Sebastopol: O'Reilly, 2010. ISBN 978-0-596-80435-0.
  5. HALONEN, Antti. Being open about data: analysis of the UK open data policies and applicability of open data [online]. Report. London: Finnish Institute, 2012 [cit. 2012-04-05]. Available from WWW: http://www.finnish-institute.org.uk/images/stories/pdf2012/being%20open%20about%20data.pdf