2011-09-24

Open bibliographic data checklist

I have decided to write a few points that might be of interest to those thinking about publishing open bibliographic data. The following is a fragment of an open bibliographic data checklist, or, how to release your library's data into the public without a lawyer holding your hand.

I have been interested in open bibliographic data for a couple of years now, and I try to promote them at the National Technical Library, where we have, so far, released only authority dataset — the Polythematic Structured Subject Heading System. The following points are based on my experiences with this topic. What should you pay attention to when opening your bibliographic data then?

  • Make sure you are the sole owner of the data or make arrangements with other owners. For instance, things may get complicated in the case data was created collaboratively via shared cataloguing. If you are not in complete control of the data, then start with consulting the other proprietors that have a stake in the datasets.
  • Check if the data you are about to release are not bound by some contractual obligations. For example, you may publish a dataset under a Creative Commons licence, soon to realize that there are some unsolved contracts with parties that helped fund the creation of that data years ago. Then you need to discuss this issue with the involved parties to resolve if making the data open is a problem.
  • Read your country's legislation to get to know what you are able to do with your data. For instance, in Czech Republic it is not possible to put data into the public domain intentionally. The only way how public domain content is created is by the natural order of things, i.e., author dies, leaves no heir, and after quite some time the work enters the public domain.
  • See if the data are copyrightable. For instance, if the data do not fall into the scope of the copyright law of your country, it is not suitable to be licenced under Creative Commons, since this set of licences draws its legal binding from the copyright law; it is an extension of the copyright and it builds on it. Facts are not copyrightable and most bibliographic records are made of facts. However, some contain creative content, for example, subject indexing or an abstract, and as such are appropriate for licencing based on the copyright law. Your mileage may vary.
  • Consult the database act. Check if your country has a specific law dealing with the use of databases that might add more requirements that need your attention. For example, in some legal regimes databases are protected on other level, as an aggregation of individual data elements.
  • Different licencing options may be applicable for content and structure of dataset, for instance when there are additional terms required by database law. You can opt in dual-licensing and use two different licences, one for dataset's content that is protected by the copyright law (e.g., a Creative Commons licence), and one for dataset's structure for which the copyright protection may not apply (e.g., Public Domain Dedication and License).
  • Choose a proper licence. A proper open licence is a licence that conforms with the Open Definition (and will not get you sued), so pick one of the OKD-Compliant licences. Good source of solid information about licences for open data is Open Data Commons.
  • BONUS: Tell your friends. Create a record in the Data Hub (formerly CKAN) and add it to the bibliographic data group to let others know that your dataset exists.

Even if it may seem there are lots of things you need to check before releasing open bibliographic data, it is actually easy. It is an performative speech act: you only need to declare your data open to make it open.

<disclaimer>If you are unsure about some of the steps above, see a lawyer to consult it. Note that the usual disclaimers apply for this post, i.e., IANAL.</disclaimer>

No comments :

Post a Comment