We’ve just written about widespread frustration at the slow pace of the shift to open access publishing of academic papers, and about how some major funding organizations are trying to address that. Open access aims to make entire publications publicly available, and that is meeting considerable resistance from traditional publishers who derive their healthy profits from charging for subscriptions. Rather than continue to tackle publishers head-on, an interesting new project seeks instead to liberate only a particular part of each article, albeit an important one. The new Initiative for Open Citations (I4OC) seeks to promote the unrestricted availability of the list of citations that form a key part of most academic articles:
Citations are the links that knit together our scientific and cultural knowledge. They are primary data that provide both provenance and an explanation for how we know facts. They allow us to attribute and credit scientific contributions, and they enable the evaluation of research and its impacts. In sum, citations are the most important vehicle for the discovery, dissemination, and evaluation of all scholarly knowledge.
As the number of scholarly publications is estimated to double every nine years, citations — and the computational systems that track them — enable researchers and the public to keep abreast of significant developments in any given field. For this to be possible, it is essential to have unrestricted access to bibliographic and citation data in machine-readable form.
The present scholarly communication system inadequately exposes the knowledge networks that already exist within our literature. Citation data are not usually freely available to access, they are often subject to inconsistent, hard-to-parse licenses, and they are usually not machine-readable.
The I4OC aims to address those problems by encouraging all publishers, whether open access or otherwise, to provide the data on citations found in their journals in a form that is structured, separable, and open. “Structured” here means that it is held in a common data form that is machine readable. “Separable” refers to the fact that even for non-open access materials, the citation data is nonetheless freely available. And “open” means that it is released as raw facts, and thus without a license, or uses a CC0 public domain dedication that makes it quite clear that the citation data can be used for any purpose without needing permission.
As the I4OC home page explains, a key benefit from this new approach is increased discoverability of published articles, since even if they are not freely available, their citation data will be out in the open. Another is citation data can be analyzed in new and complex ways thanks to its machine-readable nature. Finally, it may be possible to create new services and even new businesses based around the new data resource.
All of that is highly welcome, but the fact that a separate initiative was required to make it happen underlines that fact that too much of humanity’s knowledge remains locked up behind paywalls, where its full potential is hard to realize. The correct solution to that is not making one element available, but liberating the full texts as open access. And that means real open access, not the subverted kind that Richard Poynder analyzed in his compelling and troubling post “Copyright: the immoveable barrier that open access advocates underestimated“.