Having a finger on the pulse of data citation

Endorsing Data Citation
Nicely timed for the Data Citation Principles workshop at the IDCC meeting in San Francisco yesterday, the finalized Joint Declaration of Data Citation Principles has just been posted on the Force11 website. We of course endorse these, as data citation is an area we have been promoting and practicing since our formation, using it as a mechanism to incentivize and credit the early release of data from data producers. Most of the challenges have been cultural rather than technical, and despite some setbacks (for example from Nature Genetics), for over two years now we have had generally positive interactions working closely with publishers to make sure our dataset citations have been correctly cited according to DCC and DataCite guidelines. From working very closely with the editors of Genome Biology our sorghum dataset was our first to be correctly cited in the references of a published paper, and BioMed Central now uses this example in the formatting instructions for all of their journals. We have blogged regularly on the topic, but for a more detailed overview of our and others efforts in data citation check out our paper in the BMC Research Notes Data standardization, sharing and publication series.

Amounting to more than a hill of beans: new data and functionality in GigaDB
Following in the footsteps of sorghum, the latest dataset to be published in GigaDB today is another agricultural crop important to food security in the developing world, the genome of the chickpea. As with sorghum, this is another useful example for data citation, being release just in time to showcase new functionality in our GigaScience GigaDB database. The latest release just out this week includes a number of new features, including some minor improvements to formatting, browsing and the submission system, and the ability to contact dataset authors directly, but most relevant here it now has citation manager support. Using functionality handily provided by DataCite, we have added new buttons to allow citation information to be downloaded in RIS, BibTeX and text (see the blue boxes next to the citation information in the screen shot below), allows citation information to be downloading in a format suitable for most citation manager software.

Please let us know if you find are any bugs in this new release, but these new tools aim to make the process of citing data even simpler, reducing the technical barriers, and leaving only cultural ones to overcome. We will not get into the etiquette of when to cite data or papers, but Sarah Callaghan does a fantastic job in her recent blog covering this topic. Our rationale is that if that if you feel that data generated in the course of research are just as valuable to the ongoing academic discourse as papers and monographs then it should be treated in the same manner. We would encourage others to sign the declaration and help spread the practice of data citation further.


1. Edmunds, S., Pollard, T., Hole, B., & Basford, A. (2012). Adventures in data citation: sorghum genome data exemplifies the new gold standard BMC Research Notes, 5 (1) DOI: 10.1186/1756-0500-5-223
2. Varshney,RK et al. (2014): Genomic data of the chickpea (Cicer arietinum). GigaScience Database. http://dx.doi.org/10.5524/100076
3. Force11 Data Citation Principles http://www.force11.org/datacitation/