This week marks another success for the fledgling practice of data citation, with two datasets from our GigaScience database published in Nature Biotechnology. The genomes sequenced by our colleagues at the BGI for the Cynomolgus and Chinese rhesus macaques were initially released DOIs at our launch in July, and were amongst the first (at the time) unpublished genomes released in this way. Data citation is an important concept, allowing data producers to obtain an early form of credit for releasing their work, speeding up research by encouraging early data release, and allowing the impact and reuse of data to be tracked.
After the recent success of our first dataset being published in the New England of Medicine (the genome of the recent outbreak strain of E. coli O104:H4), this is the first time one of our data DOIs has been accepted in a Nature journal. For data citation to work the assistance of journals is key, and Nature Biotechnology has been particularly helpful in promoting the scheme, arguing in an editorial as far back as 2009 that novel forms of credit for data producers were needed, and suggesting DOIs as an ideal solution for this. The Datacite consortium was set up in late 2010 to do exactly that, and we would like to thank them and the British Library for their help in issuing these DOIs.
Macaque species are the most commonly used non-human primate models in medical research, and their genomes will hopefully aid human disease research and drug discovery. Looking at orthologues of human druggable protein domains in these species is aiding the potential therapeutic exploitation of their ‘druggable genome’, and has already lead to BGI producing an exome sequencing platform for the species. On top of their genome assemblies, the DOI landing pages include links to functionally annotated and coding sequence sets, as well as a link to a browser and database. After the release of other datasets such as the CHO cell line genome, we are currently collecting another large batch of datasets to be released, so watch this space for further news and announcements.
1. Yan, G. et al. Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques. Nat Biotech, (2011). https://doi.org/10.1038/nbt.1992
2. Credit where credit is overdue. Nat Biotech 27, 579 (2009). https://doi.org/10.1038/nbt0709-579
To cite the two datasets please use the following citations:
3. Yan, G; Zhang, G; Fang, X; Zhang, Y; Li, C; Ling, F; Cooper, DN; Li, O; Li, Y; van Gool, AJ; Du, H; Chen, J; Chen, R; Zhang, P; Huang, Z; Thompson, JR; Meng, Y; Bai, Y; Wang, J; Zhuo, M; Wang, T; Huang, Y; Wei, L; Li, J; Wang, Z; Hu, H; Le, L; Stenson, PD; Li, B; Liu, X; Ball, EV; An, N; Huang, Q; Zhang, Y; Fan, W; Zhang, X; Li, Y; Wang, W; Katze, MG; Su, B; Nielsen, R; Yang, H; Wang, J; Wang, X; Wang, J (2011): Genomic data from the Chinese Rhesus Macaque (Macaca mulatta lasiota). GigaScience. doi:10.5524/100002
http://dx.doi.org/10.5524/100002
4. Yan, G; Zhang, G; Fang, X; Zhang, Y; Li, C; Ling, F; Cooper, DN; Li, O; Li, Y; van Gool, AJ; Du, H; Chen, J; Chen, R; Zhang, P; Huang, Z; Thompson, JR; Meng, Y; Bai, Y; Wang, J; Zhuo, M; Wang, T; Huang, Y; Wei, L; Li, J; Wang, Z; Hu, H; Le, L; Stenson, PD; Li, B; Liu, X; Ball, EV; An, N; Huang, Q; Zhang, Y; Fan, W; Zhang, X; Li, Y; Wang, W; Katze, MG; Su, B; Nielsen, R; Yang, H; Wang, J; Wang, X; Wang, J (2011): Genomic data from the Crab Eating Macaque/Cynomolgus Monkey (Macaca fascicularis). GigaScience. doi:10.5524/100003
Comments are closed.
Hi – this is great stuff. Even better is linking to the articles too:
Credit where credit is overdue. Nat Biotech 27, 579 (2009).
http://dx.doi.org/10.1038/nbt0709-579
Yan, G. et al. Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques. Nat Biotech
http://dx.doi.org/10.1038/nbt.1992
[…] makes the point that at present very few journals are currently doing this, but after our initially unsuccessful attempts at getting DOIs included in the NEJM E. coli paper and DOIs into a Nature Biotechnology paper, our […]
Just read a really plausible novel called The Prophesy Gene. The main characters uncover a number of unintended genetic mutations as a result of the 1980s Aral Sea environmental disaster in Central Asia and the accidental release of a genetically modified strain of anthrax. The author makes a pretty scary claim that mankind is stifling its own evolution by premeditated and accidental genetic engineering and mutations because we can’t possibly understand all of the consequences to ecosystems and dormant genetic sites and the food chain when we monkey with this stuff. For example, some people eat oxen that have grazed on mutated vegetation and those people’s digestive systems irreparably stop working. Or some dangerous fungus that humans eradicate because it causes disease but they don’t realize that it also sequesters carbon dioxide and could reverse global warming. But I think the best one is that if it wasn’t for scientist’s genetic meddling, humans might one day evolve senses that bats and sharks have like hunting by their internal sonar or the ability that butterflies and some birds have to navigate by the earth’s magnetic field. The book is by Stuart Schooler. His website is http://www.stuartschooler.com and there’s a link to a blog and a Youtube video (http://vimeo.com/53365895).
[…] we’ve written previously in this blog, how you cite data is important in tracking and maximizing its use. […]
[…] Subsequently most now have had genome papers published without difficulties in journals such as Nature Biotechnology and Science, but until recently the Polar Bear and Penguin genomes had still not been formally […]