The rate of species extinction has lent increasing urgency to the description of new species, but in this supposedly networked “big data” era the process of cataloging the rich tapestry of life has changed little since the time of Linnaeus. Fortunately, this process is finally being dragged into the 21st century, as the procedure of describing animal species at last entered the electronic era last year with the acceptance of electronic taxonomy publication and registration with ZooBank, the official registry of the ICZN. Concerned with growing disappearance rates, scientists have been forced towards a so called ‘turbo taxonomy’ approach, where rapid species description is needed to manage conservation. A new collaboration between GigaScience and Pensoft Publishers pushes the boundaries opened up by the digital-era still further, presenting an innovative holistic approach to describing new species creating a new kind of ‘specimen’, the ‘cybertype’, a 3D computer image that can be downloaded anywhere in the world it is needed and swathe of data types to suit modern biology, including its transcriptome, DNA barcodes, and video of the live animal in additional to the traditional morphological description. The approach is illustrated by the description of a new cave centipede species from a remote karst region of Croatia – the ‘cyber centipede’ Eupolybothrus cavernicolus.
The new approach combines and integrates data from several techniques, including next generation molecular methods, barcoding, and novel computing and imaging technologies, that will test the model for big data collection, storage and management in biology. The study has just been published in Pensoft’s new Biodiversity Data Journal, with data hosted and curated in our GigaDB database. We have also published an editorial today that throws additional light on the rationale and data organization process.
While acknowledging the necessity of fast descriptions, the authors of the new study present the other ‘extreme’ for taxonomic description: “a new species of the future”. An international team of scientists from Bulgaria, Croatia, UK, Denmark, France, Italy, Greece, Germany and our partners in BGI Shenzhen and China National GeneBank illustrate a holistic approach to the description of the new cave dwelling centipede species Eupolybothrus cavernicolus.
Eupolybothrus cavernicolus has become the first eukaryotic species for which, in addition to the traditional morphological description, scientists have provided a transcriptomic profile sequenced by BGI, DNA barcoding data, detailed anatomical X-ray microtomography (micro-CT), and a movie of the living specimen to document important traits of its behaviour. By employing micro-CT scanning in a new species, for the first time a high-resolution morphological and anatomical dataset is created, the ‘cybertype’ giving everyone virtual access to the specimen. As with the methylated nematode genome dataset we are hosting that recently won the BMC open data award, we have again collaborated with the ISA community to provide the metadata in the interoperable ISA-TAB format to maximize the discovery, exchange and informed integration of these diverse datasets.
Communicating the results of next generation sequencing effectively requires the next generation of data publishing. Lyubomir Penev, managing director of Pensoft Publishers, comments on the issue: “It is not sufficient just to collect ‘big’ data. The real challenge comes at the point when data should be managed, stored, handled, peer-reviewed, published and distributed in a way that allows for re-use in the coming big data world”.
A digital message in a bottle
The recent “insect squishome” paper in GigaScience is a demonstration that sequencing is moving beyond piecing together a species genetic blueprint to areas such as biodiversity research, with bulk-collected high-throughput “metabarcoding” surveys of species bringing genomics, biomonitoring and species-discovery closer together. This example attempts to integrate data from these different sources, and through curation in the GigaDB database to make it interoperable. It is exciting that you will no longer have to navigate caves or dig through museum collections to view this typed specimen, and multi-dimensional data including its genetic blueprint are freely available to everyone at the touch of a button, allowing you to see this “cyber centipede” alive and in three-dimensions. Providing long term storage and access to digital data will be serious a challenge for the future, but the longevity of Genebank has shown that this is possible for several decades at least. While this new species subterranean lifestyle may protect it from some of the growing threats to its cousins on the surface, this new type of species description also provides an example of how much previously uncharacterized information on its behavior, internal structure, physiology and genetic make-up can potentially be preserved for future generations. While museum specimens can degrade, this “cybertype” specimen has the potential to be a digital message in a bottle for future generations that may not have access to the species. The video pieced together from the microCT sections (also available as individual images in GigaDB) is also suitably H.R. Giger for a dataset released the same week as Halloween.
1. Stoev P, Komerički A, Akkari N, Shanlin Liu, Xin Zhou, Weigand AM, Hostens J, Hunter CI, Edmunds SC, Porco D, Zapparoli M, Georgiev T, Mietchen D, Roberts D, Faulwetter S, Smith V, Penev L (2013) Eupolybothrus cavernicolus Komerički & Stoev sp. n. (Chilopoda: Lithobiomorpha: Lithobiidae): the first eukaryotic species description combining transcriptomic, DNA barcoding and micro-CT imaging data. Biodiversity Data Journal 1: e1013. DOI: 10.3897/BDJ.1.e1013
2. Edmunds SC, Hunter CI, Smith V, Stoev P, Penev L (2013) Biodiversity research in the “big data” era: GigaScience and Pensoft work together to publish the most data-rich species description. GigaScience 2:14 doi:10.1186/2047-217X-2-14
3. Stoev P, Komerički A, Akkari N, Liu S, Zhou X, Weigand AM, Hostens J, Porco D, Penev L (2013): Transcriptomic, DNA barcoding, and micro-CT imaging data from an advanced taxonomic description of a novel centipede species (Eupolybothrus cavernicolus Komerički & Stoev, sp. n.). Gigascience Database. http://dx.doi.org/10.5524/100063