Bringing Wildlife Forensics into the Omics Era. Q&A with Alfred Arulandhu and Martijn Staats

Sequencers versus the smugglers.

CITES (the Convention on International Trade in Endangered Species of Wild Fauna and Flora), is one of the largest and oldest conservation and sustainable use agreements in existence, and provides a legal framework for protecting endangered plants and animals around the world. There are roughly 35,000 species listed on the three CITES appendices, and with increasing pressures on global biodiversity there are huge challenges detecting, tracking and protecting species on this list from the illegal wildlife trade. DNA barcoding techniques can leverage the revolution in DNA sequencing technology to provide a high-throughput, quantitative and low cost alternative to traditional morphological-based approaches, although there are challenges with “dark taxa” (see our editorial on the topic) and the big gaps in many branches of the tree of life. Much of this is being addressed by attempts to systematically fill in many of these gaps, and we’ve recently published papers targeting South East Asian mammalian species (see Mohd Salleh et al.) and Australasian carnivore species (see Modave et al.).

To truly unleash the genomics revolution in the battle against wildlife smuggling and roll out these new techniques to the frontline forensic labs they need to be cheap, reproducible, standardized and easy to use by non-research scientists. Attempting to fill that niche, new worked published in GigaScience presents a new pipeline for next-gen wildlife forensics combining the cheap and ubiquitous benchtop MISEQ sequencer, and an easy-to-use web based bioinformatics pipeline to detect CITES-based species called CITEsspeciesDetect. Carrying on from our many Q&A blogs we quiz two of the authors, Alfred Arulandhu and Martijn Staats about their work taking genomics into the wildlife protection world.

You both work at the RIKILT food safety institute at Wageningen University in the Netherlands, so why are you interested in endangered mostly tropical species?

The research was conducted in the EU-funded FP7 DECATHLON project, of which one of the goals was to develop a DNA metabarcoding approach that can be used by customs agencies in a routine setup to identify materials derived from endangered species in complex samples. For this, we strongly collaborated with the Dutch and Bulgarian Customs Laboratory, with input from other European customs laboratories, that are tasked with implementing CITES regulations related to nearly 36,000 species during border controls. The Customs agencies often seize batches of, for instance, traditional medicines and food supplements that are suspected of containing endangered species. Due to the highly processed nature of such products it is often difficult to make positive identifications based on visual inspection. Having a standardized and reliable DNA-based method that allows positive identification of CITES-protected species in such complex products is very important in this respect and this issue was at the basis of our research.

This work is about looking for CITES listed species in complex samples, so what kind of samples are we talking about?

This research mainly focused on traditional medicines, consisting of mixed materials derived from plant and animal species. Such products are often sold as e.g. powders, pills, capsules and tablets. Due to their highly processed nature the ingredients cannot be identified morphologically.

We’ve previously written, published (and blogged) about metabarcoding techniques, and their potential in bringing biodiversity and taxonomy research into the “big-data” era. What do you think is its advantages in conservation?

DNA metabarcoding is now part of the rich tool-box of forensic DNA analysis methods, and it will certainly contribute to enhancing investigations into crimes against protected natural resources. The major advantage of the work that is presented here is that it combines 12 informative biomarkers and a very strict NGS data analysis pipeline to identify species, including CITES species, even more solidly than was achievable so far.

There are ~35,000 species that are classified and listed by CITES, so how many species do you think will be able to be detected by your method? What needs to be done to increase the number of species detected, and what are the ultimate limitations?

The method was designed to make use of a panel of 12 DNA barcodes markers that have demonstrated universal applicability across a wide range of plant and animal taxa. However, accurate DNA barcoding depends on the use of a reference database that provides good taxonomic coverage. Unfortunately, the current under-representation of DNA barcodes from species protected by CITES and closely related species still critically hampers their identification in many cases. Based on available databases, we estimated that only 18.8% of species on the CITES list contain one or more DNA barcodes. This will improve as DNA barcoding campaigns continue, in particular through initiatives such as the Barcodes of Wildlife Project [GigaScience: related to this see the new work also just published adding 30 novel SE Asian mammalian sequence to the database, and pictured here]. The continued efforts that are being put into building reference sequence databases such as the Barcode of Life Data Systems, where millions of barcode sequences are linked to voucher specimens, remain therefore essential. The presented method will automatically improve with the gradually improving databases, the data analysis pipeline will make use of this growing dataset.

You validated this technique in 16 labs across the world. Were there any challenges in transferring these techniques and skills?

The participating enforcement agencies and laboratories were all highly experienced and proficient in advanced molecular analysis work, so performing DNA extraction and PCR according to instructions in the SOP was not too much to ask. Difficulties, however, were experienced with interpreting the NGS results. Inconsistencies were observed among laboratories when interpreting the raw BLAST output that is used for identifying species. Individual participants were later given the opportunity to reanalyze their data using the on-line platform, called CITEsspeciesDetect. The web interface enables a clear and structured presentation of the analysis results, and it automatically highlights any matches with CITES species, which helps tremendously with correctly interpreting the results.

On top of openly sharing all the code from your pipelines and the validation data and results (see the GigaDB entry) you went to the effort to put your SOPs into (see here). How difficult and how much extra time was needed to be able to do this, and do you think these efforts will be worth it? Have you seen any examples yet of this helping other labs use this tool, and can you talk about any successes of this in detection and enforcement?

Uploading the SOP into is simple and straightforward. Publishing protocols this way is relevant, because it allows disseminating our publicly funded work in an efficient way to a wider audience.

This used MISEQ data, but is it applicable to other sequencing platforms? What do you think of the potential of portable sequences such as Oxford nanopore MinIONs to be able to combine with web-based pipelines to enable this sort of work right into the field?

While the CITESspeciesDetect pipeline was specifically designed for use with Illumina data, the general work flow can theoretically be applied irrespective of the sequencing technology used. Yet, we selected Illumina technology for its ability to generate high-quality data, and the use of CITESspeciesDetect in combination with data from other sequencing platforms needs to be evaluated. The prospect of having hand-held portable sequencers that will enable Customs agencies to assess the presence of endangered species in complex products directly on-location is very exciting. However, the current sample-to-answer procedure provided by Oxford Nanopore technology involves various separate sample preparation steps that make current application of this technology in the field for the moment still impractical.

Further Reading
Modave E, MacDonald AJ, Sarre SD. A single mini-barcode test to screen for Australian mammalian predators from environmental samples. Gigascience. 2017 Aug 1;6(8):1-13. doi: 10.1093/gigascience/gix052.

Mohd Salleh F, Ramos-Madrigal J, Peñaloza F, Liu S, Mikkel-Holger SS, Riddhi PP, Martins R, Lenz D, Fickel J, Roos C, Shamsir MS, Azman MS, Burton KL, Stephen JR, Wilting A, Gilbert MTP. An expanded mammal mitogenome dataset from Southeast Asia. Gigascience. 2017 Aug 1;6(8):1-8. doi: 10.1093/gigascience/gix053.

Arulandhu, A, J; Hagelaar, R; Staats, M; Voorhuijzen, M, M; Prins, T, W; Scholtens, I, M; Costessi, A; Duijsings, D; Rechenmann, F; Gaspar, F, B; Barreto Crespo, M, T; Holst-Jensen, A; Birck, M; Burns, M; Haynes, E; Hochegger, R; Klingl, A; Lundberg, L; Natale, C; Niekamp, H; Perri, E; Barbante, A; Rosec, J; Seyfarth, R; Sovova, T; Moorleghem, C, V; Ruth, S, V; Peelen, T; Kok, E. Development and validation of a multi-locus DNA metabarcoding method to identify endangered species in complex samples. GigaScience 2017. Sept 1;6(9):1-8. doi:10.1093/gigascience/gix080