Open Science versus Ash Dieback (and the Tweenome revisited)

Bye Bye Bluebells
Bluebell woods, the dense carpets of violet–blue flowers found in ancient woodland are a spectacular and famous springtime sight in Britain, but this picture postcard scene is threatened as never before. Chalara fraxinea or ash dieback, a devastating fungal disease of ash trees has swept across northern Europe, and has now reached Britain, a country particular susceptible to its potential onslaught, as the estimated 80 million ash trees make up 30 per cent of woodland across the country. Ash trees particularly encourage biodiversity, as the tree branches are ideally spaced for light to pass through and let the bluebells and other species on the forest floor grow. Spreading over mainland Europe since its discovery in Poland in 1992, the disease has been particularly virulent in Northern Europe, with 90% of ash trees in Denmark affected. It had been hoped that Britain could act as a bulwark against the disease, but in October last year due to slow government response and a shortage of qualified plant pathologists everyone’s worst fears were realized, when the fungus was for the first time found growing growing in mature trees in Eastern England.

Crowdsourcing and Open-Science to the Rescue!
With hope of quarantine and eradication now over, the latest government strategy is to slow its spread and develop and restructure the woodland with resistant trees. To keep on top of an evolving highly infectious pathogen with a wind-borne spread that can spread in the wind is a particularly onerous task, particularly with the lack of experts and scientists on the ground. Crowdsourcing, opening up the fight against the pathogen to the global wisdom of the crowds, as well as harnessing the rapid transfer of information on social networks and “hive mind” of the web is one potentially way to address this. Already a team of developers and scientists have developed AshTag, a smartphone app that the public can use to report suspected cases of infection.

Following on from this, this week in GigaScience we publish a paper from a community of scientists taking an “open-source genomics” approach to engage and use the global genomics community in this fight. To kick start genomic analyses of the pathogen and host, Dan MacLean and colleagues from OpenAshDB present a call to arms to the research community entitled “Crowdsourcing genomic analyses of ash and ash dieback – power to the people”. Taking an usual step to immediately release the initial genomics datasets as soon as it is produced, they have producing a website (oadb.tsl.ac.uk) and GitHub based platform to share and analyse the data and results. While there have been attempts to crowdsource human disease outbreaks before (see this excellent TEDx talk from Jennifer Gardy on H1N1), this is the first time it has been attempted on a plant disease of such importance. This open-source genomics approach also follows on and learns lessons from the deadly 2011 European E. coli 0104:H4 outbreak, and the approach that our colleagues at the BGI and others followed to crowdsource the analysis of the pathogens genome via twitter, blogs and GitHub.

Using a very similarly structured collaborative GitHub-based platform that the Era7 team built upon the original E. coli data released by the BGI as the first data DOI in our GigaDB database, the OpenAshDB project aim to take this open-science approach even further, with plans for collaborative authorship for contributors and to work with pre-print servers before publication of the final products. On top of the altruistic reasons of scientific curiosity and wanting to protect biodiversity and the environment, one of the key incentives for taking part in a project such as this is obviously the traditional one of obtaining scientific credit. While everything has to be quickly released into the public domain to maximize its use, contributions can still be tracked through GitHub via commit number and traditional mechanisms such as citation. Working with DataCite we issued our first DOI for the E. coli genome, and the altmetrics community using tools such as Impact Story have shown it is also possible to track GitHub use through similar means (see this example for the E. coli GitHub).

The Tweenome Revisited
Crowdsourcing and open-science is an area we at GigaScience are keen to promote and support, and this paper comes on top of recent papers published on community sponsored/assembled Parrot genomes (AKA the Peoples Parrot), and personal genomics analysis via blogs. We have written previously on how our and collaborators at UMC Hamburg-Eppendorf release of CC0 (the most open public domain waiver) E. coli genome data via twitter enabled others (with special mention of early work from Nick Loman and the Era7 team who helped get the ball rolling) to kick-start a burst of crowd-sourced, curiosity-driven analyses from bioinformaticians around the world. Dubbed by some as the first “Tweenome”, this project led to a high profile paper in New England Journal of Medicine, and now over 18 months on it is a good time to look back and see the downstream consequences, and if any lessons can be learned for OpenAshDB and other projects with similar aims.

Among the over 110 citations to the paper so far (according to google scholar), the study provided insight into the pathogenicity, evolution, and treatment of the pathogen as well as assisting platform comparison studies. Obviously the main aim of doing science in this accelerated way was speed up diagnosis and treatments, and the E. coli data enabled the rapid development of diagnostic tests and anti-microbial agents. These are useful examples, as better diagnostic tests and potential therapies are obviously important downstream outcomes and goals that the OpenAshDB project could help enable.

Probably the projects biggest legacy is as an example of open-science, data-citation, and the use of CC0 data. After releasing the data under a CC0 license this allowed truly open-source analysis, and the UK HPA and github members followed suit in releasing their work in this way. Following this example, a team at Pacific-Biosystems also released their related data in a similar manner, using the example of their fellow E. coli data producers to allow them to release their data without wasting time on legal wrangling. This example has subsequently been used as an example for future UK and EU science policy, with the Royal Society in the UK using the E. coli crowsourcing as an example of “the power of intelligently open data”, and highlighting it on the cover of their influential “Science as an Open Enterprise” report.

We hope that the OpenAshDB project leaves a similar legacy, and being Ash Wednesday we hope that the many in the genomics community join the effort to study and fight this devastating ecological threat. It will not only enable future generations to continue to appreciate the beauty of bluebell woods, but provide an example of how science can be more collaborative, faster and more efficient in this new era of open-science and open data. As the authors of the paper state in the working title of the article – power to the people!

References

1. MacLean, D; et al., Crowdsourcing genomic analyses of ash and ash dieback — power to the people. GigaScience 2013, 2:2
2. OpenAshDB Website: http://oadb.tsl.ac.uk/
3. Notes from a Tweenome: http://gigasciencejournal.com/blog/notes-from-an-e-coli-tweenome-lessons-learned-from-our-first-data-doi/
4. Rohde, H; et al., Open-Source Genomic Analysis of Shiga-Toxin–Producing E. coli O104:H4. N Engl J Med 2011, 365:718-724.

Save

Recent comments

Comments are closed.