GigaScience originally launched at the 2012 ISMB (Intelligent Systems of Molecular Biology) meeting in Long Beach, and every subsequent year ISMB has held a special place in our hearts (see all the previous write-ups in GigaBlog). The conference and field itself has changed enormously since the first meeting 24 years ago, and we’ve seen a lot of developments just in the short time since we’ve been attending. Originally focusing on the development and application of advanced computational methods for biological problems, this rationale still holds true, but as “big-data” generating technologies such as genomics have fed into the clinic, a growing focus of the talks and field as a whole have had a more applied focus. At this weeks #ISMB16 it very clear in the keynotes, with Ruth Nussinov presenting on cancer signaling, Sandrine Dudoit and Sarah Teichmann both covering single-cell RNA sequencing (a big topic this year), and Serafim Batzoglou and Søren Brunak touching on medical informatics.
Open Data Is the New Recycling.
The improvements to healthcare driven by the application of these informatic approaches is not just a software/computation issue, but also essential is good quality data for the algorithms to crunch. The bigger and better described the datasets, the more that can be derived from them, and this makes deriving societal benefits an Open Data issue. We already have much of the data that can answer many important scientific questions to hand, but much of this is held in silos and not connected together. Serafim Batzogolou made this the focus of his talk, stating that biggest obstacle holding back medical genomics is data availability. If you think recycling saves the world, he expounded we should try data sharing. Nobody has ever died because of a data breach, but lives are being lost because of a lack of controls and pharmacogenomic information, particularly for certain populations, meaning we are making under-informed decisions on our health. Commercial health care providers and companies may be hoarding onto this information to monetize it, but academics can be just as much to blame here, hoarding data to blackmail authorship of papers.
At the “How do I find human genomics data to power my research?” workshop we helped organise in Hong Kong last year, the near total lack of Chinese control data was the biggest complaint we got from our attendees. Policy makers, funders, and scientific journals are not doing their job in holding researchers to account. Serafim rightly stated as with recycling, this is a moral issue. Data hoarders (whether in academia, hospitals, industry) are akin to polluters, and should be treated as such. People should not use data security as an excuse for control, and patients, citizens, and “genome bloggers” should have legal right to make their data free. Be it via the Personal Genome Project, OpenSNP, and new evolving forms of portable legal consent.
Following with the final keynote, Søren Brunak’s work was a perfect example of where this can go if these barriers are broken down. Talking us through the concept of “disease trajectory”, using huge data rich datasets across allows our likely health to be more accurately modelled, predicted, and hopefully controlled. Denmark is the perfect place for this work, being obsessed with keeping healthcare data, and having an opt-out system for sharing eHealth Records (eHRs). Having the 3rd highest eHR usage in the world, the entire country is effectively a giant cohort (see this article for more).
On top of human genetic data, other areas have equally urgent life-and-death impact. At the pre-ISMB16 SIG (Special Interest Group) meetings Jennifer Gardy’s keynote at BOSC on “The Open-Source Outbreak” really put across how open data and genomics will save us all from the next horrible infectious disease pandemic (see her slides here, video, and our write-up of BOSC).
The ISCB, the society behind ISMB obviously agrees, this year launching a “Fight against Ebola” award. Of the 14 submissions, Mark Wass (editor of our Automated Functional Prediction series) won the $2,000 prize for work on conserved differences in protein sequence determining the human pathogenicity of Ebola and the non-pathogenic Reston virus (see his paper for more). This great effort will continue next year, but will be broader and become the “Computational solutions to emerging global threat challenge”.
— GigaScience (@GigaScience) July 12, 2016
Mick Watson’s keynote at CAMDA on “Can bioinformatics help feed the world?”, highlighted agriculture and food security is another critical area due to climate change and population pressures. Mick presented this through the context of livestock microbiomes and food production, and our 3,00 Rice Genomes paper was a perfect demonstration of this. The 13.4 terabytes of data published on World Hunger Day in 2014 quadrupled the amount of rice genome data in the public domain. This paves the way for breeders to make more intelligent choices in strain selection, resulting in more accurate and rapid development of rice strains that are better suited to different agricultural environments in poor and environmentally stressed economies.
Happy Birthday to us.
Working hard to solve the world’s problems through computational biology is tiring work, and like the computers they run often some downtime to cool down. In what seems to becoming a tradition, thanks to the generous support of BMC we hosted a birthday party and drinks reception on the first night of ISMB, allowing us to catch up with many old and new friends over drinks.
— R. Taylor Raborn (@rtraborn) July 11, 2016
Handing out birthday gifts at the final day of the meeting, it was hard to follow our previous Bruce Lee/Kill Bill inspired GigaPanda t-shirts, but we think we managed it with our new “Game of Omes” design disappearing from the BMC/Springer stand in minutes. With GigaPanda styling a new “Mother of Data” look, our Khaleesi of the Great Bamboo Sea (chosen as her genome was one the first batch of datasets we published in GigaDB) can be seen posing her new Open Data sidekick, the Bearded Dragon (one of the latest genomes we published).
— Monica Munoz-Torres (@monimunozto) July 12, 2016
The next meeting in Prague will be the 25th anniversary of the conference, and will be a great and popular venue for us to celebrate our 5th birthday. We hope to see many of you there.
UPDATE 18/7/16: Embedded Jennifer Gardy’s keynote video now it has been posted on youtube.