All systems go at ICSB 2014 and the Great GigaScience and Galaxy (G3) workshop

IMG_4403

The 2014 International Conference on Systems Biology (ICSB) was hosted in Australia’s most livable city and event, sport, culture and food capital – that is Melbourne, with GigaScience being proud to be one of the media partners. Stem cell biology was a major theme on the first day, and kicked off to a strong start with Huck Hui Ng (Executive Director  of the Genome Institute of Singapore) giving a great overview of stem cell systems biology. Ng explained how his lab has set up high-throughput capabilities to further understand stem cell pluripotency. Ng’s lab has been able to show that maintenance of human embryonic stem cell stability is governed by – proteins, TFs and co-factors, mediators of INO88 splicing complexes, and Human endogenous retrovirus subfamily H (HERVH). Andres Nagy (Mt Sinai Hospital),presented on the routes of reprogramming to alternative states of pluripotency, focusing on F (or fuzzy)-Class stem cells in comparison with C (compact)-Class stem cells, and how F-class cells can form teratomas in mice. Nagy took this opportunity to highlight the Stemformatics public experiment portal for describing how mouse and human stem cells differentiate, as well as the Project Grandiose Dataset which includes transcriptome, proteome and epigenome data.

Further inspirational presentations really showcased how broad a field encompassing “biological systems” can be, including subjects as diverse as genomic analysis of the transmissible cancers Devil Facial Tumour Disease in Tasmanian Devils and Canine transmissible venereal tumours in dogs, by Elizabeth Murchison (University of Cambridge). One of the major contributors of results to the crowdsourcing of the deadly German 2011 E. Coli “sproutbreak” we helped kick start (see “notes from a tweenome”) was Kat Holt (University of Melbourne), and it was great to see some of her latest work on bacterial pathogen transmission, evolution and resistance. Also timely in the month, we’ve released Oxford Nanopore data (download it here), it was good to see one potential application for this kind of portable technology in her group’s work to understand multidrug resistance in pathogens, in particular, Shigella sonnei drug resistance and outbreaks across Vietnam.

IMG_7454Networks, Physiology and Imaging

Advances in imaging was a major theme at this year’s ICSB with several impressive presentations showcasing Hollywood–grade short films. Mark Ellisman (University of California) highlighted the complexity of cell structure and neural connections, showcasing his work in the production of large-scale brain maps, and virtual cells with detailed maps of endoplasmic reticulum and nuclei. Ellisman’s aim is to integrate big biological data with the aim of creating a “Google Earth” for the brain, with labels for every neuron via crowdsourcing. Peter Hunter (University of Auckland) gave an impressive presentation on his group’s effort employing computational physiology approaches, in an attempt to link molecular systems biology with clinical medicine. Hunter works on multi-scale modelling and captures imaging, molecular and structural data; overlaying these to produce a mathematical model of the heart. Other examples his group are working on include respiratory disease and musculoskeletal multi-scale modelling. Impressive is that Hunter also links his multi-scale modelling data with medical informatics in a project termed “ApiNATOMY”, and his involvement with the Virtual Physiological Human (VPH)-SHARE project, a European Commission funded initiative linking clinical workflows with eHealth Records. Hunter has also helped develop frameworks to handle multi-scale models in clinical settings from imaging, to the computational process of an organ/tissue model, to analysis, output and surgical risk planning and assessment.

Introduced as the Pioneer of the concept of Systems Medicine and Godfather of high-throughput sequencing, it was a perfect ending the second day of talks with yet another inspirational Keynote lecture by Lee Hood (Institute for Systems Biology). Hood emphasised the Holy Trinity of Systems Biology – biology, technology and computation, as well as the five conceptual pillars for Systems Medicine: 1) Informational science, 2) infrastructure and culture, 3) holistic systems and dynamic experimental approaches, 4) emerging technologies and systems-driven strategies, and 5) pioneering analytical tools. Having seen his personalised medicine talks at many conference keynotes, it was encouraging to see how Hood’s P4 Medicine movement has progressed. Practical examples presented included a large family genomics study that has so far identified 13 potential high penetrance bipolar mutations in more than 6,000 genomes and 1,500 families with Inova; blood-based proteomic diagnostics for Prion disease; the development of the first psychiatric disease quantitative assay for post-traumatic stress disorder in soldiers sent to Afghanistan; thought-provoking consumer-driving networking described as patient-activated social networks  involving crowd sourcing efforts by patients in order to learn how to use new medicines together; as well as the 100K Wellness Project and its pilot, Pioneer 100.

Standards and Reproducibility

A hot topic of the meeting was reproducibility, and Michael Hucka (CalTec) Chaired and presented a specific “Reproducibility of computational research: methods to avoid madness” track on what is one of our biggest topics of interest. Hucka focussed on methodological rather than cultural issues, and acknowledged that we have complete control of a compute so, in theory, there should be less excuses than for wet bench scientists. Despite this, reproducibility in computational biology is currently still mostly aspired to, but rarely achieved. Things have improved, but we can do better. David Lovell (CSIRO), presented a practical example on what you can and can’t say about biological systems based on his pre-print in BioRxiv on how calculating proportionality is a valid alternative to correlation for relative data – emphasizing, that we shouldn’t correlate proportions but use proportionality instead.

Last but not least, John Mattick (Director of the Garvan Institute of Medical Research), closed the conference with an impressive talk highlighting the massive hidden layer of RNA regulation in complex organisms. Summarising his work on the last few years on RNA transcripts, it is mind blowing about how complicated RNA regulation of gene expression really is. Mattick ended with some food for thought; can we consider the eukaryotic genome as a RNA machine? Is RNA the only thing that makes us different? Given the vast topic of talks seen at this meeting, it is was difficult to include them all here; however, a major theme going forward was that the right tools will be needed to interpret big data and networks.

Tackling Irreproducible Research – The G3 Workshop

IMG_7517

David Flanders and Nick Wong at the G3 workshop

Hosted at the University of Melbourne and kindly organised by Nicholas Wong (previously of the University of Melbourne, but now affiliated with Pacific Edge Diagnostics, NZ), “The Great GigaScience and Galaxy (G3) Workshop” was made possible with thanks to the sponsors: the VLSCI, the Australian Bioinformatics Network, ResBaz, Illumina and The University of Melbourne. Scott Edmunds, Nicole Nogoy and Rob Davidson of GigaScience, all took part in this 1-day workshop aimed to raise awareness of issues surrounding open access, open data, irreproducible science, and of course how and what GigaScience has been doing to improve transparency in publishing, irreproducible research and the technical aspects involved. The workshop featured a great range of talks from David Vaux (Walter and Eliza Hall Institute) who gave an interactive, statistics-refresher-like presentation on his ten rules for reproducible data. Vaux shared his frustrations with the inconsistency of how figures and error bars are misrepresented or in most cases, not labelled, in several papers published in Nature, Cell and PNAS – providing readers with no way to interpret the data. He highlighted that such mistakes arise from the authors, reviewers and editors not carefully reading the papers and how such carelessness promotes irreproducible science. GigaScience Commissioning Editor, Nicole Nogoy, presented an open access publishing 101 and raised awareness of the current challenges open access and funders are facing, Executive Editor, Scott Edmunds, followed by highlighting the reproducibility crisis and the need for transparency,  and Data Scientist, Rob Davidson, gave a great overview of open source tools for reproducible research. Following nicely from Rob’s techy-based talk, Maria Doyle (Peter Macallum Cancer Centre) gave a nice overview of Galaxy for reproducible research; ending with a funders perspective from Clive Morris (National Health and Medical Research Council) on what the NHMRC are doing to increase open access and open data in Australia. Presentations were followed by a positive discussion between the speakers and workshop participants facilitated by David Flanders (VLSCI) – each speaker was asked to name one practical action participants could do right after the workshop, they were then asked to vote on which action they thought was best and wanted to know more about. The consensus was for better training, which was very topical coming the day after the University of Melbourne beta-trialled their first Data Carpentry workshop, as well as our and ELIXIR-NL’s recent “bring your own data parties” (see the write up on this). The workshop concluded with two separate afternoon sessions focusing on an Authorea boot camp and a hands-on Galaxy 101 training session. With the amazing weather, company, Victorian charm, excellent dining and cosmopolitan feel, Melbourne certainly left a soft spot in three of GigaScience’s Editorial Team member’s hearts.