The (genomics) view from the 57th floor

June 22, 2012

Marina Bay Sands pool In a busy summer for meetings, this month we attended and presented at Bio-IT World Asia conference in Singapore. In this era of more globalized biology, to celebrate the 10th anniversary of the usually Boston based conference series, with Bio-IT World substituted their usual New England lobster for Chili Crab and heading east. The meetings proximity to the Singapore Biopolis (of which we paid a visit), and its location at the 57 story Marina Bay Sands resort (with its rooftop worlds highest infinity pool), were a winning combination in getting an impressive mix of scientists from around the Asia-Pacific and beyond (and Newt Gingrich) to attend and present. With tracks on IT infrastructure and the cloud, drug discovery informatics, bioinformatics, and NGS data management and interpretation, it followed GigaScience‘s “big-data” scope very well, and a lot of interesting and relevant work was on display.

Chris Dagdigian from Bioteam with one of the opening keynotes on “Bio-IT Trends from the Trenches” (slides here), set the scene for the whole meeting and summarized the issues very well, but predictably the message from this and other talks was fundamentally about issues with scale. With chemistries changing faster than data centers and research IT infrastructure can be refreshed he was understandably pessimistic about sustainability, but the work on CRAM compression algorithm from the EBI was cited as one possible hope for the future. Dag was typically blunt tackling cloud computing, and was wary of any vendors lacking Amazon API compatibility. Throwing a challenge to “cloud pretenders” that no API’s and no self service does not equal a cloud, the many cloud talks had a tough act to follow. Rising to this challenge, Xing Xu presented our BGI Cloud colleagues new “Easy Genomics” cloud-based bioinformatics platform. With a user-friendly data analysis platform and connection to Aspera, the trial version presented included 6 sequencing GPU-based and cloud optimized NGS-analysis tools, including BGI’s new SOAP Hadoop. Xing also plugged the current free trial of the service that if readers are quick they can sign up for (for more see).

Another area close to our heart is transparency of research and ease of data-reuse, and James Taylor from Galaxy focused his talk on what he feels is the crisis in genomics research: reproducibility. Giving an overview of the popular Galaxy workflow environment, the focus was on how it aids reproducibility and sharing of methods and helps tackle this reproducibility gap. The growing popularity of Galaxy was seen in the number of talks presenting work using it, with Andrew Lonie (VLSCI) presented on the Australian Genomics Virtual Library – utilizing Galaxy and Biolinux in the Australian national research NeCTAR cloud. Tin-Lap Lee (our collaborator at CUHK) also promoted the Galaxy platform he is working on with us to handle data from the GigaScience journal and GigaDB.

Another subject of great importance to us is open data, and lots of public release of human genomes were on display at the meeting. Other than the 69 public genomes presented by Richard Tearle from Complete Genomics, most of these were from less well studied Asian populations or related to on Asian specific diseases, extremely important for studying genetic variation and pharmacogenomics in populations that drugs may not work so well in. Jong Bhak presented from the Korean node of the personal genome project, 38 genomes currently available for download, all under his interesting completely open BioLicence – pioneered years before data licenses have been started to be looked at for genomics data. Stephen Rudd from MGRC also presented on the “MyGenome” – Malaysian genome project, 26 genomes sequenced from 6 of the many ethnicities of Malaysia.

GigaScience (following a talk from the competitively named Teradata) presented in a very e-Science and big-data oriented session of the IT infrastructure and the cloud track on our recent successes in data-citation and dissemination through GigaDB. For more check out the slides below. This and other Bio-IT World meetings this year can be followed on twitter using the #BioIT12 tag, and the coverage from this meeting has been handily archived here. While the location for the next meeting hasn’t been decided yet, wherever it is, it will have a hard act to follow in the quality of science on display, and in the view from the swimming pool.

View more PowerPoint from GigaScience, BGI Shenzhen

The (genomics) view from the 57th floor

Scott Edmunds

Blog post tags

Recent comments