Guardians of the Galaxy Workflow

While the Guardians of the Galaxy film franchise has just released its second film, the GigaScience Galaxy series has just published its 10th, 11th and 12th papers. And all without the need of expensive CGI, although we do have our GigaGalaxy server on standby for additional computational support.

For those not part of their large and growing user base, Galaxy is an open, web-based platform for data intensive biomedical research allowing users to reproduce and share analyses. GigaScience aims to move publishing beyond static papers, and publish and provide credit for ALL research objects, and workflows are one of the key types of “executable data” we have been focussing on alongside things like containers, virtual machines, and interactive notebooks. Aligning perfectly with our aims to increase reproducibility and transparency of research, on top of promoting papers through our special series, we have been utilizing our own GigaGalaxy server (gigagalaxy.net), to assist with the hosting and implementation of Galaxy-based workflows and methods. A previous blog explore more how we are using this platform to help visualization and interaction with the data, workflows and analyses in our papers.

Galaxy Series, Volumes 1-12
The papers just out in our Galaxy series span many different areas, but all have use of Galaxy at their core and are typical of the broad range of papers we have been publishing. These latest examples include new automated Galaxy tool registries, Galaxy tool collections for cancer research, and a new genomic track browser platform adapted and built from Galaxy.

There are a lot of bioinformatics tools out there, and the ELIXIR Tools and Data Services Registry (bio.tools) aims to provide a central information point for them all. There are more than 80 publicly available Galaxy servers around the world cataloguing even these is a challenge. Our new paper on ReGaTE presents the Registration of Galaxy Tools in ELIXIR tools registry, a new software utility that automates the process of registering the services available in a Galaxy instance.

The Morin lab have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. New methods for parallelization of these tools within Galaxy to accelerate runtime and have also been developed and tested in this new work. A paper just out today presents the GSuite HyperBrowser, a comprehensive platform for integrative analysis of track collections across the genome and epigenome. The dynamic user interface are based upon Galaxy ProTo, an alternative tool definition API for the Galaxy framework.

Galaxy Aficionados Assemble
One of the biggest strong-points of Galaxy has been the strength of its community. In terms of community support and training, leveraging the advantages of open source with a wide contributor base, and also being welcoming to users and putting on a good party. Key to this has been the yearly Galaxy Community Conferences, which we have been attending and participating in for a number of years, and it continues to be one of the highlights of our year. This year we are again silver sponsors, and the upcoming GCC2017 conference in Montpellier has extended submissions to its oral track submission until tomorrow, and has its poster and demo track open until 27th May. We initially launched our series at the 2013 conference, and we are renewing our call for papers for our special thematic focused series on studies utilizing large-scale datasets and workflows. From the logo the wine should be good too, so we look forward to catching up with many of you there. The community is getting too large and dispersed for just one meetup a year, and regional meetings are starting too. In February we attended and presented at the first Galaxy Australasia meeting in Melbourne (see Scott’s slides), where the strong Japanese contingent were trying to pitch for Japan hosting the next meeting as Australasia contains the word Asia.

The expansion of the Galaxy continues…
While we have been supporting Galaxy for several years now, as an open source and community driven project it continues to rapidly evolve and grow, and recent guest blog from Björn Grüning, on behalf of The Intergalactic Utilities Commission, outlined their recent efforts integrating the Conda package manager as a new standard for tool dependencies in Galaxy. On top of continuing development of our platform, the series continues to publish submissions both from the Galaxy Community Conferences, and wider Galaxy-related papers. Submissions relating to talks and posters at GCC2017 will be eligible for a continuing 15% discount. Just mention this in your submission letter. More papers are currently under review, so keep following the series page to see continuing additions to the series:

https://academic.oup.com/gigascience/pages/galaxy_series_data_intensive_reproducible_research

References

1. Albuquerque MA, Grande BM, Ritch EJ, Pararajalingam P, Jessa S, Krzywinski M, Grewal JK, Shah SP, Boutros PC, Morin RD. Enhancing Knowledge Discovery from Cancer Genomics Data with Galaxy. Gigascience. 2017 Mar 9. doi: 10.1093/gigascience/gix015

2. Doppelt-Azeroual O, Mareuil F, Deveaud E, Kala S MS, Soranzo N, van den Beek M, Gruening B, Ison J, M Enager HE. ReGaTE, Registration of Galaxy Tools in Elixir. Gigascience. 2017 Apr 10. doi: 10.1093/gigascience/gix022.

3. Simovski, B. et al. GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome. Gigascience 2017 Apr 27. doi: 10.1093/gigascience/gix032

Save