Sharing Genomic Data in the Precision Medicine Era

October 9, 2015

The Human Genetics Massive: #ASHG15 in Baltimore

This week the human genetics “tribe“ (as NIH Director Francis Collins referred to “his people” here) have muscled out the Eastside and Westside crews to take over the Baltimore waterfront for the yearly American Society of Human Genetics (#ASHG15) meeting. The GigaScience editors have attended this, the worlds largest human genetics meeting many times over the years, but standing in the cavernous auditorium this year really got the point across how large this meeting and the field as a whole has grown. Pushing well over 6,000 attendees this year, in his Presidential address Neil Risch said that while this was the 5^th time in Baltimore and joint top venue for hosting it, this would likely be the last time it is hosted here for some time as it has outgrown the self proclaimed “Greatest City in America”. The explosion of people working in the field has mirrored the explosion of material for them to work on generated by next (and now next-next) sequencing technologies, and while over the society’s long history, membership has changed with the technological and molecular trends —moving from clinicians towards scientists— as this genomic data now starts to become routine for clinical applications, membership trends are likely to start moving back the other way.

The ASHG president has much to do with this increase in data scales, with his seminal paper on “The Future of Genetic Studies of Complex Human Diseases” published nearly two decades ago predicting and promoting the rise of cohort and genome wide association studies (GWAS). This paper was mentioned numerous times in the opening Presidential Symposium on “Genomic Epidemiology at Scale”, not least by Francis Collins in his talk on the US Precision Medicine Initiative. Announced at the US State of the Union Address this year, this initiative is building a US national research cohort of over one million people. Lots of these participants will be gathered by combining and integrating data from ongoing projects from Kaiser Permanente and the US Veteran Affairs, but on top of “synthetic cohorts”, to make the wider population as a whole represented large numbers of volunteers are needed. It was great to see the NIH Director stress the importance of data sharing; both with patients who should have the right to access their own data, and with any researchers with a good research idea who should be able to access it. Sharing is essential to empower and engage patients, and to enable the connections and sample sizes needed for this type of precision medicine to work. The NIH are currently working on a new federal policy on biospecimens. Topically for the Baltimore audience, these policies are being drafted with the family of Henrietta Lack’s (of HeLa cell fame) in mind, and the fact that many of the family were present at the meeting got a large round of applause.

The importance of engaging with patients was passionately presented in the “art of science communication” track by “BRCActivist ” and ”Previvor” Andrea Downing (AKA Brave Bosom). Andrea became an ePatient activist after being diagnosed with a BRCA mutation, and trying to understand the implications and uncertainty surrounding her 87% risk of breast cancer. She very eloquently stressed to the audience that actionable genes are not actionable when you don’t have access to the information or medical care you need. As a vocal campaigner against Myriad, while their gene patents are now over, there is still a big data-sharing problem with most of the useful genetic data still in the propriety databases of former patent holders. Patients are keen to participate and join studies, undergoing batteries of tests and filling out incredibly long surveys, but in most cases getting no information back in return. Even when receiving results there is no translation of their implications to patients. Information is behind paywalls, a majority of cases have no access to genetic councilors, and their regular doctors are not trained and experienced enough to help.

Its all about the data. The human genetics data re-up
As Lea Starita pointed out in her later plenary (and also obvious from walking through the exhibitor area), since the Myriad patent on BRCA gene testing has been over turned, genetic testing has exploded, with lower costs leading to more companies testing more genes. Unfortunately less characterized genes than BRCA1/2 need new technologies to screen variants of unknown significance, and there are huge challenges to scale out to new genes. Matchmaking and sharing of data between patients is needed to properly understand phenotypes, and genetic data needs to break out of the proprietary silos highlighted by Andrea Downing to enable this. Many of the problems are due to the perceived legal and ethical issues surrounding genetic data because of the possibility of re-identification, but many tools and schemes are trying to address this issue to open this data up. The Global Alliance for Genomes and Health (of which we at GigaScience are members) have tried to address this niche through secure ways of sharing variant information through their Matchmaker Exchange, BRCA challenge and Beacon Projects. Nicely, they had both an information session and a booth at ASHG.

At GigaScience, we continue to experiment with and improve new ways of sharing our human genomic data, both by setting up our own Beacon (with some initial test data at http://giga-beacon.org/), making it easier to access any of our data held in external controlled access databases (e.g. this example in GigaDB making the data access committee application more straightforward), and through the newly announced collaboration with Repositive indexing our human genomic data. Experimenting with this further, we are hosting a workshop with Repositive and OpenSNP on the 26^th of October in Hong Kong.

Aimed at researchers struggling to find the data they need to power their research, this workshop will demonstrate a number of ways of searching for, finding and accessing human genomics data that has been consented for research use. In a similar manner to our “Bring You Own Data Parties”, we will cover how data that you have in your lab can be made as useful as possible for yourself and your colleagues, and how this data may be the starting point for finding your next research collaboration. Representatives of Repositive, OpenSNP, and GigaScience will providing hands-on training on using tools and data resources such as openSNP, 1000 Genomes, Global Alliance Beacons and GigaDB. This will be followed by a more general interest evening lecture on “Hacking the human genome” (see the event page here). Both events are free and are hosted by MakerBay, and you can sign up on the eventbright page if you want to attend.

Sharing Genomic Data in the Precision Medicine Era

Scott Edmunds

Blog post tags

Recent comment