Type any scientific term into any search engine and its pretty much guaranteed that a Wikipedia article will be the first hit. Many in the scientific community have been sceptical that a free website maintained by untrained volunteers should dominate the global provision of knowledge, but a growing number of researchers are deciding that it is better to embrace a platform that enables the curation burden to be distributed and potentially ‘crowdsourced’ by the global “hive mind”. At last weeks ISMB meeting in Long Beach, the biggest gathering of the bioinformatics community organized by the International Society for Computational Biology (ISCB), this growing acceptance of wikis was clear from the many advocates presenting their work, and ISCB even announced a competition to improve the existing Wikipedia articles about any aspect of computational biology (see the release here, and further coverage in genomeweb).
With a key component of the ISCB’s mission being education and public engagement, this further builds on work from the Computational Biology WikiProject, as well as the PLoS Computational Biology scheme to incentivize contributions to Wikipedia on computational biology themes by giving academic reward of a Topic Page article in the journal (see the editorial here). Only one paper has been published so far, but there are many more currently under peer-review, and Spencer Bliven, as the first author to have his work published this way promoted the scheme at the BOSC pre-ISMB satellite meeting (our report on that meeting here) with this topic paper on Cyclic Permutation producing the basis of and linking to this Wikipedia page.
Something Wiki This Way Comes
The idea of incentivizing and linking academic contributions to Wikipedia with co-ordinated publications is not a new one, with the journal RNA Biology having a similar scheme to link publication of an article about non-coding RNAs to a Wikipedia entry. In the 3 years since its launch 21 articles have come out of this scheme, although this is only a drop in the ocean compared to the >1500 human ncRNAs in mirBASE. Wikipedia enthusiast Alex Bateman presented on this and other schemes for incentivizing Wikipedua contributions in a special session at the ISMB on harnessing community intelligence for bioinformatics. Other more successful attempts to increase the number of Wikipedia content on computational biology included setting PhD students and job applicants the task of contributing to articles on microRNAs, and these produced significant increases in content (with the most productive job applicant getting the job), but these fun examples unfortunately seem unlikely to be scalable. The ISCB Wikipedia competition is another different approach to use this motivation factor to see if it can lead to an increase in quantity as well as quality of content. Giving a very “Heavy Metal Umlaut” style overview (check this excellent video on the evolution of a Wikipedia entry) on Alex’s experiences opening up the Rfam database to Wikipedia contributions. Despite vandals and bots producing a lot of noise and small changes, it was reassuring to see these were quickly rectified and the vast majority of permanent and detailed changes are produced by wikimedians and scientists.
Following on from his excellent talk last year (see our previous write-up) Andrew Su presented on one of the big success stories of this approach: Genewiki. Having built up a massive crowdsourced user base of contributors and readers annotating human gene function for the 10,000 wikipedia stubbs seeded by the project, work now seems to be moving towards making this data more computable and in structured formats (see their BMC Genomics paper for more). A related, but more graphical use of the Wiki system was presented by Alexander Pico on WikiPathways, which rips out the text editing tools of Wikipedia and replaces them with pathway editing software. Steadily building up large amounts of high quality content, WikiPathways has also embraced the incentivisation-by-publication model by teaming up with our sister BMC data-oriented journal Open Network Biology to publish a subset of high quality peer-reviewed models. Using a different approach to harness the online collective, Firas Khatib presented a final talk on solving crystal structures with computer games. The Foldit computer game has built up a army of eager gamers devoting their time to solving structural biology challenges, and several structures, drug targets and algorithms have already been improved and published as a result, some citing these gamers as the “Foldit Contenders Group”. It was interesting to see a video of work in progress on making a more intuitive interface based on x-box connect, so as a further incentive it may soon be possible to solve structural biology challenges and get fit at the same time!
We’ve talked previously on the importance of data curation, and meeting the challenges posed by the ever increasing volumes of data being churned out is an important part of GigaScience‘s scope and an area already covered in our launch issue. In most cases community annotation has been extremely difficult to get to take off, but as we know from our involvement in the crowdsourcing of the the deadly 2011 outbreak E. coli dataset that utilized a Github wiki, the Wiki and gaming approaches seem to be particularly promising ways of doing this, and we look forward to following and covering growing examples of this in the future.
Image source: Andrew Laing cc wikipedia.