by Ben Goldacre
I am a doctor and a data geek. I worry that data geeks are too easily seduced by the glamour of laboratory science and forget about clinics. Randomised controlled trials are the best tool we have in medicine for finding out if a treatment works or not. Lots of trials are done. Unfortunately, the results of these trials can go missing in action after they are completed.
Missing data is always a challenge: but we also know that “negative results” are more likely to go missing. This means we have a biased sample, overestimating the benefits of treatments. To prevent all this happening, people have set up registers of trial protocols, to be completed before trials begin. These have not been correctly used, and they are not matched to published trials, which show up what data has been left unpublished.
I will describe a small project to fix this, illustrate how that can lead on to fixing other similar problems in medicine, and make a cry for help.
In a research environment, under the current operating system, most data and figures collected or generated during your work is lost, intentionally tossed aside or classified as “junk”, or at worst trapped in silos or locked behind embargo periods. This stifles and limits scientific research at its core, making it much more difficult to validate experiments, reproduce experiments or even stumble upon new breakthroughs that may be buried in your null results.
Changing this reality not only takes the right tools and technology to store, sift and publish data, but also a shift in the way we think of and value data as a scientific contribution in the research process. In the digital age, we’re not bound by the physical limitations of analog medium such as the traditional scientific journal or research paper, nor should our data be locked into understandings based off that medium.
This session will look at the socio-cultural context of data science in the research environment, specifically at the importance of publishing negative results through tools like FigShare – an open data project that fosters data publication, not only for supplementary information tied to publication, but all of the back end information needed to reproduce and validate the work, as well as the negative results. We’ll hear about the broader cultural shift needed in how we incentivise better practices in the lab and how companies like Digital Science are working to use technology to push those levers to address the social issue. The session will also include a look at the real-world implications in clinical research and medicine from Ben Goldacre, an epidemiologist who has been looking at not only the ethical consequences but issues in efficacy and validation.
28th February to 1st March 2012