English Editing Researcher Services

Are your research data publicly accessible?

 

data accessibility

 

Today, 1 July 2018, is the day that the International Committee of Medical Journal Editors (ICMJE) has made it compulsory for all journals using ICMJE guidelines (the Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals) to include a statement in clinical trial reports saying if and how the study data can be accessed by readers. Many journals and publishers (including non-medical ones) have also recently introduced data accessibility policies, ranging from encouraging the inclusion of data accessibility statements to requiring data uploads as part of manuscript submission and even compulsory data peer review. Researchers need to know what this international trend is about and how it affects them.

 

Need for data sharing

It has already been a standard requirement that authors submitting work to journals should be prepared to submit the data underlying their study in case the journal's editors or peer reviewers ask for them. The journal guidelines should say if the policy refers to raw data (original data after anonymization) or processed data (after any conversions and analysis). In addition, journals have routinely required certain datasets to be uploaded to a public database, such as gene and protein sequences.
 
The new international trend is to promote research transparency and to encourage or require authors to upload their study data as a routine part of publishing. The uploading of data files applies to a wide range of disciplines, data types, and databases. There are also general repositories for data uploads, including Dryad, Figshare, and Zenodo. The worldwide drive for data sharing has implications for improved record keeping by researchers and their institutions. Reasons for data sharing include the following:
 
(1) Reproducibility: data sharing allows data transparency for other researchers to verify calculations and results
 
(2) Further analysis: data sharing allows easier pooled analysis of published research, such as meta-analyses
 
(3) Further research: researchers acknowledge the advantages of "open" research/science, "open" innovation, and "open" data to speed up research efforts through sharing and collaboration
 
(4) Credit: Uploaded data are assigned permanent identifiers (digital object identifiers [DOIs] or database accession numbers); by citing datasets that  are further analyzed  in studies, authors are giving credit to other researchers, potentially increasing the ways in which researchers are assessed by their institutions and funders
 
(5) Mandates: Some institutions, public funders (eg, government research councils and agencies), and foundations, charities, and non-governmental organizations (eg, Wellcome Trust, Bill & Melinda Gates Foundation) require data and research outputs from funded research to be made public as a condition of funding, and as a way of giving back to the public and encouraging reproducibility and open innovation
 
Data transparency is often, but not always, associated with Open Access publishing. Some funders require research publications to be published as Open Access articles. Even if a journal is not open access (eg, traditional subscription journals) but encourages or requires the public sharing of the underlying data, then the data files will have Open Access status. In other words, the authors retain copyright and the data are freely available, no matter if they are stored in an external repository or in the supplementary materials section of the journal's website. The data are usually assigned a Creative Commons license of CC BY: an attribution license, meaning that anyone can reuse the data as long as a correct citation is given.
 
Some publishers, such as The Royal Society, help authors upload the data associated with an article to a public repository, but with an embargo (that is, waiting until the article is published before the data can be accessed publicly). Some publishers encourage authors of an article to also make their data available in a repository and to publish an accompanying brief "data descriptor" article in a data journal to encourage data reuse (for example, Nature's Scientific Data).
 
 
 
The new ICMJE data sharing policy
 
Medical journals that claim they use the ICMJE guidelines will now require a data sharing/accessibility statement in clinical trial reports. The Annals of Internal Medicine has had a similar policy since 2007 for all research articles.

The new ICMJE policy has two parts:

  1. As of 1 July 2018 manuscripts submitted to ICMJE journals that report the results of clinical trials must contain a data sharing statement as described below.

  2. Clinical trials that begin enrolling participants on or after 1 January 2019 must include a data sharing plan in the trial's registration.... If the data sharing plan changes after registration this should be reflected in the statement submitted and published with the manuscript, and updated in the registry record.

Data sharing statements must indicate the following: whether individual deidentified participant data (including data dictionaries) will be shared; what data in particular will be shared; whether additional, related documents will be available (e.g., study protocol, statistical analysis plan, etc.); when the data will become available and for how long; by what access criteria data will be shared (including with whom, for what types of analyses, and by what mechanism).

 

The data sharing policy does not mean all data must be shared publicly (there can be exceptions if explained properly, eg, for ethical or legal reasons). It means that authors need to state if and how readers can access the data. For example, authors can say the data are not available, all the data are already in the article, some data are in public online repositories, or readers need to ask the authors for special permission. The questions that must be answered in the data sharing/availability statement are as follows:

  • Will individual participant data be available (including data dictionaries)?
     
  • What data in particular will be shared?
     
  • What other documents will be available (eg, Study Protocol, Statistical Analysis Plan, Informed Consent Form, Clinical Study Report, Analytic Code)?
     
  • When will data be available (start and end dates)?
     
  • Who can access or ask for the data?
     
  • For what types of analyses can readers ask to use the data?
     
  • By what mechanism will data be made available?
 
 

TOP guidelines

Many non-medical journals, and medical journals alike, have since 2015 adopted a more general set of guidelines related to transparency and data sharing/accessibility, called the Transparency and Openness Promotion (TOP) guidelines.

 
The TOP guidelines require journals to have journal policies in place and/or to require author statements in all research articles related to increasing reproducibility and transparency, in the following areas:
 
  • Citation standards (journal policy on citation of data, code, and materials)
  • Data transparency (data sharing/accessibility statement to mandatory uploading of data to a repository)
  • Analytic methods (analytic code) transparency
  • Research materials transparency
  • Design and analysis transparency
  • Study preregistration (journal policy on if study protocol should be submitted in advance)
  • Analysis plan preregistration (journal policy on if analysis protocol should be submitted in advance)
  • Replication (journal policy on submission of replication studies)
 
Examples of data sharing/accessibility statements
 
Dataset available on request. Please contact the corresponding authors.
The interview guide is available from the first author on request.
No additional data are available.
No additional unpublished data are available.
Data used in this analysis was provided by Xxxxx by license and cannot be shared with other parties.
All relevant data are within this article; the data are owned by Xxxxx and can be accessed by contacting Xxxxx.
An extended version of the dataset is available as supplementary material.
All data are in the public domain.
Data are available by request via www.clinicalstudydatarequest.com.
Data are publicly available online on Dryad (DOI: xxxxxxxxxxx).
 
 
Plan carefully
 
Researchers should check their target journals carefully for relevant data sharing and transparency policies, as well as all other submission/publishing and ethical requirements. Some journals require data to be peer reviewed too; failure to submit requested files or advance access to data files will lead to automatic manuscript rejection. The number of journals adopting or enforcing data sharing policies or requiring author statements is set to increase. The latest version (15 January 2018) of the Principles of Transparency and Best Practice in Scholarly Publishing that is followed by ethical journals states that
 
A journal shall also have policies on publishing ethics. These should be clearly visible on its website, and should refer to: i) Journal policies on authorship and contributorship; ii) How the journal will handle complaints and appeals; iii) Journal policies on conflicts of interest / competing interests; iv) Journal policies on data sharing and reproducibility; v) Journal’s policy on ethical oversight; vi) Journal’s policy on intellectual property; and vii) Journal’s options for post-publication discussions and corrections.
 
 
Researchers should check funding and institutional requirements and plan carefully, because any data sharing requirements will affect data handling/archiving procedures during the research itself. To help coordinate the process, funders and institutions may now require researchers to submit detailed data management and sharing plans.
 
 

Dr Trevor Lane
Education and Engagement Consultant
Edanz Group

tlane@edanzgroup.com

< Back to Edanz Academy Blog