NIH Data Sharing Plan Guide: Writing Data Sharing Plans

DMPTool and Templates

DMPTool is developed by California Digital Library to help researchers prepare data management and sharing plans for specific funding agencies, including NIH. Virginia Tech is a participating institution and anyone from the Virginia Tech research community can create an account on the DMPTool site. When you are asked to 'select your institution', choose Virginia Tech and you will be directed to VT's log in page. We encourage VT researchers to use DMPTool for the following benefits:

  • DMPTool provides instructions that help you fulfill the data management and sharing requirements of your grant.
  • DMPTool's review function enables librarians to effectively provide data management consultation services.
  • DMP templates available via DMPTool are customized for Virginia Tech researchers.

What to Include in Your Data Sharing Plan

The following five questions come from the NIH document Key Elements to Consider Preparing a Data Sharing Plan Under NIH Extramural Support.

  1. What data will you share?
  2. Who will have access to the data?
  3. Where will you make the data available?
  4. When will you share the data?
  5. How will researchers locate and access the data?

In accordance with five elements, we recommend going through specific questions below when identifying what content to include in your data sharing plan.

Key Element #1: What Data Will You Share?

To optimize the benefits of data sharing, final research data along with metadata and descriptors should be shared to make sharing meaningful and useable by other researchers:

  • What type of data will be collected and shared (e.g., genomic, physiological, clinical, medical history, etc.)
  • Will the study include unique data that connot be readily duplicated (e.g., large surveys that are too expensive to replicate; studies of unique populations, such as centenarians; studies conducted at unique times, such as a natural disaster; studies of rare phenomena, such as rare metabolic diseases; etc.)?
  • Will individual-level data or raw data also be shared, and if so, will the whole data set be shared?
  • Will aggregate data (e.g., summary statistics or tables) also be shared? Will the analytical methods used (tools and parameters) be defined?
  • What data quality control measures will be implemented?
  • What data documentation will be shared (e.g., metadata, descriptors, schema) so that others can understand and use the dataset and to prevent misuse, misinterpretation, or confusion?
  • What commonly accepted data standards or standardized vocabularies will be used to enable others to interpret the data and improve interoperability with other data systems?
  • What format will be used to encode the data? Will this format be consistent with extant, commonly used, and/or open format standards?
  • In addition to final research data, what other data will be available?

Key Element #2: Who Will Have access to the Data?

To maximize the benefits of data sharing, data should be shared as broadly as possible to the extent consistent with applicable laws, regulations, rules, and policies. In describing who will have access to data, a data sharing plan should indicate:

  • Will the general public have access to some or all of the data?
  • Will access to certain data or certain components of the data be restricted to qualified researchers, e.g., to address specific rules, laws, regulations, or policies (e.g. IRBs, human subjects, informed consent, etc.)?
  • If data access is restricted, what are the justifications/criteria for restricting access (e.g., relevant laws (local, state, federal, etc.), regulations, rules, institutional policies, IRB approvals, and consent documents)?
  • What will researchers who seek to obtain data need to do to comply with any data access restrictions?
  • Are there any limitations on release of data that may be considered "sensitive"?
  • What data sharing agreements will be necessary to appropriately restrict the transfer of protected, sensitive, or confidential data to others and to require that data be used only for research purposes?
  • Who will be operationally responsible for ensuring that no personally identifiable information is made available (e.g., principal investigator, independent curator)?

Key Element #3: Where Will You Make the Data Available?

To minimize additional administrative workloads for sharing of data, data repositories with common standards and an established infrastructure dedicated to the appropriate distribution of data would generally be ideal for data sharing:

  • Will an existing database, data repository, data enclave, or archive be used to store and disseminate the data (e.g., dbGaP, National Database for Autism Research (NDAR)), and if so, how are the policies and procedures in place for others to access the data consistent with applicable NIH policies? This table lists NIH-supported data repositories that make data accessible for reuse.
  • Will a new repository need to be developed, and if so, who/what will maintain the repository?
  • Will the data be distributed directly by an investigator to those who request it (e.g., through and electronic file)?

Key Element #4: When Will You Share the Data?

To optimize the timely and broadest usage of data, data should be made available as soon as possible and for as long as possible:

  • Indicate the schedule for release of data:
    • What data, if any, will be shared prior to publication?
    • What data will be shared upon acceptance of publication?
    • If using a repository, when will data be submitted to the repository?
  • Will data from ongoing longitudinal studies be released in increments as data become available?
  • Will the timing of data sharing be specifically linked to other relevant policies concerning the timing of release of data (e.g., NIH GWAS policy,, specific requirements in the funding opportunity announcement (FOA))?
  • How will data maintenance and access be ensured after the award ends?
    • Will there be support for continued sharing of data (e.g., through grant applications, administrative supplements, or other sources) or planned migration of data to another database, data repository, etc.?

Key Element #5: How Will Researchers Locate and Access the Data?

To optimize usage of the data, researchers need to be able to easily identify locations of relevant data and to be able to easily access the data:

  • What steps will be taken to help researchers know that the data sets exist?
    • Will registries, repositories, indexes, word-of-mouth, publications, and/or other approaches be used to publicize the availability and accessibility of the data?
    • Will these be linked and cross-referenced so other researchers can readily find them?
  • How will the data be accessed (web service, ftp, etc.)?

Need Help?

We invite you to contact us with questions, for assistance in completing data sharing plans, or for options to include in your plan, such as data repositories in your field, standardized vocabularies and formats for data, creative commons licensing choices that specifically describe how data sets may be used by others, and data publication options.

  • ‚ÄčAndi Ogier, Associate Director, Data Services Unit
  • Ginny Pannabecker, Health Sciences Research Support Coordinator; Health, Life Science & Scholarly Communication Librarian