NIH Data Sharing Plan Guide: Data Basics


Definition by the U.S. Office of Management and Budget

Research data is defined as "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings" by the U.S. Office of Management and Budget (OMB Circular 110, last revised in 1993 and amended in 1999). More recently, in February 2013 the Executive Office of the President's Office of Science and Technology Policy (OSTP) published a memo entitled "Increasing Access to the Results of Federally Funded Scientific Research," consistently using the same definition. The OSTP memo also reiterates that "research data does not include laboratory notebooks, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as laboratory specimens."

Definition by National Institutes of Health (NIH)

The NIH Grants Policy Statement defines research data as “recorded information, regardless of the form or medium on which it may be recorded, and includes writings, films, sound recordings, pictorial reproductions, drawings, designs, or other graphic representations, procedural manuals, forms, diagrams, work flow charts, equipment descriptions, data files, data processing or computer programs (software), statistical records, and other research data.”

Definition by Virginia Tech

Virginia Tech's Policy 13015 (last revised in 2001) applies to ownership and retention of research data, results, and related records, but it does not define research data.

Examples of data include "a sequence of bits, a table of numbers, the characters on a page, the recording of sounds made by a person speaking, or a moon rock specimen" (see International Digital Curation Center Glossary).

Research data include sensor data, instrument data, geospatial data, collated or aggregated data, observational data, experimental data, simulation data, numerical data, tabular data, textual data, audio/visual data or any other representation of information that can be communicated and reinterpreted by an expert. The type of data will affect your decisions about file organization, back-up formats, and short- and long-term access.

Data formats are usually determined by how you gather and process data, i.e. the software used for data collection and analysis. Formats are also determined by: 1) norms and conventions of your discipline; 2) options you choose for storing and sharing data; 3) preferred options for preservation. There are optimal data formats that are used for long-term preservation.

