Research Data Management Guide: Metadata
What Is Metadata?
Metadata is a standardized set of information about your data. It can be used to document, describe and annotate research data. Well-created and maintained metadata is key to long-term use, data re-use, data integration and interoperability.
Why Is Metadata Important?
Metadata provides critical standardized information that creates meaningful links between the data itself and potential uses of the data. Metadata can also provide details about the context in which the data was created, including time, location, creator, and instrument, among other fields. Using a rich and well-formed metadata standard to describe the data with relevant keywords, terms and phrases makes research data more easily searchable, accessible and usable.
How Is Metadata Different from Data Documentation?
Metadata is a method of documentation that refers to a particular standard that has been agreed upon by a specific community or group. If you need more general information about documenting your data, see our page on data documentation.
Are Metadata Standards Different by Disciplines?
A metadata standard (or schema) is a set group of elements standardized or agreed upon by a community or group. Some scientific disciplines and data repositories have established their own metadata standards for data sets, others use general standards, like Dublin Core (DC) or Metadata Object Description Schema (MODS). Often these general standards can be adopted or adapted and modified to fit a particular research project and situation.
The Research Data Alliance Working Group has created a comprehensive Metadata Directory to help researchers identify subject-specific metadata standards.
Three Basic Categories of Metadata Standards
- Descriptive metadata describes and identifies information resources, enabling searching and retrieving;
e.g. unique identifiers, physical attributes (media, dimensions condition), and bibliographic attributes (title, author/creator, language, keywords).
Structural metadata expresses the ways in which different components of a set of associated data relate to one another, such as tables in a database and photograph B was included in manuscript;
e.g. tags such as title page, table of contents, chapters, parts, errata, index, sub-object relationship.
Administrative metadata addresses preservation practices, information about ownership and rights, and technical metadata about formats.
e.g. Technical data such as scanner type and model, resolution, bit depth, color space, file format, compression, light source, owner, copyright date, copying and distribution limitations, license information, preservation activities (refreshing cycles, migration, etc.).