Skip to Main Content

An Introduction to Geospatial Mapping: Geospatial Mapping Data

This is a guide to help Virginia Tech community members be introduced to the concept of ‘Geospatial Mapping’ and ‘Geographic Information System (GIS)’ and how to make maps using different geospatial mapping software.

Types, Sources & Formats of Geospatial Data

The two primary types of geospatial data are vector and raster data.

Vector Data
Vector data represents geographic features on the Earth’s surface in the form of x and y coordinates (vertices). This data is represented in 3 ways:

  • Points - A point is represented by a single x, y coordinate on the Earth's surface. There can be many points in a vector point file, each with its own unique location. Examples of features represented by points include restaurants, hospitals, tourist attractions, electric poles, etc.

points

  • Lines - Lines are composed of many (at least 2) points that are connected, creating a continuous path. For instance, roads, rivers, railway tracks, etc. are features that may be represented by a line. This line is composed of a series of segments; each “bend” in the road or stream represents a vertex that has defined x, y location.

lines

  • Polygons (areas) - A polygon consists of 3 or more vertices that are connected and closed. Examples of features represented by polygons are buildings, boundaries, land parcels, lakes, etc. Polygons are often used to represent areas on a map, and can have attributes like area, perimeter, and type of land use.

polygons

In vector data, each point, line, or polygon is defined by specific coordinates, making it precise and suitable for representing features with clear boundaries. This type of data is commonly used in geospatial mapping to create maps and analyze spatial relationships. It's like connecting the dots on a map to create a detailed picture of the world around us. 

Vectors have a backend database normally referred to as an “attribute table” with features such as the length of the road or the height of the building. This allows for more detailed and precise information about specific locations. 

attribute table

The image above shows a polygon and its features as represented in the attribute table.


Raster Data
Raster data types represent real world features in a pixelated or gridded format where each pixel is associated with a specific geographical location. One or more features are associated with each grid cell. The value of a pixel can be continuous such as elevation (here all values are similar and have the same measurements e.g in meters or yards), or categorical such as land uses (e.g. residential vs agricultural land uses), land cover, scanned maps, elevation and terrain, satellite/aerial imagery, etc.

A geospatial raster is like a digital photo, but with additional spatial information that connects the data to a particular location on the Earth's surface. This includes information about the extent and size of the raster, the number of rows and columns, and the coordinate reference system used to define the location of each pixel.

raster pixelsvector and raster layers

The figure on the right is an imagery that shows how each pixel in the image is in a different cell and has its own value and how they represent geospatial information in a raster dataset and the figure on the left shows how vector and raster data come together in a map.
To learn more about vector and raster data you can watch this video: Raster and vector data.


Comparison between Vector and Raster Data
Advantages and disadvantages of working with vector and raster data
Each geospatial data type has its uses. Depending on a project's needs, one may use one exclusively or both interchangeably. Here are some advantages and disadvantages of both vector and raster data types:

Vector Data Raster Data
Easy to perform qualitative analysis.

Great for quantitative analysis and mathematical modeling.

More effective in representing discrete data than continuous data. Discrete data is geographic data that only occurs in specific locations. Examples include point and line GIS data such as tree locations, rivers, and streets. It can accommodate both continuous and discrete data. Continuous data is geographic data that has no clearly defined boundaries. Examples include elevation, slope, temperature, and precipitation datasets.
Easy to update and maintain vector data as a new feature. For example, a road can be added or removed from an existing dataset. To update a raster dataset, it needs to be completely reproduced.
Allows for the collection of multiple attributes for represented features. Raster data does not have all the characteristics of the represented features.
Generally smaller in size, allowing for quicker processing. Raster datasets can become potentially very large, affecting the speed of processing and data storage capabilities.
Resolution is not determined by cells/graphics providing a more visually appealing output map. Resolution is determined by cells: outputs can have a pixelated look and feel.


 

Geospatial data can be obtained from various sources, each providing unique information about the Earth's surface and its attributes. They can be broadly classified into primary geospatial data and secondary geospatial data

  • Primary geospatial data can be collected from the sources such as ground/fieldwork surveys (including Global Positioning System (GPS) surveys), aerial photography and satellite remote sensing, etc.
  • Secondary geospatial data can be acquired by converting existing maps or other documents into a suitable digital form. Data derived after some processing of primary geospatial data collected by others are also examples of secondary geospatial data. This data is gathered from published sources like maps and official statistics, or from external sources that have already collected data.

The chart above shows the different sources of geospatial data.

Geospatial data formats are standardized ways of storing, organizing, and exchanging geographic information. They are crucial for use in GIS and geospatial mapping applications. Geospatial data formats are used to store different types of geographic information, and how they are represented on a map depends on the format and the content of the data. Vector data is typically displayed as points, lines, or polygons, while raster data is shown as images or grids of pixels. The representation of the data is often controlled by geospatial mapping software.
Common vector data formats:

Vector Data File Format File Name Extension Description
Shapefiles .shp 
.shx
.sbx
.dbf
.prj
Widely used open source formats for vector data that can be opened in any geospatial mapping software. They are often in a zipped (.zip) folder because they include several different files such as .shp (geometric data), .shx (index), and .dbf (attribute data). This folder needs to be unzipped to use in ArcGIS Pro and other geospatial applications. The zipped folder can be dragged and opened in QGIS.
GeoJSON .geojson A lightweight, text-based format for encoding geographic data structures using JavaScript Object Notation (JSON). It is commonly used for web mapping and easy data interchange.
Geopackage (GPKG) .gpkg Modern geospatial file format that combines both the spatial data (geometry) and attribute data (like names or values of objects) in a single file. GPKG is well-suited for mobile applications, web mapping, and desktop GIS software.
GPS eXchange Data .gpx Used for GPS data, recording waypoints, tracks, and routes.
KML/KMZ .kml or .kmz Keyhole Markup Language (KML) is an XML-based format for expressing geographic annotation and visualization. KMZ is a compressed version of KML, often used for Google Earth applications.
File Geodatabase .gdb An Esri proprietary format used in ArcGIS software for organizing and managing vector, raster, and attribute data in a structured way.
AutoCAD Files .dwg, .dfx Computer-Aided Design (CAD) formats that can store both 2D and 3D geospatial data, often used in architecture, engineering, and construction.
OpenStreetMap .osm Used in GIS and mapping applications and can be freely accessed and edited by anyone. It is typically stored in XML-based formats.
Comma Separated Values .csv Information is organized into rows and columns, with each column separated by a comma. CSV files can be opened with various software applications, including spreadsheet programs like Microsoft Excel and Google Sheets; but with coordinates, it can be opened in geospatial mapping tools.

Common raster data formats:

Raster Data File Format File Name Extension Description
Geo Tagged Image File Formats .geotiff A raster format that embeds geographic metadata into the image. It is used for georeferenced images, satellite imagery, and elevation data.
Digital Elevation Model .ddf It is the type of geospatial data used to represent the elevation or terrain of the Earth's surface creating a 3D model of the landscape, providing information about the height of the terrain at different locations.
Multiresolution Seamless Image Database (MrSID) .sid This is a geospatial raster format used for efficiently storing and delivering large images, particularly aerial and satellite imagery. 
Enhanced Compressed Wavelet .ecw A geospatial raster image format designed for efficient compression of large geospatial datasets, including satellite and aerial imagery commonly used in web mapping and geospatial analysis.
Tagged Image File Formats .tiff TIFF data is often used to store elevation data, aerial or satellite imagery, and other raster-based geographic information, allowing for accurate and high-resolution representation of the Earth's surface.
NetCDF .nc Commonly used for storing multidimensional gridded (raster) scientific data, including climate and environmental datasets.
Portable Network Graphics .png It is a raster image format that can be used in geospatial data to store and display maps, diagrams, and other visual elements. PNG files are lightweight and support transparency, making them suitable for overlaying map layers or creating custom map legends.
Joint Photographic Expert Group .jpeg A widely used image format that is sometimes used in geospatial data for its efficient compression of raster images. They are often used when file size needs to be minimized, making them suitable for quick loading in web-based mapping applications.
Portable Document Format .pdf It can be used for geospatial data in the form of maps, reports, and other documents. PDFs can embed geospatial information, including maps, annotations, and coordinate data.

Note:

  1. Not all of these formats are supported by ArcGIS but ArcGIS supports a number of file formats. You can find them here: https://www.esri.com/en-us/arcgis/products/arcgis-data-interoperability/supported-formats 
  2. Some of these formats (for example .tiff, .nc, .png, .jpeg, .pdf) are not specific to geospatial information but are commonly used to store it.

Metadata Standards
Metadata is information about data that describes its characteristics, origin, content, and other relevant details. Metadata standards provide a structured format and guidelines for documenting and organizing metadata, ensuring consistency and interoperability across different datasets. Metadata standards facilitate data discovery, understanding, and proper use.

Examples of metadata standards:

  • Open Geospatial Consortium (OGC) Standards: OGC standards often include metadata requirements, such as the Web Map Service (WMS) metadata standards.
  • ISO 19115/19139: This ISO standard defines a metadata model for describing geospatial data. It covers various aspects such as data identification, data quality, spatial and temporal information, data lineage, and contact information. ISO 19115 provides a comprehensive framework for organizing metadata to support data management and discovery.
  • Federal Geographic Data Committee (FGDC) Content Standard for Digital Geospatial Metadata: This standard, developed by the U.S. FGDC, provides guidelines for creating metadata for geospatial data. It outlines required and optional metadata elements to be included in the documentation, ensuring consistency and interoperability of metadata across different datasets. FGDC was phased out in 2003 in favor of ISO 19115 standard.
  • ISO 19110 - Feature Cataloguing Methodology: This standard focuses on describing the characteristics and behaviors of geographic features and is an essential part of geospatial metadata.
  • ISO 19119 - Services Metadata: This standard defines metadata elements for describing geospatial services, such as web map services and geoprocessing services.

Note: you can read more about the ISO 191** suite of geospatial metadata standards here.

Metadata standards enable users to understand the content, context, and limitations of geospatial data. They facilitate data sharing, data integration, and interoperability among different data sources, making it easier to locate and evaluate suitable datasets for specific purposes.

Common Geospatial Metadata Elements

Metadata Element Description
Title The name or title of the dataset, providing a brief but descriptive identifier.
Abstract A concise summary or description of the dataset, highlighting its purpose and key characteristics.
Keywords Keywords or phrases that describe the content of the dataset, aiding in search and discovery.
Temporal Extent The time period covered by the dataset, specifying the start and end dates or other temporal references.
Spatial Extent The geographic area covered by the dataset, often defined by bounding coordinates (i.e. the limits of coverage of a data set expressed by latitude and longitude values in the order western-most, easternmost, northern-most, and southernmost.) or a polygon.
Data Format The file format or data structure used for the dataset, such as Shapefile, GeoJSON, or GeoTIFF.
Data Quality Information about the quality, accuracy, and reliability of the data, including details on accuracy measures and error sources.
Lineage A record of the data's history and origin, including information about data sources, processing steps, and transformations.
Use Constraints Any restrictions or limitations on how the data can be used, including licensing and copyright information.
Access and Distribution Details on how users can access and obtain the dataset, including URLs, download links, and access methods.
Metadata Contact Information about the person or organization responsible for the metadata, including their contact details.
Data Contact Information about the person or organization responsible for the dataset itself, including their contact details.
Spatial Reference System The coordinate system or projection used to represent the data's spatial reference.
Scale Denominator A measure of the map scale or level of detail of the dataset.
Supplemental Information Additional information that provides context or background about the dataset.
Citation A reference or citation that can be used to properly attribute the dataset when it is used.
Maintenance Information Information about the dataset's update frequency, version history, and data maintenance practices.

These metadata elements are crucial for ensuring that geospatial data repositories are well-documented and that potential users can effectively discover, evaluate, and use the data. Metadata standards and profiles may define additional elements and specific requirements for geospatial datasets.

Geospatial Data Quality
Geospatial data quality refers to the accuracy, completeness, consistency, and reliability of spatial data. It encompasses several aspects, including positional accuracy, attribute accuracy, logical consistency, and completeness. Data quality standards help assess and document the fitness for use of geospatial data. They ensure that data users can make informed decisions based on reliable and accurate information.

Examples of geospatial data quality standards:

  • National Standard for Spatial Data Accuracy (NSSDA): This standard  is classified as a Data Usability Standard and developed by the U.S. Federal Geographic Data Committee, provides guidelines for assessing the positional accuracy of geospatial data. It defines various accuracy classes based on the intended use of the data and establishes requirements for data collection and reporting.
  • International Organization for Standardization (ISO) 19157: This ISO standard specifies concepts, principles, and methods for assessing and reporting geospatial data quality. It defines a framework for evaluating data quality based on dimensions such as completeness, positional accuracy, attribute accuracy, logical consistency, and temporal accuracy.
  • Open Geospatial Consortium (OGC) Standards: OGC publishes various standards related to geospatial data, including data quality standards like the OGC® SensorML, which addresses quality of observation data.

Geospatial Data Curator

Profile Photo
Imma Mwanja
Contact:
Newman Library Room 3010
560 Drillfield Dr
Blacksburg, VA 24061
(540)-231-8665