Skip to Main Content
Virginia Tech® home

Web Archiving

Resources

Web Archiving

Web Archiving Metadata

Web Crawling

Web ARChive File Format

Other Resources

  • Common Crawl: an organization with the goal of "democratizing access to web information by producing and maintaining an open repository of web crawl data that is universally accessible and analyzable."
  • Library of Congress Guide to Creating Preservable Websites: Creating preservable websites increases how effectively and comprehensively those websites can be archived.

  • Perma.cc: a registrar that provides persistent shortlinks for websites used for citation. Virginia Tech is a registrar. See the Perma.cc LibGuide for information on using this service. This service relies on web archived pages.
  • WebCite: an archiving system for webreferences (web citations) to ensure that cited references in scholarly works are always accessible