Data resources for social science: Research datasets for secondary analysis
What's on this page
This part of the guide identifies key collections of data that can be exported and analyzed in analytical software. In general, you can select microdata based on narrative descriptions and data documentation. See the aggregate data tab for data in tables, which often can be exported and combined for analysis.
Resources on this page are grouped under these headings: Overview of issues in locating data; Major data repositories; VT Library data services; International demographic/economic data collections, US-centric data repositories; Governance data collections; International electoral data portals, US social & opinion data collections; International social survey archives; Miscellaneous data collections, and guidance on citing data you use.
Data can be hard to find and to work with. Statista is a large, reasonably easy to use tool for finding statistics from around the word. Sage Sage Data [formerly called Data Planet] is even more comprehensive, but it helps to be familiar with the ways statistical agencies organize and describe their data if you want to use it effectively.
The VT library has experts to help you. The VT libraries has a team of informatics consultants to help you with methodology, interpretation, visualization, and management/curation of your research data. Some tabs in this guide are maintained by members of the library's data service group.
Demographic and economic datasets: international
IPUMS - Integrated Public Use Microdata Series, InternationalIPUMS provides census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community context. Data and services available free of charge.
EurostatEurostat, the statistical office of the European Union, aims to compile and make available statistics at European level that enable comparisons between countries and regions. Eurostat coordinates the statistical activities of the institutions and bodies of the Union, in particular with a view to ensuring consistency and quality of the data and minimising reporting burden.
Indices of Social DevelopmentCreate downloadable, country-level reports drawing on 200 indicators to track how different societies perform along six indicess of social development. The indices allow estimating the effects of social development for a large range of countries on indicators like economic growth, human development, and governance.
NationMasterVisualizations to compare countries in more than two-dozen categories, drawing on a vast compilation of data from hundreds of sources.
Comparative Agendas ProjectData about policymakers rather than citizens. CAP monitors policy processes by tracking the actions that governments take in response to the challenges they face, classifying policy activities into a single, universal and consistent coding scheme. These activities can take many different forms, including debating a problem, delivering speeches, (eg, the Queen’s speech in the United Kingdom), holding hearings, introducing or enacting laws (eg, Bills and Public Laws in the United States) or issuing judicial rulings (eg, rulings from the European Court of Justice).
International public opinion: Election studiesResearch guide from Princeton University identifies many open-access repositories of election data at the global, regional, and national levels
Election PassportFree resource. Election Passport provides free access to a rich dataset of constituency election results in over 100 countries and territories throughout the world. The data are unusually complete, including votes won by very small parties, independents, and frequently candidate names, that are difficult to locate. Additional elections are regularly added.
Global Elections DatabaseFree resource. Global Elections Database (formerly known as the Constituency-Level Elections Dataset, 2007) provides information on the results of both national and subnational elections around the world. These data are presented at two levels of analysis, allowing users to quickly identify the results of elections within a country as a whole or within particular constituencies or districts of a country. All parties are included in the database regardless of the number of votes that they won. The data are based on countries' official election results and have been amassed from various government institutions. The data are accessible in multiple formats: spreadsheets; tables; GIS maps.
- Access to these data requires you to createa free, personal account, which then allows you to save customized datasets for future reference and to receive automatic updates to the data when they become available.
Constituency-Level Elections Archive (CLEA)Free resource. Constituency-Level Elections Archive (CLEA) is a repository of detailed election results at the constituency level for lower house legislative elections from around the world. Purpose is to preserve and consolidate these valuable data in one comprehensive and reliable resource that is ready for analysis and publicly available at no cost for research, education, and policy-making.
MIT Election Data and Science Lab (MEDSL)Clearinghouse for datasets that can fuel studies on US elections at all levels.
- Tools + Resources section lists open data sources recommended by MEDSL
- Research section offers "explainers" and academic papers
Cite the data you use
How to Cite Data (Michigan State)Comprehensive libguide from MSU Libraries provides general rules for data citation, with examples for citing datasets and tabular data in principal style manuals used in social science.
Annotation for Transparent Inquiry (ATI) at a GlanceATI is a new approach to connecting readers of qualitative and mixed-methods research to the underlying data, such as those curated by the Qualitative Data Repository at Syracuse University. ATI facilitates transparency by allowing scholars to “annotate” specific passages in an article. Annotations amplify the text and, when possible, include a link to one or more data sources underlying a claim; data sources are housed in a repository. (VT's institutional membership in the QDR is provided by the University Libraries.)
Temporary, trial access only -- use while you can
The University Libraries at Virginia Tech regularly secure short-term, trial access to online resources in order to gauge their appropriateness to our university's teaching and research missions. These trials run in October, February, and sometimes April.
Trials are listed in a sidebar in the main Databases A-Z directory.
Each entry includes a link to a user survey. I and other subject librarians invite you to email us your detailed assessments of resources. Responses from the Virginia Tech community are vital to the library's deliberations about whether and when to acquire or enhance databases and the like.
As appropriate I will list trials and user survey links a resource trials tab in this and my other libguides. Entries for trials I may include in the body of my libguides will go away when the trial period ends.
Data portals and repositories around the world
If you want to start by searching for variables:
Google Dataset SearchDiscovery tool for numerical and geospatial data, but (like VT Discovery Search) its reliability depends on how dataset providers comply with technical standards for describing data (ie, metadata).
Harvard's DataverseSearchable archive of datasets and data-related articles. Part of international "Dataverse Project," which is both a network of data repositories and a project to develop open source research data repository software.
ICPSR Bibliography of Data-related LiteratureICPSR Bibliography of Data-related Literature is a freely-available, searchable database of citations to published and unpublished scholarly works. The database currently contains over 93,000 citations, with hundreds more added every month. Each citation has two-way links: out to the publication and into ICPSR’s study catalog, providing access to the data being analyzed in the publications. Because of these linkages, the Bibliography facilitates data discovery and literature searches by social scientists, students, librarians, journalists, policymakers, and funding agencies.
If you prefer to start by browsing by topic or place:
Our World in Data"The mission of Our World in Data is to make data and research on the world’s largest problems understandable and accessible." Leverage interpretative essays and data visualizations of very long run trends in policy problems to inform your own (re)search. Broadly organized around the UN Global Sustainability goals, themes include health, food provision, the growth and distribution of incomes, violence, rights, wars, culture, energy use, education, and environmental changes.
Produced by the Oxford Martin Programme on Global Development at the University of Oxford.
International Social Survey ProgrammeThe ISSP is a cross-national survey program conducting annual surveys in a broad group of countries. The survey asks questions on a variety of topics. You can download full datasets or analyze online through the GESIS Archive.
European Data PortalGateway to public sector information available on public data portals across European countries. Organized by topic. Also provides information regarding the provision of data and the benefits of re-using data.
re3Data.org Registry of Research Data Repositoriesre3data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines. It presents repositories for the permanent storage and access of data sets to researchers, funding bodies, publishers and scholarly institutions. The registry is funded by the German Research Foundation (DFG). Offers an interesting visual subject browse.
Pew Research CenterProvides data and analysis on the issues, attitudes and trends shaping the United States and the world. Datasets in seven areas, including U.S. Politics & Policy, Journalism & Media, Internet, Science & Tech, Religion & Public Life, Hispanic Trends, Global Attitudes & Trends, Social & Demographic Trends. Create a fee account to download Pew dataasets.
SEDAC: NASA Socioeconomic Data & Applications Center"Focusing on human interactions in the environment, SEDAC has as its mission to develop and operate applications that support the integration of socioeconomic and earth science data and to serve as an 'Information Gateway' between earth sciences and social sciences." Search, browse, download data, maps, other tools and resources. Hosted by CIESIN at Columbia University.
OpenDOARGlobal directory of academic open-access repositories (some will include datasets, but will also have many other information formats, such as articles, white papers, and more). Browse to repositories by global region and country. Search is limited to repository names, not to repository contents.
SHARE databaseComprehensive, multi-university inventory of research widely discoverable, accessible, and reusable. This link filters SHARE to find records only of research datasets.
ARDA Data Archive - Association of Religion Data ArchivesThe ARDA Data Archive is a collection of surveys, polls, and other data submitted by researchers and made available online by the Association of Religion Data Archives. You can browse files by category, alphabetically, view the newest additions, or search for a file. Once you select a file you can preview the results, read about how the data were collected, review the survey questions asked, save selected survey questions to your own file, and/or download the data file.
US (and mostly US) data collections
- ICPSR: Inter-University Consortium for Political and Social ResearchICPSR is a large, searchable repository for social and behavioral science research datasets, covering political science, sociology, economics, demography, and interdisciplinary areas. ICPSR also curates and distributes data from public sources (like federal statistical agencies) with many value-added features, maintaining several topical archives in the areas of demography, criminal justice, mental health, aging, child care, and education.
- Virginia Tech's institutional membership entitles members of the VT community both to download datasets and to deposit their research data for permanent curation and access; create a free ICPSR "My data" account and log in with it in order to download data.
- Some datasets have access/use restrictions that may require approval by VT's institutional review board (among other offices) and by ICPSR prior to access; in some cases you researchers are required to work only in secure "data enclaves." For highly sensitive data, such approvals can add months to the beginning of the research timeline. Restricted data at ICPSR are conspicuously marked. (These restrictions are to protect research respondents' identities in areas like drug use, sexuality, and criminal behavior.)
VT faculty and students qualify for discounts on ICPSR's summer training program of workshops and courses on social science research methods.
- Data-PASS DataverseSearchable portal brings together contents of several major repositories of social data. Dataverse is an web application for sharing, preserving, citing, exploring, and analyzing research data. It facilitates making data available to others, and allows you to replicate others work. Each Dataverse repository hosts multiple dataverses. Each dataverse contains datatsets or other dataverses, and each dataset contains descriptive metadata and data files (including documentation and code that accompanies the data). Part of Harvard Dataverse.
- Qualitative Data RepositoryBased at Syracuse University, QDR selects, ingests, curates, archives, manages, durably preserves, and provides access to digital data used in qualitative and multi-method social inquiry. The repository develops and publicizes common standards and methodologically informed practices for these activities, as well as for the reuse and citation of qualitative data.
VT Libraries provides Tech's institutional membership in QDR. (In fact, Virginia Tech is the QDR's very first institutional member.)
- VTechDataVirginia Tech’s institutional data repository is a platform for depositing and providing public access to datasets and related research products created by Virginia Tech faculty, staff, and students. Other research universities may offer similar repositories.
- GeoDataGeoData is a discovery tool for geospatial data, primarily fro Virginia, comprising not only datasets purchased as a part of the library collection, but also data created, collected, or digitized from printed maps at Virginia Tech. GeoData is implementation of the inter-institutional GeoBlacklight collaboration, curated by the VT Libraries' Geospatial Data Consultant.
- U.S. Government Information: Stats/DataA handy point of departure, this libguide from UC San Diego identifies key data providers and major statistical publications from US federal agencies. (Also includes some databases restricted to UCSD.)
- Data.govThe home of the US government’s open data. Here you will find data compiled by federal agencies, tools, and resources to conduct research, develop web and mobile applications, design data visualizations, and more.Browse by topic from the landing page or access the searchable data catalog from the Data menu at the top of the page. Files may be in TXT, HTML, XLS, CSV. or other formats.
- See the Federal Committee on Statistical Methodology (FCSM) site for technical standards and guidelines behind the federal data.
- United States Census Bureau Data RepositoryThe US Census Bureau Data Repository preserves and disseminates survey instruments, specifications, data dictionaries, codebooks, and other materials provided by the US Census Bureau. ICPSR, the host of this data repository, has also listed additional Census-related data collections from its larger holdings.
- ResearchDataGov.ORG: Application Portal for Restricted Data for Federal StatisticsResearchDataGov is a web portal and application system for discovering and requesting restricted-access microdata from various US federal statistical agencies. These data must be accessed and used only within a Federal Statistical Research Data Center -- Virginia Tech has an arrangement for VT faculty researchers to apply to use the FSRDC at Georgetown University.
Access to these data will take several months, not moments: it requires application, then approval by the federal agency(ies) that generated the datasets -- as well as the Georgetown FSRDC administrator; Tech's Institute for Society, Culture, and Environment; and other campus offices. See VT application procedures at ISCE
- DataRefugeProject to safeguard US federal agencies' data and their associated user interfaces to assure that reliable copies remain available to researchers. Initial concentration has been in environmental and climate data.
- DataLumosSome government digital data were distributed on disk or tape and not posted online, and some data that were available have moved or taken down over the years. DataLumos is ICPSR's archive for valuable US government agencies' social data resources.
- Data/stats sources in other VT research guidesYou can search for data sources and statistics resources in other VT Libraries' research guides. Here is a basic starter list. Sort and filter it in various ways and use its search box as a point of departure for more.
- Correlates of State PolicyThe Correlates of State Policy Project aims to compile, disseminate, and encourage the use of data relevant to US state policy research, tracking policy differences across and change over time in the 50 states. Comprises more than 900 variables from various sources assembled them into one large dataset. These cross-state and cross-time datasets are free and publicly available for academics, policy analysts, students, policymakers, and the research community. From the Institute for Public Policy and Social Research at Michigan State Universtiy
Virginia and nearby state official data portals
- Virginia Open Data Portal
- Virginia Geographic Information Network
- Maryland's Open Data Portal
- Maryland's Mapping and GIS Data Portal
- Open Data DC
- DC Map Data
- LINC: Log Into North Carolina
- NC OneMap
- Tennessee Open Data Portal
- Transparent Tennessee OpenMaps
- TNMap Open Data Portal
- [Kentucky] KyGovMaps Open Data Portal
- Map West Virginia
- WV State GIS Data Clearinghouse
US social/opinion surveys
Roper Center Public Opinion Archives (with iPOLL)Provided by the Roper Center for Public Opinion Research at Cornell University, Roper iPoll is the largest collection of public opinion poll data with results from 1935 to the present. Roper iPoll contains nearly 800,000 questions and over 23,000 datasets from both U.S. and international polling firms.
Surveys cover many topics,large and small, including social issues, politics, pop culture, international affairs, science, the environment, and much more. When available, results charts, demographic crosstabs and full datasets are provided for immediate download. Coverage is 1930s-present.
American National Election StudiesANES has aimed since 1948 to provide data that support rich hypothesis testing about American voting behavior, maximize methodological excellence, measure many variables, and promote comparisons across people, contexts, and time. variable search tool, informational guides and ANES study reports.
US General Social SurveyGSS gathers data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes. The GSS contains a standard core of demographic, behavioral, and attitudinal questions, plus topics of special interest, such as civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events. Hundreds of trends have been tracked since 1972. In addition, since the GSS adopted questions from earlier surveys, trends can be followed for up to 70 years. Datasets may be downloaded or analyzed online with GSS Data Explorer
State and local polls at UNC DataverseOdom Institute's collection of regional and state polls and also the Louis Harris Data Center.
International social survey datasets
WorldPublicOpinion.orgWorldPublicOpinion.org presents articles summarize polling data and analyses from numerous sources with links to questionnaires and results. Full datasets can be downloaded from http://drum.lib.umd.edu/handle/1903/10117. WorldPublicOpinion.org is an international collaborative project managed by the Program for Public Consultation at the University of Maryland.
World Values Survey: Data & DocumentationGlobal study of changing values and their impact on social and political life consists of nationally representative surveys conducted in almost 100 countries which contain almost 90 percent of the world’s population, using a common questionnaire. The WVS is the largest non-commercial, cross-national, time series investigation of human beliefs and values ever executed, currently including interviews with almost 400,000 respondents. Open access.
Global Barometer (and regional partners)This project represents the largest, most careful and systematic comparative survey of attitudes and values toward politics, power, reform, democracy and citizens' political actions in Africa, Asia, Latin America and the Arabic region. It is based on a common module of questions contained in regional barometer surveys. Regional barometers:
Canadian Opinion Research Archive at Queen's UniversityDirect online access to over 25 years of public opinion survey data, collected by major survey research firms in Canada. In addition, CORA archives and provides access to the individual-level data files from most Canadian Election Studies since 1965. Search for survey questions and results frequencies from the data analysis page.
Latin American Databank (Roper Center)LAD provides a portal for Latin American datasets acquired, processed and archived by the Roper Center for Public Opinion Research. This valuable collections includes data from public opinion surveys conducted by the survey research community in Latin America and the Caribbean, including universities, institutes, individual scholars, private polling and public opinion research firms.
Mansfield Asian Opinion Poll DatabaseMonitors key public opinion trends in Northeast Asia. Translations of opinion polls on key policy-related issues from major media organizations and other agencies in Japan, South Korea, and China.
European Social SurveyESS is a cross-national survey that " measures the attitudes, beliefs and behavior patterns of diverse populations in more than thirty nations."
- ESS - National Pages links to the European Social Survey in the participating countries in local language.
UK Data Service Variable & Question BankIncludes major UK government-sponsored surveys, cross-national surveys, longitudinal studies, UK census data, international aggregate, business data, and qualitative data.
College Librarian for Social Sciences & History
560 Drillfield Dr
Blacksburg, VA 24061
Email is the best way to contact me with questions or appointment requests.
Office hours (walk-in and/or Zoom): T 1-3:00 pm, W-Th 2-4:00 pm (Eastern time), and by appointment.
Data support in VT Libraries
Data Consulting Lab (DCL)VT Libraries provide a limited number of computers loaded with specialized analytical applications that are not available through the internet (and may be expensive for individual purchase) -- along with consultants to help you gather, analyze, represent, and curate your data.
The DCL (Newman Library 3010, near the main elevators) provides STATA, R Studio, ARGGIS, ERDAS, among others.
Open hours depend on staffing and can vary by semester; to schedule a consultation or reserve time on a workstation email
or stop by. Some remote access can be reserved as times when the PCs are not in use.
Data & informatics consultantsVT Libraries' in-house consultants for social and natural sciences, engineering, and arts/visualization
Statistical consulting by SAIGTech's Statistical Applications and Innovations Group offers walk-in consulting hours in the Newman Library Data Transformation Lab (room 3010) four afternoons a week to address your quick questions or to help with research projects requiring less than 30 minutes of assistance. Walk-in hours are available only when classes are in session.
Data management & curation (VT Libraries)The Tech Libraries offer data management and curation support for researchers throughout the research lifecycle, from the planning stages through publishing and disseminating research.
Virtual Computer Labs (TLOS remote access)Tech's Technology-enhanced Learning and Online Strategies (TLOS) office transitioned 250 computers in its campus labs to virtual-only access via the VT VPN as a Covid protection measure. This page tells you how.
This list of applications on those TLOS lab machines shows which ones are available remotely. TLOS licenses for statistical software often expire every August, and updates may be delayed.
Ask a Librarian
Data about governance
GoQ: Qality of Government InstituteData and analyses "on the causes, consequences and nature of Good Governance and the Quality of Government (QoG) -- that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions. Our research addresses the questions of how to create and maintain high quality government institutions and how the quality of such institutions influences public policy and socio-economic conditions in a broader sense." Based at the University of Gothenburg, Sweden.
Authority Indices (Gary Marks et al)Provides analyses, index scores, and data documentation on two levels of governance
- Regional Authority Index. RAI tracks regional authority on an annual basis from 1950 to 2010 in 81 countries. Datasets include annual scores in for 231 regional governments/tiers and 81 countries for 1950-2010
- International Authority Index. MIA measures delegation and pooling of international authority for 76 international governmental organizations for 1950-2010. The MIA data are annual.
Miscellaneous data collections
National Center for Charitable Statistics: Data & ToolsNCCS is a national clearinghouse of data on the nonprofit sector in the United States. This open-access version of NCCS Webster contains a variety of tools and reports to help you learn more about the nonprofit sector: find a nonprofit organization in your area, view IRS Form 990 images, analyze financial data on the sector, look at trends in charitable giving, or download data.
GDELT Project"GDELT Project is an open platform for research and analysis of global society" through mining news media from around the world, in 100 languages, since 1979. The "big data" project offers a free cloud-based analysis service, Google BigQuery, and -- for advanced users -- dataset downloads.