Data extraction, sometimes referred to as data collection or data abstraction, refers to the process of extracting and organizing the information from each included (relevant) study.
The synthesis approach(es) (e.g., meta-analysis, framework synthesis) that you intend to use will inform data extraction.
Just like all other stages of a systematic review, 2 data extractors should extract data from in each included reference. The exact procedure may vary according to your resource capacity. For example, you may have a team of 10 extractors in 5 pairs of 2 extracting data from chunks of the included material, if managing a large corpus.
Note: experience in the field does not necessarily increase the accuracy of this process. See Horton et al., (2010) 'Systematic review data extraction: cross-sectional study showed that experience did not increase accuracy', and Jones et al., (2005) 'High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews' for more on this topic'.
Defining ahead of time which measurement of effect(s) will be relevant and useful is important, especially if you hope to pursue a meta-analysis. Though it is unlikely that all of your studies will produce the same measurement of effect (e.g., odds ratio, relative risk ratio), many of these measurements can be transformed or converted to the measurement you need for your meta-analysis.
If converting effect sizes, be sure to provide enough detail about this process in your manuscript such that another team could replicate. It is best to collect the original outputs from articles before converting effect sizes. There are tools available for converting effect sizes such as the Campbell Collaboration's tool for calculating or converting effect sizes and the effect size converter from MIT.
Data extraction is often performed using a single form to extract data from all included (relevant) studies in a uniform manner. Because the data extraction stage is driven by the scope and goals of a systematic review, there is not a gold standard or one-size-fits all approach to developing a data extraction form.
However, there are templates and guidance available to help in the creation of your forms.
Because it is standard to include the data extraction form in the supplemental material of a systematic review and/or meta-analysis, you may also consider the forms developed and/or used during similar, already published and/or in-progress reviews
As is the case with the critical appraisal, the type of data you are able to extract will also depend on the study design. Therefore, it is likely that the exact data you extract from each individual article will vary somewhat.
Cochrane | One form for randomized controlled trials (RCTs) only; one form for RCTs and non-RCTs
Joanna Briggs Institute (JBI) | Several forms located in each relevant chapter:
Data extracted from each reference is presented as a summary table or summary of findings table and described in the narrative.
A summary table, like the examples seen below, provides readers with quick glance summary of study details that are important to the systematic review and/or meta-analysis. Similarly to the other stages of a review, what you collect and report will depend on the scope of the review and the type of synthesis you plan to conduct.
Qualitative Data Only
Summary table from: Bin-Reza, F., Lopez Chavarrias, V., Nicoll, A., & Chamberland, M. E. (2012). The use of masks and respirators to prevent transmission of influenza: a systematic review of the scientific evidence. Influenza and other respiratory viruses, 6(4), 257–267. doi:10.1111/j.1750-2659.2011.00307.x
Quantitative Data (meta-analysis)
Summary table from: Simpson, S. S., Rorie, M., Alper, M., Schell‐Busey, N., Laufer, W. S., & Smith, N. C. (2014). Corporate Crime Deterrence: A Systematic Review. Campbell Systematic Reviews, 10(1), 1–105. https://doi.org/10.4073/csr.2014.4
It may be appropriate to include more than one summary table. For example, one table may present basic information about the study such as author names, year of publication, year(s) the study was conducted, study design, funding agency, etc.; Another table may present details more specific to the qualitative synthesis; A third table may present information specifically relevant to the meta-analysis, with effect sizes, confidence intervals, etc. Additionally, it is best practice to have one summary table for each outcome.
Chapter 5: Collecting data
Chapter 6: Choosing effect measures and computing estimates of effect
Conducting systematic reviews of intervention questions II: Relevance screening, data extraction, assessing risk of bias, presenting the results and interpreting the findings. Sargeant JM, O’Connor AM. Zoonoses Public Health. 2014 Jun;61 Suppl 1:39-51. doi: 10.1111/zph.12124. PMID: 24905995
Study designs and systematic reviews of interventions: building evidence across study designs. Sargeant JM, Kelton DF, O’Connor AM. Zoonoses Public Health. 2014 Jun;61 Suppl 1:10-7. doi: 10.1111/zph.12127. PMID: 24905992
Randomized controlled trials and challenge trials: Design and criterion for validity. Sargeant JM, Kelton DF, O’Connor AM,Zoon. Public Health. 2014. 61 (S1); 18 – 27. PMID: 24905993
C43. Using data collection forms (protocol & review / final manuscript)
C44. Describing studies (review / final manuscript)
C45. Extracting study characteristics and outcome data in duplicate (protocol & review / final manuscript)
C46. Making maximal use of data (protocol & review / final manuscript)
C47. Examining errata (review / final manuscript)
C49. Choosing intervention groups in multi-arm studies (protocol & review / final manuscript)
C50. Checking accuracy of numeric data in the review (review / final manuscript)
...forms should be developed a priori and included in the published or otherwise available review protocol as an appendix or as online supplementary materials
"...level of reviewer experience has not been shown to affect extraction error rates. As such, additional strategies planned to reduce errors, such as training of reviewers and piloting of extraction forms should be described."
"...in the absence of complete descriptions of treatments, outcomes, effect estimates, or other important information, reviewers may consider asking authors for this information. Whether reviewers plan to contact authors of included studies and how this will be done (such as a maximum of three email attempts) to obtain missing information should be documented in the protocol."
List and define all variables for which data will be sought (such as PICO items, funding sources) and any pre-planned data assumptions and simplifications
"...describe assumptions they intend to make if they encounter missing or unclear information and explain how they plan to deal with such data or lack thereof"
List and define all outcomes for which data will be sought, including prioritisation of main and additional outcomes, with rationale
Consider specifying which outcome domains were considered the most important for interpreting the review’s conclusions (such as “critical” versus “important” outcomes) and provide rationale for the labelling (such as “a recent core outcome set identified the outcomes labelled ‘critical’ as being the most important to patients”) (Item 10a)
If the review examines the effects of interventions, consider presenting an additional table that summarises the intervention details for each study