Skip to Main Content

Research Impact Metrics

A guide for those wanting to use research impact metrics for evaluation, analytics, and reviews, e.g., promotion & tenure.

Citation Metrics for Individual Research Outputs

Can Apply To

Journal articles and preprints

Metric Definition

The number of times that a journal article or preprint has appeared in the reference list of other articles and books.

Metric Calculation

Many citation databases use a combination of text-mining and manual classification to build their lists of citations, based upon the reference lists of articles and books that they index. However, the scope of these databases varies, with Web of Science being the most selective (in terms of the quantity of journals and disciplines covered) and Google Scholar being the least selective (indexing a great deal of non-peer-reviewed content and various research output types). Outside of Google Scholar and Microsoft Academic, it is difficult to track citations to preprints.

Data Sources

Citations are mined from the references sections of articles published in a manually curated list of journals, or in the case of Google Scholar, from any scholarly web domain.

Appropriate Use Cases

There are many diverse reasons why scholars cite each others’ work, so it’s impossible to say that there is one way that citations should be interpreted. The closest one can get is to say that citations are a measure of influence amongst other scholars, and that influence can sometimes be negative (especially in the humanities). Citations to journal articles are generally better applied to the evaluation of STEM research, given the dearth of coverage of humanities, arts, and social sciences research in most citation databases.

Limitations

One needs to read the context of a citation to understand its true meaning. Many factors can impact citation counts including database coverage, differences in publishing patterns across disciplines, citation accrual times, self-citation rates, the age of the publication, observation period, or journal status.

Database coverage. Citation databases like Web of Science and Scopus have been recognized to have limited coverage of humanities, arts, and social sciences research as compared to the sciences, as well as limited coverage of local and specialized journals, especially those written in languages other than English.

Discipline-specific publishing norms. Moreover, differences in authorship norms between disciplines–some fields regularly have dozens of authors for a paper, where others tend to have single-author papers–meaning that citations cannot always measure the full extent of an author’s contributions towards a work. Citations accrue at different rates across disciplines, depending on the publishing volume and other norms. For example, a paper in oncology may accrue 10 citations in the first year after publication, while a paper in philosophy may take several years to accrue as many citations.

Self-citation rates. Author self-citation is an essential part of scholarly communication and can impact citation counts. Identifying the number of self-citations provides supplementary information about the citations themselves.

Age of publication. Citations are impacted by the age of the paper. More recently published papers have had less time to accrue citations. Most papers “receive a growing number of citations to arrive at a peak somewhere between two and six years after publication before the citation count decreases, while some receive most of the citations within a year or two, others are cited constantly for a long period, and still others remain unmarked before a sudden wave of citations arrives seven or ten years afterwards."

Observation Period. If limiting the number of years from which a citation is counted, the overall citation count may decrease for a publication.

Inappropriate Use Cases

Citation counts should never be interpreted as a direct measure of research quality and should not be used as a measure of positive reputation for individual researchers. Citation counts should not be used to compare papers of different age (i.e. publication year), type (i.e. articles, reviews, etc.) or subject areas. A metric more suited for this type of comparison is the Field Normalized Citation Impact.

Available Metric Sources

Popular sources of citations to journal articles include DimensionsGoogle ScholarScopusWeb of Science, and Microsoft Academic.

Transparency

Citations are only as transparent as the availability of the citing article or book allows them to be. One may not always be able to read citations in context, given the prevalence of subscription journals to which reviewers are not guaranteed access. Most databases that report citations report the full list of citing articles, at the very least, linking through to full-text articles where possible (even if only for subscribing institutions).

Website

n/a

Timeframe

In theory, it is possible to track citations to journal articles as far back as the advent of the scientific journal. While some coverage exists prior to 1900, coverage for Scopus and Web of Science is strongest for 1900 – present.

The explanation and interpretation of this metric comes directly from the Metric Toolkit, CC BY.  

Can Apply To

Scholarly monographs and trade books, and chapters of scholarly monographs

Metric Definition

The number of times that a book or chapter has appeared in the reference list of other articles and books.

Metric Calculation

Many citation databases use a combination of text-mining and manual classification to build their lists of citations. However, the scope of these databases varies, with some databases indexing citations to far fewer books than others.

Data Sources

Book Citation Index (available through Web of Science), Google BooksGoogle ScholarScopus

Appropriate Use Cases

As with journal articles, there are many, many reasons why scholars cite each others’ work, so it’s impossible to say that there is one way that citations should be interpreted. The closest one can get is to say that citations are a measure of influence amongst other scholars, and that influence can sometimes be negative (especially in the humanities). Citations to journal articles are generally better applied to the evaluation of STEM research, given the dearth of coverage of humanities, arts, and social sciences research in most citation databases.

Limitations

One needs to read the context of a citation to understand its true meaning. In general, it is more difficult to find comprehensive citations to a monograph or to its chapters than for a journal article, due to the limited scope of major book citation databases. The current means of calculating citations (tracking a monograph’s appearance in a reference list) do not account for how often a monograph is cited in another text. For example, some works are cited many times throughout a text and are thus central to another scholars’ research, while other works are cited only once. Most citation databases that include books share the limitations of other citation databases: they favor English-language research and newer monographs, missing local and regional research published in other languages, as well as older monographs.

Inappropriate Use Cases

Citations should never be interpreted as a direct measure of quality. Raw citation counts should not be used as a measure of positive reputation for individual researchers.

Available Metric Sources

Book Citation Index (available through Web of Science), DimensionsGoogle BooksGoogle ScholarScopus

Transparency

Citations are only as transparent as the availability of the citing article or book allows them to be. One may not always be able to read citations in context, given the prevalence of subscription journals to which reviewers are not guaranteed access. Most databases that report citations report the full list of citing articles and books, at the very least, linking through to full-text articles and books where possible (even if only for subscribing institutions).

Website

n/a

Timeframe

In theory, it is possible to track citations to books published hundreds of years ago. But in practice, most databases only track citations for books published in the 20th century and beyond.

The explanation and interpretation of this metric comes directly from the Metric Toolkit, CC BY. 

Can Apply To

Research data sets and journal articles that describe them.

Metric Definition

The number of times a journal article or book has referenced a data set.

Metric Calculation

Data citations are sometimes collected only in the formal sense (i.e., with the data set being listed in the References section of a paper, alongside journal articles). They can also be calculated in the informal sense (i.e., linked to from within the Methods section of a paper). It varies from tool to tool.

Data Sources

Web of Science via Data Citation Index, Google Scholar (rare)

Appropriate Use Cases

Data citations should be used to understand how often research data has been reused in others’ studies, thereby indicating advancement of the field. Some fields (e.g.,crystallography and genomics) practice data citation at higher rates than others, and therefore evaluation of research from those fields may be more suitable scenarios for using data citations.

Limitations

Data citation is still relatively rarely practiced, with only half of journals providing instruction for how to cite data and more than 88% of all Data Citation Index records going uncited. Lack for formal referencing poses a challenge for using data citations from tools that only count such formal references in their data citation metrics. Critics of data citation claim that data citations merely mimic existing metrics that do not “recognize all players involved in the life cycle of those data from collection to publication”. Disciplinary coverage in the Data Citation Index (as of 2017) is skewed, favoring the life sciences (48% of records) over the social sciences (20%), physical sciences (23%), arts & humanities (7%), and multidisciplinary research (2%). Note that the Data Citation Index tracks citations for datasets and also related data studies (defined as “a description of studies or experiments held in repositories with the associated data which have been used in the data study”) as they are cited in articles indexed by the Web of Science databases. The availability of data should be taken into account when attempting to make comparisons for data citation rates against other data sets, as in some disciplines, open access data is cited at higher rates (up to 69% higher for cancer research).

Inappropriate Use Cases

Citation counts should never be interpreted as a direct measure of quality. Raw citation counts should not be used as a measure of positive reputation for individual researchers.

Available Metric Sources

Data Citation IndexGoogle Scholar (rare)

Transparency

Varies by provider. The Data Citation Index is fully transparent regarding the data repositories it indexes. The Data Citation Index white paper, “Recommended practices to promote scholarly data citation and tracking” (n.d.), describes how the Web of Science can find properly formed citations to datasets in order to calculate citations for the DCI. Google Scholar can index any content that conforms to their formatting guidelines, but is designed to primarily index journal articles, monographs, and other “print” outputs.

Website

n/a

Timeframe

In theory, data sets from any year can be referenced in scholarly literature. Google Scholar’s temporal scope is unknown. Data Citation Index includes citations to data from 1900 onwards.

The explanation and interpretation of this metric comes directly from the Metric Toolkit, CC BY. 

Can Apply To

Software packages and papers describing software packages

Metric Definition

The number of times a piece of software or code (or a paper that describes software or code) has been cited as a resource in a journal article or book.

Metric Calculation

Like data, software can be cited formally (in the references section of a paper) or informally (linked to from the methods section of a paper). Google Scholar searches the scholarly web for all mentions of a software package by name; Depsy searches PubMed Central Europe and ADS for mentions of a software package by name.

Data Sources

Google Scholar

Appropriate Use Cases

Citations to software can be interpreted to understand the influence of a software package, and in many cases the reuse of that software package in other researchers’ analyses.

Limitations

Software packages are much less likely to be cited directly than articles about software packages are. Only around a third of citations to software are formal, so attempts to count software citations using existing tools may miss important mentions of software in research articles.

Inappropriate Use Cases

Citations to software should not be interpreted to measure quality.

Available Metric Sources

Google Scholar

Transparency

While all sources link to the papers that cite software, the availability of these papers to the end user varies due to journal article paywalls. Though Google Scholar might link to a paper that cites a piece of software, the end user may not be able to read that paper to see the context of that citation, if they do not have a subscription to the journal in which the citing paper was published.

Website

n/a

Timeframe

In theory, software of any age can be cited in the research literature.

The explanation and interpretation of this metric comes directly from the Metric Toolkit, CC BY. 

Can Apply To

Primarily journal articles, but also other kinds of research outputs, such as book chapters and conference proceedings that are sufficiently covered by abstract and citation databases.

Metric Definition

The Field Normalized Citation Impact (FNCI) is the ratio between the actual citations received by a publication and the average number of citations received by all other similar publications. The latter is referred to as the expected number of citations. Similar publications are ones in the same subject category, of the same type (i.e. article, review, book chapter, etc.), and of the same age (i.e. publication year).

Metric Calculation

A FNCI is measured by dividing the number of citations a publication received by the average number of citations to publications in a database published in the same year, of the same type, and within the same subject category. When multiple publications are being considered, the ratio between the actual and average citations for each publication are calculated first. A typical indicator is the Mean Normalized Citation Score (MNCS), which is the mean result of all FNCI of all publications included in the analysis. Publications can also be assigned to more than one subject category. In these cases, usually the publication and its citation counts are proportionally distributed across the relevant subject categories. 

Data Sources

The FNCI is dependent on extensive citation and indexing information available in citation databases, such as Scopus and Web of Science. In addition to citation counts, these sources classify publications by year, type, and subject.

Appropriate Use Cases

The FNCI was conceived to facilitate the benchmarking of citation performance across groups of different size, disciplinary scope, and age, such as research large groups, institutions, or geographic regions. It is meant to correct for the different disciplinary patterns of scholarly communication and publication age can have on non-normalized metrics, such as citation counts. The global mean of the FNCI is 1.0, so it is easy to compare a set of values to a benchmark. For example, an FNCI of 1.50 means 50% more cited than the world average; whereas, an FWCI of 0.75 means 25% less cited than the world average.

Limitations

The FNCI is typically presented as a mean value (e.g. Mean Normalized Citation Score) for an aggregation of papers (e.g. individual scholars, a journal, a university, etc.), which can be strongly influenced by outliers. The distribution of citations across publications is often highly skewed. Most publication in a sample will receive relatively low citation attention, while a small set will accumulate high citation rates. Indicators based on highly cited publications are usually an alternative to mean-based indicators. Additionally, the FNCI may be sensitive to the field classification system chosen for the analysis, particularly when the classification is not at the publication-level but at the journal-level (e.g. Web of Science Subject Categories or the Scopus All Science Journal Classification (AJSC).

Inappropriate Use Cases

Like for most citation analysis, citation-based metrics should not be interpreted as a direct measure of research quality.

Available Metric Sources

Article level FNCI values are available in Scopus (Field Weighted Citation Index - FWCI), and other bibliometric sources (e.g. Web of Science, Dimensions, Google Scholar) provide possibilities of similarly calculated field-normalized citation based metrics.

Transparency

The FNCI’s calculation is a well known methodology in bibliometric practice. The FNCI is dependent upon a publication’s classification by discipline, publication type, and year. While the year and type of publication are recorded and verifiable, discipline assignments are not always available.

Website

N/A

Timeframe

The FNCI may be subject of different ‘citation windows.’ Typically, it uses citation data from the year of publication plus 3 years, although more extensive (i.e. larger than 3 years) or variable citation windows (i.e. considering all subsequent publication years available after the publication year) are possible.

The explanation and interpretation of this metric comes directly from the Metric Toolkit, CC BY.