Skip to main content

Clinical performance comparators in audit and feedback: a review of theory and evidence



Audit and feedback (A&F) is a common quality improvement strategy with highly variable effects on patient care. It is unclear how A&F effectiveness can be maximised. Since the core mechanism of action of A&F depends on drawing attention to a discrepancy between actual and desired performance, we aimed to understand current and best practices in the choice of performance comparator.


We described current choices for performance comparators by conducting a secondary review of randomised trials of A&F interventions and identifying the associated mechanisms that might have implications for effective A&F by reviewing theories and empirical studies from a recent qualitative evidence synthesis.


We found across 146 trials that feedback recipients’ performance was most frequently compared against the performance of others (benchmarks; 60.3%). Other comparators included recipients’ own performance over time (trends; 9.6%) and target standards (explicit targets; 11.0%), and 13% of trials used a combination of these options. In studies featuring benchmarks, 42% compared against mean performance. Eight (5.5%) trials provided a rationale for using a specific comparator. We distilled mechanisms of each comparator from 12 behavioural theories, 5 randomised trials, and 42 qualitative A&F studies.


Clinical performance comparators in published literature were poorly informed by theory and did not explicitly account for mechanisms reported in qualitative studies. Based on our review, we argue that there is considerable opportunity to improve the design of performance comparators by (1) providing tailored comparisons rather than benchmarking everyone against the mean, (2) limiting the amount of comparators being displayed while providing more comparative information upon request to balance the feedback’s credibility and actionability, (3) providing performance trends but not trends alone, and (4) encouraging feedback recipients to set personal, explicit targets guided by relevant information.

Peer Review reports


Audit and feedback (A&F), a summary of clinical performance over a specified period of time, is one of the most widely applied quality improvement interventions in medical practice. A&F appears to be the most successful if provided by a supervisor or colleague, more than once, both verbally and written, if baseline performance is low, and if it includes explicit targets and an action plan [1, 2]. However, reported effects vary greatly across studies and little is known about how to enhance its effectiveness [3]. In order to advance the science of A&F, the field has called for theory-informed research on how to best design and deliver A&F interventions [4, 5]. Numerous hypotheses and knowledge gaps have been proposed requiring further research to address outstanding uncertainty [5, 6]. One area of uncertainty is the choice of performance comparator included in feedback reports.

Although it is feasible to provide clinical performance feedback without an explicit comparison [7, 8], feedback is typically provided in the context of a performance comparator: a standard or benchmark to which the recipient’s observed performance level can be compared. Comparators play an important role in helping feedback recipients to identify discrepancies between current and desirable practice [9] and improve self-assessments [10]. While most often performance is compared against the average of a peer group [11], many other potential comparators have been proposed in the literature. The choice of comparator may have important implications for what message is conveyed by the feedback, and therefore how recipients react to it [12]. For instance, if a physician’s performance level has improved since the previous audit but remains well below national average, comparing against the physician’s previous level would suggest that there is no need for change, whereas comparing against the national average would suggest the opposite. At the same time, existing psychological theories suggest that the mechanisms by which recipients respond to feedback are complex, making it less obvious that recipients adopt an ‘externally imposed’ performance comparator as a personal target [7, 13]. Empirical studies show that, instead, recipients may reject feedback recommendations to pursue other levels of performance [14, 15]. To date, little evidence informs A&F intervention designers about which comparators should be chosen under what circumstances and how they should be delivered to the recipients [5, 16].

We aim to inform choices regarding performance comparators in A&F interventions and help identify causal mechanisms for change. Our objective was to (1) describe choices for delivering clinical performance comparators in published A&F interventions and (2) identify the associated mechanisms from theories and empirical studies that might have implications for effective A&F.


To identify current choices for performance comparators, we examined all A&F interventions evaluated in the 146 unique trials included in the 2012 Cochrane review [1] and the 2017 systematic review of electronic A&F [2]. The Cochrane review spanned 1982–2011; the systematic review spanned 2006–2016. Both reviews included the databases Cochrane Central Register of Controlled Trials, MEDLINE, EMBASE, and CINAHL. We developed a data extraction sheet and guide in order to extract details about delivered comparators from all included studies. These details included what comparators were delivered, their origin, specific values delivered, and the rationale for their use. The guide and sheet were piloted by 2 reviewers (WG and BB) on 10 initial studies followed by a second pilot on 10 additional studies, each after which improvements to terms and definitions were made. WG and BB independently extracted the data; disagreements were resolved through discussion.

To identify the potential mechanisms associated with each of the different comparators that have implications for effective A&F, we reviewed existing behaviour change theories and evidence from empirical A&F studies . Candidate theories were identified from a systematic review of theories used in randomised trials of A&F [17], contact with experts, and a supplemental theory-focused literature search following the methodology detailed by Booth and Carroll [18](Additional files 1). Empirical studies were the randomised trials included in the two reviews [1, 2], and qualitative evaluation studies included in the systematic review and meta-synthesis that was recently undertaken by part of the study team [19]. We included theories and empirical studies if they described explanations of why, how, or when a behaviour may or may not occur as a result of the comparator choice within the context of receiving clinical performance feedback. From the included theories and randomised trials, we summarised relevant predictions and evidence. From the qualitative studies, we extracted and coded excerpts in batches using Framework Analysis [20] and Realistic Evaluation [21, 22] (see details in [19]). We used an iterative process to formulate mechanisms for each comparator and refine and generalise across the included theories and empirical studies [23, 24].

The consolidated results were discussed, refined, and agreed with the team. The 10-member study team has extensive expertise in designing and evaluating A&F interventions, behaviour change, implementation science, and health psychology. Three authors (HC, NI, JB) previously reviewed or have been involved in reviewing 140 randomised A&F trials [1, 11], 3 authors (BB, SvdV, NP) reviewed 7 randomised trials of electronic A&F [2], and 4 authors (WG, BB, SvdV, NP) have reviewed 65 qualitative studies of A&F [19]. The team also included clinicians and experience as feedback recipient or feedback designer.

In the ‘Results’ section, we presented the descriptions and frequency with which performance comparators have been used in randomised trials of A&F interventions, followed by the comparators’ mechanisms supported by theory and empirical evidence.


Table 1 summarises the key characteristics of the included 146 RCTs [1, 2] and 65 qualitative evaluation studies [19] of A&F interventions. We found that 98 of the 146 (67.1%) included A&F interventions used performance comparators within feedback messages; the remaining 48 intervention trials either explicitly stated they did not use a comparator or did not mention it. Possible comparators included the performance achieved by other health professionals (benchmarks, n = 88; 60.3%), recipients’ own historical performance (trends, n = 17; 9.6%), or target standards (explicit targets, n = 16; 11.0%). Several interventions used more than 1 type of comparator (n = 19; 13.0%). Only 8 (5.5%) trials reported a rationale for using their specified comparator. We included 12 theories relating to general feedback mechanisms [7, 9, 25], goal-setting [13], guideline adherence [26], psychology [27,28,29,30], and sociology [31,32,33], and incorporated empirical findings from 5 randomised controlled trials and 42 qualitative studies to help explain comparator mechanisms and their potential effects on clinical performance. Table 2 provides these mechanisms and their theoretical and empirical support. Table 3 shows the details and frequencies of the comparators delivered in A&F interventions.

Table 1 Study characteristics
Table 2 Potential mechanisms and effects of clinical performance comparators and their theoretical and empirical support
Table 3 Performance comparators used in the 146 included audit and feedback interventions


In 88 (60.3%) interventions, the feedback included benchmarks, i.e. comparisons of recipients’ achieved performance against that of other health professionals or peers. Benchmarks could be characterised by the group of peers being compared against (reference group), and the group’s performance was represented (summary statistic). We identified 7 theories, 5 trials, and 32 qualitative studies that suggested mechanisms relevant to benchmarking (Table 2). Although benchmarks in principle do not necessarily explicitly state what levels recipients are expected to achieve, they may be perceived as targets that recipients use for improvement. In fact, they can harness competition between recipients (Social Comparison Theory [31]) and motivate recipients to change behaviour if they see others behaving differently (Persuasion Theory [27] and Social Norms Theory [33]), trying to maintain their status in a group of high-performing clinicians (Reference Group Theory [32]). Recipients who observe that others are achieving a certain level of performance may find it easier to conceive that they can too. While a wide array of qualitative studies support these theoretical mechanisms [34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52], Feedback Intervention Theory [7] counters that benchmarking debilitates the effects of feedback by directing recipients’ attention away from the task at hand (i.e. the clinical performance issue in question, such as prescribing appropriate medication). Two trials comparing feedback with versus without benchmarks, however, both found small increases in effectiveness [53, 54]. Qualitative studies showed furthermore that benchmarks induced positive emotions (e.g. reassurance, satisfaction) when recipients observed they were performing better than or similar to others [39, 49, 55,56,57,58,59], or negative emotions (e.g. embarrassment) and consequent feedback rejection when recipients performed at the lower end of the distribution [49, 58]. In 1 A&F trial, involving an intervention to increase use of a preferred drug, Schectman et al. [60] explicitly chose not to include benchmarks because they expected it to discourage greater use because overall use was low.

Reference group

Benchmarks were typically drawn from the performance of peers in the same region (n = 39; 24.7%), state or province (n = 26; 17.8%), country (n = 21; 14.4%), or—in case of individualised feedback—other health professionals within the same unit, hospital, or department (n = 12; 8.2%). In 3 (2.1%) cases, benchmarks concerned similar-type peers such as only teaching hospitals or non-teaching hospitals. Finally, in 19 (13.0%) cases, comparisons to multiple peer groups were provided, such as the region and country, or only teaching hospitals and all hospitals in the province. Qualitative studies reported that recipients were more likely to accept the benchmark when they considered its reference group relevant and comparable [36, 39, 40, 51, 52, 61,62,63], as also hypothesised by the Reference Group Theory [32]. This suggests that regional comparisons are typically preferred over national ones, and comparisons that differentiate between the type of peers may be more effective than those that do not. Alternatively, recipients rejected feedback when they felt that the comparison was irrelevant or unfair, such as when they perceived inadequate case-mix adjustment or patient stratification [36, 39, 52, 62, 63].

Summary statistic

The most common benchmark value was the group mean (n = 37; 25.3%). Other summary statistics used were the mean of the top 10% peers (n = 7; 4.8%; also known as the achievable benchmark of care, or ABC benchmark, defined as the mean performance achieved by the top 10% best performers of the group [64]), the median (n = 6; 4.1%) or various other percentiles such as the 75th or 80th percentile (n = 6; 4.1%), and the recipient’s rank or percentile rank in the group (n = 4; 2.7%). In contrast to using a summary statistic as the value of a benchmark, feedback in 26 (17.8%) interventions presented the individual performance scores achieved by peers in the group, e.g. in a bar chart, histogram, or table. Twenty-two (15.1%) times, it was not reported or unclear how peer performance was represented. Despite the mean being the most popular choice, others have used higher levels, e.g. the 80th percentile or top 10% of peers, as these could more clearly demonstrate discrepancies between actual and desired performance for the majority of feedback recipients [65,66,67]. Benchmarking against the mean reveals such discrepancies to at most half of the recipients and may not lead to the desired intentions to achieve the highest standards of care (Control Theory [9]). This was also supported by several qualitative studies in which recipients were not prompted to improve because the performance was ‘in the middle’ [35, 59, 68], or recipients were dissatisfied by comparing against the mean because they did not consider it as being the gold standard [35, 62]. In a randomised trial comparing two variations of benchmarks, Kiefe et al. [65] found that comparing to the top 10% of peers led to larger feedback effectiveness than comparing to the mean. However, Schneider et al. [66] found that identifying the top performers in the context of a quality circle did not improve the effectiveness of feedback. Consistent with Goal-setting Theory [13], some low performers considered such high benchmarks unachievable and questioned or disengaged from the feedback [35, 62] and may have benefitted more from comparing to the mean.

Feedback in three (2.1%) interventions presented individual peers’ performance scores while making the identities of those peers visible to recipients. In two cases, this concerned all peers [69, 70], whereas the other, only the top performer was identified [66]. This approach may be effective as it allows recipients to choose the most relevant peers for comparison (Reference Group Theory [32]) and further increases their sense of competition knowing that their own performance is reported to others (Social Comparison Theory [31]). However, qualitative studies have reported that recipients experienced such open reporting as threatening and therefore preferred anonymous data [44, 48, 61, 71, 72].

Multiple benchmarks

Sixteen (11.0%) interventions used a combination of benchmarks, such as the mean and standard deviation, median and the top 10%, or peers’ individual scores and interquartile range. Several qualitative studies have indicated that providing multiple benchmarks (that is, against multiple groups, multiple summary statistics, or peers’ individual performance scores) may facilitate the credibility of feedback because it helps recipients assess variation between professionals and judge whether potential discrepancies are clinically significant [37, 40, 57, 59, 73, 74]. However, it also increases the complexity of the feedback message—making it more difficult to understand whether performance requires attention or not as there are multiple values to which recipients can compare (Feedback Intervention Theory [7]). This allows recipients to make downward social comparisons, a defensive tendency in which they compare themselves against a group or individual that they consider ‘worse off’ in order to make themselves feel better about themselves (Social Comparison Theory [31]). In contrast, recipients who compare themselves against a group or individual that they perceive as superior can facilitate self-evaluation and improvement [31].


Feedback in 17 (9.6%) interventions included trends, i.e. comparisons to recipients’ own previously achieved performance over a specified period (reference period). We identified 2 theories and 12 qualitative studies that suggested mechanisms relevant to trends (Table 2). For example, Foster et al. [75] provided 1-time feedback at 6 months after the start of a multifaceted educational programme to increase adherence to asthma guidelines in which recipients’ current performance was compared to that at baseline. Rantz et al. [76] provided feedback that included trends displayed as a line graph of recipients’ performance over the previous 5 quarters. Trends allow recipients to monitor themselves and assess the rate of change in their performance over time. Feedback Intervention Theory [7] and theory on self-regulation [30] refer to this as velocity feedback and indicate that rapid rates of improvement lead to more goal achievement and satisfaction, whereas constant or delayed improvement rates ultimately lead to withdrawal. Empirical studies found that recipients who observed deteriorating performance were often prompted to take corrective action [37,38,39, 44, 46, 50, 51, 55, 77,78,79,80,81,82,83]. Upward trends made successful change observable to recipients which promoted satisfaction and other positive emotions [44,45,46, 77,78,79,80]. Feedback messages that include performance at multiple time points may also facilitate the credibility of the message if a single instance of low current performance would have been considered a ‘snap shot’ explained away as chance or seasonal effects [39, 45]. However, past performance does not clearly guide improvement: it tells recipients where they came from but not where they should end up. This may be 1 of the reasons that 13 of the 17 studies provided additional comparators (benchmarks or explicit targets).

Reference period

The reference period used to display trends, described by the number of time points and intervals of past performance, was typically consistent with the number of times and frequency with which feedback was provided. Most often, trends displayed quarterly (n = 7; 4.8%) or monthly (n = 4; 2.7%) performance; other variants were weekly (n = 2; 1.4%), biyearly (n = 2; 1.4%), or yearly (n = 1; 0.7%). While qualitative studies reported that recipients valued ‘regular updates’, the exact frequency preferred by recipients typically depended on the clinical topic and the number of observations (e.g. patients) available each audit [37, 39, 45, 46, 82, 83].

Explicit targets

In 16 (11.0%) interventions, health professionals received feedback with an explicit target: a specific level of achievement that is explicitly expected. Targets could be characterised by the person or party setting the target (source) and the level it is set at (value). Seven theories and 6 qualitative studies suggested mechanisms relevant to targets (Table 2). The use of explicit targets reduces the complexity of feedback messages because it makes it easier for recipients to know what needs to be attained and whether corrective response is necessary (Control Theory [9], Goal-setting Theory [13], Feedback Intervention Theory [7]). Two qualitative studies confirmed this [84, 85]. Explicit targets can be based on expert opinion, healthcare policies, performance data (e.g. benchmarks or trends), or a combination of these. The main difference between explicit targets, benchmarks, and trends is that the latter 2, despite potentially revealing important discrepancies with desired practice, may not explicitly judge current performance, leaving it to recipients to determine whether their performance is acceptable or not.


Targets were set by an external party (i.e. externally set targets; n = 11) or locally by feedback recipients themselves (i.e. self-set targets; n = 5); two interventions used both. External targets were set by an expert panel (n = 3; 2.1%), investigators (n = 5; 3.4%), or guidelines or government (n = 3; 2.1%). Once (0.7%) it was unclear. While powerful target-setting sources can influence recipients’ decisions to take action, theory by Ilgen et al. [25] predicts that feedback from a source with low power or credibility is easily rejected. Cabana’s model of guideline adherence [26] indicates that physicians may have various reasons for non-adherence to recommended target, such as disagreement or lack of self-efficacy or outcome expectancy. Accepting a message indicating that performance is below a target requires recipients to acknowledge the fact that they are underperforming. However, this might conflict with the self-perception of being a capable and competent health professional, a situation referred to as cognitive dissonance (Theory of Cognitive Dissonance [28]). The theory states that recipients might find it easier to resolve this conflict by rejecting the externally imposed target, rather than question their own competency—even if the feedback holds compelling and meaningful information. Two qualitative studies reported similar response by recipients due to cognitive dissonance [68, 84]. Self-affirmation Theory [29] explains that such defensive responses arise, in part, from the motivation to maintain self-integrity. Affirmations of alternative domains of self-worth unrelated to the provoking threat (e.g. by also emphasising on high performance on other care aspects) can help recipients deal with threatening information without resorting to defensive response [29].

When feedback recipients set performance targets themselves (self-set targets), they are more likely to commit to and gain progress towards the targets (Goal-setting Theory [13]). Qualitative studies have shown that feedback with self-set targets may decrease the consistency in clinical performance across recipients [85, 86], in particular, if they are not supported by an external information source (e.g. benchmarking). Furthermore, recipients might adapt their targets to performance to eliminate discrepancies rather than vice versa (Feedback Intervention Theory [7]).


Ambitious targets are more effective than easy ones as long as they are achievable (Goal-setting Theory [13] and Feedback Intervention Theory [7]). However, it might prove difficult to define a single target that is perceived as both ambitious and achievable by all recipients of a feedback intervention. Six (4.1%) interventions used absolute targets, or criterion-referenced targets, which are typically determined at or before baseline and do not change over time. For example, in Sommers et al. [87], an expert panel set a specific target (between 80 and 90%) for each quality indicator. Rantz et al. [76] provided 2 explicit targets to distinguish between good and excellent performance (e.g. 16% vs 6% rate of falls). In another 6 (4.1%) interventions, the targets related to benchmarking against best practice. For example, in Goff et al. [88], researchers set explicit targets at the 80th percentile of participants’ baseline performance. Finally, 3 (2.1%) interventions set targets based on trends. For example, Fairbrother et al. [89] awarded financial bonuses to recipients who achieved 20% improvement from baseline, and Curran et al. [90] fed back statistical process control charts with control limits depended by the unit’s past performance to define out-of-control performance. With absolute targets, it is possible for all recipients to pass or fail depending on their achieved performance level, whereas with relative targets by definition, discrepancies are only presented to a subset of recipients. Relative targets based on benchmarking may be considered unfair by recipients performing just below them, in particular when the distribution of performance scores is narrow and differences between health professionals are clinically insignificant [59, 84]. Incremental targets demonstrate discrepancies to all recipients but may be unachievable when baseline performance is already high. Absolute targets are very simple to understand, but can become outdated when achieved by most recipients and should be reset in response to changing performance levels to remain appropriate [91]. Relative targets based on benchmarking can be automatically adjusted when the provider group performance changes. This facilitates continuous quality improvement (i.e. targets increase as the group improves), but due to its changing nature, it also creates uncertainty to recipients as to which performance levels should be targeted to guide improvement efforts [72]. However, in the included studies, relative targets were all set once and did not change.


In an effort to inform the design and delivery of more reliably effective A&F, we reviewed 146 randomised trials to identify choices for delivering clinical performance comparators. Ninety-eight (67.1%) included 1 or more comparators. Health professionals’ performance was compared against the performance of others (benchmarks; 60.3%), the recipient’s own historical performance (trends; 9.6%), expected standards of achievement (explicit targets; 11.0%), or a combination of these (13.0%). Only 8 trials (5.5%) stated a rationale for using the specific comparators. We identified 12 behavioural theories and evidence from 5 randomised trials and 42 qualitative studies from which we distilled explanations of the mechanisms through which different comparators may support quality improvement.

Comparison to existing literature

In a re-analysis of the earlier Cochrane review by Jamtvedt et al. [92] (118 trials), Hysong [93] found no effect of adding benchmarks to A&F, regardless of whether or not identities of peers were known to recipients. While our findings suggest that benchmarking should increase the effectiveness of A&F by harnessing the social dynamics between recipients, there remain unanswered questions with respect to how benchmarks could work best. In line with our results, two empirical studies of A&F [14, 15] demonstrated that benchmarking against the mean and the top 10% of performers influences recipients’ intentions to improve on quality indicators, even though these intentions are not always translated into effective action [94, 95]. Still, study participants ignored some benchmarks because they were too high or the indicator lacked priority [14].

The effect of explicit targets has been previously investigated by Gardner et al. [96] in their re-analysis of the Jamtvedt review [92]. Gardner’s results were inconclusive at the time because very few studies explicitly described their use of targets, but the 2012 update of the review [1] showed that target setting, in particular in combination with action planning, increased the effectiveness of A&F. The role of involving recipients in setting targets themselves remains uncertain in healthcare settings [97, 98]. An empirical study [15] showed that recipients may set their targets regardless of any benchmarks or trends and—potentially unrealistically—high, even when confronted with benchmarks of the top 10% reflecting much lower standards [15].

Brehaut et al. [5] recently advocated a single comparator that effectively communicates the key message. While multiple comparators may indeed send complex and mixed messages to recipients, we found that well-considered and presented multiple comparators may be beneficial to the effectiveness of A&F [99]. This underlines the complexity of this area and the need for more research.

Implications for practice and research

Our findings are useful for guiding the design of A&F interventions with respect to choice of performance comparator in feedback messages. We have identified a wide variety of comparators that may be included in feedback messages, as well as mechanisms and outcomes that potentially occur as a consequence of those comparators in terms of what message the feedback conveys (i.e. whether and how it reflects discrepancies with desirable practice), how recipients might respond, and ultimately the effectiveness of A&F. Many of the mechanisms we identified originate from behavioural science which offers a great amount of theoretical and empirical evidence not often taken into account by feedback designers [4, 17]. The exact way in which a comparator modifies that response and the intervention effectiveness depends on various factors relating to the individual recipient or team, organisation, patient population, and/or clinical performance topic, in addition to whether/how the comparator reveals a discrepancy with current practice [19]. A&F designers should explicitly consider these factors and the mechanisms we presented and offer justification for their choice of comparator.

A single type of comparator that works for all recipients and for all care processes or outcomes targeted by the A&F intervention may not exist. Comparators should be designed to maximise feedback acceptance in the context of raising standards of care via multiple means. Based on our findings, we have four suggestions for choosing comparators:

  1. 1.

    Step away from benchmarking against the mean and consider tailored performance comparisons

Benchmarks work by leveraging the social dynamics between recipients, the main mechanisms of which have been described by the Social Comparison Theory [31] and Reference Group Theory [32]. However, 42% of the A&F interventions included in this study that used benchmarking involved comparisons to the group mean. The theory predicts, and qualitative and quantitative evidence have demonstrated, that such comparisons are unlikely to raise performance levels comprehensively across feedback recipients. We recommended that recipients compare themselves to high-performing others that are both relevant and comparable to the recipient. However, if benchmarks are too high, they may be perceived as unachievable for low performers and lead to feedback rejection, or other unintended consequences. For example, a recent A&F study to reduce high-risk prescribing in nursing homes felt that benchmarking against the top 10% may risk unintended discontinuation of appropriate medications and therefore compared against the top quartile instead [100]. A solution to this problem may lie in tailoring of feedback messages to individual recipients or practices [12], for example by comparing low performers to the mean or median and others to the top 10%.

  1. 2.

    Balance the credibility and actionability of the feedback message

Qualitative studies have found feedback credibility and actionability to be important characteristics that should be properly balanced when choosing comparators. Based on a single comparator, health professionals may explain negative feedback away as a coincidental ‘snapshot’ of low performance, or question the data quality or fairness of the comparison [101]. Offering multiple performance comparators may help recipients assess whether there are true discrepancies with desired practice. For example, trends reveal whether low performance was one-time or has been consistent over time, and multiple benchmarks (e.g. individual peer scores) indicate performance in light of the variation between health professionals. Although providing multiple comparators may therefore increase the credibility of the feedback, it also increases its complexity and cognitive load and might send mixed messages to recipients. For example, if a health professional’s performance has improved over time but remains below the top 10% of practices, a feedback message suggesting that improvement is needed might be inconsistent with the professional’s interpretation that ‘the numbers are improving so no further change is necessary’ [5]. Hence, feedback should be presented in a way that clearly presents the key message (i.e. improvement is recommended or not), limiting the amount of information (e.g. comparators) presented to increase actionability, while allowing recipients to view more detailed comparative information if desired to increase credibility.

  1. 3.

    Provide performance trends, but not trends alone

Trends enable recipients to monitor performance and progress over multiple time points, and many qualitative studies have shown that recipients likely act upon observed performance changes. In fact, Feedback Intervention Theory [7] and theory on self-regulation [30] show that the rate of performance change (i.e. velocity) may be a more important motivator for change than the distance between performance and a goal (i.e. discrepancy). Trends also increase the credibility of feedback and enable a quality improvement cycle in which recipients continuously self-assess their performance upon which they decide whether or not to act. Trends therefore add substantial value to feedback and should be an explicit part of feedback messages. However, since trends only provide information about performance of the past and not the goal, they should be accompanied with other comparators (i.e. a benchmark or explicit target) that provide explicit direction for further improvement.

  1. 4.

    Encourage feedback recipients to set personal, explicit targets guided by relevant information

Goal-setting Theory [13], and various theories that extend it, predicts that explicit targets reduce feedback complexity because they set specific, measurable goals. However, qualitative studies report that unless such externally set targets were set by a broadly recognised, credible authority (e.g. national guidelines) or are linked to financial incentives, accreditation, or penalties, they may not be acceptable for a subset of recipients. We therefore recommend that feedback recipients are encouraged to set their own targets, guided by relevant information drawn from guidelines, expert opinion, and performance data, to which explicit comparisons can be made in the feedback. Feedback providers can collaborate with recipients to ensure the appropriateness of targets. Although recipients may consequently pursue different targets, it also enables them to commit to self-chosen targets that are both achievable and appropriate for themselves which reduces the chance of feedback rejection.

Strengths and limitations

To our knowledge, we are the first to have systematically considered existing relevant theories and empirical evidence to fill a key knowledge gap with regard to the use of clinical performance comparators in A&F interventions [4, 6]. Few past studies have explicitly built on extant theory and previous research [17]. This work helps advance the science in the field by summarising the practical considerations for the comparator choice in the A&F design.

There are also several limitations. In using the 2012 Cochrane review of A&F and 2017 systematic review of electronic A&F to identify current choices for performance comparators, we were limited to randomised controlled trials being evaluated in a research setting only. Other study designs, and A&F used in non-research routine healthcare settings, might have yielded other types and/or frequencies of performance comparators that have been used. In particular, because A&F in research settings likely emphasises performance improvement while routine A&F may focus more on performance monitoring, we expect that the comparators and mechanisms we identified are more aimed at activating recipients to improve practice, rather than only supporting recipients to assess their performance. Another limitation is the quality of reporting and lack of consistency with regard to the terminology for comparators, particularly in the older studies [11, 102]. One way in which this particularly might have manifested is that it was often unclear to which extent performance comparators were delivered as explicit targets. For example, studies that have used a particular benchmark may have added an explicit message that they are expected to achieve that standard, making the benchmark an explicit target as well, but it has not been reported as such in the paper. As a result, despite the prominence of targets in existing feedback theories [7, 9, 13], we have found limited evidence about the use of explicit targets.

Our review was limited to performance comparators at an aggregated level. When feedback is provided about individual patient cases, comparators at the patient-level may be included which allow feedback recipients to make performance comparisons for each patient [103]. We also did not explore the influence of the way in which comparators were displayed or represented in the feedback messages. Finally, we did not use meta-regression to examine and quantify the effects of each comparator because such an analysis would be vastly underpowered as a result of the large variety in comparator use across trials.

Unanswered questions and future research

Colquhoun et al. have generated a list of 313 theory-informed hypotheses that suggest conditions for more effective interventions of which 26 related to the comparators [6]. Our research delivers some important pieces of the puzzle to design and deliver effective A&F, but many other pieces are still missing. To move the science forward, more of these hypotheses should be tested. Within the domain of performance comparators, theory-informed head-to-head trials comparing different types of comparators (e.g. [100, 104]) are needed to help uncover successful comparators tested under similar conditions.


Published A&F interventions have typically used benchmarks, historic trends, and explicit targets as performance comparators. The choice of comparator seemed rarely motivated by theory or evidence, even though abundant literature about feedback mechanisms exists in theories from behavioural and social sciences and empirical studies. Most interventions benchmarked against mean performance which is unlikely to comprehensively raise the standards of care. There appears to be considerable opportunity to design better performance comparators to increase the effectiveness of A&F. Designers of A&F interventions need to explicitly consider the mechanisms of comparators and offer justification for their choice.



Audit and feedback


  1. 1.

    Ivers N, Jamtvedt G, Flottorp S, Young JM, Odgaard-Jensen J, French SD, et al. Audit and feedback: effects on professional practice and healthcare outcomes. Cochrane Database Syst Rev. 2012;6:CD000259.

    Article  Google Scholar 

  2. 2.

    Tuti T, Nzinga J, Njoroge M, Brown B, Peek N, English M, et al. A systematic review of electronic audit and feedback: intervention effectiveness and use of behaviour change theory. Implement Sci. 2017;12:61.

    Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Ivers NM, Grimshaw JM, Jamtvedt G, Flottorp S, O’Brien MA, French SD, et al. Growing literature, stagnant science? Systematic review, meta-regression and cumulative analysis of audit and feedback interventions in health care. J Gen Intern Med. 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Ivers NM, Sales A, Colquhoun H, Michie S, Foy R, Francis JJ, et al. No more “business as usual” with audit and feedback interventions: towards an agenda for a reinvigorated intervention. Implement Sci. 2014;9:14.

    Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Brehaut JC, Colquhoun HL, Eva KW, Carroll K, Sales A, Michie S, et al. Practice feedback interventions: 15 suggestions for optimizing effectiveness. Ann Intern Med. 2016;164:435–41.

    Article  PubMed  Google Scholar 

  6. 6.

    Colquhoun HL, Carroll K, Eva KW, Grimshaw JM, Ivers N, Michie S, et al. Advancing the literature on designing audit and feedback interventions: identifying theory-informed hypotheses. Implement Sci. 2017;12:117.

    Article  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Kluger AN, DeNisi A. The effects of feedback interventions on performance: a historical review, a meta-analysis, and a preliminary feedback intervention theory. Psychol Bull. 1996;119:254–84.

    Article  Google Scholar 

  8. 8.

    Eva KW, Armson H, Holmboe E, Lockyer J, Loney E, Mann K, et al. Factors influencing responsiveness to feedback: on the interplay between fear, confidence, and reasoning processes. Adv Health Sci Educ Theory Pract. 2012;17:15–26.

    Article  PubMed  Google Scholar 

  9. 9.

    Carver CS, Scheier MF. Control theory: a useful conceptual framework for personality-social, clinical, and health psychology. Psychol Bull. 1982;92:111–35.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L. Accuracy of physician self-assessment compared with observed measures of competence a systematic review. Jama. 2006;296:1094–102.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Colquhoun H, Michie S, Sales A, Ivers N, Grimshaw JM, Carroll K, et al. Reporting and design elements of audit and feedback interventions: a secondary review. BMJ Qual Saf. 2016:bmjqs-2015-005004-..

    Article  Google Scholar 

  12. 12.

    Landis-Lewis Z, Brehaut JC, Hochheiser H, Douglas GP, Jacobson RS. Computer-supported feedback message tailoring: theory-informed adaptation of clinical audit and feedback for learning and behavior change. Implement Sci. 2015;10:12.

    Article  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Locke EA, Latham GP. Building a practically useful theory of goal setting and task motivation. A 35-year odyssey. Am Psychol. 2002;57:705–17.

    Article  PubMed  Google Scholar 

  14. 14.

    Gude WT, van Engen-Verheul MM, van der Veer SN, de Keizer NF, Peek N. How does audit and feedback influence intentions of health professionals to improve practice? A laboratory experiment and field study in cardiac rehabilitation. BMJ Qual Saf. 2017;26:279–87.

    Article  PubMed  Google Scholar 

  15. 15.

    Gude WT, Roos-Blom M-J, van der Veer SN, Dongelmans DA, de Jonge E, Francis JJ, et al. Health professionals’ perceptions about their clinical performance and the influence of audit and feedback on their intentions to improve practice: a theory-based study in Dutch intensive care units. Implement Sci. 2018;13:33.

    Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Foy R, Eccles MP, Jamtvedt G, Young J, Grimshaw JM, Baker R. What do we know about how to do audit and feedback? Pitfalls in applying evidence from a systematic review. BMC Health Serv Res. 2005;5:50.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Colquhoun HL, Brehaut JC, Sales A, Ivers N, Grimshaw J, Michie S, et al. A systematic review of the use of theory in randomized controlled trials of audit and feedback. Implement Sci. 2013;8:66.

    Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Booth A, Carroll C. Systematic searching for theory to inform systematic reviews: is it feasible? Is it desirable? Health Inf Libr J. 2015;32:220–35.

    Article  Google Scholar 

  19. 19.

    Brown B, Gude W, Blakeman T, van der Veer S, Ivers N, Francis J, et al. Clinical performance feedback intervention theory (CP-FIT): a new theory for designing, implementing, and evaluating feedback in health care based on a systematic review and meta-synthesis of qualitative research. Implement Sci. 2019.

  20. 20.

    Ritchie J, Spencer L. Qualitative data analysis for applied policy research. Anal Qual Data. 2010:173–94.

  21. 21.

    Pawson R, Tilley N. Realistic evaluation. 2007. doi:

    Google Scholar 

  22. 22.

    Pawson R. Evidence-based policy: a realist perspective. London: SAGE Publications; 2006.

    Book  Google Scholar 

  23. 23.

    Brehaut JC, Eva KW. Building theories of knowledge translation interventions: use the entire menu of constructs. Implement Sci. 2012;7:114.

    Article  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Byng R, Norman I, Redfern S. Using realistic evaluation to evaluate a practice-level intervention to improve primary healthcare for patients with long-term mental illness. Evaluation. 2005;11:69–93.

    Article  Google Scholar 

  25. 25.

    Ilgen DR, Fisher CD, Taylor MS. Consequences of individual feedback on behavior in organizations. J Appl Psychol. 1979;64:349–71.

    Article  Google Scholar 

  26. 26.

    Cabana MD, Rand CS, Powe NR, Wu AW, Wilson MH, Abboud PA, et al. Why don’t physicians follow clinical practice guidelines? A framework for improvement. JAMA. 1999;282:1458–65.

    CAS  Article  PubMed  Google Scholar 

  27. 27.

    Cialdini RB. Influence: the psychology of persuasion. New York, NY: HarperCollins Publishers Inc; 1993.

    Google Scholar 

  28. 28.

    Festinger L. A theory of cognitive dissonance. Stanford: Stanford University Press; 1957.

    Book  Google Scholar 

  29. 29.

    Steele CM. The psychology of self-affirmation: sustaining the integrity of the self. Adv Exp Soc Psychol. 1988;21:261–302.

    Article  Google Scholar 

  30. 30.

    Johnson RE, Howe M, Chang C-H. (Daisy)The importance of velocity, or why speed may matter more than distance. Organ Psychol Rev. 2013;3:62–85.

    Article  Google Scholar 

  31. 31.

    Festinger L. A theory of social comparison processes. Hum Relations. 1954;7:117–40.

    Article  Google Scholar 

  32. 32.

    Merton RK. Contributions to the theory of reference group behavior. Soc Theory Soc Struct. 1968:279–334.

    Article  PubMed  Google Scholar 

  33. 33.

    Berkowitz AD. The social norms approach: Theory, research, and annotated bibliography. 2004. Retrieved from

  34. 34.

    Dixon-Woods M, Redwood S, Leslie M, Minion J, Martin GP, Coleman JJ. Improving quality and safety of care using technovigilance: an ethnographic case study of secondary use of data from an electronic prescribing and decision support system. Milbank Q. 2013;91:424–54.

    Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Guldberg TL, Vedsted P, Lauritzen T, Zoffmann V. Suboptimal quality of type 2 diabetes care discovered through electronic feedback led to increased nurse-GP cooperation. A qualitative study. Prim Care Diabetes. 2010;4:33–9.

    Article  PubMed  Google Scholar 

  36. 36.

    Yi SG, Wray NP, Jones SL, Bass BL, Nishioka J, Brann S, et al. Surgeon-specific performance reports in general surgery: an observational study of initial implementation and adoption. J Am Coll Surg. 2013;217.

    Article  PubMed  Google Scholar 

  37. 37.

    Eldh AC, Fredriksson M, Halford C, Wallin L, Dahlström T, Vengberg S, et al. Facilitators and barriers to applying a national quality registry for quality improvement in stroke care. BMC Health Serv Res. 2014;14.

  38. 38.

    Jeffs L, Doran D, Hayes L, Mainville C, VanDeVelde-Coke S, Lamont L, et al. Implementation of the National Nursing Quality Report Initiative in Canada: insights from pilot participants. J Nurs Care Qual. 2015;30:E9–16.

    Article  PubMed  Google Scholar 

  39. 39.

    Ross JS, Williams L, Damush TM, Matthias M. Physician and other healthcare personnel responses to hospital stroke quality of care performance feedback: a qualitative study. BMJ Qual Saf. 2016;25:441–7.

    Article  PubMed  Google Scholar 

  40. 40.

    Taylor A, Neuburger J, Walker K, Cromwell D, Groene O. How is feedback from national clinical audits used? Views from English National Health Service trust audit leads. J Heal Serv Res Policy. 2016;21:91–100.

    Article  Google Scholar 

  41. 41.

    Lloyd M, Watmough S, O’Brien S, Furlong N, Hardy K. Formalized prescribing error feedback from hospital pharmacists: doctors’ attitudes and opinions. Br J Hosp Med. 2015;76:713–8.

    CAS  Article  Google Scholar 

  42. 42.

    Lippert ML, Kousgaard MB, Bjerrum L. General practitioners uses and perceptions of voluntary electronic feedback on treatment outcomes – a qualitative study. BMC Fam Pract. 2014;15:193.

    Article  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Wilkinson EK, McColl A, Exworthy M, Roderick P, Smith H, Moore M, et al. Reactions to the use of evidence-based performance indicators in primary care: a qualitative study. Qual Saf Heal Care. 2000;9:166–74.

    CAS  Article  Google Scholar 

  44. 44.

    Johnston S, Green M, Thille P, Savage C, Roberts L, Russell G, et al. Performance feedback: an exploratory study to examine the acceptability and impact for interdisciplinary primary care teams. BMC Fam Pract. 2011;12.

  45. 45.

    Mannion R, Goddard M. Impact of published clinical outcomes data: case study in NHS hospital trusts. BMJ. 2001;323:260–3.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Palmer C, Bycroft J, Healey K, Field A, Ghafel M. Can formal collaborative methodologies improve quality in primary health care in New Zealand? Insights from the EQUIPPED Auckland collaborative. J Prim Health Care. 2012;4:328–36.

    Article  PubMed  Google Scholar 

  47. 47.

    Vachon B, Désorcy B, Camirand M, Rodrigue J, Quesnel L, Guimond C, et al. Engaging primary care practitioners in quality improvement: making explicit the program theory of an interprofessional education intervention. BMC Health Serv Res. 2013;13.

  48. 48.

    Paskins Z, John H, Hassell A, Rowe I. The perceived advantages and disadvantages of regional audit: a qualitative study. Clin Gov. 2010;15:200–9.

    Article  Google Scholar 

  49. 49.

    Payne VL, Hysong SJ. Model depicting aspects of audit and feedback that impact physicians’ acceptance of clinical performance feedback. BMC Health Serv Res. 2016;16.

  50. 50.

    Eldh AC, Fredriksson M, Vengberg S, Halford C, Wallin L, Dahlström T, et al. Depicting the interplay between organisational tiers in the use of a national quality registry to develop quality of care in Sweden. BMC Health Serv Res. 2015;15.

  51. 51.

    Chadwick LM, Macphail A, Ibrahim JE, Mcauliffe L, Koch S, Wells Y. Senior staff perspectives of a quality indicator program in public sector residential aged care services: a qualitative cross-sectional study in Victoria, Australia. Aust Health Rev. 2016;40:54–62.

    Article  PubMed  Google Scholar 

  52. 52.

    de Vos MLG, van der Veer SN, Graafmans WC, de Keizer NF, Jager KJ, Westert GP, et al. Process evaluation of a tailored multifaceted feedback program to improve the quality of intensive care by using quality indicators. BMJ Qual Saf. 2013;22:233–41.

    Article  Google Scholar 

  53. 53.

    Wones RG. Failure of low-cost audits with feedback to reduce laboratory test utilization. Med Care. 1987;25:78–82.

    CAS  Article  PubMed  Google Scholar 

  54. 54.

    Søndergaard J, Andersen M, Vach K, Kragstrup J, Maclure M, Gram LF. Detailed postal feedback about prescribing to asthma patients combined with a guideline statement showed no impact: a randomised controlled trial. Eur J Clin Pharmacol. 2002;58:127–32.

    Article  PubMed  Google Scholar 

  55. 55.

    Seip B, Frich JC, Hoff G. Doctors’ experiences with a quality assurance programme. Clin Gov. 2012;17:297–306.

    Article  Google Scholar 

  56. 56.

    Shepherd N, Meehan TJ, Davidson F, Stedman T. An evaluation of a benchmarking initiative in extended treatment mental health services. Aust Health Rev. 2010;34:328–33.

    Article  PubMed  Google Scholar 

  57. 57.

    McLellan L, Dornan T, Newton P, Williams SD, Lewis P, Steinke D, et al. Pharmacist-led feedback workshops increase appropriate prescribing of antimicrobials. J Antimicrob Chemother. 2016;71:1415–25.

    Article  PubMed  Google Scholar 

  58. 58.

    Powell AA, White KM, Partin MR, Halek K, Hysong SJ, Zarling E, et al. More than a score: a qualitative study of ancillary benefits of performance measurement. BMJ Qual Saf. 2014;23:651–8.

    Article  PubMed  Google Scholar 

  59. 59.

    Boyce MB, Browne JP, Greenhalgh J. Surgeon’s experiences of receiving peer benchmarked feedback using patient-reported outcome measures: a qualitative study. Implement Sci. 2014;9.

  60. 60.

    Schectman JM, Kanwal NK, Scott Schroth W, Elinsky EG. The effect of an education and feedback intervention on group-model and network-model health maintenance organization physician prescribing behavior. Med Care. 1995;33:139–44.

    CAS  Article  PubMed  Google Scholar 

  61. 61.

    Cameron M, Penney G, MacLennan G, McLeer S, Walker A. Impact on maternity professionals of novel approaches to clinical audit feedback. Eval Heal Prof. 2007;30:75–95.

    Article  Google Scholar 

  62. 62.

    Søndergaard J, Andersen M, Kragstrup J, Hansen H, Freng Gram L. Why has postal prescriber feedback no substantial impact on general practitioners’ prescribing practice? A qualitative study. Eur J Clin Pharmacol. 2002;58:133–6.

    Article  PubMed  Google Scholar 

  63. 63.

    Dixon-Woods M, Leslie M, Bion J, Tarrant C. What counts? An ethnographic study of infection data reported to a patient safety program. Milbank Q. 2012;90:548–91.

    Article  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Kiefe CI, Weissman NW, Allison JJ, Farmer R, Weaver M, Dale Williams O. Identifying achievable benchmarks of care: concepts and methodology. Int J Qual Heal Care. 1998;10:443–7.

    CAS  Article  Google Scholar 

  65. 65.

    Kiefe CI, Allison JJ, Williams OD, Person SD, Weaver MT, Weissman NW. Improving quality improvement using achievable benchmarks for physician feedback: a randomized controlled trial. JAMA. 2001;285:2871–9.

    CAS  Article  PubMed  Google Scholar 

  66. 66.

    Schneider A, Wensing M, Biessecker K, Quinzler R, Kaufmann-Kolle P, Szecsenyi J. Impact of quality circles for improvement of asthma care: results of a randomized controlled trial. J Eval Clin Pract. 2008;14:185–90.

    Article  PubMed  PubMed Central  Google Scholar 

  67. 67.

    Ferguson TB, Peterson ED, Coombs LP, Eiken MC, Carey ML, Grover FL, et al. Use of continuous quality improvement to increase use of process measures in patients undergoing coronary artery bypass graft surgery: a randomized controlled trial. J Am Med Assoc. 2003;290:49–56.

    Article  Google Scholar 

  68. 68.

    Grando V, Rantz M, Maas M. Nursing home staff’s views on quality improvement interventions: a follow up study. J Gerontol Nurs. 2007;33:40–7.

    PubMed  Google Scholar 

  69. 69.

    Baker R, Smith JF, Lambert PC. Randomised controlled trial of the effectiveness of feedback in improving test ordering in general practice. Scand J Prim Health Care. 2003;21:219–23.

    Article  PubMed  Google Scholar 

  70. 70.

    Filardo G, Nicewander D, Herrin J, Edwards J, Galimbertti P, Tietze M, et al. A hospital-randomized controlled trial of a formal quality improvement educational program in rural and small community Texas hospitals: one year results. Int J Qual Heal Care. 2009;21:225–32.

    Article  Google Scholar 

  71. 71.

    McFadyen C, Lankshear S, Divaris D, Berry M, Hunter A, Srigley J, et al. Physician level reporting of surgical and pathology performance indicators: a regional study to assess feasibility and impact on quality. Can J Surg. 2015;58:31–40.

    Article  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Kirschner K, Braspenning J, Jacobs JEA, Grol R. Experiences of general practices with a participatory pay-for-performance program: a qualitative study in primary care. Aust J Prim Health. 2013;19:102–6.

    Article  PubMed  Google Scholar 

  73. 73.

    Groene O, Klazinga N, Kazandjian V, Lombrail P, Bartels P. The World Health Organization Performance Assessment Tool for quality improvement in hospitals (PATH): an analysis of the pilot implementation in 37 hospitals. Int J Qual Heal Care. 2008;20:155–61.

    Article  Google Scholar 

  74. 74.

    Henri Maurice Veillard J, Louise Schiøtz M, Guisset A-L, Davidson Brown A, Klazinga NS. The PATH project in eight European countries: an evaluation. Int J Health Care Qual Assur. 2013;26:703–13.

    Article  Google Scholar 

  75. 75.

    Foster JM, Hoskins G, Smith B, Lee AJ, Price D, Pinnock H. Practice development plans to improve the primary care management of acute asthma: randomised controlled trial. BMC Fam Pract. 2007;8:23.

    Article  PubMed  PubMed Central  Google Scholar 

  76. 76.

    Rantz MJ, Popejoy L, Petroski GF, Madsen RW, Mehr DR, Zwygart-Stauffacher M, et al. Randomized clinical trial of a quality improvement intervention in nursing homes. Gerontologist. 2001;41:525–38.

    CAS  Article  PubMed  Google Scholar 

  77. 77.

    Morrell C, Harvey G, Kitson A. Practitioner based quality improvement: a review of the Royal College of Nursing’s dynamic standard setting system. Qual Health Care. 1997;6:29–34.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Siddiqi K, Volz A, Armas L, Otero L, Ugaz R, Ochoa E, et al. Could clinical audit improve the diagnosis of pulmonary tuberculosis in Cuba, Peru and Bolivia? Trop Med Int Heal. 2008;13:566–78.

    Article  Google Scholar 

  79. 79.

    Siddiqi K, Newell J. What were the lessons learned from implementing clinical audit in Latin America? Clin Gov. 2009;14:215–25.

    Article  Google Scholar 

  80. 80.

    Nessim C, Bensimon CM, Hales B, Laflamme C, Fenech D, Smith A. Surgical site infection prevention: a qualitative analysis of an individualized audit and feedback model. J Am Coll Surg. 2012;215:850–7.

    Article  PubMed  Google Scholar 

  81. 81.

    Gort M, Broekhuis M, Regts G. How teams use indicators for quality improvement - a multiple-case study on the use of multiple indicators in multidisciplinary breast cancer teams. Soc Sci Med. 2013;96:69–77.

    Article  PubMed  Google Scholar 

  82. 82.

    Jeffs L, Beswick S, Lo J, Lai Y, Chhun A, Campbell H. Insights from staff nurses and managers on unit-specific nursing performance dashboards: a qualitative study. BMJ Qual Saf. 2014;23:1001–6.

    Article  PubMed  Google Scholar 

  83. 83.

    Grant AM, Guthrie B, Dreischulte T. Developing a complex intervention to improve prescribing safety in primary care: mixed methods feasibility and optimisation pilot study. BMJ Open. 2014;4:e004153.

    Article  PubMed  PubMed Central  Google Scholar 

  84. 84.

    Damschroder LJ, Robinson CH, Francis J, Bentley DR, Krein SL, Rosland AM, et al. Effects of performance measure implementation on clinical manager and provider motivation. J Gen Intern Med. 2014;29:877–84.

    Article  PubMed  PubMed Central  Google Scholar 

  85. 85.

    Simms RA, Ping H, Yelland A, Beringer AJ, Fox R, Draycott TJ. Development of maternity dashboards across a UK health region; current practice, continuing problems. Eur J Obstet Gynecol Reprod Biol. 2013;170:119–24.

    Article  PubMed  Google Scholar 

  86. 86.

    Kristensen H, Hounsgaard L. Evaluating the impact of audits and feedback as methods for implementation of evidence in stroke rehabilitation. Br J Occup Ther. 2014;77:251–9.

    Article  Google Scholar 

  87. 87.

    Sommers LS, Sholtz R, Shepherd RM, Starkweather DB. Physician involvement in quality assurance. Med Care. 1984;22:1115–38.

    CAS  Article  PubMed  Google Scholar 

  88. 88.

    Goff DC, Gu L, Cantley LK, Sheedy DJ, Cohen SJ. Quality of care for secondary prevention for patients with coronary heart disease: results of the Hastening the Effective Application of Research through Technology (HEART) trial. Am Heart J. 2003;146:1045–51.

    Article  PubMed  Google Scholar 

  89. 89.

    Fairbrother G, Hanson KL, Friedman S, Butts GC. The impact of physician bonuses, enhanced fees, and feedback on childhood immunization coverage rates. Am J Public Health. 1999;89:171–5.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  90. 90.

    Curran E, Harper P, Loveday H, Gilmour H, Jones S, Benneyan J, et al. Results of a multicentre randomised controlled trial of statistical process control charts and structured diagnostic tools to reduce ward-acquired meticillin-resistant Staphylococcus aureus: the CHART Project. J Hosp Infect. 2008;70:127–35.

    CAS  Article  PubMed  Google Scholar 

  91. 91.

    Reeves D, Doran T, Valderas JM, Kontopantelis E, Trueman P, Sutton M, et al. How to identify when a performance indicator has run its course. BMJ. 2010;340:c1717.

    Article  PubMed  Google Scholar 

  92. 92.

    Jamtvedt G, Young JM, Kristoffersen DT, O’Brien MA, Oxman AD. Audit and feedback: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2006:CD000259.

  93. 93.

    Hysong SJ. Meta-analysis: audit and feedback features impact effectiveness on care quality. Med Care. 2009;47:356–63.

    Article  PubMed  PubMed Central  Google Scholar 

  94. 94.

    Gude WT, van Engen-Verheul MM, van der Veer SN, Kemps HMC, Jaspers MWM, de Keizer NF, et al. Effect of a web-based audit and feedback intervention with outreach visits on the clinical performance of multidisciplinary teams: a cluster-randomized trial in cardiac rehabilitation. Implement Sci. 2016;11:160.

    Article  PubMed  PubMed Central  Google Scholar 

  95. 95.

    Gude WT, van der Veer SN, de Keizer NF, Coiera E, Peek N. Optimizing digital health informatics interventions through unobtrusive quantitative process evaluations. Stud Health Technol Inform. 2016;228:594–8.

    PubMed  Google Scholar 

  96. 96.

    Gardner B, Whittington C, McAteer J, Eccles MP, Michie S. Using theory to synthesise evidence from behaviour change interventions: the example of audit and feedback. Soc Sci Med. 2010;70:1618–25.

    Article  PubMed  Google Scholar 

  97. 97.

    No authors listed. Medical audit in general practice. I: Effects on doctors’ clinical behaviour for common childhood conditions. North of England study of standards and performance in general practice. BMJ. 1992;304:1480–4.

    Article  Google Scholar 

  98. 98.

    Nasser M, Oxman AD, Paulsen E, Fedorowicz Z. Local consensus processes: effects on professional practice and health care outcomes. Cochrane Database Syst Rev. 2008;4.

  99. 99.

    Gude WT, Roos-Blom MJ, van der Veer SN, de Jonge E, Peek N, Dongelmans DA, et al. Electronic audit and feedback intervention with action implementation toolbox to improve pain management in intensive care: protocol for a laboratory experiment and cluster randomised trial. Implement Sci. 2017;12:68.

    Article  PubMed  PubMed Central  Google Scholar 

  100. 100.

    Ivers NM, Desveaux L, Presseau J, Reis C, Witteman HO, Taljaard MK, et al. Testing feedback message framing and comparators to address prescribing of high-risk medications in nursing homes: protocol for a pragmatic, factorial, cluster-randomized trial. Implement Sci. 2017;12:86.

    Article  PubMed  PubMed Central  Google Scholar 

  101. 101.

    van der Veer SN, de Keizer NF, Ravelli ACJ, Tenkink S, Jager KJ. Improving quality of care. A systematic review on how medical registries provide information feedback to health care providers. Int J Med Inform. 2010;79:305–23.

    Article  PubMed  Google Scholar 

  102. 102.

    Hoffmann TC, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, et al. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ. 2014;348:g1687.

    Article  PubMed  Google Scholar 

  103. 103.

    Dowding D, Randell R, Gardner P, Fitzpatrick G, Dykes P, Favela J, et al. Dashboards for improving patient care: review of the literature. Int J Med Inform. 2015;84:87–100.

    Article  PubMed  Google Scholar 

  104. 104.

    Elouafkaoui P, Young L, Newlands R, Duncan EM, Elders A, Clarkson JE, et al. An audit and feedback intervention for reducing antibiotic prescribing in general dental practice: the RAPiD cluster randomised controlled trial. PLoS Med. 2016;13:e1002115.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


This research was supported by NIHR Greater Manchester Patient Safety Translational Research Centre and NIHR Manchester Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Author information




All authors contributed to the study conception and participated in critically appraising and revising the intellectual content of the manuscript. WG was primarily and BB secondarily responsible for the data extraction and the manuscript draft. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Wouter T. Gude.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Identifying behaviour change theories (DOCX 175 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gude, W.T., Brown, B., van der Veer, S.N. et al. Clinical performance comparators in audit and feedback: a review of theory and evidence. Implementation Sci 14, 39 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Benchmarking
  • Medical audit
  • Feedback
  • Quality improvement