Skip to main content
  • Systematic review
  • Open access
  • Published:

The effectiveness and acceptability of evidence synthesis summary formats for clinical guideline development groups: a mixed-methods systematic review



Clinical guideline development often involves a rigorous synthesis of evidence involving multidisciplinary stakeholders with different priorities and knowledge of evidence synthesis; this makes communicating findings complex. Summary formats are typically used to communicate the results of evidence syntheses; however, there is little consensus on which formats are most effective and acceptable for different stakeholders.


This mixed-methods systematic review (MMSR) aimed to evaluate the effectiveness and acceptability (e.g. preferences and attitudes and preferences towards) of evidence synthesis summary formats for GDG members. We followed the PRISMA 2020 guideline and Joanna Briggs Institute Manual for Evidence Synthesis for MMSRs. We searched six databases (inception to April 20, 2021) for randomised controlled trials (RCTs), RCTs with a qualitative component, and qualitative studies. Screening, data extraction, and quality appraisal were performed in duplicate. Qualitative findings were synthesised using meta-aggregation, and quantitative findings are described narratively.


We identified 17,240 citations and screened 54 full-text articles, resulting in 22 eligible articles (20 unique studies): 4 articles reported the results of 5 RCTs, one of which also had a qualitative component. The other 18 articles discussed the results of 16 qualitative studies. Therefore, we had 5 trials and 17 qualitative studies to extract data from. Studies were geographically heterogeneous and included a variety of stakeholders and summary formats. All 5 RCTs assessed knowledge or understanding with 3 reporting improvement with newer formats. The qualitative analysis identified 6 categories of recommendations: ‘presenting information’, ‘tailoring information’ for end users, ‘trust in producers and summary’, ‘knowledge required’ to understand findings, ‘quality of evidence’, and properly ‘contextualising information’. Across these categories, the synthesis resulted in 126 recommendations for practice. Nine recommendations were supported by both quantitative and qualitative evidence and 116 by only qualitative. A majority focused on how to present information (n = 64) and tailor content for different end users (n = 24).


This MMSR provides guidance on how to improve evidence summary structure and layout. This can be used by synthesis producers to better communicate to GDGs. Study findings will inform the co-creation of evidence summary format prototypes based on GDG member’s needs.

Trial registration

The protocol for this project was previously published, and the project was preregistered on Open Science Framework (Clyne and Sharp, Evidence synthesis and translation of findings for national clinical guideline development: addressing the needs and preferences of guideline development groups, 2021; Sharp and Clyne, Evidence synthesis summary formats for decision-makers and Clinical Guideline Development Groups: A mixed-methods systematic review protocol, 2021).

Peer Review reports


Clinical guidelines are an important tool for the practice of evidence-based medicine. Often involving rigorous syntheses of the best available evidence, clinical guidelines (CG) aim to improve healthcare in a cost-effective manner by assisting decision-making for clinicians and policymakers [1,2,3]. Guideline development groups (GDG) are comprised of a multidisciplinary decision-makers such as healthcare professionals, methodologists, and patient representatives. These participants engage in the guideline development process which may involve formal consensus methods amongst these stakeholders. Research on group decision-making within the guideline context indicates that these different stakeholders have different priorities and understandings of knowledge and research evidence [4,5,6].

In creating guidelines, GDGs need to consider evidentiary factors (such as quality, quantity, and consistency) alongside complex trade-offs between competing benefits and harms, side effects, and risks of various disease management options [7]. The methodological expertise and research knowledge of a GDG can thus influence the quality of a guideline [8] and therefore guideline uptake. Evidence syntheses, such as systematic reviews, may be infrequently used by healthcare managers and policymakers due to intrinsic factors such as format and content and extrinsic factors such as lack of awareness and skills to seek, appraise, and interpret systematic reviews [9, 10]. While for patients involved in guideline development, the strong focus on research evidence can hinder active participation in discussions [11]. Review or evidence synthesis summaries have been proposed as a way to improve the uptake and usefulness of evidence syntheses for decision-makers [9, 10].

Evidence synthesis summaries come in a variety of different formats such as one-page plain language reports, policy briefs, summary of findings tables, visual abstracts or infographics, and more. While summaries may be more easily understandable than complete systematic reviews [12, 13], review summaries are often too long and complex and may require additional work to effectively ‘translate’ the evidence for policymakers [14]. Given the different priorities and knowledge bases of GDG members [4,5,6], it is reasonable that different stakeholders would have preferences for different formats. Accordingly, research has shown that there is no clear consensus on the most effective way to communicate to all members [12, 13].

It is critical to identify the best summary formats to ensure the best possible communication within multidisciplinary GDGs as they interpret evidence syntheses and develop clinical guidelines to support evidence-based decision-making [15]. This study aimed to evaluate the effectiveness and acceptability of (e.g. preferences for and attitudes towards) different communication formats of evidence synthesis summary formats amongst GDG members. The objectives were as follows: (1) how and to what degree do different summary formats (digital, visual, audio) of presenting evidence synthesis findings impact the end user’s understanding of the review findings? and 2) What are the end users’ preferences for and attitudes towards these formats? To support a multifaceted view on the guideline development process, we conducted a mixed-methods systematic review (MMSR) as this method offers a deeper understanding of findings, more easily identifies discrepancies in the evidence, and is more useful for decision-makers [16, 17]. The MMSR approach also allows one to examine different aspects of a particular phenomenon — i.e. the effects that summary formats may have on knowledge or decision-making and how acceptable these formats were to users [18].


We conducted a MMSR according to a preregistered and published protocol [19, 20], following the guidance of the Joanna Briggs Institute (JBI) Manual for Evidence Synthesis, using a convergent segregated approach [17], and the PRISMA 2020 checklist (Additional file 1) [21].

Study designs and eligibility criteria

Eligible studies were included if they were randomised controlled trials (RCTs) comparing alternative summary formats for evidence syntheses, RCTs with a supplemental qualitative component, or qualitative studies such as focus groups, interviews, or open-ended surveys. Per our protocol, we restricted to these study designs as we chose to focus on the performance and impact of summary formats in optimal settings, and RCTs are the most appropriate design to evaluate effectiveness [20]. We did not include observational studies as there is a high potential that confounding factors will be extensive due to the complexity of stakeholders, evidence synthesis types, and summary formats involved.

Eligible participants were those who could be involved in clinical guideline development groups (e.g. healthcare professionals, policymakers, patient representatives, researchers, methodologists) and outcomes related to effectiveness, acceptability (e.g. views and preferences) of summary formats. We excluded studies involving students, journalists, or the general public as communication to these populations is more complex. Members of the general public were included if they were a patient representative involved in a guideline development group. Use of evidence synthesis summary formats to inform clinicians and patient’s decision-making regarding individual care was not the focus of this review [20].

Search strategy and study selection

We searched six databases, Ovid MEDLINE, Embase MEDLINE (Medical Literature Analysis and Retrieval System Online), APA (American Psychological Association ) PsycINFO, CINAHL (Cumulative Index to Nursing and Allied Health Literature), Web of Science, and Cochrane Library, from inception to April 20, 2021 (Additional file 2). The search strategy was purposefully sensitive rather than specific. All titles, abstracts, and full texts were independently double screened (DAB, BC, JQ, MKS, BT) using Covidence [22]. Disagreements were discussed between two lead reviewers (BC, MKS) until consensus was achieved. The complete list of eligible articles and potentially relevant studies with exclusion justifications are available on the project’s OSF page [19]. We used the CitationChaser Shiny application to perform backwards citation identification [23, 24]. One reviewer (MKS) manually screened citations that the app was unable to include (e.g. reports without a DOI).

Data extraction and appraisal of studies

The data extraction form was piloted by two reviewers (MKS, DAB) on one article, required changes were discussed, and the final data extraction was performed using this form and the TiDiER checklist [25]. Study quality was assessed using the JBI Critical Appraisal Checklist for Qualitative Research and the JBI Checklist for RCTs as appropriate [26]. An assessment of the overall certainty of evidence using the GRADE or ConQual approach is not recommended [17, 27] for JBI MMSRs because the data from separate quantitative and qualitative evidence is transformed and integrated. All data extraction was performed independently in duplicate (DAB, BC, JQ, MKS). Disagreements were discussed with the lead author (MKS) and resolved by consensus. The data extraction forms are available on OSF [19].

Analysis and synthesis of findings

As we did not have a sufficient number of quantitative studies included, we were unable to perform a meta-analysis, the Harboard test for publication bias [28], Egger’s test [29], and statistical heterogeneity [30] as planned. As established in our protocol [20], since we could not perform a meta-analysis, a narrative synthesis was performed.

Qualitative findings were synthesised using the pragmatic meta-aggregation approach which allows a reviewer to present findings of included studies as originally intended by the original authors [31, 32]. Meta-aggregation seeks to enable generalisable statements in the form of recommendations to guide practitioners and policymakers. Findings (defined as a verbatim extract of the author’s analytical interpretation of the results or data) from the “Results” section of manuscripts and accompanying illustrations (direct quotations or statements from participants) were coded as ‘unequivocal evidence’. Findings with no illustrations or an illustration lacking clear association were ‘equivocal/credible’. Findings which were not supported by the data were ‘unsupported’. Interpretations of the study results given by the study authors were not coded to avoid interfering with the transformation and integration process in an MMSR when combining the quantitative and qualitative evidence [31, 33].

NVivo 12 was used to analyse results from primary qualitative studies and accompanying illustrations [34]. One author (MKS) performed the initial line-by-line coding of equivocal, unequivocal, and unsupported findings which was checked by a second reviewer (BC) [17, 35]. MKS is a mixed-methods researcher with a background with psychoepidemiology and metaresearch, whereas BC is a health services researcher who has extensive experience in evidence synthesis and working with guideline development groups. These findings were then synthesized into categories, based on similarity in meaning. Categories were proposed by MKS, reviewed by BC, and refined through discussions. All findings were double coded to categories by both reviewers, and MKS distilled the findings into actionable recommendations for practice which were then reviewed by BC. As recommended by JBI, we did not differentiate between equivocal and unequivocal findings when aggregating them into categories. These coding steps are detailed in Fig. 1, and an example of the late-stage synthesis steps is in Fig. 2.

Fig. 1
figure 1

Mixed Methods Synthesis Steps and Results

Fig. 2
figure 2

Qualitative Synthesis Example

To synthesise findings from both qualitative and quantitative evidence, we followed the JBI guidance for MMSR and used a convergent segregated approach as we conducted separate quantitative and qualitative syntheses and then integrated the findings of each [17, 36]. We juxtaposed the synthesised quantitative and qualitative findings and then organized the linked findings in a single line of reasoning to produce an overall configured analysis [18]. This integration process identifies areas of convergence, inconsistency, or contradiction [37]. The final table of recommendations was agreed upon through discussion by the entire multidisciplinary author team. Since overall assessments of the certainty of evidence using the GRADE or ConQual approach are not recommended for MMSRs, we created a cutpoint (supported by ≥ 3 evidence streams) as a blunt proxy for level of evidence to create a more usable set of recommendations.


Search results

After deduplication of identified records, we screened 17,240 titles and abstracts, the majority of which were excluded (n = 17,185). The yield rate is slightly lower than previous estimates likely due to the breadth of stakeholders, summary formats, and outcomes of interest [38, 39]. We reviewed 54 full-text articles and identified 22 articles for inclusion which all underwent backwards citation screening (Fig. 3). The search strategy output and reasons for inclusion/exclusion files are available on OSF [19]. Of note, many studies had multiple phases or participant groups. We included the study if we could clearly separate the methods and results for the phase and/or group. Where possible, we extracted information only from the eligible phase/group.

Fig. 3
figure 3

PRISMA Flow Diagram

Characteristics of included studies

Our final sample included 22 full-text articles representing 20 unique studies. This included 16 qualitative studies, 4 RCTs, and 1 mixed-methods RCT and qualitative study (Tables 1 and 2) involving 908 total participants from a variety of different stakeholder groups (Table 1). Many studies involved a multidisciplinary mix of participants such as researchers, health professionals, and policymakers [40, 41, 43,44,45, 47,48,49,50, 54,55,56, 59,60,61], although some had homogenous groups of clinicians [51, 52, 57] or decision-makers [42, 46, 53]. The majority of types of evidence syntheses were systematic reviews, but one study related specifically to network meta-analyses (NMA), one to diagnostic test accuracy (DTA) reviews, and one to updating reviews. Seven studies involved an international mix of participants [42, 48, 53, 54, 58, 60, 61], five were from Canada [43, 46, 47, 51, 52], three from the USA [44, 45, 49, 55, 56], two from Croatia [41, 59], two from England [40, 57], and one from Kenya [50]. Most were funded by national agencies [41,42,43, 45,46,47, 49, 51, 52, 55, 56, 59] such as the Canadian Institutes of Health Research [43, 47, 51, 52] or the Agency for Healthcare Research and Quality [45, 46, 49, 55, 56].

Table 1 Included qualitative studies
Table 2 Included randomised controlled trials

The TiDiER checklist was used to gather intervention data detailed in Tables 1, 2, and 3. The majority of included qualitative studies conducted either focus groups [41, 43, 49, 51, 52] or one-on-one semi-structured interviews [40, 42, 44,45,46,47, 50, 53,54,55,56,57,58, 62] (Table 1). RCTs were conducted either with an online survey [59, 60] or through in-person workshops (Tables 2 and 3) [50, 61]. There were a wide variety of summary formats tested including de novo summary prototypes [43, 46, 47, 49,50,51,52,53, 57], Grading of Recommendations, Assessment, Development and Evaluations (GRADE) Summary of Findings (SoF) evidence tables [42, 48, 50, 54, 58], MAGICapp [55, 56], Tableau [55, 56], evidence flowers [40], plain language summaries [41], and infographics [41]. Summary formats covered a wide variety of clinical topics (Tables 1 and 2).

Table 3 Description of interventions in randomised controlled trials

Quality appraisal

We found the quality of reporting for the qualitative studies was quite poor (Additional file 3). The main weakness across these studies included not providing information on philosophical perspectives (11/17) [40, 41, 43,44,45,46,47, 49,50,51, 53, 55, 56], not locating the researcher culturally or theoretically (15/17) [40,41,42, 46,47,48,49,50,51,52,53,54, 56,57,58], and not addressing the influence of the researcher on the research (15/17) [40,41,42, 44,45,46,47,48,49,50,51,52,53,54,55,56, 58]. Several interviews or focus groups also did not provide clear direct quotes from participants (6/17) [43, 47, 49, 51, 55, 56, 62]. On the other hand, the four quantitative studies were mostly reported clearly with low risk of bias [50, 59,60,61]. The main weaknesses is related to descriptions of the blinding of treatment assignment for the outcome assessors and those delivering treatment (2/4) [50, 61].

Quantitative analysis

The summary formats tested across the five included RCTs (described across four papers) are described in detail in Table 3. Four RCTs compared alternative versions of SoF tables against a format in current practice and/or a standard systematic review [50, 60, 61] One study compared an infographic to a plain language summary (PLS) and scientific abstract (SA) [59]. Studies were largely multidisciplinary, and results were not presented by stakeholder group. An exception to this was the study by Buljan et al. (2018) which conducted separate trials with patient representatives (‘consumers’) and doctors. There were no differences between the groups in knowledge scores for both the plain-language summary (PLS) and infographic formats. However, patient representatives reported lower satisfaction (user-friendliness) and reading experience with both formats when compared to doctors. As the quantitative studies used a variety of scales and summary formats, we could only summarise results narratively.

In preparation for the mixed-methods synthesis, we identified 74 individual findings from quantitative studies (Additional file 4) and synthesised these into four main areas which related to review outcomes of Knowledge/Understanding, Satisfaction/Reading Experience, Accessibility/Ease of Use, and Preference (Fig. 1). These individual findings helped identify areas of convergence, inconsistency, or contradiction with the qualitative findings and recommendations described later.

Knowledge or understanding

All five RCTs assessed knowledge or understanding as an outcome (Table 4). No studies employed standardised measures, choosing to use study-specific questions. Two articles, reporting the results of three studies, found that the new format improved knowledge or understanding [60, 61]. Carasco-Labra et al. reported that compared to a standard SoFs table, a new format of SoF table with seven alternative items improved understanding [60]. Of seven items testing understanding, three showed similar results, two showed small differences favouring the new format, and two (understanding risk difference and quality of the evidence associated with a treatment effect) showed large differences favouring the new format [63% (95% CI: 55, 71) and 62% (95% CI: 52, 71) more correct answers, respectively]. In two small RCTs, Rosenbaum et al. found that the inclusion of a SoF table in a review improved understanding and rapid retrieval of key findings compared to reviews with no SoF table [61]. In the second RCT, there were large differences in the proportion that correctly answered questions about risk in the control group (44% vs. 93%, P = 0.003) and risk in the intervention group (11% vs. 87%, P < 0.001). Two studies reported no significant differences between formats in knowledge or understanding [50, 59].

Table 4 Quantitative results

Ease of use/accessibility

All five RCTs provided some assessment of ease of use and accessibility, measured in a variety of ways (Table 4). Buljan et al. reported that user-friendliness was higher for an infographic compared to a PLS for doctors and patient representatives [patients median infographic score: 30.0 (95% CI: 25.5–34.5) vs. PLS: 21.0 (19.0–25.0); doctors median infographic score: 36.0 (30.9–40.0) vs. PLS: 29.0 (26.8–36.2)] [59], while Carasco-Labra et al. reported that in six out of seven domains, participants rated information in the alternative SoF table as more accessible overall (MD 0.3, SE 0.11, P = 0.001) [60]. Opyio et al.’s graded-entry SoF formats were associated with a higher mean composite score for clarity and accessibility of information about the quality of evidence (adjusted mean difference 0.52, 95% CI: 0.06 to 0.99) [50]. In two small RCTs, Rosenbaum et al. found that participants with the SoF format were more likely to respond that the main findings were accessible [61]. The second RCT demonstrated, that in general, participants with the SoF format spent less time finding answers to key questions than those without.


Two studies assessed satisfaction (Table 4). Buljan et al. reported that both patients and doctors rated an infographic better for reading experience than a PLS, even though it did not improve knowledge [patients median infographic score: 33.0 (95% CI: 28.0–36.0) vs. PLS: 22.5 (19.0–27.4); doctors median infographic score: 37.0 (26.8–41.3) vs. PLS: 24.0 (21.3–27.2)] [59]. Carasco-Labra et al. reported that participants were more satisfied with the new format of SoF tables (5/6 questions where the largest proportion was in favour of alternate SoF tables) [60].


Two studies assessed user preference (Table 4). Carasco-Labra et al. reported that participants consistently preferred the new format of SoF tables (MD 2.8, SD 1.6) [60]. Similarly, Rosenbaum et al. reported that overall participants preferred the alternative (or new) format of SoF tables compared to the current formats (MD/SD: 2.8/1.6) [61].

Qualitative analysis

From 16 qualitative studies and 1 RCT with a supplemental qualitative component, line by line coding identified 542 equivocal and unequivocal findings within the “Results” section of the articles. No unsupported findings were identified (Fig. 1). From these initial 542 findings, we synthesized them further into 393 findings across 6 categories defined as follows (Fig. 4):

  1. 1)

    Presenting information (comments on the content, structure, and style of the summary format)

  2. 2)

    Tailoring information (inherently linked to the presentation of information but more focused on accommodating end user’s different learning styles, backgrounds, and needs to appropriately tailoring content)

  3. 3)

    Contextualising findings (properly framing the findings themselves within the relevant context by providing information such as setting, cost constraints, and ability to implement findings)

  4. 4)

    Trust in producers and summary (end user’s perceptions of credibility markers of the work as a whole — such as transparency, funding sources, and clear references — i.e. that the work was rigorously done by qualified individuals)

  5. 5)

    Quality of evidence (focused on the assessment of study quality and the totality of the evidence including how assessments were reached and information about rating)

  6. 6)

    Knowledge required to understand findings (educational information that should be added to summaries due to comprehension difficulties or gaps in end user’s knowledge base)

Fig. 4
figure 4

Categories of Recommendations

These 393 synthesized findings were then reviewed again by two authors (MKS, BC) to produce 126 recommendations for practice which, where possible, are presented based on targeted GDG members (Additional files 5 and 6) and specific type of evidence syntheses such as NMA (n = 22), DTA reviews (n = 2), and updating reviews (n = 8). A total of 94 recommendations could broadly apply to broader types of evidence synthesis. As previously mentioned, most studies contained diverse multidisciplinary participants. When quotes from participants were reported, it was often not attributed to a specific stakeholder, and several studies also included no direct quotes from participants. However, where possible, recommendations are presented according to group membership (noted by superscripts). The individual 126 recommendations from the qualitative synthesis are available in Additional file 5, alongside the citation(s) which support each, whether they also had mixed-methods support, and which end user may have expressed the recommendation.

A majority of recommendations are related to presenting information (n = 64) or Tailoring Information for the end user (n = 24). For example, items under the presenting information category include things like ‘use bullet points’, ‘flag important information by bolding/highlighting’, use ‘greyscale-friendly colours’, and ‘avoid abbreviations.’ Tailoring Information included guidance on how to create bespoke customised documents with ‘easily extractable information to forward to colleagues’ and the importance of ‘clarifying the audience’ that the report is for and about. Several items regarding the presentation of numerical and statistical findings were identified across several categories. For example, for Presenting Information, it was suggested to ‘use absolute numbers, not probabilities’ and todecrease numeric/statistical data’, whereas the contextualising findings category suggested ‘interpretation aids for statistics’ and noted that policy/decision-makers are ‘not interested in methodology. The Knowledge Required category highlighted the lack of awareness of abbreviations, recommending to ‘avoid abbreviations (e.g. RR for relative risk, CI for confidence intervals)’ altogether. Some of these items are intrinsically linked as the Knowledge Required recommendations highlighted that for readers, certain items like ‘forest plots are difficult to understand’, so providing ‘interpretation of statistical results’ and ‘defining statistical terms’ can be helpful.

Mixed-methods synthesis

The four outcome areas for the quantitative evidence (e.g. Knowledge, Satisfaction) were also covered by the qualitative evidence. However, due to the large heterogeneity in stakeholders, formats, and assessments methods, it was difficult to determine whether the qualitative evidence helped explain differences in size or direction of effects in the quantitative studies.

From 74 individual quantitative findings (Additional file 4), we identified 17 which converged with at least one of the 126 qualitative recommendations (Additional file 5). Some of these 17 items supported the same recommendation (e.g. several findings supported the use of summary of findings tables), so in total, these 17 quantitative findings supported 9 qualitative findings. Some of these items are inherently linked as SoF tables (4) are often using the GRADE rating scale (8). Similarly, the items about assessments of quality (7 and 9) likely to refer to GRADE as well. The 9 recommendations with mixed-methods support are marked with an asterisk in Figs. 6, 7, and 8 (Additional file 6) and include providing a clear summary report as follows:

  1. 1)

    Is structured

  2. 2)

    Is brief

  3. 3)

    Provides information on the standard steps and nature of the review

  4. 4)

    Presents results in summary of findings (SoF) tables

  5. 5)

    Defines statistical terms

  6. 6)

    Provides interpretations of statistical results

  7. 7)

    Includes assessments of quality

  8. 8)

    Describes the rating scale (GRADE)

  9. 9)

    Describes how authors arrived at their assessments of quality

Throughout our recommendations, there are items which may appear at face value to be contradictory. However, they simply accommodate different learning styles (e.g. ‘use summary of findings tables’ and ‘use narrative summaries’); thus, these are considered complimentary. Relatedly, there were some items that were expressed by different groups which echoed the end user’s different needs. For example, the ‘Abstract Methods Results and Discussion (AMRaD) format’ was advocated by clinicians, whereas ‘avoid academic formatting’ was expressed by policy/decision-makers. Additionally there are some items that are similar but were expressed for very different purposes — for example ‘including author’s names’ is in both the presenting information and trust in producers and summary categories as some participants flagged this as a clear indicator of their trust in the quality of the work, whereas others just wanted the information for general factual transparency purposes (Additional file 6: Figs. 6, 7, 8).

As an overall aim of a MMSR is to provide actionable recommendations, in an effort to strike a balance between 9 recommendations with mixed-methods support and 94 recommendations from the qualitative literature, we reviewed all recommendations (Additional file 5) and took a pragmatic approach to narrow down the list to those with three or more studies supporting them (or mixed-methods support) (Additional file 7). Using this approach, there were the aforementioned 9 recommendations with mixed-methods support and 20 recommendations with supporting evidence from three or more studies (Fig. 5). Most of the recommendations were from the Presenting Information category (n =12), e.g. ‘give publication date’, ‘use bullet points’, and ‘detail key messages’. Three were focused on contextualising information (e.g. ‘framed within local context’, ‘effective intervention details to help implementation’), two were on Trust in producers and Summary (e.g. ‘put logos on first page’, ‘include author’s names’), one was from the knowledge required category (e.g. ‘avoid field-specific or technical jargon’), and one was from the Tailoring information category (e.g. ‘choice and control over the amount of detail received’).

Fig. 5
figure 5

Recommendations with Mixed Methods or at least 3 supporting evidence streams


This mixed-methods systematic review synthesised the evidence on the effectiveness and acceptability of different evidence synthesis summary formats. The quantitative results suggest that alternative versions of SoF tables compared to a current format and/or a standard systematic review improved knowledge or understanding. However, assessments of study quality revealed that half of the included trials had poor reporting related to the blinding of outcome assessors and those delivering treatment. There was insufficient evidence to establish a ‘gold-standard’ summary format amongst end users; however, qualitative studies offered a wealth of data such that we could synthesize findings into 126 actionable recommendations across six thematic areas. Thirty-two of the 126 recommendations were for specific types of reviews (e.g. NMA, DTA, and updating reviews). Ninety-four items could be broadly applied to a variety of evidence synthesis types, and nine had mixed-methods support. A further 21 of the actionable recommendations were also supported by at least three different studies, a proxy measure adopted to indicate items with a larger evidence base. These 30 recommendations can be used to promote more effective communication with different stakeholders. To help with potential implementation, we also delineated findings by review type and stakeholder group where possible as there was some evidence that end user’s had different preferences.

The interventions included in our review were diverse with a variety of outcome measures. The majority of studies tested de novo summary prototypes, making it difficult to draw comparisons. However, five studies assessed GRADE SoF tables, and a significant portion of our recommendations pertain to summary of findings tables and GRADE ratings. In fact, there were enough findings concerning the quality assessment of studies and use of the GRADE scale that it warranted its own category ‘Quality of Evidence’ in the final recommendations. Previous work focused on US National Guidelines Clearinghouse clinical practice guidelines published between 2011 and 2018 found that the GRADE scale was inconsistently used, and only 1 in 10 (7/67, 10.4%) guidelines explicitly reported consideration of all criteria to assess the certainty in the evidence [63]. As reflected in three of our nine recommendations with mixed-method support, GRADE is an important factor in evidence summary formats. Recent work has highlighted that there are many improvements to be made in terms of consistency in presenting GRADE symbols and explaining the recommendations [64]. This aligns with seven articles in our review which supported the need to be explicit about how the scale is used, recommending to ‘provide distinct explanations of rating scale (GRADE).’ Four studies also supported detailing ‘how authors arrived at assessments of quality’ (Additional file 5). Many included interventions tended to be in a traditional academic style in that they were largely text based. Accordingly, numerous recommendations addressed how to ‘flag important’ and ‘avoid dense information’ through ‘structured’, ‘brief’, and ‘concise’ formats with ‘prominent subheadings’. Many recommendations such as ‘including quality assessments of evidence/study quality’, ‘provide distinct explanations of rating scale’, ‘choice and control over the amount of detail received’ and ‘structured’ information with ‘intervention details to help implementation’ are also aligned with several items on the dissemination checklist for Cochrane reviews [65].

The need for structured presentation of information is also supported by previous work. Brandt et al. found that 181 internal medicine and general practice physicians had a clear preference for multi-layered guideline presentation formats [66]. Short menu formats and visual aids have been shown to improve performance when participants are presented with both conditional probability and natural frequency formats [67]. One study found that, across different levels of object numeracy and education, fact boxes (i.e. simple tabular messages) were more engaging than normal text. They also led to more comprehension and slightly more knowledge recall after 6 weeks compared to the same information in text [15].

Other than MAGICApp and Tableau, no other interactive summary formats were identified in our review. Furthermore, no studies that used audio-visual strategies such as podcasts or videos were identified in this review. There is some evidence that video abstracts are more effective than graphical abstracts and traditional abstracts in comprehension, understanding, and reading experience [68]. Audio summaries also show some promising results. University staff listening to a podcast summary of a Cochrane review had the highest rates of comprehension in comparison with those who read a plain language summary or abstract [69]. Future research should explore and test these formats with GDG members.

Many general tenets were supported by multiple studies involving multidisciplinary stakeholders. For example, concerns about the presentation of numerical and statistical results resulted in recommendations across several of our categories. Similar to our findings, Cochrane’s plain language expectations for authors of cochrane summaries (PLEACS) standards recommend presenting numerical information in terms of absolute effects and as natural frequencies [70]. A 2017 meta-analysis also supported the use of natural frequencies. Their study found that performance rates when interpreting natural frequencies increased to 24% compared to only 4% when presented in a probability format. However, three-quarters of participants still failed to obtain the correct solution with either presentation [67]. On the other hand, a 2020 study by Buljan et al. found that numerical presentation (and framing) had no effect on consumer’s and biomedical student’s understanding of health information in plain language summaries [71]. Previous research established that the required literacy for even plain language summaries is higher (over 10 to 15 years of education) than the recommended US 6th grade (11 or 12 years old) reading level [72]. All of this prior work reinforces the idea that effective interactions with evidence synthesis summaries require certain baseline knowledge. This review has provided specific knowledge areas to address as detailed in the Knowledge Required category (e.g. the need to define terms, explain methodologies, grading scales, and statistics and generally provide a supplemental explanation sheet to end users). Initiatives such as the International Guideline Development Credentialing and Certification Program (INGUIDE) [73] may also help address some of these knowledge needs by ensuring that guideline development group members have the necessary competencies.

Our recommendations are proposals for consideration, not strict rules for practice, especially considering that the evidence base supporting many recommendations is weak, and not all may be practical for resource-limited teams. The nine recommendations with mixed-methods support could be considered as essential for any summary format producer, with the additional 20 items with 3 or more evidence streams supporting them as desirable considerations. However, the included studies that these recommendations are based on often did not discuss time or resources required to actually produce the summary format(s) which could make implementation difficult. For example, inclusion of certain items, particularly those related to ‘contextualising findings’, may require additional work or expertise which some may consider to be outside the scope of a typical review [53]. However, these suggestions should not be ignored as research has shown that context is rarely provided in sufficient detail in existing reviews and guidelines [74], and applying evidence synthesis findings to local contexts is a major weakness reported by some health technology assessment (HTA) units trying to promote healthcare decision-making [75].

The strengths of this study include the mixed-methods approach and an extensive search strategy. However, our study has several limitations. Firstly, we did not include observational studies, although during screening we excluded few studies based on their study design (Fig. 3) [76]. The main limitations of our findings relate to the issues of completeness of the reporting of included studies. Several articles did not provide a copy or access to the summary format(s) tested so it was sometimes difficult to properly contextualise their results. Additionally, it was often difficult to attribute a finding to a specific stakeholder group as included studies often did not provide group membership details about quotes used. This meant that many of our recommendations are non-specific as we were unable to fully decipher what works for who and under which circumstances. Stakeholders involved in guideline development have different styles of reasoning and knowledge bases to draw from [6]; therefore, drawing conclusions that are stakeholder group specific is complex. Even within one group (e.g. patient representatives), one size does not fit all when presenting recommendations [77]. However, we recommend that future work with multidisciplinary stakeholders should denote group membership when reporting quotes from participants as this was a deficit in our included studies. For example, while there is some reporting guidance for what public or patient version of clinical guidelines should include [78], we are still missing a step in the process, wherein it is unclear what works best for patient representatives involved in clinical guideline development groups. Lastly, we excluded studies in the general population and students. Studies have shown that PLS improved understanding in these populations [79, 80].


Our results provide valuable information that can be used to improve existing formats and inform future research aimed at developing more effective evidence synthesis summary formats. The nine recommendations with mixed-methods support can be considered essential to consider for any summary format producer. The additional 20 items with 3 or more evidence streams supporting them can be considered as desirable, with further exploration needed into the full set of 126 items. Future research should further explore these proposed recommendations amongst the different guideline development group members to explore which items are particularly important for which stakeholder. Our research team plans to conduct a prioritisation exercise for these recommendations so we can use them as guidance for focus group workshops with GDG members. Furthermore, other mediums of summary formats not identified in this review could be explored further such as the use of podcasts or video abstracts or summaries.

Availability of data and materials

The study was previously preregistered on the Open Science Framework, and the protocol was published in HRB Open Research [19, 76]. The datasets generated and/or analysed during the current study are available on Open Science Framework (OSF). Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).


  1. Eccles MP, Grimshaw JM, Shekelle P, Schünemann HJ, Woolf S. Developing clinical practice guidelines: target audiences, identifying topics for guidelines, guideline group composition and functioning and conflicts of interest. Implement Sci. 2012;7:60.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Woolf S, Schünemann HJ, Eccles MP, Grimshaw JM, Shekelle P. Developing clinical practice guidelines: types of evidence and outcomes; values and economics, synthesis, grading, and presentation and deriving recommendations. Implement Sci. 2012;7:61.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Qaseem A, Forland F, Macbeth F, Ollenschläger G, Phillips S, van der Wees P. Guidelines International Network: toward international standards for clinical practice guidelines. Ann Intern Med. 2012;156:525–31.

    Article  PubMed  Google Scholar 

  4. Wieringa S, Engebretsen E, Heggen K, Greenhalgh T. Clinical guidelines and the pursuit of reducing epistemic uncertainty. An ethnographic study of guideline development panels in three countries. Soc Sci Med. 2021;272:113702.

    Article  PubMed  Google Scholar 

  5. Calderón C, Rotaeche R, Etxebarria A, Marzo M, Rico R, Barandiaran M. Gaining insight into the clinical practice guideline development processes: qualitative study in a workshop to implement the GRADE proposal in Spain. BMC Health Serv Res. 2006;6:138.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Wieringa S, Dreesens D, Forland F, Hulshof C, Lukersmith S, Macbeth F, et al. Different knowledge, different styles of reasoning: a challenge for guideline development. BMJ Evid Based Med. 2018;23:87–91.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Kastner M, Bhattacharyya O, Hayden L, Makarski J, Estey E, Durocher L, et al. Guideline uptake is influenced by six implementability domains for creating and communicating guidelines: a realist review. J Clin Epidemiol. 2015;68:498–509.

    Article  PubMed  Google Scholar 

  8. Wiercioch W, Akl EA, Santesso N, Zhang Y, Morgan RL, Yepes-Nuñez JJ, et al. Assessing the process and outcome of the development of practice guidelines and recommendations: PANELVIEW instrument development. CMAJ. 2020;192:E1138–45.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Wallace JW, Charles B, Mike C. Improving the uptake of systematic reviews: a systematic review of intervention effectiveness and relevance. BMJ Open. 2014;4:e005834-NA.

    Article  Google Scholar 

  10. Tricco AC, Roberta C, Thomas SM, Motiwala SS, Shannon S, Kealey MR, et al. Barriers and facilitators to uptake of systematic reviews by policy makers and health care managers: a scoping review. Implement Sci. 2016;11:4.

    Article  PubMed  PubMed Central  Google Scholar 

  11. van de Bovenkamp HM, Zuiderent-Jerak T. An empirical study of patient participation in guideline development: exploring the potential for articulating patient knowledge in evidence-based epistemic settings. Health Expect. 2015;18:942–55.

    Article  PubMed  Google Scholar 

  12. Perrier L, Mrklas K, Lavis JN, Straus SE. Interventions encouraging the use of systematic reviews by health policymakers and managers: a systematic review. Implement Sci. 2011;6:43.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Petkovic J, Welch V, Jacob MH, Yoganathan M, Ayala AP, Cunningham H, et al. The effectiveness of evidence summaries on health policymakers and health system managers use of evidence from systematic reviews: a systematic review. Implement Sci. 2016;11:162.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Chambers DWP, Thompson C, Hanbury A, Farley K, Light K. Maximizing the impact of systematic reviews in health care decision making: a systematic scoping review of knowledge-translation resources. Milbank Q. 2011;89:131–56.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Brick C, McDowell M, Freeman ALJ. Risk communication in tables versus text: a registered report randomized trial on ‘fact boxes’. Royal Society Open Science. Royal Soc. 7:190876.

  16. Bressan V, Bagnasco A, Aleo G, Timmins F, Barisone M, Bianchi M, et al. Mixed-methods research in nursing – a critical review. J Clin Nurs. 2017;26:2878–90.

    Article  PubMed  Google Scholar 

  17. Lizarondo L, Stern C, Carrier J, Godfrey C, Rieger K, Salmond S, et al. Chapter 8: Mixed methods systematic reviews. In: Aromataris E, Munn Z, editors. JBI Manual for Evidence Synthesis [Internet]: Joanna Briggs Institute; 2020. [cited 2021 Feb 23]. Available from:

    Google Scholar 

  18. Lizarondo L, Stern C, Apostolo J, Carrier J, de Borges K, Godfrey C, et al. Five common pitfalls in mixed methods systematic reviews – lessons learned. J Clin Epidemiol. 2022;S0895435622000750.

  19. Clyne B, Sharp M. Evidence synthesis and translation of findings for national clinical guideline development: addressing the needs and preferences of guideline development groups. OSF; 2021 [cited 2021 Jun 3]; Available from:

  20. Sharp MK, Tyner B, Baki DABA, Farrell C, Devane D, Mahtani KR, et al. Evidence synthesis summary formats for clinical guideline development group members: a mixed-methods systematic review protocol [Internet]. HRB Open Res. 2021; [cited 2021 Aug 23]. Available from:

  21. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. J Clin Epidemiol. 2021.

  22. Covidence - better systematic review management [Internet]. Covidence. [cited 2021 Feb 15]. Available from:

  23. Haddaway NR. citationchaser: an R package for forward and backward citations chasing in academic searching [Internet]. Zenodo. 2021; [cited 2021 Feb 15]. Available from:

  24. nealhaddaway. nealhaddaway/citationchaser [Internet]. 2021 [cited 2021 Feb 15]. Available from:

  25. Hoffmann TC, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, et al. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ. 2014;348:g1687.

    Article  PubMed  Google Scholar 

  26. critical-appraisal-tools - critical appraisal tools | Joanna Briggs Institute [Internet]. [cited 2021 Feb 24]. Available from:

  27. 8.5.2 Mixed methods systematic review using a convergent segregated approach to synthesis and integration - JBI Manual for Evidence Synthesis - JBI Global Wiki [Internet]. [cited 2022 Aug 25]. Available from:

  28. Harbord RM, Egger M, Sterne JAC. A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat Med. 2006;25:3443–57.

    Article  PubMed  Google Scholar 

  29. Egger M, Smith GD, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. 1997;315:629–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al. Cochrane Handbook for Systematic Reviews of Interventions [Internet]. Cochrane. 2020; [cited 2021 Feb 23]. Available from:

  31. Lockwood C, Munn Z, Porritt K. Qualitative research synthesis: methodological guidance for systematic reviewers utilizing meta-aggregation. JBI Evid Implement. 2015;13:179–87.

    Google Scholar 

  32. Tufanaru C. Theoretical foundations of meta-aggregation: insights from Husserlian phenomenology and American pragmatism [Internet]. Adelaide: The Joanna Briggs Institute. The University of Adelaide; 2015. Available from:

    Google Scholar 

  33. Hannes K, Lockwood C. Pragmatism as the philosophical foundation for the Joanna Briggs meta-aggregative approach to qualitative evidence synthesis. J Adv Nurs. 2011;67:1632–42.

    Article  PubMed  Google Scholar 

  34. NVivo. Qualitative Data Analysis Software | NVivo [Internet]. 2021 [cited 2020 Sep 9]. Available from:

  35. Textual data synthesis - JBI Manual for Evidence Synthesis - JBI Global Wiki [Internet]. [cited 2021 Dec 8]. Available from:

  36. Hong QN, Pluye P, Bujold M, Wassef M. Convergent and sequential synthesis designs: implications for conducting and reporting systematic reviews of qualitative and quantitative evidence. Syst Rev. 2017;6:61.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Sandelowski M, Voils CI, Barroso J. Defining and designing mixed research synthesis studies. Res Sch. 2006;13:29.

    PubMed  PubMed Central  Google Scholar 

  38. Borah R, Brown AW, Capers PL, Kaiser KA. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry. BMJ Open. 2017;7:e012545.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Sampson M, Tetzlaff J, Urquhart C. Precision of healthcare systematic review searches in a cross-sectional sample. Res Synthesis Methods. 2011;2:119–25.

    Article  Google Scholar 

  40. Babatunde OO, Tan V, Jordan JL, Dziedzic K, Chew-Graham CA, Jinks C, et al. Evidence flowers: an innovative, visual method of presenting “best evidence” summaries to health professional and lay audiences. Res Synthesis Methods. 2018;9:273–84.

    Article  CAS  Google Scholar 

  41. Buljan I, Tokalić R, Roguljić M, Zakarija-Grković I, Vrdoljak D, Milić P, et al. Comparison of blogshots with plain language summaries of Cochrane systematic reviews: a qualitative study and randomized trial. Trials. 2020;21:426.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Busert LK, Mütsch M, Kien C, Flatz A, Griebler U, Wildner M, et al. Facilitating evidence uptake: development and user testing of a systematic review summary format to inform public health decision-making in German-speaking countries. Health Res Policy Syst. 2018;16 Available from:

  43. Dobbins M, DeCorby K, Twiddy T. A knowledge transfer strategy for public health decision makers. Worldviews on evidence-based nursing, vol. 1. Malden: Wiley-Blackwell; 2004. p. 120–8.

    Google Scholar 

  44. Hartling L, Guise J-M, Hempel S, Featherstone R, Mitchell MD, Motu’apuaka ML, et al. EPC methods: AHRQ end-user perspectives of rapid reviews. Agency for Healthcare Research and Quality (US); 2016; Available from:

    Google Scholar 

  45. Hartling L, Guise JM, Hempel S, Featherstone R, Mitchell MD, Motu’apuaka ML, et al. Fit for purpose: perspectives on rapid reviews from end-user interviews. Syst Rev. 2017;6 Available from:

  46. Hartling L, Gates A, Pillay J, Nuspl M, Newton AS. Development and usability testing of EPC evidence review dissemination summaries for health systems decisionmakers: Agency for Healthcare Research and Quality (US); 2018. Available from:

    Google Scholar 

  47. Marquez C, Johnson AM, Jassemi S, Park J, Moore JE, Blaine C, et al. Enhancing the uptake of systematic reviews of effects: what is the best format for health care managers and policy-makers? A mixed-methods study. Implement Sci. 2018;13. Available from: //WOS:000436144000001.

  48. Mustafa RA, Wiercioch W, Santesso N, Cheung A, Prediger B, Baldeh T, et al. Decision-making about healthcare related tests and diagnostic strategies: user testing of GRADE evidence tables. PLoS One. 2015;10. Available from: ://WOS:000363185500001.

  49. Newberry SJ, Shekelle PG, Vaiana M, Motala A. Reporting the findings of updated systematic reviews of comparative effectiveness: how do users want to view new information? Agency for Healthcare Research and Quality (US); 2013; Available from:

    Google Scholar 

  50. Opiyo N, Shepperd S, Musila N, Allen E, Nyamai R, Fretheim A, et al. Comparison of alternative evidence summary and presentation formats in clinical guideline development: a mixed-method study. PLoS One. 2013;8 Available from: ://WOS:000315210400056.

  51. Perrier L, Kealey MR, Straus Sharon E. A usability study of two formats of a shortened systematic review for clinicians. BMJ Open. 2014;4:e005919-NA.

    Article  Google Scholar 

  52. Perrier L, Kealey MR, Straus SE. An iterative evaluation of two shortened systematic review formats for clinicians: a focus group study. J Am Med Inform Assoc. 2014;21:e341–6.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Rosenbaum SE, Glenton C, Wiysonge CS, Abalos E, Mignini L, Young T, et al. Evidence summaries tailored to health policy-makers in low- and middle-income countries. Bull World Health Organ. 2011;89:54–61.

    Article  PubMed  Google Scholar 

  54. Rosenbaum S, Claire G, Kari NH, Oxman AD. User testing and stakeholder feedback contributed to the development of understandable and useful summary of findings tables for Cochrane reviews. J Clin Epidemiol. 2010;63:607–19.

    Article  PubMed  Google Scholar 

  55. Smith CJ, Jungbauer RM, Totten AM. Visual evidence: increasing usability of systematic reviews in health systems guidelines development. Appl Clin Inform. 2019;10:743–50.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Totten AM, Smith C, Dunham K, Jungbauer RM, Graham E. Improving access to and usability of systematic review data for health systems guidelines development. Agency for Healthcare Research and Quality (US); 2019; Available from:

  57. Steele R. Mental health clinicians views of summary and systematic review utility in evidence-based practice. Health Information and Libraries Journal [Internet]. Available from: ://WOS:000627057300001.

  58. Yepes-Nunez JJ, Li SA, Guyatt G, Jack SM, Brozek JL, Beyene J, et al. Development of the summary of findings table for network meta-analysis. J Clin Epidemiol. 2019;115:1–13.

    Article  PubMed  Google Scholar 

  59. Buljan I, Malički M, Wager E, Puljak L, Hren D, Kellie F, et al. No difference in knowledge obtained from infographic or plain language summary of a Cochrane systematic review: three randomized controlled trials. J Clin Epidemiol. 2018;97:86–94.

    Article  PubMed  Google Scholar 

  60. Carrasco-Labra A, Brignardello-Petersen R, Santesso N, Neumann I, Mustafa RA, Mbuagbaw L, et al. Improving GRADE evidence tables part 1: a randomized trial shows improved understanding of content in summary of findings tables with a new format. J Clin Epidemiol. 2016;74:7–18.

    Article  PubMed  Google Scholar 

  61. Rosenbaum S, Claire G, Oxman AD. Summary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information. J Clin Epidemiol. 2010;63:620–6.

    Article  PubMed  Google Scholar 

  62. Mustafa R, Wiercioch W, Brozek J, Lelgemann M, Buehler D, Garg A, et al. Enhancing the acceptance and implementation of grade summary tables for evidence about diagnostic tests. BMJ Qual Safety. 2013;22:A36.

    Article  Google Scholar 

  63. Dixon C, Dixon PE, Sultan S, Mustafa R, Morgan RL, Murad MH, et al. Guideline developers in the United States were inconsistent in applying criteria for appropriate Grading of Recommendations, Assessment, Development and Evaluation use. J Clin Epidemiol. 2020;124:193–9.

    Article  PubMed  Google Scholar 

  64. Klugar M, Kantorová L, Pokorná A, Líčeník R, Dušek L, Schünemann HJ, et al. Visual transformation for guidelines presentation of the strength of recommendations and the certainty of evidence. J Clin Epidemiol. 2022;143:178–85.

    Article  PubMed  Google Scholar 

  65. Cochrane. Cochrane Checklist and Guidance for disseminating findings from Cochrane intervention reviews [Internet]. 2019 Oct p. 92. Report No.: Version 1.0. Available from:

  66. Brandt L, Vandvik PO, Alonso-Coello P, Akl EA, Thornton J, Rigau D, et al. Multilayered and digitally structured presentation formats of trustworthy recommendations: a combined survey and randomised trial. BMJ Open. 2017;7 Available from:

  67. Meta-analysis of the effect of natural frequencies on Bayesian reasoning. - PsycNET [Internet]. [cited 2021 Jul 14]. Available from:

  68. Bredbenner K, Simon SM. Video abstracts and plain language summaries are more effective than graphical abstracts and published abstracts. PLoS One. 2019;14:e0224697.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Maguire LCM. How much do you need: a randomised experiment of whether readers can understand the key messages from summaries of Cochrane reviews without reading the full review. J R Soc Med. 2014;107:444–9.

    Article  PubMed  PubMed Central  Google Scholar 

  70. Plain Language Expectations for Authors of Cochrane Summaries (PLEACS) - Cochrane Editorial and Publishing Policy Resource - Confluence [Internet]. [cited 2021 Jul 12]. Available from:

  71. Buljan I, Tokalic R, Roguljic M, Zakarija-Grkovic I, Vrdoljak D, Milic P, et al. Framing the numerical findings of Cochrane plain language summaries: two randomized controlled trials. BMC Med Res Methodol. 2020;20. Available from: ://WOS:000533890400002.

  72. Karačić JDP, Buljan I, Hren D, Marušić A. Languages for different health information readers: multitrait-multimethod content analysis of Cochrane systematic reviews textual summary formats. 2019;19:1–9.

  73. International Guideline Credentialing & Certification Program [Internet]. [cited 2022 Mar 31]. Available from:

  74. Booth A, Moore G, Flemming K, Garside R, Rollins N, Tunçalp Ö, et al. Taking account of context in systematic reviews and guidelines considering a complexity perspective. BMJ Global Health. 2019;4:e000840.

    Article  PubMed  PubMed Central  Google Scholar 

  75. Poder TG, Rhainds M, Bellemare CA, Deblois S, Hammana I, Safianyk C, et al. Experiences of using Cochrane systematic reviews by local HTA Units. Iran: International Journal of Health Policy and Management [Internet]; 2020. Available from:

    Google Scholar 

  76. Sharp M, Clyne B. Evidence synthesis summary formats for decision-makers and clinical guideline development groups: a mixed-methods systematic review protocol. OSF; 2021 [cited 2021 Jul 7]; Available from:

  77. Fearns N, Walker L, Graham K, Gibb N, Service D. User testing of a Scottish Intercollegiate Guideline Network public guideline for the parents of children with autism. BMC Health Serv Res. 2022;22:77.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Wang X, Chen Y, Akl EA, Tokalić R, Marušić A, Qaseem A, et al. The reporting checklist for public versions of guidelines: RIGHT-PVG. Implement Sci. 2021;16:10.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Santesso N, Tamara R, Nilsen ES, Claire G, Rosenbaum S, Ciapponi A, et al. A summary to communicate evidence from systematic reviews to the public improved understanding and accessibility of information: a randomized controlled trial. 2014;68:182–90.

  80. Alderdice F, Jennifer M, Lasserson TJ, Beller E, Carroll M, Hundley V, et al. Do Cochrane summaries help student midwives understand the findings of Cochrane systematic reviews: the BRIEF randomised trial. Syst Rev. 2016;5:40.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


MKS and BC are supported by a Health Research Board (HRB) Emerging Investigator Award (EIA-2019-09). The HRB is not involved in the project’s protocol, analysis plan, data collection, analysis, or interpretation of study results.

Author information

Authors and Affiliations



Study conceptualisation, BC, MON, MR, and SMS. Methodology development, MKS, DD, KRM, SMS, MON, MR, and BC. Data curation, MKS, DAB, JQ, BT, and BC. Formal analysis, MKS and BC. Writing—reviewing and editing, MKS, DAB, JQ, BT, DD, KRM, SMS, MO, MR, and BC. Supervision, MKS and BC. Project administration, MKS and BC. Funding acquisition, BC, MON, MR, SMS, and KRM. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Melissa K. Sharp.

Ethics declarations

Ethics approval and consent to participate

There were no human participants involved in this project so ethical approval is not necessary. All data is publicly available.

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

PRISMA checklist.

Additional file 2.

Search strategy results.

Additional file 3.

Data extraction workbook.

Additional file 4.

Quantitative findings.

Additional file 5.

Qualitative synthesis recommendations.

Additional file 6:

 Figures 6, 7, and 8. Recommendations for Practice.

Additional file 7.

Qualitative synthesis recommendations (with at least 3 supporting studies or mixed methods support).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharp, M.K., Baki, D.A.B.A., Quigley, J. et al. The effectiveness and acceptability of evidence synthesis summary formats for clinical guideline development groups: a mixed-methods systematic review. Implementation Sci 17, 74 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: