- Open Access
The GRADE evidence-to-decision framework: a report of its testing and application in 15 international guideline panels
Implementation Science volume 11, Article number: 93 (2015)
Judgments underlying guideline recommendations are seldom recorded and presented in a systematic fashion. The GRADE Evidence-to-Decision Framework (EtD) offers a transparent way to record and report guideline developers’ judgments. In this paper, we report the experiences with the EtD frameworks in 15 real guideline panels.
Following the guideline panel meetings, we asked methodologists participating in the panel to provide feedback regarding the EtD framework. They were instructed to consider their own experience and the feedback collected from the rest of the panel. Two investigators independently summarized the responses and jointly interpreted the data using pre-specified domains as coding system. We asked methodologists to review the results and provide further input to improve the structure of the EtDs iteratively.
The EtD framework was well received, and the comments were generally positive. Methodologists felt that in a real guideline panel, the EtD framework helps structuring a complex process through relatively simple steps in an explicit and transparent way. However, some sections (e.g., “values and preferences” and “balance between benefits and harms”) required further development and clarification that were considered in the current version of the EtD framework.
The use of an EtD framework in guideline development offers a structured and explicit way to record and report the judgments and discussion of guideline panels during the formulation of recommendations. In addition, it facilitates the formulation of recommendations, assessment of their strength, and identifying gaps in research.
Clinical practice guidelines are an efficient way to bridge the gap between research evidence, expert experience, and decision-making [1–4]. In order to be trustworthy, guidelines need to be explicit regarding the methods used to summarize the evidence and rate its certainty (also known as quality of the evidence or confidence in the effect estimates) and about the judgments involved in moving from evidence to recommendations [5–7]. Unfortunately, such judgments are rarely recorded and presented in a systematic way.
The Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group (www.gradeworkinggroup.org), a collaborative of over 500 scientists, clinicians, and people with other backgrounds, has developed an approach to assessing the certainty in the body of evidence summarized in systematic reviews that support a decision or guideline recommendation, called the GRADE approach [8, 9]. This approach is used by over 90 organizations, including the World Health Organization, the National Institutes of Health and Care Excellence, the Canadian Task Force for the Preventive Services, numerous professional organizations, and the Cochrane Collaboration. There are over 20,000 citations to GRADE’s methodological work, and thousands of recommendations have been developed following the GRADE approach. GRADE can be used to evaluate different types of evidence, including evidence for intervention effects (including multiple treatment comparisons), test accuracy, prognosis, and resources.
In the context of its “Developing and Evaluating Communication Strategies to Support Informed Decisions and Practice Based on Evidence (DECIDE)” project, GRADE has developed strategies for the targeted communication of evidence-based recommendations to different stakeholders . One of such strategies is the GRADE Evidence to Decision (EtD) Framework, meant to structure development of both recommendations (e.g., clinical recommendations) and decisions (e.g., coverage or public health decisions) using the GRADE approach. Each EtD framework includes detailed sections describing (a) the question and background, (b) an assessment of the evidence, and (c) the conclusions. A summary of the effects of alternative management strategies on patient-important outcomes, data about patients’ values and preferences, and information about resource utilization are part of the assessment section [11–13]. The EtD framework also features information regarding acceptability and feasibility of the alternative management strategies, as well their impact on health equity. This data are obtained from current systematic reviews when available or, more often, updating of previous reviews or the de novo systematic reviews. Finally, The EtD framework provides an explicit record of the judgments made by guideline panelists that ultimately determine the direction and strength of recommendations or decisions. In this paper, we report on the first experience with the EtD framework for clinical recommendations in real guideline panels.
The EtD frameworks
The EtD frameworks evaluated in this study included different versions at various stages of its full development during the DECIDE project. The earliest version of the full GRADE EtD framework was used in 2012 in World Health Organization (WHO) Guidelines for the treatment of multidrug-resistant tuberculosis . The last version of the EtD applied in this study was used in a series of guidelines for the Ministry of Health in Saudi Arabia in 2014 (http://www.moh.gov.sa/depts/Proofs/Pages/Guidelines.aspx). The EtDs were developed based on GRADE evidence to decision tables first utilized in a WHO guideline on avian influenza [15, 16]. The original evidence to decision table (rather than framework) only included five decision criteria [16, 17]. Table 1 describes the iterative changes that were made to the EtD framework during this project. Additional file 1: Table S1 offers an example of the later EtD frameworks tested in this study with 11 criteria and the question and conclusion section . Four panels utilized on-line versions of the interactive EtD frameworks through GRADE’s software GRADEpro  while the other panels used paper-based versions (Table 2). All EtDs were transferred for final agreement with panel members and publication of the guideline to GRADEpro.
We will describe its main features of the EtD here (Additional file 1: Table S1). The header of the table provides details about the clinical question, clearly outlining its components following the PICO framework and any relevant background information. The first column offers the factors or criteria being evaluated framed as questions (e.g., is the problem a priority?). The second column identifies the required judgments by panels and allows recording of these judgments for each criterion. The relevant research evidence and additional considerations upon which the judgments are based redocumented in third column and fourth column, respectively. While the research evidence should be based on systematic reviews of the literature, additional considerations allow capturing specific considerations and information about the local context in which the recommendation will be applied. Finally, the framework captures the judgments leading to the direction and strength of the recommendation or decision. In addition, there is free space to include subgroup and implementation considerations, as well as information regarding futures updates, monitoring, and research priorities. In the four guidelines that used GRADEpro, GRADE’s online tool to assess evidence and develop recommendations, the information was shown on a large screen during the panel meeting and judgments and additional considerations were added online to the EtD during the panel meetings . The version of the EtD in GRADEpro corresponded to the stage of the EtD development (shown in Additional file 1: Table S1) while the current version of GRADEpro that is available on the web (www.gradepro.org) includes the fully developed static and interactive EtD.
We used the EtD frameworks in 15 international guideline panels (Table 2). Two guideline panels were organized by international agencies (World Health Organization and World Allergy Organization); 12 panels were in the context of national guideline programs (10 in the Kingdom of Saudi Arabia, 1 in Colombia, and 1 in Spain); and 1 panel was part of an effort to produce establish methods for developing guidance on rare diseases (Italy and Germany).
All guidelines were developed between 2012 and 2014. The guidelines for Saudi Arabia were produced by ten panels, five of them meeting in parallel followed by consecutive meetings of the other five panels. While the preparatory work for these panels was extensive and preceded the panel meetings, the panel meetings lasted 2 days each . Two or three methodologists supervised two panels each sequentially over a total of 4 days. The other guideline panels met over 1 or 2 days and were also led by pairs of methodologists. Following the guideline panel meetings, we asked the methodologists leading the panel to provide feedback regarding the EtD framework. Specifically, three investigators (WW, PAC, HJS) developed a spreadsheet with pre-specified domains and we asked the methodologists to provide free text comments on each domain and also free text comments about their overall experience with the framework (Additional file 2). Two of the three investigators (WW and HJS) who developed the data collection sheet also participated as methodologists in guideline panels and therefore provided comments and feedback to the EtD framework. We choose this method for collecting the data given its relatively quick implementation just after the panel meetings. We instructed the methodologists to consider their own experience with the EtD framework and also the feedback collected from the rest of the panelists during the guideline panel meeting. After a first round of feedback, we provided a summary of the results to the methodologists for further input.
Two investigators (IN and RBP) independently identified themes from the feedback obtained from the methodologists. They used the pre-specified domains as coding system. Then, the two investigators jointly interpreted the data and organized the themes into the following categories: general comments about the framework and comments about specific sections within the framework. All participants were invited to review and comment on a draft report of the results.
Context and methodologists
The topics covered by guideline panels were diverse, including cardiovascular diseases (5 guidelines), asthma and allergy (3 guidelines), infectious diseases (2 guidelines), cancer screening and diagnosis (2 guidelines), and others (3 guidelines). Most of the guidelines were focused on adults (11 guidelines) (Table 2).
Ten methodologists led the guideline development on the 15 panels, and for all guidelines, a second methodologist provided backup support. Seven of the panel leads were physicians, two dentists, and one registered nurse, all of them with postgraduate training in health research methods or a related discipline. All guideline methodologists had formal training and experience in conducting systematic reviews and in using the GRADE approach to formulate recommendations. They were all members of the GRADE working group. For 13 out of the 15 guidelines, one highly experienced guideline methodologist led the panel discussions supported by another guideline methodologist. Most of the highly experienced methodologists had led multiple systematic reviews and high level guideline panels at major international organizations (e.g., WHO) or professional societies with large guideline programs (e.g., American Thoracic Society or American College of Chest Physicians). The gender distribution was six males and four females, and the mean age was 37.8 years (standard deviation 6.3). All of them provided feedback about the EtD framework.
General comments about the EtD framework
Table 3 summarizes the results by themes. In general, the EtD framework was well received and the comments were mostly positive. The methodologists felt that in a real guideline panel, the EtD framework helps structuring a complex group process through relatively simple steps in an explicit and transparent way. Further, two methodologists commented that the explicitness of the framework could help having an accurate idea of the underlying evidence and protecting against inappropriate recommendations.
Two methodologists stated that the wording of the questions and options to answer were suboptimal for recommendations comparing two active interventions (instead of one active intervention against placebo/no treatment). Also, three methodologists raised concerns regarding the length of the framework and the relevance of some sections in specific circumstances. In particular, considerations regarding feasibility, accessibility, and equity were considered less relevant in the context of clinical recommendations. Finally, two methodologists stated that in order to take full advantage of the ETD framework, it is necessary to be familiar with the GRADE approach and with the framework itself. Therefore, for new users, previous training may be helpful for successful implementation and use. For example, one methodologist stated, “Until you see the process you don’t appreciate all the real work and value of the forms and filling out information before the panel meeting, etc. You must go through the process and use the [EtD] and see the outcome.”
Comments about specific sections
The sections “values and preferences” and “balance of benefits and harms” posed difficulties in several panels. Regarding the “values and preferences” section, the GRADE approach requires guideline panelists to judge whether there is important variability in how patients’ value the main outcomes and to what extent there is important uncertainty about this. In case of any of the two, panels should consider issuing a weak recommendation. The main concern with this question was that the answer options did not differentiate between the variability and uncertainty in patients’ values and preferences, creating some confusion among panelists. Four methodologists suggested dividing this question in two.
Regarding the balance between benefits and harms, this section features three questions requiring panelists to judge the magnitude of desirable and undesirable effects, as well the relation between the two. Panelists struggled judging the size of the benefits and harms consistently. Two methodologists considered the three questions redundant and suggested to drop the questions about the size of the benefit and harms and maintain only the one addressing their balance.
This is the first study evaluating the use of the GRADE EtD framework in the context of real guideline development. Given the widespread application of GRADE, this study provides important information and rationale for utilizing EtDs in guidelines. In general, the response to the use of the EtD framework was positive, although some sections (i.e., “values and preferences” and “balance between benefits and harms”) required improvements, in particular the choice of answer options. These improvements have been implemented in the current final option of GRADE EtD frameworks (www.gradepro.org) [11–13].
There are some limitations to this study. The use of iteratively developed formats of the EtD framework in different guideline panels could have introduced some variability in the provided answers. However, it was necessary as part of the development and improvement of the EtDs. For example, early testing with the World Health Organization and World Allergy Organization panels revealed difficulties with providing responses to the available answer options. In order to make best use of the EtDs, they were modified for use in the work of later guideline panels. While methodologists were prospectively instructed before the panel meetings to record challenges with the EtDs during the panel meetings, there was no independent evaluation of challenges with the EtD frameworks. However, the data were recorded without delay and we believe that recall of the encountered issues does not lead to bias as all methodologists had an interest in improving the frameworks.
The strengths of this study include the strategies taken to ensure the trustworthiness of our findings: prospective data collection, independent and duplicate analysis, and member checking of the conclusions. The most important strength of this work is the application of the EtD frameworks in the context of real guideline development by international panels in a broad range of topics, including diagnostic and therapeutic questions.
Our findings strongly suggest that the EtD framework can enhance the transparency of guideline developers’ judgments, allowing decision makers to truly assess the recommendation. This is not only corroborated by this work but by the adoption of the EtD approach by major organizations such as WHO . What happens within guideline panels is seldom reported in current guidelines [22–26]. Hence, when recommendations are analyzed at a deeper level, some decisions of the panel might seem arbitrary and unjustified. Providing details about the reasons and judgments behind such decisions enrich the recommendations and also may facilitate the adaptation of international guidelines to local contexts. In the EtDs, each criterion is considered by working trough the EtD sequentially and then in a summary before formulating a recommendation. This avoids unnecessarily moving back and forth between criteria and wasting time. In addition, each criterion requires a judgment that determines the direction and strength of a recommendation, which is often not found in guidelines. This information will facilitate updating (by seeing judgments and evaluating if the should change) and adaptation of guidelines (by making others aware of the judgments).
The EtD framework also offers the opportunity to identify and potentially avoid inappropriate recommendations. One potential problem of recommendations is the mismatch between the strength of the recommendation and the underlying information about the certainty in effect estimates (also known as confidence in the evidence or quality of the evidence), the balance between the benefits and harms, values and preferences, and resource considerations . The EtD framework, on the one hand, allows users to judge to what extent the recommendation corresponds to the underlying information and, on the other hand, may help preventing guideline developers from inappropriately grading the strength of the recommendation . Finally, the EtD framework may also be a valuable research tool, since it would allow a detailed comparison of conflicting recommendations from different organizations.
In conclusion, the use of an EtD framework in guideline development offers an explicit and transparent way to record and report the judgments and discussion of guideline panelists. The EtD framework may facilitate the assessment of recommendations and also their potential adaptation and implementation. As result of this work, a new section was added to the EtD framework. This section allows summarizing the judgments of guideline panelists just prior to the deliberations regarding the direction and strength of recommendations. The currently latest version of the EtD resulting from this and other work including an interactive version of the EtD based on this and other work is available in GRADE’s online GRADEpro Guideline Development Tool (www.gradepro.org) and described elsewhere [11–13]. While the current formats of EtD frameworks are already used widely, future use and research will inform changes that will have to be made to the EtDs. To name some areas of relevance, additional research could identify how to best integrate issues arising from potential conflicts of interest, how to structure research recommendations, how to address agreement of panel members with information provided by systematic reviews, how to document alterations in judgments about the certainty in the evidence by panel members as a result of indirectness of the evidence , and how to structure implementation considerations. The GRADE working group will further evaluate and develop the EtD frameworks. Doing this, the evaluation of the EtD frameworks in real guideline development processes should continue to provide information for improvement.
Woolf SH, Grol R, Hutchinson A, Eccles M, Grimshaw J. Clinical guidelines: potential benefits, limitations, and harms of clinical guidelines. BMJ. 1999;318(7182):527–30.
Eccles MP, Grimshaw JM, Shekelle P, Schunemann HJ, Woolf S. Developing clinical practice guidelines: target audiences, identifying topics for guidelines, guideline group composition and functioning and conflicts of interest. Implemen Sci. 2012;7:60.
Shekelle P, Woolf S, Grimshaw JM, Schunemann HJ, Eccles MP. Developing clinical practice guidelines: reviewing, reporting, and publishing guidelines; updating guidelines; and the emerging issues of enhancing guideline implementability and accounting for comorbid conditions in guideline development. Implemen Sci. 2012;7:62.
Woolf S, Schunemann HJ, Eccles MP, Grimshaw JM, Shekelle P. Developing clinical practice guidelines: types of evidence and outcomes; values and economics, synthesis, grading, and presentation and deriving recommendations. Implemen Sci. 2012;7:61.
Brouwers MC, Kho ME, Browman GP, Burgers JS, Cluzeau F, Feder G, Fervers B, Graham ID, Grimshaw J, Hanna SE, et al. AGREE II: advancing guideline development, reporting and evaluation in health care. J Clin Epidemiol. 2010;63(12):1308–11.
Institute of Medicine (US) Committee on Standards for Developing Trustworthy Clinical Practice Guidelines; Editors: Robin Graham, Michelle Mancher, Dianne Miller Wolman, Sheldon Greenfield, and Earl Steinberg. Washington (DC): National Academies Press (US); 2011 (available at: http://www.nap.edu/openbook.php?record_id=13058)
Schunemann HJ, Wiercioch W, Etxeandia I, Falavigna M, Santesso N, Mustafa R, Ventresca M, Brignardello-Petersen R, Laisaar KT, Kowalski S, et al. Guidelines 2.0: systematic development of a comprehensive checklist for a successful guideline enterprise. CMAJ. 2014;186(3):E123–142.
Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, Guyatt GH, Harbour RT, Haugh MC, Henry D, et al. Grading quality of evidence and strength of recommendations. BMJ. 2004;328(7454):1490.
Schünemann HJ, Best D, Vist G, Oxman AD. Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations. CMAJ. 2003;169(7):677–80.
Treweek S, Oxman AD, Alderson P, Bossuyt PM, Brandt L, Brozek J, Davoli M, Flottorp S, Harbour R, Hill S, et al. Developing and Evaluating Communication Strategies to Support Informed Decisions and Practice Based on Evidence (DECIDE): protocol and preliminary results. Implement Sci. 2013;8:6.
Schunemann HJ, Mustafa R, Brozek J, Santesso N, Alonso-Coello P, Guyatt G, Scholten R, Langendam M, Leeflang MM, Akl EA, et al. GRADE Guidelines: 16. Development of the GRADE Evidence to Decision (EtD) frameworks for tests in clinical practice and public health. J Clin Epidemiol 2016. doi: 10.1016/j.jclinepi.2016.01.032.
Alonso-Coello P, Schunemann H, Moberg J, Brignardello-Petersen R, Akl E, Davoli M, Treweek S, Mustafa R, Rada G, Rosenbaum S, Morelli A, Guyatt GH, Oxman AD. GRADE Evidence to Decision frameworks: 1. Introduction. BMJ in press.
Alonso-Coello P, Oxman AD, Moberg J, Brignardello-Petersen R, Akl E, Davoli M, Treweek S, Mustafa R, Vandvik, P, Meerpohl, J, Guyatt, GH, Schunemann H. GRADE Evidence to Decision frameworks: 2. Clinical practice guidelines. BMJ in press.
World Health Organization. The use of bedaquiline in the treatment of multidrug-resistant tuberculosis. Interim policy guidance http://apps.who.int/iris/bitstream/10665/75146/1/9789241548441_eng.pdf. Accessed 8 Dec 2014. In.; 2013.
Schunemann HJ, Hill SR, Kakad M, Bellamy R, Uyeki TM, Hayden FG, Yazdanpanah Y, Beigel J, Chotpitayasunondh T, Del Mar C, et al. WHO Rapid Advice Guidelines for pharmacological management of sporadic human infection with avian influenza A (H5N1) virus. Lancet Infect Dis. 2007;7(1):21–31.
Schunemann HJ, Hill SR, Kakad M, Vist GE, Bellamy R, Stockman L, Wisloff TF, Del Mar C, Hayden F, Uyeki TM, et al. Transparent development of the WHO rapid advice guidelines. PLoS Med. 2007;4(5):e119.
Santesso N, Schunemann H, Blumenthal P, De Vuyst H, Gage J, Garcia F, Jeronimo J, Lu R, Luciani S, Quek SC, et al. World Health Organization Guidelines: use of cryotherapy for cervical intraepithelial neoplasia. Int J Gynaecol Obstet. 2012;118(2):97–102.
Saudi Arabian Clinical Practice Guideline on Allergic Rhinits in Asthma (available at: http://www.moh.gov.sa/endepts/Proofs/Pages/Guidelines.aspx)
Schünemann HJ, Brozek J. GRADEpro Guideline Development Tool. In. Hamilton, Canada: McMaster University; 2015.
Schünemann H, Wiecioch W, Brozek J, Etxeandia-Ikobaltzetaen I, Mustafa R, Manja V, Brignadello-Peterson R, Neumann I, Falavigna M, Al-Hazzani W, et al. GRADE Evidence to Decision Frameworks for adoption, adaptation and de novo development of trustworthy recommendations: GRADE-ADOLOPMENT J Clin Epidemiol 2016, accepted for publication.
WHO handbook for guideline development. Geneva, Switzerland: World Health Organization; 2014.
Alonso-Coello P, Irfan A, Sola I, Gich I, Delgado-Noguera M, Rigau D, Tort S, Bonfill X, Burgers J, Schunemann H. The quality of clinical practice guidelines over the last two decades: a systematic review of guideline appraisal studies. Qual Saf Health Care. 2010;19(6):e58.
Raine R, Sanderson C, Black N. Developing clinical guidelines: a challenge to current methods. BMJ. 2005;331(7517):631–3.
Oxman AD, Fretheim A, Schunemann HJ. Improving the use of research evidence in guideline development: introduction. Health Res Policy Syst. 2006;4:12.
Oxman AD, Schunemann HJ, Fretheim A. Improving the use of research evidence in guideline development: 14. Reporting guidelines. Health Res Policy Syst. 2006;4:26.
Wilson KC, Irwin RS, File Jr TM, Schunemann HJ, Guyatt GH, Rabe KF. Reporting and publishing guidelines: article 12 in Integrating and coordinating efforts in COPD guideline development. An official ATS/ERS workshop report. Proc Am Thorac Soc. 2012;9(5):293–7.
Neumann I, Santesso N, Akl EA, Rind DM, Vandvik PO, Alonso-Coello P, Agoritsas T, Mustafa RA, Alexander PE, Schunemann H, et al. A guide for health professionals to interpret and use recommendations in guidelines developed with the GRADE approach. J Clin Epidemiol. 2016;72:45–55.
Alexander PE, Brito JP, Neumann I, Gionfriddo MR, Bero L, Djulbegovic B, Stoltzfus R, Montori VM, Norris SL, Schunemann HJ, et al. World Health Organization strong recommendations based on low-quality evidence (study quality) are frequent and often inconsistent with GRADE guidance. J Clin Epidemiol. 2016;72:98–106.
Schunemann HJ. Interpreting GRADE's levels of certainty or quality of the evidence: GRADE for statisticians, considering review information size or less emphasis on imprecision? J Clin Epidemiol. 2016;75:6–15.
We thank all panelists and methodologists of the guidelines in which this EtD framework was tested.
This project has been funded by the European Union’s Seventh Framework Programme for research, technological development and dissemination under grant agreement no. 258583 (www.decide-collaboration.eu), the Ministry of Health of the Kingdom of Saudi Arabia, and the McMaster GRADE center.
HJS, PAC and WW conceived of the study and participated in its design and coordination. IN and RBP participated in its design and conducted the analysis. ACL, CC, EA, RAM, WA, IEI, MXR, MF, NS, JB and AI provided substantial contributions to the interpretation of the findings. IN and HJS drafted the manuscript. PAC, WW, RBP, ACL, CC, EA, RAM, WA, IEI, MXR, MF, NS, JB and AI revised the manuscript critically. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
About this article
Cite this article
Neumann, I., Brignardello-Petersen, R., Wiercioch, W. et al. The GRADE evidence-to-decision framework: a report of its testing and application in 15 international guideline panels. Implementation Sci 11, 93 (2015). https://doi.org/10.1186/s13012-016-0462-y
- Clinical practice guidelines
- Evidence to decisions framework