E-learning interventions are comparable to user's manual in a randomized trial of training strategies for the AGREE II

Background Practice guidelines (PGs) are systematically developed statements intended to assist in patient and practitioner decisions. The AGREE II is the revised tool for PG development, reporting, and evaluation, comprised of 23 items, two global rating scores, and a new User's Manual. In this study, we sought to develop, execute, and evaluate the impact of two internet interventions designed to accelerate the capacity of stakeholders to use the AGREE II. Methods Participants were randomized to one of three training conditions. 'Tutorial'--participants proceeded through the online tutorial with a virtual coach and reviewed a PDF copy of the AGREE II. 'Tutorial + Practice Exercise'--in addition to the Tutorial, participants also appraised a 'practice' PG. For the practice PG appraisal, participants received feedback on how their scores compared to expert norms and formative feedback if scores fell outside the predefined range. 'AGREE II User's Manual PDF (control condition)'--participants reviewed a PDF copy of the AGREE II only. All participants evaluated a test PG using the AGREE II. Outcomes of interest were learners' performance, satisfaction, self-efficacy, mental effort, time-on-task, and perceptions of AGREE II. Results No differences emerged between training conditions on any of the outcome measures. Conclusions We believe these results can be explained by better than anticipated performance of the AGREE II PDF materials (control condition) or the participants' level of health methodology and PG experience rather than the failure of the online training interventions. Some data suggest the online tools may be useful for trainees new to this field; however, this requires further study.


Background
Evidence-based practice guidelines (PGs) are systematically developed statements aimed at assisting clinicians and patients to make decisions about appropriate healthcare for specific clinical circumstances [1] and to inform decisions made by policy makers [2][3][4]. While PGs have been shown to have a moderate impact on behavior [5], their potential for benefit is only as good as the PGs themselves [6][7][8]. The AGREE II, a revised version of the original tool [9], is an instrument designed to direct the development, reporting, and evaluation of PGs [10][11][12][13]. The AGREE II consists of 23 items grouped into six quality domains, two overall assessment items, and extensive supporting documentation to facilitate its appropriate application (i.e., User's Manual).
International adoption of the original AGREE Instrument and interest in the revised version has been significant, and attests to the potential value of this tool [14]. The AGREE II was designed for many different types of users and for users with varied expertise. Given the breadth and heterogeneity of the AGREE II's stakeholder group, efforts to promote and facilitate its application are complex. The internet is a key medium to reach a vast, varied, and global audience. However, passive internet dissemination alone, even with a primed and interested audience, will not fully optimize its application and use. Our interest was to explore educational interventions and to leverage technical platforms to accelerate an effective application process. E-learning (internet-based training) provides a potentially effective, standardized, and cost-efficient model for training in the use of AGREE II. A recent meta-analysis and systematic review showed large effect sizes for internet-based instruction (clinical and methodological content areas) on outcomes with health-profession learners [15,16]. Improved learning outcomes seemed to be associated with designs that included interactivity, practice exercises, repetition, and feedback. Thus, e-learning appeared to be a promising solution for our context. While the evidence base underpinning the efficacy and design principles of e-learning training materials are well established [17][18][19][20][21][22][23], there remain questions regarding the optimal application and combination of these principles for particular interventions. In this study, we wanted to design and test two e-learning interventions, a tutorial alone versus a tutorial plus an interactive practice exercise, against a more traditional learning form to determine their impact on outcomes related to the AGREE II.
Our primary research question is, whether compared to just reading the User's Manual, does the addition of an online tutorial program, with or without a practice exercise with feedback, improve learners' performance and increase learners' satisfaction and self-efficacy with the AGREE II? Based on the results of systematic reviews [15,16], we hypothesized the training platform that included the tutorial plus the practice exercise with feedback would be superior to the User's Manual alone. For exploratory purposes, we also examined whether differences existed across the outcome measures between the two e-learning intervention groups.

Methods
This study was funded by the Canadian Institutes of Health Research and received ethics approval from the Hamilton Health Sciences/Faculty of Health Sciences Research Ethics Board (REB #09-398; McMaster University, Hamilton, Ontario, Canada). Key evidence-based principles in the science of technical training, multimedia learning, and cognitive psychology were used to develop the two training platforms [17][18][19][20][21][22][23].

Study design and intervention
A single factorial design with three levels of training intervention was implemented (see Figure 1).

Tutorial
Participants received access to a password-protected website where they were presented with a seven-minute multimedia tutorial presentation with an overview of the AGREE II conducted by a 'virtual coach.' Following the tutorial, the participants were granted access to a PDF copy of the AGREE II and were instructed to review the User's Manual before proceeding to the test PG.

Tutorial + practice exercise
Participants received access to a password-protected website where they received the same tutorial presentation described above and access to the AGREE II User's Manual. They were then presented with the practice exercise that required participants to read a sample or 'practice' PG and appraise it using the AGREE II. Upon entering each AGREE II score, participants were provided immediate feedback on how their score compared to the mean of four experts. If their score fell outside a predefined range, participants received two-stage formative feedback to guide the appraisal process. At the conclusion of their review, participants received a summary of their performance in appraising the practice PG compared to expert norms. Participants then proceeded to read and appraise the test PG.

User's manual
Participants assigned to the control condition received PDF copies of the AGREE II User's Manual for review before proceeding to the test PG. The User's Manual is a 56-page document. It provides an overview of the AGREE enterprise and general instructions on how to use the tool. Then, for each of the 23 core items, it presents a definition of the concept and examples, advice on where the information can be found within a PG document, and the specific criteria and considerations for scoring. It concludes with the two global rating measures.

Participants and process
Following our sample size calculation reported in the detailed protocol previously published [14], we required 20 participants per group to have at least 80% power to detect a performance advantage of as little as ± 0.79 standard deviations for either of the intervention groups compared to the passive learning group. Methodologists, clinicians, policy makers, and trainees were sought from guideline programs, professional directories, and the Guidelines International Network (G-I-N) community. Because our previous research showed virtually no differences in AGREE II performance as a function of type of users, we did not account for this factor in our study design [11][12][13].
A total of 107 interested individuals registered with the Scientific Office. After receiving a letter of invitation and screening for their eligibility, 87 participants were randomized to one of the three training conditions using a computer-generated randomization sequence (1:1:1 ratio). Individuals were eligible for study participation if they had no or limited experience and exposure to the original AGREE Instrument or the AGREE II. To assess this, participants were asked to first complete an online eligibility questionnaire. Here, they were asked about the type(s) of previous experience they had with the original AGREE and AGREE II (as a tool to inform guideline development, guideline reporting, guideline evaluation, and other) and the extent of this experience (never, 1 to 5 guidelines, 6 to 10 guidelines, 11 to 15 guidelines, 16 to 20 guidelines, 20+ guidelines). They were also asked if they had participated in any AGREErelated research study previously (yes, no, uncertain). Participants who answered they had not participated in an AGREE-related research study and who had little to no AGREE or AGREE II experience (defined as never Step 1

Tutorial
Step 2

Review/Assess
Test-PG Step 4

Tutorial
Step 2

Practice Exercise
Step 4

Review/Assess
Test-PG Step 5

Review/Assess Test-PG
Step 3

Tutorial Group
Tutorial + Practice Exercise Group using either instrument or using it on a maximum of 1 to 5 guidelines) were eligible to participate. These individuals were then randomized to group and received access to an individualized password-protected web-based study platform. Participants completed their specific training intervention, evaluated one of ten test PGs using the AGREE II, and completed a series of post-test Learner's Scales and a demographics survey. Participants were blinded to the study conditions, our research questions, and hypothesis.

Practice guidelines
Eleven PGs were selected for this study: one served as the practice PG for participants randomized to the Tutorial + Practice Exercise group and, to facilitate generalizability of results, the remaining ten were selected for the test PGs. Participants were randomized to one of the ten test PGs. Practice guideline was not a factor of analytic interest. Eligibility criteria for the 11 PGs are described in detail in the previously published protocol and include: English-language documents published from 2002 onward; were within the clinical areas of cancer, cardiovascular, or critical care; were 50 pages or less; and represented a range of quality [14].

AGREE II performance
The AGREE II consists of survey items and a User's Manual [11][12][13]: twenty-three items are grouped into six domains of PG quality: scope and purpose, stakeholder involvement, rigour of development, clarity of presentation, applicability, and editorial independence. Items are answered using a 7-point response scale ('strongly disagree' to 'strongly agree'). Standardized domain scores are calculated enabling construction of a performance score profile permitting direct comparisons across the domains or items. The AGREE II survey items conclude with two global measures answered using a 7-point scale: one item targeting the PG's overall quality and the second targeting the appraiser's intention to use the PG. The User's Manual provides explicit direction for each of the 23 and two overall items, as noted above. Participant performance served as the primary outcome.

Learner's scale
In addition to the primary outcome of performance on the test PG, a series of secondary measures, known as the Learner's scale, were also collected. This scale was comprised of Learner Satisfaction scale (i.e., satisfaction with learning opportunity), Self-Efficacy scale (i.e., belief one can succeed), Mental Effort scale (i.e., cognitive effort to complete a task), and Time-on-Task. With the exception of Time-on-Task, which was a self-report measure, a 7-point response scale was used to answer the remaining items. The questions included in the Learner's scale were inspired by previous work done in this field [17][18][19][20][21][22][23]. Specific reliability and validity testing of the items and subscales was not undertaken.

AGREE II perceptions
Participants were asked to rate the usefulness of the AGREE II (for development, reporting, and evaluation) and the User's Manual using a 7-point scale.

Demographics and AGREE II Experience scale
Participants were asked about their backgrounds including experience with the PG enterprise, the original AGREE instrument and the AGREE II.

Outcomes and analyses Primary measures
Two performance measures served as the primary outcomes. First, the Performance -Distance Function calculates the difference between the domain scores of the participants from those of expert norms. Expert norms were derived by members of the AGREE Next Steps research team who appraised the test PGs used in this study. Four expert appraisers rated each guideline. Mean standardized scores were used to construct the expert performance score profiles. Thus, the measure of distance (i.e., difference in scores between participants and experts) for each AGREE II domain was calculated by squaring the difference between the participants' profile domain ratings from the experts' profile domain ratings. A series of one-way analysis of variance tests were subsequently calculated to examine differences in distance function as a function of training intervention.
Second, performance was measured by examining the proportion of participants who met minimum performance competencies with the AGREE II tool [14]. A Pass/Fail algorithm designed for another study [14] was used here to calculate the performance level for participants randomized to the condition with the practice PG.

Secondary measures
The Learner's scale served as the core secondary measure. To this end, a series of multivariate one-way analysis of variance tests were conducted to examine differences in participants' satisfaction, self-efficacy, and mental effort as a function of training intervention. A series of analysis of variance tests were conducted to examine differences in participants' self-reported Time-on-Task and in participants' reported perceptions of the AGREE II.

Results
There were no changes to any of the outcomes once the trial commenced.

Participants (Table 1 and Figure 2)
Letters of invitation were sent to 107 participants, of which 87 were eligible to participate (12 were excluded based on past experience with the AGREE Instrument and eight were non-respondents to the letter of invitation). Sixty participants completed the study (response rate = 69%), 20 per condition. The majority of participants were female, between the ages of 25 and 65, and with some level of health methods training.
Performance -distance function (Table 2) There were no significant differences in any of the domain distance functions between the three training groups (p > 0.05 for all comparisons). Performance -pass/fail criteria 86% of the individuals in the Tutorial + Practice Exercise training intervention arm passed the online training with the practice PG.
Training satisfaction and self-efficacy (Table 3) Participants reported high levels of training satisfaction (means 6.0+) and self-efficacy (means 5.4+). There were no significant differences in any measure as a function of training condition (p > 0.05 for all comparisons). The Tutorial, Tutorial + Practice Exercise, and review of the PDF training options were recommended by 80%, 60%, and 60% of participants, respectively (p > 0.05 for all comparisons).

Mental effort (Table 4)
The multivariate analysis of variance failed to show a difference in participants' reporting of mental effort as a function of training condition. With the exception of one measure (the AGREE II was mentally demanding), the univariate analyses of variance also failed to show significance differences.
Time-on-task (Table 5) There were no significant differences as a function of training condition in the time spent by participants reviewing either the PDF version of the AGREE II or in the time taken to complete the test PG (p > 0.05 for all comparisons).   (Table 6) Participants reported favourable perceptions about the AGREE II as a tool to facilitate the development, reporting, and evaluation of PGs; they also reported favourable perceptions about the AGREE II User's Manual in enhancing skills with its application. No significant differences were found for any outcome as a function of training intervention conditioSn.

Discussion
In this study, we tested two internet-based electronic training interventions against a traditional training method using a PDF version of the User's Manual to determine their effects on various measures related to performance on and attitudes toward the AGREE II. The goal was to identify the best strategy to facilitate the AGREE II's appropriate and effective uptake by its  stakeholders. In contrast to our hypotheses, participants randomized to the training condition that included the Tutorial + Practice Exercise did not demonstrate superior performance with the AGREE II, greater satisfaction with the training experience, higher levels of self-efficacy, or more positive attitudes toward the tool than did participants randomized to the other two conditions. One potential explanation is that our randomization did not work properly, and there were differences in experience participants had in health research methodology and/or the AGREE or the AGREE II. Our demographic data (see Table 2) suggest participants allocated to the control condition may have been more apt to have had minimal exposure than no exposure to the tools than were participants allocated to the other conditions. The inclusion of direct pretest measures to more accurately capture guideline performance before training exposure and to ensure baseline characteristics of the participants do not vary on this factor may be warranted in future studies.
A second potential explanation for our findings is that our interventions did not work. This explanation, however, is not well supported. First, each intervention arm aligned with design characteristics found in other studies and systematic reviews to be effective training features, such as immediate feedback, interactivity, and repetition [15,16]. Second, albeit the data are subjective, they do show that participants liked all of our interventions; for example, satisfaction measures and self-efficacy measures are extremely high, well above the mid-point of the 7-point response scale. To that end, one may conclude then, that our control condition (i.e., review of the PDF version of the AGREE II only) was very effective, and that there is a ceiling effect on performance measures and other outcomes.
Exploring these conclusions further, a significant component in the revision of the AGREE II was the reworking of the User's Manual and its written training resource component. As described, the document provides descriptions, examples, and explicit direction for how to evaluate a PG report using AGREE II. The comprehensive nature of the PDF version of the AGREE II User's Manual may be quite sufficient for many potential users. In fact, previous research, as was found in this study, demonstrates high support for the User's Manual by participants [13].
While this study failed to demonstrate superiority of the online electronic training interventions, we do not believe they should be abandoned all together. While we were successful in screening participants so that they  had little-to-no experience with the AGREE II or the original version of the tool, virtually all participants had some experience in health methods (e.g. systematic review, critical appraisal) and many had experience with the PG enterprise (see Table 1). This selection bias may represent a limitation to the study that also compromises the interpretability of the findings. Specifically, it may be that the online training interventions would be of benefit to the truly novice participant: individuals with no experience with the AGREE II, PGs in general, or health research methodology-for example, trainees and students in the field of health services research.
There are some previous data to support this. In the separate project that developed the pass-fail algorithm used in this study, most of the participants were trainees early on in their post-graduate career with considerably less experience in health methods or PGs. In contrast to pass rates of 86% reported in this study, the initial pass rates for those participants was 73%, suggesting the training may be better suited for novice users. Future research studies recruiting these types of participants are warranted. Indeed, educational research supports the notion of adapting instructional methods based on individual differences in prior knowledge. In general, the literature suggests that good instructional design techniques may be of more importance for low prior knowledge than for high prior knowledge learners [19,22]. Redundant content should usually be eliminated for more experienced learners. It is possible that the more knowledgeable learners in our study experienced unnecessary extra cognitive load from the additional e-learning instructional interventions, when the control materials of the User's Manual were sufficient. There may even be expertise reversal effects, where a given instructional method that works well for novice learners [24] is less effective or even detrimental for individuals with more expertise [25]. In this study, it is possible that either the ceiling effect or detrimental effects of redundancy may have led to no difference from the control condition. Further investigation is required to assess whether efficient instruction on the AGREE II for more advanced learners will require different methods than training designed for entry-level learners.
In summary, our study did not demonstrate our two online AGREE II electronic training interventions improved outcomes over the control condition. We believe this can be explained in part by the better than expected performance of the control condition (i.e. current standard of the PDF AGREE II, namely the User's Manual) and in part by the level of experience among the participants with health methods and PGs. Future research may demonstrate that the two online training interventions may be best suited to and effective tools for very novice users, new to the area of PGs and the AGREE II Enterprise. The training interventions are available through the AGREE Enterprise Web site [26].