Psychometric properties of implementation measures for public health and community settings and mapping of constructs against the Consolidated Framework for Implementation Research: a systematic review
Implementation Science volume 11, Article number: 148 (2016)
Recent reviews have synthesised the psychometric properties of measures developed to examine implementation science constructs in healthcare and mental health settings. However, no reviews have focussed primarily on the properties of measures developed to assess innovations in public health and community settings. This review identified quantitative measures developed in public health and community settings, examined their psychometric properties, and described how the domains of each measure align with the five domains and 37 constructs of the Consolidated Framework for Implementation Research (CFIR).
MEDLINE, PsycINFO, EMBASE, and CINAHL were searched to identify publications describing the development of measures to assess implementation science constructs in public health and community settings. The psychometric properties of each measure were assessed against recommended criteria for validity (face/content, construct, criterion), reliability (internal consistency, test-retest), responsiveness, acceptability, feasibility, and revalidation and cross-cultural adaptation. Relevant domains were mapped against implementation constructs defined by the CFIR.
Fifty-one measures met the inclusion criteria. The majority of these were developed in schools, universities, or colleges and other workplaces or organisations. Overall, most measures did not adequately assess or report psychometric properties. Forty-six percent of measures using exploratory factor analysis reported >50 % of variance was explained by the final model; none of the measures assessed using confirmatory factor analysis reported root mean square error of approximation (<0.06) or comparative fit index (>0.95). Fifty percent of measures reported Cronbach’s alpha of <0.70 for at least one domain; 6 % adequately assessed test-retest reliability; 16 % of measures adequately assessed criterion validity (i.e. known-groups); 2 % adequately assessed convergent validity (r > 0.40). Twenty-five percent of measures reported revalidation or cross-cultural validation. The CFIR constructs most frequently assessed by the included measures were relative advantage, available resources, knowledge and beliefs, complexity, implementation climate, and other personal resources (assessed by more than ten measures). Five CFIR constructs were not addressed by any measure.
This review highlights gaps in the range of implementation constructs that are assessed by existing measures developed for use in public health and community settings. Moreover, measures with robust psychometric properties are lacking. Without rigorous tools, the factors associated with the successful implementation of innovations in these settings will remain unknown
In the field of implementation science, a considerable number of theories and frameworks are being used to better understand implementation processes and guide the development of strategies to improve the implementation of health innovations [1–3]. Many of these theories and frameworks, however, have not been tested empirically. As such, examining the utility of theories and frameworks has been recognised as critical to advance the field of implementation science .
The assessment of implementation theories and frameworks necessitates robust measures of their theoretical constructs. Psychometric properties important for measures of implementation research have been proposed  and include the following: reliability (internal consistency and test-retest); validity (construct and criterion); broad application (validated in different settings and cultures); and sensitivity to change (responsiveness). Tools which are acceptable, feasible, and display face and content validity are also particularly useful for researchers in real-world settings . Furthermore, the psychometric characteristics of measures that assess a comprehensive range of implementation constructs have been highlighted as a particular priority area of research .
A number of reviews of implementation measures exist [6–13]. Such reviews indicate that the quality of existing measures of implementation constructs is limited. A review by Brennan and colleagues, for example, identified 41 instruments designed to assess factors hypothesised to influence quality improvement in primary care . The review found that while most studies reported the internal consistency of instruments, very few assessed the construct validity of the measures using factor analysis . Similarly, in a review of the psychometric properties of research utilisation measures used in health care, Squires and colleagues found that, of the 97 identified studies (60 unique measures), only 31 reported internal consistency and only 3 reported test-retest reliability . Twenty percent of the included measures had not undergone any type of validity testing, and no studies reported on measure acceptability .
There are a number of limitations of previous reviews. Most do not provide comprehensive details of the psychometric properties of included measures [7, 8, 12] or address only a small number of constructs or outcomes relevant to implementation science [8, 10]. Additionally, the majority of these reviews primarily focus on measures developed for use in healthcare settings [6, 9, 11, 13]. Evidence from the field of psychometric research has suggested that, even when administered to similar population groups, changes in measure reliability and validity can occur when a measure developed in one setting is applied to another setting with different characteristics [14, 15].
Currently, a comprehensive review of measures of implementation constructs is being conducted by the Society for Implementation Research Collaboration (SIRC) Instrument Review Project [16, 17]. The SIRC review addresses some of the limitations of past reviews by extracting a range of psychometric properties from identified measures and assessing a more comprehensive range of outcomes  and constructs relevant to implementation science . The outcomes of interest in the SIRC review are taken from Proctor and colleagues’ Implementation Outcomes Framework (IOF) and focus on the appropriateness, acceptability, feasibility, adoption, penetration, cost, fidelity, and sustainability of the intervention itself . The constructs of interest for the review are drawn from the Consolidated Framework for Implementation Research (CFIR), which outlines factors or conditions deemed important to support the successful implementation of an intervention . The constructs are grouped under five domains which describe the following: (1) Intervention characteristics (details of the intervention itself); (2) Outer setting (factors of influence which are external to an organisation); (3) Inner setting (internal characteristics of an organisation such as culture and learning climate); (4) Characteristics of individuals (actions and behaviours of individuals within the organisation); and (5) Process (systems and pathways within an organisation) .
To date, the SIRC review has uncovered 420 instruments related to 34 of the CFIR constructs and 104 instruments related to Proctor and colleagues’ IOF [16, 17]. At present, the data are available for the measures relevant to the inner setting domain of the CFIR and the IOF . However, while comprehensive, the SIRC review only pertains to measures primarily applied to healthcare or mental health care settings, where the individuals responsible for implementing health-related interventions are most likely to be healthcare professionals [16, 17]. In the field of public health, the implementation of health-related interventions often occurs in non-clinical settings, with non-healthcare professionals responsible for implementing these changes. Therefore, there is a need to identify measures which have been developed specifically to measure constructs important for the implementation of health-related interventions in community settings, where the primary role of the organisations and individuals is not healthcare delivery.
To our knowledge, no previous reviews of measures of implementation constructs have focussed on instruments designed for use in a broad range of community settings. Such measures are of particular interest to public health researchers who are utilising implementation theories or frameworks to support evidence-based practice in these settings. As such, the aim of this study was to (1) systematically review the literature to identify measures of implementation constructs which have been developed in community settings; (2) describe each measure’s psychometric properties; and (3) describe how the domains of each measure align with the five domains and 37 constructs of the CFIR.
Scope of this review
The focus of this review was to identify, from peer-reviewed literature, measures which have been developed for use in community-based (non-clinical settings), and which measure constructs aligned to the CFIR. These measures were then examined to determine their psychometric properties and identify which of the CFIR constructs they captured. In this review, ‘measures’ are defined as surveys, questionnaires, instruments, tools, or scales which contain individual items that are answered or scored using predefined response options. ‘Constructs’ are defined as the broad attributes or characteristics which these items (usually grouped into domains) are attempting to capture. The constructs of interest were chosen to align with the CFIR, as this framework is the most comprehensive and draws together numerous theories which have been developed to guide the planning and evaluation of implementation research and combines them into one uniform theory with overarching domains .
A systematic search and review was conducted to address the broad question of ‘what psychometrically robust measures are currently available to assess implementation research in public health and community settings’. A comprehensive search of peer-reviewed publications was conducted using four electronic databases and the quality of identified measures was assessed using well-established, pre-defined psychometric criteria.
Publications were included if they (1) were peer-reviewed journal articles reporting original research results; (2) reported research from non-clinical settings; (3) reported details regarding the development of a measure; (4) described a measure which assessed at least one of the 37 CFIR constructs; (5) described a measure which was being applied to a specific innovation or intervention; and (6) used statistical methods to assess the measures’ factor structure.
In this review, clinical settings included the following: hospitals, general practices, allied health facilities such as physiotherapy or dental practices, rehabilitation centres, psychiatric facilities, and any other settings where the delivery of health or mental health care was the primary focus. Non-clinical settings included schools, universities, private businesses, childcare centres, correctional facilities, and any other settings where the delivery of health or mental health care was not the primary focus. Given that an aim of the study was to map the domains of included measures against constructs within the CFIR, it was important that measures displayed a minimum level of construct validity via exploratory or confirmatory factor analysis.
Duplicate abstracts were excluded from the review, as were abstracts describing reviews, editorials, commentaries, protocols, conference abstracts, and dissertations. Publications which reported on measures developed using qualitative methods only were also ineligible.
A search of MEDLINE, PsycINFO, EMBASE, and CINAHL databases was conducted to identify publications describing the development of measures to assess factors relevant to the implementation of innovations. These four databases were selected as they index journals from the field of implementation science and provide extensive coverage of research across a range of public health and community settings, such as schools, pharmacies, businesses, nursing homes, sporting clubs, and childcare facilities.
Prior to the database searches being conducted, four authors met to ensure that the chosen keywords accurately captured the constructs of interest and that keywords were combined using the correct Boolean operators . The core search terms comprised of keywords that related to measurement, the psychometric properties of instruments, the levels at which the measurement could occur (e.g. organisational or individual) and the goals of research implementation. These keywords were as follows: [questionnaire or measure or scale or tool] AND [psychometric or reliability or validity or acceptability] AND [organisation* or institut* or service or staff or personnel] AND [implement* or change or adopt* or sustain*].
Similar to the strategy used in the SIRC review [16, 17], the core search terms were combined with five more keyword searches designed to capture the constructs within each of the five CFIR domains: (1) Intervention Characteristics [strength or quality or advantage or adapt* or complex* or pack* or cost]; (2) Outer Setting [needs or barrier* or facilitate* or resource* or network or external or peer or compet* or poli* or regulation* or guideline* or incentive*]; (3) Inner Setting [structur* or communication or cultur* or value* or climate or tension or risk* or reward* or goal* or feedback or commitment or leadership or knowledge*]; (4) Characteristics of Individuals [belief* or attitude* or self-efficacy or skill* or identi* or trait* or ability* or motivat*]; or (5) Process [plan* or market or train or manager or team or champion or execut* or evaluat*].
The keyword search terms were repeated for all four databases. Keyword searches were limited to the English language; however, no limit was placed on the year of publication, as measurement tools often evolve over many years. Medical Subject Headings (MeSH) were not used in the literature search, as keyword searches have been found to have higher sensitivity, being more successful than subject searching in identifying relevant publications .
Identification of eligible publications
One author coded all abstracts according to the inclusion and exclusion criteria. A second author cross-checked 10 % of the abstracts to confirm they had been correctly classified. Full-text versions of publications were obtained for included abstracts. To ensure that no relevant tools had been missed, previous systematic reviews [7, 8, 10] were also screened for relevant measures, as were tools included on the SIRC Instrument Review Project website . Copies of publications for any additional measures that met the inclusion criteria were obtained. Full-text versions of all eligible publications were then obtained and screened to identify the names and acronyms of all relevant measures they described. The reference lists of all eligible publications were also screened for any additional measures, and Google Scholar was used to conduct cited reference searches. A final literature search was conducted by ‘measure name’ and ‘author names’, using Google Scholar. This search strategy ensured that as many publications as possible were found that related to the psychometric development and validation and revalidation and cross-cultural adaptation of identified measures.
Extraction of data from eligible publications
The properties of each measure were extracted from all full-text publications relating to the development of the measure using data reported in the manuscript text, tables, or figures. Extracted data included: (1) the research setting, sample, and characteristics of the intervention or innovation being assessed; (2) psychometric properties including face and content validity, construct and criterion validity, internal consistency, test-retest reliability, responsiveness, acceptability, and feasibility; and (3) whether the measure had undergone a process of revalidation or cross-cultural adaptation.
The psychometric properties of each measure were independently assessed by two authors using the same criteria described in previous systematic reviews [23, 24] and according to the guidelines for the development and use of tests, including the Standards for Educational and Psychological Testing [5, 25, 26]. The Standards provides a frame of reference to ensure all relevant issues are addressed when developing a measure and allows the quality of measures to be evaluated by those who wish to use them . Following the assessment of psychometric properties, two authors then independently coded each publication to determine which measure domains corresponded with which CFIR constructs. When discrepancies emerged, a third author assisted in reaching consensus.
Setting, sample, and characteristics of the innovation being assessed
Details regarding the country and setting where the measure was developed, characteristics of the innovation or intervention being assessed, response rate, sample size, and demographic characteristics of the sample (gender and profession) who completed the measure were described.
Face and content validity
An instrument is said to have face validity if both the administrators and those who complete it agree that it measures what it was designed to measure . To have content validity, the description of the measure’s development needed to include: (1) the process by which items were selected; (2) who assessed the measure’s content; and (3) what aspects of the measure were revised [14, 28]. Information regarding any theories or frameworks that the measure was developed to test, as well as whether items were adapted from previously validated measures, was also extracted.
Construct and criterion validity
A measure was classified as having good internal structure (construct validity) if exploratory factor analysis (EFA) was performed with eigenvalues set at >1 [14, 29] and >50 % of the variance was explained , or confirmatory factor analysis (CFA) was performed with a root mean square error of approximation (RMSEA) of <0.06 and a comparative fit index (CFI) of >0.95 [31, 32]. The number of items and domains in the measure following factor analysis was recorded. Additional construct validity was determined by assessing whether the measure had convergent validity (correlations (r) >0.40) with similar instruments or divergent validity (correlations (r) <0.30) with dissimilar instruments . Criterion validity was determined by assessing whether the measure was able to obtain different scores for sub-populations with known differences (known-groups validity) .
Internal consistency and test-retest reliability
To meet the criteria for internal consistency, correlations for a measure’s subscales and total scale needed to have a Cronbach’s alpha (α) of >0.70 or a Kuder-Richardson 20 (KR-20) of >0.70 for dichotomous response scales . For test-retest reliability, the measure needed to have undergone a repeated administration with the same sample within 2–14 days . Agreement between scores from the two administrations needed to be calculated, with item, subscale, and total scale correlations having a (1) Cohen’s kappa coefficient (κ) of >0.60 for nominal or ordinal response scales ; (2) Pearson correlation coefficient (r) of >0.70 for interval response scales [14, 28]; or an (3) Intraclass correlation coefficient (ICC) of >0.70 for interval response scales [14, 28].
Responsiveness, acceptability, feasibility, revalidation, and cross-cultural adaptation
A measure’s potential to detect change over time was confirmed if it could show a moderate effect size (>0.5) for a given change [14, 28, 36], and if it had minimal floor and ceiling effects (less than 5 % of the sample achieved the highest or lowest scores) . To determine acceptability and feasibility (burden associated with using the measure), data on the following were extracted: proportion of missing items, time needed to complete, and time needed to interpret and score . Data from publications reporting the revalidation of a measure with additional samples, or in different languages or cultures, were also extracted .
The domains of each included measure were assessed to determine whether the factors they measured corresponded with one or more of the 37 CFIR constructs . A brief summary of each of the CFIR constructs is presented in Additional file 1. The mapping process was domain-focused (i.e. mapping the overall measure domains to constructs) rather than item-focused (i.e. mapping individual items to constructs) to ensure that the overall construct was well captured. Within a measure, only one domain needed to be judged by the reviewers to address a CFIR construct. Therefore, it was possible that a measure with five domains might only have one of its domains mapped to a CFIR construct. Similarly, a measure with three domains might have all contributing to the same CFIR construct. In the latter scenario, the construct was only counted once.
Descriptive statistics (frequencies and proportions) were used to report the number of domains from the included measures which were mapped to each of the CFIR constructs and CFIR domains. Frequencies and proportions were also used to describe the number of measures which met various psychometric criteria.
Identified measures of implementation constructs
The initial searches of MEDLINE, PsycINFO, EMBASE, and CINAHL identified 8547 potentially relevant publications. Of these, 5195 were duplicates leaving 3352 publication abstracts to be coded. Of these 3352 publications, 3317 did not meet the inclusion criteria (see Fig. 1 for PRISMA diagram), leaving 35 eligible publications. The process of identifying measures included in systematic reviews related to the current review [7, 8, 10], and a secondary literature search by measure or author name, lead to the inclusion of an additional 30 publications. A total of 65 full-text publications were retained which described 51 unique measures.
Psychometric properties of measures
Setting, sample, and characteristics of the innovation being assessed
Table 1 outlines the details of the setting, sample, and characteristics of the innovation being assessed by each measure. The majority of measures were developed in the USA (n = 28), with Canada and Australia also having developed three or more measures each. Sixteen measures were developed for use in school settings [38–52], six for use in universities or colleges [53–60], three for use in pharmacies [61–63], two for use in police or correctional facilities [64, 65], two for use in nursing homes [66, 67], six for use with whole communities or in multiple settings [68–75], and sixteen measures were developed for use in workplace settings or other organisations (e.g. utility companies, IT service providers, human services) [76–92]. A broad range of innovations or interventions were assessed, with technology-focussed innovations featuring prominently. Sample sizes in each study ranged from 31 to 1358, and response rates ranged from 15 to 98 %. Sample characteristics (i.e. gender and profession of participants) were inconsistently reported across the studies.
Face and content validity
Almost all measures (n = 47) had undergone a process of face and content validation. The development of 36 measures was guided by an existing theory or framework (Additional file 2). No measures were specifically designed to address all constructs considered important for the implementation of innovations by the CFIR. Twenty-six measures had adapted at least some of their items from pre-existing instruments (Additional file 2).
Construct and criterion validity
The internal structure of 45 instruments was determined via EFA (11 of these also used CFA [42, 49, 52, 54, 55, 59, 65, 67, 77, 78, 82, 91–93]), and six studies used CFA alone [39, 40, 68, 72, 75, 83, 94] (Additional file 3). For studies which conducted EFA, 46 % reported that >50 % of the variance was explained by the final factor model. None of the studies that used CFA alone reported acceptable RMSEA (<0.06) or CFI (>0.95). Across all measures, the number of items ranged from 9 to 149, and the number of factors (domains) ranged from 1 to 20. Eight measures were tested for criterion validity for sub-populations with known differences. These measures demonstrated capacity to distinguish between a number of groups with known differences, including the amount of teaching experience , familiarisation with technology , age , and managers and non-managers . Only two measures [41, 82] reported testing for convergent/divergent validity against existing instruments, although only one  met the required threshold of having significant positive or negative correlations >0.40 or <0.30 with an external measure. In this instance, these relationships were only reported for some individual domains rather than the total score of the scale.
Internal consistency and test-retest reliability
Fifty of the 51 included measures reported on the internal consistency of either the total scale or the individual domains (Additional file 4). The internal consistency of both the total scale and the domains was reported for four measures [40, 61, 66, 76], the internal consistency of the total scale only was reported for five measures (all alpha’s >0.70) [47, 49, 51, 75, 83], and the internal consistency for the scale domains only was reported for the remaining 41 measures. Twenty measures achieved a Cronbach’s alpha of >0.70 for all of their domains [38, 40, 41, 48, 50–52, 54, 59, 60, 63, 76, 79, 81, 84, 85, 87, 89, 90, 95, 96], indicating that more than 50 % of measures did not meet the acceptable threshold for at least one domain. Three measures were examined for test-retest reliability [47, 73, 84]. The administration period was acceptable (2–14 days) for all measures, and adequate test-retest reliability (Pearson’s correlations >0.70) was achieved for all measures, with the exception of one domain (awareness, r = 0.65) in the Stages of Concern Questionnaire .
Responsiveness, acceptability, feasibility, revalidation, and cross-cultural adaptation
Seventeen measures reported acceptability and feasibility, with five studies reporting the time that it took to complete the measure (range 10–70 min; M = 34.6 min) [39, 64, 73, 81, 90] and six studies reporting the proportion of missing items observed following the measure administration (range 1.5–5 %) [52, 59, 63, 67, 75, 84] (Additional file 5). Seven studies examined responsiveness in relation to effect sizes [38, 47, 67, 69, 75, 93, 97], and all but one reported an effect size above the threshold criterion of 0.5 , indicating that these measures are capable of detecting moderate size change (Additional file 5). No studies reported floor or ceiling effects. Thirteen measures were revalidated in new settings and with different populations across a number of additional studies [55, 77, 91, 96, 98–112].
A summary of the psychometric criteria reported by the included measures can be seen in Table 2.
Mapping of measure domains that align with the 37 constructs of the CFIR
The number of measure domains that mapped onto the CFIR constructs ranged from 1 to 19. Relative advantage, networks and communications, culture, implementation climate, learning climate, readiness for implementation, available resources, and reflecting and evaluating were the constructs most frequently addressed by the included measures. Five of the CFIR constructs were not addressed by any measure (Additional file 6). These five constructs were as follows: intervention source, tension for change, engaging, opinion leaders, and champions.
To our knowledge, this is the first systematic review to describe the psychometric properties of measures developed to assess innovations and implementation constructs specifically in public health and community settings. Overall, the psychometric properties of included measures were typically inadequately assessed or not reported. No single measure reported on all key psychometric quality indicators. The majority of studies assessed face, content, construct validity, and internal consistency. However, criterion validity (known-groups), test-retest reliability, and acceptability and feasibility were rarely reported. Only seven measures had responsiveness to change assessed. These findings mirror those of previous reviews [7, 13] that found that few measures demonstrated test-retest reliability, acceptability, or criterion validity.
When measures did report psychometric data, it was typically below the widely accepted thresholds defined in this review. Almost half of the measures that reported undertaking EFA reported that their final factor model explained <50 % of the variance. Furthermore, none of the measures that used CFA alone reported satisfactory RMSEA (<.06) or CFI (>0.95). This suggests that a notable proportion of available implementation measures developed and currently available for use in non-clinical settings are not particularly robust or are prone to misspecification of fit. That only eight of the 51 measures explored criterion validity using known-groups is also concerning. The lack of attention to known-groups validity limits the confidence we can place in these measures being able to detect how groups within community settings (e.g. experienced teachers vs. new teachers) vary in regards to implementation of innovation. This is important for identifying which aspects of an intervention or innovation might need to be adjusted to ensure more robust implementation in the future.
Internal consistency was frequently reported but only 40 % of measures reported that all scale domains had a Cronbach’s alpha >0.70, highlighting a need for further refinement of scale items and revalidation. Only three measures assessed test-retest reliability, another area requiring much greater attention in future studies. Those studies that did assess test-retest reliability performed well, meeting the vast majority threshold criteria. However, the stability of these types of measures over time remains unclear. Acceptability and feasibility data were reported for just 33 % of the measures. Mean completion time for measures was almost 35 min. Although shorter questionnaires have been shown to improve response rates , it is unclear what the optimal survey length is while still maintaining the survey validity. Rates of missing data ranged from 1.5 to <5 %, which according to Schafer  is acceptable given missing data rates of less than 5 % are likely to be inconsequential. Only 25 % of measures had been revalidated or validated in a different culture. This limits the generalisability of the measures and poses a significant barrier to research translation within potentially underserved communities or cultures .
Without more comprehensive assessment of the psychometric properties of these instruments, the ability to ascertain the utility of theories or frameworks to support the implementation of innovations in public health and community settings is limited. For example, understanding the responsiveness of measures is essential for evaluating implementation interventions and ensuring that changes in constructs over time can be detected [116, 117]. Having measures which are acceptable and feasible is also important to the conduct of rigorous research, particularly in more pragmatic research studies [5, 18]. Low survey response rates or high rates of attrition due to onerous research methods can introduce bias and compromise study internal and external validity [118, 119].
Alignment of measure domains with constructs of the CFIR
While some of the CFIR constructs were addressed by domains from multiple measures in this study, five constructs were not assessed by any measure. These were intervention source, tension for change, engaging, opinion leaders, and champions. The development of psychometrically robust measures which can assess these constructs in public health and community settings may be a priority area of research for the field.
The most frequently addressed constructs appeared to fall within the ‘inner setting’ and ‘characteristics of individuals’ domains, suggesting that the focus of measures to date has been on understanding only the immediate environment where the innovation or intervention will be implemented. It appeared that measures addressing ‘outer setting’ or ‘process’ constructs were less frequently observed than other domains. The development of future measures should target these domains of the CFIR to ensure a greater breadth and depth of understanding of all factors which may influence the implementation of evidence into practice in public health and community settings.
Comparison of the current review with the SIRC Instrument Review Project
Despite the similarity in review methodologies utilised by the current review and that undertaken by SIRC , few measures have been reported by both reviews. This is not surprising, as although the SIRC review captured some measures developed in education or workplace settings, other public health and community settings were not addressed. Furthermore, the SIRC review used a much broader inclusion criteria with regard to measures of CFIR constructs. For example, for the construct of ‘self-efficacy’, the SIRC review includes all measures of self-efficacy, regardless of the context in which self-efficacy is being examined. In contrast, the current review only includes measures which assess self-efficacy in the context of an individual’s perceived ability to implement the target innovation.
Despite these differences, the use of a common framework (CFIR) for examining constructs captured by different measures in the current review promotes consistency and complements the findings of the SIRC review.
It is possible that not all existing implementation measures in public health and community settings were captured by this review. The keywords used to identify measures were limited to ‘questionnaire’, ‘measure’, ‘scale’, or ‘tool’ and other possible terms such as ‘instrument’ and ‘test’ were not used. These terms were excluded due to the likelihood of identifying non-relevant publications related to clinical practice (e.g. surgical instruments, immunologic tests). However, the exclusion of these keywords may have meant that some relevant publications were not identified during the database search. Additionally, the review did not assess measures published in the grey literature and only studies published in English were included. However, it is likely that those measures which were identified represent the best available evidence, given their publication in peer-reviewed journals and indexing in four scientific databases. The psychometric properties that were chosen to be extracted from publications about each measure may have also limited the findings. For example, for studies that utilised CFA, only data pertaining to the RMSEA and CFI were recorded based on recommendations by Schmitt . Included publications may have reported additional CFA metrics (such as goodness of fit (GFI) or the normed fit index (NI)); however, they were not included in this review.
Despite these limitations, the findings from this review are likely to be of value to public health researchers who are looking to identify measures with robust psychometric properties that can be used to assess implementation constructs. There are, however, a small number of constructs for which no measure could be identified. Developing measures which can assess these five remaining constructs will be an important consideration for future research.
Existing measures of implementation constructs for use in public health and community settings require additional testing to enhance their reliability and validity. Further research is also needed to revalidate these measures in different settings and populations. At present, no single measure, or combination of measures, can be used to assess all constructs of the CFIR in public health and community settings. The development of new measures which can assess the broader range of implementation constructs across all of the CFIR domains should continue to be a priority for the field.
confirmatory factor analysis
comparative fit index
Consolidated Framework for Implementation Research
Cumulative Index to Nursing and Allied Health Literature
exploratory factor analysis
Excerpta Medica database
goodness of fit
Implementation Outcomes Framework
Medical Literature Analysis and Retrieval System Online
Medical Subject Headings
normed fit index
Preferred Reporting Items for Systematic Reviews and Meta-Analyses
root mean square error of approximation
Society for Implementation Research Collaboration
Davies P, Walker A, Grimshaw J. A systematic review of the use of theory in the design of guideline dissemination and implementation strategies and interpretation of the results of rigorous evaluations. Implement Sci. 2010;5:14.
Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci. 2015;10:53.
Tabak RG, Khoong EC, Chambers D, Brownson RC. Bridging research and practice: models for dissemination and implementation research. Am J Prev Med. 2012;43:337–50.
Martinez RG, Lewis CC, Weiner BJ. Instrumentation issues in implementation science. Implement Sci. 2014;9:118.
Rabin BA, Purcell P, Naveed S, Moser RP, Henton MD, Proctor EK, Brownson RC, Glasgow RE. Advancing the application, quality and harmonization of implementation science measures. Implement Sci. 2012;7:119.
Brennan SE, Bosch M, Buchan H, Green SE. Measuring organizational and individual factors thought to influence the success of quality improvement in primary care: a systematic review of instruments. Implement Sci. 2012;7:121.
Chaudoir SR, Dugan AG, Barr CH. Measuring factors affecting implementation of health innovations: a systematic review of structural, organizational, provider, patient, and innovation level measures. Implement Sci. 2013;8:22.
Chor KHB, Wisdom JP, Olin SCS, Hoagwood KE, Horwitz SM. Measures for predictors of innovation adoption. Adm Policy Ment Health. 2014;42:545–73.
Scott T, Mannion R, Davies H, Marshall M. The quantitative measurement of organizational culture in health care: a review of the available instruments. Health Serv Res. 2003;38:923–45.
Weiner BJ, Amick H, Lee SYD. Conceptualization and measurement of organizational readiness for change: a review of the literature in health services research and other fields. Med Care Res Rev. 2008;65:379–436.
King T, Byers JF. A review of organizational culture instruments for nurse executives. J Nurs Adm. 2007;37:21.
Emmons KM, Weiner B, Fernandez ME, Tu SP. Systems antecedents for dissemination and implementation: a review and analysis of measures. Health Educ Behav. 2012;39:87–105.
Squires JE, Estabrooks CA, Gustavsson P, Wallin L. Individual determinants of research utilization by nurses: a systematic review update. Implement Sci. 2011;6:1.
McDowell I. Measuring health: a guide to rating scales and questionnaires. New York: Oxford University Press; 2006.
Hersen M. Clinician’s handbook of adult behavioral assessment. Boston: Elsevier Academic Press; 2006.
Lewis CC, Fischer S, Weiner BJ, Stanick C, Kim M, Martinez RG. Outcomes for implementation science: an enhanced systematic review of instruments using evidence-based rating criteria. Implement Sci. 2015;10:155.
Lewis CC, Stanick CF, Martinez RG, Weiner BJ, Kim M, Barwick M, Comtois KA. The Society for Implementation Research Collaboration Instrument Review Project: a methodology to promote rigorous evaluation. Implement Sci. 2015;10:2.
Proctor E, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, Griffey R, Hensley M. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm Policy Ment Health. 2011;38:65–76.
Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4:50.
The SIRC Instrument Review Project (IRP): A systematic review and synthesis of implementation science instruments [http://www.societyforimplementationresearchcollaboration.org/sirc-projects/sirc-instrument-project]
Sampson M, McGowan J, Cogo E, Grimshaw J, Moher D, Lefebvre C. An evidence-based practice guideline for the peer review of electronic search strategies. J Clin Epidemiol. 2009;62:944–52.
Jenuwine E, Floyd J. Comparison of Medical Subject Headings and text-word searches in MEDLINE to retrieve studies on sleep in healthy individuals. J Med Libr Assoc. 2004;92:349–53.
Clinton-McHarg T, Carey M, Sanson-Fisher R, Shakeshaft A, Rainbird K. Measuring the psychosocial health of adolescent and young adult (AYA) cancer survivors: a critical review. Health Qual Life Outcomes. 2010;8:25.
Tzelepis F, Rose SK, Sanson-Fisher RW, Clinton-McHarg T, Carey ML, Paul CL. Are we missing the Institute of Medicine’s mark? A systematic review of patient-reported outcome measures assessing quality of patient-centred cancer care. BMC Cancer. 2014;14:41.
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association; 2014.
Mokkink L, Terwee C, Patrick D, Alonso J, Stratford P, Knol D, Bouter L, de Vet H. The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res. 2010;19:539–49.
Anastasi A, Urbina S. Psychological testing. Upper Saddle River: Prentice Hall; 1997.
Lohr KN, Aaronson NK, Alonso J, Audrey-Burnam M, Patrick DL, Perrin EB, Roberts JS. Evaluating quality-of-life and health status instruments: development of scientific review criteria. Clin Ther. 1996;18:979–92.
Kaiser HF. Directional statistical decisions. Psychol Rev. 1960;67:160–7.
Tabachnick BG, Fidell LS. Using multivariate statistics. Boston: Pearson; 2013.
Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equation Model. 1999;6:1–55.
Schmitt TA. Current methodological considerations in exploratory and confirmatory factor analysis. J Psychoeduc Assess. 2011;29:304–21.
Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale: Lawrence Erlbaum Associates; 1988.
Rubin A, Bellamy J. Practitioner’s guide to using research for evidence-based practice. Hoboken: John Wiley & Sons; 2012.
Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. A comparison of two time intervals for test-retest reliability of health status instruments. J Clin Epidemiol. 2003;56:730–5.
Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. New York: Oxford University Press; 2008.
Pedhazur EJ, Schmelkin LP. Measurement, design, and analysis: an integrated approach. Hillsdale: Lawrence Erlbaum Associates; 1991.
Aldridge JM, Laugksch RC, Fraser BJ. School-level environment and outcomes-based education in South Africa. Learn Environ Res. 2006;9:123–47.
Bowen GL, Rose RA, Ware WB. The reliability and validity of the school success profile learning organization measure. Eval Program Plann. 2006;29:97–104.
Canfield JP, Teasley ML, Abell N, Randolph KA. Validation of a Mckinney-Vento Act implementation scale. Res Soc Work Pract. 2012;22:410–9.
Chatterji M. Measuring leader perceptions of school readiness for reforms: use of an iterative model combining classical and Rasch methods. J Appl Meas. 2001;3:455–85.
Deschesnes M, Trudeau F, Kebe M. Psychometric properties of a scale focusing on perceived attributes of a health promoting school approach. Can J Public Health. 2009;100:389–92.
Gingiss PL, Gottlieb NH, Brink SG. Measuring cognitive characteristics associated with adoption and implementation of health innovations in schools. Am J Health Promot. 1994;8:294–301.
Gingiss PL, Gottlieb NH, Brink SG. Increasing teacher receptivity toward use of tobacco prevention education programs. J Drug Educ. 1994;24:163–76.
Hayes DM. Toward the development and validation of a curriculum coordinator Role-efficacy Belief Instrument for sexuality education. J Sex Educ Ther. 1992;18:127–35.
Hume A, McIntosh K. Construct validation of a measure to assess sustainability of school-wide behavior interventions. Psychol Sch. 2013;50:1003–14.
Kingery PM, Holcomb JD, Jibaja-Rusth M, Pruitt BE, Buckner WP. The health teaching self-efficacy scale. J Health Educ. 1994;25:68–76.
Lambert LG, Monroe A, Wolff L. Mississippi elementary school teachers’ perspectives on providing nutrition competencies under the framework of their school wellness policy. J Nutr Educ Behav. 2010;42:271–6.
McIntosh K, MacKay LD, Hume AE, Doolittle J, Vincent CG, Horner RH, Ervin RA. Development and initial validation of a measure to assess factors related to sustainability of school-wide positive behavior support. J Posit Behav Interv. 2011;13:208–18.
Mellin EA, Bronstein L, Anderson-Butcher D, Amorose AJ, Ball A, Green J. Measuring interprofessional team collaboration in expanded school mental health: model refinement and scale development. J Interprof Care. 2010;24:514–23.
Steckler A, Goodman RM, McLeroy KR, Davis S, Koch G. Measuring the diffusion of innovative health promotion programs. Am J Health Promot. 1992;6:214–24.
Tuytens M, Devos G. Teachers’ perception of the new teacher evaluation policy: a validity study of the policy characteristics scale. Teach Teach Educ. 2009;25:924–30.
Atkinson NL. Developing a questionnaire to measure perceived attributes of eHealth innovations. Am J Health Behav. 2007;31:612–21.
Chung KC. Gender, culture and determinants of behavioural intents to adopt mobile commerce among the Y generation in transition economies: evidence from Kazakhstan. Behav Inf Technol. 2014;33:743–56.
Chung K-C, Holdsworth DK. Culture and behavioural intent to adopt mobile commerce among the Y Generation: comparative analyses between Kazakhstan, Morocco and Singapore. Young Consumers. 2012;13:224–41.
Davis FD. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q. 1989;13:319–40.
Pillay H, Irving K, McCrindle A. Developing a diagnostic tool for assessing tertiary students’ readiness for online learning. Int J Learn Tech. 2006;2:92–104.
Pillay H, Irving K, Tones M. Validation of the diagnostic tool for assessing tertiary students’ readiness for online learning. High Educ Res Dev. 2007;26:217–34.
Saeed KA, Abdinnour S. Understanding post-adoption IS usage stages: an empirical assessment of self-service information systems. Inf Syst J. 2013;23:219–44.
Talukder M, Quazi A. The impact of social influence on individuals’ adoption of innovation. J Org Comp Elect Com. 2011;21:111–35.
Fang Y, Yang S, Feng B, Ni Y, Zhang K. Pharmacists’ perception of pharmaceutical care in community pharmacy: a questionnaire survey in Northwest China. Health Soc Care Community. 2011;19:189–97.
Kansanaho HM, Puumalainen II, Varunki MM, Airaksinen MSA, Aslani P. Attitudes of Finnish community pharmacists toward concordance. Ann Pharmacother. 2004;38:1946–53.
Roberts AS, Benrimoj SI, Chen TF, Williams KA, Aslani P. Practice change in community pharmacy: quantification of facilitators. Ann Pharmacother. 2008;42:861–8.
Cochran JK, Bromley ML, Swando MJ. Sheriff’s deputies’ receptivity to organizational change. Policing. 2002;25:507–29.
Taxman FS, Henderson C, Young D, Farrell J. The impact of training interventions on organizational readiness to support innovations in juvenile justice offices. Adm Policy Ment Health. 2014;41:177–88.
Christensson L, Unosson M, Bachrach-Lindstrom M, Ek AC. Attitudes of nursing staff towards nutritional nursing care. Scand J Caring Sci. 2003;17:223–31.
Randall R, Nielsen K, Tvedt SD. The development of five scales to measure employees’ appraisals of organizational-level stress management interventions. Work Stress. 2009;23:1–23.
Boothroyd RA, Greenbaum PE, Wang W, Kutash K, Friedman RM. Development of a measure to assess the implementation of children’s systems of care: the systems of care implementation survey (SOCIS). J Behav Health Serv Res. 2011;38:288–302.
Chang SE, Pan YHV. Exploring factors influencing mobile users’ intention to adopt multimedia messaging service. Behav Inf Technol. 2011;30:659–72.
Collis B, Pals N. A model for predicting an individual’s use of a telematics application for a learning-related purpose. Int J Educ Telecommunications. 2000;6:63-103.
Collis B, Peters O, Pals N. A model for predicting the educational use of information and communication technologies. Instruct Sci. 2001;29:95–125.
Greenbaum PE, Wang W, Boothroyd R, Kutash K, Friedman RM. Multilevel confirmatory factor analysis of the systems of care implementation survey (SOCIS). J Behav Health Serv Res. 2011;38:303–26.
Hall GE, George AA, Rutherford WL. Measuring stages of concern about innovation: a manual for use of the SOC questionnaire. Austin: Research and Development Center for Teacher Education, The University of Texas; 1977.
Hall GE, George A, Rutherford W. Measuring stages of concern about the innovation: a manual for use of the SoC questionnaire. Austin: Research and Development Center for Teacher Education, The University of Texas; 1979.
Monthuy-Blanc J, Bouchard S, Maiano C, Seguin M. Factors influencing mental health providers’ intention to use telepsychotherapy in first nations communities. Transcult Psychiatry. 2013;50:323–43.
Bess KD, Perkins DD, McCown DL. Testing a measure of organizational learning capacity and readiness for transformational change in human services. J Prev Interv Community. 2010;39:35–49.
Bouckenooghe D, Devos G, Van den Broeck H. Organizational change questionnaire-climate of change, processes, and readiness: development of a new instrument. J Psychol. 2009;143:559–99.
Caldwell DF, O’Reilly CA. The determinants of team-based innovation in organizations the role of social influence. Small Group Res. 2003;34:497–517.
Chwelos P, Benbasat I, Dexter AS. Research report: empirical test of an EDI adoption model. Inf Syst Res. 2001;12:304–21.
Dahlan N, Ramayah T, Mei LL. Readiness to adopt data mining technologies: an exploratory study of telecommunication employees in Malaysia. In: Karagiannis D, Reimer U, editors. Practical aspects of knowledge management, vol. 2569. Berlin: Springer-Verlag; 2002. p. 75–86.
Hanusaik N, O’Loughlin JL, Kishchuk N, Eyles J, Robinson K, Cameron R. Building the backbone for organisational research in public health systems: development of measures of organisational capacity for chronic disease prevention. J Epidemiol Community Health. 2007;61:742–9.
Holt DT, Armenakis AA, Feild HS, Harris SG. Readiness for organizational change: the systematic development of a scale. J Appl Behav Sci. 2007;43:232–55.
Judge TA, Thoresen CJ, Pucik V, Welbourne TM. Managerial coping with organizational change: a dispositional perspective. J Appl Psychol. 1999;84:107.
Jung J, Nitzsche A, Neumann M, Wirtz M, Kowalski C, Wasem J, Stieler-Lorenz B, Pfaff H. The Worksite Health Promotion Capacity Instrument (WHPCI): development, validation and approaches for determining companies’ levels of health promotion capacity. BMC Public Health. 2010;10:1–10.
Molla A, Licker PS. eCommerce adoption in developing countries: a model and instrument. Inf Manage. 2005;42:877–99.
Molla A, Licker PS. Perceived e-readiness factors in e-commerce adoption: an empirical investigation in a developing country. Int J Electron Commerce. 2005;10:83–110.
Moore GC, Benbasat I. Development of an instrument to measure the perceptions of adopting an information technology innovation. Inf Syst Res. 1991;2:192–222.
Peltier JW, Schibrowsky JA, Zhao Y. Understanding the antecedents to the adoption of CRM technology by small entrepreneurs vs owner-managers. Int Small Bus J. 2009;27:307–36.
Ravichandran T. Swiftness and intensity of administrative innovation adoption: an empirical study of TQM in information systems. Decision Sci. 2000;31:691.
Saffu K, Walker JH, Hinson R. An empirical study of perceived strategic value and adoption constructs: the Ghanaian case. Manag Decis. 2007;45:1083–101.
Strating MMH, Nieboer AP. Norms for creativity and implementation in healthcare teams: testing the group innovation inventory. Int J Qual Health Care. 2010;22:275–82.
Zeitz G, Johannesson R, Ritchie JE. An employee survey measuring total quality management practices and culture development and validation. Group Org Manag. 1997;22:414–44.
Taxman FS, Young DW, Wiersema B, Rhodes A, Mitchell S. The national criminal justice treatment practices survey: multilevel survey methods and procedures. J Subst Abuse Treat. 2007;32:225–38.
Lin SP, Yang HY. Exploring key factors in the choice of e-health using an asthma care mobile service model. Telemed e-Health. 2009;15:884–90.
Davis FD, Bagozzi RP, Warshaw PR. User acceptance of computer-technology—a comparison of 2 theoretical-models. Manage Sci. 1989;35:982–1003.
Venkatesh V, Davis FD. A theoretical extension of the technology acceptance model: four longitudinal field studies. Manage Sci. 2000;46:186–204.
Molla A, Licker PL. PERM: a model of eCommerce adoption in developing countries. In: Khosrow-Pour M, editor. Issues and trends of information technology management in contemporary organizations, vol. 1. Hershey: Idea Group Publishing; 2002. p. 527–30.
Al-Hudhaif SA, Alkubeyyer A. E-commerce adoption factors in Saudi Arabia. Int J Bus Manage. 2011;6:122.
Aldridge JM, Fraser BJ. Teachers’ views of their school climate and its relationship with teacher self-efficacy and job satisfaction. Learn Environ Res. 2016;19:291.
Alharbi S, Drew S. Using the technology acceptance model in understanding academics’ behavioural intention to use learning management systems. Int J Adv Comput Sci Appl (IJACSA). 2014;5.
Bailey Jr DB, Palsha SA. Qualities of the Stages of Concern Questionnaire and implications for educational innovations. J Educ Res. 1992;85:226–32.
Berkowitz R, Bowen G, Benbenishty R, Powers JD. A cross-cultural validity study of the school success profile learning organization measure in Israel. Child Schools. 2013;35:137–46.
Canfield JP. The McKinney-Vento Act implementation scale: a second validation study. J Child Poverty. 2014;20:47–64.
Cheung D, Hattie J, Ng D. Reexamining the Stages of Concern Questionnaire: a test of alternative models. J Educ Res. 2001;94:226–36.
Christensson L, Bachrach-Lindstrom M. Adapting “the staff attitudes to nutritional nursing care scale” to geriatric nursing care. J Nutr Health Aging. 2009;13:102–7.
Cetinkaya B. Understanding teachers in the midst of reform: teachers’ concerns about reformed sixth grade mathematics curriculum in Turkey. Eurasia J Math Sci Technol Educ. 2012;8:155–66.
Godoe P, Johansen T. Understanding adoption of new technologies: technology readiness and technology acceptance as an integrated concept. J Eur Psychol Stud. 2012;3:38–52.
Knapp P, Raynor DK, Thistlethwaite JE, Jones MB. A questionnaire to measure health practitioners’ attitudes to partnership in medicine taking: LATCon II. Health Expect. 2009;12:175–86.
Richardson JW. Technology adoption in Cambodia: measuring factors impacting adoption rates. J Int Dev. 2011;23:697–710.
Shotsberger PG, Crawford AR. On the elusive nature of measuring teacher change: an examination of the stages of concern questionnaire. Eval Res Educ. 1999;13:3–17.
Tan J, Tyler K, Manica A. Business-to-business adoption of eCommerce in China. Inf Manage. 2007;44:332–51.
Van Den Berg R, Ros A. The permanent importance of the subjective reality of teachers during educational innovation: a concerns-based approach. Am Educ Res J. 1999;36:879–906.
Edwards P, Roberts I, Clarke M, DiGuiseppi C, Pratap S, Wentz R, Kwan I, Cooper R. Methods to increase response rates to postal questionnaires. Cochrane Database Syst Rev. 2007;2:MR000008.
Schafer J. Multiple imputation: a primer. Stat Methods Med Res. 1999;8:3–15.
Macfarlane A, O’Reilly-de Brun M, de Brun T, Dowrick C, O’Donnell C, Mair F, Spiegel W, van den Muijsenbergh M, van Weel Baumgarten E, Lionis C, et al. Healthcare for migrants, participatory health research and implementation science—better health policy and practice through inclusion. The RESTORE project. Eur J Gen Pract. 2014;20:148–52.
Husted JA, Cook RJ, Farewell VT, Gladman DD. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–68.
Guyatt G, Walter S, Norman G. Measuring change over time—assessing the usefulness of evaluative instruments. J Chronic Dis. 1987;40:171–8.
Groves RM. Nonresponse rates and nonresponse bias in household surveys. Public Opin Q. 2006;70:646–75.
Armstrong JS, Overton TS. Estimating nonresponse bias in mail surveys. J Mark Res. 1977;14:396–402.
Rogers EM. Diffusion of innovations. 3rd ed. New York: Free Press; 1983.
Hall GE, Hord SM. Change in schools: facilitating the process. Albany: State University of New York Press; 1987.
Bandura A. Social foundations of thought and action: a social cognitive theory. Englewood Cliffs: Prentice-Hall; 1986.
Bronstein LR. A model for interdisciplinary collaboration. Soc Work. 2003;48:297–306.
Bronstein LR. Index of interdisciplinary collaboration. Soc Work Res. 2002;26:113–26.
Litwin G, Stringer R. Motivation and organizational climate. Manage Int Rev. 1969;9:163.
Taylor J, Bowers D. Survey of organizations: a machine scored standardized questionnaire. Ann Arbor: Institute for Social Research, University of Michigan; 1972.
Greenhalgh T, Robert G, Macfarlane F, Bate P, Kyriakidou O. Diffusion of innovations in service organizations: systematic review and recommendations. Milbank Q. 2004;82:581–629.
Oldenburg B, Parcel GS. Diffusion of innovations. In: Glanz K, Rimer BK, Viswanath K, editors. Health behavior and health education: theory, research, and practice. San Francisco: John Wiley & Sons; 2002. p. 312–34.
Goldman KD. Perceptions of innovations as predictors of implementation levels: the diffusion of a nation wide health education campaign. Health Educ Behav. 1994;21:433–45.
Parcel GS, O’Hara-Tompkins NM, Harrist RB, Basen-Engquist KM, McCormick LK, Gottlieb NH, Eriksen MP. Diffusion of an effective tobacco prevention program. Part II: evaluation of the adoption phase. Health Educ Res. 1995;10:297–307.
Lafferty CK. Diffusion of an asset building innovation in three Portage County school districts: A model of individual change. Kent: Kent State University; 2001.
Fullan M. The new meaning of educational change. New York: Teachers College Press; 2001.
Bandura A, McClelland DC. Social learning theory. Prentice-Hall: Englewood Cliffs; 1977.
Gibson S, Dembo MH. Teacher efficacy—a construct-validation. J Educ Psychol. 1984;76:569–82.
Lewin K. Field theory in social science: selected theoretical papers. New York: Harper; 1951.
Moos RH. The social climate scales: an overview. Palo Alto: Consulting Psychologists Press; 1974.
Fisher DL, Fraser BJ. School climate and teacher professional development. South Pac J Teach Educ. 1991;19:17–32.
Fisher DL, Fraser BJ. Validity and use of school environment instruments. J Classroom Interact. 1991;26(2):13–8.
Bowen GL. Organizational culture profile. Chapel Hill: Bowen & Associates; 1997.
Orthner DK, Cook PC, Sabah Y, Rosenfeld J. Measuring organizational learning in human services. Development and validation of the organizational learning capacity assessment. Miami: Workforce Issues in Social Work; 2005.
Cameron KS, Bright D, Caza A. Exploring the relationships between organizational virtuousness and performance. Am Behav Sci. 2004;47:766–90.
McIntosh K, Horner RH, Sugai G. Sustainability of systems-level evidence-based practices in schools: current knowledge and future directions. In: Sailor W, Dunlap G, Sugai G, Horner RH, editors. Handbook of positive behavior support. Berlin: Springer Science & Business Media; 2009. p. 327–52.
Nysveen H, Pedersen PE, Thorbjornsen H. Intentions to use mobile services: antecedents and cross-service comparisons. J Acad Mark Sci. 2005;33:330–46.
Hupcey JE, Penrod J, Morse JM, Mitcham C. An exploration and advancement of the concept of trust. J Adv Nurs. 2001;36:282–93.
Bauer HH, Barnes SJ, Reichardt T, Neumann MM. Driving consumer acceptance of mobile marketing: a theoretical framework and empirical study. J Electron Commerce Res. 2005;6:181–92.
Hofstede G. Culture’s consequences: international differences in work-related values. Beverly Hills: Sage Publications; 1980.
Yoo B, Donthu N. The effects of marketing education and individual cultural values on marketing ethics of students. J Mark Educ. 2002;24:92–103.
Bolton T. Perceptual factors that influence the adoption of videotex technology: results of the channel 2000 field test. J Broadcasting. 1983;27:141–53.
Bandura A. Self-efficacy mechanism in human agency. Am Psychol. 1982;37:122–47.
Beach LR, Mitchell TR. A contingency model for the selection of decision strategies. Acad Manage Rev. 1978;3:439–49.
Johnson EJ, Payne JW. Effort and accuracy in choice. Manage Sci. 1985;31:395–414.
Payne JW. Contingent decision behavior. Psychol Bull. 1982;92:382.
Swanson EB. Measuring user attitudes in MIS research: a review. Omega. 1982;10:157–65.
Swanson EB. Information channel disposition and use. Decision Sci. 1987;18:131–45.
Saga VL, Zmud RW. The nature and determinants of IT acceptance, routinization, and infusion. In: Levine L, editor. Diffusion, transfer, and implementation of information technology. Pittsburgh: North-Holland; 1993. p. 67–86.
Fishbein M. A theory of reasoned action: some applications and implications. Nebr Symp Motiv. 1979;27:65–116.
Frambach RT, Schillewaert N. Organizational innovation adoption—a multi-level framework of determinants and opportunities for future research. J Bus Res. 2002;55:163–76.
Venkatesh V, Morris MG, Davis GB, Davis FD. User acceptance of information technology: toward a unified view. MIS Q. 2003;27:425–78.
Igbaria M, Guimaraes T, Davis G. Testing the antecedents of microcomputer usage via a structural equation model. J Manage Inf Syst. 1995;11:87–114.
Igbaria M, Zinatelli N, Cragg P, Cavaye A. Personal computing acceptance factors in small firms: a structural equation model. MIS Q. 1997;21:279–305.
Al-Gahtani SS, King M. Attitudes, satisfaction and usage: factors contributing to each in the acceptance of information technology. Behav Inf Technol. 1999;18:277–97.
Taylor S, Todd PA. Understanding information technology usage—a test of competing models. Inf Syst Res. 1995;6:144–76.
Lam T, Cho V, Qu H. A study of hotel employee behavioral intentions towards adoption of information technology. Int J Hosp Manag. 2007;26:49–65.
Lewis W, Agarwal R, Sambamurthy V. Sources of influence on beliefs about information technology use: an empirical study of knowledge workers. MIS Q. 2003;27:657–78.
Selwyn N. Students’ attitudes toward computers: validation of a computer attitude scale for 16-19 education. Comput Educ. 1997;28:35–41.
Huang HM. Student perceptions in an online mediated environment. Int J Instr Media. 2002;29:405.
Watkins R, Leigh D, Triner D. Assessing readiness for e-learning. Perform Improv Q. 2004;17:66.
Smith PJ, Murphy KL, Mahoney SE. Towards identifying factors underlying readiness for online learning: an exploratory study. Distance Educ. 2003;24:57–67.
Smith PJ. Learning preferences and readiness for online learning. Educ Psychol. 2005;25:3–12.
Muse HE. The web-based community college student: an examination of factors that lead to success and risk. Internet Higher Educ. 2003;6:241–61.
Osborn V. Identifying at-risk students in videoconferencing and web-based distance education. Am J Distance Educ. 2001;15:41–54.
Roblyer MD, Marshall JC. Predicting success of virtual high school students: preliminary results from an educational success prediction instrument. J Res Comput Educ. 2002;35:241–55.
Scott WR. Organizations: rational, natural, and open systems. London: Prentice Hall International; 1998.
Benrimoj SI, Roberts AS. Providing patient care in community pharmacies in Australia. Ann Pharmacother. 2005;39:1911–7.
Roberts AS, Benrimoj SIC, Chen TF, Williams KA, Hopp TR, Aslani P. Understanding practice change in community pharmacy: a qualitative study in Australia. Res Soc Adm Pharm. 2005;1:546–64.
Raynor D, Thistlethwaite J, Hart K, Knapp P. Are health professionals ready for the new philosophy of concordance in medicine taking? Int J Pharm Pract. 2001;9:81–4.
Hepler CD, Strand LM. Opportunities and responsibilities in pharmaceutical care. Am J Hosp Pharm. 1990;47:533–43.
Tesluk PE, Farr JL, Mathieu JE, Vance RJ. Generalization of employee involvement training to the job setting: individual and situational effects. Pers Psychol. 1995;48:607.
Orthner DK, Cook P, Sabah Y, Rosenfeld J. Organizational learning: a cross-national pilot-test of effectiveness in children’s services. Eval Program Plann. 2006;29:70–8.
Scott SG, Bruce RA. Determinants of innovative behavior: a path model of individual innovation in the workplace. Acad Manage J. 1994;37:580–607.
Bass BM, Avolio BJ. Improving organizational effectiveness through transformational leadership. New York: SAGE Publications, Inc; 1994.
Arnold JA, Arad S, Rhoades JA, Drasgow F. The empowering leadership questionnaire: the construction and validation of a new scale for measuring leader behaviors. J Organ Behav. 2000;21:249–69.
Podsakoff PM, MacKenzie SB, Moorman RH, Fetter R. Transformational leader behaviors and their effects on followers’ trust in leader, satisfaction, and organizational citizenship behaviors. Leadersh Q. 1990;1:107–42.
Martino S, Ball SA, Gallon SL, Hall D, Garcia M, Ceperich S, Farentinos C, Hamilton J, Hausotter W. Motivational interviewing assessment: supervisory tools for enhancing proficiency. Salem: Northwest Frontier Addiction Technology Transfer Center, Oregon Health and Science University; 2006.
Farrell J, Young DW, Taxman FS. Effects of organizational factors on use of juvenile supervision practices. Crim Justice Behav. 2011;38:565–83.
Caldwell DF, Chatman JA, O’Reilly CA. Building organizational commitment: a multifirm study. J Occup Psychol. 1990;63:245–61.
Saksvik P, Nytro K, Tvedt SD. Healthy organizational change. In: Houdmont J, Leka S, editors. Occupational health psychology: European perspectives on research, education and practice, vol. 2. Nottingham: ISMAI Publishers; 2007. p. 81–90.
Saksvik P, Tvedt SD, Nytro K, Andersen GR, Andersen TK, Buvik MP, Torvatn H. Developing criteria for healthy organizational change. Work Stress. 2007;21:243–63.
Fishbein M, Ajzen I. Belief, attitude, intention, and behavior: an introduction to theory and research. Boston: Addison-Wesley Pub. Co.; 1975.
Hartwick J, Barki H. Explaining the role of user participation in information system use. Manage Sci. 1994;40:440–65.
Hurt HT, Joseph K, Cook CD. Scales for the measurement of innovativeness. Hum Commun Res. 1977;4:58–65.
Zaltman G. Metaphorically speaking: new technique uses multidisciplinary ideas to improve qualitative research. Mark Res. 1996;8:13.
Hall GE, Wallace RD, Dossett WA. A developmental conceptualization of the adoption process within educational institutions. Austin: Research and Development Center for Teacher Education, The University of Texas; 1973.
Newlove BW, Hall GE. A manual for assessing open-ended statements of concern. Austin: Research and Development Center for Teacher Education, The University of Texas; 1976.
Davis FD. User acceptance of information technology: system characteristics, user perceptions and behavioral impacts. Int J Man Mach Stud. 1993;38:475–87.
Venkatesh V. Determinants of perceived ease of use: integrating control, intrinsic motivation, and emotion into the technology acceptance model. Inf Syst Res. 2000;11:342–65.
Venkatesh V, Davis FD. Extrinsic and intrinsic motivation to use computers in the work place. J Appl Psychol. 2000;22:1111–32.
Kwon TH, Zmud RW. Unifying the fragmented models of information systems implementation. In: Boland RJ, Hirschheim R, editors. Critical issues in information systems research. New York: John Wiley & Sons, Inc; 1987. p. 227–51.
Davenport TH, Harris JG, De Long DW, Jacobson AL. Data to knowledge to results: building an analytic capability. Calif Manage Rev. 2001;43:117.
Plsek PE. Collaborating across organizational boundaries to improve the quality of care. Am J Infect Control. 1997;25:85–95.
Kilo CM. A framework for collaborative improvement: lessons from the Institute for Healthcare Improvement’s Breakthrough Series. Qual Manage Healthc. 1998;6:1–14.
Kivimaki M, Elovainio M. A short version of the team climate inventory: development and psychometric properties. J Occup Organ Psychol. 1999;72:241–6.
Strating MMH, Nieboer AP. Psychometric test of the team climate inventory-short version investigated in Dutch quality improvement teams. BMC Health Serv Res. 2009;9:1.
Anderson N, West MA. Team climate inventory: manual and user’s guide. Windsor: ASE; 1994.
Iacovou CL, Benbasat I, Dexter AS. Electronic data interchange and small organizations: adoption and impact of technology. MIS Q. 1995;19:465–85.
Raymond L, Pare G. Measurement of information technology sophistication in small manufacturing businesses. Inf Resourc Manage J. 1992;5:4–16.
Ferguson S. The benefits and barriers to adoption of EDI. Vancouver: University of British Columbia; 1992.
Marsick VJ, Watkins KE. Demonstrating the value of an organization’s learning culture: the dimensions of the learning organization questionnaire. Adv Dev Hum Resourc. 2003;5:196.
Goodman RM, Speers MA, McLeroy K, Fawcett S, Kegler M, Parker E, Smith SR, Sterling TD, Wallerstein N. Identifying and defining the dimensions of community capacity to provide a basis for measurement. Health Educ Behav. 1998;25:258–78.
Crisp BR, Swerissen H, Duckett SJ. Four approaches to capacity building in health: consequences for measurement and accountability. Health Promot Int. 2000;15:99–107.
Hawe P, King L, Noort M, Jordens C, Lloyd B. Indicators to help with capacity building in health promotion. Sydney: NSW Health Department; 2000.
Dean JW, Bowen DE. Management theory and total quality—improving research and practice through theory development. Acad Manage Rev. 1994;19:392–418.
Miller D, Friesen PH. Innovation in conservative and entrepreneurial firms—2 models of strategic momentum. Strateg Manage J. 1982;3:1–25.
Saraph JV, Benson PG, Schroeder RG. An instrument for measuring the critical factors of quality management. Decision Sci. 1989;20:810–29.
Van de Ven AH, Poole MS. Explaining development and change in organizations. Acad Manage Rev. 1995;20:510–40.
Hunt VD. Quality in America: how to implement a competitive quality program. Burr Ridge: Irwin Professional Publishing; 1992.
Grandzol JR. Implementing total quality: Critical relationships. Philadelphia: Temple University, Department of Management Science and Operations Management; 1996.
Hackman JR, Oldham GR. Work redesign. Reading: Addison-Wesley; 1980.
Schwartz R, Smith C, Speers MA, Dusenbury LJ, Bright F, Hedlund S, Wheeler F, Schmid TL. Capacity building and resource needs of state health agencies to implement community-based cardiovascular disease programs. J Public Health Policy. 1993;14:480–94.
Hawe P, Noort M, King L, Jordens C. Multiplying health gains: the critical role of capacity-building within health promotion programs. Health Policy. 1997;39:29–42.
Riley BL, Taylor SM, Elliott SJ. Determinants of implementing heart health promotion activities in Ontario public health units: a social ecological perspective. Health Educ Res. 2001;16:425–41.
Chatterji M, Sentovich C, Ferron J, Rendina-Gobioff G. Using an iterative model to conceptualize, pilot test, and validate scores from an instrument measuring teacher readiness for educational reforms. Educ Psychol Meas. 2002;62:444–65.
Pearlin LI, Menaghan EG, Lieberman MA, Mullan JT. The stress process. J Health Soc Behav. 1981;22:337–56.
Trumbo DA. Individual and group correlates of attitudes toward work-related changes. J Appl Psychol. 1961;45:338.
Miller VD, Johnson JR, Grau J. Antecedents to willingness to participate in a planned organizational change. J Appl Commun Res. 1994;22:59–80.
Mayer RC, Davis JH. The effect of the performance appraisal system on trust for management: a field quasi-experiment. J Appl Psychol. 1999;84:123.
Watson D, Clark LA, Tellegen A. Development and validation of brief measures of positive and negative affect: the PANAS scales. J Pers Soc Psychol. 1988;54:1063–70.
Hong SM, Faedda S. Refinement of the Hong psychological reactance scale. Educ Psychol Meas. 1996;56:173–82.
The authors wish to acknowledge Ms. Tara Payling for the assistance with identifying the included publications.
The study was supported by an infrastructure funding from the University of Newcastle, Hunter Medical Research Institute, and Hunter New England Population Health.
Availability of data and materials
All data generated or analysed during this study are included in the published article and its supplementary information files.
LW, SY, and TCM developed the aims, the methodology, and the search terms. TCM, FT, TR, AF, ES, MK, and JYO contributed to the abstract and full-text screening and data extraction. TCM led the authorship of the manuscript. All other authors contributed to authorship and all approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Summary of the CFIR domains and constructs used to support mapping of included measure domains . (DOCX 32.8 kb)
About this article
Cite this article
Clinton-McHarg, T., Yoong, S.L., Tzelepis, F. et al. Psychometric properties of implementation measures for public health and community settings and mapping of constructs against the Consolidated Framework for Implementation Research: a systematic review. Implementation Sci 11, 148 (2016). https://doi.org/10.1186/s13012-016-0512-5