Skip to main content

Table 2 Evidence-based assessment criteria

From: The Society for Implementation Research Collaboration Instrument Review Project: A methodology to promote rigorous evaluation

Criterion

Description

Reliability information

 

 0

None (N): α values are not yet available or are only available for subscales

 1

Minimal/emerging (M): α values of <0.60

 2

Adequate (A): α values of 0.60–0.69

 3

Good (G): α values of 0.70–0.79

 4

Excellent (E): α values of ≥0.80

NA

Internal consistency measures are not applicable for this measure or classical test theory anchors are not appropriate, results reported using item response theory

Structural validity

 

 0

None (N): no exploratory or confirmatory analysis has yet been performed nor any Item Response Theory tests of (uni-)dimensionality have been conducted, or percent variance explained is not reported

 1

Minimal/emerging (M): the sample consisted of less than five times the number of items and an exploratory factor analysis explained less than 25% of the variance

 2

Adequate (A): the sample consisted of less than five times the number of items but is less than 100 in total and an exploratory factor analysis explained less than 50% of the variance or a confirmatory factor analysis revealed an RMSEA of 0.08 to 0.05 or CFI = 0.90 to 0.95

 3

Good (G): the sample consisted of five times the number of items and is greater than or equal to 100 in total or the sample consisted of five to seven times the number of items but is less than 100 in total and in either case an exploratory factor analysis explained less than 50% of the variance or a confirmatory factor analysis revealed an RMSEA of 0.05 to 0.03 or CFI = 0.95 to 0.97

 4

Excellent (E): the sample consisted of seven times the number of items and is greater than 100 in total and an exploratory analysis explained greater than 50% of the variance or a confirmatory factor analysis revealed an RMSEA of <0.03 or CFI > 0.97

Criterion (predictive) validity information

 

 0

None (N): predictive validity not yet tested or failed to be detected in evaluation

 1

Minimal/emerging (M): evidence of small correlation (α range: 0.10 to 0.29) between measure and scores on another test (measuring a distinct construct of interest or outcome) administered at some point in the future

 2

Adequate (A): evidence of medium correlation (α range: 0.30 to 0.49) between measure and scores on another test (measuring a distinct construct of interest or outcome) administered at some point in the future

 3

Good (G): evidence of strong correlation (α range: 0.50 to 1.00) between measure and scores on another test (measuring a distinct construct of interest or outcome) administered at some point in the future

 4

Excellent (E): evidence of medium-strong correlation (α range: 0.30 or higher) between measure and scores on at least two other tests (measuring a distinct construct of interest or outcome) administered at some point in the future

Norms

 

 0

None (N) none: norms are not yet available

 1

Minimal/emerging (M): measures of central tendency and distribution for the total score (and subscales if relevant) based only on a small (n < 30) sample are available

 2

Adequate (A): measures of central tendency and distribution for the total score (and subscales if relevant) based on a moderate (n = 30–49) sample are available

 3

Good (G): measures of central tendency and distribution for the total score (and subscales if relevant) based on a medium (n = 50–99) sample are available

 4

Excellent (E): measures of central tendency and distribution for the total score (and subscales if relevant) based on a large (n > 100) sample are available

Responsiveness (sensitivity to change)

 

 0

None (N): the measure has either not been administered both pre- and post-implementation to evaluate sensitivity to change or it has been administered and it did not demonstrate responsiveness (change) across an implementation process

 1

Minimal/emerging (M): the measure demonstrated change over time based on a small (n < 50) sample

 2

Adequate (A): the measure demonstrated either clinically or statistically significant change over time based on a medium sample (n > 50 but <100)

 3

Good (G): the measure demonstrated change over time reflective of both clinically and statistically significant change based on a large sample (n > 100)

 4

Excellent (E): the measure demonstrated both clinically and statistically significant change over time based on at least two large (n > 100) samples

Usability (measure length)

 

 0

None (N): the measure is not in the public domain

 1

Minimal (M): the measure has greater than 100 items

 2

Adequate (A): the measure has greater than 50 items but fewer than 100

 3

Good (G): the measure has greater than 10 items but fewer than 50

 4

Excellent (E): the measure has fewer than 10 items