Skip to main content

Table 1 Explanation of steps taken to fit the data to the Rasch model

From: Applying modern measurement approaches to constructs relevant to evidence-based practice among Canadian physical and occupational therapists

Threshold order

There should be a logical ordering to the response options such that endorsing a more optimal response option should situate the person at a higher level of the latent trait. That means a person with higher ability (for example a knowledgeable clinician in EBP) is expected to select higher response options on an ordinal scale. At lower ability, more clinicians should endorse a lower response level, and fewer should endorse a higher response level. If the thresholds are disordered, the response options need to be rescored, sometimes reducing the responses to binary. The number of thresholds is equal to the number of response options - 1 and reflects the number of “jumps” the person has to make for each item.

Fit to the Rasch model

The items should line up hierarchically such that those items that need little ability to endorse at the most optimal response level are at the low end and those items requiring more ability to endorse are higher. Overall goodness of model fit is indicated by a non-significant chi-square test (p > 0.05) after a Bonferroni adjustment for the number of items. The fit of each item and each person is as important, or even more important, than overall fit. Item and person fit is indicated when fit residual (deviance from pure linearity) values are within ± 2.5 and the chi-square test for fit is non-significant (> 0.05). Those items that fail this criterion need to be looked at carefully to ensure their importance in scoring the latent trait. A fit residual of greater than + 2.5 indicates the item does not fit the latent trait; a fit residual of less than − 2.5 indicates the item overfits and may be redundant.

Unidimensionality

A requirement of the Rasch model is that a single latent trait is being measured. This is assessed using a principal component analysis (PCA) of the fit residuals. The person-ability estimates derived from all pair-wise comparisons of the two most disparate set of items (those with the highest positive and negative loadings on the first factor) are compared using independent t tests. For a set of items to be considered unidimensional, less than 5% of t values should be outside ± 1.96. When this value is greater than 5%, a binomial test of proportions is used to calculate the 95% confidence interval (CI) around the t test estimate. Evidence of unidimensionality is still supported if the 5% value falls within the 95%CI.

Response dependency

The uniqueness of the information provided by the items is a requirement of the Rasch model. Items with pair-wise residual (after controlling for the latent trait) correlations greater than 0.3 could indicate lack of independence of the responses which inflates the reliability. Solutions include creating a super-item which combines the response options across items or choosing the one item that best suits the testing context.

Differential item functioning (DIF)

The items should have the same ordering of difficulty across all people being measured defined by personal factors such as in this study, PT or OT, gender, and language. DIF is an indicator of item bias. Typically, DIF is indicated with a significant F test from a two-way analysis of variance. A caution is that with large and sample sizes, anything may be significant; with small sample sizes, nothing may be significant. A close visual inspection of the item characteristic curve plotted by the level of each factor will support or not the information from the statistical approach. Two options are available for items with DIF, deletion or split scoring.

Targeting

An ideally targeted measure should include a set of items that spans the full range of the theoretical latent construct (− 4 to + 4 logits) and have a mean location of 0 with a standard deviation (SD) of 1. Ideally, the person estimates from this measure should be centred on location 0 with a SD of 1.

Discrimination or person separation

This indicates how well people are differentiated by the spread of the item difficulty. The person separation index (PSI) is interpreted like a Cronbach’s alpha. The larger the index, the better is the discrimination which facilitates the measurement of change. Values of > 0.9 are suitable for measuring within-person change; values > 0.7 are suitable for detecting group differences.