From: Statistical considerations in a systematic review of proxy measures of clinical behaviour
Report | ni | nj | nk | Statistics used | Notes |
---|---|---|---|---|---|
Item-by-item comparisons: items treated as distinct | |||||
Flocke, 2004[9] | 10 | 19 | 138 | Â | Â |
Stange, 1998[19] | 79 | 32 | 138 | Â | Â |
Ward, 1996[20] | 2 | 26 | 41 | Sensitivity = a/(a + c) | Â |
Wilson, 1994[21] | 3 | 20 | 16 | Â | Â |
Zuckerman, 1975[22] | 15 | 17 | 3 | Â | Â |
Stange, 1998[19] | 79 | 32 | 138 | Â | Â |
Ward, 1996[20] | 2 | 26 | 41 | Â | Â |
Wilson, 1994[21] | 3 | 20 | 16 | Specificity = d/(b + d) | Â |
Zuckerman, 1975[22] | 15 | 17 | 3 | Â | Â |
Dresselhaus, 2000*[8] Gerbert, 1988[11] Pbert, 1999*[15] Rethans, 1987*[18] Wilson, 1994[21] | 7 4 15 24 3 | 8 3 9 1 20 | 20 63 12 25 16 | Agreement: comparison of: (i) (a + b)/T, and (ii) (a + c)/T | Agreement was assessed by comparing the proportion of recommended behaviours performed as measured by the direct and proxy measures. Three reports performed hypothesis tests, using analysis of variance [8], Cochran's Q-test [15], and McNemar's test [18]. |
Gerbert, 1988*[11] Pbert, 1999*[15] Stange, 1998[19] | 4 15 79 | 3 9 32 | 63 12 138 | kappa = 2(ad - bc)/{(a + c)(c + d) + (b + d)(a + b)} | All three reports used kappa-statistics to summarise agreement; two reports [11, 15] also used them for hypothesis testing. |
Gerbert, 1988[11] | 4 | 3 | 63 | Disagreement = (i) c/T (ii) b/T (iii) (b + c)/T | Disagreement was assessed as the proportion of items recorded as performed by one measure but not by the other. |
Item-by-item comparisons: items treated as interchangeable within categories of behaviour | |||||
Luck, 2000[12] | NR | 8 | 20 | Â | Â |
Page, 1980 [14] | 16-17 | 1 | 30 | Sensitivity = a/(a + c) | Â |
Rethans, 1994[17] | 25-36 | 3 | 35 | Â | Â |
Luck, 2000[12] Page, 1980[14] | NR | 8 1 | 20 30 | Specificity = d/(b + d) | Â |
Gerbert, 1986[10] Page, 1980[14] | 20 16-17 | 3 1 | 63 30 | Convergent validity = (a + d)/T | Convergent validity was assessed as the proportion of items showing agreement. |
Comparisons of summary scores for each consultation: summary scores were the number (or proportion) of recommended items performed | |||||
Luck, 2000*[12] | NR | 8 | 20 | Â | Analysis of variance to compare means of scores on direct measure and proxy. |
Pbert, 1999*[15] | 15 | 9 | 12 | Â | Â |
Summary score: | Â | Â | Â | Â | Â |
Rethans, 1987*[18] | 24 | 1 | 25 |
| Paired t-tests to compare means of scores on direct measure and proxy. |
Pbert, 1999*[15] | 15 | 9 | 12 | Â | Pearson correlation of the scores on direct measure and proxy. |
Comparisons of summary scores for each clinician: summary scores were the number (or proportion) of recommended items performed | |||||
O'Boyle, 2001[13] | 1 | NA | 120 | Â | Comparison of means of scores on direct measure and proxy. |
Summary score: | Â | Â | Â | Â | Â |
O'Boyle, 2001*[13] | 1 | NA | 120 |
| Pearson correlation of scores on direct measure and proxy. |
Rethans, 1994*[17] | 25-36 | 3 | 25 | Â | Â |
Comparisons of summary scores for each consultation: summary scores were weighted sums of the number of recommended items performed | |||||
Peabody, 2000*[16] | 21 | 8 | 28 | Â | Analysis of variance to compare means of scores on direct measure and proxy. |
Summary score: | Â | Â | Â | Â | Â |
Page, 1980*[14] | 16-17 | 1 | 30 | Â | Pearson correlation of scores on direct measure and proxy. |