Skip to main content

Table 2 Statistical methods used in the included papers to compare direct and proxy measures of behaviour

From: Statistical considerations in a systematic review of proxy measures of clinical behaviour

Report ni nj nk Statistics used Notes
Item-by-item comparisons: items treated as distinct
Flocke, 2004[9] 10 19 138   
Stange, 1998[19] 79 32 138   
Ward, 1996[20] 2 26 41 Sensitivity = a/(a + c)  
Wilson, 1994[21] 3 20 16   
Zuckerman, 1975[22] 15 17 3   
Stange, 1998[19] 79 32 138   
Ward, 1996[20] 2 26 41   
Wilson, 1994[21] 3 20 16 Specificity = d/(b + d)  
Zuckerman, 1975[22] 15 17 3   
Dresselhaus, 2000*[8]
Gerbert, 1988[11]
Pbert, 1999*[15]
Rethans, 1987*[18]
Wilson, 1994[21]
7
4
15
24
3
8
3
9
1
20
20
63
12
25
16
Agreement: comparison of: (i) (a + b)/T, and (ii) (a + c)/T Agreement was assessed by comparing the proportion of recommended behaviours performed as measured by the direct and proxy measures. Three reports performed hypothesis tests, using analysis of variance [8], Cochran's Q-test [15], and McNemar's test [18].
Gerbert, 1988*[11]
Pbert, 1999*[15]
Stange, 1998[19]
4
15
79
3
9
32
63
12
138
kappa = 2(ad - bc)/{(a + c)(c + d) + (b + d)(a + b)} All three reports used kappa-statistics to summarise agreement; two reports [11, 15] also used them for hypothesis testing.
Gerbert, 1988[11] 4 3 63 Disagreement = (i) c/T (ii) b/T (iii) (b + c)/T Disagreement was assessed as the proportion of items recorded as performed by one measure but not by the other.
Item-by-item comparisons: items treated as interchangeable within categories of behaviour
Luck, 2000[12] NR 8 20   
Page, 1980 [14] 16-17 1 30 Sensitivity = a/(a + c)  
Rethans, 1994[17] 25-36 3 35   
Luck, 2000[12]
Page, 1980[14]
NR 8
1
20
30
Specificity = d/(b + d)  
Gerbert, 1986[10]
Page, 1980[14]
20
16-17
3
1
63
30
Convergent validity = (a + d)/T Convergent validity was assessed as the proportion of items showing agreement.
Comparisons of summary scores for each consultation: summary scores were the number (or proportion) of recommended items performed
Luck, 2000*[12] NR 8 20   Analysis of variance to compare means of scores on direct measure and proxy.
Pbert, 1999*[15] 15 9 12   
Summary score:      
Rethans, 1987*[18] 24 1 25 Paired t-tests to compare means of scores on direct measure and proxy.
Pbert, 1999*[15] 15 9 12   Pearson correlation of the scores on direct measure and proxy.
Comparisons of summary scores for each clinician: summary scores were the number (or proportion) of recommended items performed
O'Boyle, 2001[13] 1 NA 120   Comparison of means of scores on direct measure and proxy.
Summary score:      
O'Boyle, 2001*[13] 1 NA 120 Pearson correlation of scores on direct measure and proxy.
Rethans, 1994*[17] 25-36 3 25   
Comparisons of summary scores for each consultation: summary scores were weighted sums of the number of recommended items performed
Peabody, 2000*[16] 21 8 28   Analysis of variance to compare means of scores on direct measure and proxy.
Summary score:      
Page, 1980*[14] 16-17 1 30   Pearson correlation of scores on direct measure and proxy.
  1. a, b, c, d, T are defined in Table 1; i = item, j = consultation, k = physician, ni = average number of items per consultation, nj = average number of consultations per clinician; nk = average number of clinicians assessed; ωi = weight for ith item; xijk = 0 if item is not performed; xijk = 1 if item is performed;.
  2. NR = Not reported; NA = Not applicable.
  3. * This study used this method for hypothesis testing.