# Table 2 Means for all measured variables across Studies 1, 2 and 3 (standard deviations given in brackets)

Study 1

Study 2

Study 3

Effect of using BCTTv1 (between-group)

Effect of training and using BCTTv1 (between-group)

Effect of training and using BCTTv1 (within-group)

Research question

Video

Untrained + no taxonomy

(n = 24)

Untrained + taxonomy

(n = 18)

Untrained + no taxonomy

(n = 29)

Trained + taxonomy

(n = 56)

Untrained + no taxonomy

(n = 39)

Trained + taxonomy

(n = 39)

1

‘I can clearly visualise what was delivered in the intervention’

0.83 (1.44)

1.47 (1.18)

1.76 (0.62)

0.88 (1.07)

−0.13 (1.20)

0.64 (1.48)

1

1.75 (0.60)

0.94 (1.40)

1.71 (0.82)

0.55 (1.19)

0.26 (1.09)

0.83 (1.52)

2

0.38 (1.53)

1.90 (0.81)

1.45 (0.64)

0.51 (1.23)

−0.53 (1.21)

0.45 (1.44)

‘I can clearly visualise how the intervention was delivered’

1.02 (1.37)

1.53 (1.16)

1.14 (1.14)

0.14 (1.09)

−0.17 (1.19)

0.59 (1.61)

1

1.94 (0.62)

0.94 (1.18)

1.00 (1.03)

0.00 (1.26)

0.29 (1.14)

0.68 (1.73)

2

0.56 (1.42)

2.00 (0.94)

0.70 (1.64)

−0.42 (1.05)

−0.63 (1.04

0.50 (1.53)

‘Someone would be able to replicate what was delivered in the intervention’

0.69 (1.56)

1.39 (1.30)

1.64 (0.76)

1.02 (0.86)

−0.18 (1.22)

0.40 (1.50)

1

1.56 (0.42)

0.87 (1.41)

1.53 (0.92)

1.05 (1.09)

0.16 (1.07)

0.38 (1.39)

2

0.25 (1.74)

1.80 (1.11)

1.55 (0.69)

0.81 (0.81)

−0.53 (1.29)

0.43 (1.63)

‘Someone would be able to replicate how the intervention was delivered’

0.69 (1.56)

1.42 (1.10)

1.31 (0.88)

0.47 (1.04)

−0.25 (1.17)

0.45 (1.47)

1

1.50 (0.71)

1.00 (1.10)

1.18 (0.92)

0.43 (1.20)

0.11 (1.14)

0.45 (1.41)

2

0.28 (1.72)

1.75 (1.03)

1.05 (0.93)

0.28 (1.01)

−0.61 (1.11)

0.45 (1.56)

2

Reliability of BCT identification (PABAK between coders)

0.86 (0.05)

0.88 (0.05)

0.84 (0.07)

0.87 (0.06)

0.85 (0.06)

0.86 (0.06)

1

0.87 (0.06)

0.88 (0.06)

0.84 (0.05)

0.85 (0.04)

0.86 (0.06)

0.85 (0.06)

2

0.86 (0.05)

0.88 (0.05)

0.84 (0.05)

0.83 (0.09)

0.85 (0.06)

0.85 (0.06)

3

Validity of BCT identification (PABAK between coders and developer consensus)

0.69 (0.05)

0.69 (0.06)

0.67 (0.05)

0.70 (0.05)

0.67 (0.06)

0.66 (0.06)

1

0.65 (0.03)

0.65 (0.07)

0.66 (0.04)

0.69 (0.05)

0.63 (0.04)

0.61 (0.04)

2

0.72 (0.04)

0.70 (0.05)

0.66 (0.04)

0.72 (0.05)

0.72 (0.03)

0.71 (0.04)

4

Sufficiency of time allocated for task

-

5.43 (1.74)

5.56 (1.32)

6.06 (1.43)

4.90 (1.41)

5.36 (1.46)

4.20 (1.90)

-

4.71 (0.73)

4.88 (1.03)

4.72 (1.27)

4.30 (1.22)

5.00 (1.03)

4.80 (0.68)

Difficult vs. easy

-

4.25 (0.89)

3.56 (1.32)

4.40 (0.89)

3.75 (1.25)

3.86 (1.41)

3.38 (1.31)

Worthless vs. useful

-

6.50 (0.53)

6.25 (0.93)

6.00 (1.00)

4.70 (2.08)

6.43 (0.65)

6.00 (1.43)

-

6.75 (0.46)

6.25 (0.86)

6.00 (0.82)

4.60 (1.98)

6.29 (0.61)

6.41 (1.04)

Undesirable vs. desirable

-

6.50 (0.76)

5.31 (2.00)

6.40 (0.89)

5.85 (1.04)

5.86 (1.29)

6.38 (0.71)

Description will be clear

-

5.70 (1.49)

5.20 (1.57)

5.67 (0.52)

5.68 (1.00)

6.21 (0.80)

5.54 (1.35)

Description will be replicable

-

5.70 (1.83)

5.07 (1.49)

5.50 (1.05)

5.37 (1.26)

6.07 (0.92)

5.49 (1.30)

1. For research questions 1 and 2, all items had response options from −3 ‘strongly disagree’ to +3 ‘strongly agree’; for research question 3, all items had response options from 1 ‘strongly disagree’ to 7 ‘strongly agree’