Skip to main content

Statistical approach.

Change scores were calculated for both primary outcomes (fatigue and physical functioning) by subtracting 20 and 70 week scores from baseline scores from). First, we examined simple descriptive statistics, by therapist and therapy type (PR vs SL) for both outcomes and measures of the therapeutic alliance. Since (as a result of the trial design) the controls are common to all six therapist by therapy type combinations, differences in the efficacy of these six combinations can be compared without reference to the controls. For example, the effect of pragmatic rehabilitation when delivered by therapist 1 is the difference in average outcome for PR delivered by therapist 1 and the average outcome in the controls. The effect of pragmatic rehabilitation when delivered by therapist 2 is the difference in average outcome for PR delivered by therapist 2 and the average outcome in the controls. Hence the difference in efficacy between these two therapists at delivering PR is simply the difference between the corresponding average outcomes for the two therapists (the average outcome for the common controls drops out).
To identify whether there is a treatment by therapist interaction we examined whether there are differences in the therapist effects on outcome, both (i) separately for the two types of treatment, and (ii) for both types of treatment. These were evaluated simultaneously in a regression (ANCOVA) analysis including the interaction of therapist and treatment. The model was run for both primary outcomes, fatigue and physical functioning, at both outcome time points. In line with the trial’s original protocol and write-up, treatment effects were evaluated separately for the 20- and 70-week outcomes .
We then used regression (ANCOVA) models to evaluate whether there were differences in the average patient-rated therapeutic alliance for the therapists. We compared the therapeutic alliance levels with the different therapists when they were delivering the two types of treatment, and as a treatment by therapist interaction. We evaluated whether the treatment effects for outcome were related to the therapist’s average patient-rated therapeutic alliance scores. The regression model was then repeated using only the task element of alliance to compare to the work of Heins et al. If a relationship between the treatment effects and the average patient-rated therapeutic alliance score for a therapist was found, a causal (instrumental variable) analysis  was planned.
In all of the regression models outlined in the preceding two paragraphs, in addition to the effect of type of intervention, the therapist and their statistical interaction, the regression models included, as additional covariates, baseline scores of whichever dependent variable was being regressed. Furthermore, the two variables which were used for stratification of the FINE trial sample, namely ambulatory status (whether or not the patient used a mobility aid on most days), and whether the London ME criteria were met were also included in all of these regression models. These covariates were included to allow for chance imbalance, control for their effects and to increase precision; the effects of these covariates are not reported here. The model regressing patient-rated therapeutic alliance scores included the baseline fatigue score as a covariate to control for the effect of illness severity, but it did not include baseline therapeutic alliance as it is not meaningful to measure therapeutic alliance before the patient has met the therapist. Standard errors and 95% confidence intervals for parameters were calculated using robust (sandwich) estimators, allowing for the effects of possible skewness (lack of normality) in the datar each regression model using XLSTAT  in order the determine whether we had sufficient power to detect a moderate sized effect. The standardised effect size used was Cohen’s D which provides information about the size of the difference between two means, taking into account the variability in the data. A Cohen’s D of 0.25 is conventionally considered a moderate effect . Our power calculation revealed that given the sample size and interactions, we have 0.91 power to detect a moderate effect size (Cohen’s D = 0.25) in the main intervention effects, and 0.84 power to detect a moderate effect size in the intervention by therapist interactions, with a 0.05 two-sided significance level. This research is adequately powered to detect a moderate effect size.

Comments