After years of waiting, the long-term follow-up results of the GETSET study have finally been published. The control group that received no intervention did just as well as the group that received guided graded exercise self-help. This isn’t the first time that the control group catches up over time. A similar pattern was seen in the FINE, PACE, FITNET, and QURE-studies. This blog post explores the intriguing implications of these follow-up findings.
Graded exercise therapy (GET) and cognitive behavioral therapy (CBT) are both controversial treatments for patients suffering from myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). For many years GET and CBT were promoted as evidence-based treatments but patient organizations have strongly objected to this. They refer to multiple surveys where patients report GET and CBT to be unhelpful or even harmful. In more recent years, methodological weaknesses of GET and CBT trials were highlighted. This has promoted prestigious institutions like the Centers for Disease Control and Prevention (CDC) in the US and The National Institute for Health and Care Excellence (NICE) in the UK, to change course and no longer recommend GET or CBT as effective treatments for ME/CFS.
Response bias or genuine improvements in health?
For some, this change has been difficult to accept. After all, most randomized trials show that the group that receives GET or CBT reports fewer symptoms than the control group. The crux of the matter is whether those changes reflect genuine improvements in health or simply response bias because of inherent weakness in how these trials were set up.
Proponents of the latter view argue that GET and CBT trials were far from the gold standard in medicine, namely a randomized, blinded, controlled trial. In such a study, there is a control group that receives a treatment almost identical to the intervention being tested, except for its core ingredient. In drug trials, for example, participants in the control group receive a placebo pill that looks and tastes just the same as the drug. Patients nor therapists know which pill is the drug or the placebo. In other words, they are ‘blinded to treatment allocation’ to make sure that their expectations do not influence the results.
GET and CBT trials for people with ME/CFS have been quite poor at including a convincing control condition. Often GET and CBT were compared to ‘treatment as usual’. In practice, this meant that the control group received no active intervention at all or significantly less contact with health professionals. That alone might have influenced how patients reported their symptoms. In addition, GET and CBT trials have the major limitation that patients and therapists are not blinded because it is practically infeasible to do so. In such cases, guidelines recommend including objective outcomes that are not easily influenced by optimism or expectations. So instead of only using symptom questionnaires, one would also use step counts, employment figures, data on disability benefits, and fitness tests to check if patients are really doing better.
Unfortunately, GET and CBT trials in ME/CFS have almost done the exact opposite. They have focused on subjective outcomes such as fatigue questionnaires and ignored objective outcomes as much as possible. The reason for this might be that objective outcomes do not show much improvement after GET and CBT. This, in turn, would support the conclusion that changes in symptom questionnaires reflect response bias rather than genuine improvements in health. After all, it is rather difficult to explain how exercise therapy would increase patients’ physical functioning if it fails to increase their activity level or fitness objectively.
There might be a simpler explanation: maybe patients reported their physical functioning to be a little bit better because they were encouraged to focus on positive things. Maybe they would very much like to think they got better given the investments in time and energy they made in trying GET or CBT. Or maybe they simply didn’t want to disappoint their therapist with whom they worked closely for several weeks.
The control group catches up over time
Next to the lack of improvement on objective outcomes, there is an additional argument that supports this view: long-term follow-up data. One might expect response bias to be larger directly after treatment ended than many months afterward. So one way to check if GET or CBT leads to genuine improvements in health is to see if people still report feeling better several months after the trial ended.
In GET and CBT studies, there is a trend for the control group to catch up over time. A good example is the FINE trial where patients in the intervention group reported a beneficial effect directly after treatment ended but no longer one year later. At that point, the control group did just as well.
Something similar happened in the PACE trial. The GET and CBT group reported improvements directly after treatment ended and a half year later (possibly because they received “booster sessions” in between). Those results were touted in newspaper articles with headlines such as “Got ME? Just get out and exercise, say scientists” or “ME patients should go out and exercise for best hope of recovery, finds study.” But when a couple of years later the long-term follow-up results were published, the GET nor CBT group fared better than the control group. Whether you received weeks of intensive treatment with GET or CBT or not, it didn’t seem to make a difference anymore.
Last month, the final results of GETSET were published. This trial looked at the effectiveness of a self-help guide to graded exercise. The data tells a familiar story. Immediately after treatment ended, patients in the exercise group reported less fatigue and better physical functioning than the control group. The results were reported in The Lancet as an encouraging success. But now the follow-up results have been released and once again there was no longer any significant difference between both groups.
Another good example is a study called FITNET. It looked at internet-based CBT for adolescents with ME/CFS. When the first results were published in The Lancet in 2012 it looked like a great success. 63 percent of participants in the internet-based CBT group were said to be recovered compared to only 8 percent in the control group! At long-term follow-up, however, there were no longer any statistically significant differences: usual care led to similar recovery rates as internet-based CBT.
Then there is the Qure study. This trial tested the effectiveness of CBT for patients with Q-fever fatigue syndrome. Initially, a significant difference in fatigue scores was highlighted between the CBT and placebo groups. One year later, however, the improvements of the CBT group disappeared. Patients who received CBT experienced a relapse and the placebo group performed just as well at long-term follow-up.
The ambiguous, the bad, and the unreported
Those were the biggest and most recent trials of GET and CBT. There are older and smaller ones that report significant benefits at long-term follow-up (examples here, here, and here) but the results are ambiguous and were published without a public trial protocol that specified how analyses would be conducted. This makes the findings difficult to interpret because they could have been affected by publication bias, p-hacking, or other methodological lapses.
Then there were also GET and CBT studies of which follow-up findings were never published but where we have some clues that they existed and that the results were not too positive. In a 1999 paper, for example, Fred Friedberg commented on a CBT trial from Oxford University, saying that “outcome improvements have begun to decline 17 months after treatment termination (Michael Sharpe, personal communication, October 12, 1998).”
Another big trial on CBT was conducted by the Dutch research team of Gijs Bleijenberg. Two people noted that the follow-up results of this trial were disappointing and contradicted the findings post-treatment. In a 2001 letter to The Lancet, Kenneth M Lassen criticized the fact that these follow-up results were not published. He wrote:
“I find disturbing the lack of full disclosure. At the AACFS, a question was asked about the length of benefit for CBT. The presenter stated that the natural course and CBT groups did not differ significantly 3 years after treatment.”
Similarly, in 2004 Elke Van Hoof remarked about that same trial that there was a “report by one of the coresearchers that the effects of CBT were no longer present after 3 years (Bleijenberg G, communication, Fifth International Research, Clinical and Patient Conference).”
Finally, there is a third group of studies that have triumphantly claimed that treatment effects were ‘maintained’ at long-term follow-up but without comparing the result to a control group (example here and here). That is not very enlightening because in most trials that did have a control group treatment benefits were also ‘maintained’ but the control group did just as well.
It seems that a lack of treatment effect at long-term follow-up is a rather consistent finding in trials on GET and CBT for people with ME/CFS. This contrasts with the self-reported improvements shortly after treatment ended.
There are many possible explanations for this pattern, such as loss of statistical power or additional treatment received after randomization ended. The current data however doesn’t support these explanations. Large trials such as FINE, PACE, and GETSET still had a substantial sample size at follow-up and the differences between groups became rather small. In the long-term follow-up of GETSET, for example, they were almost identical. This suggests that lack of statistical power was not the main reason why the groups no longer differed significantly.
Next up is the possibility that patients in the control group received additional therapy after the trial endpoint and that this accounts for why they improved and no longer differed from the intervention group. In the GETSET trial, most participants did receive additional treatment after the trial endpoint but there was no significant difference in the number of participants who received additional therapy sessions between the intervention and control group. The authors concluded that “there is no evidence that the improvements observed in the SMC group were due to them having received more exposure to therapy than the GES group after trial completion.” Similarly, the PACE trial found little evidence that improvements in the control group after the trial endpoint were due to additional therapy. The reanalysis of the PACE trial data by Wilshire et al. looked at this more closely and concluded that “the disappearance of group differences at long-term follow-up cannot be attributed to the effects of additional post-trial therapy.”
A more evident explanation is that GET and CBT simply do not work and that improvements reported at the end of treatment, reflect response bias rather than genuine improvements in health. This makes follow-up results yet another obstacle for proponents of GET and CBT to claim that these treatments are effective. Not only do they have to explain why patient surveys and objective outcomes indicate that GET and CBT do not work, they also have to come up with a plausible explanation for the lack of improvements at long-term follow-up.