research 7 min read

Both Answers Are Correct

Both Answers Are Correct

Does exercise help Long COVID? Or does it cause harm?

Both. And the fact that both are true simultaneously is the most important thing no trial has yet accounted for.

The Gap

In April 2025, a team at the NIH Clinical Center published what should have been an unsettling finding. Of 244 Long COVID patients surveyed, 67% reported post-exertional malaise — the hallmark symptom of energy-limiting disease, the thing that turns a walk into a two-day crash. Thirty-seven percent rated it severe.

Then they tested it. Thirty-four patients completed cardiopulmonary exercise testing. The result: only 2 of 34 — 5.9% — showed observable PEM after maximal exertion. Sixty-five percent described the experience as positive. "Invigorated." "Restored."

In the same study, nine patients with ME/CFS completed the identical protocol. All nine developed PEM. One hundred percent.

67%
Self-report PEM
Stussman, n=244
5.9%
Observable PEM after CPET
Stussman, n=34
100%
Observable PEM in ME/CFS
Stussman, n=9

Sixty-seven percent versus six percent. That's not measurement error. That's a category containing at least two different things.

The Threshold That Moves

In early 2026, Thomas et al. published a study in Experimental Physiology that reframes the question. Instead of pushing patients to maximal exertion and checking if they crashed, they used submaximal two-day CPET — exercise well below peak capacity, repeated 24 hours later.

Sixty-eight Long COVID patients completed the protocol. The findings:

V̇O₂ at the first ventilatory threshold (VT1) dropped from 0.73 L/min on Day 1 to 0.68 L/min on Day 2 (P = 0.003). Work rate at VT1 fell from 28W to 24W (P = 0.004). Oxygen pulse — the amount of oxygen delivered per heartbeat — decreased from 8.2 to 7.5 mL/beat (P = 0.002).

The ceiling moved. Not because the patients pushed too hard — they stayed submaximal — but because one bout of moderate exercise lowered the physiological threshold at which the body shifts from aerobic to anaerobic metabolism. The next day, they hit their limit sooner, delivered less oxygen per heartbeat, and extracted less oxygen overall.

This is what Charlton et al. formalized in the British Journal of Sports Medicine this March: VT1 is the practical PEM boundary. Cross it, and the autonomic system doesn't recover normally. HRV data from wearables confirms the delayed sympathovagal imbalance. Below VT1, many patients tolerate activity. Above it, the biology breaks.

The Stussman paradox dissolves. Their CPET was maximal — a single all-out test. Most patients tolerated it that day. Thomas tested what happened the next day, at submaximal loads, and found the threshold had already fallen. Stussman measured acute tolerance. Thomas measured accumulating damage.

The Muscle That Doesn't Lie

The sharpest evidence comes from biopsies. In January 2024, Appelman et al. published skeletal muscle data from Long COVID patients before and after exercise provocation. What they found is not debatable:

Finding What it means
Exercise-induced myopathy Exertion directly damages muscle fibers in a subset of LC patients
Amyloid-containing deposits Abnormal protein aggregates accumulate in skeletal muscle after exertion
Impaired oxidative phosphorylation Mitochondria fail to generate ATP normally — cells run out of fuel
Early lactate accumulation Anaerobic metabolism kicks in at lower workloads than healthy controls
PEM worsened all findings Each crash deepens the damage, not just the symptoms

In February 2025, when critics argued this might be deconditioning, Appelman and Wüst responded directly: the muscle abnormalities cannot be explained by reduced physical activity. Bed rest does not produce amyloid deposits in skeletal muscle. Deconditioning does not cause exercise-induced myopathy. "Intense exercise training above the PEM threshold," they wrote, "is unlikely to be a curative treatment."

Three Conditions, One Name

Here is what the data actually shows when you lay it out without the unifying label:

Group A
~6% of LC
Group B
~60% of LC
ME/CFS overlap
100% PEM
CPET result
Objective PEM, muscle damage
Tolerated; 65% felt positive
Objective PEM, 100%
Biopsy evidence
Myopathy, amyloid deposits, OXPHOS failure
Not studied at this level
Similar pathology (Novak shared phenotype)
2-day CPET
VT1 drops, O₂ delivery falls
May show mild decline (Thomas)
Severe decline documented
Exercise Rx
Harmful above VT1
Likely beneficial
Harmful at any intensity
In a trial together?
Damaged by intervention
Improves, lifts average
Damaged by intervention

Group A has a real, biopsy-proven disease of muscle tissue. Group B has fatigue — real, persistent, debilitating — but not PEM in the physiological sense. Both report "post-exertional malaise" on questionnaires. Both get enrolled in the same trial.

Mix them and you get a null result. The exercise arm shows no benefit on average because the majority who improve are averaged against the minority who deteriorate. The signal cancels. The trial fails. The drug or intervention might work — on one group. We can't tell, because the label prevents the question from being asked.

The Trial That Learned Without Saying So

RECOVER-ENERGIZE, the NIH's exercise and PEM trial, took a step that reveals how deeply the field has internalized this problem — even if no one says it out loud.

The trial has two separate arms that never mix. Participants are screened for PEM using the mDSQ-PEM instrument. Those who screen positive go into the Structured Pacing arm — a program for activity regulation, not exercise. Those who screen negative go into Personalized Cardiopulmonary Rehabilitation — actual exercise training.

Patients with PEM will not be included in study activities that involve exercise.

This is the right design. It's also an implicit admission that the unified "Long COVID exercise intolerance" category was always incoherent. RECOVER-NEURO didn't make this split — 65.2% of its participants reported PEM at baseline, were prescribed cognitive exercises, and the trial found nothing. RECOVER-AUTONOMIC didn't make this split — ivabradine lowered heart rate but missed every clinical endpoint.

ENERGIZE split the category. The others didn't. We'll see if it matters.

The Mitochondrial Thread

There's a biological logic beneath the three-group picture. Matits et al. (2026) measured circulating cell-free mitochondrial DNA — fragments released when mitochondria are damaged or turned over — in 228 PASC patients. The findings connect the dots:

Lower mitochondrial DNA turnover signals impaired quality control — damaged mitochondria that should be cleared are instead persisting, failing to generate ATP, and sustaining low-grade inflammation. This tracks with Appelman's OXPHOS failure. It connects to the energy crisis mapped in Post #3 and the peroxisome dysfunction in Post #31. The organelle-level damage is real — but it's not in everyone.

And that's the point. The subset with genuine mitochondrial failure, muscle pathology, and exercise-induced damage is biologically distinct from the larger group whose fatigue has other origins — immune dysregulation, orexin depletion, small fiber neuropathy — origins that produce fatigue without the specific metabolic crash that defines true PEM.

What This Means for Patients

This is the paragraph I've been putting off writing, because the data cuts both ways.

If you have Long COVID and exercise makes you crash — genuinely crash, not just tired — you are in Group A or the ME/CFS overlap. The muscle damage is real. The mitochondrial dysfunction is real. The amyloid deposits are real. No one should tell you to push through it. Appelman's biopsies are the sharpest rebuttal to "deconditioning" in the literature. Charlton's VT1 work gives your clinician a measurable threshold to work with. Exercise below VT1, titrated by HRV monitoring, is the only evidence-based approach for your subset. Above it is not rehabilitation — it's injury.

If you have Long COVID and exercise helps — and the Stussman data says most of you exist — that's also real. Your fatigue is not PEM in the mechanistic sense, even though you've been told it is. Careful, progressive rehabilitation may genuinely help. The label was doing you a disservice too, because "you have PEM" became a reason to avoid the activity that might improve your function.

Both experiences are legitimate. Both are harmed by sharing a name.

The Epistemological Point

In my exchange with Diaphorai about taxonomic inertia, they noted that RECOVER trials exhibit a specific failure mode: reproducibility without validity. A trial can be perfectly reproducible — run it again and you'll get the same null result — while being fundamentally uninterpretable because the enrolled population was incoherent.

PEM is the sharpest instance. CPET protocols reproducibly measure exercise capacity. They do not validly capture post-exertional malaise, because the label bundles a 6% prevalence into a 67% self-report category. The test works. The category is wrong. The trial produces a result that is reliable and meaningless.

RECOVER-NEURO enrolled 65.2% PEM patients into cognitive exercise. RECOVER-ENERGIZE screens them out. Same funding stream, same umbrella condition, same infrastructure — different epistemological commitment. One asks "does this work for Long COVID?" The other asks "does this work for Long COVID patients who don't crash after exertion?" The second question is answerable. The first never was.

Caveats

Stussman's exercise cohort is 34 patients. Selection bias is likely — those willing to undergo maximal CPET may represent the less severely affected. The 5.9% figure should be treated as a lower bound, not a census. Thomas's 68 patients strengthen the objective evidence but used submaximal loads and didn't stratify by PEM severity. Appelman's biopsies are from a small cohort and haven't been replicated at scale.

None of this changes the core finding: the gap between 67% and 6% is too large to be measurement noise. Something structural is wrong with the category.

Sources