The same load that built them last week is breaking them this week.
An average effect hides the one thing a coach most needs — the condition under which it flips its sign.
Two athletes. Same prescription — a hard three-week block, load ramped about 30%. One comes out the far side sharper than they have been all season. The other comes out flat, then sick, then hurt. Run the numbers across your whole roster and the relationship between load and next-week performance is a shrug: a small positive effect, wide error bars, nothing you'd stake a plan on.
That average is true. It is also useless — and quietly misleading. It is the arithmetic mean of two opposite truths, and it describes neither athlete. The block built the first one and broke the second, and the average splits the difference into a number that happened to no one.
An average effect is a lie of composition: it can be positive for everyone, negative for everyone, or — most often — a blend of two regimes the mean was never equipped to tell apart.The problem with one number
01The condition is the coaching
What separated those two athletes wasn't the load. It was the state they were in when it landed. The first athlete absorbed the block on the back of recovered autonomic function — HRV sitting at or above their own baseline. The second took the same load while their HRV was suppressed, already carrying fatigue the prescription didn't know about.
So the honest answer to "does load help?" isn't yes or no. It's: it depends on HRV — and that dependency has a name.
02What an effect modifier is
An effect modifier is a variable that changes the strength — or the sign — of a relationship between two others. Load drives performance, but HRV-relative-to-baseline modifies that effect: above the line, more load helps; below it, the same load hurts. The modifier doesn't cause the outcome itself. It sets the terms under which the cause operates.
Plotted against the modifier, the effect isn't a flat line at its average. It's a slope that crosses zero — and the place it crosses is the only number on the chart a coach can actually use.
03Find the threshold; don't assume it
The trap most tools fall into is hard-coding the breakpoint — a fixed HRV percentage, the same for everyone, borrowed from a population study. But the threshold is itself a within-athlete quantity. Where one athlete's effect flips might be five points of HRV above where another's does.
So the engine estimates the threshold from the athlete's own history rather than importing it. That's slower, and it's the entire point: a breakpoint that's true on average is, once again, a number that happened to no one.
- Estimate the effect of load separately across the range of the modifier — not one slope, but how the slope changes.
- Locate where it crosses zero, with an interval around that crossing, not a single confident point.
- Hold the whole thing to the same gate as any other claim: if the data above or below the line is too thin, the threshold is returned as
insufficient, not guessed.
"Should we add load?" stops being a question with one answer. It becomes "where is this athlete relative to their threshold today?" — which is a question the data can actually answer.
04Why an AI coach can't skip this
A language model asked "does load improve performance?" will answer fluently in either direction, because both are defensible on the average. That fluency is the danger: it reads the same whether the model has the evidence or is improvising.
Wiring an agent to the conditional effect closes that gap. Instead of one global verdict, the agent gets a finding bound to a condition — and a license that travels with it:
ADVISEwhen HRV is above the threshold and the interval clears zero — recommend the load.HYPOTHESIZEnear the crossing, where the sign is uncertain — surface it as a working hypothesis, don't push.DECLINEwhen the athlete sits in a regime the data hasn't covered — say nothing yet.
The agent never has to bluff a single answer to a question that has two, because the question was never single to begin with.
05What we actually ship
Not "load helps performance (p<.05)." That sentence is the thing we're arguing against. What leaves the engine is closer to:
Load's effect on the next benchmark depends on HRV — about−0.056when HRV is suppressed, flipping to+0.022when it's recovered. An exploratory modifier estimated from 19 athlete-days — surfaced, not yet confirmed.A claim with its conditions attached
It's a longer sentence than a readiness score. It's also the difference between a number a coach can defend to an athlete and a number that, three weeks from now, quietly broke one of them.
- effect modifier
- A condition that may change a relationship's strength or sign. AthDash tests the condition before promoting the modifier.
- within-athlete
- Worked out inside one athlete's own history — not borrowed from a cohort they may not resemble.
- confidence interval
- The range the true effect plausibly sits in. We ship it, never a bare number.
- license
- How far an AI coach is allowed to act on a finding,
DECLINE→ACT.