Field Notes
Method

The same load that built them last week is breaking them this week.

An average effect hides the one thing a coach most needs — the condition under which it flips its sign.

Two athletes. Same prescription — a hard three-week block, load ramped about 30%. One comes out the far side sharper than they have been all season. The other comes out flat, then sick, then hurt. Run the numbers across your whole roster and the relationship between load and next-week performance is a shrug: a small positive effect, wide error bars, nothing you'd stake a plan on.

That average is true. It is also useless — and quietly misleading. It is the arithmetic mean of two opposite truths, and it describes neither athlete. The block built the first one and broke the second, and the average splits the difference into a number that happened to no one.

An average effect is a lie of composition: it can be positive for everyone, negative for everyone, or — most often — a blend of two regimes the mean was never equipped to tell apart.The problem with one number
AthDash chart showing the same weekly load reducing performance when HRV is suppressed and improving performance when HRV is recovered.
Same prescription, different athlete state. The useful coaching answer is the condition where the effect changes sign.

01The condition is the coaching

What separated those two athletes wasn't the load. It was the state they were in when it landed. The first athlete absorbed the block on the back of recovered autonomic function — HRV sitting at or above their own baseline. The second took the same load while their HRV was suppressed, already carrying fatigue the prescription didn't know about.

So the honest answer to "does load help?" isn't yes or no. It's: it depends on HRV — and that dependency has a name.

02What an effect modifier is

An effect modifier is a variable that changes the strength — or the sign — of a relationship between two others. Load drives performance, but HRV-relative-to-baseline modifies that effect: above the line, more load helps; below it, the same load hurts. The modifier doesn't cause the outcome itself. It sets the terms under which the cause operates.

Plotted against the modifier, the effect isn't a flat line at its average. It's a slope that crosses zero — and the place it crosses is the only number on the chart a coach can actually use.

Conditional effect · load → next-day performanceby HRV vs. baseline
0 + HRV THRESHOLD LOAD HURTS LOAD HELPS
Below the threshold, adding load costs performance; above it, the same load builds it. The average effect — a faint positive — sits in the gap and describes neither regime.

03Find the threshold; don't assume it

The trap most tools fall into is hard-coding the breakpoint — a fixed HRV percentage, the same for everyone, borrowed from a population study. But the threshold is itself a within-athlete quantity. Where one athlete's effect flips might be five points of HRV above where another's does.

So the engine estimates the threshold from the athlete's own history rather than importing it. That's slower, and it's the entire point: a breakpoint that's true on average is, once again, a number that happened to no one.

  • Estimate the effect of load separately across the range of the modifier — not one slope, but how the slope changes.
  • Locate where it crosses zero, with an interval around that crossing, not a single confident point.
  • Hold the whole thing to the same gate as any other claim: if the data above or below the line is too thin, the threshold is returned as insufficient, not guessed.
What this changes

"Should we add load?" stops being a question with one answer. It becomes "where is this athlete relative to their threshold today?" — which is a question the data can actually answer.

04Why an AI coach can't skip this

A language model asked "does load improve performance?" will answer fluently in either direction, because both are defensible on the average. That fluency is the danger: it reads the same whether the model has the evidence or is improvising.

Wiring an agent to the conditional effect closes that gap. Instead of one global verdict, the agent gets a finding bound to a condition — and a license that travels with it:

  • ADVISE when HRV is above the threshold and the interval clears zero — recommend the load.
  • HYPOTHESIZE near the crossing, where the sign is uncertain — surface it as a working hypothesis, don't push.
  • DECLINE when the athlete sits in a regime the data hasn't covered — say nothing yet.

The agent never has to bluff a single answer to a question that has two, because the question was never single to begin with.

05What we actually ship

Not "load helps performance (p<.05)." That sentence is the thing we're arguing against. What leaves the engine is closer to:

Load's effect on the next benchmark depends on HRV — about −0.056 when HRV is suppressed, flipping to +0.022 when it's recovered. An exploratory modifier estimated from 19 athlete-days — surfaced, not yet confirmed.A claim with its conditions attached

It's a longer sentence than a readiness score. It's also the difference between a number a coach can defend to an athlete and a number that, three weeks from now, quietly broke one of them.

In plain terms
effect modifier
A condition that may change a relationship's strength or sign. AthDash tests the condition before promoting the modifier.
within-athlete
Worked out inside one athlete's own history — not borrowed from a cohort they may not resemble.
confidence interval
The range the true effect plausibly sits in. We ship it, never a bare number.
license
How far an AI coach is allowed to act on a finding, DECLINEACT.