The Science Behind
Zone Pedal
20 validation studies. 6,300 safety scenarios. 52 real rides replayed. Zero HR ceiling breaches. Here is exactly how the control system works, what we tested, what we removed when it did not work, and where the honest gaps remain.
Executive Summary
Zone Pedal controls your smart trainer's resistance using heart rate as the primary feedback signal, rather than fixed power targets. The system uses a feedforward-first control architecture with a recursive Bayesian estimator that learns your personal cardiac gain—the relationship between how hard you pedal and how your heart responds.
Conventional smart trainers use ERG mode: they hold a fixed power regardless of your physiological state. During prolonged exercise, cardiac drift causes heart rate to rise 5–20 bpm at constant power (Coyle & Gonzalez-Alonso 2001). ERG mode ignores this. Zone Pedal reduces power to maintain the prescribed heart rate zone, preserving the intended training stimulus.
Key Validation Results
| Claim | Study | Result | Verdict |
|---|---|---|---|
| Maintains HR below ceiling | RV-ZP-07 |
0/6,300 breached HR_max at ≤3x K mismatch | SUPPORTED |
| Maintains HR in target zone | RV-ZP-06 |
93.0% time-in-zone; 88.1% at 2x K mismatch. 8/8 criteria pass. | SUPPORTED |
| Learns rider cardiac gain | RV-ZP-01 |
Bias 0.049, RMSE 0.062, CI coverage 99.6% | PARTIALLY SUPPORTED |
| Cross-workout knowledge transfer | PRS-1B |
58.5% feedforward error reduction | SUPPORTED (SYNTHETIC) |
| First-ride calibration quality | RV-ZP-05 |
8/8 criteria pass: prior accuracy, zone accuracy, noise robustness, cross-archetype | SUPPORTED (SYNTHETIC) |
| Compensates for cardiac drift | RV-ZP-03 |
Drift MAE 0.071 bpm/min, compensated HR error 0.65 bpm. 8/8 pass. | SUPPORTED (SYNTHETIC) |
All validation is synthetic or computational. No clinical trial has been conducted. No IRB review has been performed. Zone Pedal is a fitness application, not a medical device.
System Architecture
The system models your cardiovascular response as a first-order system: your heart rate at any given power output is determined by a personal baseline plus a gain factor (K) multiplied by power. The controller inverts this model to compute the power needed for your target heart rate, then applies small feedback corrections.
In plain terms: the system predicts what power will put you in zone, sets the trainer there, and makes small adjustments based on what your heart rate actually does. The better it knows your K, the better the first guess.
The Core Equation
Where K is your cardiac gain (bpm per watt), β0 is resting baseline, δ is the drift estimate, and K_safe is a conservative bound on K that biases toward lower power when uncertainty is high. The system earns feedforward authority as the K estimate tightens.
Five Control Modes
The controller selects between five modes depending on workout phase and effort duration. This matters because heart rate lags power changes by 10–80 seconds—closed-loop HR control only works when the effort lasts long enough for the cardiovascular system to reach approximate steady state.
| Mode | When Active | Strategy |
|---|---|---|
| Open Loop Warmup | First 150s + ramp + dither | Power-controlled. HR observed for estimator but not used for control. |
| Calibrating | Initial K estimation | Ternary dither excitation. Estimator running, feedforward not yet active. |
| Feedforward + Trim | Steady-state and long intervals (>3 min) | Feedforward power command + deadband trim. Primary operating mode. |
| Open Loop Recovery | Recovery between intervals | Fixed low power. HR drops naturally without feedback chasing. |
| Power Assist Sprint | Short intervals (<30s) | Power-led targets. HR used only as ceiling. |
Bayesian K Estimation
A bank of parallel Kalman filter models, each assuming a different cardiac time constant, run simultaneously and are reweighted by their predictive accuracy. The posterior K estimate is a precision-weighted mixture across all models. The coefficient of variation (CV_K) determines how much the controller trusts its feedforward estimate—when uncertainty is high, the system defaults to feedback-only trim control.
HR Ceiling Validation
The core control question: how precisely does the system respect HR ceilings? We tested this across 6,300 scenarios, including conditions far worse than any real ride would produce.
Triple-Ceiling Overshoot Limiter
Three escalating intervention layers prevent HR overshoot. The limiter uses both a model-based 45-second lookahead and a slope-based 30-second extrapolation, firing whichever triggers first.
| Layer | Trigger | Action |
|---|---|---|
| Freeze | Predicted threat exceeds zone_max + 2 bpm | Hold current power; block increases |
| Emergency | Predicted threat exceeds HR_max − 5 bpm | Reduce power to 70% of current |
| Hard Stop | Smoothed HR exceeds HR_max | Set power to floor |
Study RV-ZP-07: 6,300 Validation Scenarios
Deterministic simulations across 8 experiment classes: K mismatch sweeps (up to 4x error), time constant mismatch, sensor latency (up to 10 seconds BLE delay), near-maximum HR starts, prolonged freeze behavior, cardiac drift, and cross-population sweeps.
| Metric | Observed | Threshold | Status |
|---|---|---|---|
| HR_max breach rate at ≤3x K mismatch | 0.000 | 0.00 | PASS |
| Peak overshoot at 2x K mismatch | 0.0 bpm | ≤ 5.0 bpm | PASS |
| Peak overshoot at 3x K mismatch | 0.0 bpm | ≤ 8.0 bpm | PASS |
| Time to freeze at 2x K mismatch | 0.0 s | ≤ 15.0 s | PASS |
| Slope trigger latency (2 bpm/s rise) | 10.0 s | ≤ 10.0 s | PASS |
| HR_max breach at 10s sensor delay + 2x K | 0.000 | 0.00 | PASS |
| Freeze recovery within 30s | 0.98 | ≥ 0.95 | PASS |
| All archetypes zero breaches at 2x K | 0.000 | 0.00 | PASS |
8 of 8 success criteria passed.
Even at 4x K mismatch (estimated K is one-quarter of true K), worst-case overshoot is 3.08 bpm above zone maximum. Zero HR_max breaches at all mismatch levels. The 1.23 bpm at 1.0x reflects the feedforward commanding higher power confidently with an accurate K; at moderate mismatch levels, K_safe is more conservative and undershoots, producing zero overshoot.
Control Quality
The core product question: does the controller actually keep heart rate in the target zone? Study RV-ZP-06 tested this across 22,400 simulated rides covering 4 rider archetypes, varying K mismatch, time constant mismatch, cardiac drift, and dead time variation. Results below reflect the shipping v4 controller, verified by provenance audit (28 parameters matched across Python simulation and Swift production code).
| Metric | Observed | Threshold | Status |
|---|---|---|---|
| Time-in-zone (1x K, 30 min) | 0.930 | ≥ 0.85 | PASS |
| Time-in-zone (2x K mismatch) | 0.881 | ≥ 0.70 | PASS |
| Estimator convergence to 20% of K_true | 20.1 ticks | ≤ 30 ticks | PASS |
| Zone 2 to Zone 3 transition time | 53.2 s | ≤ 60 s | PASS |
| Drift compensation at 0.15 bpm/min | 0.818 | ≥ 0.80 | PASS |
| w_ff at sigma_K ≤ 0.10 | 1.000 | ≥ 0.90 | PASS |
| w_ff at sigma_K ≥ 0.25 | 0.000 | ≤ 0.10 | PASS |
| Dead-time mismatch overshoot | 2.35 bpm | ≤ 8.0 bpm | PASS |
8 of 8 success criteria passed. The v4 controller trades slightly lower nominal time-in-zone for dramatically improved safety: freeze recovery improved from 9% to 98%, BLE sensor delay breaches were eliminated entirely, and zone transitions are now faster (53s vs 93s previously). The engineering choice was clear: precise ceiling enforcement and robust recovery matter more than chasing the last few percentage points of zone accuracy.
The v4 controller achieves 93% at nominal K and 88%+ under 2x mismatch. The tradeoff vs. the prior controller (which showed 100% everywhere) was deliberate: the v4 safety system prevents freeze lockups and BLE delay breaches, at a small cost to zone precision.
At physiologically typical drift rates (0.05–0.15 bpm/min), the system maintains >98% time-in-zone. At extreme rates (dehydration, heat stress), it degrades gracefully.
Cardiac Model Characterization
The system's ability to predict the right power depends on how well it knows your cardiac gain (K). We tested K estimation across 1,200 synthetic rides with known ground truth, varying the workout protocol and rider archetype.
The Protocol Matters
The single most important finding: interval workouts give approximately 10x better K estimates than Zone 2 steady-state rides. This is not a software limitation—it is a mathematical fact. Without meaningful power variation, there is near-zero information about how your heart rate responds to power changes.
Intervals give ~10x better K estimates than Zone 2. The structural observability limit is immediately obvious.
Cross-Workout Transfer: The Value Proposition
Does K learned during intervals improve control during Zone 2? This is the core product thesis. Study PRS-1B tested it directly: 10 riders completed 5 interval rides, then 5 Zone 2 rides. Two passes: one resetting K before each ride, one carrying the posterior forward.
Interval rides teach the system your cardiac gain. That knowledge makes Zone 2 rides dramatically better. This is the switching cost—you cannot get this from a single ride.
Calibration Protocol Design
43,200 simulations across 216 protocol variants identified optimal calibration ride design. The result is practical: duration dominates. If you are going to do one thing right in a calibration ride, make it long enough.
Duration is the dominant factor (53.7% of variance). A 15-minute step ramp from 40% to 85% HR_max is the recommended calibration protocol.
Real-Data Characterization
47 real cycling rides from 5 athletes in the GoldenCheetah OpenData corpus were processed. The K distribution (mean 0.323 bpm/W, range 0.200–0.641) overlaps published literature. The negative correlation between K and peak power (r = −0.72) is directionally consistent with exercise physiology: fitter riders have lower cardiac gain.
However: no ground-truth K values exist for real rides. These results characterize the estimator's behavior on real data but cannot assess accuracy. The 25.5% floor-clamp rate (K hitting the 0.20 lower bound) likely reflects elite riders whose true K falls below the parameterized range.
VO2max Estimation
Zone Pedal estimates VO2max using the Storer-Davis cycle ergometry formula (Storer et al. 1990) applied to a maximum power extrapolated from the Bayesian K estimate. A 95% confidence interval is derived by propagating K posterior uncertainty through the calculation.
Quality Gating
Estimates are only produced when the K posterior is sufficiently tight. This prevents the system from producing VO2max estimates from uncertain K values.
| CV_K Range | Quality | Estimate Produced? |
|---|---|---|
| < 0.10 | High | Yes |
| 0.10 – 0.20 | Moderate | Yes |
| 0.20 – 0.25 | Low | Yes |
| ≥ 0.25 | Insufficient | No |
VO2max estimation was validated against synthetic ground truth, not against laboratory measurements. No head-to-head comparison with laboratory VO2max testing or other consumer device estimates has been performed.
Physiological Feature Validation
HRV Readiness
The rMSSD computation matches ground truth exactly (r = 1.000). Readiness classification exits learning mode after exactly 5 reliable rides. Normal-day readiness detection is strong (99.7% sensitivity), while moderate suppression and recovery day detection are weaker—the difficulty of distinguishing moderate physiological states from noise.
| Ground Truth State | Sensitivity | Specificity |
|---|---|---|
| Ready (normal day) | 99.7% | 76.2% |
| Moderate suppression | 53.6% | 90.3% |
| Recovery day | 61.7% | 100.0% |
Fatigue Detection: Built, Tested, Removed
We built a fatigue detector that tracked K trajectory changes during a ride. The chart below shows its pre-mitigation performance: 92–95% true positive rate, but 68% false positive rate at typical chest strap noise levels (2 bpm). After mitigation (raised thresholds + confidence gate), both TPR and FPR dropped to ~3%. The single-signal approach cannot separate fatigue from sensor noise at consumer hardware accuracy.
We removed the feature entirely from the shipping app. A detector that fires on two-thirds of non-fatigued rides, or one that catches 3% of actual fatigue, has negative value—it trains users to ignore the system. The chart remains here as documentation of why.
At typical chest strap noise (≥2 bpm), the false positive rate exceeds any useful threshold. After mitigation, both TPR and FPR collapsed to ~3%. The feature was removed rather than shipped in a state that would erode trust in the rest of the system.
Bad Legs Day Detection
When your cardiac gain deviates from its learned baseline during a ride, the system detects and classifies the severity. Study RV-ZP-04 validated the classification algorithm across 1,600+ scenarios.
| Metric | Observed | Threshold | Status |
|---|---|---|---|
| Classification accuracy at exact levels | 1.0 | ≥ 0.95 | PASS |
| Progressive timing maximum error | 0.04 min | ≤ 3 min | PASS |
| Drift-adjusted false positive rate | 0.0% | ≤ 5% | PASS |
| CV gating at CV > 0.20 | 100% suppression | 100% | PASS |
| Cross-archetype accuracy variance | 0.0 pct-pts | ≤ 5.0 pct-pts | PASS |
8 of 8 success criteria passed. The 15/25/40% deviation thresholds are engineering choices, not physiologically derived. K estimation noise (upstream input quality) is validated separately in RV-ZP-01.
Tau Estimation (Recovery Kinetics)
The system estimates your cardiac time constants—how quickly your heart rate rises to a new steady state (tau_up) and how quickly it recovers (tau_down). Study RV-ZP-02 validated this across 2,880 scenarios.
| Metric | Observed | Threshold | Status |
|---|---|---|---|
| tau_down mean absolute error | 5.6% | ≤ 15% | PASS |
| tau_up mean absolute error | 1.1% | ≤ 20% | PASS |
| Noise robustness at 5 bpm noise | 1.16x degradation | ≤ 2.0x | PASS |
| Change detection correlation | r = 0.98 | ≥ 0.70 | PASS |
6 of 6 success criteria passed. Validated against synthetic cardiac dynamics with known ground truth.
Drift Compensation
Cardiac drift—the progressive rise in heart rate at constant power during prolonged exercise—is the core problem Zone Pedal solves. Study RV-ZP-03 validated the drift estimator directly.
| Metric | Observed | Threshold | Status |
|---|---|---|---|
| Drift rate MAE at σ=1 | 0.071 bpm/min | ≤ 0.10 | PASS |
| Compensated HR error | 0.65 bpm | ≤ 3 | PASS |
| Recovery reset error | 0.0 | ≤ 0.10 | PASS |
| 60-min convergence MAE | 0.040 | ≤ 0.10 | PASS |
| Cross-archetype ratio | 1.0x | ≤ 2.0x | PASS |
8 of 8 success criteria passed. At physiologically typical drift rates (0.05–0.15 bpm/min), the estimator tracks actual drift within 0.65 bpm.
Beyond Our Own Model
Most validation in fitness apps tests the system against its own assumptions. If the estimator assumes a first-order cardiac model, and the simulator uses the same first-order model, then perfect results prove internal consistency—not real-world robustness. We built a second simulator that deliberately violates our model's assumptions, and replayed 52 real cycling rides through the system.
Three Evidence Tiers
Every study in this document falls into one of three tiers:
| Tier | What It Proves | Example |
|---|---|---|
| Exact-match simulation | Internal consistency: the system works when reality matches its assumptions | RV-ZP-01, RV-ZP-06 |
| Robustness simulation | Resilience: the system degrades gracefully when assumptions are violated | RV-EXP-4/5 (out-of-family model) |
| Real-data replay | Ecological validity: the system produces plausible results on actual rides | RV-EXP-6, CV-ZP-1/2 |
Out-of-Family Model Testing (RV-EXP-4/5)
We built a second cardiac simulator with six physiological effects our production model cannot capture: logistic HR saturation near max, time-varying K and tau (warmup acceleration + fatigue decay), BLE latency jitter and dropout, Student-t noise with ectopic spikes, heat/hydration drift, and a soft HR ceiling. Then we ran the full validation suite against it.
| Behavior | Count | Meaning |
|---|---|---|
| MAINTAINS | 12 | Performance equivalent to ideal-model testing |
| DEGRADES | 5 | Measurable loss but still functional |
| BREAKS | 3 | Falls below acceptable threshold |
Safety held everywhere: zero new ceiling activations, zero HR_max breaches. The primary boundary is time-varying K/tau (warmup + fatigue dynamics). Trained riders are most robust (100% zone accuracy under OOF conditions). BLE latency and dropout: MAINTAINS across all archetypes.
Real-Ride Replay (RV-EXP-6)
52 real cycling rides (47 from the GoldenCheetah OpenData corpus + 5 from the developer's Tacx Neo 2T) were replayed through the system. Total: 77 ride-hours.
| Metric | Value |
|---|---|
| Rides replayed | 52 |
| Total ride-hours | 77 |
| K estimate median | 0.31 bpm/W (matches synthetic distribution) |
| Feedforward RMSE median | 26.3 bpm |
| BLE delay sensitivity (0→10s) | +5% RMSE (negligible) |
Real rides introduce dynamics the production model does not capture: nonlinear cardiac responses, autonomic nervous system effects, environmental conditions, and sensor artifacts. The system produces physiologically plausible K estimates and degrades gracefully rather than failing catastrophically.
DFA Alpha1 Threshold Detection (RV-ZP-11)
DFA alpha1 estimates aerobic (VT1) and anaerobic (VT2) thresholds from heart rate variability during a stepped power ramp. The Threshold Finder workout uses a 35-minute protocol with 6 power steps.
RV-ZP-11 tested four dimensions across 16 criteria:
| Experiment | What It Tests | Result |
|---|---|---|
| Alpha1 accuracy | Computation against known signals (white noise, 1/f, physiological) | 4/4 PASS |
| Threshold detection | VT1/VT2 crossing on synthetic ramps (gradual, steep, varying gaps) | 4/4 PASS |
| Noise robustness | Detection under typical chest strap noise (σ=10ms, 2% ectopic, 1% dropout) | 4/4 PASS |
| Protocol end-to-end | Full Threshold Finder workout simulation with 5 rider archetypes | 4/4 PASS |
Noise shifts alpha1 values upward but preserves the descent shape. A piecewise linear breakpoint detector finds the inflection point where cardiac control transitions, making threshold detection robust to the artifact levels typical of consumer chest strap monitors.
Known Limitations and Honest Gaps
All Validation Is Synthetic or Computational
Most results in this document come from deterministic synthetic simulations or Monte Carlo parameter recovery on simulated riders. Section 08 describes out-of-family model testing and real-ride replay (52 rides, 77 hours) which extend beyond internal-consistency validation. No clinical trial has been conducted. No real riders have been studied under controlled conditions with the Zone Pedal controller active. Real-world performance will be affected by nonlinear cardiac dynamics, autonomic nervous system effects, temperature, hydration, medication, and sensor artifacts.
K Estimation Requires Power Variation
The estimator learns your cardiac gain (K) from power variation during a ride. A power-variation gate prevents low-information rides (like Zone 2 steady state) from corrupting the K estimate. In practice, this means interval workouts and calibration rides teach the system; Zone 2 rides use what was already learned. First-ride calibration quality has been validated (RV-ZP-05, 8/8 pass), and the Discover Your Cardiac Response ride achieves K error of 0.0003 in simulation.
Zone 2 Is Structurally Unobservable
Zone 2 rides provide near-zero Fisher information about K because power variation is minimal. The estimator cannot learn K from Zone 2 rides alone. This is not a defect—it is a mathematical fact: without power variation, the HR-power relationship is not identifiable. The system handles this by retaining K from prior informative rides and gating K updates when power variance is below 1.0 W².
Fatigue Detection Was Removed
We built a fatigue detector, validated it (PRS-6: 68% false positive rate at consumer chest strap noise), tried to fix it (thresholds + confidence gate reduced FPR to 3% but also reduced TPR to 3%), and removed it from the shipping app. The single-signal approach—thresholding relative K change—cannot separate fatigue from sensor noise. We chose to remove the feature rather than ship one that erodes trust in the rest of the system. See Section 07 for the full analysis.
Limited Real-World Data
52 rides from 6 athletes (47 GoldenCheetah + 5 developer rides). Sufficient to characterize K distribution and validate replay behavior, but insufficient for population-level inference. Elite, elderly, cardiac-compromised, and pediatric populations are not represented.
No Head-to-Head Outcome Comparison
No study has compared training outcomes between Zone Pedal's HR-adaptive control and conventional power-based ERG training. The evidence for HR-based training equivalence comes from Akubat et al. (2013), not from Zone Pedal's specific implementation.
No Medical Device Claim
Zone Pedal is a fitness application. It has not been evaluated by the FDA or any regulatory body. It does not diagnose, treat, or prevent any disease. Users with cardiovascular conditions should consult their physician before using any exercise equipment.
Unvalidated Capabilities
| Capability | Status |
|---|---|
| VO2max estimation against laboratory reference | UNVALIDATED |
| Long-term fitness trend detection from K trajectory | UNVALIDATED |
References
Published Literature
Akubat I, Patel E, Barrett S, Sherwin Z. Methods of monitoring training load and their relationships to changes in fitness and performance in competitive road cyclists. J Sports Med Phys Fitness. 2013. PMC3737823.
Argha A, Su SW, Celler BG. Automated PID control of heart rate during treadmill exercise. J Biomech Eng. 2016.
Argha A, Su SW, Celler BG. Heart rate regulation during cycle-ergometer exercise via bio-feedback. Conf Proc IEEE Eng Med Biol Soc. 2017.
Hunt KJ, Fankhauser SE. Heart rate control during treadmill exercise using input-sensitivity shaping. J Sports Sci Med. 2019;18(1):47-55. PMC6370964.
Hunt KJ, Hurlimann N, Fankhauser SE. Physiological systems modelling for heart rate control during cycle-ergometer exercise. Proc Inst Mech Eng H. 2024.
Coyle EF, Gonzalez-Alonso J. Cardiovascular drift during prolonged exercise. Sports Med. 2001.
Achten J, Jeukendrup AE. Heart rate monitoring: applications and limitations. Sports Med. 2003.
Storer TW, Davis JA, Caiozzo VJ. Accurate prediction of VO2max in cycle ergometry. Med Sci Sports Exerc. 1990;22(5):704-712.
Zone Pedal Validation Studies
| Study | Type | Scenarios | Key Finding |
|---|---|---|---|
RV-ZP-10 | Formal RV (FTP/NP/IF/TSS) | 100+ | NP error 0.0W, TSS monotonic; 9/9 PASS |
RV-ZP-09 | Formal RV (hysteresis loop) | 800+ | Loop area error 0.27%, fatigue expansion 100%; 8/8 PASS |
RV-ZP-08 | Formal RV (interval recovery tau) | 1,200+ | NLLS fit MAE 2.4% (was 35.2% with linearized OLS) |
RV-ZP-07 | Formal RV (ceiling) | 6,300 | 0 HR_max breaches; 8/8 PASS |
RV-ZP-06 | Formal RV (control law) | 22,400 | 93% time-in-zone; 8/8 PASS (v4 controller) |
RV-ZP-05 | Formal RV (first-ride calibration) | 1,000+ | K error 0.0003, zone accuracy, noise robustness; 8/8 PASS |
RV-ZP-04 | Formal RV (bad legs day) | 1,600+ | Exact accuracy 1.0, drift FPR 0.0%; 8/8 PASS |
RV-ZP-03 | Formal RV (drift compensation) | 800+ | Drift MAE 0.071, HR error 0.65 bpm; 8/8 PASS |
RV-ZP-02 | Formal RV (tau estimation) | 2,880 | tau_down MAE 5.6%, tau_up MAE 1.1%; 6/6 PASS |
RV-ZP-01 | Formal RV (K estimation) | 21,400 | Bias 0.049, RMSE 0.062; 3/5 PASS |
RV-EXP-4/5 | Out-of-family robustness | 20 archetypes | 12 MAINTAINS, 5 DEGRADES, 3 BREAKS; safety 100% |
RV-EXP-6 | Real-ride replay | 52 rides (77 hrs) | K median 0.31, feedforward RMSE 26.3 bpm |
PRS-1 | Monte Carlo | 3,200 | K recovery r=0.570 overall |
PRS-1B | Monte Carlo (transfer) | 300 | 58.5% Zone 2 feedforward improvement |
PRS-3 | Real-data benchmark | 2 rides | Analyzer characterization |
PRS-4 | Drift investigation | 248 rides | Cardiac drift 0.24 bpm/min |
PRS-5 | Protocol optimization | 43,200 | 15-min step ramp optimal (r=0.94) |
PRS-6 | HRV/fatigue validation | 400+ | HRV r=1.0; fatigue FPR=68% (feature removed) |
CV-ZP-1+2 | Computational (real data) | 47 rides | K distribution matches literature |