VALIDATION & METHODOLOGY

The Science Behind
Zone Pedal

20 validation studies. 6,300 safety scenarios. 52 real rides replayed. Zero HR ceiling breaches. Here is exactly how the control system works, what we tested, what we removed when it did not work, and where the honest gaps remain.

20 Validation studies
0 HR ceiling breaches
93% Time-in-zone
58.5% Cross-workout improvement

01 — Overview

Executive Summary

Zone Pedal controls your smart trainer's resistance using heart rate as the primary feedback signal, rather than fixed power targets. The system uses a feedforward-first control architecture with a recursive Bayesian estimator that learns your personal cardiac gain—the relationship between how hard you pedal and how your heart responds.

Conventional smart trainers use ERG mode: they hold a fixed power regardless of your physiological state. During prolonged exercise, cardiac drift causes heart rate to rise 5–20 bpm at constant power (Coyle & Gonzalez-Alonso 2001). ERG mode ignores this. Zone Pedal reduces power to maintain the prescribed heart rate zone, preserving the intended training stimulus.

Key Validation Results

ClaimStudyResultVerdict
Maintains HR below ceiling RV-ZP-07 0/6,300 breached HR_max at ≤3x K mismatch SUPPORTED
Maintains HR in target zone RV-ZP-06 93.0% time-in-zone; 88.1% at 2x K mismatch. 8/8 criteria pass. SUPPORTED
Learns rider cardiac gain RV-ZP-01 Bias 0.049, RMSE 0.062, CI coverage 99.6% PARTIALLY SUPPORTED
Cross-workout knowledge transfer PRS-1B 58.5% feedforward error reduction SUPPORTED (SYNTHETIC)
First-ride calibration quality RV-ZP-05 8/8 criteria pass: prior accuracy, zone accuracy, noise robustness, cross-archetype SUPPORTED (SYNTHETIC)
Compensates for cardiac drift RV-ZP-03 Drift MAE 0.071 bpm/min, compensated HR error 0.65 bpm. 8/8 pass. SUPPORTED (SYNTHETIC)

All validation is synthetic or computational. No clinical trial has been conducted. No IRB review has been performed. Zone Pedal is a fitness application, not a medical device.


02 — How It Works

System Architecture

The system models your cardiovascular response as a first-order system: your heart rate at any given power output is determined by a personal baseline plus a gain factor (K) multiplied by power. The controller inverts this model to compute the power needed for your target heart rate, then applies small feedback corrections.

In plain terms: the system predicts what power will put you in zone, sets the trainer there, and makes small adjustments based on what your heart rate actually does. The better it knows your K, the better the first guess.

The Core Equation

HRss(P) = β0 + K · P
Cardiac response model
Pff = (HRtarget − β0 − δ) / Ksafe
Feedforward power command

Where K is your cardiac gain (bpm per watt), β0 is resting baseline, δ is the drift estimate, and K_safe is a conservative bound on K that biases toward lower power when uncertainty is high. The system earns feedforward authority as the K estimate tightens.

Five Control Modes

The controller selects between five modes depending on workout phase and effort duration. This matters because heart rate lags power changes by 10–80 seconds—closed-loop HR control only works when the effort lasts long enough for the cardiovascular system to reach approximate steady state.

ModeWhen ActiveStrategy
Open Loop WarmupFirst 150s + ramp + ditherPower-controlled. HR observed for estimator but not used for control.
CalibratingInitial K estimationTernary dither excitation. Estimator running, feedforward not yet active.
Feedforward + TrimSteady-state and long intervals (>3 min)Feedforward power command + deadband trim. Primary operating mode.
Open Loop RecoveryRecovery between intervalsFixed low power. HR drops naturally without feedback chasing.
Power Assist SprintShort intervals (<30s)Power-led targets. HR used only as ceiling.

Bayesian K Estimation

A bank of parallel Kalman filter models, each assuming a different cardiac time constant, run simultaneously and are reweighted by their predictive accuracy. The posterior K estimate is a precision-weighted mixture across all models. The coefficient of variation (CV_K) determines how much the controller trusts its feedforward estimate—when uncertainty is high, the system defaults to feedback-only trim control.


03 — HR Ceiling Validation

HR Ceiling Validation

The core control question: how precisely does the system respect HR ceilings? We tested this across 6,300 scenarios, including conditions far worse than any real ride would produce.

Triple-Ceiling Overshoot Limiter

Three escalating intervention layers prevent HR overshoot. The limiter uses both a model-based 45-second lookahead and a slope-based 30-second extrapolation, firing whichever triggers first.

LayerTriggerAction
FreezePredicted threat exceeds zone_max + 2 bpmHold current power; block increases
EmergencyPredicted threat exceeds HR_max − 5 bpmReduce power to 70% of current
Hard StopSmoothed HR exceeds HR_maxSet power to floor

Study RV-ZP-07: 6,300 Validation Scenarios

Deterministic simulations across 8 experiment classes: K mismatch sweeps (up to 4x error), time constant mismatch, sensor latency (up to 10 seconds BLE delay), near-maximum HR starts, prolonged freeze behavior, cardiac drift, and cross-population sweeps.

MetricObservedThresholdStatus
HR_max breach rate at ≤3x K mismatch0.0000.00PASS
Peak overshoot at 2x K mismatch0.0 bpm≤ 5.0 bpmPASS
Peak overshoot at 3x K mismatch0.0 bpm≤ 8.0 bpmPASS
Time to freeze at 2x K mismatch0.0 s≤ 15.0 sPASS
Slope trigger latency (2 bpm/s rise)10.0 s≤ 10.0 sPASS
HR_max breach at 10s sensor delay + 2x K0.0000.00PASS
Freeze recovery within 30s0.98≥ 0.95PASS
All archetypes zero breaches at 2x K0.0000.00PASS

8 of 8 success criteria passed.

K Mismatch vs. Maximum Overshoot
RV-ZP-07 — How wrong can the model be before ceiling enforcement degrades?

Even at 4x K mismatch (estimated K is one-quarter of true K), worst-case overshoot is 3.08 bpm above zone maximum. Zero HR_max breaches at all mismatch levels. The 1.23 bpm at 1.0x reflects the feedforward commanding higher power confidently with an accurate K; at moderate mismatch levels, K_safe is more conservative and undershoots, producing zero overshoot.


04 — Does It Work?

Control Quality

The core product question: does the controller actually keep heart rate in the target zone? Study RV-ZP-06 tested this across 22,400 simulated rides covering 4 rider archetypes, varying K mismatch, time constant mismatch, cardiac drift, and dead time variation. Results below reflect the shipping v4 controller, verified by provenance audit (28 parameters matched across Python simulation and Swift production code).

MetricObservedThresholdStatus
Time-in-zone (1x K, 30 min)0.930≥ 0.85PASS
Time-in-zone (2x K mismatch)0.881≥ 0.70PASS
Estimator convergence to 20% of K_true20.1 ticks≤ 30 ticksPASS
Zone 2 to Zone 3 transition time53.2 s≤ 60 sPASS
Drift compensation at 0.15 bpm/min0.818≥ 0.80PASS
w_ff at sigma_K ≤ 0.101.000≥ 0.90PASS
w_ff at sigma_K ≥ 0.250.000≤ 0.10PASS
Dead-time mismatch overshoot2.35 bpm≤ 8.0 bpmPASS

8 of 8 success criteria passed. The v4 controller trades slightly lower nominal time-in-zone for dramatically improved safety: freeze recovery improved from 9% to 98%, BLE sensor delay breaches were eliminated entirely, and zone transitions are now faster (53s vs 93s previously). The engineering choice was clear: precise ceiling enforcement and robust recovery matter more than chasing the last few percentage points of zone accuracy.

Time-in-Zone: Robustness Across Riders and Model Error
RV-ZP-06 — 4 archetypes x 5 K mismatch levels

The v4 controller achieves 93% at nominal K and 88%+ under 2x mismatch. The tradeoff vs. the prior controller (which showed 100% everywhere) was deliberate: the v4 safety system prevents freeze lockups and BLE delay breaches, at a small cost to zone precision.

Cardiac Drift Compensation
RV-ZP-06 — Time-in-zone vs. drift severity

At physiologically typical drift rates (0.05–0.15 bpm/min), the system maintains >98% time-in-zone. At extreme rates (dehydration, heat stress), it degrades gracefully.


05 — Learning Your Heart

Cardiac Model Characterization

The system's ability to predict the right power depends on how well it knows your cardiac gain (K). We tested K estimation across 1,200 synthetic rides with known ground truth, varying the workout protocol and rider archetype.

The Protocol Matters

The single most important finding: interval workouts give approximately 10x better K estimates than Zone 2 steady-state rides. This is not a software limitation—it is a mathematical fact. Without meaningful power variation, there is near-zero information about how your heart rate responds to power changes.

K Estimation Accuracy by Protocol
RV-ZP-01 — Mean absolute bias (bpm/W) for trained archetype

Intervals give ~10x better K estimates than Zone 2. The structural observability limit is immediately obvious.

Cross-Workout Transfer: The Value Proposition

Does K learned during intervals improve control during Zone 2? This is the core product thesis. Study PRS-1B tested it directly: 10 riders completed 5 interval rides, then 5 Zone 2 rides. Two passes: one resetting K before each ride, one carrying the posterior forward.

Cross-Workout Transfer: Intervals Make Zone 2 Better
PRS-1B — Feedforward power error during Zone 2 rides

Interval rides teach the system your cardiac gain. That knowledge makes Zone 2 rides dramatically better. This is the switching cost—you cannot get this from a single ride.

Calibration Protocol Design

43,200 simulations across 216 protocol variants identified optimal calibration ride design. The result is practical: duration dominates. If you are going to do one thing right in a calibration ride, make it long enough.

What Makes a Good Calibration Ride
PRS-5 — Variance decomposition across 216 protocol variants

Duration is the dominant factor (53.7% of variance). A 15-minute step ramp from 40% to 85% HR_max is the recommended calibration protocol.

Real-Data Characterization

47 real cycling rides from 5 athletes in the GoldenCheetah OpenData corpus were processed. The K distribution (mean 0.323 bpm/W, range 0.200–0.641) overlaps published literature. The negative correlation between K and peak power (r = −0.72) is directionally consistent with exercise physiology: fitter riders have lower cardiac gain.

0.323 Mean K (bpm/W) from 47 real rides
r = −0.72 K vs peak power correlation

However: no ground-truth K values exist for real rides. These results characterize the estimator's behavior on real data but cannot assess accuracy. The 25.5% floor-clamp rate (K hitting the 0.20 lower bound) likely reflects elite riders whose true K falls below the parameterized range.


06 — Fitness Estimation

VO2max Estimation

Zone Pedal estimates VO2max using the Storer-Davis cycle ergometry formula (Storer et al. 1990) applied to a maximum power extrapolated from the Bayesian K estimate. A 95% confidence interval is derived by propagating K posterior uncertainty through the calculation.

Quality Gating

Estimates are only produced when the K posterior is sufficiently tight. This prevents the system from producing VO2max estimates from uncertain K values.

CV_K RangeQualityEstimate Produced?
< 0.10HighYes
0.10 – 0.20ModerateYes
0.20 – 0.25LowYes
≥ 0.25InsufficientNo

VO2max estimation was validated against synthetic ground truth, not against laboratory measurements. No head-to-head comparison with laboratory VO2max testing or other consumer device estimates has been performed.


07 — Beyond Control

Physiological Feature Validation

HRV Readiness

The rMSSD computation matches ground truth exactly (r = 1.000). Readiness classification exits learning mode after exactly 5 reliable rides. Normal-day readiness detection is strong (99.7% sensitivity), while moderate suppression and recovery day detection are weaker—the difficulty of distinguishing moderate physiological states from noise.

Ground Truth StateSensitivitySpecificity
Ready (normal day)99.7%76.2%
Moderate suppression53.6%90.3%
Recovery day61.7%100.0%

Fatigue Detection: Built, Tested, Removed

We built a fatigue detector that tracked K trajectory changes during a ride. The chart below shows its pre-mitigation performance: 92–95% true positive rate, but 68% false positive rate at typical chest strap noise levels (2 bpm). After mitigation (raised thresholds + confidence gate), both TPR and FPR dropped to ~3%. The single-signal approach cannot separate fatigue from sensor noise at consumer hardware accuracy.

We removed the feature entirely from the shipping app. A detector that fires on two-thirds of non-fatigued rides, or one that catches 3% of actual fatigue, has negative value—it trains users to ignore the system. The chart remains here as documentation of why.

Fatigue Detection: Why We Removed It
PRS-6 dry run — pre-mitigation baseline showing the fundamental signal-to-noise problem

At typical chest strap noise (≥2 bpm), the false positive rate exceeds any useful threshold. After mitigation, both TPR and FPR collapsed to ~3%. The feature was removed rather than shipped in a state that would erode trust in the rest of the system.

Bad Legs Day Detection

When your cardiac gain deviates from its learned baseline during a ride, the system detects and classifies the severity. Study RV-ZP-04 validated the classification algorithm across 1,600+ scenarios.

MetricObservedThresholdStatus
Classification accuracy at exact levels1.0≥ 0.95PASS
Progressive timing maximum error0.04 min≤ 3 minPASS
Drift-adjusted false positive rate0.0%≤ 5%PASS
CV gating at CV > 0.20100% suppression100%PASS
Cross-archetype accuracy variance0.0 pct-pts≤ 5.0 pct-ptsPASS

8 of 8 success criteria passed. The 15/25/40% deviation thresholds are engineering choices, not physiologically derived. K estimation noise (upstream input quality) is validated separately in RV-ZP-01.

Tau Estimation (Recovery Kinetics)

The system estimates your cardiac time constants—how quickly your heart rate rises to a new steady state (tau_up) and how quickly it recovers (tau_down). Study RV-ZP-02 validated this across 2,880 scenarios.

MetricObservedThresholdStatus
tau_down mean absolute error5.6%≤ 15%PASS
tau_up mean absolute error1.1%≤ 20%PASS
Noise robustness at 5 bpm noise1.16x degradation≤ 2.0xPASS
Change detection correlationr = 0.98≥ 0.70PASS

6 of 6 success criteria passed. Validated against synthetic cardiac dynamics with known ground truth.

Drift Compensation

Cardiac drift—the progressive rise in heart rate at constant power during prolonged exercise—is the core problem Zone Pedal solves. Study RV-ZP-03 validated the drift estimator directly.

MetricObservedThresholdStatus
Drift rate MAE at σ=10.071 bpm/min≤ 0.10PASS
Compensated HR error0.65 bpm≤ 3PASS
Recovery reset error0.0≤ 0.10PASS
60-min convergence MAE0.040≤ 0.10PASS
Cross-archetype ratio1.0x≤ 2.0xPASS

8 of 8 success criteria passed. At physiologically typical drift rates (0.05–0.15 bpm/min), the estimator tracks actual drift within 0.65 bpm.


08 — Stress-Testing Assumptions

Beyond Our Own Model

Most validation in fitness apps tests the system against its own assumptions. If the estimator assumes a first-order cardiac model, and the simulator uses the same first-order model, then perfect results prove internal consistency—not real-world robustness. We built a second simulator that deliberately violates our model's assumptions, and replayed 52 real cycling rides through the system.

Three Evidence Tiers

Every study in this document falls into one of three tiers:

TierWhat It ProvesExample
Exact-match simulationInternal consistency: the system works when reality matches its assumptionsRV-ZP-01, RV-ZP-06
Robustness simulationResilience: the system degrades gracefully when assumptions are violatedRV-EXP-4/5 (out-of-family model)
Real-data replayEcological validity: the system produces plausible results on actual ridesRV-EXP-6, CV-ZP-1/2

Out-of-Family Model Testing (RV-EXP-4/5)

We built a second cardiac simulator with six physiological effects our production model cannot capture: logistic HR saturation near max, time-varying K and tau (warmup acceleration + fatigue decay), BLE latency jitter and dropout, Student-t noise with ectopic spikes, heat/hydration drift, and a soft HR ceiling. Then we ran the full validation suite against it.

BehaviorCountMeaning
MAINTAINS12Performance equivalent to ideal-model testing
DEGRADES5Measurable loss but still functional
BREAKS3Falls below acceptable threshold

Safety held everywhere: zero new ceiling activations, zero HR_max breaches. The primary boundary is time-varying K/tau (warmup + fatigue dynamics). Trained riders are most robust (100% zone accuracy under OOF conditions). BLE latency and dropout: MAINTAINS across all archetypes.

Real-Ride Replay (RV-EXP-6)

52 real cycling rides (47 from the GoldenCheetah OpenData corpus + 5 from the developer's Tacx Neo 2T) were replayed through the system. Total: 77 ride-hours.

MetricValue
Rides replayed52
Total ride-hours77
K estimate median0.31 bpm/W (matches synthetic distribution)
Feedforward RMSE median26.3 bpm
BLE delay sensitivity (0→10s)+5% RMSE (negligible)

Real rides introduce dynamics the production model does not capture: nonlinear cardiac responses, autonomic nervous system effects, environmental conditions, and sensor artifacts. The system produces physiologically plausible K estimates and degrades gracefully rather than failing catastrophically.

DFA Alpha1 Threshold Detection (RV-ZP-11)

DFA alpha1 estimates aerobic (VT1) and anaerobic (VT2) thresholds from heart rate variability during a stepped power ramp. The Threshold Finder workout uses a 35-minute protocol with 6 power steps.

RV-ZP-11 tested four dimensions across 16 criteria:

ExperimentWhat It TestsResult
Alpha1 accuracyComputation against known signals (white noise, 1/f, physiological)4/4 PASS
Threshold detectionVT1/VT2 crossing on synthetic ramps (gradual, steep, varying gaps)4/4 PASS
Noise robustnessDetection under typical chest strap noise (σ=10ms, 2% ectopic, 1% dropout)4/4 PASS
Protocol end-to-endFull Threshold Finder workout simulation with 5 rider archetypes4/4 PASS

Noise shifts alpha1 values upward but preserves the descent shape. A piecewise linear breakpoint detector finds the inflection point where cardiac control transitions, making threshold detection robust to the artifact levels typical of consumer chest strap monitors.


09 — What We Do Not Know

Known Limitations and Honest Gaps

All Validation Is Synthetic or Computational

Most results in this document come from deterministic synthetic simulations or Monte Carlo parameter recovery on simulated riders. Section 08 describes out-of-family model testing and real-ride replay (52 rides, 77 hours) which extend beyond internal-consistency validation. No clinical trial has been conducted. No real riders have been studied under controlled conditions with the Zone Pedal controller active. Real-world performance will be affected by nonlinear cardiac dynamics, autonomic nervous system effects, temperature, hydration, medication, and sensor artifacts.

K Estimation Requires Power Variation

The estimator learns your cardiac gain (K) from power variation during a ride. A power-variation gate prevents low-information rides (like Zone 2 steady state) from corrupting the K estimate. In practice, this means interval workouts and calibration rides teach the system; Zone 2 rides use what was already learned. First-ride calibration quality has been validated (RV-ZP-05, 8/8 pass), and the Discover Your Cardiac Response ride achieves K error of 0.0003 in simulation.

Zone 2 Is Structurally Unobservable

Zone 2 rides provide near-zero Fisher information about K because power variation is minimal. The estimator cannot learn K from Zone 2 rides alone. This is not a defect—it is a mathematical fact: without power variation, the HR-power relationship is not identifiable. The system handles this by retaining K from prior informative rides and gating K updates when power variance is below 1.0 W².

Fatigue Detection Was Removed

We built a fatigue detector, validated it (PRS-6: 68% false positive rate at consumer chest strap noise), tried to fix it (thresholds + confidence gate reduced FPR to 3% but also reduced TPR to 3%), and removed it from the shipping app. The single-signal approach—thresholding relative K change—cannot separate fatigue from sensor noise. We chose to remove the feature rather than ship one that erodes trust in the rest of the system. See Section 07 for the full analysis.

Limited Real-World Data

52 rides from 6 athletes (47 GoldenCheetah + 5 developer rides). Sufficient to characterize K distribution and validate replay behavior, but insufficient for population-level inference. Elite, elderly, cardiac-compromised, and pediatric populations are not represented.

No Head-to-Head Outcome Comparison

No study has compared training outcomes between Zone Pedal's HR-adaptive control and conventional power-based ERG training. The evidence for HR-based training equivalence comes from Akubat et al. (2013), not from Zone Pedal's specific implementation.

No Medical Device Claim

Zone Pedal is a fitness application. It has not been evaluated by the FDA or any regulatory body. It does not diagnose, treat, or prevent any disease. Users with cardiovascular conditions should consult their physician before using any exercise equipment.

Unvalidated Capabilities

CapabilityStatus
VO2max estimation against laboratory referenceUNVALIDATED
Long-term fitness trend detection from K trajectoryUNVALIDATED

10 — Sources

References

Published Literature

Akubat I, Patel E, Barrett S, Sherwin Z. Methods of monitoring training load and their relationships to changes in fitness and performance in competitive road cyclists. J Sports Med Phys Fitness. 2013. PMC3737823.

Argha A, Su SW, Celler BG. Automated PID control of heart rate during treadmill exercise. J Biomech Eng. 2016.

Argha A, Su SW, Celler BG. Heart rate regulation during cycle-ergometer exercise via bio-feedback. Conf Proc IEEE Eng Med Biol Soc. 2017.

Hunt KJ, Fankhauser SE. Heart rate control during treadmill exercise using input-sensitivity shaping. J Sports Sci Med. 2019;18(1):47-55. PMC6370964.

Hunt KJ, Hurlimann N, Fankhauser SE. Physiological systems modelling for heart rate control during cycle-ergometer exercise. Proc Inst Mech Eng H. 2024.

Coyle EF, Gonzalez-Alonso J. Cardiovascular drift during prolonged exercise. Sports Med. 2001.

Achten J, Jeukendrup AE. Heart rate monitoring: applications and limitations. Sports Med. 2003.

Storer TW, Davis JA, Caiozzo VJ. Accurate prediction of VO2max in cycle ergometry. Med Sci Sports Exerc. 1990;22(5):704-712.

Zone Pedal Validation Studies

StudyTypeScenariosKey Finding
RV-ZP-10Formal RV (FTP/NP/IF/TSS)100+NP error 0.0W, TSS monotonic; 9/9 PASS
RV-ZP-09Formal RV (hysteresis loop)800+Loop area error 0.27%, fatigue expansion 100%; 8/8 PASS
RV-ZP-08Formal RV (interval recovery tau)1,200+NLLS fit MAE 2.4% (was 35.2% with linearized OLS)
RV-ZP-07Formal RV (ceiling)6,3000 HR_max breaches; 8/8 PASS
RV-ZP-06Formal RV (control law)22,40093% time-in-zone; 8/8 PASS (v4 controller)
RV-ZP-05Formal RV (first-ride calibration)1,000+K error 0.0003, zone accuracy, noise robustness; 8/8 PASS
RV-ZP-04Formal RV (bad legs day)1,600+Exact accuracy 1.0, drift FPR 0.0%; 8/8 PASS
RV-ZP-03Formal RV (drift compensation)800+Drift MAE 0.071, HR error 0.65 bpm; 8/8 PASS
RV-ZP-02Formal RV (tau estimation)2,880tau_down MAE 5.6%, tau_up MAE 1.1%; 6/6 PASS
RV-ZP-01Formal RV (K estimation)21,400Bias 0.049, RMSE 0.062; 3/5 PASS
RV-EXP-4/5Out-of-family robustness20 archetypes12 MAINTAINS, 5 DEGRADES, 3 BREAKS; safety 100%
RV-EXP-6Real-ride replay52 rides (77 hrs)K median 0.31, feedforward RMSE 26.3 bpm
PRS-1Monte Carlo3,200K recovery r=0.570 overall
PRS-1BMonte Carlo (transfer)30058.5% Zone 2 feedforward improvement
PRS-3Real-data benchmark2 ridesAnalyzer characterization
PRS-4Drift investigation248 ridesCardiac drift 0.24 bpm/min
PRS-5Protocol optimization43,20015-min step ramp optimal (r=0.94)
PRS-6HRV/fatigue validation400+HRV r=1.0; fatigue FPR=68% (feature removed)
CV-ZP-1+2Computational (real data)47 ridesK distribution matches literature