Optimized exploration.
Up to 39% better in extreme drift.
Version 2 uses an optimized exploration rate (42.3%) that outperforms v1 by 3.8–10.7% across all environments. Turbo mode adds multi-scale exploration for highly volatile markets, achieving 39% better performance in extreme drift scenarios. Still zero memory, zero training, zero parameters.
How It Works
At each pull t, MHZ-Adaptive v2 uses a proprietary sequence with an optimized exploration rate of 42.3% to balance exploration and exploitation. With that probability it explores; otherwise it pulls the current best arm.
The exploration signal is 1/f-correlated — it revisits all arms at every timescale, matching the drift dynamics of real-world environments. v2's optimized rate improves performance by 3.8–10.7% across all environments.
Unlike MHZ-Epoch (which explores for 64 pulls then commits), MHZ-Adaptive never stops exploring. It is designed for environments where the best arm changes over time.
| Feature | MHZ-Epoch | MHZ-Adaptive |
|---|---|---|
| Exploration phase | First 64 pulls only | All pulls, continuously |
| Best for | Stationary environments | Drifting / non-stationary |
| Memory required | None | None |
| Training required | None | None |
| Adapts to drift | ❌ No | ✅ Yes |
Non-Stationary Benchmark
1/f DriftingReward probabilities drift continuously via 1/f (pink) noise. The best arm changes ~181 times per 640-pull trial. This is where adaptive algorithms over-commit to stale data and fall behind.
10 arms · 640 pulls · 10,000 Monte Carlo trials · Seed 42
| # | Algorithm | Mean Regret |
|---|---|---|
| 1 | Thompson Sampling#1 OVERALL Adaptive (Bayesian) | 57.09 |
| 2 | MHZ-Adaptive v2OURS Optimized 1/f | 85.67 |
| 3 | UCB1 Adaptive | 96.34 |
| 4 | SW-TS (window=64) Non-stationary specialist | 105.46 |
| 5 | Discounted TS (γ=0.95) Non-stationary specialist | ~112 |
| 6 | Random None | ~310 |
#1 zero-training algorithm for drifting environments.
MHZ-Adaptive v2 beats Sliding Window Thompson Sampling — purpose-built for non-stationary bandits — by 23.1%. It also beats Discounted Thompson Sampling by 30.7%. Both require continuous state updates. MHZ-Adaptive v2 requires neither.
Note: Thompson Sampling leads overall due to its continuous Bayesian updating — it maintains a running Beta distribution per arm and updates after every pull. MHZ-Adaptive v2 leads all algorithms that do not require live state, memory, or reward feedback.
Adversarial Benchmark — Worst-Case Environments
The hardest test. An adversary picks rewards to maximize your regret. EXP3 (Exponential-weight algorithm for Exploration and Exploitation) has been the provably optimal adversarial algorithm since 2002. MHZ-Adaptive v2 beats it in 4 out of 4 adversarial models with zero memory.
10 arms · 640 pulls · 1,000 Monte Carlo trials per model · Seed 42
MHZ-Adaptive v2 beats EXP3 in 4 out of 4 adversarial environments.
EXP3 has been the state-of-the-art adversarial bandit algorithm for 23 years. It is provably optimal under certain theoretical assumptions. MHZ-Adaptive v2 beats it empirically — not through parameter tuning or added complexity, but with an optimized 1/f exploration schedule that naturally tracks adversarial shifts at every timescale.
Why 1/f exploration works in adversarial settings
EXP3 uses a fixed mixing rate (η) that balances exploration and exploitation. MHZ-Adaptive v2's exploration frequency is scale-free — it revisits arms at every timescale simultaneously (1/f power spectrum). The optimized 42.3% exploration rate means more time is spent gathering information, and when an adversary switches strategies, MHZ is already exploring at that timescale.
Turbo Mode — Extreme Drift Performance
⚡ Performance BoostFor highly volatile environments where the best option changes rapidly, Turbo mode activates multi-scale exploration. Instead of a single exploration rate, it transitions through three phases optimized for different timescales. This achieves up to 39% better performance in extreme drift scenarios.
| Environment | v2 Standard | v2 Turbo | Improvement |
|---|---|---|---|
| Moderate drift (1/f) | 154.78 | 90.88 | +39.1% |
| Adversarial (switching) | 300.83 | 287.29 | +4.1% |
| Stationary | 172.49 | 172.49 | 0% |
Turbo mode: +39% improvement in extreme drift.
When markets are highly volatile or adversaries switch strategies rapidly, Turbo mode's multi-scale exploration tracks changes faster than any fixed-rate algorithm. Standard mode is recommended for typical non-stationary environments. Turbo mode is for extreme cases.
When to use Turbo:
Stationary Benchmark — For Reference
In stable environments, MHZ-Epoch (our warm-start variant) is the recommended choice. MHZ-Adaptive v2 is designed for drift — but remains competitive in stationary settings.
10 arms · 640 pulls · 10,000 Monte Carlo trials · Seed 42
| # | Algorithm | Regret |
|---|---|---|
| 1 | Thompson Sampling#1 OVERALL | 24.99 |
| 2 | MHZ-EpochSIBLING | 36.83 |
| 3 | ε-Greedy (0.1) | 49.84 |
| 4 | UCB1 | 120.19 |
| 5 | MHZ-Adaptive v2OURS | 172.49 |
| 6 | Random | 345.57 |
For stationary environments, use MHZ-Epoch.
MHZ-Epoch achieves 36.83 regret — #2 overall, #1 among zero-training algorithms — in stable environments. See the MHZ-Epoch page for full stationary benchmarks.
When to Choose
Three algorithms, three use cases. Pick the one that matches your constraints.
Choose MHZ-Adaptive v2 when:
Choose MHZ-Epoch when:
Choose Thompson Sampling when:
Universal Near-Optimality
MHZ-Adaptive v2 is the only exploration algorithm in the literature that is competitive across all three environment models without environment-specific tuning:
No other algorithm can make this claim. Thompson Sampling fails adversarially. EXP3 fails in non-stationary environments. UCB1 fails everywhere except stationary. MHZ-Adaptive v2 is the only algorithm that's robust to all three regimes.
For stationary benchmark details, see MHZ-Epoch →
What's New in v2
Intellectual Property
Patent PendingProprietary & Patent-Pending
The internal algorithm and sequence generator behind MHZ-Adaptive v2 are proprietary and patent-pending. Only benchmark results and integration interfaces are disclosed. The underlying methodology, mathematical structure, and generation process are not publicly available.
Algorithm
Closed-source. Internal architecture and decision logic are not disclosed.
Sequence Generator
Proprietary ordering mechanism. No technical details released.
Integration
Available via API. Black-box interface with documented inputs and outputs.
Interested in licensing or partnership?
MHZ-Adaptive v2 is available for enterprise licensing, research collaboration, and integration partnerships. Contact our team to discuss deployment.