MHZ-Epoch — Fixed Exploration Schedule

MHZ-Epoch
Strongest Fixed Exploration Schedule

69% lower regret than UCB1 after only 64 structured pulls. Second only to adaptive Thompson Sampling in stationary environments. Designed for real-world drifting conditions.

0.00

Cumulative Regret • 10,000 Trials • Stationary Bandit

Patent Pending
#2
Stationary Bandit
Overall Rank
#1
Fixed Schedule
Non-Adaptive
-69%
Regret Reduction
vs UCB1
10K
Benchmark Length
Trials

Stationary Results

Regret comparison against standard baselines. 10,000-trial stochastic bandit.

+47.4%
vs Thompson Sampling
24.99 → 36.83
-26.1%
vs ε-Greedy
49.84 → 36.83
-69.4%
vs UCB1
120.19 → 36.83
-89.3%
vs Random
345.57 → 36.83

Strongest fixed exploration schedule tested

MHZ-Epoch achieves 36.83 cumulative regret using only 64 structured exploration pulls, then switches to greedy exploitation. It outperforms all non-adaptive baselines by a wide margin. Thompson Sampling (24.99), which adapts on every pull, remains the overall leader on stationary benchmarks.

Benchmark Leaderboard

Stationary multi-armed bandit · 10 arms · 640 pulls · 10,000 trials · cumulative regret (lower is better).

#AlgorithmRegret
1Thompson Sampling#1 OVERALL24.99
2MHZ-EpochOURS36.83
3ε-Greedy (0.1)49.84
4UCB1120.19
5Random345.57

Cumulative Regret Curves — 10,000 Trials

Interactive chart coming soon

Non-Stationary Benchmarks In Progress

The stationary benchmark above measures performance on fixed reward distributions — an environment where Thompson Sampling's per-pull adaptation gives it a natural advantage. Thompson wins there. That result is expected.

Real-world applications — markets, user preferences, clinical efficacy, ad performance — are non-stationary. Reward distributions drift. Adaptive methods over-commit to stale estimates and stop exploring. MHZ-Epoch's persistent multi-scale exploration structure is engineered to maintain performance exactly when environments shift — the scenario where Thompson Sampling's greedy adaptation becomes a liability.

Benchmarks on drifting and adversarial bandits are currently running. Results will be published here.

Target Domains

Environments where a fixed, pre-computed exploration schedule offers structural advantages.

Training Efficiency

Reduce wasted compute cycles. Converge on optimal configurations faster during hyperparameter search and model selection.

Recommendation Systems

Identify the best content, product, or action to surface with fewer exploration rounds and lower opportunity cost.

Non-Stationary Environments

Persistent multi-scale exploration structure designed for drifting reward distributions where adaptive methods over-commit.

Edge Devices

Lightweight sequential decisions with minimal state. No neural network overhead. Fixed schedule runs anywhere.

A/B Testing

Structured exploration with deterministic allocation. Reach statistical significance with a pre-computed ordering.

Resource Allocation

Dynamic budget distribution across competing options. Minimize regret in portfolio, bid, and scheduling problems.

Intellectual Property

Patent Pending

Proprietary & Patent-Pending

The internal algorithm and sequence generator behind MHZ-Epoch are proprietary and patent-pending. Only benchmark results and integration interfaces are disclosed. The underlying methodology, mathematical structure, and generation process are not publicly available.

Algorithm

Closed-source. Internal architecture and decision logic are not disclosed.

Sequence Generator

Proprietary ordering mechanism. No technical details released.

Integration

Available via API. Black-box interface with documented inputs and outputs.

Interested in licensing or partnership?

MHZ-Epoch is available for enterprise licensing, research collaboration, and integration partnerships. Contact our team to discuss deployment.

MHZ-Epoch Sequential Decision Algorithm© 2025 OmegaForge (Medici Group) · Berlin, Germany · Patent Pending