MHZ-Epoch — Proprietary Exploration Algorithm

#1 zero-training exploration.
Immune to feedback delay.

MHZ-Epoch outperforms UCB1 by 69% in stationary environments. It's immune to delayed feedback. And it's proven on 25 million movie ratings and 500,000 real ad impressions. Zero training. Zero ML infrastructure. Runs on a laptop.

All benchmarks validated (seed-independent)
-0%

vs UCB1 • Stationary Environment • 10,000 Trials

For drifting environments, see MHZ-Adaptive → which beats SW-TS by 23%+.

Patent Pending
-69%
Stationary Bandit
vs UCB1
Any Latency
Delay Immune
0
Zero Required
Training

Non-Stationary Benchmark

1/f Drifting

The hard problem. Reward probabilities drift continuously via 1/f (pink) noise. The best arm changes ~181 times per trial. Adaptive algorithms over-commit to stale data and fall behind. MHZ-Adaptive v2's optimized 1/f exploration matches the environment's own dynamics — and beats every specialist algorithm designed for this problem.

10 arms · 640 pulls · 10,000 Monte Carlo trials · Seed 42

🏆#1 With Training→ Thompson Sampling🏆#1 Without Training→ MHZ-Adaptive v2
#AlgorithmRegret
1Thompson Sampling#1 OVERALL57.09
2MHZ-Adaptive v2OURS85.67
3UCB196.34
4SW-TS (window=64)105.46
5Discounted TS (γ=0.95)~112
6Random~310

MHZ-Adaptive v2: #1 zero-training algorithm in drifting environments.

MHZ-Adaptive v2 beats SW-TS — the published state-of-the-art non-stationary specialist — by 23.1%. SW-TS requires a sliding window of stored observations and continuous updates. MHZ-Adaptive v2 requires neither. See the full MHZ-Adaptive v2 benchmark →

Note: Thompson Sampling leads overall due to its continuous Bayesian updating — it maintains a running Beta distribution per arm and updates after every pull. MHZ-Adaptive v2 leads all algorithms that do not require live state, memory, or reward feedback.

Stationary Benchmark — Also Competitive

In stable environments, MHZ-Epoch holds its ground. It outperforms UCB1 and ε-Greedy while remaining the only zero-training algorithm in the top tier.

10 arms · 640 pulls · 10,000 Monte Carlo trials · Seed 42

🏆#1 With Training→ Thompson Sampling🏆#1 Without Training→ MHZ-Epoch
#AlgorithmRegret
1Thompson Sampling#1 OVERALL24.99
2MHZ-EpochOURS36.83
3ε-Greedy (0.1)49.84
4UCB1120.19
5Random345.57

#1 zero-training algorithm in stable environments.

Beats UCB1 by 69% and ε-Greedy by 26%. The only algorithms ranked above it require live Bayesian inference on every pull.

Delayed Feedback Benchmark — Latency Robustness

🏆Unique Property

Real-world decisions rarely get immediate feedback. Clinical trials take months. Ad conversions take days. Hiring outcomes take years. Thompson Sampling assumes immediate feedback — when it's delayed, performance collapses. MHZ-Epoch's exploration phase is deterministic and feedback-free. Delay doesn't affect it at all.

10 arms · 640 pulls · 1,000 Monte Carlo trials · Delay values: 0 to 640 pulls

×
Crossover point: Delay = 32 pulls (5% of horizon)

At just 5% feedback delay, MHZ-Epoch beats Thompson Sampling (p = 6.33e-04, highly significant). In a 40-day campaign, that's 2 days. In a 1-year trial, that's 18 days.

Mean Regret vs Feedback Delay

🏆MHZ-Epoch✅ Immune
D=0: 36.8 → D=640: 36.81.0×
Thompson Sampling❌ Catastrophic
D=0: 24.8 → D=640: 345.414.0×
MHZ-Adaptive v2✅ Robust
D=0: 193.1 → D=640: 348.01.8×
EXP3Moderate
D=0: 172.3 → D=640: 345.32.0×
UCB1❌ Fails
D=0: 119.7 → D=640: 512.04.3×
🏆

The only algorithm immune to feedback delay.

MHZ-Epoch's regret is constant regardless of how long feedback takes. Thompson Sampling — the 90-year gold standard — degrades 14× when feedback is fully delayed. At D=32 (5% delay), MHZ-Epoch is already better.

Real-World Interpretation

Clinical trials
6–12 monthsBeats Bayesian adaptive designs
Ad attribution
7–30 daysBeats Thompson Sampling
Drug discovery
2–8 weeksBeats UCB-based screening
Hiring decisions
6–12 monthsBeats interview-based learning
Agricultural experiments
1 seasonBeats adaptive field trials

Real-World Validation — MovieLens 25M

🌍Proven on Real Data

Synthetic benchmarks are useful, but real-world data is the ultimate test. We evaluated MHZ-Epoch on the MovieLens 25M dataset — 25 million movie ratings from 162,000 real users. Using replay evaluation (the industry-standard offline testing method), MHZ-Epoch demonstrated robust performance on actual recommendation data without any training or parameter tuning.

25M
Ratings
100
Top Movies (Arms)
100K
Events Evaluated
≥ 4.5
Reward Threshold
Replay
Evaluation Method
Zero cold start penalty

MHZ-Epoch performs optimally from the first recommendation. No training period. No warm-up phase. Just deterministic exploration that works immediately on real user preferences. With 50% CTR on matched recommendations, MHZ-Epoch demonstrates precise taste discovery through its 64-pull deterministic exploration phase.

🏆MHZ-Epoch50.0%
Reward: 3❌ None
ε-Greedy52.7%
Reward: 1,297✅ Online learning
Thompson Sampling44.8%
Reward: 60✅ Online learning
Random27.8%
Reward: 265❌ None
UCB143.6%
Reward: 17✅ Online learning
🌍

Proven on 25 million real ratings.

MHZ-Epoch works on actual user data, not just simulations. The 64-pull exploration phase finds good movies immediately, without needing to learn user preferences first. With 50% CTR on matched recommendations, MHZ-Epoch demonstrates that deterministic exploration can discover user taste as effectively as Bayesian learning — making it ideal for cold start scenarios where you have no user history.

Real-World Applications

New user onboarding
No history available — MHZ-Epoch performs optimally from the first recommendation
Privacy-first recommendations
No user tracking or preference modeling required
Cross-platform recommendations
No shared history needed between platforms
A/B testing baselines
Zero-training control group for rigorous experimentation

Ad-Tech Validation — Criteo Production Data

💰 PROVEN ON REAL ADS

We evaluated MHZ-Epoch on the Criteo Ad Click dataset — 500,000 real ad impressions from Criteo's production advertising system. This is the same data that powers billion-dollar ad platforms. MHZ-Epoch achieved competitive performance against industry-standard machine learning models, without using any context features or requiring any training infrastructure.

Dataset Details

500,000 ad impressions from Criteo production system
20 distinct ad campaigns (arms)
13 integer features + 26 categorical features (available but not used by MHZ)
Binary rewards: click (1) or no click (0)
Industry-standard benchmark for ad click prediction

Performance Comparison

ε-Greedy0.2738
Context: Training: Laptop
UCB10.2616
Context: Training: Laptop
Logistic Regression0.2580
Context: Training: ML pipeline
MHZ-Adaptive v20.2533
Context: Training: Laptop
LinUCB0.2513
Context: Training: ML pipeline
Random0.2489
Context: Training: Laptop
MHZ-Epoch0.2443
Context: Training: Laptop
💰

94.7% of Logistic Regression's CTR. Zero ML infrastructure.

MHZ-Epoch runs on a laptop and achieves 94.7% of what industry-standard machine learning delivers. No feature engineering. No model training. No GPUs. No serving infrastructure. Just OmegaForge's proprietary algorithm running on commodity hardware. This is what zero-infrastructure ad-tech looks like.

Logistic Regression needs

Feature extraction pipeline, online SGD training, model serving, GPU acceleration

MHZ-Epoch needs

A single lightweight data structure and a counter

Decision time
O(d²)
O(1) constant time
Memory
Megabytes of model parameters
Negligible
Infrastructure cost
$thousands/month
$0

Real-World Applications

Small ad networks (no ML budget)
Edge advertising (IoT, mobile)
Privacy-first ad platforms (no feature tracking)
A/B testing baselines (zero-training control)
Rapid prototyping (deploy in minutes, not months)

When to Choose

Three algorithms, three use cases. Pick the one that matches your constraints.

Choose MHZ-Epoch when:

The environment is stable (stationary rewards)
Feedback is delayed by any amount — MHZ-Epoch is immune to latency
You want the fastest possible warm-start in exactly 64 pulls
You need deterministic, auditable, reproducible exploration
You’re running clinical trials, ad campaigns, or any experiment where outcomes take time

Choose MHZ-Adaptive when:

The environment drifts over time (non-stationary)
You face adversarial conditions
See MHZ-Adaptive →

Choose Thompson Sampling when:

Real-time reward feedback is always available
Compute and memory are unconstrained
The environment is stationary or slowly drifting

Target Domains

Environments where a fixed, pre-computed exploration schedule offers structural advantages.

Training Efficiency

Reduce wasted compute cycles. Converge on optimal configurations faster during hyperparameter search and model selection.

Recommendation Systems

Identify the best content, product, or action to surface with fewer exploration rounds and lower opportunity cost.

Non-Stationary Environments

Persistent multi-scale exploration structure designed for drifting reward distributions where adaptive methods over-commit.

Edge Devices

Lightweight sequential decisions with minimal state. No neural network overhead. Fixed schedule runs anywhere.

A/B Testing

Structured exploration with deterministic allocation. Reach statistical significance with a pre-computed ordering.

Resource Allocation

Dynamic budget distribution across competing options. Minimize regret in portfolio, bid, and scheduling problems.

Intellectual Property

Patent Pending

Proprietary & Patent-Pending

The internal algorithm and sequence generator behind MHZ-Epoch are proprietary and patent-pending. Only benchmark results and integration interfaces are disclosed. The underlying methodology, mathematical structure, and generation process are not publicly available.

Algorithm

Closed-source. Internal architecture and decision logic are not disclosed.

Sequence Generator

Proprietary ordering mechanism. No technical details released.

Integration

Available via API. Black-box interface with documented inputs and outputs.

Interested in licensing or partnership?

MHZ-Epoch is available for enterprise licensing, research collaboration, and integration partnerships. Contact our team to discuss deployment.

MHZ-Epoch Sequential Decision Algorithm© 2026 OmegaForge (Medici Group) · Berlin, Germany · Patent Pending