
Arena Score
A single score (0-100) that summarizes the quality of a backtest — so you don't have to interpret every individual metric.
Example: how the Arena Score appears in your backtest results.
Laden...
Why 4 dimensions?
A single metric is always misleading. 40% CAGR sounds great — not if the strategy only made 5 trades and Buy & Hold would have done 60%. Arena Score deliberately weighs Return, Risk, Consistency, and statistical foundation.
Sample Size as its own dimension — that's the unique contribution over other tools. A 5-trade backtest can never score more than 25 points, no matter how good CAGR and Win Rate look. This automatically encourages statistically sound tests.
The 4 dimensions
Return (0-30)
How much your strategy outperforms the average Buy & Hold investor (not just the day-1 buyer).
| Condition | Points | Meaning |
|---|---|---|
| Outperformance < 0% | 0 | Schlechter als Ø B&H |
| Outperformance < 5% | 5–10 | Leicht besser |
| Outperformance < 15% | 10–20 | Deutlich besser |
| Outperformance < 30% | 20–27 | Stark besser |
| Outperformance ≥ 30% | 30 | Außergewöhnlich |
Efficiency (0-25)
Risk-adjusted return via MAR Ratio: CAGR ÷ |Max Drawdown|. Higher = less pain per percent of gain.
| Condition | Points | Meaning |
|---|---|---|
| MAR < 0.3 | 0 | Sehr volatil |
| MAR < 0.5 | 5 | Schwach |
| MAR < 1.0 | 10 | Akzeptabel |
| MAR < 1.5 | 17 | Gut |
| MAR < 2.0 | 21 | Sehr gut |
| MAR ≥ 2.0 | 25 | Exzellent |
Consistency (0-25)
Win Rate (0-10 pts) + Profit Factor (0-15 pts). Rewards strategies where winners reliably dominate losers.
| Condition | Points | Meaning |
|---|---|---|
| PF < 1.0 | PF: 0 | Verliert Geld |
| PF < 1.3 | PF: 3 | Knapp profitabel |
| PF < 1.5 | PF: 7 | Solide |
| PF < 2.0 | PF: 11 | Stark |
| PF ≥ 2.0 | PF: 15 | Exzellent — €2 Gewinn je €1 Verlust |
| WR < 40% | WR: 0 | Schwache Trefferquote |
| WR 50–55% | WR: 8 | Über Zufall |
| WR ≥ 55% | WR: 10 | Hohe Trefferquote |
Sample Size (0-20)
Statistical significance — two dimensions combined: absolute trade count (0-10) + test duration (0-10). Both must be solid to reach max. Hard floor: <5 trades or <1 year = 0. This accounts for fact that weekly/monthly strategies naturally generate fewer trades but each trade spans more market time.
| Condition | Points | Meaning |
|---|---|---|
| <5 Trades ODER <1 Jahr | 0 | Hard Floor — Score automatisch F |
| Trades 5-9 | +2 | Minimal (Trade-Count) |
| Trades 10-19 | +5 | Basic (Trade-Count) |
| Trades 20-29 | +7 | Solide (Trade-Count) |
| Trades 30-49 | +9 | Stark (Trade-Count) |
| Trades ≥50 | +10 | Robust (Trade-Count max) |
| Jahre 1-2 | +2 | Kurz (Zeit) |
| Jahre 2-3 | +5 | Mittelfristig (Zeit) |
| Jahre 3-4 | +7 | Mehrjährig (Zeit) |
| Jahre 4-6 | +9 | Langfristig (Zeit) |
| Jahre ≥6 | +10 | Multi-Cycle (Zeit max) |
The grades
Excellent
85-100
Very Good
70-84
Good
55-69
Mixed
40-54
Weak
25-39
Not Recommended
0-24
Where do you see the Score?
- · Backtest result — right above the metrics, expandable
- · My Backtests — column "Score", sortable
- · Strategy Insights — Avg Score per strategy+interval
- · Compare tab (multi-strategy backtest) — as primary sort
⚠ Important
Arena Score only evaluates the past backtest. A high score does not guarantee future performance — but a low score is a reliable warning signal. Use it as a first filter, not as a decision engine.