Use this file to discover all available pages before exploring further.
Most failed backtests fail in predictable ways. The same mistakes show up across operators learning the discipline. This guide walks through the common pitfalls and how to avoid them — saving you live-capital tuition.
Tuning parameters until they produce great results on a specific historical window. The “tuned” parameters are overfit to the noise of that window, not to a generalizable market characteristic.Symptom: backtest looks great. Live underperforms substantially.Mitigation: walk-forward testing. Tune on in-sample, validate on out-of-sample. Never peek at out-of-sample during tuning.Red flags: you “optimized” by sweeping >3 parameters; you tuned until you got the result you wanted; you can’t explain why the tuned parameters work.
2. Look-ahead bias
Your strategy accidentally uses information that wouldn’t have been available at the decision moment.Common form: an indicator’s “current value” is computed using data that hadn’t yet arrived in real time. Closing-bar values, end-of-period statistics, or post-hoc adjustments leaked into a “real-time” signal.Symptom: results that seem too good to be true. +200% annual return with -2% max drawdown is almost certainly look-ahead-biased.Mitigation: the unCoded Backtester handles this carefully — every decision uses only data up to the candle close. But if you’re computing custom indicators or doing post-hoc analysis, the bias can re-emerge.
3. Survivorship bias
Backtesting only on symbols that exist today. The symbols that delisted (because their projects failed) aren’t in your universe — but in 2021, you might have allocated to them.Symptom: backtest results that look great because you’re testing on a “winner” universe.Mitigation: backtest on majors only. BTCUSDT, ETHUSDT, SOLUSDT, BNBUSDT — symbols whose continued existence is high-confidence. Don’t extrapolate from major-symbol backtests to long-tail altcoins.
4. Ignoring max drawdown for total return
+50% total return looks great. The same backtest with -40% max drawdown is terrible — most operators capitulate during a -40% drawdown and crystallize the loss.Symptom: optimizing for total return without checking drawdown. Live operator panic-closes at the bottom and never realizes the “good” return.Mitigation: always read total return AND max drawdown together. Ask “could I emotionally hold through this drawdown for the recovery?” If no, the strategy is wrong for you regardless of total return.
5. Fee underestimation
Backtesting with unrealistic fee assumptions (0.025% instead of 0.075%, or no fees). Reported return is much higher than live would be.Symptom: live performance is 5-10% worse than backtest predicted, attributed to “bad luck” or “different regime” when actually it’s just realistic fees eating P&L.Mitigation: use the realistic fee for your venue. Binance with BNB: 0.075%. Binance without: 0.10%. Coinbase small account: 0.40%-0.60%. Check your venue, your tier.
6. Slippage underestimation
Backtesting assuming zero slippage on every order. Reported returns higher than reality, especially on illiquid symbols or large position sizes.Symptom: backtest claims +30% annual; live produces +22%. Difference is largely slippage on real fills.Mitigation: use a non-zero slippage parameter. For majors at moderate size, 0.05% slippage is reasonable. For altcoins or large size, 0.2% or more.
7. Insufficient sample size
A backtest with <20 trades is not statistically meaningful. The result could be coincidence.Symptom: confident decision-making based on 5-10 trades. Live performance diverges wildly from “expectations.”Mitigation: aim for >50 trades in any backtest segment. Lengthen the window or pick a higher-frequency mode if you’re not getting there.
8. Single-window testing
Backtesting only on one historical window (e.g., recent 12 months). Strategy is regime-fit to that specific window, not validated for robustness.Symptom: strategy that worked in 2023 fails in 2026.Mitigation: test multiple windows: bull (e.g., 2020-21), bear (2022), chop (2023), recent. If the strategy survives all, you have evidence of robustness. If it fails on some, decide whether you can stomach those regimes.
9. Wrong fees / fee tier assumptions
Using “BNB-discounted Binance” fees when you don’t actually have BNB top-up. Or using “VIP 4” fees when you’re at VIP 0.Mitigation: be specific. Match the fee assumption to your actual operator state.
10. Not accounting for operator behavior
Backtest assumes a perfectly disciplined operator who never deviates. Live operators panic, override, change settings, take vacations.Symptom: live performance is consistently worse than backtest because operator-induced deviations subtract from returns.Mitigation: simulate worst-case operator behavior. “What if I panic-close half my positions during a -15% drawdown?” Stress-test your psychology, not just your strategy.
Historical candle data may have gaps, anomalies, or rounding artifacts that bias the backtest.Mitigation: use venue-source historical data (the Backtester pulls from venues directly). Avoid third-party aggregated feeds that may have data quality problems.
12. Time-of-day mismatches
Backtest uses UTC timestamps; your live operation experiences your local timezone. Time-window conditions (e.g., “trade only during US market hours”) need consistent timezone handling.Mitigation: be explicit about timezone assumptions in time-based conditions. Verify backtest and live use the same timezone reference.
13. Different exchange behavior than backtest assumes
Each exchange has its own quirks — partial fills, retry behavior, error codes. Backtests typically assume idealized exchange behavior.Mitigation: forward-test on the same venue you’ll deploy live. Backtest predicts the strategy logic; forward-test catches venue-specific frictions.
14. Comparing strategies with different costs as if equivalent
“BasicMode shows +25% annual; Tsl2Sell shows +30%.” But BasicMode has 200 trades; Tsl2Sell has 8. After fees and slippage, the comparison shifts.Mitigation: compare strategies after fees and slippage are deducted. Trade-frequency-aware comparison.
15. Not running the full validation pipeline
Backtest → walk-forward → shadow → forward-test → scale up. Operators who skip steps end up paying tuition with real capital.Mitigation: discipline. Each step has its purpose.
Define what 'good' looks like before running backtests
Decide what max drawdown you can stomach, what total return makes the strategy worth running, what win rate range is acceptable. Before you see backtest results.Defining criteria before testing prevents post-hoc rationalization.
2
Use realistic fees and slippage
Match your venue, your tier, your typical order size. Don’t optimize away realistic frictions.
3
Test on multiple windows
Bear, chop, bull, recent. Same parameters across all windows. Look for regime robustness.
4
Walk-forward when tuning
If you’re adjusting parameters, walk-forward catches curve-fitting. Don’t peek at out-of-sample.
5
Major-symbol-only backtesting
Avoid survivorship bias. Test on BTCUSDT, ETHUSDT, SOLUSDT, etc.
6
Sample size matters
Aim for >50 trades per segment. Lengthen the window if you’re not getting there.
7
Drawdown over total return
Focus on max drawdown more than total return. Total return is the headline; drawdown is what kills operators.
8
Forward-test on small live capital after backtest
Even after extensive backtesting, forward-test on $1,500-$3,000 for 2-4 weeks before scaling up. Real fills, real frictions, real operator emotions.
A strategy that’s mathematically optimal but causes you to panic-close during drawdowns is a strategy that’s wrong for you. Match strategies to your actual stress tolerance, not to theoretical optimum.