Architecture
CNN + LSTM
Proximal Policy Optimisation
Symbol
XAUUSDT
Gold / USD โ 1h bars
Best Return
+7.18%
In-sample ยท MLP baseline ยท 1M steps
Status
Training
v3 โ reward reshaping run
Latest Equity Curve
Training History
| Run | Steps | Return | Sharpe | Max DD | Notes |
|---|---|---|---|---|---|
| MLP baseline | 1M | +7.18% | 0.443 | -6.46% | Close+Open only |
| CNN+LSTM v1 | 1M | +80.15% | 0.242 | -60.15% | Overlevered |
| CNN+LSTM v2 | 3M | +3.03% | 0.181 | -2.88% | Too conservative |
| CNN+LSTM v3 | 3M | โ | โ | โ | In progress |
Model Architecture
| Component | Detail |
|---|---|
| Features | Close-norm, Ret(1/5/20), Vol-norm, RSI-14 โ 6 channels ร 40 bars |
| CNN branch | Conv1d(2ร64, k=3) โ GlobalAvgPool โ 64 dims |
| LSTM branch | 2-layer LSTM hidden=64 โ last hidden โ 64 dims |
| Account state | Balance / Equity / Margin โ 32 dims |
| Orders | Entry price, volume, profit โ 32 dims |
| Policy head | MLP [256 โ 128] โ 3 actions |
| Total params | ~311k |
| Training device | CPU ยท 8 parallel workers |