data['ML_Signal'] = 0 data.loc[X_test.index, 'ML_Signal'] = y_pred # Only trade on predictions
Without clean data, your algorithm is garbage-in, garbage-out.
Using ta library:
import ta
data['rsi'] = ta.momentum.RSIIndicator(data['Close'], window=14).rsi() data['macd'] = ta.trend.MACD(data['Close']).macd() data['bb_high'] = ta.volatility.BollingerBands(data['Close']).bollinger_hband() data['bb_low'] = ta.volatility.BollingerBands(data['Close']).bollinger_lband() data['volume_ratio'] = data['Volume'] / data['Volume'].rolling(20).mean()
Add target variable (future return):
data['target'] = data['Close'].shift(-1) / data['Close'] - 1 # next day return
data = yf.download('AAPL', start='2020-01-01', end='2024-01-01') data['Returns'] = data['Close'].pct_change() print(data.head())
Alternatives: Alpha Vantage (free tier), Polygon.io (professional), Binance API (crypto).
Instead of predicting price, teach an agent to maximize equity curve. Using Stable-Baselines3:
import gym from stable_baselines3 import PPOclass TradingEnv(gym.Env): # Define state (portfolio, prices), actions (buy/sell/hold), rewards (PnL) pass
env = TradingEnv(data) model = PPO('MlpPolicy', env, verbose=1) model.learn(total_timesteps=10000)
Moving from backtest to live trading requires an execution engine that connects to a broker via API (e.g., Alpaca, Interactive Brokers, Binance). Key components: