Add model card
Browse files
README.md
ADDED
|
@@ -0,0 +1,157 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: stable-baselines3
|
| 3 |
+
tags:
|
| 4 |
+
- reinforcement-learning
|
| 5 |
+
- trading
|
| 6 |
+
- finance
|
| 7 |
+
- stock-market
|
| 8 |
+
- ppo
|
| 9 |
+
- quantitative-finance
|
| 10 |
+
- algorithmic-trading
|
| 11 |
+
license: mit
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# π Stock Trading RL Agent - MyTestExp
|
| 15 |
+
|
| 16 |
+
A reinforcement learning agent trained for stock trading using **PPO** algorithm.
|
| 17 |
+
|
| 18 |
+
## π― Model Overview
|
| 19 |
+
|
| 20 |
+
This model uses reinforcement learning to make trading decisions (Hold, Buy, Sell) based on technical indicators and market data.
|
| 21 |
+
|
| 22 |
+
### π§ Key Features
|
| 23 |
+
- **Algorithm**: PPO
|
| 24 |
+
- **Policy**: Multi-Layer Perceptron (MLP)
|
| 25 |
+
- **Action Space**: Continuous (Action Type + Position Size)
|
| 26 |
+
- **Observation Space**: Technical indicators + Portfolio state
|
| 27 |
+
- **Training Steps**: 500,000
|
| 28 |
+
- **Stocks Trained On**: 5
|
| 29 |
+
|
| 30 |
+
## π Training Configuration
|
| 31 |
+
|
| 32 |
+
### Data Configuration
|
| 33 |
+
```json
|
| 34 |
+
{
|
| 35 |
+
"tickers": [
|
| 36 |
+
"AAPL",
|
| 37 |
+
"MSFT",
|
| 38 |
+
"GOOGL",
|
| 39 |
+
"AMZN",
|
| 40 |
+
"TSLA"
|
| 41 |
+
],
|
| 42 |
+
"period": "5y",
|
| 43 |
+
"interval": "1d",
|
| 44 |
+
"use_sp500": false
|
| 45 |
+
}
|
| 46 |
+
```
|
| 47 |
+
### Environment Configuration
|
| 48 |
+
```
|
| 49 |
+
{
|
| 50 |
+
"initial_balance": 10000,
|
| 51 |
+
"transaction_cost": 0.001,
|
| 52 |
+
"max_position_size": 1.0,
|
| 53 |
+
"lookback_window": 60,
|
| 54 |
+
"reward_type": "return"
|
| 55 |
+
}
|
| 56 |
+
```
|
| 57 |
+
### Training Configuration
|
| 58 |
+
```
|
| 59 |
+
{
|
| 60 |
+
"algorithm": "PPO",
|
| 61 |
+
"total_timesteps": 500000,
|
| 62 |
+
"learning_rate": 0.0003,
|
| 63 |
+
"batch_size": 64,
|
| 64 |
+
"n_epochs": 10,
|
| 65 |
+
"gamma": 0.99,
|
| 66 |
+
"eval_freq": 1000,
|
| 67 |
+
"n_eval_episodes": 5,
|
| 68 |
+
"save_freq": 10000,
|
| 69 |
+
"seed": 42
|
| 70 |
+
}
|
| 71 |
+
## π Evaluation Results
|
| 72 |
+
|
| 73 |
+
| Stock | Total Return | Sharpe Ratio | Max Drawdown | Win Rate |
|
| 74 |
+
|-------|-------------|-------------|-------------|----------|
|
| 75 |
+
| AMZN | 162.87% | 0.74 | 187.11% | 6.72% |
|
| 76 |
+
| MSFT | 7243.44% | 0.56 | 164.60% | 52.11% |
|
| 77 |
+
| GOOGL | 0.00% | 0.00 | 0.00% | 0.00% |
|
| 78 |
+
| TSLA | 109.91% | -0.22 | 145.29% | 44.76% |
|
| 79 |
+
| AAPL | -74.02% | 0.65 | 157.07% | 7.01% |
|
| 80 |
+
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
π Usage
|
| 84 |
+
### Installation
|
| 85 |
+
```
|
| 86 |
+
pip install stable-baselines3 yfinance pandas numpy
|
| 87 |
+
```
|
| 88 |
+
Loading the Model
|
| 89 |
+
```
|
| 90 |
+
from stable_baselines3 import PPO
|
| 91 |
+
|
| 92 |
+
# Load the trained model
|
| 93 |
+
model = PPO.load("best_model.zip")
|
| 94 |
+
```
|
| 95 |
+
# Load the data scaler
|
| 96 |
+
```
|
| 97 |
+
import pickle
|
| 98 |
+
with open("scaler.pkl", "rb") as f:
|
| 99 |
+
scaler = pickle.load(f)
|
| 100 |
+
```
|
| 101 |
+
# Making Predictions
|
| 102 |
+
```
|
| 103 |
+
import numpy as np
|
| 104 |
+
|
| 105 |
+
# Prepare your observation (should match training format)
|
| 106 |
+
obs = your_observation_data # Shape: (n_features,)
|
| 107 |
+
|
| 108 |
+
# Get action from model
|
| 109 |
+
action, _states = model.predict(obs, deterministic=True)
|
| 110 |
+
|
| 111 |
+
# action[0] = action type (0: Hold, 1: Buy, 2: Sell)
|
| 112 |
+
# action[1] = position size (0-1)
|
| 113 |
+
```
|
| 114 |
+
π Model Performance
|
| 115 |
+
The model has been evaluated on multiple stocks with the following key metrics:
|
| 116 |
+
|
| 117 |
+
Risk-adjusted returns (Sharpe ratio)
|
| 118 |
+
Maximum drawdown analysis
|
| 119 |
+
Win rate performance
|
| 120 |
+
Transaction cost considerations
|
| 121 |
+
π οΈ Technical Details
|
| 122 |
+
State Space
|
| 123 |
+
The agent observes:
|
| 124 |
+
|
| 125 |
+
Technical indicators (SMA, EMA, RSI, MACD, Bollinger Bands)
|
| 126 |
+
Price and volume data
|
| 127 |
+
Portfolio state (balance, position, net worth)
|
| 128 |
+
Historical sequences (lookback window)
|
| 129 |
+
Action Space
|
| 130 |
+
Action Type: Discrete choice (Hold=0, Buy=1, Sell=2)
|
| 131 |
+
Position Size: Continuous value (0-1) representing fraction of available capital
|
| 132 |
+
Reward Function
|
| 133 |
+
Type: return
|
| 134 |
+
Considerations: Transaction costs, risk-adjusted returns
|
| 135 |
+
π Training Details
|
| 136 |
+
Environment: Enhanced Stock Trading Environment
|
| 137 |
+
Evaluation Frequency: Every 1000 steps
|
| 138 |
+
Model Checkpoints: Every 10000 steps
|
| 139 |
+
Random Seed: 42 (for reproducibility)
|
| 140 |
+
π Files in this Repository
|
| 141 |
+
best_model.zip: Best performing model during training
|
| 142 |
+
final_model.zip: Final model after training completion
|
| 143 |
+
scaler.pkl: Data preprocessing scaler
|
| 144 |
+
config.json: Complete training configuration
|
| 145 |
+
evaluation_results.json: Detailed evaluation metrics
|
| 146 |
+
training_summary.json: Training statistics and progress
|
| 147 |
+
β οΈ Disclaimer
|
| 148 |
+
This model is for educational and research purposes only. Past performance does not guarantee future results. Always do your own research and consider consulting with a financial advisor before making investment decisions.
|
| 149 |
+
|
| 150 |
+
π€ Contributing
|
| 151 |
+
Contributions are welcome! Please feel free to submit issues and pull requests.
|
| 152 |
+
|
| 153 |
+
π License
|
| 154 |
+
This project is licensed under the MIT License.
|
| 155 |
+
|
| 156 |
+
Generated on: 2025-07-04 17:14:46 UTC
|
| 157 |
+
Training completed: 2025-07-04
|