Adilbai commited on
Commit
ee338ff
Β·
verified Β·
1 Parent(s): 35c73ef

Add model card

Browse files
Files changed (1) hide show
  1. README.md +157 -0
README.md ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: stable-baselines3
3
+ tags:
4
+ - reinforcement-learning
5
+ - trading
6
+ - finance
7
+ - stock-market
8
+ - ppo
9
+ - quantitative-finance
10
+ - algorithmic-trading
11
+ license: mit
12
+ ---
13
+
14
+ # πŸš€ Stock Trading RL Agent - MyTestExp
15
+
16
+ A reinforcement learning agent trained for stock trading using **PPO** algorithm.
17
+
18
+ ## 🎯 Model Overview
19
+
20
+ This model uses reinforcement learning to make trading decisions (Hold, Buy, Sell) based on technical indicators and market data.
21
+
22
+ ### πŸ”§ Key Features
23
+ - **Algorithm**: PPO
24
+ - **Policy**: Multi-Layer Perceptron (MLP)
25
+ - **Action Space**: Continuous (Action Type + Position Size)
26
+ - **Observation Space**: Technical indicators + Portfolio state
27
+ - **Training Steps**: 500,000
28
+ - **Stocks Trained On**: 5
29
+
30
+ ## πŸ“ˆ Training Configuration
31
+
32
+ ### Data Configuration
33
+ ```json
34
+ {
35
+ "tickers": [
36
+ "AAPL",
37
+ "MSFT",
38
+ "GOOGL",
39
+ "AMZN",
40
+ "TSLA"
41
+ ],
42
+ "period": "5y",
43
+ "interval": "1d",
44
+ "use_sp500": false
45
+ }
46
+ ```
47
+ ### Environment Configuration
48
+ ```
49
+ {
50
+ "initial_balance": 10000,
51
+ "transaction_cost": 0.001,
52
+ "max_position_size": 1.0,
53
+ "lookback_window": 60,
54
+ "reward_type": "return"
55
+ }
56
+ ```
57
+ ### Training Configuration
58
+ ```
59
+ {
60
+ "algorithm": "PPO",
61
+ "total_timesteps": 500000,
62
+ "learning_rate": 0.0003,
63
+ "batch_size": 64,
64
+ "n_epochs": 10,
65
+ "gamma": 0.99,
66
+ "eval_freq": 1000,
67
+ "n_eval_episodes": 5,
68
+ "save_freq": 10000,
69
+ "seed": 42
70
+ }
71
+ ## πŸ“Š Evaluation Results
72
+
73
+ | Stock | Total Return | Sharpe Ratio | Max Drawdown | Win Rate |
74
+ |-------|-------------|-------------|-------------|----------|
75
+ | AMZN | 162.87% | 0.74 | 187.11% | 6.72% |
76
+ | MSFT | 7243.44% | 0.56 | 164.60% | 52.11% |
77
+ | GOOGL | 0.00% | 0.00 | 0.00% | 0.00% |
78
+ | TSLA | 109.91% | -0.22 | 145.29% | 44.76% |
79
+ | AAPL | -74.02% | 0.65 | 157.07% | 7.01% |
80
+
81
+ ```
82
+
83
+ πŸš€ Usage
84
+ ### Installation
85
+ ```
86
+ pip install stable-baselines3 yfinance pandas numpy
87
+ ```
88
+ Loading the Model
89
+ ```
90
+ from stable_baselines3 import PPO
91
+
92
+ # Load the trained model
93
+ model = PPO.load("best_model.zip")
94
+ ```
95
+ # Load the data scaler
96
+ ```
97
+ import pickle
98
+ with open("scaler.pkl", "rb") as f:
99
+ scaler = pickle.load(f)
100
+ ```
101
+ # Making Predictions
102
+ ```
103
+ import numpy as np
104
+
105
+ # Prepare your observation (should match training format)
106
+ obs = your_observation_data # Shape: (n_features,)
107
+
108
+ # Get action from model
109
+ action, _states = model.predict(obs, deterministic=True)
110
+
111
+ # action[0] = action type (0: Hold, 1: Buy, 2: Sell)
112
+ # action[1] = position size (0-1)
113
+ ```
114
+ πŸ“Š Model Performance
115
+ The model has been evaluated on multiple stocks with the following key metrics:
116
+
117
+ Risk-adjusted returns (Sharpe ratio)
118
+ Maximum drawdown analysis
119
+ Win rate performance
120
+ Transaction cost considerations
121
+ πŸ› οΈ Technical Details
122
+ State Space
123
+ The agent observes:
124
+
125
+ Technical indicators (SMA, EMA, RSI, MACD, Bollinger Bands)
126
+ Price and volume data
127
+ Portfolio state (balance, position, net worth)
128
+ Historical sequences (lookback window)
129
+ Action Space
130
+ Action Type: Discrete choice (Hold=0, Buy=1, Sell=2)
131
+ Position Size: Continuous value (0-1) representing fraction of available capital
132
+ Reward Function
133
+ Type: return
134
+ Considerations: Transaction costs, risk-adjusted returns
135
+ πŸ“ Training Details
136
+ Environment: Enhanced Stock Trading Environment
137
+ Evaluation Frequency: Every 1000 steps
138
+ Model Checkpoints: Every 10000 steps
139
+ Random Seed: 42 (for reproducibility)
140
+ πŸ“‹ Files in this Repository
141
+ best_model.zip: Best performing model during training
142
+ final_model.zip: Final model after training completion
143
+ scaler.pkl: Data preprocessing scaler
144
+ config.json: Complete training configuration
145
+ evaluation_results.json: Detailed evaluation metrics
146
+ training_summary.json: Training statistics and progress
147
+ ⚠️ Disclaimer
148
+ This model is for educational and research purposes only. Past performance does not guarantee future results. Always do your own research and consider consulting with a financial advisor before making investment decisions.
149
+
150
+ 🀝 Contributing
151
+ Contributions are welcome! Please feel free to submit issues and pull requests.
152
+
153
+ πŸ“„ License
154
+ This project is licensed under the MIT License.
155
+
156
+ Generated on: 2025-07-04 17:14:46 UTC
157
+ Training completed: 2025-07-04